hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<heller__> this does two things: 1) improving the performance of the pool itself by avoiding a modulo operation and 2) achieves a better distribution for the thread data pointers
<hkaiser> nice
Anushi1998 has joined #ste||ar
<hkaiser> speeds up things by what? 5%?
<hkaiser> nice
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed fibhash from ca6f5ec to fb442be: https://github.com/STEllAR-GROUP/hpx/commits/fibhash
<ste||ar-github> hpx/fibhash fb442be Thomas Heller: Improving spinlook pool by using a multiplicative fibonacci based hash...
ste||ar-github has left #ste||ar [#ste||ar]
<heller__> yeah, between 4 and 5 percent for 1 and 2 cores
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed fibhash from fb442be to bb57609: https://github.com/STEllAR-GROUP/hpx/commits/fibhash
<ste||ar-github> hpx/fibhash bb57609 Thomas Heller: Improving spinlook pools by using a multiplicative fibonacci based hash...
ste||ar-github has left #ste||ar [#ste||ar]
<heller__> so the other cool thing is, is that this is a build without thread descriptions
<heller__> hkaiser: so, there is one idea that spins around my head the last two days: What if we don't steal suspended threads anymore? This would have a few advantages: Direct support of thread_local, rescheduling suspended threads comes without any synchronization overheads
<hkaiser> heller__: what about load balancing?
<heller__> in that scheme, we only balance load for tasks that have not been scheduled yet
<hkaiser> ahh
<hkaiser> sure, worth trying
<heller__> very interesting. applying the same to the lcos::local::spinlock_pool actually makes the benchmark slower again
<hkaiser> heh
eschnett_ has joined #ste||ar
diehlpk has joined #ste||ar
Anushi1998 has quit [Ping timeout: 244 seconds]
diehlpk has quit [Ping timeout: 276 seconds]
jaafar has joined #ste||ar
eschnett_ has quit [Quit: eschnett_]
hkaiser has quit [Quit: bye]
quaz0r has joined #ste||ar
Vir has quit [Ping timeout: 264 seconds]
V|r has joined #ste||ar
david_pfander has joined #ste||ar
jaafar has quit [Ping timeout: 260 seconds]
_diers_ has quit [Quit: _diers_]
mdiers_ has joined #ste||ar
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] biddisco force-pushed cuda_cmake_doc from 66cb967 to f318fa3: https://github.com/STEllAR-GROUP/hpx/commits/cuda_cmake_doc
<ste||ar-github> hpx/cuda_cmake_doc f318fa3 John Biddiscombe: Note that cuda support requires cmake 3.9, enforce it in CMake
ste||ar-github has left #ste||ar [#ste||ar]
mcopik has quit [Ping timeout: 268 seconds]
mcopik has joined #ste||ar
mcopik has quit [Ping timeout: 276 seconds]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] biddisco force-pushed cuda_cmake_doc from f318fa3 to c8d6aa0: https://github.com/STEllAR-GROUP/hpx/commits/cuda_cmake_doc
<ste||ar-github> hpx/cuda_cmake_doc c8d6aa0 John Biddiscombe: Note that cuda support requires cmake 3.9, enforce it in CMake
ste||ar-github has left #ste||ar [#ste||ar]
jbjnr_ has quit [Read error: Connection reset by peer]
mcopik has joined #ste||ar
heller__ has quit [Remote host closed the connection]
jbjnr has joined #ste||ar
hkaiser has joined #ste||ar
heller_ has joined #ste||ar
<heller_> hkaiser: howdy
<hkaiser> heyho
<heller_> hkaiser: 7 to 10% now
<hkaiser> nice
<heller_> getting there ;)
<heller_> all just constant overheads as it seems
<hkaiser> nod
<hkaiser> those benchmarks need a factor of 10 to draw even with the equivalent plain Pyton code
<heller_> ugh, ok
<heller_> that's insane
<hkaiser> ;-)
<hkaiser> I already improved this by a factor of 60
<hkaiser> so all of the low hanging fruit is probably eaten
<hkaiser> but chipping off the odd 10 or more percent is helpful in any case
<heller_> ok
<heller_> I would like to see the impact on the nightly tester
<hkaiser> so thanks for doing it! much appreciated!
<hkaiser> (didn't mean to discourage you)
<heller_> you should know how I work by now :P
<hkaiser> ;-
<hkaiser> ;-)
<heller_> hkaiser: where do I find the python equivalent, btw?
<hkaiser> helnightly tester: needs to go through a PR
<hkaiser> are your changes HPX or Phylanx?
<hkaiser> heller_: ^^
<heller_> hkaiser: HPX only
<hkaiser> k
<hkaiser> then it has to be on hpx master... I'll talk to Kevin today if we can setup something more flexible
<heller_> ok
<heller_> gonna take a walk ... not sleeping enough isn't cool ;)
<hkaiser> heller_: can you show me your changes?
<heller_> sure, one sec
<heller_> hkaiser: on the fibash branch
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed fibhash from bb57609 to 136b30f: https://github.com/STEllAR-GROUP/hpx/commits/fibhash
<ste||ar-github> hpx/fibhash 9923bd8 Thomas Heller: Improving spinlook pools by using a multiplicative fibonacci based hash...
<ste||ar-github> hpx/fibhash 05237b0 Thomas Heller: Getting rid of modulo operation
<ste||ar-github> hpx/fibhash 29b000b Thomas Heller: Switching pool to non lco version
ste||ar-github has left #ste||ar [#ste||ar]
<heller_> hkaiser: the thread_specific_ptr really only added overheads due to an extra indirection
<heller_> hkaiser: I think this is a leftover from before thread_local, where you could only store PODs
<hkaiser> heller_: yes
<hkaiser> heller_: I tried to make this very same change - that will not work on all platforms
<hkaiser> windows does not allow for exporting thread_local variables
<hkaiser> so this will have to be platform specific
<heller_> hkaiser: ok, let's fix this then
<heller_> I'll isolate the changes
<hkaiser> thanks
<hkaiser> also, the spinlock used for gid_type is a os-mutex now?
<heller_> yes
<hkaiser> interesting
<heller_> less overhead ... if there is contention, we don't want to go back to the scheduling loop
<hkaiser> but this way is gives up the whole timeslice
<heller_> yes
<heller_> there is also the threads::get_self check
mdiers_ has quit [Remote host closed the connection]
<heller_> I'll check it again
<hkaiser> heller_: right, get_self could be the culprit
aserio has joined #ste||ar
<heller_> which could be better after the changes to our TLS variables
<hkaiser> we could try to just assert(get_self() != nullptr)
mdiers_ has joined #ste||ar
<heller_> no
<hkaiser> right, tcp pp
<heller_> and bootstrapping
<heller_> what we could do for the TCP parcelport is to not start up an io service thread, but poll explicitly in background_work
<hkaiser> that's orthogonal...
<hkaiser> this also loses the reactive style of the pp
<heller_> yes and no
<heller_> the PP itself will remain the same, it's just not driven by the OS, but by us
<hkaiser> k
<hkaiser> I'd keep this change se[arate, though
<heller_> it is not orthogonal in the sense that with this change, even the TCP write and read handlers would be executed inside of an HPX thread
<heller_> of course
eschnett has joined #ste||ar
<hkaiser> gtg now
<hkaiser> ttyl
<heller_> ok
hkaiser has quit [Quit: bye]
<jbjnr> what are you trying to fix?
<jbjnr> (just curious)
mdiers_ has quit [Ping timeout: 252 seconds]
mdiers_ has joined #ste||ar
jaafar has joined #ste||ar
aserio1 has joined #ste||ar
hkaiser has joined #ste||ar
aserio has quit [Ping timeout: 276 seconds]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell created remove_thread_specific_ptr (+1 new commit): https://github.com/STEllAR-GROUP/hpx/commit/0633dc0cf40e
<ste||ar-github> hpx/remove_thread_specific_ptr 0633dc0 Thomas Heller: Use HPX_NATIVE_TLS instead of hpx::util::thread_specific_ptr...
ste||ar-github has left #ste||ar [#ste||ar]
aserio1 has quit [Ping timeout: 260 seconds]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed remove_thread_specific_ptr from 0633dc0 to 26ed46b: https://github.com/STEllAR-GROUP/hpx/commits/remove_thread_specific_ptr
<ste||ar-github> hpx/remove_thread_specific_ptr 26ed46b Thomas Heller: Use HPX_NATIVE_TLS instead of hpx::util::thread_specific_ptr...
ste||ar-github has left #ste||ar [#ste||ar]
<mbremer> hkaiser: are you ready?
<hkaiser> mbremer: skype?
<mbremer> yup
nikunj has joined #ste||ar
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed remove_thread_specific_ptr from 26ed46b to 4286305: https://github.com/STEllAR-GROUP/hpx/commits/remove_thread_specific_ptr
<ste||ar-github> hpx/remove_thread_specific_ptr 4286305 Thomas Heller: Use HPX_NATIVE_TLS instead of hpx::util::thread_specific_ptr...
ste||ar-github has left #ste||ar [#ste||ar]
<heller_> hkaiser: alright, this patch should deal with TLS
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell opened pull request #3498: Remove thread specific ptr (master...remove_thread_specific_ptr) https://github.com/STEllAR-GROUP/hpx/pull/3498
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed fibhash from 136b30f to e29168d: https://github.com/STEllAR-GROUP/hpx/commits/fibhash
<ste||ar-github> hpx/fibhash 214507e Thomas Heller: Improving spinlook pools by using a multiplicative fibonacci based hash...
<ste||ar-github> hpx/fibhash e29168d Thomas Heller: Getting rid of modulo operation
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell opened pull request #3499: Improving spinlock pools (master...fibhash) https://github.com/STEllAR-GROUP/hpx/pull/3499
ste||ar-github has left #ste||ar [#ste||ar]
<heller_> hkaiser: PRs submitted, please have a look at the TLS stuff if that branch still compiles and links for you
<hkaiser> heller_: will try
<heller_> hkaiser: the fibhash branch still has lcos::local::spinlock_pool
<hkaiser> k
<heller_> trying different scenarios right now
<heller_> will let you know
<hkaiser> heller_: will have time over the WE only, though
<heller_> ok
<heller_> I'd like to have those in the release. I think they are low risk changes
<hkaiser> heller_: you'd have to convince simbergm
<heller_> yeah
<simbergm> hkaiser (IRC): heller_ (IRC) probably not too hard to convince me ;)
<simbergm> I'll have a look at the pr as well
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell pushed 1 new commit to fibhash: https://github.com/STEllAR-GROUP/hpx/commit/c9c13645ad5f30fed15bf8ad72cad16d5a3680d9
<ste||ar-github> hpx/fibhash c9c1364 Thomas Heller: Switching pool to non lco version for better latency
ste||ar-github has left #ste||ar [#ste||ar]
<heller_> simbergm: two: #3498 and #3499
<heller_> simbergm: we might want to look at the future overhead tests as well
david_pfander has quit [Ping timeout: 276 seconds]
<simbergm> heller_ (IRC): sure, thanks!
<zao> == building and installing MPI/GCC/7.3.0-2.30/OpenMPI/3.1.1/HPX/1.2.0-rc1-g38ecfb0ec6c-cxx14...
<zao> == COMPLETED: Installation ended successfully
<zao> I am disappointed in you people, this is way too stable and boring :D
<heller_> wait until my commits get in to break stuff :P
jaafar_ has joined #ste||ar
jaafar has quit [Ping timeout: 264 seconds]
jaafar_ is now known as jaafar
aserio has joined #ste||ar
mcopik has quit [Ping timeout: 276 seconds]
hkaiser has quit [Quit: bye]
eschnett has quit [Quit: eschnett]
parsa[[w]] has joined #ste||ar
mcopik has joined #ste||ar
parsa[w] has quit [Ping timeout: 276 seconds]
hkaiser has joined #ste||ar
parsa[[w]] has quit [Read error: Connection reset by peer]
parsa[w] has joined #ste||ar
aserio has quit [Quit: aserio]
mcopik has quit [Ping timeout: 252 seconds]
mcopik has joined #ste||ar
hkaiser has quit [Ping timeout: 260 seconds]
hkaiser has joined #ste||ar