00:00
<
heller__ >
this does two things: 1) improving the performance of the pool itself by avoiding a modulo operation and 2) achieves a better distribution for the thread data pointers
00:01
Anushi1998 has joined #ste||ar
00:02
<
hkaiser >
speeds up things by what? 5%?
00:03
ste||ar-github has joined #ste||ar
00:03
<
ste||ar-github >
hpx/fibhash fb442be Thomas Heller: Improving spinlook pool by using a multiplicative fibonacci based hash...
00:03
ste||ar-github has left #ste||ar [#ste||ar]
00:04
<
heller__ >
yeah, between 4 and 5 percent for 1 and 2 cores
00:14
ste||ar-github has joined #ste||ar
00:14
<
ste||ar-github >
hpx/fibhash bb57609 Thomas Heller: Improving spinlook pools by using a multiplicative fibonacci based hash...
00:14
ste||ar-github has left #ste||ar [#ste||ar]
00:19
<
heller__ >
so the other cool thing is, is that this is a build without thread descriptions
00:25
<
heller__ >
hkaiser: so, there is one idea that spins around my head the last two days: What if we don't steal suspended threads anymore? This would have a few advantages: Direct support of thread_local, rescheduling suspended threads comes without any synchronization overheads
00:25
<
hkaiser >
heller__: what about load balancing?
00:26
<
heller__ >
in that scheme, we only balance load for tasks that have not been scheduled yet
00:26
<
hkaiser >
sure, worth trying
00:27
<
heller__ >
very interesting. applying the same to the lcos::local::spinlock_pool actually makes the benchmark slower again
01:27
eschnett_ has joined #ste||ar
02:26
diehlpk has joined #ste||ar
02:45
Anushi1998 has quit [Ping timeout: 244 seconds]
03:19
diehlpk has quit [Ping timeout: 276 seconds]
03:20
jaafar has joined #ste||ar
03:20
eschnett_ has quit [Quit: eschnett_]
03:25
hkaiser has quit [Quit: bye]
05:00
quaz0r has joined #ste||ar
05:56
Vir has quit [Ping timeout: 264 seconds]
05:56
V|r has joined #ste||ar
06:50
david_pfander has joined #ste||ar
07:03
jaafar has quit [Ping timeout: 260 seconds]
07:15
_diers_ has quit [Quit: _diers_]
07:35
mdiers_ has joined #ste||ar
07:52
ste||ar-github has joined #ste||ar
07:52
<
ste||ar-github >
hpx/cuda_cmake_doc f318fa3 John Biddiscombe: Note that cuda support requires cmake 3.9, enforce it in CMake
07:52
ste||ar-github has left #ste||ar [#ste||ar]
08:22
mcopik has quit [Ping timeout: 268 seconds]
09:28
mcopik has joined #ste||ar
09:58
mcopik has quit [Ping timeout: 276 seconds]
10:52
ste||ar-github has joined #ste||ar
10:52
<
ste||ar-github >
hpx/cuda_cmake_doc c8d6aa0 John Biddiscombe: Note that cuda support requires cmake 3.9, enforce it in CMake
10:52
ste||ar-github has left #ste||ar [#ste||ar]
11:02
jbjnr_ has quit [Read error: Connection reset by peer]
11:33
mcopik has joined #ste||ar
12:11
heller__ has quit [Remote host closed the connection]
12:33
jbjnr has joined #ste||ar
12:35
hkaiser has joined #ste||ar
12:41
heller_ has joined #ste||ar
12:41
<
heller_ >
hkaiser: howdy
12:41
<
heller_ >
hkaiser: 7 to 10% now
12:41
<
heller_ >
getting there ;)
12:42
<
heller_ >
all just constant overheads as it seems
12:42
<
hkaiser >
those benchmarks need a factor of 10 to draw even with the equivalent plain Pyton code
12:43
<
heller_ >
that's insane
12:44
<
hkaiser >
I already improved this by a factor of 60
12:44
<
hkaiser >
so all of the low hanging fruit is probably eaten
12:45
<
hkaiser >
but chipping off the odd 10 or more percent is helpful in any case
12:48
<
heller_ >
I would like to see the impact on the nightly tester
12:48
<
hkaiser >
so thanks for doing it! much appreciated!
12:48
<
hkaiser >
(didn't mean to discourage you)
12:48
<
heller_ >
you should know how I work by now :P
12:52
<
heller_ >
hkaiser: where do I find the python equivalent, btw?
12:54
<
hkaiser >
helnightly tester: needs to go through a PR
12:54
<
hkaiser >
are your changes HPX or Phylanx?
12:54
<
hkaiser >
heller_: ^^
12:56
<
heller_ >
hkaiser: HPX only
12:56
<
hkaiser >
then it has to be on hpx master... I'll talk to Kevin today if we can setup something more flexible
12:58
<
heller_ >
gonna take a walk ... not sleeping enough isn't cool ;)
12:59
<
hkaiser >
heller_: can you show me your changes?
13:12
<
heller_ >
sure, one sec
13:13
<
heller_ >
hkaiser: on the fibash branch
13:13
ste||ar-github has joined #ste||ar
13:13
<
ste||ar-github >
hpx/fibhash 9923bd8 Thomas Heller: Improving spinlook pools by using a multiplicative fibonacci based hash...
13:13
<
ste||ar-github >
hpx/fibhash 05237b0 Thomas Heller: Getting rid of modulo operation
13:13
<
ste||ar-github >
hpx/fibhash 29b000b Thomas Heller: Switching pool to non lco version
13:13
ste||ar-github has left #ste||ar [#ste||ar]
13:14
<
heller_ >
hkaiser: the thread_specific_ptr really only added overheads due to an extra indirection
13:28
<
heller_ >
hkaiser: I think this is a leftover from before thread_local, where you could only store PODs
13:34
<
hkaiser >
heller_: yes
13:38
<
hkaiser >
heller_: I tried to make this very same change - that will not work on all platforms
13:39
<
hkaiser >
windows does not allow for exporting thread_local variables
13:39
<
hkaiser >
so this will have to be platform specific
13:39
<
heller_ >
hkaiser: ok, let's fix this then
13:39
<
heller_ >
I'll isolate the changes
13:40
<
hkaiser >
also, the spinlock used for gid_type is a os-mutex now?
13:40
<
hkaiser >
interesting
13:40
<
heller_ >
less overhead ... if there is contention, we don't want to go back to the scheduling loop
13:41
<
hkaiser >
but this way is gives up the whole timeslice
13:48
<
heller_ >
there is also the threads::get_self check
13:48
mdiers_ has quit [Remote host closed the connection]
13:49
<
heller_ >
I'll check it again
13:50
<
hkaiser >
heller_: right, get_self could be the culprit
13:50
aserio has joined #ste||ar
13:50
<
heller_ >
which could be better after the changes to our TLS variables
13:50
<
hkaiser >
we could try to just assert(get_self() != nullptr)
13:50
mdiers_ has joined #ste||ar
13:51
<
hkaiser >
right, tcp pp
13:51
<
heller_ >
and bootstrapping
13:52
<
heller_ >
what we could do for the TCP parcelport is to not start up an io service thread, but poll explicitly in background_work
13:52
<
hkaiser >
that's orthogonal...
13:52
<
hkaiser >
this also loses the reactive style of the pp
13:52
<
heller_ >
yes and no
13:53
<
heller_ >
the PP itself will remain the same, it's just not driven by the OS, but by us
13:53
<
hkaiser >
I'd keep this change se[arate, though
13:53
<
heller_ >
it is not orthogonal in the sense that with this change, even the TCP write and read handlers would be executed inside of an HPX thread
13:53
<
heller_ >
of course
13:53
eschnett has joined #ste||ar
13:54
hkaiser has quit [Quit: bye]
13:54
<
jbjnr >
what are you trying to fix?
13:54
<
jbjnr >
(just curious)
13:57
mdiers_ has quit [Ping timeout: 252 seconds]
14:04
mdiers_ has joined #ste||ar
14:22
jaafar has joined #ste||ar
14:48
aserio1 has joined #ste||ar
14:49
hkaiser has joined #ste||ar
14:50
aserio has quit [Ping timeout: 276 seconds]
14:51
ste||ar-github has joined #ste||ar
14:51
<
ste||ar-github >
hpx/remove_thread_specific_ptr 0633dc0 Thomas Heller: Use HPX_NATIVE_TLS instead of hpx::util::thread_specific_ptr...
14:51
ste||ar-github has left #ste||ar [#ste||ar]
14:52
aserio1 has quit [Ping timeout: 260 seconds]
15:02
ste||ar-github has joined #ste||ar
15:02
<
ste||ar-github >
hpx/remove_thread_specific_ptr 26ed46b Thomas Heller: Use HPX_NATIVE_TLS instead of hpx::util::thread_specific_ptr...
15:02
ste||ar-github has left #ste||ar [#ste||ar]
15:03
<
mbremer >
hkaiser: are you ready?
15:03
<
hkaiser >
mbremer: skype?
15:15
nikunj has joined #ste||ar
15:22
ste||ar-github has joined #ste||ar
15:22
<
ste||ar-github >
hpx/remove_thread_specific_ptr 4286305 Thomas Heller: Use HPX_NATIVE_TLS instead of hpx::util::thread_specific_ptr...
15:22
ste||ar-github has left #ste||ar [#ste||ar]
15:24
<
heller_ >
hkaiser: alright, this patch should deal with TLS
15:24
ste||ar-github has joined #ste||ar
15:24
ste||ar-github has left #ste||ar [#ste||ar]
15:26
ste||ar-github has joined #ste||ar
15:26
<
ste||ar-github >
hpx/fibhash 214507e Thomas Heller: Improving spinlook pools by using a multiplicative fibonacci based hash...
15:26
<
ste||ar-github >
hpx/fibhash e29168d Thomas Heller: Getting rid of modulo operation
15:26
ste||ar-github has left #ste||ar [#ste||ar]
15:29
ste||ar-github has joined #ste||ar
15:29
ste||ar-github has left #ste||ar [#ste||ar]
15:29
<
heller_ >
hkaiser: PRs submitted, please have a look at the TLS stuff if that branch still compiles and links for you
15:39
<
hkaiser >
heller_: will try
15:40
<
heller_ >
hkaiser: the fibhash branch still has lcos::local::spinlock_pool
15:40
<
heller_ >
trying different scenarios right now
15:40
<
heller_ >
will let you know
15:40
<
hkaiser >
heller_: will have time over the WE only, though
15:40
<
heller_ >
I'd like to have those in the release. I think they are low risk changes
15:45
<
hkaiser >
heller_: you'd have to convince simbergm
15:47
<
simbergm >
hkaiser (IRC): heller_ (IRC) probably not too hard to convince me ;)
15:47
<
simbergm >
I'll have a look at the pr as well
15:48
ste||ar-github has joined #ste||ar
15:48
<
ste||ar-github >
hpx/fibhash c9c1364 Thomas Heller: Switching pool to non lco version for better latency
15:48
ste||ar-github has left #ste||ar [#ste||ar]
15:48
<
heller_ >
simbergm: two: #3498 and #3499
15:49
<
heller_ >
simbergm: we might want to look at the future overhead tests as well
15:50
david_pfander has quit [Ping timeout: 276 seconds]
15:50
<
simbergm >
heller_ (IRC): sure, thanks!
15:54
<
zao >
== building and installing MPI/GCC/7.3.0-2.30/OpenMPI/3.1.1/HPX/1.2.0-rc1-g38ecfb0ec6c-cxx14...
15:54
<
zao >
== COMPLETED: Installation ended successfully
15:54
<
zao >
I am disappointed in you people, this is way too stable and boring :D
15:54
<
heller_ >
wait until my commits get in to break stuff :P
16:14
jaafar_ has joined #ste||ar
16:16
jaafar has quit [Ping timeout: 264 seconds]
16:47
jaafar_ is now known as jaafar
18:02
aserio has joined #ste||ar
18:37
mcopik has quit [Ping timeout: 276 seconds]
20:01
hkaiser has quit [Quit: bye]
20:25
eschnett has quit [Quit: eschnett]
20:30
parsa[[w]] has joined #ste||ar
20:33
mcopik has joined #ste||ar
20:34
parsa[w] has quit [Ping timeout: 276 seconds]
21:32
hkaiser has joined #ste||ar
21:39
parsa[[w]] has quit [Read error: Connection reset by peer]
21:42
parsa[w] has joined #ste||ar
21:54
aserio has quit [Quit: aserio]
22:12
mcopik has quit [Ping timeout: 252 seconds]
22:29
mcopik has joined #ste||ar
23:05
hkaiser has quit [Ping timeout: 260 seconds]
23:26
hkaiser has joined #ste||ar