K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
bita has quit [Remote host closed the connection]
diehlpk_work has quit [Remote host closed the connection]
hkaiser has quit [Quit: bye]
<mdiers[m]> <ms[m] "or do you use any particular bui"> https://gist.github.com/m-diers/7681125b87afdcbddb147a020da96af8
<ms[m]> mdiers: ok, thanks
<ms[m]> that's unlikely to make a difference :/
heller1 has quit [Quit: Idle for 30+ days]
<mdiers[m]> ms: Problems to reproduce it?
<ms[m]> mdiers: yep...
<mdiers[m]> Optimized debug build (-g -O2)?
<ms[m]> yep
<mdiers[m]> How many numa-domains?
<ms[m]> just two, so that's different from your setup
<ms[m]> does it reproduce in release mode as well?
<mdiers[m]> How many cores? Otherwise, I can still adjust the example a bit accordingly.
<ms[m]> 2×18
<mdiers[m]> <ms[m] "does it reproduce in release mod"> Yes, The other one I have only for debugging.
<ms[m]> let me try that first to see if I can even trigger the segfault
<mdiers[m]> <ms[m] "2×18"> Perfect, on such a system it comes with us also increased to the crash. I will write the adjustment right away.
<mdiers[m]> There is a different second main_hpx().
<mdiers[m]> With this you should get 6 thread pools.
<mdiers[m]> ms: Does it work with the changes? Otherwise I have to see how I can improve it. But with me it is also sensitive, it can be that it runs through a few times.
<mdiers[m]> * ms: Does it work with the changes? Otherwise I have to see how I can improve it. But it is sensitive, it can be that it runs through a few times.
<rori> <hkaiser "ms, rori: thanks for all the wor"> all ms work ;) thanks ms
<ms[m]> mdiers: still no crashes after a few hundred iterations
<ms[m]> even with the updated hpx_main
<ms[m]> there's probably something else going on
<ms[m]> just to check something else, what compiler and boost versions are you using?
<mdiers[m]> <ms[m] "even with the updated hpx_main"> I will try to make some adjustments to increase the crash frequency.
hkaiser has joined #ste||ar
<mdiers[m]> ms: Damn, the moon position has changed, now it runs again through.
<ms[m]> mdiers: :D that's great and terrible
<hkaiser> mdiers[m]: did anything else change (except the moon phase)?
<mdiers[m]> ms: hkaiser I get it reproduced very poorly now, unfortunately. Maybe every 20th time it occurs within the first four iterations.
<mdiers[m]> * ms: hkaiser I get it reproduced very poorly now, unfortunately. Maybe every 20th time it occurs within the first four iterations (with the -g -O2 build).
<mdiers[m]> Need a break.
akheir has joined #ste||ar
diehlpk_work has joined #ste||ar
<diehlpk_work> hkaiser, ms[m] RC1 compiled on Fedora
<diehlpk_work> HPX seems to work with gcc 11
<hkaiser> +1
shahrzad has joined #ste||ar
<ms[m]> hkaiser: thanks for looking at the codespell report
<ms[m]> I'll try to set up that builder to test hwloc
akheir1 has joined #ste||ar
akheir has quit [Read error: Connection reset by peer]
<hkaiser> ms[m]: I'm working on it
<hkaiser> (adding a special option for hwloc testing)
<ms[m]> hkaiser: ok, very nice, thank you!
<hkaiser> ms[m]: that actually revealed more problems...
<zao> I wonder what kind of glitch it is we're having on FreeBSD where the process hangs occasionally. Not going to investigate that any time soon tho.
<zao> Just noting for reference that there are more problems unrelated to hwloc on the platform.
<hkaiser> ms[m]: I'll add the option to the lsu clang-9 builder - is that ok?
<ms[m]> hkaiser: yeah, that should work fine
<ms[m]> don't remember which hwloc version that is though, does it matter?
<hkaiser> yes, we need hwloc >= V2
<hkaiser> ms[m]: ^^
<ms[m]> hkaiser: looks like it's 2.2.0, don't think rostam currently has any hwloc < 2
<hkaiser> ok, good
bita has joined #ste||ar
<bita> hkaiser, meeting?
hkaiser_ has joined #ste||ar
hkaiser has quit [Ping timeout: 272 seconds]
diehlpk_work has quit [Ping timeout: 240 seconds]
diehlpk_work has joined #ste||ar