aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
david_pfander has quit [Ping timeout: 255 seconds]
eschnett has joined #ste||ar
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 240 seconds]
EverYoun_ has quit [Ping timeout: 255 seconds]
quaz0r has joined #ste||ar
parsa has joined #ste||ar
aserio has joined #ste||ar
<wash>
I am not at sc jbjnr
parsa has quit [Quit: Zzzzzzzzzzzz]
<K-ballo>
hkaiser, aserio: found my social security card
<hkaiser>
K-ballo: ;)
<hkaiser>
good
aserio has quit [Ping timeout: 248 seconds]
parsa has joined #ste||ar
parsa has quit [Read error: Connection reset by peer]
parsa has joined #ste||ar
parsa has quit [Client Quit]
aserio has joined #ste||ar
<aserio>
K-ballo: Yay!
aserio has quit [Client Quit]
aserio has joined #ste||ar
aserio has quit [Client Quit]
hkaiser has quit [Ping timeout: 248 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 250 seconds]
hkaiser has joined #ste||ar
hkaiser has quit [Client Quit]
hkaiser has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: bye]
gedaj has quit [Quit: leaving]
EverYoung has joined #ste||ar
pree has joined #ste||ar
pree has quit [Remote host closed the connection]
pree has joined #ste||ar
jaafar has quit [Ping timeout: 248 seconds]
jbjnr has quit [Remote host closed the connection]
parsa has joined #ste||ar
jbjnr has joined #ste||ar
EverYoung has quit [Ping timeout: 240 seconds]
<msimberg>
heller, jbjnr: yt? could i ask for some advice/rubber ducking?
<jbjnr>
fire away
<jbjnr>
(not familiar with the prahse rubber ducking, but I'm willing to take a chance)
<msimberg>
i'm getting segfaults with asan where i didn't use to get them
<msimberg>
you're my rubber duck :) trying to figure out the obvious thing that i've missed
<msimberg>
i've tried a clean rebuild, tried a different version of gcc, and i know where the segfaults start (and the changes in that commit are relevant for this)
<msimberg>
but i don't know what i've changed because that commit is before i started
<msimberg>
it's not mixing gcc/clang this time...
<jbjnr>
well the obvioous thing to do is disable stack overflow checks in hpx - then see if the segfault goes away I guess
<msimberg>
okay, thanks, that's a good start
<jbjnr>
HPX_HAVE_THREAD_STACKOVERFLOW_DETECTION being the possible cuplprit
<jbjnr>
HPX_WITH_STACKOVERFLOW_DETECTION:BOOL=OFF
<jbjnr>
or rather HPX_WITH_THREAD_STACKOVERFLOW_DETECTION:BOOL=OFF
<jbjnr>
the first one is left over crud I think and I should purge my cmake cache
<jbjnr>
hmmm. they both seem valid.
<msimberg>
i have both as well
<msimberg>
jbjnr: that worked, thanks!
<msimberg>
still a bit confused as i'm pretty sure i never changed that, but maybe i can't trust my memory anymore...
<msimberg>
will continue like this and see if i hit any other problems
pree has quit [Ping timeout: 260 seconds]
<heller>
msimberg: the problem here is that both asan and the stack overflow detection are setting a signal handler
<heller>
msimberg: there seems to be race or something in there
<msimberg>
heller: thanks, sounds reasonable
<msimberg>
will put this on my list of things to look into eventually
<msimberg>
now that I have it working again I'm seeing some new leaks on master (I'll try to fix them, first the old one though...)
<msimberg>
could we (I) add an asan (and other sanitizers if possible) build and run of (at least) hello world to the circleci builds? or buildbot?
<heller>
msimberg: Yes, please
<heller>
msimberg: we still need a more scalable testing infrastructure though
<heller>
so that we can use the machines at FAU and CSCS and LSU
<heller>
msimberg: which asan errors are you seeing?
<heller>
msimberg: /home/inf3/heller/programming/hpx/hpx/runtime/components/server/managed_component_base.hpp:294:23: runtime error: member access within misaligned address 0x7f424da417bc for type 'managed_component<hpx::lcos::detail::promise_lco<unsigned long, unsigned long>, hpx::components::detail::this_type>', which requires 8 byte alignment
<heller>
?
<msimberg>
heller: yeah, thought so, will wait until there's more news on the ci stuff from cscs
<msimberg>
right now there are two hwloc related leaks
<msimberg>
so not that one
<heller>
oh
<heller>
on master?
<heller>
bad
<heller>
the one above is with ubsan and not asan
<msimberg>
ah no, old master
<heller>
ok, you should check on newest master ;)
<msimberg>
on a branch, haven't rebased on latest master
<msimberg>
you've fixed that?
<msimberg>
will check again with latest master
<msimberg>
and haven't tried ubsan lately
<heller>
no, i haven't fixed anything there
<heller>
but you never know ...
<msimberg>
will check in any case, thanks!
parsa has quit [Ping timeout: 255 seconds]
K-ballo has joined #ste||ar
pree has joined #ste||ar
david_pfander has joined #ste||ar
<github>
[hpx] sithhell created fix_wrapper_heap (+1 new commit): https://git.io/vFXCM
<github>
hpx/fix_wrapper_heap 075dfaf Thomas Heller: Fixing wrapper_heap alignment problems...
<jbjnr>
probably not enough detail, but might help the discussion
<hkaiser>
jbjnr: thanks!
<hkaiser>
very helpful
aserio has joined #ste||ar
EverYoung has joined #ste||ar
<heller>
hkaiser: regarding your race, I discovered that the wrapper_heap changes introduce UB due to producing unaligned addresses (see #3007). Could this be related to your issues with data flow?
parsa has joined #ste||ar
<heller>
Also, I think it would be worth exploring if the problem still exists when using when_all(...).then instead of dataflow, since that still uses the old traversal
<K-ballo>
it does?
<heller>
Yes
aserio has quit [Quit: aserio]
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
<hkaiser>
heller: the problem is definitely related to the new async traversal
EverYoun_ has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
jaafar has joined #ste||ar
hkaiser has quit [Quit: bye]
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
david_pfander has joined #ste||ar
parsa has quit [Ping timeout: 240 seconds]
EverYoung has joined #ste||ar
EverYoun_ has quit [Ping timeout: 240 seconds]
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
eschnett has quit [Ping timeout: 248 seconds]
eschnett has joined #ste||ar
pree has quit [Quit: Bye dudes]
parsa has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
david_pfander has quit [Ping timeout: 240 seconds]
<github>
[hpx] K-ballo created remove-get_full_machine_mask (+1 new commit): https://git.io/vF1LY