hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC2018: https://wp.me/p4pxJf-k1
<github>
[hpx] K-ballo opened pull request #3382: Fix usage of HPX_CAPTURE together with default value capture [=] (master...fix-hpx-capture-default) https://git.io/fNmp8
jaafar has quit [Ping timeout: 244 seconds]
diehlpk has joined #ste||ar
hkaiser has quit [Quit: bye]
K-ballo has quit [Quit: K-ballo]
nanashi55 has quit [Ping timeout: 264 seconds]
nanashi55 has joined #ste||ar
diehlpk has quit [Ping timeout: 240 seconds]
quaz0r has quit [Ping timeout: 248 seconds]
quaz0r has joined #ste||ar
jaafar has joined #ste||ar
jaafar has quit [Ping timeout: 264 seconds]
nikunj has joined #ste||ar
nikunj has quit [Read error: Connection reset by peer]
<biddisco>
I just removed all memory order in the assumption it defaults to sequental
<hkaiser>
right, it does
<hkaiser>
biddisco: well, then the logic in cancellation token is wrong
<biddisco>
ok
<biddisco>
it's not code I've ever looked at until now
<hkaiser>
looks sane to me :/
<hkaiser>
ans the relaxed, that is
<hkaiser>
sans*
<biddisco>
it's the destructor that is segfaulting. this means the lambdas are doing something fishy
<biddisco>
I'm wondering if clang is optimizing something strangely
<biddisco>
debug mode is ok
<hkaiser>
yah
<hkaiser>
biddisco: try adding 'tok' as an explicit capture to the second lambda as well
<biddisco>
I think I tried that already earlier
<hkaiser>
ok
<biddisco>
let me try again
<hkaiser>
K-ballo: is destruction of shared_ptr thread safe?
<K-ballo>
only for that instance
<hkaiser>
I believe to remember that some operations were not
<K-ballo>
that's not what I meant
<hkaiser>
K-ballo: for that data instance or that shared_ptr instance
<K-ballo>
destroying the shared_ptr while using that some instance is bad
<K-ballo>
destryoing the shared_ptr while some other instance pointing to the same shared data is fine
<K-ballo>
some -> same
<hkaiser>
ok, then we should be fine
<K-ballo>
yes, we should
<hkaiser>
each lambda holds a copy
<K-ballo>
somehow the control block is getting corrupted though
<biddisco>
did not fix it using tok, first, last, count instead of =
<hkaiser>
biddisco: nod
<hkaiser>
biddisco: another test would be to replace the shared_ptr in the token by a plain pointer and let the memory leak
<biddisco>
ok
<biddisco>
hkaiser: you win $10
<biddisco>
using a flat pointer makes the segfault go away
<hkaiser>
interesting
<hkaiser>
so something messes up the lifetime of the shared_ptr
<hkaiser>
biddisco: just to be on the safe side and to exclude std::shared_ptr to be a problem on that platform - could you replace it with a boost:shared_ptr, pls?
<biddisco>
ok
<biddisco>
hkaiser: no boost::make_shared in my 1.67 any more
<biddisco>
hold on
<hkaiser>
use shared_ptr(new T()) instead
<biddisco>
wrong #include
<biddisco>
crashes with boost make_shared and boost::shared_ptr. they probably $ifdef use std anyway
<K-ballo>
no, they wouldn't
<K-ballo>
they ship a number of features not in std
<hkaiser>
ok, so it's something with our code
<hkaiser>
surprise
<hkaiser>
:/
<zao>
I would not be surprised if Boost didn't give any hoots about PowerPC or other "alternative platforms".
<zao>
(long-running grudge from my side, of course)
<K-ballo>
we are corrupting the capture somehow
<biddisco>
does cancellationtoekn need a copy ctor
<hkaiser>
the default one should be fine
<hkaiser>
K-ballo: the control block is not in the capture
<K-ballo>
that's calling the control block: __release_shared (this=0x17)
<hkaiser>
nod
<K-ballo>
uh, line 4504 is
<hkaiser>
so this pointer is somehow overwritten
<hkaiser>
biddisco: another idea is to replace the shared_ptr with an intrusive one, but that is more involved
<biddisco>
we need to understand why this fails before making random fixes really.
<biddisco>
(IMHO)
<hkaiser>
I'm still not sure if this is really our problem
<hkaiser>
biddisco: but I agree
<biddisco>
might be a compiler problem
<biddisco>
but how to be sure
<hkaiser>
with a plain pointer all is well (at least it doesn't expose the same behavior), that's the reason why an intrusive pointer might be good as well, as it's one pointer compared to two pointers in shared_ptr
<hkaiser>
asummptions
<K-ballo>
could you try a data breakpoint, to see what writes to that location?
<biddisco>
which location do you want
<K-ballo>
the one corresponding to the captured shared pointer
<hkaiser>
the second pointer in the shared_ptr that is captured in the second lambda
<biddisco>
ok. I try
<biddisco>
might not have it in my relwithdebinfo build
<hkaiser>
K-ballo: those lambdas could have been moved...
<hkaiser>
that would change the address for the dat abreakpoint
<biddisco>
hmmm
<biddisco>
value has been optimized out
<biddisco>
K-ballo: when I use debug mode, the bug does not happen