K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
K-ballo has quit [Quit: K-ballo]
diehlpk_work has quit [Remote host closed the connection]
<gnikunj[m]>
so it's because of trying to access a shared state that doesn't exist anymore
<gnikunj[m]>
as of now, I could only reproduce this error on rostam (any compiler) but couldn't reproduce the same on my laptop, so things may vary for you too
<ms[m]>
gnikunj: interesting... your code looks sane, but it's a good test if that indeed does fail every now and then
<ms[m]>
the line in the backend that you say segfaults is definitely an odd place for it to segfault :/
<ms[m]>
I'll try to find some time to reproduce it
<ms[m]>
release or debug?
<gnikunj[m]>
both
<ms[m]>
👍️
<ms[m]>
fails pretty quickly or does it usually take a long time?
<gnikunj[m]>
it fails right on spot
<ms[m]>
deterministically?
<ms[m]>
first time around?
<gnikunj[m]>
the test shouldn't take more than a few ms to execute anyway
<gnikunj[m]>
no, it's not deterministic. Sometimes it executes to completion, sometimes it seg faults
<ms[m]>
ok, thanks!
<gnikunj[m]>
that Kokkos::Cuda::finalize() error was linked to a similar code above
<gnikunj[m]>
so if we solve this, things should work for my resilience library too
<ms[m]>
ok, sounds good
<ms[m]>
no promises on when I can have a look but I'll try to do it soon
<ms[m]>
zao: since you seem to have followed this freenode drama a bit more closely (and I realized only later that this is on freenode): would you recommend we move over to libera.chat? I get the impression that's the only sane thing to do
<gnikunj[m]>
thanks!
<zao>
ms[m]: Long-term, it feels like the reasonable thing to do. One may want to wait a bit for the dust to settle.
<ms[m]>
sounds good, thanks
K-ballo has joined #ste||ar
hkaiser has joined #ste||ar
hkaiser has quit [Quit: bye]
hkaiser has joined #ste||ar
nanmiao has joined #ste||ar
<pedro_barbosa[m]>
is there a way to print specific fields from the get_cuda_info function in HPXCL?
<pedro_barbosa[m]>
example, the function returns the name of the device, memory size, cache and whatnot, and I wanted to print only the name of the device
<rachitt_shah[m]>
ms hkaiser can I create a google drive to consolidate pdfs/notes?
<rachitt_shah[m]>
This would be alongside the wiki, but would help me out with personal tracking. I'll add everyone to the folder.
<ms[m]>
rachitt_shah: yeah, that's ok
<gnikunj[m]>
hkaiser: could you point me to the timepoint code? I'll add in cuda asm code and use that instead.
<hkaiser>
pedro_barbosa[m]: not sure I understand your problem, care to elaborate?