K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<hkaiser>
could this be a problem with the test itself?
<hkaiser>
ms[m]: yt?
<gonidelis[m]>
hkaiser: still awake?
<hkaiser>
already awake ;-)
<gonidelis[m]>
hkaiser: wow... not even dawn there yet
<hkaiser>
4am
<ms[m]>
hkaiser: here
<ms[m]>
and good morning :)
<hkaiser>
hey, g'morning
<hkaiser>
wrt #4622 (sanitizers)
<hkaiser>
I enabled the sanitizer support for VS and there the leak is not reproduced
<hkaiser>
so I would suggest to simply suppress this leak from being reported, even more as it's a non-issue - it refers to some memory related to perf counters not being freed
<hkaiser>
those are created once and live until the end of the program anyways
<hkaiser>
ms[m]: ^^
<ms[m]>
hkaiser: ok, sounds fair
<ms[m]>
thanks for looking into it!
<ms[m]>
are you adding the suppression? or would you like me to do it?
<hkaiser>
I will try doing it
<hkaiser>
ms[m]: hmmm, I need the unmagled name
<hkaiser>
the logs show the mangled names only - can you help?
<hkaiser>
or v.v.
<ms[m]>
hkaiser: sure, I'll have a look
<hkaiser>
I have added the unmangled name for now - let's see if it works
<gnikunj[m]>
hkaiser: you're up early ;)
<gonidelis[m]>
gnikunj[m]: he is on a roll. not even lack of caffeine could stop him
<gnikunj[m]>
it seems like it :)
<gnikunj[m]>
hkaiser: strange. That test does not fail for me on loni.
<gnikunj[m]>
changing the probability of failure to 95% (to trigger the catch statement) also works just fine for me.
<gonidelis[m]>
hkaiser: ... it just compiled ... i can describe my enthusiasm right now.
<gnikunj[m]>
gonidelis[m]: congrats!
<gonidelis[m]>
gnikunj[m]: thanks ;D !
<zao>
It's the Year of our Lord 2021 and people still mail whole lists to ask to be removed from them instead of using the mailman interface. :D
<gonidelis[m]>
wth was that? double account or sth ;p
<gnikunj[m]>
zao: ikr! I didn't think someone could email to get themselves removed from the list
<gonidelis[m]>
ahh... why does reduce don't have a ranges counterpart?
<gonidelis[m]>
I actually cannot find any standards oriented to doc for both ranges::reduce and ranges::accumulate
<gonidelis[m]>
k-ballo: what's the catch?
<gonidelis[m]>
(they have not make it to the standard maybe --__-- ?)
<gonidelis[m]>
although we do have an impl somehow
<gnikunj[m]>
hahahaha! Do people not know that there's an easy way to remove themselves from the mailinglist. Shrug
<gonidelis[m]>
k-ballo: `std::size_t(42)` ?
<gnikunj[m]>
hkaiser: could you rerun the failing test? I'm still unable to reproduce it on a cluster.
<hkaiser>
gnikunj[m]: it happens with gcc 9 only, I believe
<gnikunj[m]>
I'm on gcc9.3.0 :/
<hkaiser>
k
<gnikunj[m]>
what should I do?
<hkaiser>
gnikunj[m]: if you need to rerun the tests, easiest is to push to the PR
<gnikunj[m]>
what's strange is that async replicate one does not fail while replay fails. They have the same exact structure.
<hkaiser>
nod
<hkaiser>
it fails on 5 platforms only
<hkaiser>
or 6
<gnikunj[m]>
do we need to set HPX_TEST(true) everywhere?
<gnikunj[m]>
i.e. I see that it's set to true in the catch statement only
<hkaiser>
hpx_test(true) doesn't do anything, never, ever
<gnikunj[m]>
so it's reaching HPX_TEST(false)?
<hkaiser>
yah, it points to line 82, I believe
<hkaiser>
that means that some unexpected exception was thrown
<gnikunj[m]>
yes. Alright, let me check the replay implementation. Otherwise, I'll build gcc 9 and then try to reproduce the error on rostam.
<hkaiser>
gnikunj[m]: ms[m] can probaly give you the exect versions
<hkaiser>
exact* even
<gnikunj[m]>
yeah, that will help
<ms[m]>
gnikunj: is there randomness in that test? it's failed with gcc 7.5.0/boost 1.65.0/debug, clang 7(something)/boost/debug, gcc 9/boost 1.72.0/debug+release
<ms[m]>
the first one has generic coroutines+apex+papi enabled, but otherwise those are pretty vanilla configurations
<ms[m]>
I doubt it's compiler/boost specific...
<gnikunj[m]>
ms: I see. The thing is all other tests pass and the actual API is failing.
<gnikunj[m]>
which means their is a random behavior within the API somewhere that leads to it. Let me build with clang one
hkaiser has quit [Quit: bye]
<k-ballo[m]>
gonidelis that's the initial value
<gonidelis[m]>
k-ballo: that's what i am wondering... since we pass just a vector `c` I reckon it should be sth like `c.begin()`. haven't checked yet
<gonidelis[m]>
oh sorry i thought that was a question
<gonidelis[m]>
k-ballo: yeah that makes sense then
<gonidelis[m]>
cool
weilewei has joined #ste||ar
akheir has joined #ste||ar
shahrzad has joined #ste||ar
hkaiser has joined #ste||ar
<gonidelis[m]>
hkaiser: yt?
<hkaiser>
here now
<gonidelis[m]>
Since all the sequential calls eventually end up in `std::remove_if`
<hkaiser>
same as we've done for other algorithm implementations
<gonidelis[m]>
hkaiser: cool... that's what i expected. no prob
<gonidelis[m]>
it gets all the more interesting
<gonidelis[m]>
(plus it benefits the perf analysis project)
<gonidelis[m]>
hkaiser: oh one more thing since we 've mentioned it. according to http://eel.is/c++draft/alg.remove , on the std overloads should I return the begin or the end iter? (it seems like it does not matter but I need to pick one since we have the same base impl for std and ranges)
<hkaiser>
do what the standard says
nanmiao11 has joined #ste||ar
nanmiao11 has left #ste||ar [#ste||ar]
<hkaiser>
you could let the base implementation return both and pick what you need for the algorithms themselves
nanmiao11 has joined #ste||ar
<gonidelis[m]>
hkaiser: ok say no more ;)
nanmiao11 has quit [Quit: Connection closed]
nikunj has joined #ste||ar
nikunj has quit [Client Quit]
hkaiser has quit [Quit: bye]
weilewei has quit [Quit: Ping timeout (120 seconds)]
hkaiser has joined #ste||ar
<hkaiser>
gnikunj[m]: thanks for the fix!
<hkaiser>
tests are useful after all
<gnikunj[m]>
hkaiser: you're welcome :) I'm sorry it took this long
<gnikunj[m]>
they definitely are! I couldn't have thought that the exception was thrown from vector.at() function
<hkaiser>
well, that's vector::at does ;-) otherwise you could use vector::operator[]
<gnikunj[m]>
I don't like using vector::operator[]. It is easier to work with an exception than to work with a seg fault :D
<hkaiser>
use the MS compiler, it gives you a nice debug std library that detects these issues
<gnikunj[m]>
I need a tutorial on working with MS VS before I feel comfortable with it. I have your configurations but the IDE seems completely new to me every time I try to shift to it :/
<gnikunj[m]>
may I'm stupid who can't understand how it works
<gnikunj[m]>
let me try checking what the issue could be. It isn't something related to distributed stuff so I can do it locally
diehlpk_work_ has quit [Remote host closed the connection]
diehlpk_work_ has joined #ste||ar
<gnikunj[m]>
hkaiser: it says: Build: 4858-clang-newest (cscs(daint)) on 2021-01-19 16:15:11 on the top. My commit came after. I can't reach that page through the tests list as well. Which one of the jenkins test is it that is failing?
<hkaiser>
yes, some tests fail for unrelated reasons
<gnikunj[m]>
and here I thought I had it figured with the vector::at() exception
<gnikunj[m]>
let me debug it further
bita has joined #ste||ar
<gonidelis[m]>
0.0
<gonidelis[m]>
rostam down?
<gonidelis[m]>
negative
<gonidelis[m]>
there might be some spikes actually ;0
<gnikunj[m]>
hkaiser: stupid me. Changed the if statement for cases when exception occurs but forgot to change the if statement when predicate fails (hence the recurring failing test). I've added that and 2 handy assertions to ensure things work correctly now :)
<hkaiser>
;-)
<gnikunj[m]>
I really hope that I don't need to do anything else now :D
<hkaiser>
gnikunj[m]: thanks for looking into this
<gnikunj[m]>
I'd say no worries, but I'd wait till all the tests passes ;)