hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
Yorlik_ has joined #ste||ar
Yorlik has quit [Ping timeout: 264 seconds]
hkaiser_ has quit [Quit: Bye!]
diehlpk has joined #ste||ar
diehlpk has quit [Quit: Leaving.]
K-ballo has quit [Read error: Connection reset by peer]
K-ballo has joined #ste||ar
K-ballo has quit [Ping timeout: 240 seconds]
K-ballo has joined #ste||ar
Yorlik_ is now known as Yorlik
hkaiser has joined #ste||ar
hkaiser has quit [Ping timeout: 260 seconds]
hkaiser has joined #ste||ar
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 252 seconds]
K-ballo1 is now known as K-ballo
HHN93 has joined #ste||ar
HHN93 has quit [Ping timeout: 260 seconds]
parsa[fn] has quit [Ping timeout: 260 seconds]
parsa[fn] has joined #ste||ar
HHN93 has joined #ste||ar
HHN93 has quit [Ping timeout: 260 seconds]
tufei has quit [Quit: Leaving]
tufei has joined #ste||ar
prakhar42 has joined #ste||ar
HHN93 has joined #ste||ar
<HHN93>
the CI/CD tests for github commits, if one of the tests fail, does it mask other failures too?
<HHN93>
or all tests guaranteed to run?
<hkaiser>
it shouldn't
<hkaiser>
depends on the error
<HHN93>
what's wrong with our test suite though?
<hkaiser>
if a test fails compiling it stops, if all tests compile, all will be run
<HHN93>
it is only timeout errors from segmented algorithms?
<hkaiser>
yah, those are known
<hkaiser>
not your fault
<HHN93>
any idea why they occur? is it some bug in our code?
<HHN93>
or bug in the test suite
<HHN93>
or bug in the distributed setup of rostam?
<hkaiser>
HHN93: we don't know yet - nobody has investigated. I think pansysk75[m] plans to have a look
<HHN93>
oh ok, I was thinking about having a look too. Wanted to know if there's some place I can start at
<hkaiser>
not yet - I'm not sure if we should implement those
<HHN93>
i guess we could have par counter parts if the user guarentees the fold op is commutative
<HHN93>
using reduction
<hkaiser>
the only argument I have would be to provide implementations for users that can't rely on C++20/23
<hkaiser>
well, C++23 only
<hkaiser>
HHN93: not only
<HHN93>
associative*
<hkaiser>
I think the spec implies sequential execution, but I'm not 100% sure
<HHN93>
we can have an overload which accepts exPolicy parameter
<HHN93>
I believe there was a talk where we claimed switching std:: to hpx:: should work, so implementing it seems to make sense
<hkaiser>
ok
<hkaiser>
HHN93: I'm not sure however how you could enforce commutativity
<hkaiser>
I also think, we would require associativity as well
<HHN93>
we need the user to guarantee that associativity
<HHN93>
my idea was to do reduction
<hkaiser>
we do have reduce()
<HHN93>
I am arguing that fold ops should be added to HPX, just to maintain consistency with std::, I believe that was our goal
<hkaiser>
nod, fair point
<HHN93>
adding an overload to run fold in par is an idea I have, personally i am not very sure about it either as users might miss the fact that the op must be associative
<HHN93>
I will try to look into if there are more such bugs in other algorithms too and try to push before the next release
<hkaiser>
HHN93: for 6216: didn't you say you wanted to create a test that verifies the fix?
<HHN93>
I am not able to come up with a singular test to verify it, the random generator tests for all lengths from 1 to 8 so it should work
<hkaiser>
why not?
prakhar42 has quit [Quit: Client closed]
<HHN93>
bug depends on number of threads
<hkaiser>
we can control the number of threads, can't we?
<HHN93>
i wasn't aware we can control number threads for tests, but I am not sure how it matters. We would like to test if the bug occurs for any number of threads right?
<hkaiser>
we would like to test that the problem is fixed at least for one case that we know failed before
<HHN93>
there is also quite some uncertain behaviour with the bug, there was also out of memory accesses, ,which meant you could end up with the correct answer even if the bug exists. Testing over a large number of situations seems like the best option
<hkaiser>
I don't disagree
<hkaiser>
what I'm saying is that we have certain cases that fail, why not prove that those are fixed?
<HHN93>
`we would like to test that the problem is fixed at least for one case that we know failed before`
<HHN93>
ok I will add the known test, I agree it covers a lot of cases. If there is anything it doesn't the randomised should TCs take care of it
<HHN93>
do we intend to expand the data structures we provide to our users? or are we planning to work on libCDS integration instead?
<hkaiser>
as long as the CDS relies on lock free operations, you can use any existing code with hpx - no need to integrate
<hkaiser>
for instance from tbb
<hkaiser>
the problem is that some CDS require interaction with the threading system (hazard pointers, for instance) - those need to be integrated into HPX's threading