hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<zao> Does these errors on master ring a bell to anyone?
<zao> Seems like I'm on the bottom edge of supported compiler with Clang 3.8.1 vs. documented 3.8.
<zao> Building with my old testing container again, so heaven knows what it contains.
<zao> It ought to have worked in the past.
<K-ballo> is this cuda or something special?
<zao> No, plain CPU.
<K-ballo> looks like something I could have broken
<zao> `cmake ../hpx -DCMAKE_BUILD_TYPE=Debug -DBOOST_ROOT=/opt/boost-1.65.1-cxx14 -DCMAKE_CXX_COMPILER=clang++ -DHPX_WITH_CXX14=ON -G Ninja -DHPX_WITH_EXAMPLES=OFF -DHPX_WITH_DEPRECATION_WARNINGS=OFF`
<zao> Not sure how C++14-capable the Clang and libstdc++ is supposed to be.
<K-ballo> seems like libstdc++ pre 8.1 would be the cause
<zao> I'm gonna go to bed, just wanted to mention this while I remembered it and had the logs around.
<zao> Was actually going to see if I could reproduce my spurious failure of the MPI migrate_component, but didn't get quite that far :)
<K-ballo> i'll have to look into it.. libstdc++ seems to be asking whether it can construct an util::bound from an std::tuple
<K-ballo> the answer would definitively be no, but it's a hard error sfinae unfriendly one
<K-ballo> one of those bugs with single element tuple constructors
<K-ballo> should be able to reproduce with an older libstdc++, and add a workaround for it
<K-ballo> doesn't look like something my recent changes would have caused
K-ballo has quit [Quit: K-ballo]
eschnett has joined #ste||ar
hkaiser has joined #ste||ar
hkaiser has quit [Quit: bye]
eschnett has quit [Quit: eschnett]
<Yorlik> How threadsafe is std::vector.push_back()? Could 2 threads accidentally write to the same position?
<Yorlik> OK - made a test and got an access violation - so its not :(
<heller_> Yes, the std container are not thread safe
<Yorlik> Its even more weird
<Yorlik> I am finding data in the vectors I never wrote to them
<Yorlik> The access violation went away after i did a sufficiently large reserve beforehand
<Yorlik> But the output from the vector did not only give me the jumbled numbers of the thread writes, but also an occasional dash(-) - though it is a vector of ints I was writing to
<Yorlik> output like this after an orderly start: -8421504511-84215045143-842150451-
<Yorlik> code is this for thread 1-8 , just replace the numbers: std::thread t1 ( [ & ] ( ) { for ( int i = 0; i < n; i++ ) { stuff.push_back ( 1 ); } } );
<Yorlik> so each thread should write its number
<Yorlik> I wonder if cout or the console are getting mad at me
<Yorlik> size of the vector is clearly too small - as expected
<heller_> It's not just the memory allocation
<heller_> In your case also the update of the size
<Yorlik> I just don't understand why it's printing dashes
<heller_> So multiple threads might update both the size and the last element at the same time
<heller_> Negative numbers
<Yorlik> The rest makes sense to me
<Yorlik> Owww
<heller_> It's undefined behavior
<Yorlik> That would mean we have partial writes mixed from threads
<heller_> Something like that
<Yorlik> Even a simple int write gets jumbled
<Yorlik> But I see no irregular numbers
<Yorlik> just 1-8 and the dash
<heller_> You push back at the same time. All threads update the size and only after that, they write at the same spot at the same time
<Yorlik> Look at this output: https://imgur.com/a/tOUGfVR
<Yorlik> I wonder how it would manage to get a negative 8 on occasion
<heller_> It's hardly possible to reason about UB
<Yorlik> It's just some wird order in that mess
<zao> Abandon all hope ye who invoke UB.
<Yorlik> lol
<Yorlik> It's really funny
<Yorlik> threads 1-8 just write 1-8
<Yorlik> Dwellers of the UnderDark ....
<zao> Consider separating your output with whitespace, you might have -842 bajillion
<heller_> Or check your cat
<Yorlik> No thats funny
<zao> heller_: I don't trust that MPI migrate_component functionality btw. I found an eventually-failed run in one of my consoles and I'm somewhat sure it's a good OpenMPI build.
<Yorlik> It gives me a large negative number, but only with digits 1-8
<zao> Haven't gotten around reproducing it, as master's broken on my setup atm :)
<zao> That number is very specific in hex.
<heller_> zao: ok
<heller_> I'll investigate
<Yorlik> This is all so crazy .. Bottomline is I need a threadsafe vector and queue ...
<zao> Yorlik: It's 0xCDCDCDCD.
<Yorlik> Nice find !
<zao> Any sufficiently advanced C/C++ troubleshooter learns to recognize memory patterns by sight :)
<Yorlik> I'm not a sufficiently advanced C++ troubleshooter. Rather a sufficiently advanced C++ troublemaker. :O
<zao> heller_: The only evidence I have is https://gist.github.com/zao/b3d6db770515dc33ee2773f49ebeb194
<zao> It _can_ be one of the broken OpenMPI builds I manufactured, so it could be a false flag.
<Yorlik> Time to learn how to make an efficient threadsafe container. Any hints / pointers? I don't just want to copypaste code - rather a good tutorial if any of you know of one.
<zao> https://github.com/khizmax/libcds mentions a lot of literature.
<Yorlik> zao: Nice find - thanks !
<zao> I can't find the C-only library I've used in the past, I think they outlined a lot of stuff too.
<zao> In general, the less generic you can make your requirements, the higher the chance of a decent data structure existing for you.
<Yorlik> Makes sense
<zao> In particular around the number of consumers/producers, whether your data is POD and movable, etc.
<zao> Storing the data in-line in the buffer vs. storing in buckets or indirectly referenced by atomically sized pointer.
<Yorlik> The data structure I made is based on a vector which is indexed by a map holding keyed indices. It uses a freelist to save space and reuse slots. The items are linked , so I can use them as a que, which has a target and thetarget is in the map. So it's like a vecor holding a bunch of queues automaigally sorted
<Yorlik> Butit's totally not thread safe
<Yorlik> Also using vector indices to link is not overly efficient
<Yorlik> It's a start though
<Yorlik> there are multiple producers and consumers, but only one consumer per list
<Yorlik> However - time to dig some
<heller_> Yorlik: and of course, check his blog
<Yorlik> Will do _ thanks !
<heller_> if you tell me your email, I can give you something
hkaiser has joined #ste||ar
K-ballo has joined #ste||ar
<zao> K-ballo: Filed #3731 for the util::bound problem, btw.
<K-ballo> I'm trying an old gcc build now to see if it reproduces
<K-ballo> it does not
<K-ballo> zao: can you test a patch if I make one?
<zao> Sure.
<zao> g++ 6.3.0 is indeed happy, just ran a test.
<K-ballo> I tried 4.9, wasn't taking any chances
<K-ballo> which version is the one in your report?
<K-ballo> 6.3.0
<zao> Aye.
eschnett has joined #ste||ar
<zao> K-ballo: The build got past the problem runtime_impl.cpp file but failed on something later.
<K-ballo> related?
<K-ballo> there are a couple other potential pain points for buggy tuple
eschnett has quit [Quit: eschnett]
<K-ballo> nope, not related.. that's the one were we were abusing boost.atomic ub
<K-ballo> I suppose libstdc++6.3.0 doesn't have enough trivial members when compiled under clang or something?
<K-ballo> zao: is BOOST_ATOMIC_HAVE_GNU_128BIT_INTEGERS defined?
<zao> Where? CMakeCache, Boost headers, generated sources?
<K-ballo> mmh, no idea, let me see
<zao> The only mention of the define is in the HPX source.
<zao> I'm building with Boost 1.65.1 in C++14 mode, btw.
<zao> No idea which compiler, it's been in the container image for a long time.
<K-ballo> HPX_HAVE_GNU_128BIT_INTEGERS in <BUILD>/hpx/config/defines.hpp
<zao> Does not exist in that file.
<K-ballo> wonder whether that's ever set...
<zao> Closest thing I can find is hpx_add_config_define(BOOST_HAS_INT128) which is guarded in CMakeLists by APPLE
<K-ballo> I suspect it's leftover code
<K-ballo> it'd be useful if clang said how the type is not trivially copyable
<K-ballo> I suppose it could just be the `volatile` all over the place
<zao> Will check soon, coffee break.
eschnett has joined #ste||ar
<zao> K-ballo: An #error placed on that line is reached.
<K-ballo> ok.. that leaves the volatile hipotesis, which I cannot explain
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
jaafar has joined #ste||ar
<jaafar> Hi HPX team
<jaafar> Is there a FAQ for diagnosing performance issues with the parallel algorithms?
<jaafar> I tried out exclusive_scan and am finding it is slower with par than seq for data sizes from 10 to 10000000
<jaafar> My first thought of course is that I'm doing something wrong :)
<jaafar> So before I file a bug or whatever I wanted to ask
<jaafar> Getting off the train soon, back on in 30 minutes
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
jaafar has quit [Ping timeout: 240 seconds]
eschnett has quit [Read error: Connection reset by peer]
jaafar has joined #ste||ar
<jaafar> Well that took longer than I expected :)
<jaafar> If anyone has thoughts on my question from an hour ago I will gratefully receive them
eschnett has joined #ste||ar
<heller_> Difficult to tell
<heller_> I think a bug report with a test case would be best
<jaafar> OK thanks heller_ I'm willing to just file a bug
<jaafar> Anyone mind Google Benchmark or is there a different microbenchmark tool I should use
<heller_> I think that's fine
<jaafar> yay OK
<heller_> Self contained would be perfect, but I kinda look Google benchmark as well
eschnett has quit [Read error: Connection reset by peer]
<heller_> A small report in the ticket should give enough information though
eschnett has joined #ste||ar
david_pfander has quit [Ping timeout: 246 seconds]
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
<jaafar> OK filed thanks
eschnett has quit [Quit: eschnett]
eschnett has joined #ste||ar
diehlpk_mobile has joined #ste||ar
<diehlpk_mobile> One of our SC 19 workshop paper was finally published
diehlpk_mobile has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
eschnett has quit [Quit: eschnett]
tianyi93 has joined #ste||ar
diehlpk_work has quit [Remote host closed the connection]
bita_ has joined #ste||ar
bita has quit [Ping timeout: 240 seconds]
maxwellr96 has quit [Ping timeout: 264 seconds]