hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<Yorlik> like put a global mutex around every single access?
<heller_> yes, for example
<heller_> not a global one though
<zao> heller_: A quick question before I hit the sack - I'm not completely familiar with the gitlab CI image, I can't quite tell from the outside what SC++L that the sanitizer branch ends up using with sanitizers - is it libstdc++ or libc++?
<Yorlik> I was thinking about using that pattern with std containers
<heller_> zao: libc++
<zao> ah
<zao> thanks
<heller_> zao: sorry, libstdc++
<heller_> my bad
<zao> Oh :)
<heller_> not the llvm one ;)
<Yorlik> heller_: why not a global one? I meant a mutex used by all threads for reading and one for writing
<zao> Swamped with work at work building the competition so haven't gotten around trying the branch yet.
<heller_> Yorlik: class concurrent_vector { std::vector<T> data; mutex mtx; void push_back(...) { std::lock_guard<...> l(mtx); data.push_back(...); } };
<heller_> Yorlik: along those lines
<heller_> Yorlik: your attempt won't work
<heller_> zao: starpu, i guess?
<heller_> Yorlik: mutex don't compose
<zao> Aye.
<heller_> what a relief ;)
<Yorlik> heller_: Using a wrapper class?
<heller_> Yorlik: yes
<zao> The group using it found out today that they were using a criminally broken toolchain.
<zao> We had apparently forgotten to hide out experimental "what happens if we run the latest GCC+CUDA+OpenMPI versions" test toolchain :D
<Yorlik> heller_: You say it could work or not work? My idea was to use a wrapper guarding the push_backs and stuff
<heller_> zao: nice
<heller_> zao: what are you using as for CI?
<heller_> Yorlik: yes sure, but keep the synchronization local to that class
<heller_> Yorlik: and you can't use two different mutex to guard write or read accesses
<Yorlik> Yes, that was the plan.
<Yorlik> Right - makes sense.
<heller_> Yorlik: you also have to watch out for things like pointer/reference stabilities and such
<Yorlik> You mean guarding my references are not destroyed by any reordering operations?
<Yorlik> I was pondering to use handles
<heller_> things like push_back and friends might invalidate any pointers into your container
<Yorlik> push_back?
<Yorlik> You mean if the allocator changes things?
<Yorlik> I feel i need a custom allocator ...
<heller_> no, if you don't have enough capacity, the implementation allocates new storage with enough capacity, copies over the old elements and then inserts the new element
<Yorlik> having a vector copy in the middle of an update cycle would be fun for sure
<Yorlik> :)
<heller_> a custom allocator won't help you there ;)
<Yorlik> I have an idea for a workaround, but it requires a never relocating allocator
<Yorlik> Its possible
<Yorlik> You reserver a HUGE amount of virtual address space begorehand
<Yorlik> and grow only
<Yorlik> and you malloc additional pages from that space as you go
<Yorlik> unfortunately its os specific how to do it
<Yorlik> virtual alloc on windows and I think mmap on linux
eschnett has quit [Quit: eschnett]
<Yorlik> Which means I should probably just write my own managed vector
<Yorlik> Or Vector-ish data structure - rather a growable array, no erase
<heller_> yeah
<heller_> and that can be done with std::allocator for functional testing first
<Yorlik> I could use an insanely high reserver ofc
<heller_> right
<Yorlik> though ... does reserve actually reserve or dows it allocate on demand?
<heller_> reserve actually reserves
<Yorlik> I assumed reserve does all the mallocs already
<heller_> yes
<Yorlik> means the space is wasted
<heller_> right
<Yorlik> I want a virtual reserve that does malloc on demand
<heller_> but replacing std::allocator with something hand rolled to lazily allocate the pages on demand, is trivial
<Yorlik> Allright
<heller_> that's how our stacks work, btw
<Yorlik> You allocate virtually ?
<heller_> yes
<Yorlik> and do the hard alloc lazily?
<Yorlik> Nice !
<heller_> be aware of the initial overhead when accessing one of those non allocated pages
<Yorlik> I am thinking of some sort of time sliced preventive allocation triggered by a watchdog
<heller_> the generated page faults have a significant impact
<Yorlik> like you watch the size and when its above a threshold you start allocating
<Yorlik> I'd like to prevent that from start
<heller_> the granularity will be the size of a page
<heller_> but you can control that (under linux at least)
<heller_> that is the page size
<Yorlik> That was my thought - a 4K alligned vector allocating in steps of 4k
<heller_> those are all operations that can be done once the game is functional ;)
<heller_> or the core engine
<zao> For my own HPX testing, I still have the semi-automated Singularity-based thing I'm hacking on.
<heller_> and for work?
<zao> Nothing currently, just EasyBuild installations and some manual test suites for software we have ideas about performance/correctness for.
<zao> We're looking at CSCS's Reframe eventually to automate some of those.
<zao> As for the StarPU-using group, I have no idea what their methodology is. Infinite numbers of PhDs, I guess :)
<zao> There's a presentation of some milestone tomorrow, hoping to catch that.
<heller_> reframe looks awesome
K-ballo has quit [Quit: K-ballo]
eschnett has joined #ste||ar
hkaiser has quit [Quit: bye]
eschnett has quit [Quit: eschnett]
<Yorlik> When continuously inserting random numbers into a std::set in batches, you'd expect *set.begin() to get smaller over time and *(--set.end()) to become bigger, right? But I am seeing occasionally the bigin is getting bigger. my set is a std::set<uint64_t> ( ) and I am printing values with %16X in printf. Though the chance I have made an error is higher, I can't get my head around this. I am doing no deletions at
<Yorlik> all. Output is here: https://imgur.com/a/eSQaEY4
<Yorlik> I wonder if the uint64_t is too much for cout and it's just getting truncated
<Yorlik> the hex digits are never getting past 8
<Yorlik> resp printf %16X not working (not cout)
<Yorlik> Yup confirmed - %16X is not working.
<Yorlik> Had to split the printout into 2x32bit uints to work
<Yorlik> Thanks, IRC for being a nice rubberduck ;)
akheir has quit [Remote host closed the connection]
akheir has joined #ste||ar
<jbjnr__> heller yt?
<jbjnr__> I just looked at some the logs of jobs I ran last night, and I see exceptions (not many, but some) thrown because of "{what}: archive data bstream is too short: HPX(serialization_error)
<jbjnr__> " - (testing libfabric). I'm trying to remember what the problem we had that caused this (bad data I know, but there were certain specific issues that trigered it). If your memory is better and you recall it, please let me know. thanks
<Yorlik> Oh - someone reviewed :D - Thanks to you _heller :)
<Yorlik> Time to build the new HPX :)
<Yorlik> Argh - it doesn't compile
<Yorlik> Issue filed.
<heller_> jbjnr__: uh, don't remember, sorry
<jbjnr__> no worries. I'll dig into the code over the weekend. must be a race somewhere
<heller_> sure is
<jbjnr__> does master branch have occasional deadlocks? I see occasioanl hangs on exit on the moody camel branch and I'm not sure if it's new or present on master too.
<simbergm> jbjnr__: definitely
<jbjnr__> you mean present on master too?
<simbergm> jbjnr__: definitely deadlocks still on master
<jbjnr__> ta
<jbjnr__> who's fixing it?
<jbjnr__> :)
<heller_> Me
<heller_> I think I have most fixed on the sanitizers branch
<jbjnr__> \o/ yay for heller.
<jbjnr__> it might not be your fault - but you always fix it!!!
<jbjnr__> (It's always your fault btw)
K-ballo has joined #ste||ar
<heller_> jbjnr__: would be boring otherwise ;)
<simbergm> heller_: what's missing on the sanitizers branch btw? can we help?
<heller_> simbergm: there are some leaks still
<heller_> simbergm: i'll open a PR step by step
<heller_> and eventually we'll be able to have the sanitizers running along normally
<heller_> but the MPI version needs to be updated first...
<simbergm> awesome
<Yorlik> Anyone else having issues compiling the current master with tests? (Windows) It seems to compile now, but I have to compile it with -DHPX_WITH_TESTS=OFF
<Yorlik> Also had to disable a bit in a CMake File in the tests
<K-ballo> which tests are failing to compile?
<Yorlik> It begins with the CMakeFile - so it already breaks in the generation phase
<Yorlik> I filed an issue for that one
<Yorlik> But when disabling the offending lines (seems like a harmless bug) the tests break
<Yorlik> I'll have to start another compile to get the exact message
<Yorlik> Running my hackish fix compile in the moment
<Yorlik> Just wondering if there was a known issue
<K-ballo> odd, the file has not been touched for months
<K-ballo> Yorlik: when's the last time you had successful compilation with tests?
<Yorlik> I used tag 1.2.1
<Yorlik> I wanted that enhancement hkaiser made for custom allocators
<Yorlik> thats why I'm compiling
<Yorlik> 1.2.1 worked flawlessly
<Yorlik> No compiles in between
K-ballo has quit [Ping timeout: 250 seconds]
K-ballo has joined #ste||ar
<K-ballo> I don't usually compile with test and examples on windows, but I last did a couple weeks ago
<Yorlik> That probably was close to 1.2.1
<Yorlik> maybe even pre 1.2.1
<jbjnr__> tests are broken because of guided_pool_executor. I'm trying to fix it now, but still don't understand what's going on
<Yorlik> OK
<K-ballo> 1.2.1 wouldn't have most of the changes that happened since 1.2.0, couple weeks old master would
<K-ballo> generation with tests works here, might need more/less flags, or a clean build
<K-ballo> ye, needs vcpkg
<K-ballo> actually no.. should be unconditional
<Yorlik> I'm compiling without vcpkg
<K-ballo> are you sure that error is the very first error to pop up?
<Yorlik> Default options
<Yorlik> Yes
<Yorlik> But when I fix the Cmake File the tests explode
<Yorlik> Tests off generates and compiles properly now
<K-ballo> I'm not interested in that, I don't think the test exploding could explain the cmake configuration error
<Yorlik> I als believe the CMake is an extar thing
<Yorlik> The target in the offending lines does not exists
<Yorlik> I made a full text search for it
<Yorlik> Weird
<Yorlik> Might be some bug causing CMake to silently fail maybe?
* Yorlik doesn't like silent fails
<K-ballo> I don't know, but we have a real problem there, commenting the offending line out won't help
<Yorlik> It consumes the list, but doesn generate a target - I'll check my local files really quick
<K-ballo> are you still using your custom made special superbuild cmake flow?
<Yorlik> Just to exclude any weirdness on my side, but I tried after a hard reset too
<Yorlik> Yes - custom superbuild -but its pretty stable these days
<K-ballo> see if you can reproduce without it
<Yorlik> OK
<Yorlik> At least I can confirm my local file isn't corrupted - the löist definitely has the test here too.
<Yorlik> Working on the conventional compiule
<heller_> how did you upgrade to master?
<Yorlik> Clean Checkout and CMake Gui
<Yorlik> I used the git default gui - pull and merge
<Yorlik> err - fetch and merge ofc
hkaiser has joined #ste||ar
<Yorlik> K-ballo: It generated with the normal CMake gui - but a normal VS project.
<Yorlik> That was my normal CMake 13
<jbjnr__> K-ballo: I think I know what's wrong now. The async_execute used to be called with (funtion, dataflow_frame, result, predecessor(tuple of Futures)), but now it is called only with (function, predecessor(tuple of futures)), the dataflowframe and result are gone. Not sure why the result was there, but I guess it was filled in by the caller
<Yorlik> Inside VS I am using the integrated CMake of VS and targeting Ninja
<Yorlik> I copied / moved the source tree which worked in place of the old one and it broke again
<Yorlik> So - there is some difference in the generation process / environment between inside and outside of VS which makes the difference
<Yorlik> I personally think I will pull all dependencies out of my superbuild and make my life easier
<Yorlik> My motivation to track these tidbits of broken build processes is not very high tbh.
<Yorlik> I'll just keep to the given standards and done.
<Yorlik> No one can check all variations and depths of all circumstances - but still: there is a possibility this issue indicates some weirds problem.
<Yorlik> Its just a pity to lose the possibility to automate everything and haviong to resort to manual labour.
<heller_> FWIW, I can't reproduce your error on linux
<Yorlik> I think its some weirdness in the depths of the system.
<Yorlik> I just can say even after completely removing my .vs and build directory and restarting VS it reproduced
<K-ballo> jbjnr__: dataflow uses post, does that get synthezised from async_execute?
<jbjnr__> yes, eventually.
<jbjnr__> I seem to ahve fixed it, but I have a lockup on exit - every time
<jbjnr__> not sure if that's related or different
<jbjnr__> (just my test)
<K-ballo> what was result? from what I can see it would have been called with the frame and a boolean constant
<K-ballo> we need to document in the dataflow implementation and the post implementation the places in which changes would affect the guided executor
<jbjnr__> I was puzzled by result. I just passed it rhough, but no real ide what it was. I called it result, bu maybe it was something else cos it was a different type from result_type
<jbjnr__> it's gone now anyway
<jbjnr__> whatever you cleaned up, made it better
<K-ballo> ah, it must have been that true/false_type for is_void
eschnett has joined #ste||ar
<jbjnr__> could be
<K-ballo> lucky you I left the tuple of futures there as a convenience, my intention is to eventually take that away too
<jbjnr__> if you remove the tuple and we just have a list of futures, that's probably ok, cos I can specialize on >1 futures
<K-ballo> no, I would remove all the things
<K-ballo> feed the executor a nullary callable
<jbjnr__> anyway, compilation is fixed now.
<K-ballo> if we were in 14 I could have done it, with 11 support it required too much machinery
<jbjnr__> ^^that might be a problem - if I can't introspect the args, I can't do late binding
<K-ballo> that's the underlying something I'm hoping to understand following the fix
eschnett has quit [Client Quit]
<jbjnr__> if and when you go down that path, remember this conversation and warn me
<K-ballo> my hands are tied now that I know the guided executor depends on deep implementation details
<jbjnr__> we can still discuss it and look for a way ...
<jbjnr__> (but not today please :)
<K-ballo> I will look for a way, but worst case scenario we'll annotate datafllow with comments not to change things for other things depend on them
<jbjnr__> so you want dataflow to unwrap each future as it becomes ready, bind them to a new callable and keep unwrapping each layer until all futures are complete, then there's a final callable that is passed to the executor with all args bound in to it?
<K-ballo> mmh, I'm not sure I understand, but I think not..
<K-ballo> this is the implementation guided executor depends on: https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/lcos/dataflow.hpp#L193-L199
<K-ballo> ideally I would capture `futures` rather than pass it as an argument
<hkaiser> jbjnr__: I'm not aware of any interface changes for the executors
<K-ballo> that would require either 14's lambda init captures, or a custom made callable
<K-ballo> looking at this back now, I'm surprised I didn't write a simple custom made callable... I did so for a few other cases
<jbjnr__> I see
<jbjnr__> I'll ponder it offline.
<hkaiser> K-ballo: post is normally not synthesized, but if the executor does not expose it, the customization points will use async_execute instead - true
<jbjnr__> what do you mean by 'synthesized' here?
<Yorlik> hkaiser: I closed the issue with my HPX build, since it obviously is specific to my setup. I still can build using the default way of using external CMake and targeting MSVC.
eschnett has joined #ste||ar
<hkaiser> Yorlik: ok
<hkaiser> jbjnr__: the executor customization points do different things depending on what functionality the used executor exposes
<hkaiser> jbjnr__: e.g. if the executor exposes post(), then the post_execute CP will use it, otherwise it will call the executor's async_execute instead
<hkaiser> ... and discard the returned future
<jbjnr__> ok. Just the terminology was confusing me
<jbjnr__> lockup problem solved. Just me being useless and leaving some debug code in there.
<jbjnr__> will do PR now
hkaiser has quit [Quit: bye]
<K-ballo> great
aserio has joined #ste||ar
<heller_> simbergm: first set of patches: https://github.com/STEllAR-GROUP/hpx/pull/3737
<simbergm> heller_: thanks!
<simbergm> various references/ids/whatever were released too early before?
Yorlik has quit [Read error: Connection reset by peer]
Yorlik has joined #ste||ar
<heller_> simbergm: yeah
hkaiser has joined #ste||ar
nikunj has joined #ste||ar
akheir has quit [Quit: Konversation terminated!]
akheir has joined #ste||ar
aserio has quit [Ping timeout: 252 seconds]
aserio has joined #ste||ar
<heller_> hkaiser: upgraded the MPI version in our docker build container. The MPI migrate_component run looks pretty good now.
<hkaiser> heller_: good!
<hkaiser> and thanks!
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 252 seconds]
aserio1 is now known as aserio
khuck has joined #ste||ar
<khuck> hkaiser: did you or adrian send a webex link for the meeting today?
<khuck> aserio: ^^
<aserio> I did
<aserio> give me a sec
<aserio> hkaiser: ^^
<aserio> khuck: ^^
khuck has quit []
eschnett has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
hkaiser has quit [Quit: bye]
aserio has quit [Ping timeout: 252 seconds]
bibek has quit [Quit: Konversation terminated!]
eschnett has quit [Quit: eschnett]
aserio has joined #ste||ar
aserio has quit [Quit: aserio]