hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
eschnett has joined #ste||ar
eschnett has quit [Quit: eschnett]
<parsa> K-ballo: how did people access an element in a shared_ptr<double[]>? https://wandbox.org/permlink/7nGopjMgdRAaxAVf works with recent GCCs but not 6.3>
<K-ballo> that's a 20 feature, isn't it?
<K-ballo> ptr.get()[i] I imagine, assuming that's good enough
<parsa> that doesn't work either: "invalid use of array with unspecified bounds"
<K-ballo> true, you'll have to go with double for element type
<K-ballo> sketchy
<hkaiser> parsa: use boost::shared_array<double>
<parsa> that worked. but it's deprecated
<parsa> boost::shared_array that is
<K-ballo> what is deprecated?
<K-ballo> ah, yeah, use boost::shared_ptr<double[]>, assuming
<K-ballo> ask glen on slack
hkaiser has quit [Quit: bye]
eschnett has joined #ste||ar
eschnett has quit [Quit: eschnett]
jbjnr__ has joined #ste||ar
jbjnr_ has quit [Ping timeout: 240 seconds]
nikunj97 has joined #ste||ar
Yorlik has joined #ste||ar
hkaiser has joined #ste||ar
eschnett has joined #ste||ar
nikunj97 has quit [Ping timeout: 246 seconds]
nikunj has joined #ste||ar
eschnett has quit [Quit: eschnett]
nikunj has quit [Ping timeout: 250 seconds]
nikunj has joined #ste||ar
<K-ballo> seeing 'hpx::iostreams::detail::release_ostream': inconsistent dll linkage warning on master
<hkaiser> K-ballo: ok, will take care of this, thanks
eschnett has joined #ste||ar
eschnett has quit [Quit: eschnett]
<hkaiser> K-ballo: how to reproduce this? compiling the core modules + hello_world does not expose this warning
<hkaiser> nvm, rebuilding helps ;-)
eschnett has joined #ste||ar
<hkaiser> ...fixed now
parsa is now known as parsa_
parsa_ is now known as parsa
<jbjnr__> hkaiser: diehlpk_work daissgr - bad news rando lockups on startup on node counts of 256-512-1024 etc. Seems to be a boot time problem. No test code ever runs, never get any output. will look into bootup code this evening.
<jbjnr__> ^random
<jbjnr__> queue very full, hard to get nodes to test on :(
nikunj has quit [Ping timeout: 272 seconds]
<hkaiser> jbjnr__: could be manifesting our hangs
nikunj has joined #ste||ar
eschnett has quit [Quit: eschnett]
nikunj has quit [Ping timeout: 250 seconds]
<jbjnr__> hkaiser: what hangs are you referring to?
<hkaiser> jbjnr__: the hangs we see on CircleCI
<jbjnr__> these happen at start, I do not see any output from the main code. I will enable some debug logs in the bootstrapping and see if I can diagnose what's wrong - could it be the barrier? (Does BBB have one?)
nikunj has joined #ste||ar
<jbjnr__> do the circleci hangs happen on startup?
<hkaiser> I have the longstanding suspicion that the hangs we see happen during startup...
<jbjnr__> hmm
<jbjnr__> ^hmmmmm
<jbjnr__> :)
<hkaiser> :D
<jbjnr__> does any of th bootup code make use of the distributed barrier ?
<hkaiser> jbjnr__: startup uses a barrier, yes
<jbjnr__> I could swap it with my one ...
<hkaiser> didn't heller fix that?
<hkaiser> there was a PR yesterday
nikunj has quit [Remote host closed the connection]
<jbjnr__> he cleaned up a memory leak but I'm not sure he changed the actual functionality
<hkaiser> he changed barrier
<hkaiser> keeping things alive longer...
<hkaiser> gtg
<jbjnr__> I'll try it out now
nikunj has joined #ste||ar
<nikunj> can anyone explain me this runtime error: hpx::init: exception caught: option '--hpx:console' does not take any arguments
<nikunj> I added a few commandline options and when I try to use them, HPX throws the above error
<nikunj> if I chose not to add explicit commandline value and let HPX chose the defaults I set, the output is as expected
<nikunj> found the issue, I wasn't writing commandline arguments correctly :/
<zao> :D
<jbjnr__> hkaiser: (just fyi merging the barrier fix does not seem to make any difference)
<hkaiser> jbjnr__: ok, good to know, thanks
nikunj has quit [Remote host closed the connection]
parsa is now known as parsa_
<jbjnr__> heller: I am going to take a look at the barrier code tomorrow. if you see this, coould you please write a short explanation of what the steps it uses are so that I can go through it and see if there is a possibility of a hang. Thanks. I am a bit worried because I just had a quick look at barrier node and it is full of when_all's, vectors of futures, all sorts of other stuff and it's going to take me a bit of time to decipher
<jbjnr__> what's going on. No comments in the code to speak of.
<jbjnr__> goodnight all