aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
rod_t has left #ste||ar [#ste||ar]
EverYoung has quit [Ping timeout: 252 seconds]
<github> [hpx] khuck created fixing_ppc64le_clang_build (+1 new commit): https://git.io/vFefR
<github> hpx/fixing_ppc64le_clang_build a1d7f53 Kevin Huck: Adding uintstd.h header...
<github> [hpx] khuck opened pull request #2972: Adding uintstd.h header (master...fixing_ppc64le_clang_build) https://git.io/vFefV
ct-clmsn has joined #ste||ar
<github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vFeTn
<github> hpx/master ac3517b Hartmut Kaiser: Merge pull request #2946 from STEllAR-GROUP/fixing_2137...
K-ballo has quit [Quit: K-ballo]
wash has quit [Quit: leaving]
wash has joined #ste||ar
wash has joined #ste||ar
wash has quit [Client Quit]
wash has joined #ste||ar
wash is now known as Guest66456
Guest66456 has quit [Client Quit]
parsa has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
hkaiser has quit [Quit: bye]
parsa has joined #ste||ar
gedaj has quit [Read error: Connection reset by peer]
gedaj_ has joined #ste||ar
gedaj_ has quit [Quit: Leaving]
gedaj has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
ct-clmsn has quit [Ping timeout: 240 seconds]
gedaj has quit [Remote host closed the connection]
gedaj_ has joined #ste||ar
wash has joined #ste||ar
wash has quit [Quit: leaving]
wash has joined #ste||ar
jaafar has quit [Quit: Konversation terminated!]
simbergm has joined #ste||ar
wash has quit [Quit: leaving]
david_pfander has joined #ste||ar
wash has joined #ste||ar
<simbergm> heller: yt?
<heller> simbergm: hey
simbergm is now known as msimberg
<heller> msimberg: only a couple of minutes
<heller> gtg to a lecture in 10
<msimberg> mmh, okay
<msimberg> so I did the fix that you suggested for the asan thing
<msimberg> and it works fine
<msimberg> but I was wondering if scheduling loop can't throw?
<msimberg> and if frame_context is related to this somehow?
<msimberg> but we can discuss later as well
<msimberg> and I also have some more asan warnings which I can't make much sense of
<heller> Ok
<heller> Let's talk later
<heller> I'm done in 10 minutes
<msimberg> yeah, that's okay
<msimberg> let me know when you have time
<heller> Sorry, 2 hours
<msimberg> :)
<msimberg> also ok
<heller> Regarding throwing: yes, might happen, but really shouldn't
<heller> Other asan warnings: tell me
<msimberg> i'll make a gist, they're rather long
<msimberg> for the reset I was thinking if it doesn't make sense to put it at the end of thread_func instead
<msimberg> or use something like a unique_pointer
<msimberg> basically I'm just initializing hpx multiple times and get an AddressSanitizer: stack-buffer-overflow error with one of 3 or 4 stack traces
<msimberg> doesn't happen every time but easily get one if I initialize say 500 times
<zao> Nifty.
<github> [hpx] msimberg opened pull request #2973: Fix small typos (master...typo-fixes) https://git.io/vFeRX
heller has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
heller has joined #ste||ar
Bibek has quit [Remote host closed the connection]
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vFeuM
<github> hpx/gh-pages f7c0d2a StellarBot: Updating docs
Bibek has joined #ste||ar
pree has joined #ste||ar
pree has quit [Remote host closed the connection]
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
pree has joined #ste||ar
<heller> msimberg: yes, might make sense
<heller> msimberg: looking at the asan errors right now
<heller> msimberg: you should really do a debug build to get more information there
pree has quit [Ping timeout: 248 seconds]
eschnett has joined #ste||ar
pree has joined #ste||ar
<msimberg> heller: thanks, I'll do that and let you know
<wash[m]> Asplos reviews are in
<heller> wash[m]: uuuh!
<heller> wash[m]: can you forward them?
<wash[m]> Yes
<heller> wash[m]: or where do I find them?
<wash[m]> Working on it.. i only have my phone with me.
<wash[m]> One sec
<heller> wash[m]: accept or reject?
<wash[m]> Reject. The reviews are positive, but they think it's off topic for the conference
<heller> grrr
<wash[m]> Also they didn't like related work
<heller> hmm
pree has quit [Read error: Connection reset by peer]
pree has joined #ste||ar
K-ballo has joined #ste||ar
<msimberg> heller: I have, uhm, longer stacktraces
<msimberg> would you like to see them?
<heller> suyre
<msimberg> hold on
mcopik has joined #ste||ar
mcopik has quit [Client Quit]
<msimberg> without asan I get after a while "the runtime system is not active (did you already call finalize?)" which I'm guessing could be related
pree has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
<heller> msimberg: hmmm
<heller> msimberg: this happening when registering locks
<heller> msimberg: it sounds a bit that hpx::init is returning too early still
<msimberg> heller: thanks, that already helps
<msimberg> do you think register_lock and init is the best place to start looking then? one of them was not during shutdown, but at load_components...
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
<github> [hpx] hkaiser closed pull request #2968: Cleaning up dataflow overload set (master...dataflow-api) https://git.io/vdNGj
<heller> msimberg: yeah, just look at the stack and observe the functions
<heller> msimberg: might be a race somewhere, access to invalid memory etc.
<msimberg> heller: I'll see what I can do, thanks again
<hkaiser> msimberg: generally, we need to unlock all mutexes before throwing an exception, we don't always do that
<msimberg> hkaiser: ok
<heller> asan is giving fals positives atm when an exception is thrown
<msimberg> heller: hmm, so you think it's likely that that is what's happening now, or just as a general note?
pree has joined #ste||ar
<heller> msimberg: no, you'll get a different asan error then
<msimberg> ok
<msimberg> is it always the same in that case?
<msimberg> as in it's easy to recognize when that's happening?
<msimberg> and hkaiser: to answer your question from monday: yes
pree has quit [Read error: Connection reset by peer]
<hkaiser> msimberg: uhh, what question? ;)
<msimberg> :)
<msimberg> meaning it waits for all queues to be empty and so it hangs when removing a pu
<heller> yes
<hkaiser> ahh, yes
<hkaiser> ok
<msimberg> or never quits the scheduler
<msimberg> but according to heller it wasn't a good idea to just wait for the single pu to have empty queues (hope I interpreted it correctly)
<msimberg> but it seems like that would still be the correct thing to do
<heller> yes, it makes sense
<msimberg> ok, then it's "just
<msimberg> " a question of getting it to work
<heller> this reverted all the work I tried there
<hkaiser> msimberg: as always...
<heller> as it didn't really work ;0
<msimberg> heller: ok, thanks, that's helpful
<msimberg> hkaiser, heller: I also looked into suspending the pus with condition variables, and I guess each thread would have to have its own condition variable? i.e. we can't reuse the condition variable that is already there, because then we'd be waking up suspended threads pus when adding work or unsuspending other pus
<heller> msimberg: if you do the usual while(condition) cv.wait();
<heller> you could reuse it
<heller> but you'd wake up the threads too often, I guess
<msimberg> I see
<msimberg> what would be the argument against having separate condition variables for each pu?
<hkaiser> msimberg: yes, each thread already has it's own CV
<hkaiser> I showed you where it sits
<msimberg> hmm, the one in scheduler_base?
<hkaiser> yes
<msimberg> but that would be one per pool, no?
<msimberg> or scheduler
<hkaiser> let me look
<hkaiser> I might mix up things
<hkaiser> but yah, we woul dneed one CV per pu
<msimberg> hkaiser: ok
<msimberg> thanks
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
<hkaiser> jbjnr: yt?
pree has joined #ste||ar
hkaiser has quit [Quit: bye]
aserio has joined #ste||ar
eschnett has quit [Quit: eschnett]
parsa has joined #ste||ar
jaafar has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
<msimberg> heller: continuing testing with tsan, I get "signal-unsafe call inside of a signal" in hpx::init ... hpx::util::runtime_configuration::init_stack_size (can give you a stack trace if you want)
<msimberg> real or false positive? or not important?
K-ballo has quit [Remote host closed the connection]
K-ballo1 has joined #ste||ar
K-ballo1 is now known as K-ballo
eschnett has joined #ste||ar
<heller> msimberg: I'm not sure how reliable tsan is
<heller> msimberg: signal unsafe sounds correct though
<zao> I'm still a bit curious as to what kind of clobber resulted in such bogus output from partitioned_vector_subview_test.. https://i.imgur.com/JQ26PR9.png
<zao> A solid three megabytes of strings from what seems to be the executable.
<msimberg> heller: ok, thanks
<msimberg> it also gave some harmless complaints about missing return for example here:
<msimberg> do you generally try to make it happy even though you know it's correct, or just ignore?
<msimberg> and do you already know that ubsan is not useful with hpx?
hkaiser has joined #ste||ar
<msimberg> if ubsan is useful...
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 248 seconds]
aserio1 is now known as aserio
<github> [hpx] hkaiser closed pull request #2972: Adding uintstd.h header (master...fixing_ppc64le_clang_build) https://git.io/vFefV
<zbyerly> is rostam down?
<zbyerly> nvm
aserio has quit [Quit: aserio]
hkaiser has quit [Read error: Connection reset by peer]
aserio has joined #ste||ar
parsa has joined #ste||ar
parsa has quit [Client Quit]
pree has quit [Ping timeout: 255 seconds]
david_pfander has quit [Ping timeout: 240 seconds]
parsa has joined #ste||ar
hkaiser has joined #ste||ar
<hkaiser> jbjnr: pls contact me once you read this
<heller> hkaiser: if I read the reviews correctly, they mostly complain that we aren't putting our work into the correct frame
<heller> Or fail to show the novelty
<heller> msimberg: ubsan is immensely useful to me
<heller> msimberg: if feasible, I try to fix the warnings, yes
<heller> ubsan is overly pedantic sometimes
<heller> msimberg: why do you find ubsan not helpful?
parsa has quit [Quit: Zzzzzzzzzzzz]
hkaiser has quit [Quit: bye]
hkaiser has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
teonnik has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
<teonnik> Hi, is it possible to run hpx application using MPI calls directly? I tried running MPI_Allreduce with two processes but the application deadlocks?
<heller> teonnik: do you also require hpx networking?
<teonnik> I have an existing MPI program and would like to use HPX kernels for shared memory (on each node) parallelism. I would still like to keep the distributed (MPI) part intact.
<teonnik> When I installed HPX, I used the -DHPX_WITH_PARCELPORT_MPI=On option if I understood the question correctly.
hkaiser has joined #ste||ar
pree has joined #ste||ar
patg[w] has joined #ste||ar
EverYoung has joined #ste||ar
<pree> hkaiser : It's good to see your talk at cppcon 2017, It is awesome : )
hkaiser has quit [Read error: Connection reset by peer]
<K-ballo> he was overwhelmed with emotion to the point of connection error :)
<pree> :)
<patg[w]> lol
hkaiser has joined #ste||ar
jbjnr has quit [Read error: Connection reset by peer]
jbjnr has joined #ste||ar
<zao> Happiness overflow is undefined, optimized out.
<heller> teonnik: in that case, you should turn networking of completely
<teonnik> How do I do that?
<heller> teonnik: -DHPX_WITH_NETWORKING=Off
<teonnik> Thanks!
<heller> that's a cmake option
rod_t has joined #ste||ar
<teonnik> Has anyone built HPX with clang 5.0.0? I get a linking error with boost.regex at the very end. There are also a number of warnings during compilation.
aserio has joined #ste||ar
<zao> Warnings are probably par for the course.
<zao> I should give it a try.
<zao> Did you build Boost with that compiler as well, and do you have any particular flags for building Boost and HPX?
<zbyerly> warning: compilation successful
<jbjnr> hkaiser: see pm
pree has quit [Ping timeout: 240 seconds]
parsa has joined #ste||ar
EverYoung has quit [Ping timeout: 252 seconds]
parsa has quit [Quit: Zzzzzzzzzzzz]
EverYoung has joined #ste||ar
pree has joined #ste||ar
<jbjnr> teonnik: I compiled clang recently (6?) and then built all of hpx and its dependencies on our cray. mu settings are here https://github.com/biddisco/biddisco.github.io/wiki/Daint-Clang
<jbjnr> you might not want all of my cmake flag, but it might give you an idea of boost+hpx compilation
<teonnik> Thanks, I'll have a look
<github> [hpx] khuck created khuck-patch-1 (+1 new commit): https://git.io/vFvXN
<github> hpx/khuck-patch-1 34a2515 Kevin Huck: Change version of pulled APEX to master...
<heller> teonnik: I'm building fine with clang 5 as well
<heller> teonnik: problem is a boost compiled with a different c++ standard
<zao> Do we go for C++17, while they go for stock?
<zao> Ah yes, we -std=c++17
<zao> Expecting my build to fail nicely soon then.
<teonnik> Boost was compiled with -std=c++11 if I remember correctly, clang 5.0.0 defaults to -std=c++14
<teonnik> Could that be a problem?
<zao> HPX goes for the best it can, unless you explicitly -DHPX_WITH_CXX14=ON or suchlike.
EverYoung has quit [Ping timeout: 246 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<jbjnr> teonnik: yes, boost needs to be compiled with the same flags as hpx, otherwise you see those regex type link errors
EverYoung has quit [Ping timeout: 246 seconds]
hkaiser has quit [Ping timeout: 240 seconds]
pree has quit [Ping timeout: 260 seconds]
zbyerly has quit [Quit: Leafing]
<zao> Interestingly enough, my stock-built 1.65.1 and HPX (w/ inferred C++17) linked properly.
Vir has quit [Quit: ZNC - http://znc.in]
teonnik has quit [Ping timeout: 260 seconds]
<jbjnr> K-ballo: I've tried everything, but my then_execute won't get called. Grrrr.
<K-ballo> jbjnr: where can I see the declaration?
<jbjnr> is the version I'm currently experimenting with
Vir has joined #ste||ar
<K-ballo> that much looks ok
<jbjnr> I cut'n'pasted the contents of the function - not entirely sure they are the right way to do it - I only want a future<result_type> really, but I wasn't sure how to get it.
<jbjnr> anyway, it's getting called at all that's the trouble
<K-ballo> yeah I didn't look at the body of the function, it does not participate in the decision of whether the call is viable since it has a proper return type
<K-ballo> can I paste this in a test or something?
<jbjnr> let me see if I can put it into the other mini test ...
<jbjnr> (same link,pdated)
<jbjnr> we need to trigger the then_execute to be called.
<K-ballo> jbjnr: any branch, or cscs?
<K-ballo> by "any" branch I meant master :P
<K-ballo> master compiles, let's see
<K-ballo> yeah, I see it happen
<jbjnr> yes. any branch. - but you know that now
<K-ballo> jbjnr: the executor used as argument is const, your then_execute non-const
<jbjnr> ooh
<K-ballo> your then_execute isn't viable, so it goes to the fallback
<K-ballo> why would that executor be const though?
<jbjnr> I do not see where the const is you refer to
<K-ballo> because then_execute_helper forces it to be const, that's for hkaiser
EverYoung has joined #ste||ar
teonnik has joined #ste||ar
<K-ballo> I suspect those consts are a left over from the first Executors approach
<jbjnr> how did you spot the const in the first place?
<K-ballo> I cheated
<K-ballo> with a direct call to the executor
<jbjnr> thanks for the help. I guess I will have to try to remove the consts, one by one until hpx compiles again ....
<jbjnr> or make my exec const
<K-ballo> remove the one I linked to, and the forward declaration to it in future.hpp, and it works
<jbjnr> nice.thanks
hkaiser has joined #ste||ar
<jbjnr> hkaiser: K-ballo has solved my problem with then_execute
<jbjnr> fyi
<hkaiser> tks
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<jbjnr> hmmm. it's not enough to remove just those two consts - there are more
<jbjnr> hkaiser: why are the executors const in async and then exeute - etc?
<hkaiser> shrug
<K-ballo> those two were enough for the test case
<jbjnr> yup, but not for my real use-case
<jbjnr> I touch a bit more code I think.
<jbjnr> though, actually .... let me try something
eschnett has quit [Quit: eschnett]
<K-ballo> I'm pretty sure those const are a leftover from the first iteration of Executors, which were pointer wrappers
<teonnik> The linking issue with clang 5.0.0 was indeed due to build inconsistencies between HPX and Boost. I rebuilt boost and now there are no problems.
<hkaiser> teonnik: good
<jbjnr> K-ballo: note that the reduced test where you return {} compilkes, but the original with all the continuation guff doesn't
<K-ballo> I noticed
<jbjnr> K-ballo: ok. I had this instead of *this for the executor param in the make_continuation call
<jbjnr> compiles and runs ok . seems I am one step closer to my goal. thanks very much. One more beer to add to the huge pile that is building for you one day!
<hkaiser> parsa[w]: yt?
<parsa[w]> hkaiser: aye
teonnik has quit [Ping timeout: 260 seconds]
<hkaiser> the error reporting if blaze is not found is broken
<hkaiser> it says to set 'the follwoing' variables without listing them
<hkaiser> parsa[w]: even if I set blaze_INCLUDE_DIR to the vcpkg dir it does not find anything
<parsa[w]> if you're using vcpkg it detects the folder without you having to do anything
<parsa[w]> i'm clueless about not printing the variable names. though. checking right now
<hkaiser> parsa[w]: phylanx_error() stops cmake
<parsa[w]> oh
<jbjnr> hk is this the correct way to handle then_execute inside the actual then_execute fn ? https://gist.github.com/biddisco/93f1db06e57168366d72e16556b09248#file-gistfile1-txt-L93-L104
<parsa[w]> hkaiser: does this help? https://github.com/STEllAR-GROUP/phylanx/pull/81
eschnett has joined #ste||ar
<hkaiser> should help, yes, thanks
<hkaiser> I'll try later
<hkaiser> parsa[w]
<jbjnr> (what I don't like is that then_execute, calls async_execute - which seems like a waste of some calls)
aserio has quit [Quit: aserio]
parsa has joined #ste||ar
<hkaiser> jaafar: only the emulation
<hkaiser> jbjnr: ^^
<jbjnr> emulation?
<jbjnr> then_execute creates a continuation - which ends up calling async
<jbjnr> async_execute
<jbjnr> would be better if then_execute just executed directly
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
patg[w] has quit [Quit: Leaving]
parsa has joined #ste||ar
<hkaiser> jbjnr: only if the executor does not implement then_execute directly, then it emulates the functionality based on whatth eexecutor provides
<hkaiser> not sure what you mean by 'execute directly', what does that mean?
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
rod_t has left #ste||ar [#ste||ar]