aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
parsa has quit [Quit: Zzzzzzzzzzzz]
EverYoung has quit [Ping timeout: 246 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
parsa has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
EverYoung has quit [Ping timeout: 252 seconds]
diehlpk has quit [Ping timeout: 240 seconds]
eschnett has quit [Quit: eschnett]
eschnett has joined #ste||ar
hkaiser has quit [Ping timeout: 248 seconds]
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
parsa has quit [Client Quit]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
parsa has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
gedaj has quit [Quit: leaving]
gedaj has joined #ste||ar
gedaj has quit [Quit: leaving]
gedaj has joined #ste||ar
gedaj has quit [Quit: leaving]
gedaj has joined #ste||ar
gedaj has quit [Quit: leaving]
gedaj has joined #ste||ar
gedaj has quit [Client Quit]
gedaj has joined #ste||ar
gedaj has quit [Client Quit]
gedaj has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
EverYoung has quit [Ping timeout: 252 seconds]
jaafar has quit [Ping timeout: 248 seconds]
david_pfander has joined #ste||ar
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vFs6d
<github> hpx/gh-pages 397636a StellarBot: Updating docs
msimberg has quit [Ping timeout: 248 seconds]
msimberg has joined #ste||ar
<heller> msimberg: hey
<heller> msimberg: how is it going?
<jbjnr> group meeting - he'll be out at lunchtime :(
<msimberg> heller: I found the reason for the last (for now) lockup
<heller> msimberg: nice!
<msimberg> not sure what the best way to fix it but I have it somehow working at least
<heller> glad to hear you found the issues
<heller> ok, what's the last reason?
<msimberg> cleanup_threads wants the thread map to be empty
<msimberg> otherwise it returns false
<msimberg> so it was locking up when removing the first pu
<msimberg> because hpx_main goes on the first one
<heller> yeah...
<msimberg> and it stays in the thread map even if it gets stolen
<heller> yup, only destroyed threads get removed
<msimberg> so for now cleanup_threads returns true if terminated_items_count is 0
<heller> ok
<msimberg> I think that's nicer because it's a bit confusing that cleanup_threads cares about anything else but cleaning up threads
<heller> I agree
<msimberg> I think it still works because the termination waits until there are 1 + os threads left
<heller> yes
<heller> termination was one of the bigger issues with my patches back then
<msimberg> there are conflicting interests for when dynamically removing a pu and removing one when shutting down the whole runtime
<msimberg> but gtg now
<msimberg> I'll push my changes to a branch if you want to try it out
<heller> msimberg: yup, that's why i said earlier that we should keep it seperate
hkaiser has joined #ste||ar
msimberg has quit [Ping timeout: 248 seconds]
msimberg has joined #ste||ar
<heller> hkaiser: may I put your attention to the failing binbacking tests on buildbot ;)?
<hkaiser> will look once I find the time
<heller> seems to be related to the local_new changes
<hkaiser> uhh ohh
<hkaiser> will look
<hkaiser> can't be much
<heller> at least they failed after the merge
<github> [hpx] sithhell created ubsan (+1 new commit): https://git.io/vFGJo
<github> hpx/ubsan afa254d Thomas Heller: Fixing integer overflows
<heller> hkaiser: do you want me to dry run the HPX tutorial?
<hkaiser> heller: what for?
<heller> aka proof read it ;)
<hkaiser> ahh, sure
<heller> just wanted to offer help
<github> [hpx] sithhell opened pull request #2979: Fixing integer overflows (master...ubsan) https://git.io/vFGJQ
<heller> and I am really interested in the jupyter integration
<hkaiser> nod
<hkaiser> I don't think this will go well :/
<heller> what's the problem there?
<jbjnr> (lol)
<hkaiser> jbjnr: you misunderstand - the problem is not the technology
<heller> do you have the material ready yet?
<hkaiser> think so, yes
<jbjnr> heller: are you doing the tutorial at HLRS next year?
<heller> jbjnr: if either you or jbjnr (or both!) do it with me, yes!
<jbjnr> I got into trouble for our one this year as nothing worked, so if you/we do one at HLRS I will prepare new material.
<jbjnr> I have some ideas ...
<heller> ok
<heller> what was the reaction?
<heller> jbjnr: i'll only say "yes" if I don't have to do it alone
<jbjnr> I'll get clearance and get back to you
<heller> thanks
<diehlpk_work> heller, I can also support you
hkaiser has quit [Quit: bye]
aserio has joined #ste||ar
<wash[m]> aserio: we got kicked off
<aserio> wash[m]: yea I tried to fix audio
<aserio> didnt work
<wash[m]> Call in by phone
<heller> Operation Bell call going on right now or in an hour>
eschnett has quit [Quit: eschnett]
patg[w] has joined #ste||ar
eschnett has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
hkaiser has joined #ste||ar
<hkaiser> jbjnr: here now
<heller> PSA: I updated the stellar-group/build_env:debian_clang to use clang-5.0.0 + lld in the hope for faster build times
<heller> I am just now testing it with latest HPX master
<heller> please report any problems this upgrade might entail
<zao> heller: I'm wondering a bit... would there be value in being able to ask a bot to run a subset of tests repeatedly for a commit?
<zao> Feels like I'm reinventing plain boring buildbot/cdash otherwise.
diehlpk has joined #ste||ar
<heller> well
<zao> Trying to figure out how to get some value out of the Ryzen box :P
<heller> buildbot is able to do just that
<heller> theoretically
<heller> zao: what you describe sounds like a cronjob
<heller> as of the value: Yes, I think there is a lot of value there to identify more race conditions etc.
<zao> Traditional builds are of the "one build, one full testrun", right?
<zao> I'm thinking of having a box where one can request soaking of some tests to build confidence.
<zao> And if it doesn't have any particular jobs, just track heads and run their suites.
<heller> that sounds incredible valuable to me
patg[w] has quit [Quit: Leaving]
<heller> I'd also like to have an easy answer to the question: "since which commit did this test fail?"
<zao> That's a good one.
<zao> Things I need to figure out there is the interface to request jobs, and what kind of queries one can make. Also how to run multiple HPXes on the same machine without them colliding port-wise.
<heller> easy as in: just tell me on the dashboard: - Which tests have been failing repeatedly (and since when) - Which tests fail only occasionally, maybe starting with commit XXX
<jbjnr> hkaiser: ok
<zao> I see.
<heller> multiple HPXes: only configuring the MPI parcelport should work
<zao> Would that exclude some TCP-only tests?
<heller> yes
<hkaiser> jbjnr: skype?
<heller> but there are no TCP-only tests
<zao> Ah.
<heller> everything that runs with TCP is also run with MPI
<zao> Could've sworn there were ones with "tcp" in the name :)
<zao> I guess that I might be able to shove the run into a container, and not have to care.
<heller> yes, and the same with mpi, if you'd have enabled it
<heller> thats another option
<heller> for request interface: I'd look into the github API. you can subscribe to multiple events
<heller> for example, PRs created, comments to commits made etc.
<heller> this should contain most informaiton
<heller> and you'd could say in a comment: "zao, please soak test this commit"
<heller> or something along those lines
<heller> mighty herder of cats, please give me some confidence!
<zao> Ah yes, the stuff you can hook in if you control the repo?
<zao> I've just been looking at the RSS feed.
<heller> yes, this stuff
<heller> there is an rss feed?
<heller> ncie
<heller> ok, no significant improvement for building core
<msimberg> heller: current state for the throttle test is here: https://github.com/msimberg/hpx/tree/fix-throttle-test
<msimberg> I also added a simple counter for the background threads, needs cleaning up though
<msimberg> if you have time I have a couple of questions about the scheduler
<zao> Things like https://github.com/stellar-group/hpx/commits/master.atom exist, but they only cover integrations and commits, sadly.
aserio has joined #ste||ar
wash has joined #ste||ar
diehlpk has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<heller> msimberg: thanks for the branch
EverYoung has quit [Remote host closed the connection]
<heller> msimberg: ask your questions
EverYoung has joined #ste||ar
<msimberg> heller: just as a warning I haven't run all the tests yet
<heller> Examples got a 4 minutes improvement
<heller> msimberg: sure
<msimberg> when I was running some of the tests it never happens, but I would've thought the background thread would be "suspended" whenever it's not running
<heller> msimberg: when a background thread suspends
<heller> No
<msimberg> it seems to be pending most of the time
<msimberg> but how that would get triggered then?
<msimberg> but do you know...
<heller> It happens, when for example, the code executed needs to wait on something. For example calling future.get or waiting on a spinlock
<heller> The primary case where this happens is with some direct actions
wash has quit [Quit: leaving]
<heller> We then put the task onto the regular queues to avoid starvation there
wash has joined #ste||ar
<msimberg> ah okay, I guess this could get triggered in the background thread if there's some agas stuff or similar going on
<heller> Yes, for example
<msimberg> I'm running only on one node so never see it
<msimberg> ok
<msimberg> second, when would the scheduler be in the suspended state? there were very few places where state_suspended is used...
<heller> We currently don't use that, iirc
<msimberg> and the runtime and scheduler use the same state enum, right? some states seem unused?
<heller> Yes
<msimberg> there was one place for the this_thread_executor that uses it
<msimberg> anyway
<msimberg> I'm asking because it would be useful to use the suspended/suspending state for suspending with condition variables, but I wasn't sure if it's okay to use state_suspended or if it needs another one
<heller> I would say yes
<msimberg> okay, thanks
<msimberg> I'll try it out and see if it works anyway
<heller> Good
<msimberg> heller: one more to clarify
<heller> Yes
<heller> Ahhh, right
<msimberg> this piece of code doesn't explicitly schedule the background_thread (as it says)
<msimberg> based on what you said earlier, an LCO in this context would be e.g. a get that blocks and once that is ready it will really be added to the thread queues, correct?
<msimberg> i.e. it will be added to the queue even if background_thread itself wasn't originally in the queue
<heller> Ok, when an hpx suspends, it doesn't sit in any queue. The thread id is being held by some data structure (usually condition_variable) which will set it back to pending eventually and put it into a queue
<heller> Right
<heller> You need another thread to have the thread transition from suspended to pending
<msimberg> okay, that helps a lot, I haven't looked so much at the lcos yet
<msimberg> thanks!
<heller> Your welcome
<heller> You are
parsa has joined #ste||ar
jaafar has joined #ste||ar
hkaiser has quit [Quit: bye]
EverYoung has quit [Ping timeout: 252 seconds]
EverYoung has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
david_pfander has quit [Ping timeout: 260 seconds]
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 252 seconds]
aserio has quit [Ping timeout: 246 seconds]
hkaiser has joined #ste||ar
<github> [hpx] K-ballo force-pushed fixing-2904 from fd485f2 to 98dc643: https://git.io/vFGxe
<github> hpx/fixing-2904 98dc643 Agustin K-ballo Berge: Temporarily disable allocator rebinding in pack traversal
eschnett has quit [Ping timeout: 260 seconds]
EverYoun_ has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
eschnett has joined #ste||ar
parsa has joined #ste||ar
<heller> Docker compile is killing everything :/
hkaiser has quit [Quit: bye]
<github> [hpx] K-ballo force-pushed then-fwd-future from 75fcf30 to 654091a: https://git.io/vFq60
<github> hpx/then-fwd-future d631d51 Agustin K-ballo Berge: Fix future used with continuation on .then()
<github> hpx/then-fwd-future 96fa9e5 Agustin K-ballo Berge: Improve .then interface
<github> hpx/then-fwd-future e2fef6a Agustin K-ballo Berge: Improve error messages caused by misuse of .then
hkaiser has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
<heller> weeh, 30 minutes improvement
<K-ballo> heller: on what?
<heller> circle ci
<heller> 4:30 to 4:00
<K-ballo> what did you change/
<heller> I updated to clang 5.0.1 and lld
<K-ballo> sweet
<heller> do you know if that enables lto by default?
<K-ballo> no idea
<zao> I saw chandler explicitly used -flto=thin when demoing in the recent pacific++ presentation, not sure what it does default, if at all.
<zao> (that was also trunk)
EverYoun_ has joined #ste||ar
mbremer has joined #ste||ar
EverYoung has quit [Ping timeout: 252 seconds]
<mbremer> Hi, I'm trying to figure out what percentage of my application is dedicated to overhead. Could I represent this by /threads/time/average-overhead divided by /threads/time/average?
hkaiser has quit [Quit: bye]
EverYoun_ has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
parsa has joined #ste||ar
aserio has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
hkaiser has joined #ste||ar
mbremer has quit [Quit: Page closed]
aserio has quit [Ping timeout: 252 seconds]
EverYoun_ has joined #ste||ar
<heller> zao: lto-thin is a Google optimization for multi threaded linking
EverYoung has quit [Ping timeout: 258 seconds]
<zao> Advertised as the kind of thing that did LTO at all.
<heller> zao: there is a cppcon talk about it. Pretty cool
<zao> I've got a lot to catch up on.
EverYoun_ has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
patg[w] has joined #ste||ar
patg[w] has quit [Quit: Leaving]
<parsa[w]> hkaiser: #99 is ready
<github> [hpx] K-ballo opened pull request #2981: Temporarily disable allocator rebinding in pack traversal (master...fixing-2904) https://git.io/vFZ0A
<K-ballo> why do I get an *error* about finalize being deprecated?
<hkaiser> uhh
<hkaiser> that's not right
<K-ballo> dataflow::finalize (with v1 executors)
<K-ballo> SDL checks, turns deprecated warnings into errors, I've hit this before
<jbjnr> hkaiser: what are the smart executors that you will talk about at espm workshop at SC?
<jbjnr> I just got an email with the programme and saw an hpx paper mentioned
<jbjnr> I like the sound of "Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs"
aserio has joined #ste||ar
<hkaiser> jbjnr: it's Zahra's work, using ML for finding the right parameters
<hkaiser> K-ballo: ahh yes, that one is right
EverYoun_ has joined #ste||ar
<jbjnr> ML?
<jbjnr> right parameters for what? chunk sizes or something of that kind?
EverYoung has quit [Ping timeout: 258 seconds]
EverYoun_ has quit [Ping timeout: 258 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
gedaj has quit [Quit: leaving]
eschnett has quit [Quit: eschnett]
aserio has quit [Quit: aserio]