aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
EverYoun_ has joined #ste||ar
fane_faiz1 has quit [Ping timeout: 240 seconds]
EverYoung has quit [Ping timeout: 255 seconds]
EverYoun_ has quit [Ping timeout: 255 seconds]
fane_faiz1 has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
rod_t has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
diehlpk has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
EverYoung has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
rod_t has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
K-ballo has quit [Quit: K-ballo]
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
EverYoung has joined #ste||ar
fane_faiz1 has quit [Ping timeout: 248 seconds]
EverYoung has quit [Ping timeout: 256 seconds]
diehlpk has quit [Ping timeout: 258 seconds]
hkaiser has quit [Quit: bye]
rod_t has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rod_t has joined #ste||ar
rod_t has quit [Client Quit]
nanashi55 has quit [Ping timeout: 260 seconds]
nanashi55 has joined #ste||ar
EverYoung has joined #ste||ar
rod_t has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
jakemp has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
<github> [hpx] msimberg pushed 3 new commits to master: https://git.io/vbnBq
<github> hpx/master 9b1b369 Alex Hirsch: Fix setting of default build type
<github> hpx/master b8e7eb4 Mikael Simberg: Merge branch 'master' into fix-default-build-type
<github> hpx/master 60e9fa0 Mikael Simberg: Merge pull request #3033 from W4RH4WK/fix-default-build-type...
fane_faiz1 has joined #ste||ar
jbjnr has joined #ste||ar
jaafar has quit [Ping timeout: 260 seconds]
fane_faiz1 has quit [Ping timeout: 240 seconds]
<heller> jbjnr: the pycicle status looks great!
<jbjnr> thanks
<jbjnr> it now correctly scrapes the results from each build and should update github with the results.
<jbjnr> but - we might need a pycicly github account so that my face isn't attached to each one :)
<jbjnr> also, we might want pycicle-cscs/pycicle-fau/pycicle-lsu probably so that if builds fomr one succeed, but from another fail, the results don't get overwritten by the other one ...
<heller> Yes
<heller> I was thinking that we should report separately for each build
<heller> Such that we have something like: pycicle/cray(daint)/gcc-5.3/boost-1.63/...
<heller> Each site would get its own status then
<heller> Maybe coerce the config/build/test status into one?
<heller> So we could have something like: pycicle/linux(greina)/gcc-5.3.0/boost-1.65.1/release
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 252 seconds]
<jbjnr> heller: I think, one status per build is too many, but one per PR on each machine would be ok
<jbjnr> so if Release AND debug ANF boostv1 AND boostv2 AND etc all pass, then status=PASS, else fail ???
<jbjnr> I will do a clang build setup on daint today
<heller> jbjnr: ok, if everything is coordinated inside pycicle than I am fine
<jbjnr> currently it is not, but that would be a next improvment - still working on getting a stable dashboard display etc
<heller> ok
<heller> the problem with one status per machine is that it is harder to digest which configuration actually failed
david_pfander has joined #ste||ar
<heller> jbjnr: if all checks pass, the status is folded anyways
<heller> i'd go for one status for each config
<jbjnr> I see. Might be a bit messy though
<jbjnr> with so many statuses
<heller> not if all pass ;)
<heller> one per configuration, also has the advantage, that the status URL could directly forward to the build id
<jbjnr> - we do not know the Build ID - !!!
<jbjnr> that's #1 problem with our cdash thingy
fane_faiz1 has joined #ste||ar
simbergm has joined #ste||ar
mcopik has quit [Ping timeout: 240 seconds]
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vbn6L
<github> hpx/gh-pages d29633d StellarBot: Updating docs
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
<simbergm> heller: the stacksize test seems to still have occasional failures :( I'll keep an eye on it
<simbergm> it may be only with gcc 4.9
<github> [hpx] msimberg closed pull request #3035: Make parallel unit test names match build target/folder names (master...fix-parallel-test-names) https://git.io/vbqB9
simbergm has quit [Ping timeout: 260 seconds]
fane_faiz1 has quit [Ping timeout: 248 seconds]
K-ballo has joined #ste||ar
fane_faiz1 has joined #ste||ar
heller has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
heller has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
simbergm has joined #ste||ar
fane_faiz1 has quit [Ping timeout: 248 seconds]
simbergm has quit [Ping timeout: 240 seconds]
fane_faiz1 has joined #ste||ar
hkaiser has joined #ste||ar
simbergm has joined #ste||ar
fane_faiz2 has joined #ste||ar
fane_faiz1 has quit [Ping timeout: 248 seconds]
hkaiser has quit [Read error: Connection reset by peer]
hkaiser_ has joined #ste||ar
hkaiser_ has quit [Client Quit]
hkaiser has joined #ste||ar
<hkaiser> jbjnr: thanks for adding the github integration for pycicle
<hkaiser> jbjnr: may I ask however that the link posted to github will point to a more descriptive spot?
<hkaiser> for instance, for #3033 it points here: http://cdash.cscs.ch/index.php?project=HPX&date=2017-12-05&filtercount=1&field1=buildname/string&compare1=63&value1=3033-fix-default-build-type
<hkaiser> no way to find out what build that is and why the executables are not being built
<hkaiser> jbjnr: hmm for #3021 however it gives more information: http://cdash.cscs.ch/index.php?project=HPX&date=2017-12-06&filtercount=1&field1=buildname/string&compare1=63&value1=3021-segfault-fix
<hkaiser> jbjnr: also, why isn't there a test running for #3039?
<hkaiser> ahh, has conflicts, I see
eschnett has joined #ste||ar
<heller> is rostam down?
jakemp has quit [Ping timeout: 260 seconds]
<github> [hpx] hkaiser pushed 1 new commit to fixing_3027: https://git.io/vbcek
<github> hpx/fixing_3027 a67dcd3 Hartmut Kaiser: Fixing inspect problems
daissgr has quit [Ping timeout: 240 seconds]
fane_faiz2 has quit [Ping timeout: 246 seconds]
simbergm has quit [Ping timeout: 240 seconds]
daissgr has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
hkaiser has quit [Quit: bye]
Smasher has joined #ste||ar
aserio has joined #ste||ar
eschnett has quit [Quit: eschnett]
rod_t has joined #ste||ar
aserio has quit [Quit: aserio]
aserio has joined #ste||ar
eschnett has joined #ste||ar
jaafar has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
hkaiser has joined #ste||ar
Smasher has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
aserio has quit [Ping timeout: 264 seconds]
aserio has joined #ste||ar
<wash[m]> aserio: I won't be on the call today
<aserio> wash[m]: thanks for the heads up!
<jbjnr> hkaiser: sorry about the bogus github status link for the PR you mentioned. I believe that this was caused by my testing of the github status setting and I might have writtedn some crap to several branches in the form of status updates. I think it is behaviing as expected now. please tell me if you find any more like that.
<hkaiser> jbjnr: ok, will do - thanks
rod_t has joined #ste||ar
fane_faiz2 has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rod_t has joined #ste||ar
<github> [hpx] biddisco created fixing_2092 (+1 new commit): https://git.io/vbczS
<github> hpx/fixing_2092 121753b John Biddiscombe: Do not bind test running to cmake test build rule
<zao> Load average: 183.29 70.55 26.69
<zao> Upside of things, I just spawned 50 parallel test runs of a single build. I hope that all tests are well-behaved and won't drive me into swap :P
<jbjnr> hkaiser: did you mentioned someehere that you had subsumed the work I did on making future_then pass predicate using && into some other branch/work?
<hkaiser> yes
<hkaiser> #3039
<zao> Turned out that I indeed could rebind the Testing directory of a build tree, so they won't trample on each other.
<jbjnr> ok, great. Should I start testing that branch for my work?
<zao> Unless there's a test that writes to the build directory itself. I sure hope not :)
<jbjnr> zao: probably not, but not sure
EverYoung has joined #ste||ar
<jbjnr> most tests only write to std::cout/err
<hkaiser> jbjnr: yes
<hkaiser> pls
<jbjnr> some examples might create files ...
<jbjnr> hkaiser: ok. Will it work ok, or are there any thing I need to look out for
<zao> This is a lot of threads... oh boy.
EverYoung has quit [Ping timeout: 255 seconds]
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rod_t has joined #ste||ar
<hkaiser> jbjnr: circleci is green now, I think
<jbjnr> ?
<hkaiser> I meant for #3039
<jbjnr> aha. great. cdash still shows errors, but they are the same as master, so no surprises there. thanks
<jbjnr> I'll play with it soon. probly not tonight though
<hkaiser> jbjnr: sure, all scheduling is done through executors now
<jbjnr> awesome!
mcopik has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
david_pfander has quit [Ping timeout: 240 seconds]
rod_t has joined #ste||ar
rod_t has quit [Client Quit]
<zao> Soooo.... the machine appears hung solid :D
<jbjnr> surprise!
<jbjnr> when you check -you'll find partitioned_vector_xxx is the problem
rod_t has joined #ste||ar
<jbjnr> I just submitted an issue for that cos it brought nodes down here when I messed up this weekend.
<jbjnr> ok. I was wrong. sorry
<zao> (those processes are just the tip of the 50 task run, tho)
<zao> So quite possible that there's some naughty test down below, or this AMD machine is still not stable under load.
<zao> And oh, this is running tests, the build is long ago.
<zao> The pain of building partitioned_*, I know very well.
<zao> I just allocate a ton of swap and -j it anyway on this 8c16t machine w/ 32G of RAM :)
<zao> Time to head home and see if the house is on fire or not.
hkaiser has quit [Quit: bye]
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rod_t has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
aserio has quit [Ping timeout: 240 seconds]
hkaiser has joined #ste||ar
eschnett has quit [Quit: eschnett]
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rod_t has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<zao> jbjnr: This is completely normal, right? :D https://imgur.com/undefined
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
fane_faiz2 has quit [Ping timeout: 248 seconds]
rod_t has joined #ste||ar
aserio has joined #ste||ar
tjtn has joined #ste||ar
<tjtn> Hey all
<hkaiser> tjtn: hey
aserio has quit [Ping timeout: 248 seconds]
<tjtn> So I found out about Stellar through a repo on GitHub, and started poking around the site a bite
<tjtn> *bit
aserio has joined #ste||ar
<tjtn> Anyway, I'm working on an article for my blog (https://www.timnoetzel.com) on the user experience for developers who code complex systems, I was wondering if I could interview somebody from your team
<tjtn> Just a few short questions over email
<hkaiser> tjtn: sure
<hkaiser> whom would you like to talk to?
<tjtn> Basically the article would cover developer experience, and ideally would feature somebody who's got some strong opinions about it
<tjtn> Maybe one or two of you can PM me your emails and we can get a conversation going there?
<zao> Some of these tests are painfully low on CPU usage - https://i.imgur.com/d7dfh8W.png
hkaiser has quit [Quit: bye]
<zao> Gah, the compile tests fire in here... was there a knob to remove all those?
<zao> Suddenly, fifty clangs in my top.
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
hkaiser has joined #ste||ar
rod_t has joined #ste||ar
mcopik has quit [Ping timeout: 248 seconds]
<aserio> diehlpk_work: yt?
EverYoung has quit [Ping timeout: 255 seconds]
parsa[w] has quit [*.net *.split]
auviga has quit [*.net *.split]
hkaiser has quit [*.net *.split]
jbjnr has quit [*.net *.split]
nanashi55 has quit [Ping timeout: 260 seconds]
taeguk[m] has quit [Ping timeout: 240 seconds]
thundergroudon[m has quit [Ping timeout: 255 seconds]
autrilla has quit [Ping timeout: 255 seconds]
nanashi55 has joined #ste||ar
aserio has quit [Ping timeout: 246 seconds]
parsa[w] has joined #ste||ar
auviga has joined #ste||ar
jbjnr has joined #ste||ar
mcopik has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
tjtn has quit [Read error: Connection reset by peer]
rod_t has joined #ste||ar
<zao> Hrm... I wonder if I'm running into the startup delay problem. Pretty much all of my tests use >10s, some as low as 5s.
<zao> Mostly yawning CPU-wise, even with 100 concurrent tests.
<zao> I guess I should pull something newer than e9f64ec3a1ddd72870f01a9e7fd0e33903ee117c
rod_t has quit [Client Quit]
<zao> Or it's just something particular to these tests.
<zao> [Wed Dec 6 20:37:30 2017] exclusive_scan_: page allocation stalls for 15868ms, order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null)
<zao> Hehe :)
aserio has joined #ste||ar
EverYoung has joined #ste||ar
rod_t has joined #ste||ar
eschnett has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rod_t has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<zao> It seems like most of the time, HPX binds to cores 1, 3, 5, 7; across all tests.
<zao> So concurrent test runs compete over the same physical cores.
<zao> That's 12 completely unused cores.
<zao> (well, HARTs, as there's 8 cores)
<zao> Don't make me install SLURM on this god-forgotten machine just to get proper usage out of my cores.
rod_t has joined #ste||ar
<K-ballo> zao: fix it fix it fix it
<zao> I could "fix" it by building without hwloc, I guess :)
rod_t has quit [Client Quit]
<zao> I guess it's perfect for provoking the AMD problems, which seems to manifest if you peg a core and exercise it's HT twin.
rod_t has joined #ste||ar
rod_t has quit [Client Quit]
hkaiser has joined #ste||ar
rod_t has joined #ste||ar
rod_t has quit [Client Quit]
EverYoun_ has joined #ste||ar
EverYoun_ has quit [Remote host closed the connection]
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 250 seconds]
rod_t has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rod_t has joined #ste||ar
rod_t has quit [Client Quit]
rod_t has joined #ste||ar
<github> [hpx] biddisco closed pull request #3052: Do not bind test running to cmake test build rule (master...fixing_2092) https://git.io/vbczH
<github> [hpx] biddisco deleted fixing_2092 at 121753b: https://git.io/vbCCA
rod_t has left #ste||ar ["Textual IRC Client: www.textualapp.com"]
aserio has quit [Quit: aserio]
EverYoun_ has quit [Ping timeout: 252 seconds]