hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
hkaiser has quit [Quit: bye]
nanashi55 has quit [Ping timeout: 244 seconds]
nanashi55 has joined #ste||ar
K-ballo has quit [Ping timeout: 244 seconds]
K-ballo has joined #ste||ar
david_pfander has joined #ste||ar
eschnett_ has quit [Read error: Connection reset by peer]
eschnett_ has joined #ste||ar
nikunj has joined #ste||ar
ste||ar-github has joined #ste||ar
ste||ar-github has left #ste||ar [#ste||ar]
<ste||ar-github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://github.com/STEllAR-GROUP/hpx/commit/3a281b1bf990911a408018945a23269a247797b4
<ste||ar-github> hpx/gh-pages 3a281b1 StellarBot: Updating docs
hkaiser has joined #ste||ar
<hkaiser> heller: g'morning
<heller> hkaiser: good mornin
<hkaiser> I'd like to merge #3472 asap
<hkaiser> it fixes master
<heller> hkaiser: no way!
<heller> hkaiser: I already did it
<hkaiser> :D
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell pushed 1 new commit to master: https://github.com/STEllAR-GROUP/hpx/commit/19534c7fa32dcbf053c2d966fa2521d3f9ce6a57
<ste||ar-github> hpx/master 19534c7 Thomas Heller: Merge pull request #3472 from STEllAR-GROUP/clang_test_workaround...
ste||ar-github has left #ste||ar [#ste||ar]
<hkaiser> thanks
<heller> no problem
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell deleted clang_test_workaround at c0f1254: https://github.com/STEllAR-GROUP/hpx/commit/c0f1254
ste||ar-github has left #ste||ar [#ste||ar]
eschnett_ is now known as eschnett
eschnett has quit [Quit: eschnett]
aserio has joined #ste||ar
eschnett_ has joined #ste||ar
hkaiser has quit [Quit: bye]
hkaiser has joined #ste||ar
nikunj has quit [Remote host closed the connection]
aserio has quit [Ping timeout: 264 seconds]
<hkaiser> heller: any idea how to fix the jemalloc issue?
<hkaiser> (the hpx_option gone wrong)
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] hkaiser created fixing_jemalloc_prefix (+1 new commit): https://github.com/STEllAR-GROUP/hpx/commit/aa1f44bd72c2
<ste||ar-github> hpx/fixing_jemalloc_prefix aa1f44b Hartmut Kaiser: Fixing invalid cmake code if no jemalloc prefix was given
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] hkaiser opened pull request #3473: Fixing invalid cmake code if no jemalloc prefix was given (master...fixing_jemalloc_prefix) https://github.com/STEllAR-GROUP/hpx/pull/3473
ste||ar-github has left #ste||ar [#ste||ar]
<hkaiser> heller: this ^^ should solve it, pls verify
aserio has joined #ste||ar
diehlpk has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
K-ballo has joined #ste||ar
diehlpk has quit [Remote host closed the connection]
diehlpk has joined #ste||ar
david_pfander has quit [Ping timeout: 244 seconds]
aserio has quit [Ping timeout: 260 seconds]
diehlpk has quit [Ping timeout: 252 seconds]
aserio has joined #ste||ar
<heller> hkaiser: hey
<hkaiser> hey
<heller> appear.in?
<hkaiser> heller: I'm in
<hkaiser> heller: can't hear you
diehlpk has joined #ste||ar
hkaiser has quit [Quit: bye]
<heller> for the protocol: The MPI parcelport on Cray machines just sucks
<heller> plain and simple...
<heller> it needs to go ASAP
<khuck> heller: is libfabric ready for prime-time?
<heller> khuck: I think so yes.
<heller> I was using it on a 9k node run recently
<aserio> heller: slow is better than broken
<heller> even the TCP parcelport is faster than my implementation
<heller> than MPI with my applciation
<heller> aserio: and really, I am using the libfabric parcelport exclusively on cori, no problems there
<heller> consolidating the parcelports is something for post 1.2
<heller> ha
<heller> that's a very old comment ;)
<heller> verbs should work, as well as libfabric
diehlpk has quit [Ping timeout: 272 seconds]
<heller> grr, I take that back ... just got a segfault with libfabric :/
hkaiser has joined #ste||ar
eschnett_ has quit [Quit: eschnett_]
mbremer has joined #ste||ar
jbjnr has quit [Read error: Connection reset by peer]
jbjnr has joined #ste||ar
mbremer has left #ste||ar [#ste||ar]
aserio has quit [Quit: aserio]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] biddisco force-pushed guided_executor from 11940fe to a3f161f: https://github.com/STEllAR-GROUP/hpx/commits/guided_executor
<ste||ar-github> hpx/guided_executor 47697eb John Biddiscombe: Add new guided_pool_executor that passes thread_schedule_hint to the scheduler...
<ste||ar-github> hpx/guided_executor 6665463 John Biddiscombe: Add overload for pool_numa_hint that can be used with lambdas
<ste||ar-github> hpx/guided_executor 58a2cd4 John Biddiscombe: Implementing then_execute for guided executor .then continuations with numa hints
ste||ar-github has left #ste||ar [#ste||ar]
quaz0r has quit [Ping timeout: 272 seconds]
<heller> khuck: do you remember if we have our input files still lying around somewhere?
<khuck> yes
<khuck> I remember :)
<khuck> can you access /project/projectdirs/xpress/hpx-lsu-cori-II ?
<heller> yes
<khuck> check the scaling subdirectory
<khuck> I copied your scripts, but hacked many of them
<khuck> my run results are in /global/cscratch1/sd/khuck/xpress/bell
<khuck> not sure if I ever got it to run to completion at scale
<khuck> (with apex)
<heller> hmm
<heller> I miss the restart file for 12 LoR
<khuck> ah
<khuck> let me look
<heller> got it
<heller> well, I calculated the file size...
<heller> should be 140 gigs
<khuck> kinda big.
<khuck> I think I always copied the files from your $SCRATCH and/or project directory
quaz0r has joined #ste||ar
quaz0r has quit [Ping timeout: 252 seconds]
<khuck> @hkaiser good news, or bad news?
<khuck> hkaiser: lets try that again...would you like the good news or the bad news?
<khuck> (the bad news isn't that bad)
akheir has quit [Quit: Leaving]
<hkaiser> khuck: what's the bad news?
<khuck> HPX still calls a handful of posix routines from worker threads
<heller> hkaiser: text pushed
<khuck> but phylanx no longer does
<hkaiser> heller: thanks! bravo! much appreciated!
<heller> :D
<hkaiser> khuck: thanks for this - I was not aware of those
<khuck> np
<heller> 1.4 GB/s aggregated bandwidth isn't that much ;)
<hkaiser> khuck: those calls look like to happen before the runtime is up, though
<khuck> hkaiser: they were on the OS thread that was a "worker". Like I said, they are pretty minor - they hardly showed up on the profile and I had to hunt for them
<khuck> sorry, across two different threads. but workers nonetheless
<hkaiser> khuck: ok
<hkaiser> I'll try to identify them
<heller> maybe late module loading or ini parsing, maybe?
<hkaiser> the parse_command_line should happen right at the start, no idea why it shows up to run on a worker thread
<khuck> it's also possible it happens before that thread is labeled as a "worker"
quaz0r has joined #ste||ar
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell closed pull request #3473: Fixing invalid cmake code if no jemalloc prefix was given (master...fixing_jemalloc_prefix) https://github.com/STEllAR-GROUP/hpx/pull/3473
ste||ar-github has left #ste||ar [#ste||ar]