K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<hkaiser>
gnikunj[m]: yt?
hkaiser has quit [Quit: bye]
bita has quit [Ping timeout: 264 seconds]
bita has joined #ste||ar
bita has quit [Ping timeout: 260 seconds]
heller1 has quit [Quit: Idle for 30+ days]
<gonidelis[m]>
what's the purpose of having just an `ExPolicy` here as an argument, compared to the rvalue ref argument in the `parallel()` overload?
<gonidelis[m]>
K-ballo: yt??
hkaiser has joined #ste||ar
<gonidelis[m]>
hkaiser: please ping me whenever you have a spare 5 minutes within the day
<hkaiser>
gonidelis[m]: will do, need some coffee and breakfast first
<gonidelis[m]>
hkaiser: sure
parsa[m] has quit [*.net *.split]
parsa[m] has joined #ste||ar
gonidelis[m] has quit [Ping timeout: 244 seconds]
pedro_barbosa[m] has quit [Ping timeout: 240 seconds]
parsa[m] has quit [Ping timeout: 246 seconds]
gnikunj[m] has quit [Ping timeout: 244 seconds]
teonnik has quit [Ping timeout: 258 seconds]
klaus[m] has quit [Ping timeout: 258 seconds]
rori has quit [Ping timeout: 244 seconds]
ms[m] has quit [Ping timeout: 240 seconds]
k-ballo[m] has quit [Ping timeout: 240 seconds]
jpinto[m] has quit [Ping timeout: 240 seconds]
tiagofg[m] has quit [Ping timeout: 240 seconds]
gonidelis[m] has joined #ste||ar
gnikunj[m] has joined #ste||ar
teonnik has joined #ste||ar
klaus[m] has joined #ste||ar
parsa[m] has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
gnikunj[m] has quit [Ping timeout: 240 seconds]
gonidelis[m] has quit [Ping timeout: 240 seconds]
klaus[m] has quit [Ping timeout: 246 seconds]
teonnik has quit [Ping timeout: 260 seconds]
parsa[m] has quit [Ping timeout: 258 seconds]
rori has joined #ste||ar
jpinto[m] has joined #ste||ar
ms[m] has joined #ste||ar
k-ballo[m] has joined #ste||ar
pedro_barbosa[m] has joined #ste||ar
tiagofg[m] has joined #ste||ar
klaus[m] has joined #ste||ar
parsa[m] has joined #ste||ar
teonnik has joined #ste||ar
gonidelis[m] has joined #ste||ar
gnikunj[m] has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
<hkaiser>
ms[m]: yt?
<ms[m]>
hkaiser: hey
<hkaiser>
ms[m]: hey
<hkaiser>
ms[m]: I'm struggling with build system settings again
<hkaiser>
ms[m]: for instance jenkins/cscs/clang-oldest sets NETWORKONG=OFF, does that mean that DISTRIBUTED_RUNTIME is off as well?
<hkaiser>
the actual question is, how do I disable tests/examples that have to run in distributed (num_localities > 1)?
<weilewei>
hkaiser different threads have different thread idling pattern, what should I look into?
<hkaiser>
that's not deterministic
<hkaiser>
but you know that some tasks are long running (walkers) leading to low idle-rates
<weilewei>
Yes
<hkaiser>
other tasks are short (accumulators)
<hkaiser>
look at APEX traces
<hkaiser>
what I'd suggest is to look into correlation of the overall idle-rate and execution time vs. number of walkers/accumulators
<weilewei>
I see, let me try to figure that out
<weilewei>
The overall idle-rate, is it represented as /threads/{locality#0/total}/idle-rate ?
<hkaiser>
weilewei: yes
<hkaiser>
actually for any locality#N
<hkaiser>
weilewei: if you specify /threads{locality#*}/idle-rate on the command line you'll see only the overall numbers
<weilewei>
I saw the printout at the end: /threads{locality#0/total}/idle-rate,1,61.689660,[s],6491,[0.01%], does it mean that the overall idle rate is 64.49%?
<weilewei>
hkaiser ^^
<hkaiser>
weilewei: yes
<hkaiser>
that's across all cores
<weilewei>
good, then I will do a parameter sweep with idle rate enabled
<hkaiser>
nod
<weilewei>
and another same experiment but without idle rate (Release build)
<hkaiser>
weilewei: the idle-rate sweep should be don eusing release as well
<weilewei>
Oh, I see, then I will build a Release version, good to know!
bita has joined #ste||ar
<weilewei>
hkaiser for 1 walker and 1 accumulator case, I found the overall idle rate is 96.70%, and I noticed most of idle rate is 99%, except one 0.62%
<weilewei>
I think 47 physical cores are being idle in most cases, because only 1 long-running walker
<gnikunj[m]>
hkaiser: turns out it was a stupid bad memory access in 1d stencil replay example. I've corrected it and also changed the parameters of other performance test examples to mitigate the time out issue. Things should build and execute as expected now!
<gnikunj[m]>
apologies for the 2d delay
bita has quit [Ping timeout: 260 seconds]
<gnikunj[m]>
K-ballo: hkaiser really curious why std::span is introduced in the C++20 standard. What was std::array and std::vector missing? For one, I can think of initializing vector from a c-style array. Two, compile time initialization of elements (But constexpr tuple can do compile time initialization, then why this?)