hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
K-ballo has quit [Quit: K-ballo]
tufei has joined #ste||ar
diehlpk_work has quit [Remote host closed the connection]
hkaiser has quit [Quit: Bye!]
Yorlik has joined #ste||ar
Yorlik has quit [Ping timeout: 268 seconds]
zao has quit [Ping timeout: 245 seconds]
wash_ has quit [Ping timeout: 245 seconds]
wash_ has joined #ste||ar
zao has joined #ste||ar
K-ballo has joined #ste||ar
tufei has quit [Remote host closed the connection]
tufei has joined #ste||ar
<ms[m]> gi
hkaiser has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
K-ballo has joined #ste||ar
<jedi18[m]> hkaiser: I think I had in fact run the benchmarks in debug mode, I ran it again properly in release mode and it is what you predicted, ranges performs much better than par https://github.com/Jedi18/ranges_benchmarks/tree/main/plots
<hkaiser> jedi18[m]: cool!
<jedi18[m]> Hi LorenDB , fancy seeing you here :D
<hkaiser> there are still questions, but things make more sense, definitely
<LorenDB[m]> jedi18[m]: Oh hi! I've been around for a while, actually
<hkaiser> jedi18[m]: how can you get a speedup of 1200 and more on 16 cores?
<jedi18[m]> hkaiser: Yeah, I'll run some more benchmarks
<hkaiser> is there some auto-vectorization happening?
<jedi18[m]> On buran, so 48 cores
<hkaiser> anyways, 1200 is suspicious
<jedi18[m]> Hmm true
<hkaiser> most likely things get optimized away
<jedi18[m]> Yeah, anyway I'll do some more benchmarks on different nodes and with higher number of iterations and maybe bigger sizes
<hkaiser> nod
<hkaiser> first I'd suggest to make sure you measure what you're looking for
<hkaiser> the easiest way to avoid for the compiler to optimize things away is to use the generated output in some ways
<hkaiser> (outside the timing-scope)
<hkaiser> access the first and last element of the sequence or something
<hkaiser> assign it to a global variable
<jedi18[m]> Oh ok got it, I'll do that, thanks!
<jedi18[m]> I guess a global variable sum where I add elements of random indices of the vector should do right?
<hkaiser> yes
<hkaiser> make the compiler think the output is used somewhere it can't see
Yorlik has joined #ste||ar
<jedi18[m]> Ok, also I changed it to use command line flags and the bash script and it's so much more convenient to run now, so thanks xD
<jedi18[m]> O
<jedi18[m]> Ok*
<hkaiser> jedi18[m]: :D
Yorlik has quit [Read error: Connection reset by peer]
tufei has quit [Quit: Leaving]
tufei has joined #ste||ar
FunMiles_ has quit [Remote host closed the connection]
FunMiles has joined #ste||ar
FunMiles has quit [Ping timeout: 264 seconds]
FunMiles has joined #ste||ar
FunMiles has quit [Ping timeout: 260 seconds]
FunMiles has joined #ste||ar
FunMiles_ has joined #ste||ar
FunMiles has quit [Ping timeout: 256 seconds]
FunMiles_ has quit [Ping timeout: 260 seconds]
FunMiles has joined #ste||ar
FunMiles has quit [Remote host closed the connection]
FunMiles has joined #ste||ar
FunMiles has quit [Ping timeout: 264 seconds]
FunMiles has joined #ste||ar
FunMiles has quit [Ping timeout: 256 seconds]
parsa[fn] has joined #ste||ar
parsa[fn] has quit [Client Quit]