hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
Yorlik_ has joined #ste||ar
Yorlik has quit [Ping timeout: 246 seconds]
hkaiser has quit [*.net *.split]
Kalium has quit [*.net *.split]
K-ballo has quit [Ping timeout: 240 seconds]
K-ballo has joined #ste||ar
Kalium has joined #ste||ar
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 240 seconds]
K-ballo1 is now known as K-ballo
hkaiser has joined #ste||ar
<pansysk75[m]>
hkaiser: I was not able to reproduce the failed range_sort tests, on a configuration identical to the jenkins one (as far as i can tell)
<pansysk75[m]>
Just to confirm, I should be working on a "cuda" node (not on a "jenkins-cuda" one), correct?
<hkaiser>
what's the difference?
<pansysk75[m]>
the jenkins nodes are similar to the others, just "reserved" for running the CI?
<hkaiser>
no, jenkins just runs on rostam, nothing special - different builders do use different slurm partitions, though
<pansysk75[m]>
oh, i was looking at the "partition" naming when typing sinfo
<pansysk75[m]>
the node names are the ones on the right ("geev" for example)
<pansysk75[m]>
I mean, it behaves well, but as a user I would expect that decision to be up to me and not the hpx::sort impl
<pansysk75[m]>
* Check the pics below, where I set the number of chunks on parallel "transform" and "sort", and I get a maybe unexpected result in the 2nd case
<pansysk75[m]>
* Check the pics below, where I set the number of chunks on parallel "transform" and "sort" (using static_chunk_size), and I get a maybe unexpected result in the 2nd case
<hkaiser>
pansysk75[m]: yes, I know that sort is different
<hkaiser>
how does the changed sort compares against the original version?
<pansysk75[m]>
by "changed" you mean?
<hkaiser>
the one that uses the chunking interface
<hkaiser>
I might have misunderstood what you said
<hkaiser>
did you actually change sort ?
<pansysk75[m]>
nope, but I'll get back to you with that
<hkaiser>
ahh
<hkaiser>
ok - cool
<pansysk75[m]>
I'm not that concerned about performance, more about uniformity (if that word exists)
<hkaiser>
both are important ;-)
<pansysk75[m]>
because setting a minimum chunk size is probably a good approach for all parallel algorithms (thats why static_chunk_size probably exists)
<hkaiser>
I agree
<pansysk75[m]>
will play around a bit and i'll get back to you
<hkaiser>
sort was contributed by somebody as a 'one off' contribution to HPX, we never got around to fully integrate it with the scheduling property facilities
<hkaiser>
also, sort should use projections (at least the range based version, IIRC), not sure if we actually support that ATM
<hkaiser>
pansysk75[m]: actually ... I think sort does use the chunk-size calculation...
<hkaiser>
after all - I forgot that we implemented that after all
<pansysk75[m]>
It does do a calculation, and then takes the maximum with the magic number a few lines after that
<pansysk75[m]>
The majority of the algorithms call get_bulk_iteration_shape, which also does the same calculation
<hkaiser>
well, we don't need the iterator range (the shape) in this case
<pansysk75[m]>
So my concern is about
<pansysk75[m]>
1. Chopping of the chunk_size at an arbitrary number
<pansysk75[m]>
2. Repeating ourselves
<pansysk75[m]>
hkaiser: Agree on that
<hkaiser>
pansysk75[m]: perf gets really bad if the chunks are becoming too small
<pansysk75[m]>
<hkaiser> "pansysk75: perf gets really..." <- I agree
<pansysk75[m]>
That's the case with other algorithms as well, but we let the user take the hit when they make the mistake of calling a par impl on very small work, correct? (Chunks = 4*n_cores and all that jazz)
<hkaiser>
dkaratza[m]: I comitted a minor change to your PR, should be fine now
<hkaiser>
pansysk75[m]: try it
hkaiser has quit [Quit: Bye!]
hkaiser has joined #ste||ar
Yorlik_ has quit [Ping timeout: 255 seconds]
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 250 seconds]
K-ballo1 is now known as K-ballo
tufei_ has joined #ste||ar
diehlpk_work has quit [Remote host closed the connection]