#ste||ar on 2022-10-27 — irc logs at irclog.cct.lsu.edu

2021-08-06 22:55 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu

01:36 Yorlik_ has joined #ste||ar

01:39 Yorlik has quit [Ping timeout: 246 seconds]

02:20 hkaiser has quit [*.net *.split]

02:20 Kalium has quit [*.net *.split]

02:20 K-ballo has quit [Ping timeout: 240 seconds]

02:23 K-ballo has joined #ste||ar

02:24 Kalium has joined #ste||ar

11:21 K-ballo1 has joined #ste||ar

11:22 K-ballo has quit [Ping timeout: 240 seconds]

11:22 K-ballo1 is now known as K-ballo

11:24 hkaiser has joined #ste||ar

12:11 <pansysk75[m]> hkaiser: I was not able to reproduce the failed range_sort tests, on a configuration identical to the jenkins one (as far as i can tell)

12:12 <pansysk75[m]> Just to confirm, I should be working on a "cuda" node (not on a "jenkins-cuda" one), correct?

12:12 <hkaiser> what's the difference?

12:12 <pansysk75[m]> the jenkins nodes are similar to the others, just "reserved" for running the CI?

12:13 <hkaiser> no, jenkins just runs on rostam, nothing special - different builders do use different slurm partitions, though

12:14 <pansysk75[m]> oh, i was looking at the "partition" naming when typing sinfo

12:14 <pansysk75[m]> the node names are the ones on the right ("geev" for example)

12:14 <hkaiser> the slurm partition a builder uses is defined here, e.g.: https://github.com/STEllAR-GROUP/hpx/blob/master/.jenkins/lsu/slurm-configuration-gcc-10-cuda-11.sh

12:16 <hkaiser> pansysk75[m]: I believe however, that the test error is efemeral and not caused by your changes

12:17 <hkaiser> could be an issue in the test itself, even - not the algorithm (even more as the range algorithms simply dispatch to the iterator-based ones

12:17 <hkaiser> and those are fine

12:17 <pansysk75[m]> yes, makes sense

13:13 hkaiser has quit [Quit: Bye!]

14:37 hkaiser has joined #ste||ar

16:00 karamemp[m] has quit [Quit: You have been kicked for being idle]

16:03 <pansysk75[m]> I noticed that a handful of algorithms behave differently to manually setting the chunk size.... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/5821f7e6c8118d812403ed2a00a811cd8ce06174>)

16:04 <pansysk75[m]> This seems like it's not conforming to the approach of the rest of the algorithms

16:09 <pansysk75[m]> Check the pics below, where I use static_chunk_size with "transform" and "sort", and I get a maybe unexpected result in the 2nd case

16:09 <pansysk75[m]> https://i.postimg.cc/50ThpMRH/sort-HPX-PAR-SCS.png

16:09 <pansysk75[m]> https://i.postimg.cc/FR5bxWxr/transform-HPX-PAR-SCS.png

16:15 <pansysk75[m]> I mean, it behaves well, but as a user I would expect that decision to be up to me and not the hpx::sort impl

16:18 <pansysk75[m]> * Check the pics below, where I set the number of chunks on parallel "transform" and "sort", and I get a maybe unexpected result in the 2nd case

16:34 <pansysk75[m]> * Check the pics below, where I set the number of chunks on parallel "transform" and "sort" (using static_chunk_size), and I get a maybe unexpected result in the 2nd case

16:41 <hkaiser> pansysk75[m]: yes, I know that sort is different

16:41 <hkaiser> how does the changed sort compares against the original version?

16:41 <pansysk75[m]> by "changed" you mean?

16:42 <hkaiser> the one that uses the chunking interface

16:42 <hkaiser> I might have misunderstood what you said

16:42 <hkaiser> did you actually change sort ?

16:43 <pansysk75[m]> nope, but I'll get back to you with that

16:43 <hkaiser> ahh

16:43 <hkaiser> ok - cool

16:43 <pansysk75[m]> I'm not that concerned about performance, more about uniformity (if that word exists)

16:44 <hkaiser> both are important ;-)

16:44 <pansysk75[m]> because setting a minimum chunk size is probably a good approach for all parallel algorithms (thats why static_chunk_size probably exists)

16:44 <hkaiser> I agree

16:45 <pansysk75[m]> will play around a bit and i'll get back to you

16:45 <hkaiser> sort was contributed by somebody as a 'one off' contribution to HPX, we never got around to fully integrate it with the scheduling property facilities

16:47 <hkaiser> also, sort should use projections (at least the range based version, IIRC), not sure if we actually support that ATM

16:48 <hkaiser> pansysk75[m]: actually ... I think sort does use the chunk-size calculation...

16:49 <hkaiser> after all - I forgot that we implemented that after all

16:50 <hkaiser> pansysk75[m]: here: https://github.com/STEllAR-GROUP/hpx/blob/master/libs/core/algorithms/include/hpx/parallel/algorithms/sort.hpp#L303

16:52 <pansysk75[m]> It does do a calculation, and then takes the maximum with the magic number a few lines after that

16:52 <pansysk75[m]> The majority of the algorithms call get_bulk_iteration_shape, which also does the same calculation

16:53 <hkaiser> well, we don't need the iterator range (the shape) in this case

16:54 <pansysk75[m]> So my concern is about

16:54 <pansysk75[m]> 1. Chopping of the chunk_size at an arbitrary number

16:54 <pansysk75[m]> 2. Repeating ourselves

16:54 <pansysk75[m]> hkaiser: Agree on that

16:58 <hkaiser> pansysk75[m]: perf gets really bad if the chunks are becoming too small

17:05 <pansysk75[m]> <hkaiser> "pansysk75: perf gets really..." <- I agree

17:05 <pansysk75[m]> That's the case with other algorithms as well, but we let the user take the hit when they make the mistake of calling a par impl on very small work, correct? (Chunks = 4*n_cores and all that jazz)

17:21 <hkaiser> dkaratza[m]: I comitted a minor change to your PR, should be fine now

17:21 <hkaiser> pansysk75[m]: try it

17:25 hkaiser has quit [Quit: Bye!]

18:29 hkaiser has joined #ste||ar

19:21 Yorlik_ has quit [Ping timeout: 255 seconds]

19:52 K-ballo1 has joined #ste||ar

19:54 K-ballo has quit [Ping timeout: 250 seconds]

19:54 K-ballo1 is now known as K-ballo

21:41 tufei_ has joined #ste||ar

22:38 diehlpk_work has quit [Remote host closed the connection]