#ste||ar on 2023-03-19 — irc logs at irclog.cct.lsu.edu

2021-08-06 22:55 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu

00:00 tufei_ has quit [Remote host closed the connection]

00:00 tufei has joined #ste||ar

01:18 K-ballo has quit [Ping timeout: 260 seconds]

01:20 K-ballo has joined #ste||ar

01:30 K-ballo1 has joined #ste||ar

01:30 K-ballo has quit [Ping timeout: 248 seconds]

01:30 K-ballo1 is now known as K-ballo

02:32 Yorlik_ has joined #ste||ar

02:35 Yorlik__ has quit [Ping timeout: 260 seconds]

02:39 Neeraj has joined #ste||ar

02:40 Neeraj has quit [Client Quit]

03:20 hkaiser has joined #ste||ar

03:48 hkaiser has quit [Quit: Bye!]

05:35 HHN93 has joined #ste||ar

06:50 HHN93 has quit [Quit: Client closed]

08:58 <pansysk75[m]> HHN93: You can use chrono (as you mentioned) and compare with the existing implementation. Testing different input sizes is also a good idea. You can use this as a quick starting point: https://github.com/Pansysk75/HPX-Performance-Benchmarks/tree/minimal_bench/src

09:01 <pansysk75[m]> Also important, make sure to test for multiple iterations, and discard the first few runs (they are going to be slower due to "cold cache")

09:02 <pansysk75[m]> But thats all, there is not much to it :)

10:36 K-ballo1 has joined #ste||ar

10:36 K-ballo has quit [Ping timeout: 246 seconds]

10:36 K-ballo1 is now known as K-ballo

11:01 Neeraj has joined #ste||ar

11:02 Neeraj has quit [Client Quit]

12:39 hkaiser has joined #ste||ar

13:38 harshitpant1 has joined #ste||ar

13:39 <harshitpant1> Hi, hkaiser and gonidelis, I find the GSoC project "Bring the HPX distributed algorithms up to date" interesting. I have HPX built on my computer and have been looking at examples and tests to understand how things work in HPX. I also raised two minor PRs in February. I am interested in not only fixing the currently failing algorithms but also in

13:39 <harshitpant1> proposing new segmented algorithms as suggested in the HPX GSoC Wiki page. In the past few days, I have been trying to figure out how the parallel algorithms have been implemented in HPX and how the segmented algorithms work for partitioned vectors. If you're interested in mentoring that project this year, could you please share what things should

13:39 <harshitpant1> I be doing and looking at?

13:40 <hkaiser> harshitpant1: perhaps start with finding out why some of the segemented algorithm tests fail? that should give you a means to understand the code better

13:41 <hkaiser> harshitpant1: for instance https://cdash.cscs.ch/viewTest.php?onlyfailed&buildid=74722

13:41 <hkaiser> these tests fail on and off, so it's not really reproducible

13:42 <hkaiser> harshitpant1: otherwise, try to have a look at the examples and follow the code path to see where things are happening

13:43 <hkaiser> reading the docs might help as well, some of the distributed functionalities are described there

13:43 <harshitpant1> Yes, I tried to run these tests on my computer and it seems these tests sometimes fail and sometimes don't.

13:44 <harshitpant1> Is there some design document available for segmented algorithms implementation?

13:46 <hkaiser> harshitpant1: no, sorry - just the code

13:46 <hkaiser> well there are bits and pieces

13:46 <harshitpant1> Okay

13:47 <hkaiser> there is a paper written in 1998 by Matt Austern: Segmented Iterators and Hierarchical Algorithms

13:48 <harshitpant1> thanks

13:48 <hkaiser> we use this technique to implement the iterators for the partitioned vector

13:49 <harshitpant1> I will go through this and the other things you mentioned. Will get back to you later. Thanks!

13:49 <hkaiser> harshitpant1: I should be able to find the pdf for that paper, if needed

13:50 <harshitpant1> I think I can do that myself.

13:50 <harshitpant1> Thanks for your time.

13:54 harshitpant1 has quit [Quit: Client closed]

14:21 Neeraj has joined #ste||ar

14:22 <Neeraj> Hi hkaiser you said yesterday about an issue with related to me can you provide some information about that

14:23 Neeraj93 has joined #ste||ar

14:23 <Neeraj93> or any other issues

14:25 Neeraj83 has joined #ste||ar

14:26 Neeraj has quit [Ping timeout: 260 seconds]

14:28 Neeraj93 has quit [Ping timeout: 260 seconds]

14:28 Yorlik__ has joined #ste||ar

14:31 Yorlik_ has quit [Ping timeout: 265 seconds]

14:32 <Neeraj83> I have more questions related the project "Implement a faster associative container for GIDs" What are some potential trade-offs between different data structures like n-ary trees, tries, and radix trees, and how will we decide which

14:32 <Neeraj83> one to use for the new implementation? How will we ensure that the new implementation maintains

14:32 <Neeraj83> correctness and compatibility with the existing codebase and APIs?

14:32 <Neeraj83> or Can you provide more information on how the current hash map and binary search tree implementation is performing in practice, and what specific performance issues we're trying to address with the new implementation?

14:32 <hkaiser> Neeraj83: I was not able to find a representative example of the problem yet. I'll keep you posted

14:33 aryan_j[m] has quit [Ping timeout: 265 seconds]

14:37 <Neeraj83> I am also want your guidance , if you would be available to assist me with writing my proposal for the GSOC project. I would greatly appreciate your guidance in ensuring that the proposal aligns with the project goals and expectations.

14:47 aryan_j[m] has joined #ste||ar

15:14 Neeraj83 has quit [Ping timeout: 260 seconds]

15:23 HHN93 has joined #ste||ar

15:33 <hkaiser> HHN93: you did ask yesterday how to do performance analysis

15:34 <HHN93> yes

15:34 <HHN93> 08:58 <pansysk75[m]> HHN93: You can use chrono (as you mentioned) and compare with the existing implementation. Testing different input sizes is also a good idea. You can use this as a quick starting point: https://github.com/Pansysk75/HPX-Performance-Benchmarks/tree/minimal_bench/src

15:34 <HHN93> 09:01 <pansysk75[m]> Also important, make sure to test for multiple iterations, and discard the first few runs (they are going to be slower due to "cold cache")

15:34 <HHN93> 09:02 <pansysk75[m]> But thats all, there is not much to it :)

15:35 <HHN93> I got found this response in logs

15:35 <hkaiser> HHN93: good

15:35 <hkaiser> pansysk75[m]: is a good source of information for those things. he has done a fair amount of measurements and performance analysis

15:36 <HHN93> sure, I will go through the code for benchmarking and try to do post the results today or tuesday

15:40 <satacker[m]> Is there any reason we should (or why did we not) consider using https://github.com/google/benchmark ?

16:10 HHN93 has quit [Ping timeout: 260 seconds]

16:50 HHN93 has joined #ste||ar

17:24 HHN93 has quit [Ping timeout: 260 seconds]

17:24 HHN93 has joined #ste||ar

17:35 HHN93 has quit [Quit: Client closed]

17:51 harshitpant1 has joined #ste||ar

18:12 K-ballo1 has joined #ste||ar

18:13 K-ballo has quit [Ping timeout: 276 seconds]

18:13 K-ballo1 is now known as K-ballo

19:14 harshitpant1 has quit [Quit: Client closed]

19:17 <hkaiser> satacker[m]: no particular reason

20:06 <hkaiser> Neeraj83: here is an example for the failing test: https://app.circleci.com/pipelines/github/STEllAR-GROUP/hpx/14287/workflows/11133811-c981-451a-b4f2-902e2f073e85/jobs/338992

20:19 HHN93 has joined #ste||ar

20:19 HHN93 has quit [Client Quit]

21:11 <sarkar_t[m]> Hi hkaiser , I went through some of the resources you shared to understand the S/R architecture along with Eric Niebler's talk from CppCon 2021. From that talk I understood the functioning of some of the sender adaptors like `then()`, `when_all()`, but the implementation was of libunifex.

21:12 <sarkar_t[m]> Can you please link some resources from which I can know about the sender adaptors that are there in hpx and they way they are used?

21:29 <hkaiser> sarkar_t[m]: (almost) all of the S/R facilities are here: https://github.com/STEllAR-GROUP/hpx/tree/master/libs/core/execution/include/hpx/execution/algorithms

21:30 <hkaiser> sarkar_t[m]: also, there is the reference implementation for p2300: https://github.com/NVIDIA/stdexec

21:41 <sarkar_t[m]> <hkaiser> "sarkar_t: (almost) all of the S..." <- Thanks for this. I will look into this and understand the implementation.

21:41 <sarkar_t[m]> <hkaiser> "sarkar_t: also, there is the..." <- This looks like a nicely compiled resource to know about S/R. Thanks a lot for this!

21:43 <sarkar_t[m]> In the mean time, can you also point me to any small issue or task that can get me started with my project?

23:49 <hkaiser> sarkar_t[m]: I think you could start putting together a list of algorithms that already support executors that are based on s/r