hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
tufei_ has quit [Remote host closed the connection]
tufei has joined #ste||ar
K-ballo has quit [Ping timeout: 260 seconds]
K-ballo has joined #ste||ar
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 248 seconds]
K-ballo1 is now known as K-ballo
Yorlik_ has joined #ste||ar
Yorlik__ has quit [Ping timeout: 260 seconds]
Neeraj has joined #ste||ar
Neeraj has quit [Client Quit]
hkaiser has joined #ste||ar
hkaiser has quit [Quit: Bye!]
HHN93 has joined #ste||ar
HHN93 has quit [Quit: Client closed]
<pansysk75[m]> HHN93: You can use chrono (as you mentioned) and compare with the existing implementation. Testing different input sizes is also a good idea. You can use this as a quick starting point: https://github.com/Pansysk75/HPX-Performance-Benchmarks/tree/minimal_bench/src
<pansysk75[m]> Also important, make sure to test for multiple iterations, and discard the first few runs (they are going to be slower due to "cold cache")
<pansysk75[m]> But thats all, there is not much to it :)
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 246 seconds]
K-ballo1 is now known as K-ballo
Neeraj has joined #ste||ar
Neeraj has quit [Client Quit]
hkaiser has joined #ste||ar
harshitpant1 has joined #ste||ar
<harshitpant1> Hi, hkaiser and gonidelis, I find the GSoC project "Bring the HPX distributed algorithms up to date" interesting. I have HPX built on my computer and have been looking at examples and tests to understand how things work in HPX. I also raised two minor PRs in February. I am interested in not only fixing the currently failing algorithms but also in
<harshitpant1> proposing new segmented algorithms as suggested in the HPX GSoC Wiki page. In the past few days, I have been trying to figure out how the parallel algorithms have been implemented in HPX and how the segmented algorithms work for partitioned vectors. If you're interested in mentoring that project this year, could you please share what things should
<harshitpant1> I be doing and looking at?
<hkaiser> harshitpant1: perhaps start with finding out why some of the segemented algorithm tests fail? that should give you a means to understand the code better
<hkaiser> these tests fail on and off, so it's not really reproducible
<hkaiser> harshitpant1: otherwise, try to have a look at the examples and follow the code path to see where things are happening
<hkaiser> reading the docs might help as well, some of the distributed functionalities are described there
<harshitpant1> Yes, I tried to run these tests on my computer and it seems these tests sometimes fail and sometimes don't.
<harshitpant1>  Is there some design document available for segmented algorithms implementation?
<hkaiser> harshitpant1: no, sorry - just the code
<hkaiser> well there are bits and pieces
<harshitpant1> Okay
<hkaiser> there is a paper written in 1998 by Matt Austern: Segmented Iterators and Hierarchical Algorithms
<harshitpant1> thanks
<hkaiser> we use this technique to implement the iterators for the partitioned vector
<harshitpant1> I will go through this and the other things you mentioned. Will get back to you later. Thanks!
<hkaiser> harshitpant1: I should be able to find the pdf for that paper, if needed
<harshitpant1> I think I can do that myself.
<harshitpant1> Thanks for your time.
harshitpant1 has quit [Quit: Client closed]
Neeraj has joined #ste||ar
<Neeraj> Hi hkaiser you said yesterday about an issue with related to me can you provide some information about that
Neeraj93 has joined #ste||ar
<Neeraj93> or any other issues
Neeraj83 has joined #ste||ar
Neeraj has quit [Ping timeout: 260 seconds]
Neeraj93 has quit [Ping timeout: 260 seconds]
Yorlik__ has joined #ste||ar
Yorlik_ has quit [Ping timeout: 265 seconds]
<Neeraj83> I have more questions related the project "Implement a faster associative container for GIDs"                                                                             What are some potential trade-offs between different data structures like n-ary trees, tries, and radix trees, and how will we decide which
<Neeraj83> one to use for the new implementation?                                                                                                                                                                   How will we ensure that the new implementation maintains
<Neeraj83> correctness and compatibility with the existing codebase and APIs?
<Neeraj83> or Can you provide more information on how the current hash map and binary search tree implementation is performing in practice, and what specific performance issues we're trying to address with the new implementation?
<hkaiser> Neeraj83: I was not able to find a representative example of the problem yet. I'll keep you posted
aryan_j[m] has quit [Ping timeout: 265 seconds]
<Neeraj83> I am also want your guidance , if you would be available to assist me with writing my proposal for the GSOC project. I would greatly appreciate your guidance in ensuring that the proposal aligns with the project goals and expectations.
aryan_j[m] has joined #ste||ar
Neeraj83 has quit [Ping timeout: 260 seconds]
HHN93 has joined #ste||ar
<hkaiser> HHN93: you did ask yesterday how to do performance analysis
<HHN93> yes
<HHN93> 08:58 <pansysk75[m]> HHN93: You can use chrono (as you mentioned) and compare with the existing implementation. Testing different input sizes is also a good idea. You can use this as a quick starting point: https://github.com/Pansysk75/HPX-Performance-Benchmarks/tree/minimal_bench/src
<HHN93> 09:01 <pansysk75[m]> Also important, make sure to test for multiple iterations, and discard the first few runs (they are going to be slower due to "cold cache")
<HHN93> 09:02 <pansysk75[m]> But thats all, there is not much to it :)
<HHN93> I got found this response in logs
<hkaiser> HHN93: good
<hkaiser> pansysk75[m]: is a good source of information for those things. he has done a fair amount of measurements and performance analysis
<HHN93> sure, I will go through the code for benchmarking and try to do post the results today or tuesday
<satacker[m]> Is there any reason we should (or why did we not) consider using https://github.com/google/benchmark ?
HHN93 has quit [Ping timeout: 260 seconds]
HHN93 has joined #ste||ar
HHN93 has quit [Ping timeout: 260 seconds]
HHN93 has joined #ste||ar
HHN93 has quit [Quit: Client closed]
harshitpant1 has joined #ste||ar
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 276 seconds]
K-ballo1 is now known as K-ballo
harshitpant1 has quit [Quit: Client closed]
<hkaiser> satacker[m]: no particular reason
HHN93 has joined #ste||ar
HHN93 has quit [Client Quit]
<sarkar_t[m]> Hi hkaiser , I went through some of the resources you shared to understand the S/R architecture along with Eric Niebler's talk from CppCon 2021. From that talk I understood the functioning of some of the sender adaptors like `then()`, `when_all()`, but the implementation was of libunifex.
<sarkar_t[m]> Can you please link some resources from which I can know about the sender adaptors that are there in hpx and they way they are used?
<hkaiser> sarkar_t[m]: (almost) all of the S/R facilities are here: https://github.com/STEllAR-GROUP/hpx/tree/master/libs/core/execution/include/hpx/execution/algorithms
<hkaiser> sarkar_t[m]: also, there is the reference implementation for p2300: https://github.com/NVIDIA/stdexec
<sarkar_t[m]> <hkaiser> "sarkar_t: (almost) all of the S..." <- Thanks for this. I will look into this and understand the implementation.
<sarkar_t[m]> <hkaiser> "sarkar_t: also, there is the..." <- This looks like a nicely compiled resource to know about S/R. Thanks a lot for this!
<sarkar_t[m]> In the mean time, can you also point me to any small issue or task that can get me started with my project?
<hkaiser> sarkar_t[m]: I think you could start putting together a list of algorithms that already support executors that are based on s/r