hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
diehlpk_work has quit [Remote host closed the connection]
Yorlik__ has quit [Ping timeout: 265 seconds]
Yorlik has joined #ste||ar
Yorlik has quit [Ping timeout: 265 seconds]
Yorlik has joined #ste||ar
<dkaratza[m]>
hkaiser: I just sent you a first version of the GSoD proposal. If you have time to review it before our meeting, then we can discuss it tomorrow
<hkaiser>
dkaratza[m]: thanks! I'll have a look!
Yorlik has quit [Ping timeout: 250 seconds]
Yorlik has joined #ste||ar
Yorlik has quit [Ping timeout: 240 seconds]
Yorlik has joined #ste||ar
hkaiser has quit [Quit: Bye!]
Yorlik_ has joined #ste||ar
Yorlik has quit [Ping timeout: 265 seconds]
prakhar has joined #ste||ar
prakhar has quit [Quit: Client closed]
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 240 seconds]
K-ballo1 is now known as K-ballo
<mdiers[m]>
I have a problem probably with the parcelport_mpi. It does not occur with parcelport_tcp. When doing distributed calculations with e.g. four nodes, I do not always reliably get the state is_ready from the future, although the call is processed there. However, this occurs very rarely and not reproducibly, on average every 30 minutes.
<mdiers[m]>
Are there special hpx logs for the area that could be output? Or other environment variables from mpi (OMPI_MCA_???_verbose=100)?
hkaiser has joined #ste||ar
prakhar has joined #ste||ar
HHN93 has joined #ste||ar
<HHN93>
hkaiser how do I share performance analysis graphs?
<hkaiser>
one way would be an email exchange, if you have concise graphs, you could add them to the github PR (if related)
<hkaiser>
another way is a wiki page somewhere, a blog post is possible as well
<HHN93>
another way is a wiki page somewhere, a blog post is possible as well
<HHN93>
this sounds great
<HHN93>
for now I will try to add it to github
<hkaiser>
ok
HHN93 has quit [Ping timeout: 260 seconds]
prakhar has quit [Quit: Client closed]
HHN93 has joined #ste||ar
<HHN93>
hkaiser can you please review #6199, #6200
<hkaiser>
HHN93: will do
<HHN93>
the performance benefits seem to be minimal for par vs par_unseq. Any advice on how I can look into improving it?
HHN93 has quit [Quit: Client closed]
HHN93 has joined #ste||ar
HHN93 has quit [Ping timeout: 260 seconds]
HHN93 has joined #ste||ar
<hkaiser>
HHN93: adding the #pragmas instructs the compiler to try figuring out what can be vectorized
<HHN93>
so if the compiler misinterprets it, performance benefits would be minimal?
<hkaiser>
so if the code inside the loop is complex or involves function invocations the compiler can't see through, the effect will be minimal as no or only minimal vectorization is happening
<HHN93>
oh ok
<hkaiser>
HHN93: so I'd start looking at the generated assembly to see what gets vectorized and what not
<HHN93>
is this the reason why we wanted to add hpx support to god bolt?
<hkaiser>
HHN93: sure, that would make it much easier to experiment with hpx, wouldn't it?
<hkaiser>
HHN93: the assembly could be looked at outside of godbolt as well
<HHN93>
YES, but until then do we have to take snippet of code and edit it accordingly before using godbolt?
<HHN93>
HHN93: the assembly could be looked at outside of godbolt as well
<HHN93>
hmmm, are you suggesting a decompiler?
<hkaiser>
HHN93: just add the correct command line option to your compiler invocation and it will generate the assembly for you
K-ballo has quit [Ping timeout: 240 seconds]
K-ballo has joined #ste||ar
diehlpk_work has joined #ste||ar
HHN93 has quit [Quit: Client closed]
HHN93 has joined #ste||ar
HHN93 has quit [Quit: Client closed]
tufei has quit [Remote host closed the connection]
<zao>
libc++ similarly has it in some LLVM repo somewhere, probably a monorepo knowing google :D
<zao>
Or on a machine where you've got an implementation, stepping into your headers and looking around.
<HHN93>
llvm-project has a seperate directory name openmp, isn't that what I should be looking at?
<zao>
The header part for libstdc++ is typically down in `/usr/include/c++/11` or so.
<zao>
I have no idea what your project is nor what you're supposed to investigate, but if it's about par/unseq stuff I would expect it would be more about the parallel algorithms part?
<zao>
std::async? std::transform? Not sure what you're doing right now.
<HHN93>
yes, I am trying to implement par_unseq for HPX algorithms
<HHN93>
there were no performance gains when using vectorization, hkaiser suggested that it might be because the compiler does not understand how to vectorize
<HHN93>
so I am going through assembly code of simple cases of #pragma omp simd to see how the assembly generated is differetn
<Aarya[m]>
Hi, can anyone explain the difference between what is done here "https://github.com/STEllAR-GROUP/hpxMP" and what needs to be done for the "hpxMP: HPX threading system for LLVM OpenMP" project rtohid
<hkaiser>
Aarya[m]: hpxMP is a new implementation of OpenMP independent of LLVM
<hkaiser>
what we would like to do is to 'port' the LLVM openmp runtime implementation to HPX
<hkaiser>
Aarya[m]: rtohid[m] will have all the details
<zao>
Is there anywhere we _can't_ sneak in HPX? :P
<zao>
HHN93: One trick I have when I don't quite know where something is in library code is that I write a small program that uses it and step through the execution into the library code or Go To Definition in an IDE.
<HHN93>
I am trying to see how using pragma omp simd changes the executable generated, pragmas are just pre processor directives, right? how can I step into it?
<hkaiser>
zao: that's the point - taking over the world one project at a time ;-)
HHN93 has quit [Quit: Client closed]
HHN93 has joined #ste||ar
<Aarya[m]>
<hkaiser> "Aarya: hpxMP is a new implementa..." <- Is this implementation performing worse than the LLVM openMP implementation?
<hkaiser>
Aarya[m]: much worse
<hkaiser>
it also doesn't support many of the omp #pragmas properly
<hkaiser>
moving to the LLVM runtime would relieve us from ever having to worry about new omp #pragmas they might introduse as the runtime would tak care of that
<Aarya[m]>
So we just need to make all pthread alternatives available. And llvm will handle everything else
<hkaiser>
exactly
<gonidelis[m]>
<zao> "HHN93: One trick I have when I..." <- damn.... that's smart