hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
tall15421542 has joined #ste||ar
Abhishek09 has joined #ste||ar
<Abhishek09> diehlpk_work: hey, i want to discuss my project with rod but he can't .What to do?
<Abhishek09> You also not intrested anymore
<hkaiser> Abhishek09: please take into account time zones
<hkaiser> it's 9pm here
<hkaiser> I doubt diehlpk_work is still available
<hkaiser> also, I'll talk to Rod tomorrow
<Abhishek09> hkaiser: Thanks . I want to discuss on project " Providing pip package for phylanx"
<Abhishek09> It is very difficult to talk synchronosly via email
<hkaiser> sure
hkaiser has quit [Quit: bye]
Abhishek09 has quit [Remote host closed the connection]
<zao> Mailing list test successfully ignored!
tall15421542 has quit [Remote host closed the connection]
jaafar_ is now known as jaafar
K-ballo has joined #ste||ar
rori has joined #ste||ar
<rori> FYI I'm back from vacation early
<zao> yay?
<rori> ^^
Abhishek09 has joined #ste||ar
Abhishek09 has quit [Remote host closed the connection]
tall15421542 has joined #ste||ar
kordejong has quit [Quit: killed]
rori has quit [Quit: killed]
heller1 has quit [Quit: killed]
simbergm has quit [Quit: killed]
mdiers_ has quit [Quit: mdiers_]
mdiers_ has joined #ste||ar
mdiers_ has quit [Client Quit]
mdiers_ has joined #ste||ar
kordejong has joined #ste||ar
simbergm has joined #ste||ar
nikunj has joined #ste||ar
rori has joined #ste||ar
hkaiser has joined #ste||ar
tall15421542 has quit [Remote host closed the connection]
<mdiers_> hkaiser: hi, can i ask you again about my last problem?
<hkaiser> mdiers_: sure
<mdiers_> I'm looking at http://stellar.cct.lsu.edu/pubs/hpcmaspa2015.pdf right now...
<mdiers_> soo my old problem, after a few changes I now have /threads{locality#0/total/total}/time/average,1,59.471139,[s],370051,[ns]
<mdiers_> but unfortunately still /threads{locality#0/total/total}/idle-rate,1,59.471319,[s],4391,[0.01%]
<mdiers_> i only have this problem on the epyc with 48/64 cores.
<mdiers_> if i create threadpools with 4 cores each and count on them it works almost perfectly.
<hkaiser> mdiers_: good
<hkaiser> that tells you that you have sufficient parallelism for a couple of cores, but not for a large numer of them
<hkaiser> you simply don't have sufficient concurrent work to keep all cores busy
<hkaiser> do more work or restructure your algorithms
<hkaiser> mdiers_: I'm sorry I have to run - let's talk a bit later
<mdiers_> ok, tkz
hkaiser has quit [Quit: bye]
mdiers_ has quit [Quit: mdiers_]
mdiers_ has joined #ste||ar
hkaiser has joined #ste||ar
hkaiser has quit [Ping timeout: 256 seconds]
nikunj97 has joined #ste||ar
nikunj has quit [Ping timeout: 245 seconds]
nikunj has joined #ste||ar
K-ballo has quit [Ping timeout: 256 seconds]
nikunj97 has quit [Ping timeout: 265 seconds]
hkaiser has joined #ste||ar
nikunj97 has joined #ste||ar
nikunj97 has quit [Ping timeout: 260 seconds]
<mdiers_> hkaiser: yt?
<hkaiser> here
<hkaiser> mdiers_: ^^
<mdiers_> ;-)
<mdiers_> yes you are right with the size of the problem. so i calculate the same problem on different dates in parallel.
<mdiers_> as for example in my case 16x with 4 cores each a data record. like in the example 16 times a component, and then over 4.
<mdiers_> this is my idea, which differs slightly from the basic use of hpx, but hopefully can be realized anyway.
<hkaiser> everything is possible with HPX ;-)
<mdiers_> yes, that's why i switched to hpx
<mdiers_> because the basic idea should actually make this possible
<hkaiser> nod
<mdiers_> then what have I done wrong or what can be improved here: https://gist.github.com/m-diers/10e0bed1ac978c7f27713c7767f45acf
<hkaiser> the vtune marking the scheduler as a hot spot is a red herring
<hkaiser> this is caused by the high idle-rate which is caused by too little parallelism
<mdiers_> yes, but i tested it with 256 data records divide by 4 tasks, and it was the same
<mdiers_> i think i have created some behavior that makes the scheduler not work efficiently
<hkaiser> mdiers_: ok, I will try to look at your code in more detail tonight
<mdiers_> i have the same behavior on epyc nodes only, on 20 or 32 core intel nodes i do not have it
<hkaiser> interesting
<mdiers_> We have enough different hardware to test... The 24 core epyc is my workstation and the 64 core rome is a borrowed test system to be able to estimate later purchases.
<mdiers_> is there any experience if there is a performance loss at hpx when compiling with 128 core support instead of 64?
<mdiers_> in my example the workload can of course also be increased. and the parallel::for_loop in the workload i have already provided with static_chunk_size( (n+target.num_pus-1)/target.num_pus)
nikunj has quit [Ping timeout: 258 seconds]
nikunj has joined #ste||ar
Abhishek09 has joined #ste||ar
<mdiers_> hkaiser: i will continue tomorrow, now i also suspect the memory accesses. thanks a lot for your invested time. ;-)
<Abhishek09> hkaiser: have you contacted rod ?
<hkaiser> Abhishek09: not yet, thanks for reminding me
rtohid has joined #ste||ar
<rtohid> @Abhishek09, I am here, how can I help you?
<Abhishek09> rtohid: happy to see u
<rtohid> thanks, did you get my email yesterday?
<Abhishek09> I want to discuss about on your project "Providing pip package for phylanx"
<Abhishek09> rtohid: Yes,but i also reply
<Abhishek09> I think you have to use manylinux
<Abhishek09> because pypa doesn't accept it without manylinux
<rtohid> what is it, and why do you need it? could you give a brief description, please?
<Abhishek09> Kindly walkthrough this https://www.python.org/dev/peps/pep-0599/#pypi-support
<Abhishek09> if we do without, it will be rejected , bad pypa citizens
<Abhishek09> Your way that we discuss, it will work but not accepted
<rtohid> yes, but that's the distribution issue, isn't it?
<Abhishek09> Therefore, we have to use manylinux or sdist will be possible
<Abhishek09> but i will prefer wheel
<rtohid> let's start with packaging and not worry about distribution for now.
<Abhishek09> Why ? if not our work will be wasted
<Abhishek09> we have plan everything before making any decision
<rtohid> I would start with building HPX and Phylanx first.
<Abhishek09> so that we always towards direction of achieving our goals
<rtohid> For us, I guess, the goal is to package Phylanx. What is your goal?
<Abhishek09> to make phlanx pip installlable
<rtohid> cool, so let's start by building it first.
<Abhishek09> in same way that we discuss earlier
<Abhishek09> But how you deal with wheel for releasing on pypa?
<rtohid> I would start from a docker image and try to figure out Phylanx's software dependencies.
<rtohid> the goal for now is not distributing the package. Let's do this step by step.
<Abhishek09> hpx,jemelloc,gcc, boost,pybind11,
<Abhishek09> blaze
<rtohid> great! can you please create a docker image with all these installed?
<Abhishek09> docker image or wheel file?
<rtohid> docker image. first we'd want to manually build the library and then we can automate the process step by step.
<Abhishek09> What your plan ? can you expain , So that i can research whether it is efficient or not
<rtohid> start with manual build, automate the process a few steps at a time and, eventually, package it.
<Abhishek09> is this project is 3 month project or shorter?
<rtohid> I am not sure about the time.
<rtohid> sorry, I have to go to a meeting now. Please feel free to email me if you had any questions.
rtohid has left #ste||ar ["Konversation terminated!"]
hkaiser has quit [Quit: bye]
<Abhishek09> what is the dnf package of phylanx name?
Abhishek09 has quit [Remote host closed the connection]
hkaiser has joined #ste||ar
<diehlpk_work> hkaiser, We have scaling results for hpx + apex, hpx + apex + performance counters, hpx standalone, and hpx + performance counters + apex for one up to 64 nodes
<diehlpk_work> Sagiv will compute the sub grids per sec and we can compare the overhead using all of them
<hkaiser> diehlpk_work: very nice! thanks!
<diehlpk_work> hkaiser, Did you have time to read the CIC documet?
<hkaiser> diehlpk_work: I read it over but had a hard time to come up with anything in addition
K-ballo has joined #ste||ar