#ste||ar on 2020-03-03 — irc logs at irclog.cct.lsu.edu

2020-02-24 20:46 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020

01:11 tall15421542 has joined #ste||ar

02:38 Abhishek09 has joined #ste||ar

02:40 <Abhishek09> diehlpk_work: hey, i want to discuss my project with rod but he can't .What to do?

02:41 <Abhishek09> You also not intrested anymore

03:01 <hkaiser> Abhishek09: please take into account time zones

03:01 <hkaiser> it's 9pm here

03:02 <hkaiser> I doubt diehlpk_work is still available

03:02 <hkaiser> also, I'll talk to Rod tomorrow

03:20 <Abhishek09> hkaiser: Thanks . I want to discuss on project " Providing pip package for phylanx"

03:21 <Abhishek09> It is very difficult to talk synchronosly via email

03:37 <hkaiser> sure

03:42 hkaiser has quit [Quit: bye]

03:51 Abhishek09 has quit [Remote host closed the connection]

04:01 <zao> Mailing list test successfully ignored!

04:13 tall15421542 has quit [Remote host closed the connection]

05:20 jaafar_ is now known as jaafar

07:10 K-ballo has joined #ste||ar

09:07 rori has joined #ste||ar

09:08 <rori> FYI I'm back from vacation early

09:09 <zao> yay?

09:09 <rori> ^^

10:38 Abhishek09 has joined #ste||ar

10:38 Abhishek09 has quit [Remote host closed the connection]

10:55 tall15421542 has joined #ste||ar

10:56 kordejong has quit [Quit: killed]

10:56 rori has quit [Quit: killed]

10:56 heller1 has quit [Quit: killed]

10:57 simbergm has quit [Quit: killed]

11:01 mdiers_ has quit [Quit: mdiers_]

11:02 mdiers_ has joined #ste||ar

11:03 mdiers_ has quit [Client Quit]

11:03 mdiers_ has joined #ste||ar

11:46 kordejong has joined #ste||ar

11:46 simbergm has joined #ste||ar

11:51 nikunj has joined #ste||ar

12:24 rori has joined #ste||ar

13:12 hkaiser has joined #ste||ar

13:59 tall15421542 has quit [Remote host closed the connection]

14:20 <mdiers_> hkaiser: hi, can i ask you again about my last problem?

14:40 <hkaiser> mdiers_: sure

14:41 <mdiers_> I'm looking at http://stellar.cct.lsu.edu/pubs/hpcmaspa2015.pdf right now...

14:44 <mdiers_> soo my old problem, after a few changes I now have /threads{locality#0/total/total}/time/average,1,59.471139,[s],370051,[ns]

14:44 <mdiers_> but unfortunately still /threads{locality#0/total/total}/idle-rate,1,59.471319,[s],4391,[0.01%]

14:45 <mdiers_> i only have this problem on the epyc with 48/64 cores.

14:47 <mdiers_> if i create threadpools with 4 cores each and count on them it works almost perfectly.

14:47 <hkaiser> mdiers_: good

14:48 <hkaiser> that tells you that you have sufficient parallelism for a couple of cores, but not for a large numer of them

14:48 <hkaiser> you simply don't have sufficient concurrent work to keep all cores busy

14:48 <hkaiser> do more work or restructure your algorithms

14:49 <hkaiser> mdiers_: I'm sorry I have to run - let's talk a bit later

14:49 <mdiers_> ok, tkz

14:49 hkaiser has quit [Quit: bye]

14:53 mdiers_ has quit [Quit: mdiers_]

14:54 mdiers_ has joined #ste||ar

15:35 hkaiser has joined #ste||ar

16:09 hkaiser has quit [Ping timeout: 256 seconds]

16:21 nikunj97 has joined #ste||ar

16:21 nikunj has quit [Ping timeout: 245 seconds]

16:23 nikunj has joined #ste||ar

17:00 K-ballo has quit [Ping timeout: 256 seconds]

17:23 nikunj97 has quit [Ping timeout: 265 seconds]

17:32 hkaiser has joined #ste||ar

17:36 nikunj97 has joined #ste||ar

17:48 nikunj97 has quit [Ping timeout: 260 seconds]

18:00 <mdiers_> hkaiser: yt?

18:26 <hkaiser> here

18:27 <hkaiser> mdiers_: ^^

18:27 <mdiers_> ;-)

18:29 <mdiers_> yes you are right with the size of the problem. so i calculate the same problem on different dates in parallel.

18:29 <mdiers_> as for example in my case 16x with 4 cores each a data record. like in the example 16 times a component, and then over 4.

18:29 <mdiers_> this is my idea, which differs slightly from the basic use of hpx, but hopefully can be realized anyway.

18:30 <hkaiser> everything is possible with HPX ;-)

18:32 <mdiers_> yes, that's why i switched to hpx

18:33 <mdiers_> because the basic idea should actually make this possible

18:33 <hkaiser> nod

18:37 <mdiers_> then what have I done wrong or what can be improved here: https://gist.github.com/m-diers/10e0bed1ac978c7f27713c7767f45acf

18:38 <hkaiser> the vtune marking the scheduler as a hot spot is a red herring

18:38 <hkaiser> this is caused by the high idle-rate which is caused by too little parallelism

18:40 <mdiers_> yes, but i tested it with 256 data records divide by 4 tasks, and it was the same

18:42 <mdiers_> i think i have created some behavior that makes the scheduler not work efficiently

18:42 <hkaiser> mdiers_: ok, I will try to look at your code in more detail tonight

18:43 <mdiers_> i have the same behavior on epyc nodes only, on 20 or 32 core intel nodes i do not have it

18:44 <hkaiser> interesting

18:46 <mdiers_> We have enough different hardware to test... The 24 core epyc is my workstation and the 64 core rome is a borrowed test system to be able to estimate later purchases.

18:48 <mdiers_> is there any experience if there is a performance loss at hpx when compiling with 128 core support instead of 64?

18:54 <mdiers_> in my example the workload can of course also be increased. and the parallel::for_loop in the workload i have already provided with static_chunk_size( (n+target.num_pus-1)/target.num_pus)

19:26 nikunj has quit [Ping timeout: 258 seconds]

19:28 nikunj has joined #ste||ar

19:39 Abhishek09 has joined #ste||ar

19:58 <mdiers_> hkaiser: i will continue tomorrow, now i also suspect the memory accesses. thanks a lot for your invested time. ;-)

19:59 <Abhishek09> hkaiser: have you contacted rod ?

20:02 <hkaiser> Abhishek09: not yet, thanks for reminding me

20:05 rtohid has joined #ste||ar

20:06 <rtohid> @Abhishek09, I am here, how can I help you?

20:07 <Abhishek09> rtohid: happy to see u

20:07 <rtohid> thanks, did you get my email yesterday?

20:08 <Abhishek09> I want to discuss about on your project "Providing pip package for phylanx"

20:08 <Abhishek09> rtohid: Yes,but i also reply

20:09 <Abhishek09> I think you have to use manylinux

20:10 <Abhishek09> because pypa doesn't accept it without manylinux

20:10 <rtohid> what is it, and why do you need it? could you give a brief description, please?

20:11 <Abhishek09> Kindly walkthrough this https://www.python.org/dev/peps/pep-0599/#pypi-support

20:12 <Abhishek09> if we do without, it will be rejected , bad pypa citizens

20:17 <Abhishek09> Your way that we discuss, it will work but not accepted

20:18 <rtohid> yes, but that's the distribution issue, isn't it?

20:18 <Abhishek09> Therefore, we have to use manylinux or sdist will be possible

20:19 <Abhishek09> but i will prefer wheel

20:20 <rtohid> let's start with packaging and not worry about distribution for now.

20:21 <Abhishek09> Why ? if not our work will be wasted

20:21 <Abhishek09> we have plan everything before making any decision

20:22 <rtohid> I would start with building HPX and Phylanx first.

20:22 <Abhishek09> so that we always towards direction of achieving our goals

20:24 <rtohid> For us, I guess, the goal is to package Phylanx. What is your goal?

20:24 <Abhishek09> to make phlanx pip installlable

20:25 <rtohid> cool, so let's start by building it first.

20:25 <Abhishek09> in same way that we discuss earlier

20:26 <Abhishek09> But how you deal with wheel for releasing on pypa?

20:26 <rtohid> I would start from a docker image and try to figure out Phylanx's software dependencies.

20:27 <rtohid> the goal for now is not distributing the package. Let's do this step by step.

20:27 <Abhishek09> hpx,jemelloc,gcc, boost,pybind11,

20:27 <Abhishek09> blaze

20:28 <rtohid> great! can you please create a docker image with all these installed?

20:29 <Abhishek09> docker image or wheel file?

20:29 <rtohid> docker image. first we'd want to manually build the library and then we can automate the process step by step.

20:32 <Abhishek09> What your plan ? can you expain , So that i can research whether it is efficient or not

20:56 <rtohid> start with manual build, automate the process a few steps at a time and, eventually, package it.

20:57 <Abhishek09> is this project is 3 month project or shorter?

21:00 <rtohid> I am not sure about the time.

21:00 <rtohid> sorry, I have to go to a meeting now. Please feel free to email me if you had any questions.

21:00 rtohid has left #ste||ar ["Konversation terminated!"]

21:05 hkaiser has quit [Quit: bye]

21:31 <Abhishek09> what is the dnf package of phylanx name?

21:47 Abhishek09 has quit [Remote host closed the connection]

22:38 hkaiser has joined #ste||ar

22:47 <diehlpk_work> hkaiser, We have scaling results for hpx + apex, hpx + apex + performance counters, hpx standalone, and hpx + performance counters + apex for one up to 64 nodes

22:48 <diehlpk_work> Sagiv will compute the sub grids per sec and we can compare the overhead using all of them

23:23 <hkaiser> diehlpk_work: very nice! thanks!

23:23 <diehlpk_work> hkaiser, Did you have time to read the CIC documet?

23:24 <hkaiser> diehlpk_work: I read it over but had a hard time to come up with anything in addition

23:53 K-ballo has joined #ste||ar