#ste||ar on 2020-04-09 — irc logs at irclog.cct.lsu.edu

2020-02-24 20:46 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020

01:02 <hkaiser> diehlpk_mobile[m: yt?

01:24 <diehlpk_work_> hkaiser, yes

01:24 <hkaiser> diehlpk_work_: see pm, pls

01:39 Amy1 has quit [Ping timeout: 260 seconds]

01:41 hkaiser has quit [Quit: bye]

02:19 bita has quit [Quit: Leaving]

06:46 hkaiser has joined #ste||ar

07:02 nikunj97 has joined #ste||ar

07:14 <zao> Timestep 106.00 terminate called after throwing an instance of 'thrust::system::system_error'

07:14 <zao> what(): after reduction step 2: cudaErrorInvalidConfiguration: invalid configuration argument

07:15 <zao> Don't you love mysterious crashes hours into a run on a cluster? :D

07:54 <zao> Hah, had run a zero-size kernel and it blew up a few functions later.

08:06 <heller1> that's cuda error handling for you ;)

08:25 Vir has quit [Ping timeout: 256 seconds]

08:28 Vir has joined #ste||ar

08:28 Vir has quit [Changing host]

08:41 karame_ has quit [Remote host closed the connection]

09:55 nikunj97 has quit [Ping timeout: 260 seconds]

10:20 kale_ has joined #ste||ar

10:26 kale_ has quit [Quit: Konversation terminated!]

10:28 kale_ has joined #ste||ar

10:37 kale_ has quit [Ping timeout: 260 seconds]

10:48 kale_ has joined #ste||ar

10:49 mcopik has joined #ste||ar

10:50 mcopik has quit [Client Quit]

10:51 kale_ has quit [Client Quit]

10:52 kale_ has joined #ste||ar

11:10 kale_ has quit [Ping timeout: 265 seconds]

11:47 Amy1 has joined #ste||ar

12:24 hkaiser_ has joined #ste||ar

12:25 hkaiser has quit [Ping timeout: 260 seconds]

12:28 hkaiser_ has quit [Read error: Connection reset by peer]

12:28 hkaiser has joined #ste||ar

12:29 <hkaiser> jaafar_: should we merge wave to master for Boost 1.73?

12:47 hkaiser has quit [Ping timeout: 260 seconds]

13:04 hkaiser has joined #ste||ar

13:27 mcopik has joined #ste||ar

13:35 Amy1 has quit [Quit: WeeChat 2.2]

13:37 mcopik has quit [Remote host closed the connection]

13:37 mcopik has joined #ste||ar

13:38 mcopik has quit [Client Quit]

13:48 nikunj97 has joined #ste||ar

14:05 bita has joined #ste||ar

14:34 nan11 has joined #ste||ar

14:38 Amy1 has joined #ste||ar

14:44 <Amy1> https://ibb.co/vxfMbcd

14:44 <Amy1> how to optimize this code using simd?

14:45 <rori> Hey is anyone using the `HPX_WITH_VIM_YCM` successfully ?

14:45 <rori> whereas I verified that the corresponding directories are in the compile_commands.json

14:45 <rori> I enabled the option for the configure step, did a `make configure_ycm` to copy the configuration file in the source dir and added the `let g:ycm_extra_conf_globlist = ['<path_to_my_project>/*'] ` to my `.vimrc` but I still have some header not found errors

14:47 <hkaiser> rori: I've never used this feature

14:48 <rori> ok ^^ thanks !

14:53 karame_ has joined #ste||ar

14:55 <nan11> Is Avah in irc?

14:59 nikunj97 has quit [Quit: Leaving]

15:02 gonidelis has joined #ste||ar

15:07 <hkaiser> nan11: don't think so

15:07 <nan11> Okay

15:08 <hkaiser> Amy1: use std::experimental::simd for your delta_pos and pos arrrays

15:08 <hkaiser> Amy1: see https://en.cppreference.com/w/cpp/experimental/simd/simd

15:09 <hkaiser> or here: https://github.com/VcDevel/std-simd

15:11 <simbergm> hkaiser, others, sorry about all the additional failures on pycicle

15:12 <hkaiser> simbergm: no worries

15:12 <simbergm> some are due to me enabling the build unit tests, which is why I merged the pr fixing that

15:12 <simbergm> not sure if there's something else going on

15:12 <hkaiser> thanks for taking care of things!

15:12 <simbergm> if things are ok it should stabilize by tomorrow

15:14 nikunj97 has joined #ste||ar

15:14 <hkaiser> simbergm: do you refer to this: https://cdash.cscs.ch//viewBuildError.php?onlydeltap&buildid=102308?

15:15 <simbergm> hkaiser: I wasn't, but that looks interesting as well...

15:15 weilewei has joined #ste||ar

15:15 <hkaiser> could be a c++ standards issue

15:15 <simbergm> ah, but it's probably the same issue

15:15 <hkaiser> nod

15:16 <simbergm> clang defaults to something really low whereas gcc defaults to 14

15:16 <hkaiser> right

15:16 <simbergm> so yeah, that should go away

15:17 weilewei has quit [Remote host closed the connection]

15:18 <hkaiser> simbergm: I'll rebase John's PRs

15:19 <hkaiser> simbergm: btw, did you ever merge your fix for the -1 index issue in one of the schedulers?

15:19 <simbergm> hkaiser: sure, pycicle will automatically pick up the changes on master though (in case you want to save yourself some work)

15:19 weilewei has joined #ste||ar

15:19 <simbergm> hmm, the one on the exception pr?

15:19 <hkaiser> no

15:19 <simbergm> or something else?

15:19 <hkaiser> the local_thread_num issue

15:19 <simbergm> the exception pr is merged

15:20 <simbergm> well, there was a fix for the local thread num -1 issue on that pr, but I still have to go and check that all the other schedulers get the local thread num set

15:21 <hkaiser> ok, then one of the problems on the APEX PR should go away

15:24 <weilewei> hkaiser now the error of G4 array between GPUDirect and baseline is down to 5^e-15, very acceptable range. So my logic is correct now, and gonna run more experiments to verify my implementation. The bigger error before was due to my ignorance on data processing tools, like hdf5 and python interface for complex number... learned something new

15:24 <simbergm> hkaiser: those might require the proper fix

15:24 <simbergm> I can have a look in any case

15:25 <simbergm> (the exception pr went in yesterday and the apex failures are still there from today)

15:46 <hkaiser> simbergm: ok, thanks

15:46 <hkaiser> weilewei: nice

15:47 <weilewei> hkaiser do we happen to have access to any AMD GPU?

15:50 <hkaiser> weilewei: we could ask Adrian ;-)

15:51 <weilewei> hkaiser lol, true, just checking

15:51 <hkaiser> weilewei: at some point we did have AMD GPUs in rostam, not sure if that's still the case

15:51 <weilewei> hkaiser ok, maybe Ali knows

15:52 <hkaiser> pls ask him

15:52 <Yorlik> hkaiser: YT?

15:52 <hkaiser> here

15:52 <hkaiser> half-way

15:53 <Yorlik> I did some thinking about this specific tree structure of a 64 ary tree

15:53 <Yorlik> Problems and chances of it

15:53 <Yorlik> Also a possible generalization and why Morton Code actually might be the answer

15:54 <Yorlik> I first thought the Morton Codes are only useful to balance a newly constructed tree

15:54 <Yorlik> But their power is much more

15:54 <Yorlik> If you say N is a number of components you combine in a morton code

15:54 <Yorlik> And you want to have this N Dimensional tree type

15:55 <Yorlik> You end up with 2^N possible coordinates / child nodes

15:55 <Yorlik> So - you enter dimensional explosion very quickly.

15:56 <Yorlik> If the bitwise comparison of coordinates in such a tree is used to calculate the numbver in the array of child nodes, this number encodes all the comparisons you made

15:56 <hkaiser> ok

15:56 <hkaiser> optimizing again?

15:56 <Yorlik> isten first

15:56 <Yorlik> L

15:57 <Yorlik> If you write a Morton Code as a number of base 2^N the difits represent the subsector indives through the tree - the entore path is encoded

15:57 <Yorlik> Sry - my typing is horrible.

15:58 <Yorlik> If you want to avoid to chase pointers at every level, because you cannot have an array for all coordinates The Morton Code solves that

15:58 <Yorlik> You avoid the dimensional explosion

15:59 <hkaiser> k

15:59 <simbergm> weilewei: did you get any reaction from john yesterday? he didn't reply to me either when I asked about it, but I can poke him about it again

15:59 <Yorlik> If you have like a 64 bit tree the amount of storage exlodes - so you need a sparse structure in any case.

15:59 <Yorlik> Just storing the coordinates becomes a problem.

15:59 <weilewei> simbergm unfortunately no response from John yesterday

16:00 <Yorlik> Also - traversing a tree is probably less efficion than quickly calculating a Morton Code using intrinsics/SIMD

16:00 <Yorlik> And then looking it up in a skip list

16:00 <weilewei> simbergm I guess he is ignoring, never mind then.

16:00 <Yorlik> OFC I'd have to measure, but the real problem is the consequences of dimensional explosion as you add coordinates.

16:01 <simbergm> weilewei: I don't think so, he can just be a bit distracted sometimes

16:02 <weilewei> simbergm thanks, if convenient, please poke him again. Thanks!

16:14 jaafar_ is now known as jaafar

16:18 akheir has joined #ste||ar

16:33 <nikunj97> simbergm, yt?

16:34 <simbergm> nikunj97: here

16:34 <nikunj97> simbergm, I built hpx with apex and otf2. I also did export the required (https://github.com/STEllAR-GROUP/tutorials/tree/master/cscs2019/session3#apex-trace-output)

16:35 <nikunj97> how do I get the trace output from an executable from here?

16:35 <nikunj97> running the executable does not do anything

16:36 <simbergm> :(

16:36 <nikunj97> is there anything special that I need to do?

16:36 <simbergm> what does ldd tell you for your executable?

16:36 <simbergm> you're not supposed to need to do anything special

16:37 <nikunj97> simbergm, https://gist.github.com/NK-Nikunj/d5d9c5a9dbfc10091c96dda8b9a141de

16:37 <simbergm> I hope you didn't end up with a commit just when I managed to break apex linking again...

16:37 <simbergm> can you give me hpx's commit hash as well

16:37 <simbergm> sorry if I've made you build a broken version

16:37 <nikunj97> hpx commit: 969833a

16:37 <nikunj97> simbergm, :(

16:37 <simbergm> yep, no apex there

16:38 <simbergm> do you have libhpx_apex in your hpx build directory?

16:38 <nikunj97> yes libhpx_apex exists

16:39 <nikunj97> https://gist.github.com/NK-Nikunj/d5d9c5a9dbfc10091c96dda8b9a141de#file-ls

16:40 <simbergm> nikunj97: you'll need this: https://github.com/STEllAR-GROUP/hpx/pull/4510

16:40 <simbergm> (I'm mad at cmake for letting me make that mistake, but we also need better testing...)

16:41 <simbergm> you can also try linking to `HPX::apex` in your application

16:41 <nikunj97> how do I do that?

16:41 <nikunj97> I do not want to rebuild everything

16:41 <simbergm> I think that should give you the same effect if you don't want to rebuild hpx, but don't rely on it in the future

16:41 <simbergm> target_link_libraries(myapp PRIVATE HPX::apex)

16:42 <nikunj97> I add that to my CMakeLists.txt on the application I'm working on?

16:42 <simbergm> yeah

16:42 <nikunj97> let me try it

16:43 <simbergm> it's meant to automatically be linked through HPX::hpx's interface link libraries, but it's not because of my mistake up there

16:45 <nikunj97> simbergm, https://gist.github.com/NK-Nikunj/d5d9c5a9dbfc10091c96dda8b9a141de#file-cmake

16:46 <simbergm> nikunj97: mind grepping for "apex" in the hpx install directory/lib/cmake/HPX? potentially lib64

16:47 <simbergm> I'm going off memory here so I might get some things wrong

16:47 <simbergm> you might end up recompiling anyway :P

16:48 <nikunj97> https://gist.github.com/NK-Nikunj/d5d9c5a9dbfc10091c96dda8b9a141de#file-cmake

16:48 <nikunj97> it's APEX::apex everywhere

16:49 <nikunj97> should I just replace apex with HPX everywhere?

16:50 <hkaiser> nikunj97: not everywhere

16:50 <hkaiser> only in one spot

16:50 <simbergm> hrm

16:50 <hkaiser> am I right?

16:51 <nikunj97> let me just recompile it with the PR branch :/

16:51 <hkaiser> sorry, if I misunderstand things

16:51 <simbergm> sorry, yes, just one spot!

16:51 <simbergm> thinking about something else

16:51 <nikunj97> where do I make the change then?

16:51 <hkaiser> nikunj97: it's in the PR

16:51 <simbergm> since apex was built there should be the exported target in HPXModuleTargets.cmake

16:52 <simbergm> but if linking to HPX::apex didn't work it's not going to work by changing it there either...

16:52 <simbergm> I'll do a bit of digging

16:52 <nikunj97> let me just recompile in that case

16:53 <simbergm> if you don't mind potentially doing it again... I just want to check that my fix actually is correct (I admit I didn't test it)

16:53 <nikunj97> simbergm, sure will do

16:54 <nikunj97> will comment on the PR if things work

16:54 <simbergm> ah, thanks

16:54 <simbergm> (I did actually mean that I'll try it out myself, but if you don't mind trying I'm very happy as well)

17:01 <gonidelis> Procedural-oriented question: When I have a github project cloned into my PC (say HPX) after executing some compilations there are produced certain new compilation files on the directory. So what happens after I make some changes and want to push them back. What is a standard procedure with which I can push my new code explicitly and not the

17:01 <gonidelis> compilation files?

17:02 <simbergm> nikunj97: even without those changes you should have something like the following in lib64/cmake/HPX/HPXTargets.cmake: https://gist.github.com/msimberg/d8fd7467149175028e2519487e47a62e

17:02 <simbergm> mind checking?

17:03 <gonidelis> Is it the project's business to produce the compilation files on an external dir or is it my business to keep a copy of the original project ?

17:06 <simbergm> gonidelis: no compiled artifacts go into the the git repository since they can 1) be rebuilt from the source files, 2) they are machine specific, 3) they can be big(!), etc.

17:06 <simbergm> there are probably hundreds of reasons not to add compiled files

17:07 <simbergm> hpx does not even allow building in the source directory which makes it a bit easier to not accidentally check in anything from the build directory

17:08 <simbergm> if you have your build directory completely outside of the source directory git won't even let you add them, and if you have a "build" subdirectory in your source directory we have a gitignore rule to ignore any files in that directory

17:10 <gonidelis> thank you. You answered perfectly to all of my questions. I 'll keep your words in mind...

17:26 <nikunj97> simbergm, there's no apex in HPXTargets.cmake

17:27 <nikunj97> do I add that at the end?

17:30 <simbergm> nikunj97: no, that means apex is not (correctly?) enabled

17:30 <simbergm> adding it manually is a bad idea

17:31 <simbergm> so you're on 428f0ad5f31 (msimberg-patch-5), now?

17:31 <nikunj97> I used -DHPX_WITH_APEX=ON -DAPEX_WITH_OTF2

17:31 <nikunj97> simbergm, I haven't built it yet

17:31 <nikunj97> I thought, you wanted to make changes to the current build

17:33 <simbergm> nikunj97: I was hoping you could, but that's not going to work if apex wasn't enabled in the first place

17:33 <nikunj97> how do I enable it?

17:33 <simbergm> so in your build directory, you definitely have HPX_WITH_APEX=ON (check CMakeCache.txt or ccmake, whatever you prefer)

17:33 <nikunj97> did I do something wrong with the build?

17:34 <simbergm> `-DHPX_WITH_APEX=ON` is correct, I'm just being thorough :)

17:34 <simbergm> don't worry

17:34 <nikunj97> HPX_WITH_APEX ON

17:34 <nikunj97> from ccmake

17:35 <simbergm> good

17:35 <simbergm> then, in your build directory in lib/cmake/HPX/HPXTargets.cmake do you have any mention of HPX::hpx?

17:35 <simbergm> or did you already check the build directory earlier?

17:36 <simbergm> sorry HPX::apex

17:36 <nikunj97> https://gist.github.com/NK-Nikunj/d5d9c5a9dbfc10091c96dda8b9a141de#file-grep

17:37 <simbergm> that looks better

17:37 <nikunj97> HPXTargets.cmake does have HPX::apex in it

17:37 <simbergm> and did you install that build?

17:37 <nikunj97> yes

17:37 <simbergm> and there's no HPX::apex in the install directory?

17:37 <nikunj97> this was the build I installed

17:38 gonidelis has quit [Remote host closed the connection]

17:38 <nikunj97> there was HPX::apex in there as well

17:38 <simbergm> there was HPX::apex in the install directory?

17:38 <nikunj97> yes

17:39 <simbergm> and did you check that your benchmark application is pointing to the correct install?

17:39 <nikunj97> but it was not showing up in ldd

17:39 <nikunj97> yes

17:40 <nikunj97> ldd does take in the right libhpx.so

17:40 <nikunj97> libhpx.so.1 => /home/jusers/gupta2/juawei/install/arm/hpx_trace/lib64/libhpx.so.1 (0x0000ffff8a8d0000)

17:40 <nikunj97> I'm trying out your PR right now if that changes things

17:40 <simbergm> can you show your cmakelists.txt where you linked to HPX::apex?

17:41 <simbergm> or try out the pr

17:41 <nikunj97> https://gist.github.com/NK-Nikunj/d5d9c5a9dbfc10091c96dda8b9a141de#file-cmakelists-txt

17:41 <nikunj97> it looks like this

17:42 gonidelis has joined #ste||ar

17:42 <simbergm> thanks

17:45 <simbergm> looks correct (although I recommend you don't use the global `include_directories` and friends commands, but that's unrelated; HPX_LIBRARY_DIR and HPX_INCLUDE_DIR are empty nowadays)

17:45 <hkaiser> bita: yt?

17:46 <simbergm> nikunj97: I'm running out of ideas, something is not using the right paths

17:46 <hkaiser> bita: I will be a couple of minutes late for our meeting today

17:46 <nikunj97> simbergm, I'm currently building your new PR. Let's see how it works

17:46 <simbergm> do you have anything in LD_LIBRARY_PATH that might make it look like it's linking to the correct one, even though it was compiled against another install?

17:46 <nikunj97> I only have path to nsimd in LD_LIBRARY_PATH

17:47 <simbergm> ok, not that then...

17:47 <nikunj97> coz I'm yet to write a findNsimd to link things correctly

17:47 <bita> hkaiser, Okay :)

17:50 <simbergm> nikunj97: I'll be away for a bit, but ping me if things still don't work with the pr

17:50 <nikunj97> simbergm, will do

18:01 <gonidelis> When I call hpx::asynx on an hpx::future, is it correct to say that a thread is invoked? Or is it just sth like from a higher level architecture, say 'a future' for example?

18:05 <weilewei> gonidelisthat I believe it will invoke a hpx user-level thread, and the task represented by the future will be executed in that new thread

18:05 <weilewei> gonidelis ^^

18:11 gonidelis has quit [Ping timeout: 240 seconds]

18:12 <nikunj97> simbergm, your PR worked for me!

18:12 <heller1> gonidelis: you don't invoke async on a future

18:12 <heller1> async returns a future

18:14 <heller1> And yes, think in tasks, not in threads. A future represents an asynchronous result from a task

18:16 <heller1> It does not necessarily have to be a os thread or user level thread that is carrying out the task. Could be an asynchronous copy, where some dma engine performs the work, or a network request where your network card performs the work and many more

18:23 gonidelis has joined #ste||ar

18:26 <gonidelis> heller1 Yeah, *task* fits better as a term. Thank you...

18:28 <gonidelis> The reason I ask is because I am writting a README on my matrix_multiplication sample program and I would like to be as accurate as possible as I would like to provide the code in public as a serious example of HPX speed-boost possibilities compared to a sequential version

18:32 <heller1> Be careful, matrix matrix multiplication is a well researched topic. While your implementation is most likely memory bound and a O(N^3) algorithm. The best implementations and algorithms perform way better, even the sequential version ;)

18:34 nikunj97 has quit [Read error: Connection reset by peer]

18:36 <hkaiser> yah, David did write a close to optimal mxm a while back, there should be a repository somewhere

18:37 <heller1> Cool, I wasn't aware of that! We should promote that...

18:37 <gonidelis> I agree. Maybe it would be better if I rephrase : "Dummy MxM multiplication". I am not trying to provide a fast MxM calculator but rather expose the time difference between a straightforward sequential MxM execution and a straightforward parallel MxM one. I would like it to be more of an exhibition example...

18:38 <heller1> Sure

18:39 <heller1> I really didn't mean to demotivate you... Just wanted to mention points of criticism one might have if you distribute such statements

18:39 <gonidelis> no worries! You helped me document the goal of my project in a better manner actually ;)

18:40 <heller1> On that note, it's really sad that no one implemented the strassen algorithm...

18:41 <hkaiser> here: https://github.com/DavidPfander-UniStuttgart/MatrixMultiplicationHPX

18:41 <heller1> hkaiser: we should ask the students about that next year... Gives way more insight into task based programming

18:42 <gonidelis> heller1 Do you think it would be useful if I try to implement it the next few days?

18:42 <gonidelis> useful for the community*

18:43 <heller1> If you have some time, go ahead. It's certainly useful for you.

18:43 <hkaiser> gonidelis: mostly useful for you, I guess

18:43 <gonidelis> Sure. I 'll be glad to give it a try.

18:44 nikunj97 has joined #ste||ar

18:54 <gonidelis> Some thoughts on my current project though: I have implemented a sequential version compared to a parallel one utilizing a simple `hpx::async`. Sequential seems to work better than parallel . My guess is that because I compute each cell of the procuct matrix with a different task, overhead is manifested... Do you think making a

18:54 <gonidelis> row-based-parallelization would improve the performance?

18:54 <gonidelis> https://github.com/gonidelis/HPX_matrix_multiplication You can see the project here

18:56 <gonidelis> You can find some of my comments on the implementation there too...

19:34 <simbergm> nikunj97 excellent! thanks for your patience with this :)

19:35 <simbergm> I'm still confused why my attempt at a temporary hack didn't work but it was a hack anyway...

19:36 <nikunj97> simbergm, idk. I'm glad the PR is working

19:38 <simbergm> Yeah, that's the important thing :)

19:47 <hkaiser> simbergm: things seem to work for khuck as well - thanks a lot!

19:47 <hkaiser> he asks when we will merge things ;-)

19:48 <simbergm> hkaiser: good! it was a silly bug in the first place though...

19:48 <hkaiser> aren't all bugs silly?

19:48 <simbergm> I think we can merge it right away

19:48 <hkaiser> nod, pls go ahead

19:51 <simbergm> fair enough :) this one was sillier than most bugs...

19:51 <simbergm> I'll merge it

19:55 <simbergm> done

19:55 <simbergm> it might actually be that our apex linking is also broken with pkgconfig... will have to look into that as well

20:34 <hkaiser> simbergm: thanks!

20:44 <diehlpk_work_> I like the new rostam, because we have really low job numbers

20:45 <hkaiser> diehlpk_work_: enjoy it while buildbot is down ;-)

20:46 <diehlpk_work_> Yes, all my jobs are so fast running and no queue time

20:58 weilewei has quit [Ping timeout: 240 seconds]

21:39 nikunj has quit [Read error: Connection reset by peer]

21:39 nikunj has joined #ste||ar

22:00 nikunj has quit [Ping timeout: 240 seconds]

22:01 nikunj has joined #ste||ar

22:03 nikunj97 has quit [Read error: Connection reset by peer]

22:13 gonidelis has quit [Ping timeout: 240 seconds]

22:54 nan11 has quit [Remote host closed the connection]

23:19 wate123_Jun has joined #ste||ar

23:32 <Yorlik> hkaiser: YT?

23:55 <hkaiser> Yorlik: here

23:57 <Yorlik> Heyll!

23:58 <Yorlik> Once you can afford a little time I'd like to discuss this quadtree / Z-curve problem again.

23:58 <hkaiser> Yorlik: let's not do it today, if you wouldn't mind

23:59 <Yorlik> NP - Just generally asking.

23:59 <Yorlik> It totally has time.

23:59 <hkaiser> sure, over the weekend should be fine