#ste||ar on 2021-06-05 — irc logs at irclog.cct.lsu.edu

2020-09-17 16:16 K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:57 <srinivasyadav227> hkaiser: Hi, I could not understand this https://github.com/STEllAR-GROUP/hpx/pull/5328/files/de27828d2261374221a735fe8742f604e5b3868b#r645689054, could you please elaborate it? here you meant feature test as this https://github.com/STEllAR-GROUP/hpx/blob/de27828d2261374221a735fe8742f604e5b3868b/cmake/HPX_PerformCxxFeatureTests.cmake or this

00:57 <srinivasyadav227> https://github.com/STEllAR-GROUP/hpx/blob/de27828d2261374221a735fe8742f604e5b3868b/cmake/tests/cxx20_experimental_simd.cpp ?

00:58 <hkaiser> srinivasyadav227: sorry, I meant the .cpp test

00:59 <srinivasyadav227> okay :)

02:48 hkaiser has quit [Quit: bye]

05:31 nanmiao has quit [Ping timeout: 240 seconds]

05:43 <jedi18[m]> What's the decorated_iterator that are used in tests?

08:49 <gonidelis[m]> jedi18: let me come back to you in a while

10:01 <gonidelis[m]> jedi18: I reckon it's used in order to bind an iterator to a callback

11:11 <jedi18[m]> Oh ok got it, so we just bind a callback that throws at a random position when iterating through it

11:11 <jedi18[m]> Thanks

12:46 hkaiser has joined #ste||ar

12:52 <srinivasyadav227> hkaiser: how are clang compilers in `jenkins/cscs/clang-apex` and `jenkins/cscs/clang-newest` different from other clang-compilers? this test https://github.com/STEllAR-GROUP/hpx/blob/09adb23fb0636dfacedc17a10c6bd816b1600c10/cmake/tests/cxx20_experimental_simd.cpp was correctly failing with my local clang compiler, but passing here https://github.com/STEllAR-GROUP/hpx/pull/5328 ?

12:53 <hkaiser> srinivasyadav227: all clang compilers on linux define the __GNUC__ macro

12:53 <hkaiser> but they also define __clang__ so you can check for that as well

12:54 <hkaiser> ie. #if defined(__GNUC__) && !defined(__clang__)

12:55 <hkaiser> and could you please add a comment there explaining things?

12:55 <srinivasyadav227> <hkaiser "ie. #if defined(__GNUC__) && !de"> thank you :)

12:55 <srinivasyadav227> <hkaiser "and could you please add a comme"> in .cpp test?

12:55 <hkaiser> yes

12:55 <srinivasyadav227> sure will do it soon :)

12:56 <hkaiser> thanks!

14:18 <srinivasyadav227> hkaiser gnikunj rori : iterative algorithms like fill, generate, copy have sequential overloads in same algorithm header (like this https://github.com/STEllAR-GROUP/hpx/blob/master/libs/parallelism/algorithms/include/hpx/parallel/algorithms/generate.hpp#L157-L164) or in detail dir(like this https://github.com/STEllAR-GROUP/hpx/blob/master/libs/parallelism/algorithms/include/hpx/parallel/algorithms/detail/fill.hpp),

14:18 <srinivasyadav227> so do you have suggestions where can i implement simd overloads for these algorithms?

14:23 <gnikunj[m]> I don't think any of them is the right spot for the overloads to go in but hkaiser would be the right person to answer that

14:23 <srinivasyadav227> gnikunj: okay :)

14:49 <hkaiser> srinivasyadav227: we have started to isolate those sequential implementations into separate headers, but have not touched the one that were inline with the algorithms

14:50 <hkaiser> thso separate implementations are here: https://github.com/STEllAR-GROUP/hpx/tree/master/libs/parallelism/algorithms/include/hpx/parallel/algorithms/detail

14:50 <hkaiser> for instance https://github.com/STEllAR-GROUP/hpx/blob/master/libs/parallelism/algorithms/include/hpx/parallel/algorithms/detail/fill.hpp

14:57 <srinivasyadav227> hkaiser: okay, so should i add simd overloads in https://github.com/STEllAR-GROUP/hpx/tree/master/libs/parallelism/algorithms/include/hpx/parallel/algorithms/detail ?

15:09 <hkaiser> nod, sounds good

15:09 <hkaiser> srinivasyadav227: ^^

18:32 <gonidelis[m]> hkaiser: https://ibb.co/GsKB6sM that's what I call happiness

18:41 parsa has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

18:43 <hkaiser> gonidelis[m]: finally

18:44 <hkaiser> gonidelis[m]: what did you change to get this?

18:44 <gnikunj[m]> gonidelis[m]: so we're slower than cilk?

18:45 <gonidelis[m]> that's std::ma

18:45 <gonidelis[m]> std::max

18:45 parsa has joined #ste||ar

18:45 <gnikunj[m]> aah, so why are we comparing cilk multicore with sequential std::max?

18:45 <gnikunj[m]> (I don't know the context of the discussion)

18:45 <gonidelis[m]> gnikunj[m]: it's kind of weird because I am just printing the same sequential std::max 6 times, but I am mostly using it as a ground sequential reference for Cilk

18:46 <gonidelis[m]> We wanted for Cilk to run faster than <whatever sequential form of max> just to prove that it's performant

18:46 <gnikunj[m]> aah~

18:46 <gonidelis[m]> <whatever sequential form of max> could be either my custom loop, a cilk loop with 1 worker or just std::max(seq)

18:47 <gnikunj[m]> got it

18:47 <gonidelis[m]> hkaiser: I changes many things but what set that I needed to use the std::max result

18:47 <gonidelis[m]> changed^^

18:47 <gonidelis[m]> what set the stage ^^

18:48 <gonidelis[m]> (damn I am nervous)

18:48 <hkaiser> ok, cool

18:49 <gonidelis[m]> My primal suspicion is that although I tried using the result previously, I was missing sth and I wasn't actully ouputing the result of the loop I was timing

18:50 <hkaiser> makes sense

18:53 <gonidelis[m]> Now I am going to get HPX in the arena: judgement day

19:50 <gnikunj[m]> ms: is there a guide somewhere on creating execution spaces in Kokkos?

21:03 <ms[m]> gnikunj: if you mean really implementing one, at least nothing I'm aware of

21:03 <gnikunj[m]> Yes, implementing one

21:04 <gnikunj[m]> Is there a set of functions an execution space should expose along with traits?

21:04 <gnikunj[m]> I have to implement resilient execution space in Kokkos that takes base execution space instance and works on it

21:23 <ms[m]> gnikunj: mainly some template specializations (ParallelFor/Reduce/Scan), with an execute member function (iirc), and possibly some typedefs here and there

21:24 <gnikunj[m]> Aah ok. I'll look into an implementation detail.

21:24 <ms[m]> there's a lot that an execution space can support still on top of that, but the plain ParallelFor is probably the best starting point

23:12 <gonidelis[m]> we have `--hpx:threads` and what else?

23:13 hkaiser has quit [Quit: bye]

23:14 chuanqiu has joined #ste||ar

23:53 chuanqiu has quit [Quit: Connection closed]