K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<srinivasyadav227>
so do you have suggestions where can i implement simd overloads for these algorithms?
<gnikunj[m]>
I don't think any of them is the right spot for the overloads to go in but hkaiser would be the right person to answer that
<srinivasyadav227>
gnikunj: okay :)
<hkaiser>
srinivasyadav227: we have started to isolate those sequential implementations into separate headers, but have not touched the one that were inline with the algorithms
<hkaiser>
gonidelis[m]: what did you change to get this?
<gnikunj[m]>
gonidelis[m]: so we're slower than cilk?
<gonidelis[m]>
that's std::ma
<gonidelis[m]>
std::max
parsa has joined #ste||ar
<gnikunj[m]>
aah, so why are we comparing cilk multicore with sequential std::max?
<gnikunj[m]>
(I don't know the context of the discussion)
<gonidelis[m]>
gnikunj[m]: it's kind of weird because I am just printing the same sequential std::max 6 times, but I am mostly using it as a ground sequential reference for Cilk
<gonidelis[m]>
We wanted for Cilk to run faster than <whatever sequential form of max> just to prove that it's performant
<gnikunj[m]>
aah~
<gonidelis[m]>
<whatever sequential form of max> could be either my custom loop, a cilk loop with 1 worker or just std::max(seq)
<gnikunj[m]>
got it
<gonidelis[m]>
hkaiser: I changes many things but what set that I needed to use the std::max result
<gonidelis[m]>
changed^^
<gonidelis[m]>
what set the stage ^^
<gonidelis[m]>
(damn I am nervous)
<hkaiser>
ok, cool
<gonidelis[m]>
My primal suspicion is that although I tried using the result previously, I was missing sth and I wasn't actully ouputing the result of the loop I was timing
<hkaiser>
makes sense
<gonidelis[m]>
Now I am going to get HPX in the arena: judgement day
<gnikunj[m]>
ms: is there a guide somewhere on creating execution spaces in Kokkos?
<ms[m]>
gnikunj: if you mean really implementing one, at least nothing I'm aware of
<gnikunj[m]>
Yes, implementing one
<gnikunj[m]>
Is there a set of functions an execution space should expose along with traits?
<gnikunj[m]>
I have to implement resilient execution space in Kokkos that takes base execution space instance and works on it
<ms[m]>
gnikunj: mainly some template specializations (ParallelFor/Reduce/Scan), with an execute member function (iirc), and possibly some typedefs here and there
<gnikunj[m]>
Aah ok. I'll look into an implementation detail.
<ms[m]>
there's a lot that an execution space can support still on top of that, but the plain ParallelFor is probably the best starting point
<gonidelis[m]>
we have `--hpx:threads` and what else?