K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<gnikunj[m]>
hkaiser: yt?
<hkaiser>
here
<gnikunj[m]>
Hey! Could you help me with where to get the branch of kokkos that compiles for kokkos-resiliency?
<gnikunj[m]>
wait, it doesn't have altdev either :/
<gnikunj[m]>
should I use master instead?
<hkaiser>
nod, I think so
<hkaiser>
if this doesn't work, let's ask Nick for instructions
<gnikunj[m]>
ok. Thanks!
<gnikunj[m]>
I was going through the execution model of kokkos. From what I understand, we need to add something like HPXResiliency execution space, right?
<hkaiser>
do we?
<gnikunj[m]>
and the HPXResiliency execution space can then simply invoke HPX functionalities to get resilience
<hkaiser>
the hpx-kokkos repository has executors that map onto kokkos
<gnikunj[m]>
I should look into one of the examples there then.
<gnikunj[m]>
hkaiser: hpx-kokkos looks as if we're using HPX and Kokkos together (instead of backend-frontend)
<hkaiser>
gnikunj[m]: sure, it's an HPX executor on top of kokkos
<gnikunj[m]>
so you're suggesting to use the async resilience executor on top of kokkos?
<hkaiser>
the resilience executor wraps any other executor to do the actual work
<hkaiser>
so yes, I'd start with using the resilience executor wrapped around the kokkos one
<hkaiser>
hpx::kokkos::kok is the execution policy
<hkaiser>
hpx::kokkos::default_executor is a simple executor
<hkaiser>
so use hpx::kokkos::kok.on(hpx::resiliency::experimental::replay_executor{hpx::kokkos::default_executor{}})
<gnikunj[m]>
aah, so that's what you meant
<hkaiser>
or rather make_replay_executor(exec, N)
<hkaiser>
I hope that works, not sure ;-)
<gnikunj[m]>
but do we not have to integrate it within kokkos-resilience itself?
<hkaiser>
not yet
<gnikunj[m]>
aaah, now its starting to make sense!
<hkaiser>
kokkos-resilience does checkpointing, i.e. data resilience
<gnikunj[m]>
right. I was wondering how our resilience would be any useful against actual memory/persistent checkpointing
<hkaiser>
we might not need the kokkos execution policy, a simple async(make_replay_executor(...), ...) would be a good starting point
<gnikunj[m]>
and if we had to integrate it within kokkos-resilience, we should be required to build another execution space. Within the execution space, we can use HPX's execution space functionalities to build the resilient variant.
<hkaiser>
gnikunj[m]: last phone call we agreed to first look into execution resiliency
<gnikunj[m]>
yes, I remember. But you asked me to get a prototype ready :P
<gnikunj[m]>
so I was wondering this whole time that I need to integrate it within kokkos-resilience
<hkaiser>
yah, for the execution resiliency on gpus using kokkos
<gnikunj[m]>
right.
<gnikunj[m]>
wait. If that's all I need to do, we don't need to create a library or anything.
<gnikunj[m]>
Everything is within HPX and kokkos already. A simple cpp file that integrates the two and shows it working would do.
<gnikunj[m]>
is that the idea of prototype you mentioned?
<hkaiser>
yes, exactly - let's do the minimally necessary work here
<gnikunj[m]>
got it ;)
<hkaiser>
it demonstrates the power of composable components
<gnikunj[m]>
you were right. This isn't full time work lol
<hkaiser>
don't tell them ;-)
<gnikunj[m]>
It's early morning here. But I should have a prototype working by the end of tomorrow.
<hkaiser>
sure, no rush
<gnikunj[m]>
(this is the first time I'm writing CUDA code. So expect doubts there)
<gnikunj[m]>
more like writing codes for the GPU
<hkaiser>
sure
<gnikunj[m]>
thanks :D
<hkaiser>
we might run into related issues, so let's see
<gnikunj[m]>
yah. I'm expecting some. Also, it has been some time since I last looked into HPX custom executor implementations.
<gnikunj[m]>
Do you have example code I can look into?
<gnikunj[m]>
just to debug resilience executors in cases of errors
<gnikunj[m]>
hkaiser: one more thing. To use hpx-kokkos, do I need to first compile kokkos with standard HPX and then compile hpx-kokkos and give it a path to kokkos?
<gnikunj[m]>
to rephrase, to use hpx-kokkos functionalities, I need kokkos preinstalled, right?
<gnikunj[m]>
wait nvm. I should read build instruction more carefully :/
<hkaiser>
gnikunj[m]: compiling kokkos with hpx just enables the hpx backend for it
<gonidelis[m]>
Do you mean "I" should read? or is it for both of us?
<hkaiser>
gonidelis[m]: well, I have read it several times ;-) what we usually do is for the student to register for a 'special reading' course with myself (gives you normal credits), where the student reads the book over a semester and we meet once a week to talk about it
<gonidelis[m]>
so is it like algorithmic programming 101?
<hkaiser>
also paralle shouldn't expect the same types as first and last
<gonidelis[m]>
hkaiser: that's the master branch... i have made the changes in minde but wow! you located the bogus line without even watching the actual code...
<zao>
I hear that Hartmut is at least 60% wizard.
<gonidelis[m]>
lol
<Yorlik>
o/
<hkaiser>
Yorlik: hey
<Yorlik>
hkaiser: Heyo!
<hkaiser>
how's life?
<Yorlik>
Just saying a quick hello, since I wasn't on the last three months. I'm overtaking my fathers estate (he's still alive) and tehres tons of things to organize and plan.
<hkaiser>
good luck!
<Yorlik>
I just have this weeken at home - tomorrow I'm driving back.
<Yorlik>
I'm really missing programming - after moving there I can continue my project.
<Yorlik>
For now it unfortunately has to be on hold.
<Yorlik>
I saw HPX 1.6 is out. Any groundbreaking changes or additions?
<hkaiser>
Yorlik: nothing that should impact you
<hkaiser>
if it does, it's a bug ;-)
<Yorlik>
:D
<Yorlik>
Can't wait to get back to it - I'm really missing it.