K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
hkaiser: yt?
Hey! Could you help me with where to get the branch of kokkos that compiles for kokkos-resiliency?
wait, it doesn't have altdev either :/
should I use master instead?
nod, I think so
if this doesn't work, let's ask Nick for instructions
ok. Thanks!
I was going through the execution model of kokkos. From what I understand, we need to add something like HPXResiliency execution space, right?
do we?
and the HPXResiliency execution space can then simply invoke HPX functionalities to get resilience
the hpx-kokkos repository has executors that map onto kokkos
I should look into one of the examples there then.
hkaiser: hpx-kokkos looks as if we're using HPX and Kokkos together (instead of backend-frontend)
gnikunj[m]: sure, it's an HPX executor on top of kokkos
so you're suggesting to use the async resilience executor on top of kokkos?
the resilience executor wraps any other executor to do the actual work
so yes, I'd start with using the resilience executor wrapped around the kokkos one
hpx::kokkos::kok is the execution policy
hpx::kokkos::default_executor is a simple executor
so use hpx::kokkos::kok.on(hpx::resiliency::experimental::replay_executor{hpx::kokkos::default_executor{}})
aah, so that's what you meant
or rather make_replay_executor(exec, N)
I hope that works, not sure ;-)
but do we not have to integrate it within kokkos-resilience itself?
not yet
aaah, now its starting to make sense!
kokkos-resilience does checkpointing, i.e. data resilience
right. I was wondering how our resilience would be any useful against actual memory/persistent checkpointing
we might not need the kokkos execution policy, a simple async(make_replay_executor(...), ...) would be a good starting point
and if we had to integrate it within kokkos-resilience, we should be required to build another execution space. Within the execution space, we can use HPX's execution space functionalities to build the resilient variant.
gnikunj[m]: last phone call we agreed to first look into execution resiliency
yes, I remember. But you asked me to get a prototype ready :P
so I was wondering this whole time that I need to integrate it within kokkos-resilience
yah, for the execution resiliency on gpus using kokkos
wait. If that's all I need to do, we don't need to create a library or anything.
Everything is within HPX and kokkos already. A simple cpp file that integrates the two and shows it working would do.
is that the idea of prototype you mentioned?
yes, exactly - let's do the minimally necessary work here
got it ;)
it demonstrates the power of composable components
you were right. This isn't full time work lol
don't tell them ;-)
It's early morning here. But I should have a prototype working by the end of tomorrow.
sure, no rush
(this is the first time I'm writing CUDA code. So expect doubts there)
more like writing codes for the GPU
thanks :D
we might run into related issues, so let's see
yah. I'm expecting some. Also, it has been some time since I last looked into HPX custom executor implementations.
Do you have example code I can look into?
just to debug resilience executors in cases of errors
hkaiser: one more thing. To use hpx-kokkos, do I need to first compile kokkos with standard HPX and then compile hpx-kokkos and give it a path to kokkos?
to rephrase, to use hpx-kokkos functionalities, I need kokkos preinstalled, right?
wait nvm. I should read build instruction more carefully :/
gnikunj[m]: compiling kokkos with hpx just enables the hpx backend for it
Do you mean "I" should read? or is it for both of us?
gonidelis[m]: well, I have read it several times ;-) what we usually do is for the student to register for a 'special reading' course with myself (gives you normal credits), where the student reads the book over a semester and we meet once a week to talk about it
so is it like algorithmic programming 101?
also paralle shouldn't expect the same types as first and last
hkaiser: that's the master branch... i have made the changes in minde but wow! you located the bogus line without even watching the actual code...
I hear that Hartmut is at least 60% wizard.
Yorlik: hey
hkaiser: Heyo!
how's life?
Just saying a quick hello, since I wasn't on the last three months. I'm overtaking my fathers estate (he's still alive) and tehres tons of things to organize and plan.
good luck!
I just have this weeken at home - tomorrow I'm driving back.
I'm really missing programming - after moving there I can continue my project.
For now it unfortunately has to be on hold.
I saw HPX 1.6 is out. Any groundbreaking changes or additions?
Yorlik: nothing that should impact you
if it does, it's a bug ;-)
Can't wait to get back to it - I'm really missing it.