hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
<gnikunj[m]>
hkaiser: could you point me to the implementation of sender receiver stuff in HPX pls
<gnikunj[m]>
Btw where is the actual sender class defined? I’m seeing sender derivatives that store a sender type and build upon it (bulk_sender et al) but not the base sender class
<hkaiser>
it's an old-fashioned HPX executor on top of s/r
<gnikunj[m]>
Thanks!!
<hkaiser>
a bit difficult to read as we deliberately didn't use the pipe syntax, so you have to read it inside-out
<gnikunj[m]>
Na that’s fine. I (idk if that’s just me) prefer not to use pipe in general
<hkaiser>
for s/r its really more convenient and more readable
<hkaiser>
compare start_detached(then(schedule(sched), [](...){})) with schedule(sched) | then([](...){}) | start_detached
<gnikunj[m]>
Yeah, you’re right. I guess it’s more about me not getting used to the pipe syntax as a bottleneck here 😅
<hkaiser>
gnikunj[m]: btw, Sanjay told me about charm/light and he wants to use it for the task_bench paper
<hkaiser>
also, for the stencil benchmark in taskbench we now beat charm ;-)
<gnikunj[m]>
Yeah, Simeng came in our little charmlite den. We don’t think that’s possible (we lack plenty of features rn) but she said she can implement without sections (charm++’s way of handling global data structures)
<hkaiser>
nice
<gnikunj[m]>
hkaiser: Now that’s incredible!! What led to the increase in performance?
<hkaiser>
gnikunj[m]: btw, will you be able to join the hpx call on Thursday?
<hkaiser>
gnikunj[m]: careful analysis, mostly
<gnikunj[m]>
Yes, I will be joining the call this Thursday
<hkaiser>
we are better for the single node case, comm is still bad
<gnikunj[m]>
I mean I’ll be in BTR starting Tuesday so I expect Giannis to wake me up :P
<hkaiser>
ok, good
<hkaiser>
I want him to join as well
<gnikunj[m]>
hkaiser: Yeah, we need to figure things out there.
<gnikunj[m]>
srinivasyadav227: not sure where it's coming from really.
<srinivasyadav227>
gnikunj: ok :) 😅
<gnikunj[m]>
I mean ik where it's coming from (fork-join-executor) but not sure if it's related to build time
<srinivasyadav227>
<gnikunj[m]> "srinivasyadav227: the errors are..." <- why there is no arm/arch64 related flag defined there ?, all are x86 based only ?
<gnikunj[m]>
Because we don't support cpuids based on arm
<gnikunj[m]>
those are features that are used in writing register specific things (hence the tables that stores the info)
<srinivasyadav227>
oh ok, generally cpuid is only for x86 based systems ?
<gnikunj[m]>
Yeah, I think so. Although you can identify other cpu related info for Arm as well.
<srinivasyadav227>
oh ok cool :)
<gnikunj[m]>
hkaiser: yt?
hkaiser has quit [Quit: Bye!]
FunMiles has joined #ste||ar
<FunMiles>
Is C++20 coroutines with co_await still available in HPX? How can I enable it, and it particular on MacOS?
hkaiser has joined #ste||ar
<FunMiles>
@hkaiser: Are C++20 coroutines with co_await as you showed them in CPPcon still available in HPX? How can I enable them?
<FunMiles>
When doing grep on the latest source, I'm not finding what I thought would have to be there for that feature to be still enabled.
<hkaiser>
FunMiles: just configuring with -DHPX_CXX_STANDARD=20 should do the trick
<FunMiles>
hkaiser: Trying that. But here's a quick question: How come a recursive `grep -rn promise_type *` only returns a match in `cmake/tests/cxx20_coroutines.cpp` ? Isn't it necessary to define that type with C++20 coroutines?
<gnikunj[m]>
hkaiser: I went through the paper yesterday and have a couple questions.
<gnikunj[m]>
1. The connect function should take in a receiver of the same type as the sender, right? (like if you have a then sender, then you want to connect it to the then receiver)
<gnikunj[m]>
2. If the connect function returns an operation state, why does some receiver not have them? (I see that then receiver doesn't have operation_state struct defined in it)
<gnikunj[m]>
3. How do I retrieve the end value that was set by the receiver using set_value()?
<FunMiles>
hkaiser: Are you making use of asio's C++ coroutine features to provide the coroutine interface?
<hkaiser>
FunMiles: no asio is involved, let me give you the link to the integration, give me a sec
<hkaiser>
gnikunj[m]: 1) no, sender and receiver are independent and unrelated types
<hkaiser>
you can combine any sender with any receiver
<hkaiser>
2) I'm not sure I understand, what receiver do you refer to?
<hkaiser>
3) again, not sure I understand your question
<gnikunj[m]>
2) I was talking about HPX implementation. For instance then_receiver doesn't have an operation_state struct with start function.
<gnikunj[m]>
Also, for 1) why would you want a bulk_receiver take a then_sender?
<hkaiser>
then is a receiver, no?
<gnikunj[m]>
3) What I meant was - how do you get the final value? (equivalent to future.get())
<gnikunj[m]>
also, do you have time for a quick call? I think I'll be able to better explain them like that.
<hkaiser>
it is received by the lambda you pass to then, for instance
<gnikunj[m]>
sure, but what about the final lambda which also returns a value. How do I get that?
<hkaiser>
for 2) there is a default operation_state that simply connects a sender with a receiver (i.e. simply passing through the values), that is used when no special functionality is needed
<gnikunj[m]>
are there sender algorithms that returns it?
<hkaiser>
gnikunj[m]: you can use a make_future to get a future for the overall value
<gnikunj[m]>
where is the default operation_state implemented in HPX?
<hkaiser>
sync_wait returns the value as well
<gnikunj[m]>
hkaiser: make_future isn't in 2300R3.
<gnikunj[m]>
wait, sync_wait isn't void?
<hkaiser>
no it isn't, but sync_wait is
<gnikunj[m]>
ohh crap, right
<gnikunj[m]>
sync_wait will return the final value. Now, it makes sense.
<gnikunj[m]>
ok, but I still don't understand why you can connect any receiver with any sender
<gnikunj[m]>
like why would you connect a bulk receiver with a then sender?
<hkaiser>
gnikunj[m]: that's the whole point of this, combining arbitrary senders with arbitrary receivers
<hkaiser>
anonymous consumer/producer chains
<hkaiser>
give me a sec to locate the default operation_state
<gnikunj[m]>
when I thought of arbitrary connections, I though more of transfer algorithms - like transferring execution from one scheduler context to another
<hkaiser>
sure, that's just a special receiver
<gnikunj[m]>
but transfer is a sender algorithm that returns a sender
<gnikunj[m]>
(I'm honestly a bit confused by the genericity and need clarification so I apologize if I sound absolutely stupid asking these questions)
<hkaiser>
well yes, it returns a new sender that sits on the target execution environment
<gnikunj[m]>
hkaiser: we don't have some sender adaptors as well - transfer_when_all, on, upon_*
<hkaiser>
yes, that's correct
<hkaiser>
those are fairly new in p2300
<gnikunj[m]>
Btw, I'm traveling tomorrow morning so I won't be able to attend the meeting with Srinivas
<hkaiser>
sure, np
<FunMiles>
hkaiser: I am a bit confused about the file you sent. It's not in a regular cloned hpx directory. Is it a different branch? The link says it is master but then it's hpx-local
<gnikunj[m]>
hkaiser: do you have some time to discuss on sender-receiver stuff on wednesday?
<hkaiser>
FunMiles: long story, for now just go with it, I think hpx-local will be back in hpx soon
<gnikunj[m]>
I'd like to discuss exactly on the implementation details. Meanwhile, I'll do some debug run to see the call stack of execution.
<hkaiser>
gnikunj[m]: sure
<gnikunj[m]>
ok, let me ask Katie to set up a time on wed
<FunMiles>
hkaiser: OK :) I'll give it a try.
<hkaiser>
FunMiles: sorry for the confusion
<FunMiles>
hkaiser: For the flag, is it HPX_CXX_STANDARD or CXX_STANDARD ?
<FunMiles>
Cmake says it's unused.
<hkaiser>
sec
<hkaiser>
FunMiles: for any recent versions of HPX master it should be HPX_CXX_STANDARD=20
<hkaiser>
you can try HPX_WITH_CXX20=On, that should do the right thing even if it generates a warning
<FunMiles>
-DHPXLocal_WITH_CXX_STANDARD=20
<hkaiser>
this will changesoon, but for now may work
<hkaiser>
again sorry, things are in a bit of limbo right ow
<FunMiles>
hkaiser: I was watching from cppcon the presentation about nano-coroutines by Gor Nishanov. He mentions that switching bundles takes micro-seconds, while co-routines take nano-seconds. Have you measured performance benefits from using C++20 coroutines?
<hkaiser>
FunMiles: we have not measure this, but using co_await with our futures doesn't have the benefit of nano-seconds overhead as Gor describes
<hkaiser>
this is not really using coroutines anyways
<hkaiser>
it's more syntactic sure allowing to turn future-base asynchronous continuation-based code inside out and make it easier to read and write
<hkaiser>
syntactic sugar*
<hkaiser>
it still has the same overhead as using futures without co_await
<FunMiles>
OK. Can you imagine that in the future, you could replace fibers with coroutines if all code could use purely co_await approaches?
<FunMiles>
Thus benefitting from faster switching?
<hkaiser>
FunMiles: nothing prevents you from using co_await based coroutines on top of our threading (fibers)
<hkaiser>
those things are orthogonal
<FunMiles>
More or less, it would require that I make another futures system, no? One of my conceptual gripes with C++20 coroutines is that the behavior implicitly involves the return type... the behavior of what co_await does depends on the return type.
<hkaiser>
FunMiles: no
<hkaiser>
you could use light-weight awaitables as suggested by Gor and run them on any HPX thread
<FunMiles>
I'm going to have to explore that approach.
<hkaiser>
ok
<FunMiles>
what are the CMake exported targets? HPXLocal::hpx does not exist it seems
<FunMiles>
HPXLocal seems to be OK.
<FunMiles>
It does not seem to define the proper include directories.
<hkaiser>
HPX::hpx
<FunMiles>
target not found.
<hkaiser>
FunMiles: do you use hpx or hpx-local?
<hkaiser>
which repository?
<FunMiles>
hpx-local, since that's the one with the C++20 coroutine capable code.
<hkaiser>
ok
<hkaiser>
do you need the distributed functionalities of HPX?
<FunMiles>
Not at this stage, no.
<hkaiser>
ok
<hkaiser>
the HPX::hpx_local is the target
<hkaiser>
I think ;-)
<FunMiles>
It is! :) Thanks
<FunMiles>
Though I still have some issues.
<FunMiles>
I copied the code for transpose_await.cpp and some of the includes are found while some are not.
<FunMiles>
Some filenames have changed? It uses algorithm.hpp but it seems it's algorithms.hpp now with an 's'
<FunMiles>
There is no hpx/hpx.hpp ?
jehelset has quit [Ping timeout: 240 seconds]
<hkaiser>
FunMiles: you got in the middle of a large restructuring
<hkaiser>
could you give us some time to consolidate things?
<hkaiser>
for now, I'd suggest you use the hpx repository (not hpx-local)
<FunMiles>
hkaiser: OK. I will wait.
<hkaiser>
that should automatically pull in hpx-local
<FunMiles>
The regular hpx repository does not include the co_await at the moment, right?
<hkaiser>
FunMiles: since it pull hpx-local, it will
<hkaiser>
pulls*
<hkaiser>
FunMiles: the story is that for various reasons there was a plan to split HPX into two repositories, hpx and hpx-local
<hkaiser>
for users of hpx nothing should change (the build system makes sure those are well integrated)
<hkaiser>
but for even more various reasons, this split (which happened early December) might be reverted soon
<hkaiser>
that's what I meant, that things are kindof in limbo and we need another 3-4 weeks to get back to normal