<simbergm>
hkaiser: yep, replying to it at the moment
<simbergm>
makes sense...
<simbergm>
two other options: serialization module holds just the basics for serialization and serialization_impls (needs a better name, but...) would hold the actual implementations
<simbergm>
or have x_serialization for each module x that has something that needs to be serialized
<simbergm>
we'll end up with lots more modules like that...
<zao>
Seems to build with GCC, ICC and MSVC.
<hkaiser>
simbergm: nod
<jbjnr>
thanks simbergm zao , that's nice. It does the right thing with if(f) and marks it as unassigned. I'll use it
<simbergm>
but since serialization is independent of many things now keeping it the way you have it now might be okay
<simbergm>
hkaiser: ^
<hkaiser>
jbjnr: use hpx::util::function which has a reset function
<simbergm>
we can go with this for now and I'll deal with it later if it becomes a problem
<jbjnr>
thanks hkaiser
<hkaiser>
simbergm: what problems do you anticipate?
<simbergm>
hkaiser: no problem, just an unnecessary dependency
<hkaiser>
simbergm: we would have dependencies either way
<hkaiser>
simbergm: but yah, having a separate x_serialization module would solve this - I'm on the fence here
<simbergm>
well, anything that's local shouldn't need serialization
<hkaiser>
simbergm: is 'local' a compile-time property?
<simbergm>
but this is still much better than depending on all the rest of the distributed stuff (thank you!)
<simbergm>
potentially
<heller>
Even you have just have local only instance, you might still want to use serialization
<heller>
I'm not a fan of xxx_serialization modules
<simbergm>
for?
<hkaiser>
simbergm: saving local state, checkpointing
<heller>
I don't know, sending complex c++ data structures over the wire with for example MPI?
<heller>
Or any other networking library
<hkaiser>
I wanted to avoid having serialization depend on everything else
<heller>
serialization is a core module after all
<simbergm>
good points
<simbergm>
to be clear, I'm happy with this as it is, I just thought we could avoid that dependency from a quick look
<hkaiser>
simbergm: right - I thought about that - did come up empty handed
<hkaiser>
except by cheating
<hkaiser>
include the header explicitly and make the user add the dependency to serialization if needed
<simbergm>
"the header" = which one?
<heller>
That sounds scary
<hkaiser>
memory/serialization/intrusive_ptr.hpp
<hkaiser>
i.e. the code that actually depends on serialization
<simbergm>
oh... yeah, that sounds like nasty cheating
<heller>
Also, consider that the same would apply for the data structure module and functional
<hkaiser>
yes, that's on my list of things to do
<hkaiser>
but I wanted to have a decision on how to go ahead first
<jbjnr>
(if this conversation was in slack, you could embed snippets, links, etc. <sigh>)
<heller>
I don't like that at all, why not make serialization as central as datastructures and assert and friends
<hkaiser>
jbjnr: come on
<jbjnr>
because 99% people won't use it on a node
<hkaiser>
heller: right, you would have to add serialization as a dependency for anything that requires serialization
<jbjnr>
an examples/serialization/mpi demo that encoded a set of vectors and sent it over the wire using mpi, then decoded it would be quite a good selling point
<heller>
The alternative is to push that burden onto users, which isn't appealing either
<hkaiser>
heller: right
<heller>
And as said in the comment, it should be very leightweight
<hkaiser>
right, it is
<heller>
If it's not, then we need to fix that issue
<heller>
As in, don't pay for what you don't use.
<heller>
Where building that module should be neglectable and just including only gives minimal impact on compile times
<rori>
hey ! are the compression plugins still in use ?
<heller>
I'm not aware, why do you ask?
<rori>
to know if I spend time fixing the build ^^ but if you don't know I will
<hkaiser>
rori: I think we should keep'em
<hkaiser>
simbergm, heller: so do we agree to leave things as proposed in the PR?
<hkaiser>
jbjnr: such an example is trivial
<simbergm>
hkaiser: yep, I'm happy with leaving it the way it is
<hkaiser>
simbergm: ok, thanks
<rori>
?
<jbjnr>
"such an example is trivial" - famous last words. I expect Boris Johnson used the same phrase when planning his brexit strategy.
<hkaiser>
lol
<heller>
hkaiser: ok
<hkaiser>
(jbjnr: he never intended to pull through with the brexit anyways ;-) )
<hkaiser>
jbjnr: but seriously: 2 lines serialization and 2 for deserialization plus serialization support for your types
<jbjnr>
he did and I'm glad it's only 2 lines.
<hkaiser>
ok
<jbjnr>
(he doesn't care at all about the country, only making himself look like superman and saving the world)
hkaiser has quit [Ping timeout: 245 seconds]
aserio has joined #ste||ar
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 268 seconds]
aserio1 is now known as aserio
hkaiser has joined #ste||ar
bita has joined #ste||ar
bita has quit [Quit: Leaving]
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 264 seconds]
aserio1 is now known as aserio
aserio has quit [Ping timeout: 246 seconds]
rori has quit [Quit: WeeChat 1.9.1]
jbjnr_ has joined #ste||ar
<jaafar>
hkaiser: do you have some time to review our conversation re: launch policies?
<jaafar>
s/review/restart/
aserio has joined #ste||ar
<jaafar>
I'll just stick my questions here and we can operate... asynchronously... haha
<jaafar>
"launch::sync should be synchronous, except for remote operations, where its equivalent to async().get()"
<jaafar>
I see scan_partitioner.hpp using it this way:
<jaafar>
which, it seems to me, is unlikely to block
<jaafar>
or if it does the algorithm works much differently than I thought :)
<jbjnr_>
launch::sync is used for a case like do_this.then(do_that) - normally, do that is spawned as a new task and gets queued like all other tasks
<jbjnr_>
but with do_this.then(launch:sync, do_that), then do_that is called directly on termination of do_this
<jaafar>
jbjnr_: can you explain how it works in the context of "dataflow"?
<jbjnr_>
it's like a future that's not really a future
<jbjnr_>
dataflow, the same
<jbjnr_>
dataflow is really a version of when_all(this1, this2).then(do_that)
<jaafar>
so I'm correct to say that dataflow(hpx::launch::sync, ...) is *itself* non-blocking
<jbjnr_>
so if you use sync on dataflow, then either this1 or this2 will call do_that
<jbjnr_>
(depending on which finishes second in this example)
<jaafar>
they will call, and not do it via supplying the promise value?
<jaafar>
so they are both in the same thread?
<jbjnr_>
yes, it just chains two tasks into one, but it still returns a future to the end of the second one, so it is nonblocking
<jaafar>
OK, I think I understand. dataflow() is itself non-blocking; the launch policy supplied simply tells what to do when the inputs are available
<jaafar>
not what happens right now
<jbjnr_>
this1 runs as one task in a thread, this2 runs as a task in a thread, 'that' runs in the same thread as the last one to finish
<jaafar>
OK great
<jbjnr_>
hold on ...
<jaafar>
and "async" would mark the task ready to go, but not actually switch to it
<jbjnr_>
correct
<jaafar>
I should say continuation
<jaafar>
What does "fork" do?
<jbjnr_>
fork stops the current task right now, then switches directly to the new one
<jaafar>
how is that different from "sync"?
<jbjnr_>
then resumes the old one afterwards
<jbjnr_>
sync runs one task when another one ends (but on the same thread)
<jbjnr_>
fork doesn't wait till one task ends, it interrupts the current task and switches to the new one
<jaafar>
when does that happen? at the call to dataflow(hpx::launch::fork, ...) or when the inputs are available?
<jbjnr_>
fork is probably meaningless in the context of a continueation
<jbjnr_>
dataflow(fork, ...) is probably meaningless!
<jaafar>
OK
<jbjnr_>
but async(fork, stuff)
<jbjnr_>
would be like
<jbjnr_>
this_thread.suspend, stuff.run_now
<jaafar>
seems like you could just call stuff()
<jbjnr_>
then resule this thread when stuff finishes
<jbjnr_>
^resume
<jaafar>
why not just do the work directly?
<jbjnr_>
althoug technically this thread would go onto the queue so might not be resumed right away
<jaafar>
ah so here we are using "thread" in a special HPX way right?
<jaafar>
this is not std/boost::thread
<zao>
Sounds like a useful concept, but I'll save you my bikeshed on the name :D
<jbjnr_>
usually, you would do the work directly, but there might be a case where you've broken your application into "tasks" and might want to dro everything and fork
<jbjnr_>
I've never used it
<jaafar>
OK!
<jaafar>
Last question
<jbjnr_>
just calling the function directly would make more sense as you point out
<jaafar>
I think async policy can accept a priority argument
<jbjnr_>
yes, via an executor param
<jaafar>
How is that used?
<jbjnr_>
high, normal, low
<jbjnr_>
the scheduler maintains 3 queues
<jbjnr_>
and high Priority (HP) gets taken before normal or low
<zao>
Would there be any benefits in stack height?
<jbjnr_>
to use it, grep for thread_priority_critical
<jbjnr_>
in the code and look at an example
<jbjnr_>
zao:benefits where? when forking or sync?
<jaafar>
I found I could do finalitems.push_back(dataflow(hpx::launch::async(threads::thread_priority::thread_priority_low), ...)
<zao>
Forking.
<jaafar>
and get different results
<jbjnr_>
yes, forke creates a new stack frame, but calling a funcion uses the existing one - good thinking
<jaafar>
a new stack frame or a new stack?
aserio has quit [Ping timeout: 246 seconds]
<jbjnr_>
different results?
<jaafar>
performance results
<jbjnr_>
new stack means new stack frame in this context
<jbjnr_>
new memory with reassigned stack pointer to point to it
<jaafar>
I feel like calling a function generally creates a new stack frame :)
aserio has joined #ste||ar
<jbjnr_>
true
<jbjnr_>
I confuse easily
<jaafar>
so the comment that "calling a function uses the existing one" confused me
Coldblackice_ has joined #ste||ar
<jaafar>
so does fork create a new stack?
<jbjnr_>
yes
<jaafar>
OK gotcha
<jaafar>
I guess I don't need to know about fork then
<jbjnr_>
no
<jaafar>
thanks for the explanation
<jbjnr_>
it's useless really
<jaafar>
so I could use thread priorities to manipulate the order my async tasks were scheduled in?
<jbjnr_>
soeone miust have a reason for it as it was added to the standard
<jbjnr_>
proposal
<jaafar>
I was looking for that
<jbjnr_>
priority is your friend - we use a high_priority executor for all communications with mpi for example, and also for tasks that generate many chilren and must be done first when they go into queues
<jbjnr_>
otherwise queues drain, then a parent task goes in and generates children, but the queues are temporarily empty
Coldblackice has quit [Ping timeout: 246 seconds]
<jaafar>
This is very helpful. I think "sync" will be very useful to me. Some of the work needs to be done in the same thread if possible.
<jaafar>
and the priorities too
<jbjnr_>
sync is useful when you have a short task that must be run after another finishes, I use it to trigger other stuff like dataflow(sync, blah, blah, trigger_something)
<jbjnr_>
you know that the trigger will happen as soon as the tasks complete and it wont be created as a new 'trigger task' that goes to the back of the queue and waits for ages before happening
<jbjnr_>
but don't put too many sync calls chained together otherwise as zao points out, your stack will be used up (I think), sync doesn't create a new stack frame AFAIK
<jbjnr_>
and putting many sync calls one after the other prevents any work stealingfrom hapening
<jbjnr_>
as you have just created very long functions really!
<jbjnr_>
gtg
<jbjnr_>
bbiab
hkaiser has quit [Ping timeout: 245 seconds]
<jaafar>
Yeah my big thing here is keeping the cache warm
<jaafar>
so I'd like the second phase of an algorithm to start on the same thread as soon as possible
<simbergm>
async is child stealing, fork is parent/contiuation stealing
K-ballo has quit [Ping timeout: 240 seconds]
K-ballo has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
aserio has joined #ste||ar
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 246 seconds]
<jaafar>
simbergm: I understood from jbjnr_'s explanation that "sync" was continuation stealing
aserio1 has quit [Ping timeout: 265 seconds]
<simbergm>
jaafar: async(sync, f) is equivalent to a direct function call
<simbergm>
fork is continuation stealing because f is executed immediately on this thread but the continuation (what comes after async(fork, f)) can be stolen
<jaafar>
simbergm: I am trying to understand this in the context of dataflow(sync, ...) vs dataflow(fork, ...)
<jaafar>
the way jbjnr_ described it sounds like dataflow(sync, ...) is "
<jaafar>
continuation stealing" as the Wikipedia article described
<jaafar>
in the sense that the remaining arguments to dataflow() are executed immediately by whichever thread supplied the last required data
<jaafar>
without an intervening reschedule etc.
<jaafar>
do I have that right?
aserio has joined #ste||ar
<jaafar>
(I do understand that async(sync, ...) is blocking, just trying to understand about dataflow)