hkaiser changed the topic of #ste||ar to: The topic is 'STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
adi_ahj has joined #ste||ar
adi_ahj has quit [Client Quit]
adi_ahj has joined #ste||ar
adi_ahj has quit [Client Quit]
diehlpk has joined #ste||ar
<diehlpk> hkaiser, Can you take care of the HPX and performance counter section for the SC20 paper?
<hkaiser> diehlpk: sure
<diehlpk> I added the template and the astro guys will add the scientific story by Thursday
<diehlpk> So we can read them before we meet on Monday
<hkaiser> cool
<hkaiser> thanks for pushing this
<diehlpk> I will finish the book proposal by Friday so you can edit it next week
<diehlpk> hkaiser, is the release branch the candidate for HPX 1.4?
<diehlpk> I will run a fedora build before we release
<hkaiser> yes, it's the release branch
<diehlpk> Ok, let me start a fedora build
adi_ahj has joined #ste||ar
hkaiser has quit [Quit: bye]
<diehlpk> Once these two builds are green, I am fine to build HPX 1.4 on the Fedora build system
adi_ahj has quit [Quit: adi_ahj]
<diehlpk> jbjnr_, Do you think you can get libfabric working with hpx 1.4 or any version later by the end of next week?
<diehlpk> We intent to start to run the first runs for the SC20 paper the week after
<diehlpk> and will you be able to contribute to the SC20 paper?
diehlpk has quit [Ping timeout: 260 seconds]
nikunj has joined #ste||ar
nikunj97 has joined #ste||ar
nikunj has quit [Read error: Connection reset by peer]
<jbjnr_> diehlpk_work: I can try to get it working, but I have a lot on my plate at the moment so I won't mske any promises
mdiers_ has joined #ste||ar
<mdiers_> Hello, short simple question: I need to bind a async task to core thread (without switching between the core threads). Exists a short example?
<simbergm> mdiers_: hpx::threads::executors::default_executor exec(hpx::threads::thread_schedule_hint(thread_number)); hpx::async(exec, mytask);
<simbergm> that's the closest you can get to at the moment
<simbergm> with the default scheduler the task is not guaranteed to stay on the core (if it yields or suspends)
<simbergm> with `--hpx:queueing=static or static-priority` it will stay, but no stealing will happen at all
<simbergm> jbjnr_ has a new scheduler that has some support for properly binding tasks to threads but it's still work in progress
<simbergm> would be good if you try it out though
rori has joined #ste||ar
<mdiers_> simbergm: tkz, the first approach with default_executor exec(thread_schedule_hint(1u)); while( true ) { async(exec, [](){ cout << get_worker_thread_num() << endl; } ); } results in different thread ids
nikunj97 has quit [Read error: Connection reset by peer]
<simbergm> mdiers_: yeah, especially if that's the only work you're doing the tasks are definitely going to be stolen
<simbergm> that's why it's a "hint"...
<mdiers_> simbergm: ;-) as far as I understand now. otherwise I could do it via user defined thread_pools? like in resource_partitioner/tests/unit/named_pool_executor.cpp?
Dr_Q has joined #ste||ar
<Dr_Q> Hi, I would like to know how I could join to the Stellar Group?
<Dr_Q> which are the prerequisites?
<simbergm> mdiers_: if you need guarantees that tasks will stay on a core the scheduler itself has to support it (like jbjnr_'s scheduler; it's `--hpx:queuing=shared-priority` I think together with a hint `thread_priority_bound`)
<simbergm> an executor isn't enough to guarantee that behaviour, it can only pass things like hints on to the actual scheduler
<simbergm> if you only need one thread you can create a custom thread pool with a single thread which also makes sure the tasks stay on that core
<simbergm> tasks are not stolen across different thread pools
<simbergm> Dr_Q: there's no formal way of joining the stellar group, it's more just a collection of people working on hpx, phylanx, and related projects
<simbergm> (the stellar group is actually two things: one is a research group at LSU, the other is the informal collection of people mentioned above...)
<simbergm> the best way is to start contributing to e.g. hpx ;)
<simbergm> Dr_Q: anything particular you're interested in?
Dr_Q is now known as jmgomez
<jmgomez> i am working in low latency software but as well I need to focus in high performance and parallel programming. So, I was just wanted to contribute with my experience and learning from you all.
jmgomez has left #ste||ar [#ste||ar]
<jbjnr_> mdiers_: please paste your example into a gist and I will modify it to use the bound tasks prototype with the new scheduler
<simbergm> hkaiser: yt?
weilewei has quit [Remote host closed the connection]
<mdiers_> jbjnr_: Thanks, I'll just be a moment, I just have to do a few other things.
hkaiser has joined #ste||ar
adi_ahj has joined #ste||ar
adi_ahj has quit [Quit: adi_ahj]
adi_ahj has joined #ste||ar
<jbjnr_> hkaiser: yt?
<hkaiser> here
<jbjnr_> hkaiser: dataflow doesn't [ick up annotated tasks https://gist.github.com/biddisco/a99ba3613962b6cc1858cedbdfd80ff9
<jbjnr_> or rather apex doesn't display them - when they are inside dataflow
nikunj has joined #ste||ar
<jbjnr_> it is working for async
<hkaiser> ok
<jbjnr_> but not dataflow.
<hkaiser> does async work with your executor?
<jbjnr_> I will have to go through the code again, but if you have any quick tips, please share
<jbjnr_> async is fine and .then is fine
<hkaiser> ok
<jbjnr_> actually, I'm not certain that .then is ok
<jbjnr_> I'll check
<hkaiser> jbjnr_: dataflow creates the thread here:
<jbjnr_> looks like doth then also is broken
<jbjnr_> balls.
<hkaiser> .then() dispatches through executor::post as well
<jbjnr_> thanks. I' will investigate
<hkaiser> jbjnr_: I can have a look, if you want - would need a small self-contained example, though
nikunj has quit [Ping timeout: 265 seconds]
<diehlpk_work> simbergm, The Fedora builds for HPX 1.4 are fine
hkaiser has quit [Quit: bye]
<jbjnr_> hkaiser: it looks like dataflow wraps the callable in so many other callables that the annotation type is no longer seen when it gets to the post level
<K-ballo> "so many"
<jbjnr_> well 15 levels of call stack, maybe only a coupld of actual invokes
nikunj has joined #ste||ar
<K-ballo> I like that it sounds as a mater of quantity, as if N levels of wrapping where ok but N+1 is too many
<K-ballo> s/where/were
<jbjnr_> unfortunately, it only takes one level of wrapping for the type of the iner function to be lost
<diehlpk_work> simbergm, I just checked and the failed build on s390s are due to https://pagure.io/fedora-infrastructure/issue/8522
ritvik99 has joined #ste||ar
<simbergm> diehlpk_work: thanks, very good to hear!
ritvik99 has quit [Remote host closed the connection]
ritvik99 has joined #ste||ar
adi_ahj has quit [Quit: adi_ahj]
<mdiers_> jbjnr_: here is the small example: https://gist.github.com/m-diers/c557801344f5652e12a8850c51a23b21
hkaiser has joined #ste||ar
<diehlpk_work> nikunj, yet?
<nikunj> diehlpk_work, here
<nikunj> did you get the ssh access?
<diehlpk_work> Not yet
<nikunj> diehlpk_work, btw did you send them the email?
<diehlpk_work> Yes, I sent them the document and he told me that he will forward it to Japan, so they could prepare the testbed
<nikunj> great! so we should have the access soon. Did you tell Karame about it?
<diehlpk_work> Yes
ritvik99 has quit [Ping timeout: 265 seconds]
<nikunj> diehlpk_work, ok
<nikunj> meanwhile, do you have something that I should do?
<diehlpk_work> No, we have to wait for them
<nikunj> ok
nikunj has quit [Ping timeout: 268 seconds]
<heller> diehlpk_work: what's the idea for the SC paper?
<diehlpk_work> Follow up on the work from parsa on his SC workshop paper and do long-term runs and show
<diehlpk_work> 1) Show that AGAS is not expensive for large simulations
<diehlpk_work> 2) Show performance of libfabric by using hpx performance counters
<diehlpk_work> 3) APEX measurements
weilewei has joined #ste||ar
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
RostamLog has joined #ste||ar
<diehlpk_work> jbjnr_, Do you know the next deadline for the Pix Daint allocation proposals?
<diehlpk_work> simbergm, ?
<jbjnr_> every 6 months, so april 2020 I'd say.
RostamLog has joined #ste||ar
<diehlpk_work> The SC workshop paper os published
<heller> There's also a supermuc call
<heller> diehlpk_work: how are you planning to show that agas has no significant overhead? The important question here is overhead compared to what
<heller> diehlpk_work: and a negative point for not being open access ;)
<diehlpk_work> parsa, Can you send heller your sc19 workshop paper
rori has quit [Quit: bye]
<diehlpk_work> All my papers are available as preprints
nikunj has joined #ste||ar
<hkaiser> send it to me as well, I would like to add it to our publications page
<hkaiser> parsa: ^^
<hkaiser> heller: why istn't execution_agent::do_resume calling set_thread_state() but implements it itself?
<heller> hkaiser: to avoid the coroutine::,self_
adi_ahj has quit [Quit: adi_ahj]
<hkaiser> set_thread_state doesn't use coroutine_self
<heller> I thought it did...
<hkaiser> would you mind if I changed that?
<heller> Right, only the timed one
<heller> I'm impartial there. But if IIRC, the implementation I wrote is a bit shorter and clearer, at least that was my intention
adi_ahj has joined #ste||ar
<hkaiser> heller: yes
<hkaiser> the rescheduling if the thread is active is cleaner, I would adapt that
<hkaiser> the rest is essentially copy&paste
<heller> Ok, fair enough
<hkaiser> heller: also, I might have to make coroutines depend on the basic_execution module
<hkaiser> I hope this is ok
<hkaiser> I need access to the base agent from the couroutine
<heller> Hmm, that's not good
<hkaiser> right, I don't like that
<hkaiser> but if agent is the new self then we need that
<heller> If you need to access the base agent from the coroutine, why do you need to have the base agent depend on the coroutine then?
<heller> Why?
<hkaiser> no the other way around
<hkaiser> coroutine needs to depend on agent
<heller> Wouldn't the coroutine module need to depend an the agent?
<heller> So the other way around?
<heller> Which is as intended
adi_ahj has quit [Ping timeout: 265 seconds]
<hkaiser> yes
<hkaiser> ok
adi_ahj has joined #ste||ar
RostamLog has joined #ste||ar
nikunj has quit [Remote host closed the connection]
nikunj has joined #ste||ar
adi_ahj has quit [Quit: adi_ahj]
nikunj has quit [Quit: Leaving]
weilewei has quit [Remote host closed the connection]
hkaiser has quit [Quit: bye]
hkaiser has joined #ste||ar