#ste||ar on 2023-06-01 — irc logs at irclog.cct.lsu.edu

2021-08-06 22:55 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu

01:30 Yorlik__ has joined #ste||ar

01:34 Yorlik_ has quit [Ping timeout: 265 seconds]

02:03 hkaiser has quit [Quit: Bye!]

11:41 hkaiser has joined #ste||ar

13:00 hkaiser has quit [Quit: Bye!]

13:29 hkaiser has joined #ste||ar

14:35 <ms[m]1> hkaiser: is https://github.com/STEllAR-GROUP/hpx/pull/6251 about the destruction of the `barrier` happening concurrently with other threads calling `arrive`/`arrive_and_wait`? if yes, isn't that the concern of the owner of the `barrier` to keep it alive e.g. with a `shared_ptr` until all threads have arrived? seems to me like the same problem as in https://github.com/STEllAR-GROUP/hpx/pull/3749 (with `condition_variable`)

14:36 <hkaiser> ms[m]1: you don't know when the last thread has exited the barrier

14:36 <hkaiser> you could keep the barrier alive with a shared_ptr, yes - but that means that you would have to wrap all uses of a barrier in a shared_ptr

14:38 <ms[m]1> not all uses, you may have cases where you know from other synchronization that the barrier has been arrived at

14:38 <hkaiser> well, that sounds very convoluted to me

14:39 <ms[m]1> but independently of the fix, is that the problem the PR is solving?

14:40 <hkaiser> yes

14:40 <hkaiser> and yes, it's the same as for CVs

14:41 <ms[m]1> ok, thanks, makes sense

14:43 <hkaiser> ms[m]1: bors goes out of commision, did you do anything about this for pika already?

14:44 <ms[m]1> hkaiser: not yet, but the reason it's going down is because github has "merge queue" in beta (https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/managing-a-merge-queue)

14:44 <ms[m]1> that would be the replacement eventually

14:45 <hkaiser> yes, I know - and yes, that's the plan

14:45 <ms[m]1> iiuc they won't take down the public bors instance until merge queue has feature parity

14:46 <ms[m]1> but I've tried merge queue and it's fine as well, only thing I missed was that there's no equivalent to "bors try", otherwise the transition is straightforward

14:46 <ms[m]1> instead of having the required checks in bors.toml it uses the required checks in the repo settings

14:47 <hkaiser> ok

14:47 <hkaiser> do you have an example on what needs to be done there?

14:51 <ms[m]1> not really, it's almost all in the repo settings (https://github.com/STEllAR-GROUP/hpx/settings/branch_protection_rules/25747, "Require merge queue" and "Status checks that are required") except for the branches that need to be built (https://github.com/pika-org/pika/blob/9f202d3f31579dc6253410b7b6887898aad0501c/.github/workflows/linux_debug.yml#L10 for gh actions,

14:51 <ms[m]1> https://github.com/pika-org/pika/blob/9f202d3f31579dc6253410b7b6887898aad0501c/.circleci/config.yml#L19 for circleci/others)

14:51 <hkaiser> thanks

16:36 hkaiser has quit [Quit: Bye!]

17:01 hkaiser has joined #ste||ar

17:17 <hkaiser> mdiers[m]: wrt your load balancing question: we do something similar: https://github.com/STEllAR-GROUP/hpx/blob/master/libs/core/resiliency/include/hpx/resiliency/async_replay.hpp#L56-L110

17:18 <dkaratza[m]> hkaiser: I read online that in openmp the completion of a task does not necessarily synchronize with the point where the #pragma omp task block ends. The only way to synchronize with the task is if you explicitly use taskwait or task group or sth like that.

17:18 <dkaratza[m]> So, we could say that the equivalent of `#pragma omp task` in hpx is either `hpx::async` or even `hpx::post`(since be default openmp does not synchronize)? or would this be misleading?

17:19 <hkaiser> hpx::post doesn't synchronize in any way on it's own, so it's similar, I think

17:20 <hkaiser> mdiers[m]: essentially calling the outer function repeatedly from inside the continuation

17:20 <dkaratza[m]> ok i will include both and will add a note to clarify that if the users want any kind of synchronization they should go with async and not post

17:28 <hkaiser> dkaratza[m]: yes

17:33 K-ballo has quit [Quit: K-ballo]

17:33 K-ballo has joined #ste||ar

19:01 hkaiser_ has joined #ste||ar

19:04 hkaiser has quit [Ping timeout: 240 seconds]

19:40 <dkaratza[m]> hkaiser_: I see openmp has multiple cases where they use #pragma omp parallel without always having a for loop. so I think i should include an example about this as well

19:41 <dkaratza[m]> `#pragma omp_parallel... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/cc8be1416f38b86fa62b665cdba4d60ae9888e8b>)

19:43 <dkaratza[m]> i am not sure how you can do this in hpx though. I mean having a parallel section but not a specific algorithm

19:53 <hkaiser_> dkaratza[m]: I think you can just omit that on the HPX side, it should just be 'combined' with the directives inside the parallel region

20:06 <dkaratza[m]> ok, it is quite common though to have #pragma omp parallel{ #pragma omp single{ //some code}}. how would this be in hpx?

20:07 <hkaiser_> what would be the semantics of that?

20:08 <dkaratza[m]> that sb would want to do a parallel section and run a part of its code in a single thread

20:09 <dkaratza[m]> so i should have written #pragma omp parallel{ // parallel code #pragma omp single{ //single threaded code}}

20:21 <hkaiser_> dkaratza[m]: you'd have to protect that sequential piece of code using a mutex

20:23 <hkaiser_> dkaratza[m]: something like: https://gist.github.com/hkaiser/f8891b9c11fea3f20b97ec4e3ca96c49

20:24 <dkaratza[m]> hkaiser_: without the pragmas right?

20:27 <hkaiser_> yes

20:27 <hkaiser_> I left them for you to see the correspondence of code

20:38 <gdaiss[m]> hkaiser_: Could you have another look at the DGX Jenkins PR? Al finished setting up the node with the fabric manager and the tests now pass locally for me - the only thing left doing is to merge master into it (to get the fixes from the other Jenkins PR) and to change to CUDA architecture. Hopefully the pipeline succeeds after that and this PR can be merged as well :-)

20:38 <hkaiser_> I'll rebase it, thanks