hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
Nikunj__ has quit [Read error: Connection reset by peer]
hkaiser has quit [Quit: bye]
<mdiers[m]> My application has a performance drop of 30% between 11 Mar and 17 Apr. can you find anything in this range in the hpx tests over the period? Maybe it came in during the revision of the executors? (block_executor<local_priority_queue_attached_executor>,static_chunk_size)
<ms[m]> mdiers: very possible, but I can't see anything that stands out particularly (our performance tests aren't very comprehensive either)
<ms[m]> one question, are you explicitly using block_executor<local_priority_queue_attached_executor> or are you leaving the executor as the default (the default is not local_priority_...)
<ms[m]> ?
<mdiers[m]> ms: thanks for the hint. i will have a look now, i actually use the default. at some point i had a problem with the policy, and had explicitly specified the default. i will adapt it.
<mdiers[m]> ms: on 11 Mar, the local_priority_queue_attached_executor was the default of the block_executor.
<ms[m]> mdiers: right, it changed in the executors cleanup pr
<ms[m]> so it's possible that I made the old default slower, but the new default hopefully faster...
<ms[m]> in any case, if you can try the new default (restricted_thread_pool_executor) that'd be great
<mdiers[m]> ms: I'm already on it
<ms[m]> thanks!
<mdiers[m]> ms: i test on a single-numa system. on the quad-numa-system the new version now scales, but cannot make a direct performance comparison.
<mdiers[m]> <ms[m] "in any case, if you can try the "> 1% difference between block_executor<local_priority_queue_attached_executor> and hpx::compute::host::block_executor<hpx::parallel::execution::restricted_thread_pool_executor>
<mdiers[m]> the same with a direct restricted_thread_pool_executor
hkaiser has joined #ste||ar
Nikunj__ has joined #ste||ar
Hashmi has joined #ste||ar
<ms[m]> mdiers: same bad performance or same good?
mcopik has joined #ste||ar
mcopik has quit [Client Quit]
<ms[m]> hkaiser: no sleep, eh? :/
<hkaiser> ms[m]: yah :/
<ms[m]> we need more boring issues for you to take care of ("type: boring" or "type: makes hartmut fall asleep")
<hkaiser> lol
<mdiers[m]> ms: ahh i'm going crazy: i'm rowing back, it's probably only my special case where i use the gpus with opencl.
Nikunj__ has quit [Read error: Connection reset by peer]
<mdiers[m]> ms: so, have now the test once again cleanly one after another. whether with or without opencl, I get a performance loss of now 20%.
<mdiers[m]> * My application has a performance drop of 20% between 11 Mar and 17 Apr. can you find anything in this range in the hpx tests over the period? Maybe it came in during the revision of the executors? (block_executor<local_priority_queue_attached_executor>,static_chunk_size)
Hashmi has quit [Quit: Connection closed for inactivity]
Nikunj__ has joined #ste||ar
<mdiers[m]> ms: will continue tomorrow morning
nikunj97 has joined #ste||ar
Nikunj__ has quit [Ping timeout: 244 seconds]
nikunj has joined #ste||ar
nikunj97 has quit [Ping timeout: 246 seconds]
nikunj97 has joined #ste||ar
nikunj has quit [Ping timeout: 240 seconds]
akheir has joined #ste||ar
nan11 has joined #ste||ar
weilewei has joined #ste||ar
rtohid has joined #ste||ar
nikunj97 has quit [Read error: Connection reset by peer]
<weilewei> hkaiser does hpx mpi future need mpi when compiling HPX? If I do not provide mpi module on my hpx build script, then it says "MPI could not be found but was requested by your configuration, please specify MPI_ROOT to point to the root of your MPI installation"
nikunj has joined #ste||ar
<weilewei> I set -DHPX_WITH_NETWORKING=OFF -DHPX_WITH_PARCELPORT_MPI=OFF
<hkaiser> weilewei: yes, it needs mpi
<weilewei> If I provide mpi module in the script, then hpx will be built with networking:on, however, I don't want this way. because when I run my application, it will warns mpi is started twice
<weilewei> hkaiser but in the earlier version, hpx mpi future does not require mpi when building hpx, that's what I remembered
<hkaiser> weilewei: it always required mpi, iirc
<weilewei> hkaiser then how should I avoid the error that mpi is started twice? because the main
<hkaiser> weilewei: no, if you disable networking, then networking will be off
<hkaiser> you see an error? you didn't say so
<weilewei> hkaiser right, I see the error "mpi is started twice" even when I disabled networking in hpx
<hkaiser> what error do you see? can I see the full output, pls?
<hkaiser> grrr
<hkaiser> weilewei: is that HPX master?
<weilewei> hkaiser I am using your branch fixing_4539
<hkaiser> weilewei: also, did you solve the hpx_wrap issue?
<hkaiser> weilewei: ok, I'll have a look later today
<weilewei> hkaiser yes, the compilation error goes away now
<weilewei> hkaiser thanks
<hkaiser> could you comment on the ticket, pls?
<weilewei> ok, will do
<hkaiser> weilewei: thanks
<hkaiser> I have closed the ticket now
<weilewei> hkaiser thanks.
nikunj97 has joined #ste||ar
nikunj has quit [Ping timeout: 244 seconds]
shahrzad has joined #ste||ar
<hkaiser> weilewei: yt?
<weilewei> hkaiser yes
<hkaiser> I can't reproduce the mpi_init issue with mpi_ring_async_executor_test, can you?
<weilewei> hkaiser let me try to run that test
bita has joined #ste||ar
bita_ has joined #ste||ar
shahrzad has quit [Ping timeout: 240 seconds]
<weilewei> hkaiser Am I missing any flags to build mpi_ring_async_executor_test? HPX_WITH_TESTS=ON, HPX_WITH_TESTS_UNIT=ON
<hkaiser> HPX_MPI_WITH_FUTURE=On?
<weilewei> yes, I do have this one
<hkaiser> should be built (does for me)?
<hkaiser> HPX_MPI_WITH_TESTS=On (however this should be the default)
karame_ has joined #ste||ar
<weilewei> hkaiser let me clean up my build dir and try it again
shahrzad has joined #ste||ar
akheir has quit [*.net *.split]
K-ballo has quit [Remote host closed the connection]
K-ballo has joined #ste||ar
<shahrzad> hkaiser: Can I have a 15 min meeting with you today or tomorrow? I'm kinda stuck.
<hkaiser> sec
<hkaiser> shahrzad: how about today 12:30pm?
<shahrzad> hkaiser: It's great, thanks!
<hkaiser> shahrzad ok, same link as for our regular meting
<shahrzad> hkaiser: Alright.
<hkaiser> weilewei: pls update from the branch, I fixied this 10 minutes ago
<weilewei> hkaiser ok, let me try it again
<weilewei> Another questions is about threaded ring G algorithm, I suspect MPI_Isend/recv might not be able to aware of threads locally. I am thinking to achieve the following scenarios: in multithreaded ringG, local thread sends data to corresponding thread in right-hand side neighbor, such that we have implicitly constructed multiple MPI communicator. For
<weilewei> example, we have two ranks, and each rank has 2 threads. Thread 0 from rank 0 issues MPI_Isend with tag associated with send_tag (thread_id+1 = 1) to thread 0 from rank 0 that has recv_tag (thread_id+1=1). However, this threaded ring G algorithm just breaks. See the sample program:
<hkaiser> how does it 'break'?
<weilewei> either hangs, or some errors like this: https://gist.github.com/weilewei/e34c0ba562cb913a3eee94a413d986bc
<weilewei> or Cuda failure /__SMPI_build_dir_______________________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/ibvdevice/CudaIPCPool.h:205: 'invalid resource handle'
<weilewei> from my experience, it seems the send and recv ends are not matched together
shahrzad has quit [Ping timeout: 244 seconds]
bita_ has quit [Quit: Leaving]
karame_ has quit [Remote host closed the connection]
karame_ has joined #ste||ar
<hkaiser> weilewei: is your MPI implementation thread-safe? is it initialized using MPI_Init_thread?
shahrzad has joined #ste||ar
<Yorlik> Can we mabe get rid of this annoying warning? include\hpx\threading\jthread.hpp(258): warning C4267: 'return': conversion from 'size_t' to 'unsigned int', possible loss of data
<Yorlik> Shall I make an issue for it?
<weilewei> hkaiser let me try again
<K-ballo> Yorlik: please, then share the link
<weilewei> hkaiser after adding MPI_Init_thread, it works just fine, thanks!
shahrzad has quit [Ping timeout: 264 seconds]
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
nan11 has quit [Remote host closed the connection]
nikunj97 has quit [Read error: Connection reset by peer]
nan11 has joined #ste||ar
rtohid has quit [Remote host closed the connection]
rtohid has joined #ste||ar
<hkaiser> jbjnr: pls see #4575 for a possible solution to your compilation issue
<hkaiser> nan11: ok, what am I looking for?
<nan11> 1. Is the input tiled vector correct? 2. Is the output for different tiling types correct
<hkaiser> nan11: why is the result 4 columns wide if the input has only 3 columns? (test_diag_1d_0)
<hkaiser> ahh, you're asking for the first sub-diagonal... - makes sense
<nan11> yep
<nan11> k=1
<hkaiser> looks fine to me
<nan11> Okay. Thanks
<diehlpk_work> hkaiser, see pm
<hkaiser> diehlpk_work: saw that - I'm working on a proposal right now - not much time for anything else...
<diehlpk_work> hkaiser, ok, but you will attend the meeting tomorrow?
<diehlpk_work> weilewei, Do you know how long Summit will be available before the new machine will be up?
<weilewei> diehlpk_work not until 2022
<weilewei> until 2022
<weilewei> but HPX hours on DCA project on Summit will be ended 12/30/2020
<diehlpk_work> Ok, got that. It is more about what to do with these node hours
<weilewei> then need to submit renewal. but anyhow, my impression is an evaluation of HPX hours will be placed at the end of this year
<weilewei> diehlpk_work ok
<diehlpk_work> Since the INCITE proposal next year, might be for the new machine
<hkaiser> diehlpk_work: I'll try
rtohid has left #ste||ar [#ste||ar]
shahrzad has joined #ste||ar
shahrzad has quit [Ping timeout: 244 seconds]
shahrzad has joined #ste||ar
shahrzad has quit [Ping timeout: 240 seconds]
jaafar_ has quit [Quit: Konversation terminated!]
jaafar has joined #ste||ar