K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
nanmiao11 has joined #ste||ar
<hkaiser> tiagofg[m]: would you have a minimal full example?
<hkaiser> so you are loading a shared library that has an HPX component dynamically
<hkaiser> is HPX initialized from the executable or the shared library?
<hkaiser> tiagofg[m]: I think I understand why in such a scenario the above error would show
hkaiser has quit [Quit: bye]
bita has joined #ste||ar
nanmiao11 has quit [Remote host closed the connection]
akheir has quit [Quit: Leaving]
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 256 seconds]
K-ballo1 is now known as K-ballo
bita has quit [Ping timeout: 240 seconds]
hkaiser has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
jbjnr_ has joined #ste||ar
jbjnr_ has quit [Ping timeout: 260 seconds]
jbjnr has joined #ste||ar
K-ballo has quit [Ping timeout: 246 seconds]
K-ballo has joined #ste||ar
<tiagofg[m]> hkaiser: HPX is initialized from an executable and I'm loading a shared library that has an HPX component dynamically, I guess is that
<tiagofg[m]> hkaiser: here is the minimal example that fail at runtime
<K-ballo> mmh, globals in shared libraries
<hkaiser> tiagofg[m]: ok
<hkaiser> let me investigate this
<tiagofg[m]> hkaiser: thanks!
<hkaiser> tiagofg[m]: this is definitely a new use case, not sure it's currently supported
<K-ballo> outside the scope of the standard too
<hkaiser> K-ballo: well, there are more things in hpx outside the standard ;-)
<hkaiser> tiagofg[m]: please try adding --hpx:init=hpx.component_paths=<the_path_to_the_so> to the command line, see if this helps
<tiagofg[m]> hkaiser: okko
<hkaiser> tiagofg[m]: sorry, it's --hpx:ini=...
<tiagofg[m]> nope, happens the same error
<hkaiser> nod
<hkaiser> is the shared library in the same directory as the executable for you?
<hkaiser> tiagofg[m]: ^^ ?
<tiagofg[m]> no, I created a build directory and the executable is in there, but I can put all in the same directory
<hkaiser> tiagofg[m]: also, do you have a HPX_REGISTER_COMPONENT_MODULE(); somewhere in a cpp file in the shared library?
<tiagofg[m]> I don't
<hkaiser> you need this for all shared libraries that expose a component
<tiagofg[m]> let me do that then
jbjnr has quit [Ping timeout: 264 seconds]
<hkaiser> you might have to use HPX_REGISTER_DYNAMIC_MODULE(); instead as the shared library is dynamically loaded, pls try both (one at a time)
<hkaiser> K-ballo: in my experience do all compilers today handle globals in shared libraries in the same way - the 'global' in this case means global to the code in the shared library
<hkaiser> except if the data is dll-exported
<hkaiser> what's your experience with this?
<hkaiser> so if you're not careful you may run into ODR violations
<tiagofg[m]> hkaiser: I had put hpx.component_paths wrong, it works with the correct path, without HPX_REGISTER_COMPONENT_MODULE() or HPX_REGISTER_DYNAMIC_MODULE()
<tiagofg[m]> problem solved for now!! thanks
<hkaiser> ahh, I think you should add that macro anyways (the DYNAMIC one)
<tiagofg[m]> hkaiser: right
<tiagofg[m]> thank you all for your time
<hkaiser> most welcome
<hkaiser> tiagofg[m]: as a hint - if you place the shared library in a ./hpx/ subdir to the executable you shouldn't need the command line option
<tiagofg[m]> hkaiser: ok very nice
<tiagofg[m]> hkaiser: so, what did you mean was create a directory named hpx inside the directory where executable is, and put the .so inside /hpx?
<hkaiser> yes
<tiagofg[m]> but if I run without --hpx:ini=... the error happens again, I think I'm doing it right
<hkaiser> ok
<hkaiser> even with the macro?
<tiagofg[m]> with HPX_REGISTER_COMPONENT_MODULE() yes, I tried with HPX_REGISTER_DYNAMIC_MODULE() but fail at compiling
<tiagofg[m]> let me see
<tiagofg[m]> you mean put HPX_REGISTER_DYNAMIC_MODULE() with ; after?
<hkaiser> you might need the ';'
<tiagofg[m]> error: expected constructor, destructor, or type conversion before ‘;’
<tiagofg[m]> HPX_REGISTER_DYNAMIC_MODULE();
akheir has joined #ste||ar
jbjnr has joined #ste||ar
shahrzad has joined #ste||ar
nanmiao11 has joined #ste||ar
jbjnr_ has joined #ste||ar
jbjnr has quit [Ping timeout: 258 seconds]
<hkaiser> jbjnr_: yt?
<hkaiser> jbjnr_: here is an example of how to enable parcel coalescing for a specific action: https://github.com/STEllAR-GROUP/hpx/blob/master/tests/unit/parcelset/put_parcels_with_coalescing.cpp#L65
nanmiao11 has quit [Remote host closed the connection]
<diehlpk_work> hkaiser, Meeting?
bita has joined #ste||ar
weilewei has joined #ste||ar
nanmiao11 has joined #ste||ar
<weilewei> DCA w/ HPX paper is available as preprint: https://arxiv.org/abs/2010.07098 and retweet and likes are welcome: https://twitter.com/Lokwei9/status/1316734679232524288?s=20
<jbjnr_> hkaiser, thank
<jbjnr_> you
<ms[m]> hkaiser, jbjnr : see https://github.com/STEllAR-GROUP/hpx/blob/2380408df95b1c26a1f9f605f51603cd9a924c0e/libs/parallelism/lcos_local/tests/unit/local_dataflow_external_future.cpp for a minimal (broken) example of the problem with dataflow and special executors
<hkaiser> thanks ms[m]
<zao> weilewei: yay!
<weilewei> zao Thanks!!
<jbjnr_> Gosh - I didn't even get my name in the acknowledgments after all the work I did making DCA work with HPX.
<hkaiser> jbjnr_: uhh, that's unforgivable, I'm sure weilewei can still correct that
<jbjnr_> I'm crossing him off my christmas card list
<zao> No swiss chocolate for wei!
<weilewei> jbjnr_ ahh, apology, I wasn't in charge of the acknowledgement section, and I will fix it now!
<hkaiser> so no chocolate for Ronnie ;-)
<weilewei> jbjnr_ just submitted change request, should be in later tonight
<weilewei> huhhh, I still want chocolate, last year at SC, I got the best chocolate from CSCS booth, lol
<weilewei> Sad this year the SC will be virtual
Yorlik has quit [Read error: Connection reset by peer]
<hkaiser> ms[m]: yt?
nanmiao11 has quit [Remote host closed the connection]
jbjnr_ has quit [Quit: Leaving]
<ms[m]> hkaiser: here
<ms[m]> btw, I'm trying to reproduce the parallel fill hang, and the test passes just fine... very odd
<ms[m]> except that I think I know now what's wrong...
<hkaiser> what's wrong?
<ms[m]> it's --hpx:bind=none for running tests in parallel, and the block executor gets no threads to run on
<ms[m]> we should at least throw an error, but possibly just use the full machine mask instead of an empty mask when there's no binding
<hkaiser> ah
<ms[m]> I can try to do something about it...
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
<hkaiser> ms[m]: I can change the test for now to use another executor as this is not relevant
khuck has joined #ste||ar
nanmiao11 has joined #ste||ar
<hkaiser> ms[m]: I changed it to use parallel_executor instead, should be fine now
<bita> hkaiser, I forgot to ask about the vtune today. Did you check it?
<hkaiser> ahh no, not yet
weilewei has quit [Remote host closed the connection]
hkaiser has quit [Quit: bye]
<bita> :+1