#ste||ar on 2020-05-07 — irc logs at irclog.cct.lsu.edu

2020-02-24 20:46 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020

00:09 nikunj97 has quit [Read error: Connection reset by peer]

01:16 weilewei has quit [Remote host closed the connection]

01:35 bita__ has joined #ste||ar

01:38 bita_ has quit [Ping timeout: 260 seconds]

01:39 weilewei has joined #ste||ar

01:41 hkaiser has quit [Quit: bye]

04:22 bita__ has quit [Ping timeout: 260 seconds]

05:41 weilewei has quit [Remote host closed the connection]

06:19 <zao> I've got some std-using code that has an mandatory heavy initialization step that I'm off-threading when constructing a thing, so that it might be ready when code needs it.

06:21 <zao> Right now I've got a mutex that the worker acquires when initializing the data and which any later callers need to acquire to use the data, also passing in a void future as a barrier to ensure that the worker has acquired the mutex.

06:22 <zao> Would it be more efficient if I instead use a future<void> to "guard" access to the data?

06:22 <zao> So the init code would fill the future when the data is filled, and consumers would just have to get() it before using the data.

06:23 <zao> Said initialization is prepopulating a hash map that is immutable after this initialization step, so consumers don't need mutual exclusion with each other.

06:23 <zao> I don't really know what the costs involved with a future is.

06:26 karame_ has quit [Remote host closed the connection]

06:27 <zao> (I should of course use HPX, but I'm not at that spot quite yet)

06:28 <heller1> zao: you can think of a future as something like this: condition variable + mutex with dynamic memory allocation and atomic reference counting

06:29 <heller1> FWIW, you would need a shared_future<void> to guard your initialization, which allows you to call `get` multiple times

06:30 <zao> Ah.

06:30 <heller1> the reoccurring calls to get have an overhead of one indirection + mutex lock/unlock

06:30 <heller1> in that ballpark, roughly

06:31 <zao> Maybe I could get the fastpath better by polling an atomic before going into the mutex path or something?

06:31 <zao> Again, no clue about relative costs here.

06:31 <heller1> yes, that could be done

06:32 <heller1> of course also depends on the implementation of the future ... in HPX, the fastpath of get is just an atomic read

06:33 <heller1> but absolutely depends on the usage pattern of your object(s)

06:36 <zao> The concrete application here is a virtual filesystem, in which I need to traverse the whole thing up-front to generate a complete lookup-table from child object to parent object. This takes something like 16 seconds, so I off-thread it.

06:36 <zao> It's used whenever I need to obtain the full path to an object, so not always used and typically for display purposes currently.

06:38 <zao> It's worked quite nicely in the Rust implementation of this codebase, but I'm porting it to C++ for experience.

06:38 <heller1> ;)

06:38 <heller1> how did you solve it in rust?

06:39 <heller1> if it is just display purposes, I guess the atomic + shared_future<void> thing is a nice way to go

06:39 <mdiers[m]> I need a connection to python in an hpx application, especially tensorflow. It is currently for a research project. Should I realize the interface with pybind11 or should I use Phylanx right away? Is Phylanx ready for productive environments? I have seen that the first release is out now.

06:40 <zao> In Rust I use a sharded reader-writer lock, which is biased toward faster reads: https://docs.rs/crossbeam/0.7.3/crossbeam/sync/struct.ShardedLock.html

06:41 <heller1> mdiers_: doesn't tensorflow have C++ bindings as well?

06:43 <heller1> mdiers_: phylanx doesn't give you the connection C++ to python. It gives you a python library which is using HPX. For a connection of C++ to Python, I think pybin dis the way to go

06:43 <heller1> pybind11

06:43 <heller1> (phylanx uses it too)

06:44 <heller1> on that note, I don't know about the production readyness of phylanx

06:46 <heller1> zao: that SharedLock is pretty neat. How do you deal with the situation where you see the has map not being initialized?

06:46 <zao> I block enough during construction so that the writer lock is always held before any readers get to see the object.

06:47 <heller1> icky

06:47 <zao> Yeah, a bit hacky :)

06:48 <mdiers[m]> <sithhell[m] "mdiers_: doesn't tensorflow have"> yes, but the current tensorflow part of the project is implemented in python and it will stay that way until it is finished. after that there will be a port.

06:48 <heller1> that's the kind of code which I have to debug nowadays which runs into all kinds of races and deadlocks because it was written 10 years ago with exactly those implicit assumptions

06:49 <heller1> mdiers_: in that case, I would probably write a quick binding in pybind

06:49 <zao> Don't tell anyone, but I actually had a bug where I didn't do that synchronization up-front and readers could sneak in if the init thread was delayed somehow :)

06:49 <heller1> :P

06:49 <mdiers[m]> <sithhell[m] "mdiers_: in that case, I would p"> thanks for the quick help

06:49 <heller1> mdiers_: unless you want to give phylanx a try though

06:51 <heller1> there was a hpx backend to tensorflow a while back: http://stellar.cct.lsu.edu/pubs/lukas_troska_hpx_tensorflow_04.05.17.pdf (it probably doesn't work anymore)

06:52 <heller1> phylanx is more or less pure python and I am not sure how well it integrates with 3rd party software

06:52 <mdiers[m]> <sithhell[m] "there was a hpx backend to tenso"> Yes, I saw it yesterday.

06:53 <heller1> in any case, I am sure the phylanx team would be eager to support you with the features you would need

06:55 <mdiers[m]> I can well imagine, are almost the same as here :-)

07:00 <mdiers[m]> i have another problem: a crash during the destruction of a static internal hpx object:

07:00 <mdiers[m]> `std::_Rb_tree<std::string, std::pair<std::string const, hpx::util::basic_any<void, void, void, std::integral_constant<bool, true>>>, std::_Select1st<std::pair<std::string const, hpx::util::basic_any<void, void, void, std::integral_constant<bool, true>>>>, std::less<std::string>, std::allocator<std::pair<std::string const, hpx::util::basic_any<void, void, void, std::integral_constant<bool, true>>>>>::_M_erase`

07:00 <mdiers[m]> I haven't had time to create a minimal example of this yet. maybe you have an idea?

07:05 <zao> Ho ho... tried to vcpkg install HPX... Additional packages (*) will be modified to complete this operation. Starting package 1/102: boost-vcpkg-helpers:x64-windows

07:08 <heller1> zao: good luck

07:08 <heller1> mdiers_: interesting. Doesn't ring a bell

07:12 <mdiers[m]> <sithhell[m] "mdiers_: interesting. Doesn't ri"> ok, thanks. is also an untypical context. hpx-application integrated via a shared library, loaded at runtime, only one function is called without hpx, and then again an unload.

07:15 <heller1> oh, interesting usecase...

07:16 <heller1> if you have a stacktrace, I can have a look...

07:17 <heller1> but it looks like this is related to our own plugin loading mechanism

07:17 <heller1> mdiers_: does this also happen when you remove all files from lib/hpx/ in your build/install directory?

07:27 <zao> Bleh, can't say -HPX_WITH_CXX20=ON yet on MSVC it seems. Not quite sure what explodes yet, seems to be Boost.

07:28 <heller1> hmm

07:30 * mdiers[m] sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/XQWnmzdANkJBebpTvJkyoWMe >

07:42 <zao> https://gist.github.com/zao/3af46be03b196ee433c5e8444b375d33

07:43 <zao> truncated log after the first bunch of failed projects, whole one is a bit hugs.

07:56 Guest8932 has quit [*.net *.split]

07:56 gdaiss[m] has quit [*.net *.split]

07:56 diehlpk_mobile[m has quit [*.net *.split]

07:57 * mdiers[m] sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/qWLFeqeomBpVpmApePROXjGN >

08:01 Guest8932 has joined #ste||ar

08:01 diehlpk_mobile[m has joined #ste||ar

08:01 gdaiss[m] has joined #ste||ar

08:08 mdiers_ has quit [Quit: mdiers_]

08:38 mcopik has joined #ste||ar

08:39 mcopik has quit [Client Quit]

09:39 diehlpk_mobile[m has quit [Quit: killed]

09:39 freifrau_von_ble has quit [Quit: killed]

09:39 kordejong has quit [Quit: killed]

09:39 mdiers[m] has quit [Quit: killed]

09:39 jbjnr has quit [Quit: killed]

09:40 heller1 has quit [Quit: killed]

09:40 tiagofg[m] has quit [Quit: killed]

09:40 rori has quit [Quit: killed]

09:40 ms[m] has quit [Quit: killed]

09:40 pfluegdk[m] has quit [Quit: killed]

09:40 gdaiss[m] has quit [Quit: killed]

09:40 Guest8932 has quit [Quit: killed]

09:54 kordejong has joined #ste||ar

09:55 parsa[m] has joined #ste||ar

09:56 parsa[m] is now known as Guest52957

10:07 tiagofg[m] has joined #ste||ar

10:07 pfluegdk[m] has joined #ste||ar

10:07 mdiers[m] has joined #ste||ar

10:07 jbjnr has joined #ste||ar

10:07 diehlpk_mobile[m has joined #ste||ar

10:07 ms[m] has joined #ste||ar

10:07 rori has joined #ste||ar

10:07 gdaiss[m] has joined #ste||ar

10:07 freifrau_von_ble has joined #ste||ar

10:07 heller1 has joined #ste||ar

10:50 gonidelis has joined #ste||ar

11:07 nikunj97 has joined #ste||ar

11:14 Nikunj__ has joined #ste||ar

11:17 nikunj97 has quit [Ping timeout: 260 seconds]

12:13 hkaiser has joined #ste||ar

12:16 <hkaiser> ms[m]: yt?

12:16 <ms[m]> hkaiser: here

12:16 <hkaiser> hey g'morning

12:17 <hkaiser> ms[m]: I wanted to talk about sequencing the merges

12:17 <hkaiser> what's you planning?

12:17 <ms[m]> morning

12:17 <ms[m]> no plan, but agree that planning it would be a good idea

12:17 <ms[m]> if nothing else oldest first...

12:18 <hkaiser> should we go ahead with the cmake formatting? if yes we should either wait until all planned modules are in or do it asap as each module creates conflicts there

12:19 <ms[m]> I was thinking that it might be easier to wait with that one until it's quieter since it's easy enough to reapply to the cmake formatting, but if we merge it right away it'll be quite painless as well

12:19 <hkaiser> either way is fine for me, just would like to avoid having to resolve conflicts each time something is merged

12:19 <ms[m]> right now there aren't any massive cmake changes in other prs

12:19 <ms[m]> yeah, understand completely

12:20 <hkaiser> ok, I'll wait then - pls let me know when you think it's a good time

12:20 <ms[m]> let's go and merge the cmake formatting then because that is always going to have conflicts

12:20 <ms[m]> :P

12:20 <hkaiser> ok, I need to resolve conflicts first, then ;-)

12:20 <ms[m]> I don't mind, let's do it now if it's conflict free

12:20 <ms[m]> ok, I won't merge anything before that one is in then

12:21 <hkaiser> ok, thanks - will work on it today

12:21 <hkaiser> other thing

12:21 <hkaiser> I noticed you have manually edited libs/CMakeLists.txt

12:21 <hkaiser> isn't that one generated by the module creation script?

12:21 <ms[m]> mmh, true

12:21 <ms[m]> yes

12:22 <ms[m]> I figured we're at a point where we don't necessarily need to generate that, but then I should delete the script

12:22 <hkaiser> hmm

12:22 <ms[m]> all it does now is add a module in the correct place in a list

12:23 <hkaiser> and it gnerates all the boilerplate files

12:23 <ms[m]> and usually we forget to edit the script if we edit the actual cmakelists.txt

12:23 <ms[m]> ah, I exaggerated, delete the part that generates libs/CMakeLists.txt

12:23 <ms[m]> the rest is very useful

12:23 <hkaiser> right

12:24 <ms[m]> it does mean one has to remember to add the module, but that should be caught pretty easily

12:24 <ms[m]> or we add a check for that as well

12:24 <hkaiser> mdiers[m]: we could externalize the list of modules from the main CMakeLists file

12:25 <ms[m]> hmm?

12:25 <hkaiser> or we mark the modules that are meant to be distributed explicitly so we can collect that information

12:25 <hkaiser> we will end up with several different configurations anyways - I think automating that might be a good idea

12:26 <hkaiser> sorry mdiers[m], wrong auto-completion

12:26 <ms[m]> true, I can add the checks inside the modules instead

12:26 <ms[m]> we have the logic in place to exclude modules already

12:27 <hkaiser> right

12:27 <ms[m]> good idea, I'll do that

12:27 <hkaiser> ms[m]: ok - but after the cmake formatting was merged ;-)

12:27 <ms[m]> for the different configurations we'll need something more, but let's decide on that when it's relevant...

12:27 <ms[m]> yes ;)

12:27 <hkaiser> ms[m]: ok

12:28 <hkaiser> thanks

12:29 <ms[m]> thank you!

12:30 Yorlik has quit [Read error: Connection reset by peer]

12:45 Yorlik has joined #ste||ar

14:20 <hkaiser> ms[m]: I have pushed the cmake-format with conflicts resolved

14:31 <ms[m]> hkaiser: thanks! and no worries about conflicting with the other pr based on it, I was kind of expecting it ;)

14:31 <hkaiser> ms[m]: thanks

14:32 akheir has joined #ste||ar

15:14 bita__ has joined #ste||ar

15:18 nikunj has quit [Read error: Connection reset by peer]

15:19 kordejong has left #ste||ar ["Kicked by @appservice-irc:matrix.org : Idle for 30+ days"]

15:19 nikunj has joined #ste||ar

15:30 nikunj97 has joined #ste||ar

15:30 Guest52957 has left #ste||ar ["Kicked by @appservice-irc:matrix.org : Idle for 30+ days"]

15:33 Nikunj__ has quit [Ping timeout: 246 seconds]

15:34 Nikunj__ has joined #ste||ar

15:38 nikunj97 has quit [Ping timeout: 256 seconds]

15:39 karame_ has joined #ste||ar

15:50 <gonidelis> Why should there be a my_hpx_build directory under the main hpx directory? Could someone help me clarify the differences/usage?

15:52 <ms[m]> hkaiser: https://app.circleci.com/pipelines/github/STEllAR-GROUP/hpx/3387/workflows/b115cf57-d06b-4106-bbe0-d7f1f288a3f4/jobs/188240/steps :/

15:52 <ms[m]> btw, merge whenever it's clean, I suspect it's going to be later tonight...

15:52 weilewei has joined #ste||ar

15:53 <ms[m]> gonidelis: there's no need for it to be under your main hpx directory, it's just a convention

15:53 <ms[m]> the only requirement is that the build directory isn't the same as your source directory

15:53 <ms[m]> makes it easier to wipe a build without wiping all the source files as well

15:55 <gonidelis> Alright and what is the difference in terms of purpose? I mean, are the changes be made in build or src directory?

16:04 <hkaiser> ms[m]: grrr

16:05 <ms[m]> gonidelis: you make changes to the source directory

16:05 <ms[m]> builds are derived from the source

16:06 <ms[m]> I feel like we had this discussion once earlier... :) maybe it was with someone else

16:12 nikunj97 has joined #ste||ar

16:15 Nikunj__ has quit [Ping timeout: 240 seconds]

16:20 rtohid has joined #ste||ar

16:22 pfluegdk[m] has left #ste||ar ["Kicked by @appservice-irc:matrix.org : Idle for 30+ days"]

16:26 gonidelis has quit [Ping timeout: 245 seconds]

16:34 gonidelis has joined #ste||ar

16:44 <gonidelis> ms[m] thnx! ahh now I get it, so the main purpose of build_dir is to be able to make fresh install over and over again without having to download the source all the time?

16:51 <hkaiser> gonidelis: right

16:52 <gonidelis> perfect... thanks a lot!

16:52 <hkaiser> gonidelis: only the generated files (binaries) end up in the build dir, the sources (cpp/hpp) stay n the original directory

16:52 <gonidelis> yeah yeah... crystal clear

17:04 <weilewei> Does hpx have MPI_pack() similar functionality? I am looking a way to pack multiple arrays (from multiple threads) into single buffer and send it to next rank

17:29 <hkaiser> weilewei: use HPX serialization ;-)

17:30 <weilewei> hkaiser hpx component?

17:33 <hkaiser> bita__: I might be a couple minutes late today (again)

17:33 <bita__> no worries

17:33 <hkaiser> weilewei: was merely kidding ;-)

17:34 <weilewei> hkaiser ok...

17:35 <hkaiser> weilewei: how large are the arrays you're trying to combine?

17:35 <hkaiser> MPI_pack and friends will copy the data, I don't think that's what you want

17:35 <weilewei> each array might be 30-100 Mb at this point and each rank might have 7 of them

17:36 <weilewei> I am not sure if MPI_pack is thread-aware? For example, inside each thread, all call MPI_pack

17:37 <hkaiser> weilewei: I'd rather send those large arrays separately, the data copying involved in combining them would kill you

17:39 <weilewei> hkaiser ok... I am thinking it as well, the DCA++ mathematician is suggesting "It may depend on the quality of implementation whether the MPI library is internally just copying or packing, or can actually use the network hardware to transfer non-contiguous memory regions." If the situation is the latter one, it might be an ideal case

17:40 <hkaiser> weilewei: mpi_pack will copy things, I'm almost certain - but who knows

17:41 <hkaiser> look at the mpi_pack api, there is no way it can't get away without copying

17:41 <weilewei> hkaiser ok, then that's very bad

17:42 <weilewei> hkaiser well, it does ask for outbuff pointer

17:42 <hkaiser> I meant there is no way it _can_ get away with not copying

17:48 <weilewei> hkaiser see PM, please

17:53 <hkaiser> bita__: would 1.15 would be still ok for you?

17:53 <bita__> sure

17:54 <hkaiser> thanks

18:08 <hkaiser> ms[m]: https://app.circleci.com/pipelines/github/STEllAR-GROUP/hpx/3393/workflows/a4d12112-a285-4c91-8b71-32e8ccc3f292/jobs/188351/steps

18:08 <hkaiser> \o/

18:27 Nikunj__ has joined #ste||ar

18:30 nikunj97 has quit [Ping timeout: 246 seconds]

18:32 gonidelis has quit [Ping timeout: 245 seconds]

18:33 nikunj97 has joined #ste||ar

18:36 Nikunj__ has quit [Ping timeout: 240 seconds]

18:49 Nikunj__ has joined #ste||ar

18:52 shahrzad has joined #ste||ar

18:53 nikunj97 has quit [Ping timeout: 256 seconds]

19:20 shahrzad has quit [Ping timeout: 252 seconds]

19:28 nikunj97 has joined #ste||ar

19:31 Nikunj__ has quit [Ping timeout: 244 seconds]

19:38 <jbjnr> weilewei: I'd second hkaiser - do not use mpi_pack for large data. if the number of arrays is known, then just send them one at a time. Ideally, an RMA copy API like the one I'm working on for another project would be best, but MPI doesn't make that easy.

19:47 <weilewei> jbjnr got it, it seems we might end up allocate a large array, G2_mem of size N*N*num_G2 where N is col/row of G2, and then send G2_mem out

20:19 <ms[m]> hkaiser: woop

20:19 <ms[m]> merge it! before ci finds other problems ;)

20:22 <hkaiser> ms[m]: done ;-)

20:23 <ms[m]> thanks!

20:30 <hkaiser> ms[m]: how do I link a shared library with hpx_init/hpx_wrap nowadays?

20:31 <ms[m]> hkaiser: bleh, you don't... I might need to rethink this

20:31 <ms[m]> do you have main or hpx_main in the shared library?

20:33 akheir has quit [Remote host closed the connection]

20:34 <hkaiser> nope

20:34 <hkaiser> I'm calling hpx::start

20:35 <ms[m]> then just link to HPX::hpx? does it not work?

20:35 <hkaiser> ms[m]: let me check, I might not need to link with hpx_init

20:36 <ms[m]> if it doesn't open an issue and I'll have a look tomorrow

20:36 <hkaiser> thanks!

20:36 <ms[m]> I'm not 100% happy with the targets yet...

20:38 <hkaiser> it complains about hpx::detail::init_winsocket being undefined

20:38 <hkaiser> I guess we could move that into core HPX

20:42 <ms[m]> hmm...

20:42 <ms[m]> maybe we can actually link hpx_init to everything

20:42 <ms[m]> it's probably what we did before

20:42 <ms[m]> it's just hpx_wrap that's special

20:42 <hkaiser> before we only linked executables, I think

20:43 <ms[m]> I thought so too... but then you at least had the option of manually linking to hpx_init

20:43 <ms[m]> (just thinking if this is a regression or not)

20:44 <hkaiser> ms[m]: I had cases in the past where I had main() in a shared library

20:44 <ms[m]> in principle you still have the option, I just hid it in a namespace because I thought one wouldn't need to link it to shared libraries

20:44 <hkaiser> could be different now

20:45 <hkaiser> well, let's cross that bridge when we're there

20:45 <ms[m]> do you think that might've been with `hpx_main.hpp` (i.e. the macro trickery)?

20:45 <hkaiser> we would have to move the winsocket initialization into the core library, though

20:45 <ms[m]> you seem to be crossing the bridge now ;)

20:46 <hkaiser> ms[m]: yah, could have been hpx_main

20:46 <hkaiser> Phylanx doesn't really have main() in a shared library

20:46 <hkaiser> Phylanx loads a Python extension module that initializes HPX

20:46 <hkaiser> main() is in the Python interpreter

20:47 <hkaiser> so it might not need hpx_init after all

20:47 <ms[m]> I mean you can try linking the shared library to HPXInternal::hpx_init (I think that's what it's called) just to see if that actually works

20:47 <ms[m]> hmm, right

20:48 <ms[m]> something expects init_winsocket to be there though...

20:48 <hkaiser> hpx::start

20:48 <hkaiser> which is in core anyways

20:49 <hkaiser> hpx_init has only the various main() overloads

20:50 <ms[m]> yeah

20:50 <ms[m]> and you're supplying the entrypoint manually?

20:50 <hkaiser> ms[m]: I think we can safely move that into core (and I can certainly do that)

20:51 <hkaiser> it's a windows hack after all anyways

20:51 <ms[m]> ok, let's start with that then

20:51 <ms[m]> thanks :)

20:52 <hkaiser> thank you!

20:52 <ms[m]> right, I'm off to bed... let me know tomorrow if it actually worked :)

20:53 <hkaiser> ok, thanks

22:15 <weilewei> shall I write my own vector_matrix class? In DCA, it has own reshapable_matrix and vector(technically an array), but now I am looking for a container that can have a series of reshapable matrix

22:18 rtohid has left #ste||ar [#ste||ar]

23:19 <hkaiser> weilewei: std::vector<reshapable_matrix>?

23:35 nikunj97 has quit [Read error: Connection reset by peer]