hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
<bita_> hkaiser, are still here?
<hkaiser> bita_: here now
<bita_> cout(slice([[1,2,3],[4,5,6]], nil, 1)) results in [[4, 5, 6]]
<bita_> cout(slice_column([[1,2,3],[4,5,6]], 1)) results in [2, 5]
<bita_> if instead of nil we write list(0,2) it works
<hkaiser> what about cout(slice([[1,2,3],[4,5,6]], 1, nil))
<bita_> cout(slice([[1,2,3],[4,5,6]], 1, nil)) is [[4, 5, 6]]
<hkaiser> the same as above hmmm
<bita_> while cout(slice_row([[1,2,3],[4,5,6]], 1)) is [4, 5, 6]
<hkaiser> something's fishy
<bita_> I mean even in slice_row the dimension is not the same
<hkaiser> bita_: if you look at https://github.com/STEllAR-GROUP/phylanx/blob/master/src/plugins/matrixops/slicing_operation.cpp#L200-L219, I'd expect slice_column to be representable using slice() somehow
<bita_> yeah, I understand
<hkaiser> could you step through the code for both versions and see what's different?
<bita_> of course, but I should go now
<bita_> I will work on it tonight
<hkaiser> nah, do it tomorrow
<bita_> see you :)
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 246 seconds]
nikunj97 has joined #ste||ar
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 272 seconds]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 264 seconds]
nikunj has joined #ste||ar
Nikunj__ has joined #ste||ar
nikunj has quit [Ping timeout: 256 seconds]
nikunj97 has quit [Ping timeout: 260 seconds]
hkaiser has quit [Quit: bye]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 246 seconds]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 258 seconds]
nikunj has joined #ste||ar
bita_ has quit [Quit: Leaving]
nikunj has quit [Ping timeout: 260 seconds]
nikunj has joined #ste||ar
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
Nikunj__ has quit [Read error: Connection reset by peer]
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
<heller1> no
<heller1> sometimes, cmake gets upset though
<rori> yes actually the include_compatibility path is added whether it's existing or not..
<rori> I'll create a PR
<zao> Heh, local nordics seem to favor Kokkos :D
<zao> How bad is it? :P
<heller1> zao: it's OK, there is an HPX backend for it ;)
<zao> eep
<heller1> zao: more seriously: It's a nice library, it is more or less OpenMP done right for C++
<zao> Nice.
<jbjnr> rori: sorry I disappeared into email for a few minutes - do I just need to remove the include_compatibility from the cuda cmake - I didn't look yet at it, but if its something as simple as that, I can do it locally and test it
<rori> yeah noticed that actually while just configuring without any changes
<rori> but which change did you do ? cause it's actually working without any, maybe your cmake cache was messed up
<jbjnr> I'm building DCA that uses HPX as an external lib. HPX itself builds fine
<jbjnr> (I have not made any changes yet)
<rori> but hpx shouldn't have compute_cuda/include_compatibility as an interface directory in the first place
<rori> when I grep for compute_cuda/include_compatibility in the build directory there is nont
<rori> none*
<rori> So you probably want to clean your cmake
<jbjnr> it gets written to /home/biddisco/build/hpx-apex/lib/cmake/HPX/HPXInternalTargets.cmake in my build tree
<ms[m]> jbjnr: is this on one of your own branches? you might have a `COMPATIBILITY_HEADERS ON` in your `libs/compute_cuda/CMakeLists.txt` without an actual `include_compatibility` directory
<ms[m]> or do you have that directory that it complains about?
<jbjnr> aha. let me disable that then
<jbjnr> I enabled them all last night
<jbjnr> if they don't exist, then that flag should probably not exist either
<ms[m]> yep...
<jbjnr> ok that clears it. Thanks. I've got tsome others that I must remove as well now. I had a memory last night of you telling me that cmake for some reason sets the default wrong, so I went through all of them enabling them! waste of time - grrrr.
<ms[m]> yes, but not the way you probably thought I meant it
<ms[m]> it means it'll default them to off if you already have a cache, but that means you should enable them through cmake or ccmake, not by enabling them in our cmakelists
<rori> we should add an `if(exists` in the cmake no ?
<ms[m]> yeah, that might work
<jbjnr> rori: I'll try that and do a PR for you :)
<rori> 👍️
<rori> jbjnr: sorry I totally misread your first message, looks like I needed my coffee.. ^^
<ms[m]> wait, exists is for files and directories? I'd rather have a hard error if one enables compatibility headers without an actualy include_compatibility directory...
<ms[m]> rori: I thought you were talking about some check for making sure the cmake cache variables get initialized in a sane way
<rori> So don't know about the check but only add the compatibility directory if it exists and if not set the hpx_module_compatibility_headers to off with a warning
<rori> would work
<rori> because a hard error seems harsh for this no ?
<jbjnr> all my pain is gone
<jbjnr> adding the exists clears all the errors
<jbjnr> ```
<jbjnr> if(HPX_${name_upper}_WITH_COMPATIBILITY_HEADERS)
<jbjnr> ...(truncated)
<rori> yeah maybe just add a warning that compat headers should be to off as the directory don't exist
<rori> otherwise the user might be confused by the cmake cache variable on compat headers set to on
<jbjnr> the var shouldn't exist at all, if there are no compat headers
<jbjnr> I'll find it and remove it, then problem will be fixed properly
<ms[m]> hmm, if anything cmake should just tell the user the variable is unused if they for some reason would set `HPX_WITH_BLAH_COMPATIBILITY_HEADERS=ON` for a module that doesn't have compatibility headers
<ms[m]> it might already do that
<ms[m]> but possibly not...
<rori> I think it doesn't cause we check it with an if
<ms[m]> we might not be able to get that behaviour
<ms[m]> I wouldn't add any special handling for it
<jbjnr> I don't see why the variable should exist at all for a module with no compat headers. better to just never create it
<jbjnr> it serves no purpose
<ms[m]> we don't
<jbjnr> we don't what?
<rori> yes we do only if the option is specified in hpx_add_module
<ms[m]> we don't create the variables unless the option is passed to add module
<ms[m]> :P
<ms[m]> glass half full or empty...
<jbjnr> I will look again and fix it
<ms[m]> jbjnr: in your case you had added `COMPATIBILITY_HEADERS ON` to `add_hpx_module`, correct?
<jbjnr> clearly it exists - I have at least 4 libs with compat header errors and if they are not needed, then I'll get rid of them
<jbjnr> I did not change the cmakelists, I only enabled flags in `ccmake` -
<ms[m]> hrm...
<rori> it exists if you do `COMPATIBILITY_HEADERS OFF` but it doesn't if you completely omit it
<rori> iirc
<jbjnr> lets get rid of it completely. leave it to me
<ms[m]> in that case it's switching between branches, some branches might have it enabled and others not
<ms[m]> just unset the variables?
<jbjnr> unsetting takes time. I have to run cmake, see an error, diable the flag, run cmake again, disable the next one, run cmake. Iwaster half an hour already doing that
<jbjnr> * unsetting takes time. I have to run cmake, see an error, diable the flag, run cmake again, disable the next one, run cmake. I wasted half an hour already doing that
<ms[m]> I meant unset in our cmake...
<jbjnr> they should not exiist at all!!!!
<ms[m]> !
<ms[m]> they most likely exist because you reuse caches between branches, no?
<ms[m]> but I might be misunderstanding, so do your pr and we'll see :)
<ms[m]> I'll leave it to you...
<jbjnr> aha. if switching branches is causing the problem, then that's just really really annoying. In that case, adding the EXIST check is prpbably the best plan
<ms[m]> let me do a quick pr, and you have a look to see if it does what you expect it to do
<jbjnr> you are right. when I delete all compat header stuff from cache and rerun, they stop existing
<jbjnr> so it must have been a branch change
<jbjnr> that's annoying. The exist check is simplest then
<jbjnr> we could go further and use `unset` to delete it from the cache if they have gone. I like that plan
<ms[m]> jbjnr: rori something like this: https://github.com/STEllAR-GROUP/hpx/pull/4617?
<ms[m]> I don't unset the variables because then you can still switch between branches and use the old values when you get back to a branch that actually uses the option
<ms[m]> but unsetting would be fine by me as well...
<rori> just commented cause I think it does not solve the problem when the option is on and the include_compat dir does not exist
<rori> cause the main problem is that the HPX_MODULE_COMPAT_HEADER option is ON in the cache if I understood correctly
<ms[m]> rori: it was meant to skip adding compatibility headers if the option is not passed to add_hpx_module
<ms[m]> if the option is actually passed to add_hpx_module and there's no include_compatibility directory we should fix that, not silently ignore it
<rori> so this is good but it thought john wanted to address the problem of when it is accidentally set to on and when the include_compatibility dir doesn' t exists
<rori> but I may have misunderstood
<jbjnr> Is this a new bug too ? ```/usr/bin/ld: cannot find -lhpx_wrap```
<jbjnr> for projects linking against hpx?
<rori> I think `hpx_wrap` is now automatically in `hpx_init` target but ms[m] knows probably better
<ms[m]> jbjnr: maybe, maybe not
<ms[m]> hpx_wrap was merged with hpx_init for a while but it's separate again
<ms[m]> do you have hpx_wrap in your lib directory?
<ms[m]> oh, this is from an installed hpx I guess?
<ms[m]> are you explicitly linking to hpx_wrap? if yes, don't do that (it was never supported)
<jbjnr> ok thanks. looks like something I added a while back to fix a build error. Now I can use HPX::hpx instead and it is behaving better
Yorlik has quit [Read error: Connection reset by peer]
<ms[m]> boop
<rori> ...
nikunj has quit [Ping timeout: 256 seconds]
nikunj has joined #ste||ar
Yorlik has joined #ste||ar
<jbjnr> How do we feel about the intel compiler? Should I add it to my build matrix?
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
hkaiser has joined #ste||ar
<ms[m]> jbjnr: in principle it works (pycicle builds master with it, not every pr though), so if there's enough cpu time add it
<ms[m]> jbjnr: btw, is this going to be the build matrix for master or every pr? rostam is going to be swamped if that's for every pr...
<ms[m]> having two different matrices for prs and master would be nice
<ms[m]> and fricking huge matrix for release...
<hkaiser> we shouldn't run more than 10 different configurations at a time
<jbjnr> everything will be customizable
<hkaiser> rostam will die otherwise
<jbjnr> pycicle uses slurm
<hkaiser> sure
<jbjnr> so if we limit how many nodes are available to pycicle, then we can limit the load
<hkaiser> but it won't be able to keep up with commits and PRs
<jbjnr> everything will be customizable
<hkaiser> also, does pycicle cancels builds if branches get updated?
<jbjnr> yes
<jbjnr> though I'll need to make some changes for the newer build matrix stuff
<hkaiser> ok, does this already work on daint?
<jbjnr> should do
<hkaiser> I always had the impression that the builds that have started run to completion even if obsolete
<jbjnr> you frequently see unfinished build sets on th dashboard when a PR gets force pushed and the existing build for the old one gets cacelled
<hkaiser> ok
<jbjnr> I probably need to check that it always works, and with the new build stuff, I will for sure have to update it, the old pycicle only did 1 build per PR or per master (per configuration). Allowing it to do N builds for master and N per PR will mean more checking for obsolete jobs if something is updated
<jbjnr> so ... are we supporting intel compiler - it is installed on rostam
<hkaiser> jbjnr: not at this point, I'd say let's start with a single builder and watch it for a while
bita_ has joined #ste||ar
nikunj97 has joined #ste||ar
Nikunj__ has joined #ste||ar
nikunj97 has quit [Ping timeout: 252 seconds]
weilewei has joined #ste||ar
weilewei has quit [Remote host closed the connection]
<hkaiser> ms[m]: I'll need a bit of time to properly review #4592, could you hold off on merging this, for now please?
<ms[m]> hkaiser: yeah, sure
<hkaiser> thanks
rtohid has joined #ste||ar
<diehlpk_work_> hkaiser, operation bell meeting?
<hkaiser> diehlpk_work_: is it now?
<heller1> hkaiser: how about today?
<hkaiser> heller1: sorry for bailing out on you yesterday
<heller1> No problem
<heller1> It was late anyways
<hkaiser> I'm in a meeting right now, will ping you
<heller1> Ok, I'll be available only later today...
<heller1> Around 8, I'd assume
<diehlpk_work_> hkaiser, Yes
<diehlpk_work_> we shifted it one hour
<hkaiser> heller1: ok
<hkaiser> diehlpk_work_: sorry, in a meeting
nan11 has joined #ste||ar
<diehlpk_work_> hkaiser, Ok, meeting has finished. We will not meet next week, since we have not much to discuss before we get the SC comments back
<hkaiser> diehlpk_work_: thanks
Yorlik has quit [Read error: Connection reset by peer]
Yorlik has joined #ste||ar
nikunj97 has joined #ste||ar
Nikunj__ has quit [Ping timeout: 272 seconds]
karame_ has joined #ste||ar
Nikunj__ has joined #ste||ar
Yorlik has quit [Read error: Connection reset by peer]
nikunj97 has quit [Ping timeout: 264 seconds]
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 256 seconds]
nikunj has joined #ste||ar
weilewei has joined #ste||ar
gonidelis has joined #ste||ar
<heller1> hkaiser: ready whenever you are
<heller1> hkaiser: https://zoom.us/j/2810580903
<hkaiser> heller1: still busy, sorry
<hkaiser> I could join at around 2pm (21:00)
<heller1> Ok, just ping me then
<heller1> Fine as well
gonidelis has quit [Remote host closed the connection]
<weilewei> why hpx_wrap error comes back again? I am using latest hpx branch: https://gist.github.com/weilewei/1ee21f125eb84a8dd21fdd31bc48d03a
<weilewei> and I link my target to HPX::hpx
gonidelis has joined #ste||ar
<ms[m]> weilewei: 1. what changed? 2. what's your linker line? 3. where is main defined? 4. which version of hpx?
<weilewei> 1. John pushes some changes, which I believe it is correct; 2. https://github.com/STEllAR-GROUP/DCA/blob/533565eccc33a27dfbfeee5f7bb70d0b27638fd3/src/parallel/hpx/CMakeLists.txt#L18 3. the main is defined through google test, 4. latest master today
<weilewei> ms[m] ^^
<gonidelis> When `test_for_each()` is called here https://gist.github.com/gonidelis/7a5c99d53230723afc15b38ef76df056 , does it actually call the overload from here https://gist.github.com/gonidelis/83fbb4283dac8d6d5bb8bcc602d88f3a ? If yes, aren't the template `typename` parameters supposed to be declared on the call inside some `< >`? And finally, what is
<gonidelis> `IteratorTag()` supposed to mean? Is it sth like a constructor ?
<ms[m]> weilewei: change it to `HPX::hpx_no_wrap_main`
<ms[m]> or hold on...
<ms[m]> how do you start the runtime?
<weilewei> ms[m] what do you mean how to start the runtime?
<weilewei> still same error with HPX::hpx_no_wrap_main
<ms[m]> weilewei: ok, we might need to expose hpx_wrap after all...
<ms[m]> you can for now use HPX::hpx and make parallel_hpx and OBJECT library, or you start the runtime explicitly
<ms[m]> * you can for now use HPX::hpx and make parallel_hpx an OBJECT library, or you start the runtime explicitly
<hkaiser> heller1: yt
<heller1> hkaiser: yes
<hkaiser> still up to talking?
<heller1> sure
Nikunj__ has quit [Ping timeout: 240 seconds]
<weilewei> ms[m] so replace STATIC to OBJECT?
gonidelis has quit [Ping timeout: 245 seconds]
<weilewei> jbjnr yt?
<weilewei> But a few days ago, I am able to use HPX::hpx and STATIC in parallel_hpx and no problem...
<ms[m]> weilewei: it was using hpx_wrap before
<ms[m]> I'm surprised that worked on master though, that shouldn't even be an exported target
<ms[m]> weilewei: if you can wait until tomorrow you'll have another pr to try out
<ms[m]> the problem is that hpx_wrap is required for main to be started on the hpx runtime
<ms[m]> however, it needs to be linked in a very specific order for it to work, which is kind of error prone
Yorlik_ has joined #ste||ar
Yorlik_ is now known as Yorlik
nan11 has quit [Remote host closed the connection]
nan11 has joined #ste||ar
Yorlik has quit [Quit: Leaving]
Yorlik has joined #ste||ar
<weilewei> ms[m] sure, I will try the pr tmr, thanks!
Yorlik has quit [Read error: Connection reset by peer]
<weilewei> ok... what was working at this point is link parallel_hpx to hpx_wrap in hpx 1.4.1 version
<weilewei> however, look forward to tmr pr that enables me to link parllel_hpx to HPX::hpx ... thanks!
karame_ has quit [Remote host closed the connection]
<ms[m]> weilewei: I'll just make HPX::hpx_wrap an official target... there's no way around it if you don't start the runtime manually
<weilewei> ms[m] got it... actually can you please explain a bit start the runtime manually"? like let hpx find the main function entry?
<ms[m]> manually means calling hpx::init or hpx::start
<ms[m]> with hpx_wrap you're relying on the linker replacing your main with a main defined in hpx_wrap
<ms[m]> it's magic when it works, not so much when it doesn't
<ms[m]> the requirement is that you have to link to the static library that defines main (gtest) before the library that uses main (hpx_wrap) which is why you have to explicitly link to hpx_wrap
<ms[m]> this is only for main defined in static libraries though...
<ms[m]> otherwise things are easier
<weilewei> right... I am using hpx_main in the code to replace main()... so hpx_wrap does this work
<ms[m]> hpx_main is yet another thing...
<ms[m]> oh right, hpx_main.hpp
<weilewei> right
<weilewei> thanks for the explanation! Understood now
karame_ has joined #ste||ar
rtohid has quit [Remote host closed the connection]
nan11 has quit [Remote host closed the connection]
weilewei has quit [Remote host closed the connection]
karame_ has quit [Remote host closed the connection]
Yorlik has joined #ste||ar
weilewei has joined #ste||ar
<Yorlik> dsdf
<Yorlik> woops
<zao> Well said.
<Yorlik> I'm just customizing my new machine - never thought it would be THAT much faster.
<zao> Hehe.
<zao> I bought some SSDs today, been busy migrating everything tonight.
<Yorlik> Seeing that many cores in resmon is frigtening - lol
<Yorlik> Copying stuff like ssh keys right now, so I'll do my forst test compiles soonish hopefully :)
<Yorlik>
<Yorlik> I'm also looking forward to converting my old machine into a Linux Box.
<zao> GCC 10.1 seems like a super fun compiler btw.
<Yorlik> I used to use Debian a lot in the past, but I'm wondering whats funnest for developing. Not going to use Ubuntu though.
<Yorlik> I'm tempted to use Fedora.
<Yorlik> My codev like Linux Mint
<Yorlik> Any more educated suggestions?
<zao> I run Arch on my desktop but it breaks a bit too often.
<zao> Build machine at home runs Ubuntu of varying age.
<zao> Fedora is not bad polish-wise.
<Yorlik> We decided for CentOS for the server most likely - so staying with RedHat might make sense
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
nikunj has quit [Ping timeout: 244 seconds]
Yorlik has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
Yorlik has joined #ste||ar