hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
rtohid has left #ste||ar [#ste||ar]
Vir has joined #ste||ar
Vir is now known as Guest69601
Guest69601 has quit [Client Quit]
V|r has joined #ste||ar
V|r is now known as Vir
bita has quit [Ping timeout: 256 seconds]
hkaiser has quit [Quit: bye]
akheir1 has quit [Quit: Leaving]
bita has joined #ste||ar
bita has quit [Quit: Leaving]
nan11 has quit [Remote host closed the connection]
kale_ has joined #ste||ar
h41lhydr4 has joined #ste||ar
kale_ has quit [Ping timeout: 246 seconds]
Vir has quit [Quit: ZNC 1.7.5+deb4 - https://znc.in]
Vir has joined #ste||ar
Vir is now known as Guest60983
h41lhydr4 has quit [Quit: Konversation terminated!]
h41lhydr4 has joined #ste||ar
Guest60983 has quit [Quit: ZNC 1.7.5+deb4 - https://znc.in]
Vir has joined #ste||ar
Vir is now known as Guest1064
Guest1064 has quit [Remote host closed the connection]
V|r has joined #ste||ar
V|r is now known as Vir
weilewei has quit [Remote host closed the connection]
h41lhydr4 has quit [Remote host closed the connection]
Yorlik_ has joined #ste||ar
Yorlik_ has quit [Client Quit]
Yorlik_ has joined #ste||ar
Yorlik has quit [Disconnected by services]
Yorlik_ has quit [Client Quit]
Yorlik has joined #ste||ar
<Yorlik> After using only BOOST_ROOT on Linux - I am now trying to compile on Windows again and am getting this error:
<Yorlik> Could NOT find Boost (missing: Boost_INCLUDE_DIR)
<Yorlik> It seems the build is working differently on Linux and Windows. Could we get some consistency here?
<Yorlik> ms[m] ^^
<Yorlik> K-ballo ^^
<ms[m]> Yorlik: I wish... afaict this is a cmake/boost problem
<ms[m]> we do very little to customize finding boost
<Yorlik> Shouldn't we just use Boost_DIR then?
<ms[m]> pretty much the only thing is setting `Boost_NO_BOOST_CMAKE` to `ON`
<Yorlik> Or does find boost work substantially different then find_package?
<ms[m]> you can try turning that off and it might make life better
<Yorlik> I have used that in the past
<ms[m]> I don't know about it's internals, but it is a somewhat special case
<Yorlik> I just want a reliable, platform independent build interface for HPX
<Yorlik> So - a clear dfinition of which variables are needed and which values are expected.
<Yorlik> There should be no if(WIN32) or if(UNIX) for me to define
hkaiser has joined #ste||ar
<Yorlik> hkaiser: Hello and ^^ :)
<hkaiser> Yorlik: wishful thinking...
<hkaiser> boost cmake support is very brittle (as you have noticed)
<Yorlik> How many boost libraries do you use?
<Yorlik> I might be easier to just make a bunch of imported library definitions and ask the user to supply the specifics
<Yorlik> like library name and include location
<Yorlik> Because this state of things is a royal pain in the assets
<hkaiser> Yorlik: with c++17 we only use header only libraries (I think)
<Yorlik> The problem is the inconsistency between the Linux and the Windows build.
<hkaiser> tell us what you need to do on both platforms
<hkaiser> sure it's a problem, but what do you suggest?
<Yorlik> Well - you need to tell me what to do to build on LÖinux and Windows.
<hkaiser> boost cmake support is different for each boost version (almost)
<hkaiser> I'm using Boost_ROOT and that's it
<Yorlik> Either wrap it inside the HPX build system and present a consistent platform independent build interface or tell exactly what HPX needs to be set and how for each platform.
<Yorlik> I gave HPX just BOOST ROOT and it was valid - I tested and suddenly it asked for the include dir
<hkaiser> Yorlik: nobody has that information
<Yorlik> After changing it on linux to only use boost root it was broken on Windows
<hkaiser> all of this depends on 100 factors
<Yorlik> Which information?
<hkaiser> well, how boost was installed, mostly
<Yorlik> You should be able to tell me as a user hat you need from me.
<hkaiser> cmake tells you what it needs
<Yorlik> I can't configure anything properly if I don't know what exactly you need from me.
<hkaiser> sometimes it needs just Boost_ROOT, sometimes it wants to have more, depending on the situation
<hkaiser> Yorlik: pls understand - it's not HPX that has a problem here, go complaining on the Boost ML
<Yorlik> How many boost libraries are you actually using?
<hkaiser> but even if that will be fixed, there are at least 10 versions of boost in active usage out there
<hkaiser> Yorlik: we use no libraries to link on c++17
<hkaiser> and a whole bunch of header only ones
<Yorlik> The most reliable way would probably to create imported targets.
<Yorlik> But that might require providing a bunch of settings from the user.
<Yorlik> Or - if it is C++17 - just the include directory
<Yorlik> It seems the main branch is the one between 17 and pre 17
<Yorlik> With C++17 and header only you should just require the include firectory
<hkaiser> yes
<Yorlik> the pre c++17 situation with itzs gazillions of naming variations you could either wrap that internally or ask the user for precise names to build the imported tragets
<K-ballo> Yorlik: BOOST_ROOT is all you need in a normal scenario, if you need something else then there's something odd or plain wrong with your boost install
<K-ballo> take it over to either #cmake or #boost and people will guide you through it
<Yorlik> It's a perfectly normal install in a custom Directory, just to be used for development.
<K-ballo> the first thing they'll ask for is the trace given Boost_DEBUG, on a clean build
<Yorlik> Boost Debug just finds everything
<Yorlik> I get a large list with all I compoiuled and installed
<K-ballo> take it to the appropriate place, this is not the appropriate channel to aid your superbuild pains
<K-ballo> I'm on both of those channels, so I can follow along
<Yorlik> I think it will just continue the blame game who's error it is Boost CMake support or Cmakes FindBoost or my install or whatever. Something is just not right in this system.
<Yorlik> Using Boost is the single most painful library. Even hwloc is easier to compile and use.
<K-ballo> sure, just blame someone else, that works
<Yorlik> OK - the blame is on me, got it. Great.
<hkaiser> Yorlik: if in your quest you find out that HPX can do something to help, pls let us know
<K-ballo> put the blame on whoever you want, just on the appropriate channel
<K-ballo> else you won't be getting the necessary help to solve those problems, whoever's fault it is
<Yorlik> K-Ballo Essentially you say hpx can't do anything it's the way how boost / cmake work.
<K-ballo> no, I'm saying that's not even the way boost/cmake works
<Yorlik> HPX requires different inputs on Windows and Linux to build. That's the surface of the problem.
<K-ballo> if it were indeed that way then we hpx could look into doing something
<K-ballo> BOOST_ROOT is all HPX takes on windows, linux, mac, as long as you have a normal boost deploy
<K-ballo> if you have a "weird" boost deploy, then you need extra variables in all of windows, linux, mac
<Yorlik> On Linux it seems to be happy with BOOST_ROOT, on Windows not.
<K-ballo> I've built all 6 kinds of hpx
<K-ballo> everyone else here has said as much too
<Yorlik> My Boost deploy is as weird on Linux as on Windows
<K-ballo> take it to the appropriate channel and share the relevant information that was requested
<ms[m]> diehlpk_work: I started updating this: https://github.com/STEllAR-GROUP/hpx/wiki/GSoD-2020-Project-Ideas
<ms[m]> are you going to register us? if yes, can you let me know when you've done that and have the questions that we need to answer?
<ms[m]> I guess they'll be mostly the same as last year, but I didn't want to start updating that page before I know
<ms[m]> K-ballo, hkaiser, heller, rori, jbjnr please also have a look at the link above and add/edit/remove things if you feel like it
<ms[m]> Yorlik: you too ;) there's a project for presenting our api documentation in a better way...
nikunj97 has joined #ste||ar
Nikunj__ has joined #ste||ar
nikunj97 has quit [Ping timeout: 240 seconds]
weilewei has joined #ste||ar
<hkaiser> ms[m]: Patrick said he wanted to register us for GSoD
<ms[m]> hkaiser: 👍️ then I'll wait
<hkaiser> ms[m]: yt?
<hkaiser> care for a Kokkos question?
<ms[m]> hkaiser: try me, can't promise an answer ;(
<ms[m]> ;)
<hkaiser> trying to build vanilla kokkos
<hkaiser> cmake tells me that it can't find TPLHWLOC :/
<hkaiser> HWLOC is in place for sure
<K-ballo> what does TPL stand for?
<hkaiser> no idea
<hkaiser> some internal gibberish :/
rtohid has joined #ste||ar
<ms[m]> third party library, I think
<ms[m]> i.e. dependency...
<ms[m]> uh, I think I never enable hwloc with kokkos
<ms[m]> do you know you need it?
<hkaiser> even if I disable HWLOC
<hkaiser> ahh no, setting HWLOC to off solves it, thanks
<ms[m]> do you need maybe TPLHWLOC_ROOT? instead of HWLOC_ROOT?
<zao> My build seems to find HWLOC, but EasyBuild populates a lot of environment variables.
<zao> -- Found TPLHWLOC: /eb/software/hwloc/1.11.12-GCCcore-8.3.0/lib/libhwloc.so
<ms[m]> at least with the openmp and hpx backends there's no use for hwloc
<ms[m]> I haven't checked, but I would guess they may use it with their pthread backend (which is deprecated or at least something in that direction)
<hkaiser> ms[m]: I disabled HWLOC, fine now
<ms[m]> 👍️
kale_ has joined #ste||ar
nikunj has quit [Ping timeout: 240 seconds]
<zao> Would the resource partitioners we have allow me to do something like this:
<zao> * one main thread with things that have OS thread affinity like GPU devices and window loops.
<zao> * a bunch of workers dispatching generic work.
<zao> Where I'd be able to dispatch tasks that need GPU or window access to that main thread, while it also may be blocked on OS primitives like waiting for input or swapping backbuffers.
kale_ has quit [Ping timeout: 240 seconds]
<zao> So a typical main loop would be `while (1) { PumpEvents(); update(); draw(); swap(); }`, would it suffice to occasionally call into some HPX in `update()` to make progress?
<zao> Can I oversubscribe the main thread to have it also dispatch regular worker tasks in addition to the GPU-specific tasks?
nikunj has joined #ste||ar
nan11 has joined #ste||ar
karame_ has joined #ste||ar
bita_ has joined #ste||ar
<zao> I'm guessing Yorlik is doing something similar.
<zao> (I'm gonna head out shopping now, don't answer everyone at once :P )
K-ballo has quit [Quit: K-ballo]
K-ballo has joined #ste||ar
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
<bita_> hkaiser, do you a minute?
<hkaiser> bita_: here now
<bita_> I implemented the 3d version of kmeans, https://gist.github.com/taless474/de32022edcf0e9da8aa9754fdcec6652, only the first function (closest centroids) changes. I am trying to simplify other functions like this: https://github.com/STEllAR-GROUP/phylanx/blob/master/src/plugins/algorithms/kmeans.cpp#L135 so we don't use fmap, lambda, vstack or apply
<hkaiser> nice!
<bita_> we need to use one of expand_dims or reshape anyway
<bita_> we certainly needs statistics, argmin and slice. I am not sure about the others yet
<hkaiser> ok, statistics it is, then
<bita_> yeah :~|
<hkaiser> I need to do something for one of the other projects by tuesday (I'm delaying that for over a month now)
<hkaiser> but then I'll look into that
<bita_> thank you. I wanna look into argmin now
<hkaiser> should be very similar to statistics: local argmin + reduction
<bita_> yes, but it is a little bit simpler
<hkaiser> right - good example we can learn from
<bita_> nod
rtohid has quit [Ping timeout: 240 seconds]
rtohid has joined #ste||ar
bita_ has quit [Quit: Leaving]
rtohid has left #ste||ar [#ste||ar]
<Yorlik> hkaiser: deleaker (using a trial version currently) is telling me this message never gets destroyed:
<Yorlik> auto fut = hpx::async<gameobject::send_message_action<M>>( recipient, msg );
<Yorlik> That's over the entire runtime of the pürogram, terminating regularly with exit code 0
K-ballo has quit [*.net *.split]
diehlpk_work_ has quit [*.net *.split]
jaafar has quit [*.net *.split]
diehlpk_work_ has joined #ste||ar
jaafar has joined #ste||ar
Nikunj__ has quit [Read error: Connection reset by peer]
K-ballo has joined #ste||ar
<hkaiser> Yorlik: do you think it's our problem?
<Yorlik> I don't know.
<hkaiser> so which object is not released? msg?
<Yorlik> Yes
<hkaiser> it's clearly yours, lives probably on the stack where that async is called
<Yorlik> But when the function exits it should be gone, right?
<hkaiser> right
<Yorlik> This is just a short send_message that exist after sending.
<hkaiser> if it's movable and you don't need it after the async, try moving it
<Yorlik> I'm actually moving it into another structure
<hkaiser> ok
<Yorlik> for checking if there was a remote exception
<Yorlik> But these structures get detroyed regularly
<hkaiser> Yorlik: then the async sees a copy of it anyways
<Yorlik> Exactly
<hkaiser> if it lives long enough you can do a std::ref(msg)
<hkaiser> but you said you were moving it, so this might not be a good idea
* zao shakes a fist at MSVC and references
<hkaiser> zao: sorry for not answering your questions today, didn't see them in time
<Yorlik> That would explode sooner or later I think.
<zao> hkaiser: No worries, it's mostly musing whether HPX might work this time :)
<hkaiser> Yorlik: we have async_cb<Action>(..., callback) where the callback is invoked when its safe t o destroy the arguments
<Yorlik> If it would only be few messages I'd believe its some not yet cleaned up leftovers, but its too many. I'll try to better understand this output and then check back with you.
<Yorlik> The problem is, the deleaker complains about the copy not being destroyed
<Yorlik> But I'll look closer - maybe I'm missing something
<hkaiser> Yorlik: hmm, unlikely
<Yorlik> hkaiser: how would you read this stacktrace of a leaking allocation? https://gist.github.com/McKillroy/4a2db9f447b95d46afce742ade198bda
<Yorlik> The offending line in my code is listed at the top
<hkaiser> Yorlik: looks like the future is never made ready/shared state not being free
<hkaiser> d
<Yorlik> Like get() never being called?
<hkaiser> well, the shared state has three refcnts
<hkaiser> the primise, the future, and agas
<hkaiser> promise and future will most probably go out of scope, we should investigate what happens to the refcnt from agas
<hkaiser> should be reproducible with a 10 liner ;-)
<Yorlik> What would I have to do to test this?
<hkaiser> async an arbitrary action
<Yorlik> I am moving the future into a pair together with the original message
<Yorlik> I call this an echo
<hkaiser> well, the future holds the shared state as well
<Yorlik> These echoes are checked regularly
<Yorlik> So I can trace remote exceptions
<Yorlik> And i still have the message for debugging / rollback
<Yorlik> If the future is ready the pair gets dumped
<Yorlik> Otherwise it is stored again and checked at the end of the next frame
<Yorlik> For testing I could just do an immediate future.get() to see if it's still leaking
<Yorlik> So I would shortcut away all my checking mechanicvs
<Yorlik> hkaiser: I was just running a test with only one worker thread ( .hpx.ini: [hpx] os_threads = 1 ) and the leaks went away.
<hkaiser> so it's a race?