#ste||ar on 2020-06-10 — irc logs at irclog.cct.lsu.edu

2020-02-24 20:46 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020

00:09 Yorlik has quit [Ping timeout: 246 seconds]

00:56 nikunj97 has joined #ste||ar

01:00 nikunj97 has quit [Read error: Connection reset by peer]

01:01 nikunj97 has joined #ste||ar

01:12 Nikunj__ has joined #ste||ar

01:15 nikunj97 has quit [Ping timeout: 260 seconds]

01:43 nan11 has quit [Remote host closed the connection]

02:20 hkaiser has quit [Quit: bye]

03:35 kale[m] has quit [Ping timeout: 240 seconds]

04:01 kale[m] has joined #ste||ar

05:29 bita_ has quit [Ping timeout: 260 seconds]

06:30 parsa has quit [Remote host closed the connection]

06:33 parsa has joined #ste||ar

06:48 LiliumAtratum has joined #ste||ar

06:51 <LiliumAtratum> When I have a piece of code that uses hpx synchronization primitives (e.g. mutex) invoked from a thread not managed by hpx, it crashes. What I currently do is to pack that code in `hpx::async(lambda).get()`. Is this the way to go, or is there a more idiomatic (and possible faster) way?

07:24 <LiliumAtratum> is reading from std::fstream from hpx thread somehow potentialy dangereous?

08:07 Nikunj__ is now known as nikunj97

09:54 <jbjnr> LiliumAtratum: grep std::string temp = hpx::util::debug::suspended_task_backtraces(); in the source. It may help you

09:55 <jbjnr> you can get a list of the tasks that are suspended with their stacktraces - from there you might find who is (for example) holding a lock that someone else is waiting for (though we have lock checks for that)

09:56 <jbjnr> LiliumAtratum: yes. and sts function that might block is dangerous as it stops the hpx worker thread

09:56 <jbjnr> * LiliumAtratum: yes. any std function that might block is dangerous as it stops the hpx worker thread

09:58 <jbjnr> PS. I did not read all this thread, only the first messages - I see ten pages of other stuff I missed. sorry

09:58 nikunj97 has quit [Ping timeout: 256 seconds]

10:01 <LiliumAtratum> Thanks for the hind about suspended_task_backtraces

10:01 Amy1 has quit [Ping timeout: 256 seconds]

10:03 <LiliumAtratum> My problem in `std::fstream` is that it outright crashes. I have moved the loading to a regular std thread and so far it is working as intended. Of course I am aware that "working" does not necessary mean that there is no UB somewhere hidden deep inside ;)

10:04 Amy1 has joined #ste||ar

10:07 nikunj97 has joined #ste||ar

10:12 Nikunj__ has joined #ste||ar

10:14 nikunj97 has quit [Ping timeout: 260 seconds]

10:18 <jbjnr> we have a run_as_os_thread wrapper somewhere that can be used for that if it saves you any boilerplate std threads tuff

10:27 <LiliumAtratum> I am hoping to stick to hpx thread whenever possible to minimise the amount of switching between hpx and os threading. That's why I asked if there is any known issue with std::fstream being used in hpx thread. I understand it may be needlessly blocking the hpx worker thread, but I didn't expect crashes from within the std::fstream implementation.

10:27 Nikunj__ has quit [Quit: Leaving]

10:30 Amy1 has quit [Ping timeout: 246 seconds]

10:32 Amy1 has joined #ste||ar

11:35 kale[m] has quit [Ping timeout: 246 seconds]

11:35 kale[m] has joined #ste||ar

11:37 Yorlik has joined #ste||ar

11:38 hkaiser has joined #ste||ar

11:40 kale[m] has quit [Ping timeout: 256 seconds]

11:41 kale[m] has joined #ste||ar

11:43 hkaiser has quit [Read error: Connection reset by peer]

11:45 hkaiser has joined #ste||ar

11:47 Amy1 has quit [Ping timeout: 246 seconds]

11:51 Amy1 has joined #ste||ar

12:03 Amy1 has quit [Ping timeout: 256 seconds]

12:05 Amy1 has joined #ste||ar

12:58 nikunj has quit [Ping timeout: 256 seconds]

13:01 carola[m] has quit [*.net *.split]

13:01 neill[m] has quit [*.net *.split]

13:01 diehlpk_mobile[m has quit [*.net *.split]

13:01 gdaiss[m] has quit [*.net *.split]

13:03 nikunj has joined #ste||ar

13:06 gdaiss[m] has joined #ste||ar

13:07 carola[m] has joined #ste||ar

13:07 neill[m] has joined #ste||ar

13:07 diehlpk_mobile[m has joined #ste||ar

13:10 carola[m]1 has joined #ste||ar

13:11 K-ballo has quit [Ping timeout: 265 seconds]

13:15 carola[m] has quit [Ping timeout: 256 seconds]

13:40 K-ballo has joined #ste||ar

13:52 nan11 has joined #ste||ar

13:52 kale[m] has quit [Ping timeout: 260 seconds]

13:53 kale[m] has joined #ste||ar

14:00 <jbjnr> LiliumAtratum: the short answer to your earlier question is that you should not use hpx mutexes etc in non hpx threads. if you have non hpx threads, use std::mutexes in them.

14:08 <ms[m]> I guess noone is going to be sad if we remove compatibility headers that are in detail directories?

14:16 <hkaiser> ms[m]: we'll see ;-)

14:17 <hkaiser> ms[m]: btw, thanks for looking into enabling unity builds!

14:17 <ms[m]> :P

14:18 <ms[m]> I say boohoo for them in that case... we haven't been very good with what guarantees we give, but I was hoping detail would clearly be off limits

14:18 <hkaiser> should those be enabled for the components as well?

14:18 <ms[m]> but who knows...

14:18 <ms[m]> we could probably do that as well, just didn't try it out

14:18 <hkaiser> sure, no worries

14:18 <ms[m]> we might save quite a bit of time there as well

14:18 <hkaiser> this pays off more than I thought it would, I like it

14:19 <hkaiser> on circleci I hope we don't run oom with this, however

14:20 <ms[m]> yes, that's a risk

14:20 <ms[m]> it's off everywhere though

14:20 <ms[m]> I just wanted to have it on to try locally for a while first

14:20 <ms[m]> *have it be... possible I guess

14:20 <hkaiser> right

14:21 <LiliumAtratum> @jbjnr That's why I pack the code that uses hpx mutexes into `hpx::async(lambda).get()`. Usually it is at the border between UI and the model logic / algorithms.

14:21 <ms[m]> we'll have to check that before we think about enabling it by default

14:21 <hkaiser> LiliumAtratum: be careful with this

14:21 <hkaiser> .get on a future might try to suspend the executing thread

14:22 <hkaiser> if you want to execute hpx functionality from a non-hpx thread use hpx::threads::run_as_hpx_thread(F, Args...)

14:23 <hkaiser> that is doing the 'right thing' to synchronize, etc.

14:23 <LiliumAtratum> will it work correclty if I am already in a hpx thread? Because at certain points, I may or may not be in one.

14:23 <hkaiser> that's not good either - but you could create your own wrapper that does either this or that

14:24 <hkaiser> you're on an hpx thread if hpx::threads::get_self_ptr() != nullptr

14:24 <LiliumAtratum> even if the hpx is registered to the thread?

14:25 <hkaiser> yes, get_self_ptr() is specific to hpx threads

14:25 <LiliumAtratum> ok :)

14:27 <LiliumAtratum> I must say, the integration between libs that are "thread-aware" and hpx is a pain. But that's mostly because how those libs were created, probably with stuff like thead-local vars etc.

14:29 <hkaiser> LiliumAtratum: yes, that's one of the biggest shortcomings of hpx

14:29 <hkaiser> you have to be very careful with thread_local's etc

14:29 <LiliumAtratum> but you can do nothing about it I think

14:29 <hkaiser> not much

14:29 <LiliumAtratum> for me thread-local is an ugly hack

14:29 <hkaiser> except to rely on the user knowing what he's doing ;-)

14:31 <hkaiser> actually (as I realize now), we should have fixed this (partially) and calling .get should work properly on non-hpx threads...

14:32 <hkaiser> hmmm, might not work as expected - let's see if changiing to run_as_hpx_thread fixes things for you - if not then we have to go back

14:33 <LiliumAtratum> for now it works. I had problems with `std::fstream` but that's another thing.

14:34 <LiliumAtratum> right now I just moved all fstream-related code to OS thread

14:38 <jbjnr> hkaiser:

14:38 <jbjnr> oops

14:38 <zao> :D

14:39 <jbjnr> was going to say that the hpxrun changes are nice and all you are doing is making a simple PR into a major work by asking for ssh wrappers.

14:49 <hkaiser> jbjnr: fair point

14:49 <hkaiser> I'll add a note

14:53 <hkaiser> LiliumAtratum: good thinking

14:54 <hkaiser> LiliumAtratum: there is a similar facility hpx::threads::run_on_os_thread() allowing to schedule IO on a non-hpx thread from inside hpx

14:54 <hkaiser> it will run on one of the special helper threads HPX creates for its own needs

14:57 <weilewei> jbjnr so if we want to have libcds thread data similar to Apex does in HPX, I mostly need a similar one to HPX_SetupApex.cmake

14:57 <jbjnr> ok

14:57 <hkaiser> weilewei: you'll need that in any case, I think

14:58 <LiliumAtratum> hkaiser Oh, thanks, didn't realize that there is an opposite function like this!

14:59 <weilewei> hkaiser right.. I will start exploring it!

15:05 <jbjnr> weilewei: be aware that our APEX support uses a git external script. This has been superceded by the cmake fetchcontent stuff that should probably be used instead

15:06 <jbjnr> it does the same thing, but in a more widely used format (meaning other people know it)

15:06 <jbjnr> and it has more options and stuff

15:07 <weilewei> jbjnr ok, I will learn some cmake fetchcontent then

15:10 <jbjnr> @weil

15:10 <jbjnr> weilewei:

15:10 <jbjnr> grrr

15:10 <jbjnr> https://gist.github.com/biddisco/fcd2afedb74fed170bea7f7076134cb6

15:10 <jbjnr> there's a simple example I use in one of my projects to build hpx in a subdirectory

15:11 <jbjnr> (this worked before modularization, might be broken now), but you can just change the hpx to libcds and get the right url etc. then check what the options and flags are.

15:12 <weilewei> jbjnr that's a nice starting point! thanks

15:13 <jbjnr> just before the add_subdirectory call is where you would add stuff like (set libCDS_WITH_HPX ON) and set(libCDS_WITH_XXX OFF)

15:13 <jbjnr> and so on

15:14 <K-ballo> note that for those sets to work, a fairly recent cmake version is needed (3.13+?)

15:14 <weilewei> I see

15:14 <hkaiser> kale[m]: yah, we require 3.13 nowadays

15:14 <hkaiser> K-ballo: ^^

15:15 <weilewei> yea, I found 3.13 in https://github.com/STEllAR-GROUP/hpx/blob/master/tests/unit/build/fetchcontent/CMakeLists.txt

15:17 <hkaiser> ms[m]: I'd like to use hpx-docs.stellar-group.org for our docs

15:17 <hkaiser> (or similar)

15:17 LiliumAtratum has quit [Remote host closed the connection]

15:17 <hkaiser> I think github allows to use custom domains, so this should be easy enough

15:19 <ms[m]> hkaiser: yeah, I'm in favour of that

15:19 <ms[m]> there's this: https://help.github.com/en/github/working-with-github-pages/managing-a-custom-domain-for-your-github-pages-site

15:19 <hkaiser> what should we use as the subdomain? hpx-docs?

15:20 <ms[m]> yeah, I think that could be good

15:21 <ms[m]> what I'd really like is hpx.stellar-group.org/docs but that might be harder to achieve

15:21 bita_ has joined #ste||ar

15:21 <ms[m]> it needs to have hpx and something to do with documentation in the name, so hpx-docs is as good as it gets I guess :)

15:21 <hkaiser> http://hpx-docs.stellar-group.org/

15:22 <hkaiser> now you need to change the forwarding ;-)

15:35 <ms[m]> hkaiser: nice, that was fast!

15:36 <ms[m]> you mean the links on the webpage or am I forgetting about something else?

15:47 <hkaiser> well, this somehow forwards to the start page using the github.iolink

15:47 <hkaiser> ms[m]: ^^

15:48 <hkaiser> also the readme links, yes

16:01 Amy1 has quit [Ping timeout: 240 seconds]

16:04 Amy1 has joined #ste||ar

16:07 <ms[m]> hkaiser: ah yes, readme as well

16:08 <ms[m]> thanks for setting that up!

16:08 <hkaiser> np

16:49 Amy1 has quit [Ping timeout: 264 seconds]

16:52 Amy1 has joined #ste||ar

17:15 Amy1 has quit [Ping timeout: 246 seconds]

17:19 Amy1 has joined #ste||ar

17:29 rtohid has joined #ste||ar

17:35 kale[m] has quit [Ping timeout: 256 seconds]

17:35 kale[m] has joined #ste||ar

17:40 kale[m] has quit [Ping timeout: 256 seconds]

17:40 kale[m] has joined #ste||ar

17:57 kale[m] has quit [Ping timeout: 246 seconds]

17:58 kale[m] has joined #ste||ar

17:59 mcopik has joined #ste||ar

17:59 mcopik has quit [Client Quit]

18:07 kale[m] has quit [Ping timeout: 260 seconds]

18:11 kale[m] has joined #ste||ar

18:21 kale[m] has quit [Ping timeout: 260 seconds]

18:21 kale[m] has joined #ste||ar

18:37 kale[m] has quit [Ping timeout: 256 seconds]

18:37 kale[m] has joined #ste||ar

18:41 <parsa> dar hamin

19:04 kale[m] has quit [Ping timeout: 256 seconds]

19:05 kale[m] has joined #ste||ar

20:06 <weilewei> is it possible to occur cyclic build issue that build hpx that has libcds option enabled, and then hpx FetchContent_Declare libcds, which needs hpx for hpx threading support?

20:20 <jbjnr> the way to do it is do fetchcontent of libcds inside the hpx cmake, but before add subdirectory (libcds...) - set (libCDS with HPX =on) and inside libcds cmakelists do a check to see if you are part of the hpx build.

20:21 <jbjnr> I think I have an example somewhere - I will look later - however, if libcds is pulled in after other modules, you can alwyas do if(EXISTS some hpx target that is always done first) - it doesn't have to be sophisticated

20:21 <weilewei> I see, let me check

20:23 <jbjnr> actually, APEX uses a special CMakelists.hpx instead of CMakeLists.txt - you could do that - and set a flag to say "I'm in an hpx build'

20:23 <jbjnr> there are quite a few possibilities- even just setting libCDS_INSIDE_HPX before add_subdirectory should be enough

20:23 <jbjnr> then libcds can skip the find_package(HPX) and just use the hpx targets it needs

20:24 <jbjnr> the nice part of fetchcontent and add_subdirectory is that esssentially the two projects become almost like one project and you can access vars from the first in the sub project

20:25 <jbjnr> especialy targets - which have global scope

20:26 <weilewei> nice, I will take a look how apex does to check weather it is inside hpx build

20:26 <jbjnr> what would be nice is id libcds was compiled as an hpx module - so that the libcds library became hpx_libcds - but you can leave that to me if you like

20:28 kale[m] has quit [Ping timeout: 256 seconds]

20:29 kale[m] has joined #ste||ar

21:20 <hkaiser> jbjnr: do you plan to fork libcds into the HPX repo? not sure if that's what we should do...

21:21 <jbjnr> not fork. checkout libcds into a subdir like we do with apex

21:21 <jbjnr> (but on an hpx branch)

21:21 <hkaiser> ok

21:22 <hkaiser> sounds good

21:59 <jbjnr> hkaiser: just pushed patch to make executors copyable in cuda branch. please check. bed time now

21:59 <hkaiser> jbjnr: ok, cool - thanks!

22:00 <hkaiser> that's important

22:04 <hkaiser> very interesting read: https://www.stroustrup.com/hopl20main-p5-p-bfc9cd4--final.pdf

22:09 <zao> Ooh, I was pondering just the other day if there's a whirlwind tour of all the language versions I've missed and what I should learn properly.

22:26 rtohid has left #ste||ar [#ste||ar]

22:48 <K-ballo> I've updated the reports: master https://gist.github.com/K-ballo/876eb506de2508d6846ed094f739c419, invoke https://gist.github.com/K-ballo/9a5bee727b076f5b46eb49922aeec83a

22:48 <K-ballo> with bimap gone, invoke related machinery jumps to positions 2, 3, 4, 5, 10 and 13

22:48 <K-ballo> and 9

22:49 <K-ballo> and 27

23:07 <hkaiser> very nice!

23:08 <hkaiser> so the first 50 or so are caused by spirit...

23:09 <hkaiser> and that even if we have only 3 or 4 TUs using spirit to begin with

23:09 <hkaiser> not good

23:12 <K-ballo> speaking of parsing.. seen the parse times for program options?

23:20 karame_ has quit [Remote host closed the connection]

23:25 karame_ has joined #ste||ar

23:36 <hkaiser> K-ballo: no

23:51 <K-ballo> the first gist has a second file with header parsing times