#ste||ar on 2020-05-29 — irc logs at irclog.cct.lsu.edu

2020-02-24 20:46 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020

00:00 <weilewei> @hkaiser I think that's open set of keys, if I understand correctly. It binds a unique id to thread local data...

00:48 nikunj97 has joined #ste||ar

01:20 nan111 has quit [Remote host closed the connection]

01:26 bita_ has joined #ste||ar

01:42 Yorlik has quit [Ping timeout: 246 seconds]

02:02 diehlpk_work has quit [Remote host closed the connection]

02:27 nikunj97 has quit [Quit: Leaving]

05:17 bita_ has quit [Ping timeout: 260 seconds]

06:47 jaafar has quit [Quit: Konversation terminated!]

10:20 tiagofg[m] has left #ste||ar ["Kicked by @appservice-irc:matrix.org : Idle for 30+ days"]

10:20 freifrau_von_ble has quit [Quit: Idle for 30+ days]

10:39 mcopik has joined #ste||ar

10:40 mcopik has quit [Client Quit]

10:50 Yorlik has joined #ste||ar

12:34 <ms[m]> rori: jbjnr hkaiser heller could you have a look at this please: https://github.com/STEllAR-GROUP/hpx/issues/4690? do we want anything else? (just edit the issue if so)

12:34 <ms[m]> I'll send a similar email with a link to the issue once you're happy with it

12:36 <heller1> ms[m]: hmm, It could be that some institutions might not want to share the application and what the application does?

12:36 <ms[m]> heller: but then they don't comment...?

12:37 <ms[m]> or do you mean we might want to know that some institution is using hpx but doesn't want to tell for what yet?

12:37 <ms[m]> that's a good point... do we care more about applications or users, or do we just ask for either

12:40 <heller1> right

12:40 <heller1> I want a list of institutions and/or applications

12:42 <ms[m]> heller: yep, gotcha

12:48 hkaiser has joined #ste||ar

12:52 <ms[m]> heller: rearranged it a bit

13:00 <gonidelis[m]> hkaiser: I reckon that's an automatic message there on my commit :q

13:00 <hkaiser> gonidelis[m]: no, I put it there just now

13:00 <hkaiser> ;-)

13:02 <gonidelis[m]> ahh... thanks a lot. I think we have the whole summer ahead of us so let's put that T-shirt Goal by the end of the period? (just to keep the motive :p)

13:04 <gonidelis[m]> Nevertheless, maybe the feel that part of my code is up there (on master) is honestly the biggest satisfaction I could get!!!

13:17 <ms[m]> hkaiser always sounds like a robot...

13:18 <ms[m]> gonidelis: congrats on getting that pr in! you're making really nice progress :D

13:23 <hkaiser> gonidelis[m]: I need to be able to get back to my office first anyways, might take a while...

13:26 <gonidelis[m]> ms[m]: thanks a lot!!!!

13:26 <gonidelis[m]> hkaiser: np... back to work! ;)

13:29 <Yorlik> Are task IDs guaranteed to be unique over the runtime of a program?

13:30 <ms[m]> Yorlik: no

13:30 <Yorlik> Just at a given time, I assume then?

13:30 <ms[m]> they get reused

13:30 <ms[m]> exactly

13:30 <Yorlik> OK. Makes sense.

13:31 <ms[m]> having two different tasks exist at the same time means they're not different tasks ;)

13:31 <Yorlik> Could it be, that such a reuse happens really quickly?

13:31 <Yorlik> Especially if there are many tasks.

13:31 <ms[m]> * having two different tasks exist at the same time with the same id means they're not different tasks ;)

13:32 <ms[m]> yes

13:32 <ms[m]> it depends mostly on how often tasks yield

13:32 <Yorlik> How likely is that?

13:32 <ms[m]> if most tasks don't yield and just run to completion the next task can reuse the same id immediately

13:32 <Yorlik> because I am running two lambdas at start and end of a task

13:32 <Yorlik> And they store certain data keyed with the id

13:33 <Yorlik> But I guess the lambdas are outside the task.

13:33 <ms[m]> how likely is yielding? depends on what your tasks do

13:33 <Yorlik> I yield a lot

13:33 <ms[m]> from not following your discussions here earlier I get the impression that you keep your tasks around for quite a long time

13:33 <ms[m]> right

13:33 <Yorlik> Not really

13:34 <Yorlik> A task is just a chunk in a poarloop

13:34 <Yorlik> But I am using a single lua state over the entire task

13:34 <ms[m]> mmh, that tasks can be long lived...

13:35 <ms[m]> but in any case, not sure what the question is anymore

13:35 <Yorlik> Probably it's not related to my current issue, but I have a weird crash currently which might be a race

13:35 <Yorlik> It seems not to happen if I have a really long break between my frame updates

13:36 <Yorlik> But if I run my frames immediatly after the previous frame has stopped it crashes.

13:36 <Yorlik> And the errors seem to come deeply out of the Lua call stack.

13:36 <Yorlik> So I guess there might be something that makes a state being reused in another thread

13:37 <Yorlik> I have to dig deeper - probably I just stupidly shot myself in the foot somewhere by overlooking something.

13:38 diehlpk_work has joined #ste||ar

13:39 <hkaiser> Yorlik: the start and end lambdas are run by the same task, definitely

13:40 <hkaiser> that's the whole point of having them

13:41 <hkaiser> a task id may be reused only after the task ran to completion (was terminated), it will never be reused while a task is just suspended

13:42 <Yorlik> That's good to know.

13:43 <Yorlik> I believe the Lua State crashes are just the symptom - I'm making and error somewhere. But I don't know yet.

13:44 <Yorlik> What's strange is, that a really long break between the frames make it effectively go away (100ms)

13:44 <Yorlik> with 25 ms or 50 ms breaks it just happens more rare

13:44 <Yorlik> Might be related to my timed messages

13:52 <hkaiser> gonidelis[m], heller1, rori: do we have the GSoC meeting now (in 10 mins)?

13:53 <gonidelis[m]> yes

13:56 <rori> 👍️

14:00 <hkaiser> rori, heller1: https://lsu.zoom.us/j/92781473639

14:27 K-ballo has quit [Quit: K-ballo]

14:27 K-ballo has joined #ste||ar

14:35 nan11 has joined #ste||ar

14:45 <diehlpk_work> mdiers[m], Do we have a page with the references in the documentation?

14:45 <diehlpk_work> ms[m], Not mdiers[m]

14:46 <diehlpk_work> I have seen it for some other projects and they had papers listed for all modules

14:46 <hkaiser> ms[m]: our web-pages are currently down

14:46 <ms[m]> diehlpk_work: which references?

14:46 <ms[m]> papers etc.?

14:46 <ms[m]> if yes, no

14:46 <hkaiser> lol

14:46 <hkaiser> if no, yes

14:47 <ms[m]> hpx.stellar-group.org actually goes to crochetcutedolls.com now :P

14:48 <gonidelis[m]> !!

14:51 <hkaiser> YES

14:51 <hkaiser> secret agenda

14:52 <hkaiser> ms[m]: bad news is that we might have lost all of the site

14:52 <hkaiser> they are trying to restore things, let's hope they have some newer backup

14:53 <ms[m]> hkaiser: uh-oh... fingers crossed that there's some trace of the new website there

14:53 <hkaiser> yes

14:53 <ms[m]> or it's a sign that it's time for all of us to change field

14:54 <hkaiser> going into the backup business?

14:55 <ms[m]> well, there are many possible interpretations...

15:11 akheir has joined #ste||ar

15:13 akheir has quit [Client Quit]

15:21 <K-ballo> trying the msvc build insights, results look interesting

15:22 <K-ballo> instantiating bimaps (69 times) takes over 26seconds, presumably

15:22 <K-ballo> time to experiment with our own bimap

15:40 <ms[m]> K-ballo: ouch... is that the worst one? do you have some sort of report you can share or is it all in msvc?

15:41 <K-ballo> it's a windows performance analyzer report

15:41 <K-ballo> and it's 1GB....

15:43 <ms[m]> so no :P

15:44 <K-ballo> bimap is the worst aggregated time, yes, if I'm reading this correctly

15:45 <K-ballo> the report compresses to ~70mb, the fact that it was 1gb to the byte was suspicious

16:01 <jbjnr> grrrrr. my executor works fine for async, but I get template deduction failed for apply

16:14 <hkaiser> K-ballo: bimap... hmm We use it in one spot, AGAS

16:14 <hkaiser> there might be a way to replace it

16:15 <hkaiser> it's ancient code, never has been touched for years

16:17 * jbjnr sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/vzqqkIWortHQOSdlKPchzeHf >

16:18 <hkaiser> jbjnr: need to see the actual compiler error...

16:18 * jbjnr sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/wTTYMEWTvIEsCHgOuWhLWcYN >

16:18 <jbjnr> plus the usual ten pages of other stuff.

16:19 <hkaiser> I need those too

16:19 <jbjnr> nah

16:19 <jbjnr> never mind

16:19 <jbjnr> I'll keep looking at it.

16:19 <jbjnr> what's odd is that hpx:async(...) works, but hpx::apply(...) doesn't

16:20 <hkaiser> sounds weird

16:20 <jbjnr> I'm converting the cuda helper code to an executor and adding a new cuda event polling function like the mpi one

16:23 <Yorlik> hkaiser: We solved the race - I did a very bad thing here, but not so bad as i thought..

16:24 <jbjnr> hkaiser:

16:24 <jbjnr> /home/biddisco/src/hpx-branches/cuda-futures/libs/executors/include/hpx/executors/apply.hpp:54: error: no type named ‘type’ in ‘struct std::enable_if<false, bool>’

16:24 <hkaiser> Yorlik: good

16:24 <jbjnr> deferred_invokable seems to think "no"

16:25 <hkaiser> jbjnr: could be

16:25 <hkaiser> async uses decltype(auto) nowadays, that could be the difference

16:25 <Yorlik> I had to introduce a lock at two places where i thought I'd not need one. But it's not performance critical - but it adds a lock at every task creation and release

16:26 <Yorlik> I might find a better solution later for it. It was related to the way how I allocate and release Lua states

16:26 <jbjnr> I tried using declytype(auto) for the post, but since it returns void anyway ....

16:26 <hkaiser> yah, it's the facilities used by post

16:26 <jbjnr> aha. you mean iside the apply code

16:26 <hkaiser> right

16:26 <hkaiser> comment out the enable_if and see where it breaks, that will give you better understanding

16:26 <jbjnr> but other executors use the same signature without problem. it's not right

16:27 <jbjnr> k

16:27 <hkaiser> clang is usually good at telling you why things have been SFINAE'd out

16:30 bita_ has joined #ste||ar

16:31 <jbjnr> hmm. execution.hpp non void function not returning anything, changing it to return post.... seems to fix it. I will experiment some more

16:32 <K-ballo> that shouldn't have fixed it

16:35 <jbjnr> correct. the right fix is remove the return type deduction completely, it should just be a void function

16:36 <jbjnr> getting rid of the auto now and retesting

16:37 <jbjnr> grrrr...

16:39 <jbjnr> aha. I see the problem again - it's because it wants to call to the cublas function, but it is missing a parameter, because the executor fills one in for it. Same problem I had with the mpi executor originally

16:39 <jbjnr> what did you do to fix that hkaiser

16:39 <jbjnr> the deferred_invoke return type is invalid because not all the params are present.

16:48 * jbjnr sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/VZwbHjtEeQnTReZJhRmrkuqW >

16:48 <jbjnr> yup. async just uses decltype(auto)

16:49 <jbjnr> I'll change apply to do the same and get rid of the enable if

16:50 <K-ballo> that sounds not right.. is the enable_if sfinae not wanted?

16:50 <jbjnr> ok, that fixes the compilation proble, but is it the right thing to do?

16:50 <jbjnr> ^^^messages crossed

16:50 <jbjnr> I guess it just shifts the comppilation error elsewhere if there is a real problem

16:51 <K-ballo> if we didn't want sfinae in it we shouldn't have used sfinae in it, the question is whether the sfinae was intended or an artifact

16:55 <jbjnr> indeed. since hkaiser has removed it from async, I guess we don't need it any more and I can remove it from apply now and rely on return type deduction of the final layer of the onion

16:56 <K-ballo> return type deduction doesn't sfinae

16:57 <hkaiser> K-ballo: I removed it as it was over-constraining things

16:57 <hkaiser> the underlying facilities constrain things sufficiently

16:57 <K-ballo> so I gather we never actually wanted sfinae in there

16:57 <hkaiser> it would have moved possible errors up the instantiation stack

16:58 <jbjnr> great. I'm happy then. that was what got in the way of my mpi first attempt. glad I understand it now

16:59 <K-ballo> those should have been static_asserts then

16:59 <K-ballo> sfinae is terrible for aiding diagnostics

17:01 <jbjnr> :+1

17:27 <hkaiser> we know that today, but we (at least I) didn't know that when we implemented that

17:28 <Yorlik> Can't you replace a lot of sfinae with if constexpr ?

17:30 <heller1> Yes

17:30 <heller1> C++17

17:30 <Yorlik> Argh - forgot hpx is c++14 - right?

17:31 <Yorlik> Or is it 11?

17:34 <K-ballo> similarly, we shouldn't be replacing a lot of sfinae with if constexpr, we should be replacing tag dispatching with if constexpr

17:43 jaafar has joined #ste||ar

17:54 <jbjnr> Yorlik: 14 minimum now

17:55 <Yorlik> Allright. Thanks.

18:02 sayefsakin has joined #ste||ar

19:30 kale_ has joined #ste||ar

19:30 kale_ has quit [Client Quit]

19:35 kale_ has joined #ste||ar

19:36 kale[m] has joined #ste||ar

19:36 kale_ has quit [Client Quit]

20:20 parsa has quit [Remote host closed the connection]

20:24 parsa has joined #ste||ar

20:25 parsa has quit [Remote host closed the connection]

20:29 parsa has joined #ste||ar

20:29 parsa has quit [Remote host closed the connection]

20:33 parsa has joined #ste||ar

21:02 hkaiser has quit [Quit: bye]

21:10 <bita_> K-ballo, can I ask a question?

21:10 <K-ballo> bita_: you may try

21:10 <bita_> I am trying to write a distributed version of csv_read. If a locality wants to start reading lines from a specific line, is there an option better than std::getline? I am looking into seekg, but haven't found the best option

21:13 <K-ballo> you don't know where the line ends and the next one starts without scanning the entire line

21:13 <K-ballo> you could seek, then scan the remaining of the line for the end, then start right after that?

21:13 <K-ballo> so seekg + getline and ignore

21:14 <K-ballo> but you can't seek to a specific line, unless you've scanned the entire file and already know where each one starts

21:14 <bita_> got it, thanks

21:14 <bita_> uhum

21:28 bita_ has quit [Quit: Leaving]

22:14 hkaiser has joined #ste||ar

23:09 kale[m] has quit [Ping timeout: 264 seconds]

23:10 kale[m] has joined #ste||ar