#ste||ar on 2017-11-09 — irc logs at irclog.cct.lsu.edu

2017-05-17 13:54 aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:00 <parsa[w]> ?

00:02 twwright has quit [Quit: twwright]

00:03 EverYoung has quit [Remote host closed the connection]

00:04 EverYoung has joined #ste||ar

00:06 twwright has joined #ste||ar

00:07 zbyerly_ has quit [Ping timeout: 250 seconds]

00:10 EverYoun_ has joined #ste||ar

00:13 EverYoung has quit [Ping timeout: 258 seconds]

00:15 EverYoun_ has quit [Ping timeout: 250 seconds]

00:17 EverYoung has joined #ste||ar

00:20 EverYoun_ has joined #ste||ar

00:21 <hkaiser> parsa[w]: why don't you just rebase your branch on top of the node_data_refactoring one?

00:21 <hkaiser> this way you can avoid conflicts once its merged

00:23 <hkaiser> parsa[w]: I'll commit what I have, but csv is still broken - needs more work

00:23 <hkaiser> have to go now for a while

00:24 EverYoung has quit [Ping timeout: 246 seconds]

00:24 <hkaiser> parsa: filew_write_csv is fixed, file_read_csv needs changes...

00:25 hkaiser has quit [Quit: bye]

00:25 parsa has quit [Quit: Zzzzzzzzzzzz]

00:27 EverYoun_ has quit [Remote host closed the connection]

00:27 EverYoung has joined #ste||ar

00:32 hkaiser has joined #ste||ar

00:41 EverYoung has quit [Ping timeout: 250 seconds]

00:41 EverYoung has joined #ste||ar

00:46 <hkaiser> parsa[w]: everything works now

00:46 hkaiser has quit [Quit: bye]

01:02 K-ballo has quit [Read error: Connection reset by peer]

01:02 K-ballo has joined #ste||ar

01:05 EverYoung has quit [Ping timeout: 258 seconds]

01:22 parsa has joined #ste||ar

02:11 gedaj has quit [Ping timeout: 268 seconds]

02:12 gedaj has joined #ste||ar

02:15 eschnett has joined #ste||ar

02:19 ct-clmsn has joined #ste||ar

02:20 ct-clmsn is now known as Guest8018

02:20 Guest8018 is now known as ct_clmsn_

02:49 K-ballo1 has joined #ste||ar

02:50 K-ballo has quit [Ping timeout: 248 seconds]

02:50 K-ballo1 is now known as K-ballo

02:51 zao has quit [Ping timeout: 248 seconds]

02:55 zao has joined #ste||ar

03:11 EverYoung has joined #ste||ar

03:13 jbjnr_ has joined #ste||ar

03:14 taeguk[m] has quit [Ping timeout: 250 seconds]

03:14 jbjnr has quit [Ping timeout: 250 seconds]

03:15 jbjnr_ is now known as jbjnr

03:16 thundergroudon[m has quit [Ping timeout: 246 seconds]

03:41 ct_clmsn_ has quit [Ping timeout: 240 seconds]

03:55 parsa has quit [Quit: Zzzzzzzzzzzz]

04:29 eschnett has quit [Quit: eschnett]

05:05 hkaiser has joined #ste||ar

05:28 hkaiser has quit [Quit: bye]

05:38 thundergroudon[m has joined #ste||ar

05:47 taeguk[m] has joined #ste||ar

07:45 msimberg has joined #ste||ar

07:52 Guest94058 has joined #ste||ar

08:12 Guest94058 has quit [Remote host closed the connection]

08:22 jaafar_ has quit [Ping timeout: 240 seconds]

08:52 david_pfander has joined #ste||ar

09:28 david_pfander1 has joined #ste||ar

09:29 msimberg has quit [Ping timeout: 268 seconds]

09:29 david_pfander has quit [Remote host closed the connection]

09:29 david_pfander1 is now known as david_pfander

09:42 msimberg has joined #ste||ar

10:07 <github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vFuNi

10:07 <github> hpx/gh-pages 0090a51 StellarBot: Updating docs

10:18 msimberg has quit [Ping timeout: 240 seconds]

10:31 msimberg has joined #ste||ar

12:00 EverYoung has quit []

12:01 EverYoung has joined #ste||ar

12:03 msimberg has quit [Ping timeout: 240 seconds]

12:17 msimberg has joined #ste||ar

12:48 hkaiser has joined #ste||ar

13:03 hkaiser has quit [Quit: bye]

13:34 parsa has joined #ste||ar

13:43 <parsa[w]> can a performance counter be registered after hpx_main?

13:44 <parsa[w]> *in it

13:44 <parsa[w]> heller: ^

13:49 eschnett has joined #ste||ar

13:56 parsa has quit [Quit: *yawn*]

14:05 aserio has joined #ste||ar

14:13 parsa has joined #ste||ar

14:13 parsa has quit [Client Quit]

14:23 hkaiser has joined #ste||ar

14:32 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vFzuO

14:32 <github> hpx/master 657d20c Hartmut Kaiser: Merge pull request #2989 from STEllAR-GROUP/dataflow-error...

14:32 <github> [hpx] hkaiser deleted dataflow-error at c507bcd: https://git.io/vFzuG

14:33 zbyerly_ has joined #ste||ar

14:42 aserio has quit [Ping timeout: 240 seconds]

14:55 zbyerly_ has quit [Ping timeout: 258 seconds]

14:59 parsa has joined #ste||ar

15:00 <parsa[w]> hkaiser: how do i register a performance counter? i don't have access to main

15:02 hkaiser has quit [Read error: Connection reset by peer]

15:02 hkaiser_ has joined #ste||ar

15:02 <hkaiser_> parsa: I have merged the node_data branch

15:03 eschnett has quit [Quit: eschnett]

15:04 <hkaiser_> parsa: look at the sine example to see how you cna register perf-counters during startup

15:07 <hkaiser_> parsa: https://github.com/STEllAR-GROUP/hpx/blob/master/examples/performance_counters/sine/sine.cpp#L189-L275

15:07 <parsa[w]> thank you

15:12 hkaiser_ has quit [Ping timeout: 248 seconds]

15:29 eschnett has joined #ste||ar

15:52 <github> [hpx] K-ballo force-pushed future_data_void from 5d16641 to 5c8cdaf: https://git.io/vFCPk

15:52 <github> hpx/future_data_void 00d477f Agustin K-ballo Berge: Push more future_data_base functionality to the void base class

15:52 <github> hpx/future_data_void 5c8cdaf Agustin K-ballo Berge: Move future_data_void functionality to source file

15:55 zbyerly_ has joined #ste||ar

16:07 <heller> parsa[w]: question answered?

16:07 <parsa[w]> experimenting atm

16:25 <jbjnr> who is responsible for "background_thread = create_background_thread" - is it heller

16:25 <heller> jbjnr: yes

16:26 <jbjnr> a "num_thread" param gets passed in. Is that critical?

16:26 <heller> jbjnr: that's used within the parcelport to schedule new work

16:27 <jbjnr> a "num_thread" param gets passed in. Is that critical?

16:28 <jbjnr> I mean - does the background thread task have to get assigned to a particular queue? Do you create one per queue and that's fixed in stone?

16:28 <jbjnr> or can I modify this a bit.

16:29 <jbjnr> I do not like it Sam I am.

16:29 <heller> ok

16:29 <heller> what do you have in mind?

16:30 <jbjnr> are these background tasks the ones you added to fix direct actions etc?

16:30 <heller> currently, a background thread is running as part of the scheduling of a specific core (that's num_thread)

16:30 <heller> yes

16:31 <jbjnr> I do not like them because they use the interface I'm hijacking for the numa scheduling

16:31 <jbjnr> how does a direct action get transferred to the background thread? only if it suspends? or something of that kind?

16:32 <heller> err

16:32 <heller> the background thread receives the parcel

16:33 <heller> if it contains a direct action, it is directly called within the background thread

16:33 <heller> if not, a new one gets scheduled

16:33 <jbjnr> in the scheduling loop - we used to call the parcelport directly from the loop. Does that not happen the same now?

16:33 <heller> more or less

16:33 <heller> the difference now is that the background thread is running in its own context

16:34 <jbjnr> the scheduling loop, calls background work, which checks the network, a parcel is decoded, if it is direct, how does it get run on the background thread?

16:34 <heller> and whenever it suspends, we let it loose to participate in the regular scheduling business

16:34 <heller> it calls the thread function directly

16:34 <jbjnr> which thread function

16:35 <heller> the one of the action

16:35 <heller> if it is not a direct action, the thread function is being scheduled

16:36 <jbjnr> grrr. I mean that the parcel is decoded on the scheduling loop thread, then run on it (still an OS thread) - how and when does it get transferred to the background thread

16:36 <heller> it is not

16:36 <jbjnr> ok, so that was changed

16:36 <heller> the parcel decoding is alread happening in the background thread

16:36 <heller> yes

16:36 <jbjnr> hmm.

16:36 <jbjnr> ok. I will have to make some changes to my scheduler handling and break some stuff.

16:37 <heller> how so?

16:37 <heller> the background thread doesn't expose any public API, does it?

16:37 <jbjnr> the only place in the code that uses the thread_num when scheduling tasks is that background thread stuff.

16:38 <jbjnr> all the rest of the time, the thread num is just -1

16:38 <heller> I miss the piece connecting the NUMA sensitive stuff to the scheduling loop

16:38 <jbjnr> so I was using it for my numa hints

16:38 <heller> ahhh

16:38 <heller> ok

16:38 <jbjnr> but if it is being legitimately used, then I will have to do some extra specialized call

16:38 <heller> now i get it

16:39 <jbjnr> not a big deal, but I didn't want to change all 6 schedulers

16:39 <heller> no, please knock yourself out

16:39 <heller> well

16:39 <heller> you don't have to

16:39 <heller> depends on how intrusive your change will be though

16:40 <heller> the thread num to schedule a task is being set with the thread_init_data

16:40 EverYoung has quit [Ping timeout: 240 seconds]

16:40 david_pfander has quit [Ping timeout: 240 seconds]

17:00 <heller> but that's really only relevant when you actually schedule threads

17:00 <heller> the background thread isn't really scheduled in that sense

17:17 jaafar_ has joined #ste||ar

17:38 EverYoung has joined #ste||ar

17:39 EverYoung has quit [Client Quit]

17:47 EverYoung has joined #ste||ar

17:47 EverYoung has quit [Remote host closed the connection]

17:48 EverYoung has joined #ste||ar

17:54 hkaiser has joined #ste||ar

18:13 hkaiser has quit [Quit: bye]

18:26 EverYoun_ has joined #ste||ar

18:28 EverYoung has quit [Ping timeout: 250 seconds]

18:39 zbyerly_ has quit [Remote host closed the connection]

18:47 EverYoung has joined #ste||ar

18:47 zbyerly_ has joined #ste||ar

18:50 EverYoun_ has quit [Ping timeout: 250 seconds]

19:01 eschnett has quit [Quit: eschnett]

19:02 zbyerly_ has quit [Quit: Leaving]

19:05 aserio has joined #ste||ar

19:11 jaafar has joined #ste||ar

19:11 jaafar_ has quit [Ping timeout: 248 seconds]

19:15 aserio has quit [Ping timeout: 250 seconds]

19:21 aserio has joined #ste||ar

19:31 EverYoung has quit [Remote host closed the connection]

19:32 EverYoung has joined #ste||ar

19:38 aserio has quit [Ping timeout: 240 seconds]

19:49 aserio has joined #ste||ar

20:00 parsa has quit [Quit: Zzzzzzzzzzzz]

20:02 parsa has joined #ste||ar

20:16 <jbjnr> K-ballo: yt? I have a dataflow question you might be able to answer - or heller?

20:20 <jbjnr> iso meetings no doubt ... I will wait and experiment

20:30 eschnett has joined #ste||ar

20:34 parsa has quit [Quit: Zzzzzzzzzzzz]

20:48 EverYoun_ has joined #ste||ar

20:51 EverYoung has quit [Ping timeout: 246 seconds]

21:01 hkaiser has joined #ste||ar

21:12 Shahrzad has joined #ste||ar

21:16 <aserio> hkaiser: yt?

21:16 <hkaiser> aserio: here

21:16 <aserio> see pm

21:23 EverYoun_ has quit [Remote host closed the connection]

21:23 EverYoung has joined #ste||ar

21:32 aserio has quit [Ping timeout: 258 seconds]

21:51 aserio has joined #ste||ar

21:53 <hkaiser> aserio: yt?

21:53 <aserio> hkaiser: yep

21:53 <hkaiser> pm, pls

21:54 <jbjnr> hkaiser: got a mo?

21:54 <hkaiser> jbjnr: sure

21:55 <jbjnr> https://gist.github.com/biddisco/9f5a2ce88fc83d566320bb455be7d1f4 in then_execute, I RETURN a dataflow - the dataflow invokes a utility helper, that calls the numa hint and then creates a task for the function

21:55 <hkaiser> jbjnr: there is the issue of that libfabrics BoF next Tuesday... what would you like me to talk about there?

21:56 <jbjnr> I'll do slides tomorrow and send them

21:56 <jbjnr> then you can skype me and I'll talk you through the contents

21:56 <hkaiser> jbjnr: thanks - 3-5 minutes, so no more than 3-4 slides

21:56 <jbjnr> not 10 mins?

21:56 <jbjnr> that's what you said before?

21:56 <hkaiser> well, I talk slowly

21:56 <hkaiser> ;)

21:56 <hkaiser> 3-4 slides should be ok

21:56 <hkaiser> jbjnr: ok, looking at your code

21:58 <hkaiser> jbjnr: what's your question?

21:58 <jbjnr> for each of these dataflow operations, I see what looks like two tasks.

21:59 <jbjnr> the inner apply, create a task + future, and this is returned by the inner calable object, and then returned to the dataflow wrapper

21:59 <jbjnr> does the dataflow wrapper create another task?

22:00 <hkaiser> jaafar: you could pass the executor directly to dataflow

22:00 <hkaiser> jbjnr: ^^

22:00 <hkaiser> dataflow(exec, f, args...)

22:01 <jbjnr> what does that change?

22:01 <jbjnr> hmmm

22:02 <hkaiser> jbjnr: it changes that no additional task is created but f is directly executed by exec

22:02 <jbjnr> ok, thanks, that's what I will try then

22:02 <hkaiser> jbjnr: and it simplifies your code

22:02 <jbjnr> good

22:02 <hkaiser> you don't need a additional wrapper anymore

22:03 <jbjnr> thank you. I don't see where the dataflow will forward to, but I'll try it and see what it does

22:03 <hkaiser> ahh, no - the wrapper is still needed

22:03 <hkaiser> forget what I said

22:03 <jbjnr> ok

22:03 <hkaiser> then do dataflow(launch::sync, ...)

22:03 <jbjnr> I need the wrapper, to get the hint function called after the futures are unwrapped

22:03 <hkaiser> yes, understand

22:04 <jbjnr> interesting, I was thinking about sync. I'll experiment there then

22:04 <jbjnr> that might be the wrong war around though

22:04 <jbjnr> way^

22:05 <hkaiser> sync just means that the function will not be launched on a new thread

22:05 <jbjnr> I need the numa hint function to be evaluated and then the task scheduled on that code/numanode - if I put sync in the wrapper - instead of the contents ...

22:05 <hkaiser> it does not mean that the dataflow operation is synchronous

22:05 <hkaiser> try it, I think it will do the right thing

22:06 <jbjnr> yes. you may be right, the unwrapping happens first, before sync is in play

22:06 <hkaiser> yes

22:07 <jbjnr> no

22:07 <hkaiser> lol

22:07 <jbjnr> it will launch the wrapped function on the thread owned by the predecessor becoming ready

22:07 <jbjnr> that's bad

22:07 <hkaiser> why?

22:07 <hkaiser> does it matter which thread schedules things?

22:08 <jbjnr> because the predecessor hold the args that we use to call the numa function - and then lauch the REAL task on that numanode

22:08 <hkaiser> sure

22:08 <jbjnr> aha

22:08 <jbjnr> you're right

22:08 <hkaiser> but sync will only influence the thread that decides where to run things eventually

22:08 <jbjnr> the suntask will run on the predecessor task, then the real task runs on the new apply

22:08 <jbjnr> ^subtask

22:08 <hkaiser> nod

22:08 <jbjnr> great.

22:08 <jbjnr> I'll try it

22:10 <jbjnr> hkaiser: it works :)

22:10 <jbjnr> thanks a bunch. now all is clean and I am happy that the bonus task has gone

22:32 aserio has quit [Quit: aserio]

23:00 parsa has joined #ste||ar

23:41 parsa has quit [Quit: Zzzzzzzzzzzz]

23:45 parsa has joined #ste||ar