aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<parsa[w]>
?
twwright has quit [Quit: twwright]
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
twwright has joined #ste||ar
zbyerly_ has quit [Ping timeout: 250 seconds]
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 258 seconds]
EverYoun_ has quit [Ping timeout: 250 seconds]
EverYoung has joined #ste||ar
EverYoun_ has joined #ste||ar
<hkaiser>
parsa[w]: why don't you just rebase your branch on top of the node_data_refactoring one?
<hkaiser>
this way you can avoid conflicts once its merged
<hkaiser>
parsa[w]: I'll commit what I have, but csv is still broken - needs more work
<hkaiser>
have to go now for a while
EverYoung has quit [Ping timeout: 246 seconds]
<hkaiser>
parsa: filew_write_csv is fixed, file_read_csv needs changes...
hkaiser has quit [Quit: bye]
parsa has quit [Quit: Zzzzzzzzzzzz]
EverYoun_ has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
hkaiser has joined #ste||ar
EverYoung has quit [Ping timeout: 250 seconds]
EverYoung has joined #ste||ar
<hkaiser>
parsa[w]: everything works now
hkaiser has quit [Quit: bye]
K-ballo has quit [Read error: Connection reset by peer]
K-ballo has joined #ste||ar
EverYoung has quit [Ping timeout: 258 seconds]
parsa has joined #ste||ar
gedaj has quit [Ping timeout: 268 seconds]
gedaj has joined #ste||ar
eschnett has joined #ste||ar
ct-clmsn has joined #ste||ar
ct-clmsn is now known as Guest8018
Guest8018 is now known as ct_clmsn_
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 248 seconds]
K-ballo1 is now known as K-ballo
zao has quit [Ping timeout: 248 seconds]
zao has joined #ste||ar
EverYoung has joined #ste||ar
jbjnr_ has joined #ste||ar
taeguk[m] has quit [Ping timeout: 250 seconds]
jbjnr has quit [Ping timeout: 250 seconds]
jbjnr_ is now known as jbjnr
thundergroudon[m has quit [Ping timeout: 246 seconds]
ct_clmsn_ has quit [Ping timeout: 240 seconds]
parsa has quit [Quit: Zzzzzzzzzzzz]
eschnett has quit [Quit: eschnett]
hkaiser has joined #ste||ar
hkaiser has quit [Quit: bye]
thundergroudon[m has joined #ste||ar
taeguk[m] has joined #ste||ar
msimberg has joined #ste||ar
Guest94058 has joined #ste||ar
Guest94058 has quit [Remote host closed the connection]
jaafar_ has quit [Ping timeout: 240 seconds]
david_pfander has joined #ste||ar
david_pfander1 has joined #ste||ar
msimberg has quit [Ping timeout: 268 seconds]
david_pfander has quit [Remote host closed the connection]
<jbjnr>
who is responsible for "background_thread = create_background_thread" - is it heller
<heller>
jbjnr: yes
<jbjnr>
a "num_thread" param gets passed in. Is that critical?
<heller>
jbjnr: that's used within the parcelport to schedule new work
<jbjnr>
a "num_thread" param gets passed in. Is that critical?
<jbjnr>
I mean - does the background thread task have to get assigned to a particular queue? Do you create one per queue and that's fixed in stone?
<jbjnr>
or can I modify this a bit.
<jbjnr>
I do not like it Sam I am.
<heller>
ok
<heller>
what do you have in mind?
<jbjnr>
are these background tasks the ones you added to fix direct actions etc?
<heller>
currently, a background thread is running as part of the scheduling of a specific core (that's num_thread)
<heller>
yes
<jbjnr>
I do not like them because they use the interface I'm hijacking for the numa scheduling
<jbjnr>
how does a direct action get transferred to the background thread? only if it suspends? or something of that kind?
<heller>
err
<heller>
the background thread receives the parcel
<heller>
if it contains a direct action, it is directly called within the background thread
<heller>
if not, a new one gets scheduled
<jbjnr>
in the scheduling loop - we used to call the parcelport directly from the loop. Does that not happen the same now?
<heller>
more or less
<heller>
the difference now is that the background thread is running in its own context
<jbjnr>
the scheduling loop, calls background work, which checks the network, a parcel is decoded, if it is direct, how does it get run on the background thread?
<heller>
and whenever it suspends, we let it loose to participate in the regular scheduling business
<heller>
it calls the thread function directly
<jbjnr>
which thread function
<heller>
the one of the action
<heller>
if it is not a direct action, the thread function is being scheduled
<jbjnr>
grrr. I mean that the parcel is decoded on the scheduling loop thread, then run on it (still an OS thread) - how and when does it get transferred to the background thread
<heller>
it is not
<jbjnr>
ok, so that was changed
<heller>
the parcel decoding is alread happening in the background thread
<heller>
yes
<jbjnr>
hmm.
<jbjnr>
ok. I will have to make some changes to my scheduler handling and break some stuff.
<heller>
how so?
<heller>
the background thread doesn't expose any public API, does it?
<jbjnr>
the only place in the code that uses the thread_num when scheduling tasks is that background thread stuff.
<jbjnr>
all the rest of the time, the thread num is just -1
<heller>
I miss the piece connecting the NUMA sensitive stuff to the scheduling loop
<jbjnr>
so I was using it for my numa hints
<heller>
ahhh
<heller>
ok
<jbjnr>
but if it is being legitimately used, then I will have to do some extra specialized call
<heller>
now i get it
<jbjnr>
not a big deal, but I didn't want to change all 6 schedulers
<heller>
no, please knock yourself out
<heller>
well
<heller>
you don't have to
<heller>
depends on how intrusive your change will be though
<heller>
the thread num to schedule a task is being set with the thread_init_data
EverYoung has quit [Ping timeout: 240 seconds]
david_pfander has quit [Ping timeout: 240 seconds]
<heller>
but that's really only relevant when you actually schedule threads
<heller>
the background thread isn't really scheduled in that sense
jaafar_ has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Client Quit]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
hkaiser has joined #ste||ar
hkaiser has quit [Quit: bye]
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 250 seconds]
zbyerly_ has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
zbyerly_ has joined #ste||ar
EverYoun_ has quit [Ping timeout: 250 seconds]
eschnett has quit [Quit: eschnett]
zbyerly_ has quit [Quit: Leaving]
aserio has joined #ste||ar
jaafar has joined #ste||ar
jaafar_ has quit [Ping timeout: 248 seconds]
aserio has quit [Ping timeout: 250 seconds]
aserio has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
aserio has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
<jbjnr>
K-ballo: yt? I have a dataflow question you might be able to answer - or heller?
<jbjnr>
iso meetings no doubt ... I will wait and experiment
eschnett has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
hkaiser has joined #ste||ar
Shahrzad has joined #ste||ar
<aserio>
hkaiser: yt?
<hkaiser>
aserio: here
<aserio>
see pm
EverYoun_ has quit [Remote host closed the connection]
<hkaiser>
jbjnr: there is the issue of that libfabrics BoF next Tuesday... what would you like me to talk about there?
<jbjnr>
I'll do slides tomorrow and send them
<jbjnr>
then you can skype me and I'll talk you through the contents
<hkaiser>
jbjnr: thanks - 3-5 minutes, so no more than 3-4 slides
<jbjnr>
not 10 mins?
<jbjnr>
that's what you said before?
<hkaiser>
well, I talk slowly
<hkaiser>
;)
<hkaiser>
3-4 slides should be ok
<hkaiser>
jbjnr: ok, looking at your code
<hkaiser>
jbjnr: what's your question?
<jbjnr>
for each of these dataflow operations, I see what looks like two tasks.
<jbjnr>
the inner apply, create a task + future, and this is returned by the inner calable object, and then returned to the dataflow wrapper
<jbjnr>
does the dataflow wrapper create another task?
<hkaiser>
jaafar: you could pass the executor directly to dataflow
<hkaiser>
jbjnr: ^^
<hkaiser>
dataflow(exec, f, args...)
<jbjnr>
what does that change?
<jbjnr>
hmmm
<hkaiser>
jbjnr: it changes that no additional task is created but f is directly executed by exec
<jbjnr>
ok, thanks, that's what I will try then
<hkaiser>
jbjnr: and it simplifies your code
<jbjnr>
good
<hkaiser>
you don't need a additional wrapper anymore
<jbjnr>
thank you. I don't see where the dataflow will forward to, but I'll try it and see what it does
<hkaiser>
ahh, no - the wrapper is still needed
<hkaiser>
forget what I said
<jbjnr>
ok
<hkaiser>
then do dataflow(launch::sync, ...)
<jbjnr>
I need the wrapper, to get the hint function called after the futures are unwrapped
<hkaiser>
yes, understand
<jbjnr>
interesting, I was thinking about sync. I'll experiment there then
<jbjnr>
that might be the wrong war around though
<jbjnr>
way^
<hkaiser>
sync just means that the function will not be launched on a new thread
<jbjnr>
I need the numa hint function to be evaluated and then the task scheduled on that code/numanode - if I put sync in the wrapper - instead of the contents ...
<hkaiser>
it does not mean that the dataflow operation is synchronous
<hkaiser>
try it, I think it will do the right thing
<jbjnr>
yes. you may be right, the unwrapping happens first, before sync is in play
<hkaiser>
yes
<jbjnr>
no
<hkaiser>
lol
<jbjnr>
it will launch the wrapped function on the thread owned by the predecessor becoming ready
<jbjnr>
that's bad
<hkaiser>
why?
<hkaiser>
does it matter which thread schedules things?
<jbjnr>
because the predecessor hold the args that we use to call the numa function - and then lauch the REAL task on that numanode
<hkaiser>
sure
<jbjnr>
aha
<jbjnr>
you're right
<hkaiser>
but sync will only influence the thread that decides where to run things eventually
<jbjnr>
the suntask will run on the predecessor task, then the real task runs on the new apply
<jbjnr>
^subtask
<hkaiser>
nod
<jbjnr>
great.
<jbjnr>
I'll try it
<jbjnr>
hkaiser: it works :)
<jbjnr>
thanks a bunch. now all is clean and I am happy that the bonus task has gone