hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<simbergm>
K-ballo: ok, thanks
<simbergm>
are you sure it's not 1y? I thought gcc 4.9 didn't support all of 14... in any case I'll try to look next week unless you beat me to it, it'll most likely be something simple
eschnett has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: bye]
eschnett has quit [Quit: eschnett]
eschnett has joined #ste||ar
Vir has quit [Ping timeout: 246 seconds]
Vir has joined #ste||ar
Vir has quit [Ping timeout: 268 seconds]
Vir has joined #ste||ar
david_pfander has joined #ste||ar
<jbjnr__>
heller: I had some questions about background work
<jbjnr__>
I seem to recall that you modified background work so that it runs as an hpx thread rather than an os thread
<jbjnr__>
and I wonder what it means when something is called on the background thread and it yields/suspends and whether thos background tasks can stop themselves being stolen by other threads in the scheduler
nikunj has quit [Remote host closed the connection]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 244 seconds]
<heller>
jbjnr__: yes, background work is run in an hpx thread
<heller>
jbjnr__: if the thread suspends or yields with "pending", not pending_boost, it will be given to the regular scheduler. Once the current operation completes, it will just terminate
<heller>
jbjnr__: in this case, it might get stolen
<jbjnr__>
so a new one is created each time background work happens?
<heller>
jbjnr__: if it keeps spinning with pending_boost, it will stay on the same core
<heller>
no, only when the current one yields or suspends
<jbjnr__>
what I mean is, if background work is called, and there's nothing to be done, it will exit - then it destroys itself and a new one is created for the next attempt?
<jbjnr__>
and why does pending_boost stop it being stolen?
<jbjnr__>
bbiab
<heller>
no, it won't exit itself
<heller>
it keeps spinning, only when it tries to suspend, or yield with pending, a new one gets created, and this old one will finish its current opreation
K-ballo has joined #ste||ar
<jbjnr__>
heller: confused. if the background work keeps spinning, then it will stop real work being done.
<heller>
Yes
<heller>
Well
<heller>
If it yields with pending_boost it will first execute other work
<jbjnr__>
how does pending_boost stop it being stolen?
<jbjnr__>
because most schedulers don't steal HP tasks I suppose
eschnett has quit [Quit: eschnett]
<heller>
jbjnr__: it's the algorithm
<heller>
jbjnr__: the background thread itself doesn't take part in the normal scheduling algorithm at first
<heller>
jbjnr__: it will get passed to regular scheduling only if it needs to be suspended
<jbjnr__>
you mean the background task doesn't go into the normal scheduling queue
<jbjnr__>
it is kept in the scheduling loop itself
<jbjnr__>
that makes more sense
<jbjnr__>
If that's right, then now I understand. thanks
<jbjnr__>
I looked at it and wondered how can another workr thread take a task that has already been taken, but I guess I misunderstood it
<hkaiser>
right, I don't remember the details :/
<hkaiser>
it was related to the inherent race condition that happens when a thread a) schedules a timer to awake itself and then b) suspends
<hkaiser>
the timer might fire before the thread gets suspended...
<jbjnr__>
let's remove the timers!
<hkaiser>
lol!
aserio has joined #ste||ar
jaafar_ has quit [Quit: Konversation terminated!]
hkaiser has quit [Quit: bye]
aserio has quit [Ping timeout: 250 seconds]
<diehlpk_work>
jbjnr__, have you already submitted jobs for octotiger?
<jbjnr__>
not yet. was looking at cripts now. Still have >100 jobs for my LF sitting in the queue all day
<jbjnr__>
does the sbcast of the restart file go inside the slurm job
<diehlpk_work>
yes, we need to copy the file to improve IO
<diehlpk_work>
Please wait for starting the jobs, we have a improved verison of octotiger and a different ini file
<jbjnr__>
I've generated the 1/2/4/8.../4096 folders with the ini and run script, but was looking for a script to generate sbatch submissions and do the restart file copy etc etc
<jbjnr__>
^^ok
<jbjnr__>
I will wait. Still playing with scripts anyway. But I've compiled octotiger with LF pp
<jbjnr__>
so am ready to test (I think)
<diehlpk_work>
Ok, let me finish the new config and you are good to go
<diehlpk_work>
jbjnr__, Can I add command line args to sbatch?
<jbjnr__>
I will make a script for launching lots of jobs. much easier
<diehlpk_work>
Ok, good
<diehlpk_work>
So I will use this script to start my runs
<diehlpk_work>
jbjnr__, You can use the latest master of octotiger for the collection of your data
mbremer has joined #ste||ar
<diehlpk_work>
daissgr, see pm
<diehlpk_work>
daissgr, yet?
david_pfander has quit [Ping timeout: 250 seconds]
aserio has joined #ste||ar
Vir has quit [Ping timeout: 264 seconds]
jaafar has joined #ste||ar
aserio has quit [Ping timeout: 250 seconds]
aserio has joined #ste||ar
<jbjnr__>
daissgr: yt?
<jbjnr__>
or diehlpk_work
<parsa_>
heller: is your dissertation public?
parsa_ is now known as parsa
aserio has quit [Quit: aserio]
aserio has joined #ste||ar
aserio has quit [Ping timeout: 250 seconds]
aserio has joined #ste||ar
<diehlpk_work>
jbjnr__, Please update the ini file, we disabled the output and let it run longer
eschnett has joined #ste||ar
<heller>
parsa: not yet
<parsa>
heller: can i have the pdf?
<heller>
Sure
<heller>
parsa: you should have it now
<jbjnr__>
diehlpk_work: ok
<parsa>
heller:grand! thank you!
nikunj has quit [Remote host closed the connection]
<parsa>
heller: is there a reason for not mentioning Charm++, Legion, etc in related works? i'm trying to write related works for my own stuff and need inspiration
<heller>
I think I mention them somewhere
<heller>
charm is mentioned
<heller>
legion: no idea why it is missing...
<parsa>
oh yeah, a charm++ paper's referenced in AGAS (didn't see it since the name is not mentioned)
eschnett has quit [Quit: eschnett]
eschnett has joined #ste||ar
akheir has joined #ste||ar
aserio has quit [Ping timeout: 250 seconds]
Yorlik has quit [Read error: Connection reset by peer]
aserio has joined #ste||ar
daissgr has quit [Quit: WeeChat 1.9.1]
eschnett has quit [Quit: eschnett]
aserio has quit [Quit: aserio]
nikunj has joined #ste||ar
akheir has quit [Remote host closed the connection]