aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
mcopik_ has quit [Ping timeout: 240 seconds]
eschnett has joined #ste||ar
Matombo444 has joined #ste||ar
pree has joined #ste||ar
Matombo has quit [Ping timeout: 248 seconds]
StefanLSU has joined #ste||ar
Guest31423 has quit [Quit: This computer has gone to sleep]
Matombo444 has quit [Remote host closed the connection]
Matombo has joined #ste||ar
StefanLSU has quit [Quit: StefanLSU]
patg has joined #ste||ar
patg is now known as Guest45685
pree has quit [Quit: AaBbCc]
StefanLSU has joined #ste||ar
StefanLSU has quit [Quit: StefanLSU]
K-ballo has quit [Quit: K-ballo]
Guest45685 has quit [Quit: See you later]
hkaiser has quit [Quit: bye]
vamatya has joined #ste||ar
<heller>
zbyerly: extension
<heller>
Yay
<heller>
I knew it ;)
<zbyerly_>
heller, that's a good sign right
<heller>
Yes
<heller>
We have till the 29th to finish that paper
parsa has joined #ste||ar
zbyerly_ has quit [Ping timeout: 240 seconds]
zbyerly_ has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
Matombo has quit [Remote host closed the connection]
<jbjnr>
heller: is your pool assignment for thread in master? that might have broken it
<jbjnr>
if it returns 1 instead of 0, then this error would appear
<jbjnr>
the default pool should always be pool 0
<heller>
jbjnr: pool assignment?
<heller>
jbjnr: I don't remember any pool assignments
<jbjnr>
the error is coming from get_pool_name. might be a perf counter cock up
<zbyerly_>
let me try it without the perf cntrs
<jbjnr>
zbyerly_: can you get more stack backtrace? - yes or disabkle perf counters might help for a quick test
<jbjnr>
heller: I meant that niow a thread is always launvched on the pool the parent came from - previously it was always pool#0 if the user didn't use a special executor
<jbjnr>
so if your change was in master, it might be a suspect
<jbjnr>
but I now suspect perf counters, cos they ask for pool names, nobody else needs them usually.
<jbjnr>
apart from custom_pool_executor
<zbyerly_>
works without the perf ctrsnas
<jbjnr>
zbyerly_: if a small test can produce the error, feel free to post one.
<jbjnr>
aha!
<jbjnr>
ok, then please file an issue with as much detail as poss. hartmut knows what to do as he fixed the perf counters for pools
<heller>
zbyerly_: what's the perf counter parameter you are using, btw?
<github>
hpx/master c50c064 Thomas Heller: One more attempt to fix the service_executor...
<heller>
next round
<heller>
counting the green: 17/40
<heller>
counting the green: 18/40
<zbyerly_>
i'll pray for 19 heller
<zbyerly_>
submitted an issue on that thing
<heller>
counting the green: 21/40
<zao>
What on earth are you fine people up to?
<heller>
zao: MBGA!
<zao>
Those are letters :)
<jbjnr>
heller: = thread pool terrorist
<jbjnr>
I'm going to admit to being slightly impressed by the green. Not bad at all. I'm almost ready to fogive you for breaking everything else
<heller>
zao: Make Buildbot Green Again!
<heller>
jbjnr: my pleasure ;)
<zao>
:D
<zao>
For the mem counters, resident and virtual map straight to RES and VIRT?
<jbjnr>
the tasks that are triggered by my MPI messages are being directed to the MPI pool thanks to heller's changes, so that's why our distributed performance has dropped off a cliff.
<jbjnr>
I will come up with a better plan though than just reverting it in the long term. I think that using the same thread pool is probably a good default choice, but maybe we can add more execution policies and suchlike, or add pool options to diable default threads.
<jbjnr>
thre's anothe paper in this multi-thread pool material ...
<heller>
yup yup
<zao>
Oh dear... I'm going to have to link a library (-lkvm) to implement mem_counter_bsd.cpp
<zao>
This'll be fun.
Matombo has joined #ste||ar
bikineev has quit [Remote host closed the connection]
<jbjnr>
heller: do we have a function that converts a mask_type into an hwloc_cpu set?
<jbjnr>
or mask type is a bitset<> or an int and thereis is an hwloc bitmap, but I can't seem to locate a converted
<jbjnr>
converter
<jbjnr>
nvm found a ton of hwloc bitmap functions
<jbjnr>
will iterate by hand
<heller>
ok
<heller>
please do
bikineev has joined #ste||ar
bikineev has quit [Ping timeout: 255 seconds]
bikineev has joined #ste||ar
bikineev has quit [Ping timeout: 248 seconds]
bikineev has joined #ste||ar
K-ballo has joined #ste||ar
pree has joined #ste||ar
<zao>
heller: I need to add a linker flag to a target from add_hpx_component. Can I reliably do `target_link_libraries(foo_component PRIVATE "-lmeh")`?
<heller>
yes
<zao>
The library only applies on particular CMAKE_SYSTEM_NAMEs.
<zao>
heller: CMake got upset if I used the non-keyword form of target_link_libraries.
<zao>
Do we always use the PRIVATE/PUBLIC flavor?
<zao>
(I've gone and implemented hpx_memory for FreeBSD and DragonFlyBSD, and technically also NetBSD/OpenBSD if someone figures out what the kernel structure fields are named)
<heller>
cool!
<heller>
yes, you always have to use it
hkaiser has joined #ste||ar
<jbjnr>
grrrr: "the hpx runtime system has not been initialized yet"
<zao>
Heh, some example uses __argv... I could either try to slurp them out via KVM, or just disable the example.
<zao>
Can I disable particular examples?
<jbjnr>
not easily
<jbjnr>
in cmakelists, just # comment out the one you dislike
<jbjnr>
^ones
<zao>
#if defined(silly_os) int main() {} #endif
<zao>
jbjnr: Thing is that it's conditional on the OS.
<zao>
So I guess I could if() that in CMake then.
<jbjnr>
yes
<zao>
Gonna see how much else is b0rken now that I build more of HPX.
<jbjnr>
anyone know if there is an errno to string in hpx anywhere?
<jbjnr>
reuable for exceptions etc
<heller>
ok, I totally broke it, why is launch_process timing out now?
<jbjnr>
child task not killed
<heller>
this is soo frustrating
<heller>
it worked the instance before, didn't it?
<jbjnr>
connection not made, child hangs, test times out. after that all subsequent tests usually fail due to network in use already error
<hkaiser>
jbjnr: in the end you brought down Thomas' decision to pull the plug on GB onto yourself, right ? ;)
Matombo has quit [Ping timeout: 240 seconds]
pree has joined #ste||ar
<jbjnr>
hkaiser: yes, but that's not a bad thing. you do not need us and you can go on anyway. We have to work on a CSCS project not an LSU one. If you carry on with it and need my help, then I'm here, I just can't spend N months on it without reason.
<hkaiser>
jbjnr: sure, I perfectly understand
<jbjnr>
(getting a gpu version by april is not going to be easy)
<zao>
jbjnr: We're troubleshooting RDMA at work. Is there anything in HPX one can use to put some load on it?
<jbjnr>
cray or non?
<jbjnr>
if cray, then yes, if not, then I need to fix the IBvers PP first
Matombo has joined #ste||ar
<hkaiser>
jbjnr: fixing it would be a Good Thing (tm) anyways
<zao>
Very non-Cray.
<jbjnr>
yes
<zao>
Alright, good to know.
<heller>
jbjnr: you should focus on the good news!
<jbjnr>
hkaiser: once we have the matrix work cleaned up, I'm hoping to be allwed to go back to network and do more rma stuff. Also fix verbs PP
<hkaiser>
great
<jbjnr>
good news?
<hkaiser>
yes, good news
<hkaiser>
for instance that the work we did in GB got awarded
<jbjnr>
yes
<hkaiser>
You should tell THAT to Thomas
<jbjnr>
indeed
mcopik_ has quit [Ping timeout: 240 seconds]
parsa has joined #ste||ar
<heller>
jbjnr: also that you match parsec in the single node case now
<jbjnr>
ok, hwloc set mem interleaved now tested.
<jbjnr>
no need to use numactl now
<jbjnr>
954 GFlop/s Thank you very much!
bikineev has joined #ste||ar
diehlpk_work has joined #ste||ar
<diehlpk_work>
hkaiser, heller Do you want to do any changes for the paper?
aserio has joined #ste||ar
<hkaiser>
diehlpk_work: I would like to give it a once-over today
<hkaiser>
when is the deadline
<diehlpk_work>
Submission Deadline: 2017 September 15, 23:59 AOE
<hkaiser>
ok, so we still today
StefanLSU has joined #ste||ar
<aserio>
hkaiser: will you be calling into the STORM meeting from home?