aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
RostamLog_ has joined #ste||ar
K-ballo1 has joined #ste||ar
<K-ballo1>
XML can be clunky, the rest is fun!
zao has quit [Quit: Up, up, and away!]
K-ballo has quit [Remote host closed the connection]
K-ballo1 is now known as K-ballo
ajaivgeorge has quit [Remote host closed the connection]
washplan1z has joined #ste||ar
zao has joined #ste||ar
washplanez has quit [Remote host closed the connection]
ajaivgeorge has joined #ste||ar
bikineev has quit [Remote host closed the connection]
RostamLog has quit [Ping timeout: 246 seconds]
zbyerly__ has quit [Remote host closed the connection]
zbyerly__ has joined #ste||ar
<github>
[hpx] K-ballo force-pushed range from 819b5d4 to 64472f0: https://git.io/voasb
<github>
hpx/range be60f2b Agustin K-ballo Berge: Add C++11 range utilities
<github>
hpx/range f66ba2d Agustin K-ballo Berge: Cleanup acquire traits, replace boost::range with util/range in their implementation, patch broken whens
zbyerly__ has quit [Remote host closed the connection]
zbyerly__ has joined #ste||ar
denis_blank has quit [Quit: denis_blank]
ajaivgeorge has quit [Read error: Connection reset by peer]
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: bye]
eschnett has quit [Quit: eschnett]
EverYoung has joined #ste||ar
jaafar has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
patg has joined #ste||ar
zbyerly__ has quit [Remote host closed the connection]
zbyerly__ has joined #ste||ar
zbyerly__ has quit [Ping timeout: 246 seconds]
zbyerly_ has joined #ste||ar
pree has joined #ste||ar
jaafar has quit [Ping timeout: 258 seconds]
Matombo has joined #ste||ar
shoshijak has joined #ste||ar
shoshijak has quit [Ping timeout: 258 seconds]
patg is now known as patg_away
zbyerly_ has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
david_pfander has joined #ste||ar
david_pfander has quit [Ping timeout: 268 seconds]
shoshijak has joined #ste||ar
Matombo has quit [Remote host closed the connection]
zbyerly_ has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
<shoshijak>
Hey all. I'm inspecting the problem jbjnr and I had yesterday (running multiple pools on multiple nodes results in: {what}: parcel destination does not match locality which received the parcel). I'm suspecting there's something we do twice when setting up or running the pools which should really be done only once. Can someone tell me what init_tss() does and what it's all about?
pree_ has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
Matombo has joined #ste||ar
washplan1z has quit [Quit: Lost terminal]
zbyerly_ has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
EverYoung has joined #ste||ar
Matombo has quit [Ping timeout: 268 seconds]
shoshijak has quit [Read error: Connection reset by peer]
shoshijak has joined #ste||ar
shoshijak has quit [Client Quit]
shoshijak has joined #ste||ar
shoshijak has quit [Client Quit]
<heller_>
now she's gone
shoshijak has joined #ste||ar
<heller_>
shoshijak: hey
<jbjnr_>
she sitting next to me and closing the lid on her laptop
<jbjnr_>
you can explain the tss if you like :)
<heller_>
shoshijak: init_tss intializes the thread specific storage
<heller_>
that is, pointers to the runtime object, parcehandler etc.
<jbjnr_>
if we have two thread pools with thread nums 0.3 and 0-7 (for example) - we need two separate TSS storage locations - we think we might be reusing the same one twice ....
<heller_>
the error you get might be due to the runtime already shutting down and the locality has been unregistered already
<heller_>
well, if the tss is not initialized correctly, you will get seg faults or other errors about the runtime not being initialized correctly yet
<jbjnr_>
hmmm
<heller_>
each new OS thread that you launch that is supposed to be usable within HPX needs to properly initialize it's tss
<heller_>
its
EverYoung has quit [Ping timeout: 255 seconds]
<shoshijak>
but it seems like init_tss() is done once per pool, not once per OS-thread
Matombo has joined #ste||ar
<heller_>
init_tss is called once for the main thread, then each OS thread needs to the same again
<jbjnr_>
what's the deal with this get_notification_policy stuff? what's it for?
<heller_>
to have various callbacks for different phases
<heller_>
one when the thread starts, one when it stops and one when an error occured
<heller_>
decoupling, essentially
<jbjnr_>
what has that got to do with the tss stuff though?
<jbjnr_>
ah, I see
<jbjnr_>
each thread gets notified
<heller_>
yes
<jbjnr_>
and so it uses the tss for the notifier stuff
<heller_>
the notifier has a on_start_thread function
<heller_>
which is called inside the thread pool
<heller_>
which in terms registers the TSS
<jbjnr_>
if we have 2 pools - what needs to change?
<heller_>
nothing, really
<jbjnr_>
(or 4 pools etc)
<heller_>
just use this notification thingy
<heller_>
call the necessary functions and call it a day
<jbjnr_>
the problem we might have is pool A has thread nums 0-3 and pool B has the same thread nums - if we mix up the tss for both pools anywhere ....
<heller_>
the notification policy only sets up the runtime specific TSS variables
<heller_>
no idea how to solve the clash of different thread nums
<heller_>
maybe just count them in order or so?
<jbjnr_>
is the tss local to the thread pool, or global?
<heller_>
jbjnr_: when do we know about the hackathon?
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
bikineev has quit [Ping timeout: 240 seconds]
Matombo has quit [Ping timeout: 276 seconds]
hkaiser has joined #ste||ar
<jbjnr_>
welcome hkaiser you are up early
<hkaiser>
jbjnr_: g'morning
shoshijak has quit [Read error: Connection reset by peer]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 258 seconds]
<jbjnr_>
in threadpool, there s a wait for all threads to start before we begin work - is there a wait anywhere during shutdown that makes sure we don't delete stuff until all os worker threads are terminated cleanly
<hkaiser>
jbjnr_: yes
<hkaiser>
the scheduler loop is exited only once activity has ceased
<jbjnr_>
yes, but there is a scheduling loop on each thread - how does the runtime check that -all- threads are finished. We have multiple pools and want to be sure we don't try to enter shutdown until everything is quiet
<hkaiser>
jbjnr_: currently, once all threads of the (main) scheduling loop have exited, the runtime exits
<jbjnr_>
what do you mean by "main" scheduling loop in this context
<hkaiser>
jbjnr_: well, we have only one thread_pool, right?
shoshijak has joined #ste||ar
<hkaiser>
that's what I meant with 'main'
<jbjnr_>
we have lots of thread pools
<hkaiser>
:D
<jbjnr_>
hkaiser: what do we do about thread_pool_os_executor?
<jbjnr_>
it seems to love outside the main ecosystem and does it's own thing
<hkaiser>
jbjnr_: I think this is left to the user, currently
<jbjnr_>
we are worried that it calls run and calls init_tss etc etc, but might be done at runtime and does calling init_tss from that thread bugger other stuff up?
<hkaiser>
jbjnr_: not sure
<hkaiser>
you're concerns are well funded
Matombo has joined #ste||ar
bikineev has joined #ste||ar
K-ballo has joined #ste||ar
pree_ has quit [Ping timeout: 260 seconds]
eschnett has joined #ste||ar
bikineev has quit [Ping timeout: 258 seconds]
patg_away has quit [Read error: Connection reset by peer]
bikineev has joined #ste||ar
patg has joined #ste||ar
patg is now known as Guest83749
ajaivgeorge has joined #ste||ar
<ABresting>
hkaiser: yt?
<ABresting>
heller_: yt?
<ABresting>
I developing a wrapper for libsigsegv, which takes input from user if they want to detect stack overflow or not ?
<ABresting>
is it optimal? of should we make it by default?
<ABresting>
I am*
bikineev has quit [Ping timeout: 276 seconds]
<ABresting>
also with the new build there is an error
<zao>
And as we're fully on EasyBuild now, I have decent compilers all around.
<K-ballo>
the code was not updated when the requirements changed
<zao>
(silly Cuda toolchain still stuck on GCC 5.4 *grmbhl*)
<ABresting>
heller_: I am developing a wrapper for libsigsegv, which takes input from the user if they want to detect stack overflow or not ? is it optimal? of should we make it by default?
<zao>
"input" as in CMake flag, runtime flag?
<heller_>
I like the discussion on boost-dev... Everybody is a cmake expert and they contradict all the time on what's standard cmake3 and what not and what a cmake2ism is now
<heller_>
ABresting: yes, make it configurable, that's a good idea
<K-ballo>
I was conveniently pushed out of the boost-dev mailing list just before the cmake discussion started
<K-ballo>
I read the original proposal, the first mail, op has no idea how cmake looks nowadays
aserio has joined #ste||ar
<ABresting>
heller_: I think I should give user a choice to set path in config file and if not then try and read it from a default location?
<heller_>
ABresting: look at the find modules we have
<zao>
I like the versions of CMake available on my site now. Anything between 3.5.2 and 3.8.1 - nothing less :D
<heller_>
There are tons of optional dependencies
<zao>
Like say, hwloc?
<zao>
(it's the one I keep installing out of habit)
<zao>
Oh wait, hwloc is "required"
<zao>
PAPI is probably a better one.
<ABresting>
also, when user is going to write the code, they have to specify in the main() to call our wrapper
<ABresting>
heller_: *
<hkaiser>
K-ballo: Niall is part of the cmake discussion which means it will go nowhere, so no worries...
<K-ballo>
lol
<zao>
Boost.Lite solves all the problems.
* zao
nods sagely
pree_ has quit [Ping timeout: 260 seconds]
pree__ has joined #ste||ar
Matombo has joined #ste||ar
<jbjnr_>
OMG the horror I looked at the boost list, but there's no way I can read all that cmake stuff. far too much of itr
hkaiser has quit [Quit: bye]
david_pfander has joined #ste||ar
Matombo has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
eschnett has joined #ste||ar
ajaivgeorge has quit [Read error: Connection reset by peer]
akheir has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
david_pfander has quit [Ping timeout: 260 seconds]
<ABresting>
heller_: yt?
jakemp has joined #ste||ar
ajaivgeorge has joined #ste||ar
shoshijak has quit [Quit: Ex-Chat]
shoshijak has joined #ste||ar
shoshijak has quit [Ping timeout: 240 seconds]
pree__ has quit [Ping timeout: 260 seconds]
zbyerly_ has joined #ste||ar
bibek_desktop has quit [Quit: Leaving]
<jbjnr_>
heller_: why is the runtime pointer stored in TSS - there is surely only one runtime?
<jbjnr_>
we don't keep on e runtime insteance per thread do we?
shoshijak has joined #ste||ar
shoshijak has quit [Client Quit]
shoshijak has joined #ste||ar
shoshijak has quit [Ping timeout: 240 seconds]
shoshijak has joined #ste||ar
bibek_desktop has joined #ste||ar
bikineev has joined #ste||ar
aserio1 has joined #ste||ar
<wash[m]>
Hkaiser: did you ping me earlier?
wash has joined #ste||ar
shoshijak has quit [Ping timeout: 240 seconds]
aserio has quit [Ping timeout: 255 seconds]
aserio1 is now known as aserio
bikineev has quit [Ping timeout: 240 seconds]
shoshijak has joined #ste||ar
zbyerly_ has quit [Ping timeout: 255 seconds]
Matombo has joined #ste||ar
shoshijak has quit [Ping timeout: 240 seconds]
zbyerly_ has joined #ste||ar
pree has joined #ste||ar
eschnett has quit [Quit: eschnett]
jgoncal has joined #ste||ar
aserio has quit [Ping timeout: 255 seconds]
bikineev has joined #ste||ar
aserio has joined #ste||ar
pree has quit [Ping timeout: 240 seconds]
zbyerly has quit [Quit: Leafing]
Matombo has quit [Ping timeout: 260 seconds]
eschnett has joined #ste||ar
bikineev has quit [Remote host closed the connection]
Matombo has joined #ste||ar
shoshijak has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
aserio has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
aserio has quit [Ping timeout: 260 seconds]
<heller_>
jbjnr_: I can't answer that question. It has been there since forever
<heller_>
One reason I can imagine is to differentiate between a non-HPX and a HPX context
aserio has joined #ste||ar
denis_blank has quit [Quit: denis_blank]
hkaiser has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
akheir has quit [Remote host closed the connection]
<heller_>
Still no news from ACM :/
<heller_>
Last year's finalists were announced late August :(
<heller_>
Shall I wait until Friday and the poke the head of the committee again?
bikineev has joined #ste||ar
<github>
[hpx] hkaiser created clang_format (+1 new commit): https://git.io/vQUWY