aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
RostamLog_ has joined #ste||ar
K-ballo1 has joined #ste||ar
<K-ballo1> XML can be clunky, the rest is fun!
zao has quit [Quit: Up, up, and away!]
K-ballo has quit [Remote host closed the connection]
K-ballo1 is now known as K-ballo
ajaivgeorge has quit [Remote host closed the connection]
washplan1z has joined #ste||ar
zao has joined #ste||ar
washplanez has quit [Remote host closed the connection]
ajaivgeorge has joined #ste||ar
bikineev has quit [Remote host closed the connection]
RostamLog has quit [Ping timeout: 246 seconds]
zbyerly__ has quit [Remote host closed the connection]
zbyerly__ has joined #ste||ar
<github> [hpx] K-ballo force-pushed range from 819b5d4 to 64472f0: https://git.io/voasb
<github> hpx/range be60f2b Agustin K-ballo Berge: Add C++11 range utilities
<github> hpx/range f66ba2d Agustin K-ballo Berge: Cleanup acquire traits, replace boost::range with util/range in their implementation, patch broken whens
<github> hpx/range 7c4e3ad Agustin K-ballo Berge: Replace core boost::range with util/range, remove redundant parallel traits
<github> [hpx] K-ballo force-pushed throw_with_info from 2c91a80 to 5a53909: https://git.io/vHKTJ
<github> hpx/throw_with_info 5a53909 Agustin K-ballo Berge: (draft) exception_info implementation (P0640)
eschnett has joined #ste||ar
zbyerly__ has quit [Remote host closed the connection]
zbyerly__ has joined #ste||ar
denis_blank has quit [Quit: denis_blank]
ajaivgeorge has quit [Read error: Connection reset by peer]
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: bye]
eschnett has quit [Quit: eschnett]
EverYoung has joined #ste||ar
jaafar has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
patg has joined #ste||ar
zbyerly__ has quit [Remote host closed the connection]
zbyerly__ has joined #ste||ar
zbyerly__ has quit [Ping timeout: 246 seconds]
zbyerly_ has joined #ste||ar
pree has joined #ste||ar
jaafar has quit [Ping timeout: 258 seconds]
Matombo has joined #ste||ar
shoshijak has joined #ste||ar
shoshijak has quit [Ping timeout: 258 seconds]
patg is now known as patg_away
zbyerly_ has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
david_pfander has joined #ste||ar
david_pfander has quit [Ping timeout: 268 seconds]
shoshijak has joined #ste||ar
Matombo has quit [Remote host closed the connection]
zbyerly_ has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
<shoshijak> Hey all. I'm inspecting the problem jbjnr and I had yesterday (running multiple pools on multiple nodes results in: {what}: parcel destination does not match locality which received the parcel). I'm suspecting there's something we do twice when setting up or running the pools which should really be done only once. Can someone tell me what init_tss() does and what it's all about?
pree_ has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
Matombo has joined #ste||ar
washplan1z has quit [Quit: Lost terminal]
zbyerly_ has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
EverYoung has joined #ste||ar
Matombo has quit [Ping timeout: 268 seconds]
shoshijak has quit [Read error: Connection reset by peer]
shoshijak has joined #ste||ar
shoshijak has quit [Client Quit]
shoshijak has joined #ste||ar
shoshijak has quit [Client Quit]
<heller_> now she's gone
shoshijak has joined #ste||ar
<heller_> shoshijak: hey
<jbjnr_> she sitting next to me and closing the lid on her laptop
<jbjnr_> you can explain the tss if you like :)
<heller_> shoshijak: init_tss intializes the thread specific storage
<heller_> that is, pointers to the runtime object, parcehandler etc.
<jbjnr_> if we have two thread pools with thread nums 0.3 and 0-7 (for example) - we need two separate TSS storage locations - we think we might be reusing the same one twice ....
<heller_> the error you get might be due to the runtime already shutting down and the locality has been unregistered already
<heller_> well, if the tss is not initialized correctly, you will get seg faults or other errors about the runtime not being initialized correctly yet
<jbjnr_> hmmm
<heller_> each new OS thread that you launch that is supposed to be usable within HPX needs to properly initialize it's tss
<heller_> its
EverYoung has quit [Ping timeout: 255 seconds]
<shoshijak> but it seems like init_tss() is done once per pool, not once per OS-thread
Matombo has joined #ste||ar
<heller_> init_tss is called once for the main thread, then each OS thread needs to the same again
<heller_> yes
<shoshijak> how does each OS thread "use the same again" ?
<heller_> one sec
<heller_> there are multiple init_tss thingies going on
Matombo has quit [Ping timeout: 260 seconds]
<heller_> does this help?
<shoshijak> let me see ...
bikineev has joined #ste||ar
<jbjnr_> what's the deal with this get_notification_policy stuff? what's it for?
<heller_> to have various callbacks for different phases
<heller_> one when the thread starts, one when it stops and one when an error occured
<heller_> decoupling, essentially
<jbjnr_> what has that got to do with the tss stuff though?
<jbjnr_> ah, I see
<jbjnr_> each thread gets notified
<heller_> yes
<jbjnr_> and so it uses the tss for the notifier stuff
<heller_> the notifier has a on_start_thread function
<heller_> which is called inside the thread pool
<heller_> which in terms registers the TSS
<jbjnr_> if we have 2 pools - what needs to change?
<heller_> nothing, really
<jbjnr_> (or 4 pools etc)
<heller_> just use this notification thingy
<heller_> call the necessary functions and call it a day
<jbjnr_> the problem we might have is pool A has thread nums 0-3 and pool B has the same thread nums - if we mix up the tss for both pools anywhere ....
<heller_> the notification policy only sets up the runtime specific TSS variables
<heller_> no idea how to solve the clash of different thread nums
<heller_> maybe just count them in order or so?
<jbjnr_> is the tss local to the thread pool, or global?
<jbjnr_> looks like^
<heller_> that's the one initialized in the thread pool
<jbjnr_> it's a global
<heller_> the TSS is global within a thread
<jbjnr_> and each thread pool inits it again
<heller_> yes, because it is thread local
<jbjnr_> so we are clobbering it
<heller_> it is a thread specific pointer
<jbjnr_> so where does init_tss(num) create the stuff
<jbjnr_> we cal it from pool A with init_tss(6) and from pool B with init_tss(2)
<jbjnr_> and we need 8 slots for tss
<heller_> one sec
<jbjnr_> I'm worried we are trampling on some global memory
<heller_> you get a new TSS for each new OS thread you create
<jbjnr_> we understand that we get one per thread, what we are missing is where the space comes from
<jbjnr_> so when we init_tss twice, are we not overwriting things
<heller_> no
<heller_> well, if you call it twice in the same thread, yes
<jbjnr_> aha!
<jbjnr_> thread_pool init is only called on one thread
<heller_> the TSS is a special memory segment
<heller_> has its own register even
<heller_> so it points to a thread specific memory location where the stuff is put
<jbjnr_> thread pool calls init_tss(8)
<jbjnr_> but ponly from one thread
<heller_> right
<heller_> only from the thread which has the number 8
<jbjnr_> no
<heller_> why not?
<heller_> this is called in the run function of thread_pool
<heller_> which is sitting in its own OS thread
<jbjnr_> thread pool calls init_tss ONCE but passes in N threads
<jbjnr_> then we call it TWICE with another N
<jbjnr_> second time I mean
Matombo has joined #ste||ar
<jbjnr_> it's screwed
<heller_> yes, on a different OS thread
<jbjnr_> no
<heller_> sure
<jbjnr_> the thread pool is created on the m ain thread, the workers live elsewhere
<heller_> of course..
<jbjnr_> can we do a quick skype/hangout call?
<heller_> L445 is for the main thread
<heller_> give me 5 minutes please
<jbjnr_> ok
<shoshijak> heller_ : thread_pool::run() is called once per thread pool, but it's always called from the main thread
zbyerly_ has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
<heller_> jbjnr_: 5 more minutes ...
zbyerly_ has quit [Ping timeout: 255 seconds]
denis_blank has joined #ste||ar
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vQfPD
<github> hpx/gh-pages 94e8e83 StellarBot: Updating docs
<heller_> jbjnr_: when do we know about the hackathon?
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
bikineev has quit [Ping timeout: 240 seconds]
Matombo has quit [Ping timeout: 276 seconds]
hkaiser has joined #ste||ar
<jbjnr_> welcome hkaiser you are up early
<hkaiser> jbjnr_: g'morning
shoshijak has quit [Read error: Connection reset by peer]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 258 seconds]
<jbjnr_> in threadpool, there s a wait for all threads to start before we begin work - is there a wait anywhere during shutdown that makes sure we don't delete stuff until all os worker threads are terminated cleanly
<hkaiser> jbjnr_: yes
<hkaiser> the scheduler loop is exited only once activity has ceased
<jbjnr_> yes, but there is a scheduling loop on each thread - how does the runtime check that -all- threads are finished. We have multiple pools and want to be sure we don't try to enter shutdown until everything is quiet
<hkaiser> jbjnr_: currently, once all threads of the (main) scheduling loop have exited, the runtime exits
<jbjnr_> what do you mean by "main" scheduling loop in this context
<hkaiser> jbjnr_: well, we have only one thread_pool, right?
shoshijak has joined #ste||ar
<hkaiser> that's what I meant with 'main'
<jbjnr_> we have lots of thread pools
<hkaiser> :D
<jbjnr_> hkaiser: what do we do about thread_pool_os_executor?
<jbjnr_> it seems to love outside the main ecosystem and does it's own thing
<hkaiser> jbjnr_: I think this is left to the user, currently
<jbjnr_> we are worried that it calls run and calls init_tss etc etc, but might be done at runtime and does calling init_tss from that thread bugger other stuff up?
<hkaiser> jbjnr_: not sure
<hkaiser> you're concerns are well funded
Matombo has joined #ste||ar
bikineev has joined #ste||ar
K-ballo has joined #ste||ar
pree_ has quit [Ping timeout: 260 seconds]
eschnett has joined #ste||ar
bikineev has quit [Ping timeout: 258 seconds]
patg_away has quit [Read error: Connection reset by peer]
bikineev has joined #ste||ar
patg has joined #ste||ar
patg is now known as Guest83749
ajaivgeorge has joined #ste||ar
<ABresting> hkaiser: yt?
<ABresting> heller_: yt?
<ABresting> I developing a wrapper for libsigsegv, which takes input from user if they want to detect stack overflow or not ?
<ABresting> is it optimal? of should we make it by default?
<ABresting> I am*
bikineev has quit [Ping timeout: 276 seconds]
<ABresting> also with the new build there is an error
<zao> Nifty, ought to use operator-bool or get(), I guess.
<K-ballo> or -> ?
<zao> Indeed.
<zao> (good catch :D)
<K-ballo> where does that come from? it is not visible in circle
<zao> Can't wait for operator-dot.
<K-ballo> oh boy, that's gonna be... fun
<K-ballo> I can't tell what the types of the things involved in all those expressions are :/
<K-ballo> some sort of tuple...
<K-ballo> subtle
<K-ballo> gcc 4.6 workarounds from 2 years ago
<zao> #define TEMPORARY_HACKFIX_FOR_REAL_THIS_TIME :P
pree_ has joined #ste||ar
jaafar has joined #ste||ar
<ABresting> I accidentally built it to on an outdated gcc VM :P
eschnett has quit [Quit: eschnett]
<K-ballo> ah, that explains it
<K-ballo> it seems when the requirements for gcc where bumped to 4.9, the codebase wasn't properly updated
<ABresting> yes and many people use that by default in Debian at least, even I forget it from the time I build it last
Matombo has quit [Ping timeout: 255 seconds]
<zao> Hey, Debian released the other day, no excuse to run old crap anymore :)
<ABresting> yes but stable 14.04 is widely used
<ABresting> and it comes with the GCC 4.8
<ABresting> even my VM was centos 7 and the default was 4.8 :P
<K-ballo> there should be some error at the configure level when trying to use an unsupported compiler
<zao> I wonder where I got debian from...
<ABresting> its written in prerequisites, I had it last time as my personal laptop was update but choose to ignore it this time
<K-ballo> written in prerequisites doesn't cut it, compilers don't read documentation :P
<zao> I had expected us to hard-bail in CMake indeed.
<ABresting> yes a config level check is necessary
<heller_> There should be an error when running cmake
<zao> Probably a sign that CMake is crap, eh? :P
<heller_> Based on missing features...
<zao> Sadly I don't have any legacy horror-cluster anymore.
<zao> And as we're fully on EasyBuild now, I have decent compilers all around.
<K-ballo> the code was not updated when the requirements changed
<zao> (silly Cuda toolchain still stuck on GCC 5.4 *grmbhl*)
<ABresting> heller_: I am developing a wrapper for libsigsegv, which takes input from the user if they want to detect stack overflow or not ? is it optimal? of should we make it by default?
<zao> "input" as in CMake flag, runtime flag?
<heller_> I like the discussion on boost-dev... Everybody is a cmake expert and they contradict all the time on what's standard cmake3 and what not and what a cmake2ism is now
<heller_> ABresting: yes, make it configurable, that's a good idea
<K-ballo> I was conveniently pushed out of the boost-dev mailing list just before the cmake discussion started
<K-ballo> I read the original proposal, the first mail, op has no idea how cmake looks nowadays
aserio has joined #ste||ar
<ABresting> heller_: I think I should give user a choice to set path in config file and if not then try and read it from a default location?
<heller_> ABresting: look at the find modules we have
<zao> I like the versions of CMake available on my site now. Anything between 3.5.2 and 3.8.1 - nothing less :D
<heller_> There are tons of optional dependencies
<zao> Like say, hwloc?
<zao> (it's the one I keep installing out of habit)
<zao> Oh wait, hwloc is "required"
<zao> PAPI is probably a better one.
<ABresting> also, when user is going to write the code, they have to specify in the main() to call our wrapper
<ABresting> heller_: *
<hkaiser> K-ballo: Niall is part of the cmake discussion which means it will go nowhere, so no worries...
<K-ballo> lol
<zao> Boost.Lite solves all the problems.
* zao nods sagely
pree_ has quit [Ping timeout: 260 seconds]
pree__ has joined #ste||ar
Matombo has joined #ste||ar
<jbjnr_> OMG the horror I looked at the boost list, but there's no way I can read all that cmake stuff. far too much of itr
hkaiser has quit [Quit: bye]
david_pfander has joined #ste||ar
Matombo has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
eschnett has joined #ste||ar
ajaivgeorge has quit [Read error: Connection reset by peer]
akheir has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
david_pfander has quit [Ping timeout: 260 seconds]
<ABresting> heller_: yt?
jakemp has joined #ste||ar
ajaivgeorge has joined #ste||ar
shoshijak has quit [Quit: Ex-Chat]
shoshijak has joined #ste||ar
shoshijak has quit [Ping timeout: 240 seconds]
pree__ has quit [Ping timeout: 260 seconds]
zbyerly_ has joined #ste||ar
bibek_desktop has quit [Quit: Leaving]
<jbjnr_> heller_: why is the runtime pointer stored in TSS - there is surely only one runtime?
<jbjnr_> we don't keep on e runtime insteance per thread do we?
shoshijak has joined #ste||ar
shoshijak has quit [Client Quit]
shoshijak has joined #ste||ar
shoshijak has quit [Ping timeout: 240 seconds]
shoshijak has joined #ste||ar
bibek_desktop has joined #ste||ar
bikineev has joined #ste||ar
aserio1 has joined #ste||ar
<wash[m]> Hkaiser: did you ping me earlier?
wash has joined #ste||ar
shoshijak has quit [Ping timeout: 240 seconds]
aserio has quit [Ping timeout: 255 seconds]
aserio1 is now known as aserio
bikineev has quit [Ping timeout: 240 seconds]
shoshijak has joined #ste||ar
zbyerly_ has quit [Ping timeout: 255 seconds]
Matombo has joined #ste||ar
shoshijak has quit [Ping timeout: 240 seconds]
zbyerly_ has joined #ste||ar
pree has joined #ste||ar
eschnett has quit [Quit: eschnett]
jgoncal has joined #ste||ar
aserio has quit [Ping timeout: 255 seconds]
bikineev has joined #ste||ar
aserio has joined #ste||ar
pree has quit [Ping timeout: 240 seconds]
zbyerly has quit [Quit: Leafing]
Matombo has quit [Ping timeout: 260 seconds]
eschnett has joined #ste||ar
bikineev has quit [Remote host closed the connection]
Matombo has joined #ste||ar
shoshijak has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
aserio has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
aserio has quit [Ping timeout: 260 seconds]
<heller_> jbjnr_: I can't answer that question. It has been there since forever
<heller_> One reason I can imagine is to differentiate between a non-HPX and a HPX context
aserio has joined #ste||ar
denis_blank has quit [Quit: denis_blank]
hkaiser has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
akheir has quit [Remote host closed the connection]
<heller_> Still no news from ACM :/
<heller_> Last year's finalists were announced late August :(
<heller_> Shall I wait until Friday and the poke the head of the committee again?
bikineev has joined #ste||ar
<github> [hpx] hkaiser created clang_format (+1 new commit): https://git.io/vQUWY
<github> hpx/clang_format 80fa653 Hartmut Kaiser: Adding .clang-format file
<github> [hpx] hkaiser opened pull request #2706: Adding .clang-format file (master...clang_format) https://git.io/vQUWE
<zao> hkaiser: SortIncludes? You brave soul.
<zao> Guess that you have faith in your categories...
<hkaiser> zao: worked well so far
<zao> Better hope you never touch any Windows header anywhere. Ever.
<hkaiser> heh
<zao> They're so order dependent it's not funny.
<hkaiser> we'll find out
<zao> winsock2, windows, atlbase, atl*
<zao> Kept breaking my build whenever things jiggled around in my code.
<hkaiser> right, good point
<zao> Of less interest to you, gl.h and all the extension wranglers.
<hkaiser> but this is HPX, we don't do windows ;)
<zao> Can you override for subtrees, if some fancy component happens upon something sensitive?
<zao> Assuming you want Good Stuff in your tree :)
<hkaiser> zao: also, this is more of a call for comments
<zao> Yeah, sure, just mentioning the immediate pain flashback :)
<github> [hpx] hkaiser force-pushed clang_format from 80fa653 to 805db36: https://git.io/vQUlN
<github> hpx/clang_format 805db36 Hartmut Kaiser: Adding .clang-format file
<K-ballo> some includes are defined as order dependent
<github> [hpx] hkaiser force-pushed clang_format from 805db36 to dd6901b: https://git.io/vQUlN
<github> hpx/clang_format dd6901b Hartmut Kaiser: Adding .clang-format file
<K-ballo> config, config/asio, warning prefix-suffix
shoshijak has quit [Ping timeout: 240 seconds]
<hkaiser> K-ballo: it does not sort headers which are separated by empty lines
<K-ballo> smart
Matombo has quit [Remote host closed the connection]
jakemp has quit [Ping timeout: 258 seconds]
zbyerly_ has quit [Ping timeout: 246 seconds]
aserio has quit [Quit: aserio]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
eschnett has quit [Quit: eschnett]
ajaivgeorge has quit [Remote host closed the connection]
ajaivgeorge has joined #ste||ar
K-ballo has quit [Remote host closed the connection]
titzi has quit [Write error: Broken pipe]
ajaivgeorge has quit [Remote host closed the connection]
titzi has joined #ste||ar
ajaivgeorge has joined #ste||ar
titzi has quit [Changing host]
titzi has joined #ste||ar
K-ballo has joined #ste||ar
bikineev has quit [Remote host closed the connection]
zbyerly has joined #ste||ar
zbyerly has quit [Ping timeout: 240 seconds]