aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
EverYoun_ has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 260 seconds]
thundergroudon[m has quit [Ping timeout: 240 seconds]
autrilla has quit [Ping timeout: 240 seconds]
diehlpk_work has quit [Ping timeout: 260 seconds]
EverYoung has joined #ste||ar
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 240 seconds]
EverYoung has joined #ste||ar
EverYoun_ has quit [Read error: Connection reset by peer]
EverYoun_ has joined #ste||ar
EverYoung has quit [Read error: Connection reset by peer]
EverYoung has joined #ste||ar
EverYoun_ has quit [Read error: Connection reset by peer]
EverYoun_ has joined #ste||ar
EverYoung has quit [Read error: Connection reset by peer]
thundergroudon[m has joined #ste||ar
EverYoung has joined #ste||ar
EverYoun_ has quit [Read error: Connection reset by peer]
autrilla has joined #ste||ar
EverYoung has quit [Read error: Connection reset by peer]
EverYoung has joined #ste||ar
gedaj_ is now known as gedaj
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
daissgr has quit [Ping timeout: 256 seconds]
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoun_ has joined #ste||ar
EverYoun_ has quit [Remote host closed the connection]
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 252 seconds]
vamatya has quit [Ping timeout: 256 seconds]
EverYoun_ has quit [Ping timeout: 252 seconds]
EverYoung has joined #ste||ar
EverYoun_ has joined #ste||ar
EverYoung has quit [Read error: Connection reset by peer]
EverYoun_ has quit [Ping timeout: 265 seconds]
vamatya has joined #ste||ar
daissgr has joined #ste||ar
ct_ has joined #ste||ar
<ct_> hkaiser, bibek do ya'll mind if i take a break from blaze things occasionally to write an algorithm for phylanx?
ct_ is now known as ct-clmsn
daissgr has quit [Ping timeout: 256 seconds]
diehlpk_work has joined #ste||ar
<github> [hpx] hkaiser pushed 1 new commit to refactor_base_action: https://git.io/vAv7P
<github> hpx/refactor_base_action 1e40175 Hartmut Kaiser: Making some traits constexpr instead of ALWAYS_EXPORT
galabc has joined #ste||ar
<hkaiser> ct-clmsn: not at all! ;)
<ct-clmsn> hkaiser, would you prefer the python codes "stick" to the existing primitives?
<hkaiser> what doi you mean?
<hkaiser> ahh, you mean to use only existing primitives?
<hkaiser> nah, let's take this as a incentive to create the missing ones
<ct-clmsn> ok
<hkaiser> sorry, gtg
<ct-clmsn> np
<ct-clmsn> goodnight!
hkaiser has quit [Quit: bye]
galabc has quit []
ct-clmsn has quit [Quit: Leaving]
autrilla has quit [Ping timeout: 240 seconds]
thundergroudon[m has quit [Ping timeout: 255 seconds]
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
daissgr has joined #ste||ar
hkaiser has joined #ste||ar
vamatya has quit [Quit: Leaving]
<github> [hpx] hkaiser force-pushed refactor_base_action from 1e40175 to 145507c: https://git.io/vAvAI
<github> hpx/refactor_base_action 145507c Hartmut Kaiser: Making some traits constexpr instead of ALWAYS_EXPORT
daissgr has quit [Ping timeout: 256 seconds]
hkaiser has quit [Quit: bye]
nanashi55 has quit [Ping timeout: 248 seconds]
nanashi55 has joined #ste||ar
mcopik has quit [Ping timeout: 240 seconds]
zombieleet has joined #ste||ar
zombieleet has quit [Ping timeout: 260 seconds]
thundergroudon[m has joined #ste||ar
autrilla has joined #ste||ar
<github> [hpx] msimberg closed pull request #3137: Suspend speedup (master...suspend-speedup) https://git.io/vNNqi
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
jaafar has quit [Ping timeout: 240 seconds]
david_pfander has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
<jbjnr> simbergm: just a heads up - I have merged my changes into pycicle, but I need to update the docs because the layout of the repos has changed - there's an extra subdir now so that multiple projects can be managed at the same time.
<jbjnr> (we can't actually do multiple projects simultaneously yet - but the repo dir can have differnt ones)
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
zombieleet has joined #ste||ar
Mridul has joined #ste||ar
hkaiser has joined #ste||ar
<github> [hpx] hkaiser force-pushed refactor_base_action from 145507c to 6498068: https://git.io/vAvAI
<heller_> hkaiser: good morning
<heller_> did you measure binary sizes etc with that patch?
<hkaiser> yes
<heller_> what's the gain?
<hkaiser> binary sizes go down by 10% for phylanx, main core shared library is minimally affected
<heller_> still
<heller_> nice
<hkaiser> nod
<heller_> this might help the partitioned_vector stuff as well
<hkaiser> it might - I have not checked
<hkaiser> although there it's mainly action related bloat, this patch does not address that
<hkaiser> well, a bit perhaps
<heller_> we'll see
<heller_> next stop, component migration fixes ;)?
<hkaiser> k
<heller_> this thread stack test is driving me nuts
<zao> arr
<heller_> hkaiser: /usr/bin/ld: error: undefined symbol: hpx::actions::base_action_data::load_base(hpx::serialization::input_archive&)
<hkaiser> ok, will investigate - works here
<heller_> hkaiser: and is this base_action_data not missing the bitwise serializable markup?
<hkaiser> base_action was never bitwise serializable, was it?
<heller_> base_action was not
<heller_> but the action_serialization_data which you removed in favor of base_action_data
<hkaiser> transfer_action_base was not either
<hkaiser> didn't remove it
<hkaiser> it's in the cpp file
<heller_> it looks like you just miss a cpp file
<hkaiser> ahh!
<heller_> forgot to commit even
<hkaiser> now
<github> [hpx] hkaiser force-pushed refactor_base_action from 6498068 to 00cecc6: https://git.io/vAvAI
<github> hpx/refactor_base_action 00cecc6 Hartmut Kaiser: Refactoring component_base and base_action/transfer_base_action to reduce number of instantiated functions and exported symbols
gedaj has quit [Remote host closed the connection]
<heller_> while you are at it ... the parent_XXX stuff is always serialized, even if the feature is turned of
gedaj has joined #ste||ar
<hkaiser> intentionally
<hkaiser> there is even a comment inthere
<heller_> it's about the only place where we care for binary compatibility ;)
<hkaiser> we have to start somewhere ;)
<hkaiser> it was that way forever
<heller_> sure
<heller_> 32 vs. 8 byte ... not sure if that's measurable though
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
<heller_> hkaiser: component_base.cpp is missing as well
<hkaiser> sorry
<github> [hpx] hkaiser force-pushed refactor_base_action from 00cecc6 to 2e8b98d: https://git.io/vAvAI
<github> hpx/refactor_base_action 2e8b98d Hartmut Kaiser: Refactoring component_base and base_action/transfer_base_action to reduce number of instantiated functions and exported symbols
<hkaiser> should be fine now
<heller_> and I think I got the thread stacks under control as well now...
<hkaiser> nice
<heller_> can_invoke_locally might lead to stack overflows
<heller_> if there isn't enough space left on the current stack
<hkaiser> heller_: there always can be a stack-overflow
<heller_> yeah sure
<heller_> but that way, we are asking for it to happen..
<heller_> especially with this test
<hkaiser> I don't see how this is different from executing any other function in this regard
<heller_> the difference is that they might request to be executed with a given stack size
<github> [hpx] sithhell created fixing_thread_stacks_3 (+1 new commit): https://git.io/vAfPL
<github> hpx/fixing_thread_stacks_3 d073b08 Thomas Heller: Avoiding more stack overflows...
<hkaiser> I disagree, nothing special about this
<heller_> then the test doesn't make sense
<heller_> or is broken by design
mcopik has joined #ste||ar
<heller_> we assume that we have almost HPX_SMALL_STACK_SIZE (or equivalent) space available
<hkaiser> the test was written at a point where we didn't execute local action directly - so it might not reflect the current state anymore
<hkaiser> the test should really force everthing onto a new thread
<hkaiser> otherwise it really does not make sense
<hkaiser> I thought the existing can_invoke_locally enforces that
<heller_> not for small stack sizes
<heller_> not sure what hpx_main runs in
<hkaiser> nod
<github> [hpx] sithhell opened pull request #3150: Avoiding more stack overflows (master...fixing_thread_stacks_3) https://git.io/vAfP6
<heller_> so actions are allowed to request that they run with a given stack size. The question is, do we garuantee that actions have at least as much stack available as they requested or not
<hkaiser> right
<hkaiser> we need self-growing stacks, then all of this nonesense is not needed anymore
<heller_> sure, we can't change that right now :/
<hkaiser> indeed
<simbergm> heller_: nice job with the stacks
<simbergm> I was looking at the failures with boost 1.58 and gcc4.9 and it seems maybe small_vector was still buggy in that release
eschnett has quit [Quit: eschnett]
<simbergm> much better if I use the vector implementation
<heller_> which failures in particular?
<heller_> ah, ok
<simbergm> they all fail there
<hkaiser> simbergm: let's raise the bar for small_vector, then
<simbergm> couldn't find anything in the release notes but I would raise it to 1.59
<simbergm> yeah
<heller_> let's do it then
<simbergm> yep, will do it
<simbergm> jbjnr: thanks for the update on pycicle, will stick to the old version for now until there are significant changes
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
<github> [hpx] msimberg created msimberg-patch-1 (+1 new commit): https://git.io/vAf1T
<github> hpx/msimberg-patch-1 cbc7496 Mikael Simberg: Require boost version 1.59 to use small_vector...
<github> [hpx] msimberg opened pull request #3151: Use small_vector only from boost version 1.59 onwards (master...msimberg-patch-1) https://git.io/vAf1G
<jbjnr> simbergm: what would be useful is this : create an issue in pycicle saying that you want to build with differnt options, and paste in an example of the flags/options/other that you would like to use. Then I will work on allowing us to do builds with N sets of different settings for builds (compiler/boost/cxxflags/cmake options/etc/etc)
<jbjnr> I would like to have options defined that we can chooose from, with other dependent options that change depending on which of the other options are selected (so if you want this, you must have that, but cannot have those, etc)
K-ballo has quit [Quit: K-ballo]
<simbergm> jbjnr: hmm, sure, I'll try to write up some minimal requirements
<github> [hpx] sithhell force-pushed fixing_thread_stacks_3 from d073b08 to 4f8418e: https://git.io/vAfMo
<github> hpx/fixing_thread_stacks_3 4f8418e Thomas Heller: Avoiding more stack overflows...
<jbjnr> oops. just killed some build jobs on daint by accident.
K-ballo has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
hkaiser has quit [Quit: bye]
eschnett has joined #ste||ar
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vAfSs
<github> hpx/gh-pages 2a7baa1 StellarBot: Updating docs
hkaiser has joined #ste||ar
hkaiser has quit [Client Quit]
hkaiser has joined #ste||ar
<github> [hpx] biddisco closed pull request #3151: Use small_vector only from boost version 1.59 onwards (master...msimberg-patch-1) https://git.io/vAf1G
<hkaiser> jbjnr: yt?
<github> [hpx] hkaiser deleted msimberg-patch-1 at cbc7496: https://git.io/vAfbB
<jbjnr> hkaiser: here
<jbjnr> did I do something bad?
<hkaiser> jbjnr: nah
<hkaiser> jbjnr: you played with hpx.compute last year and created a wrapper for directly executing existing cuda kernels
<hkaiser> do you still have that code?
<jbjnr> yes - sort of
<jbjnr> it's nearby - I was planning on commiting an example, like the cublas one
<jbjnr> I can do it tomorrow
<hkaiser> no need for an example, just the existing code would be fine
<jbjnr> but we don't need hpx::compute for it anyway
<jbjnr> well, not really
zombieleet has quit [Quit: Leaving]
<hkaiser> Gregor wants to start integrating his cuda kernels into octotiger..
<jbjnr> let me dig it out
<hkaiser> just the lowest level kernel launching abstractions are needed, not really hpx.compute
<github> [hpx] hkaiser force-pushed refactor_base_action from 2e8b98d to 881ad18: https://git.io/vAvAI
<github> hpx/refactor_base_action 881ad18 Hartmut Kaiser: Refactoring component_base and base_action/transfer_base_action to reduce number of instantiated functions and exported symbols...
aserio has joined #ste||ar
<simbergm> jbjnr: sorry, missed your issue on pycicle (#12)... just opened #13
<simbergm> maybe something useful in there
<jbjnr> ta
<github> [hpx] msimberg opened pull request #3152: Documentation for runtime suspension (master...suspend-documentation) https://git.io/vAJeA
daissgr has joined #ste||ar
eschnett has quit [Quit: eschnett]
daissgr has quit [Ping timeout: 255 seconds]
eschnett has joined #ste||ar
daissgr has joined #ste||ar
<hkaiser> jbjnr: did you find the code?
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 265 seconds]
aserio1 has quit [Remote host closed the connection]
aserio has joined #ste||ar
Mridul has quit [Quit: Page closed]
<jbjnr> that's what I did in the hackathon
<hkaiser> jbjnr: thanks!
aserio has quit [Read error: Connection reset by peer]
EverYoung has joined #ste||ar
aserio has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
hkaiser has quit [Quit: bye]
hkaiser has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
daissgr has quit [Ping timeout: 260 seconds]
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
aserio has quit [Quit: aserio]
aserio has joined #ste||ar
aserio has quit [Ping timeout: 255 seconds]
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<heller_> while ./bin/wait_for_1751_test --hpx:threads=4 --hpx:attach-debugger=exception; do date; done
<heller_> dumdidum
<Zwei> Err, sorry for afk, just very busy with job interviews atm
<Zwei> Will contribute after all these interviews settle down
jaafar has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
EverYoung has joined #ste||ar
vamatya has joined #ste||ar
<heller_> during shutdown...
<hkaiser> you tell me...
<zao> How nice.
<heller_> i have no idea
<heller_> hpx_main either never was put into thread_map_ or it is attempted to delete it twice
<heller_> only happens once in a hundred times or so
<hkaiser> or it's the wrong map
<github> [hpx] hkaiser force-pushed refactor_base_action from 32c31c2 to 570f67b: https://git.io/vAvAI
<github> hpx/refactor_base_action 570f67b Hartmut Kaiser: Refactoring component_base and base_action/transfer_base_action to reduce number of instantiated functions and exported symbols...
<heller_> (gdb) print tid.thrd_->queue_ == this
<heller_> $6 = true
<heller_> sanitizer time?
aserio has joined #ste||ar
<heller_> or terminated_items_ is racy
<heller_> but rather unlikely, the count should reflect that
<hkaiser> it's one of the lockfree queues, yes?
<heller_> yes
<heller_> the thread that's attempted to get deleted certainly doesn't look corrupted
<heller_> or well, recycled
<heller_> so I turned of recycling, and started a address sanitizer build
<heller_> let's see how long I have to wait...
<heller_> lol, the first run
daissgr has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
daissgr has quit [Quit: WeeChat 1.4]
daissgr has joined #ste||ar
<github> [hpx] hkaiser force-pushed refactor_base_action from 570f67b to dc1dc0d: https://git.io/vAvAI
<github> hpx/refactor_base_action dc1dc0d Hartmut Kaiser: Refactoring component_base and base_action/transfer_base_action to reduce number of instantiated functions and exported symbols...
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 252 seconds]
aserio1 is now known as aserio
<heller_> hkaiser: heap use after free when setting timed threads...
<heller_> looks like this is the corner case for shared ownership...
david_pfander has quit [Ping timeout: 252 seconds]
daissgr has quit [Ping timeout: 276 seconds]
shahrzad has quit [Read error: Connection reset by peer]
daissgr has joined #ste||ar
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 264 seconds]
aserio1 is now known as aserio
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
<heller_> but even if the lifetime problem was fixed .... there's still an issue with setting the state of an already terminated thread, which might have been reused already
EverYoun_ has joined #ste||ar
hkaiser has quit [Quit: bye]
EverYoung has quit [Ping timeout: 265 seconds]
eschnett has quit [Quit: eschnett]
hkaiser has joined #ste||ar
<parsa[w]> hkaiser: see phylanx/#203
<parsa[w]> too late :)
<hkaiser> parsa[w]: thanks for investigating though - interesting effects ;)
<parsa[w]> hkaiser: do you think it is going to be a problem elsewhere that they're referring to the same variables?
<hkaiser> shrug
<hkaiser> let's find out what's actually going on
<parsa[w]> if not i'll ignore them for now in the test or maybe group them together
<parsa[w]> okay
<heller_> ha! got it
eschnett has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
<hkaiser> heller_: this might be a result of removing the intrusive_ptr
<heller_> hkaiser: removing intrusive_ptr just made it more obvious
<heller_> hkaiser: the race has been in there since forever
<hkaiser> ok
<heller_> it's just affecting timed suspensions
<hkaiser> heller_: that could have been the use case that triggered using the intrusive_ptr's in the first place
<heller_> yeah, the problem is, that the suspended thread might terminate too quickly for the timer thread to start
<heller_> so even with intrusive pointer, we might access a thread, that has long been terminated and eventually been recycled
<github> [hpx] hkaiser created clang_tidy (+1 new commit): https://git.io/vAJha
<github> hpx/clang_tidy f80c205 Hartmut Kaiser: More refactorings based on clang-tidy reports
<heller_> are you running clang-tidy on windows now?
<github> [hpx] sithhell created fix_timed_suspension (+1 new commit): https://git.io/vAJjF
<github> hpx/fix_timed_suspension 3f00c42 Thomas Heller: Fixing a race with timed suspension...
<github> [hpx] sithhell opened pull request #3153: Fixing a race with timed suspension (master...fix_timed_suspension) https://git.io/vAJjp
<heller_> I'm letting the test run for the next 3.5 hours now
<heller_> given that I encountered failures with asan on every other run, it looks pretty good now
EverYoun_ has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
aserio has quit [Quit: aserio]
zombieleet has joined #ste||ar
galabc has joined #ste||ar
galabc has quit [Ping timeout: 260 seconds]
<github> [hpx] hkaiser force-pushed refactor_base_action from dc1dc0d to 8d692d5: https://git.io/vAvAI
<github> hpx/refactor_base_action 8d692d5 Hartmut Kaiser: Refactoring component_base and base_action/transfer_base_action to reduce number of instantiated functions and exported symbols...