aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
<github> [hpx] hkaiser force-pushed disable_executor_compatibility from 03bd6cc to 27a034c: https://git.io/vN2lz
<github> hpx/disable_executor_compatibility 27a034c Hartmut Kaiser: This patch disables default executor compatibility with V1 executors...
<github> [hpx] hkaiser force-pushed reinit_counters from ac55fa6 to 9fff8c6: https://git.io/vN2a0
<github> hpx/reinit_counters 9fff8c6 Hartmut Kaiser: Adding performance_counter::reinit to allow for dynamically changing counter sets...
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 265 seconds]
parsa has quit [Quit: Zzzzzzzzzzzz]
Smasher has quit [Remote host closed the connection]
daissgr has quit [Quit: WeeChat 1.4]
hkaiser has quit [Quit: bye]
parsa has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
nanashi55 has quit [Ping timeout: 256 seconds]
nanashi55 has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
parsa has quit [Ping timeout: 256 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
parsa has joined #ste||ar
parsa has quit [Ping timeout: 260 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
<github> [hpx] sithhell force-pushed fix_thread_overheads from 1cdb0f6 to 8ecdf93: https://git.io/vNByp
<github> hpx/fix_thread_overheads 1b3591b Bruno Pitrus: Fix #3068
<github> hpx/fix_thread_overheads 4354c6d Hartmut Kaiser: Relax atomic operations on performance counter values
<github> hpx/fix_thread_overheads ffe8e6b Mikael Simberg: Add suspend function to scheduled thread pool
<heller_> jbjnr_: alright. I am happy with my changes so far. improved the scheduling overhead significantly
<heller_> eagerly awaiting your results now...
<jbjnr_> ok. I have actually started a run this morning using the current master, so I can do a comparison with your branch later. Seems like daint is giving me trouble for the moment, so watch this space.
<jbjnr_> is your banch up to date
<jbjnr_> ^branch
<heller_> the only thing left to do is improving the setting of the actual thread state, which I believe is total overkill at the moment
<heller_> I just pushed, yes
<heller_> I will gradually make PRs now to get the stuff in
<jbjnr_> I wil review your PR's to make sure we make progress
<jbjnr_> thanks for doing this
<heller_> I have significant interest in this as well ;)
<jbjnr_> there's a lot riding on this
<heller_> overall, I am quite pleased about the low hanging fruits so far
<heller_> and there's way more
<simbergm> heller_: do you think you'll have time to have a look at the cuda build for the release? or should we skip that one?
<heller_> simbergm: the one on rostam?
<simbergm> yep
<simbergm> you mean it's fine for you?
<simbergm> fine=building
<heller_> I can do that... the biggest issue there is that we pass strange strings to the builder
<heller_> I will have a look at it
<simbergm> heller_: thanks, if you're tight on time I'll try to find some time to look at it as well, just let me know
<heller_> yeah, you need to change the rostam build settings for it to be effective
<simbergm> heller_: okay, that I can't do but if that's enough to get it running great
<simbergm> or maybe I can...
david_pfander has joined #ste||ar
parsa has joined #ste||ar
parsa has quit [Ping timeout: 248 seconds]
<simbergm> jbjnr_: pycicle would not be unhappy if it launches multiple jobs in choose_and_launch, right? i.e. your bool(random.getrandbits(1)) is just to not have so many jobs running, and not because pycicle can't handle launching two jobs at the same time?
<simbergm> (trying to add a sanitzer build)
EverYoung has joined #ste||ar
<jbjnr_> simbergm: the random bool is just a kludgey way of selecting between a gcc and clang build since I have not added a format 'option matrix' that allows you to say gcc/with cuda/with hwloc/without X/With Yetc etc
<jbjnr_> when I add a proper way of selcting multiple optins, that will change.
<jbjnr_> and No, you can't have two builds using the same binary or src dir going at the same time (well, src dir, yes, but same branch only)
<jbjnr_> best not to
<jbjnr_> and if two jobs with the same name are launched, the first one is killed
<jbjnr_> so that if you force push your PR branch, then any existing build is killed and a new one started
EverYoung has quit [Ping timeout: 276 seconds]
<simbergm> jbjnr_: right
<simbergm> PYCICLE_BUILD_STAMP seems to go into the build directory name, so it seems like if that one is unique a sanitizer build could run alongside the normal one
<simbergm> jbjnr_: second question, I started making hwloc compulsory, would there be any reason to keep the topology interface separate from the hwloc implementation? afaict there are no alternatives to hwloc
<jbjnr_> no the abstract base class can be removed if hwloc is compulsory
<simbergm> ok, thanks
<jbjnr_> I can do that if you want
<jbjnr_> (I am not sure if we agreed to go ahead with it)
<jbjnr_> (so I held back)
<simbergm> hmm, I agree, it wasn't completely agreed upon
<simbergm> but I got the impression hkaiser wasn't against it either, more against hwloc 2.0 and integrating it completely with hpx
<simbergm> so I thought better make it compulsory, because hpx broken anyway without it so it's already compulsory in practice
<simbergm> but yes, please do it if you find the time
<jbjnr_> I played with hwloc 2 and it breaks everything, so I think we should not try hwloc 2 yet, but make hwloc compulsory
<jbjnr_> and remove the abstract iface
<jbjnr_> I will add it to my todo list
<simbergm> that and some kind of rp documentation would be my two wishes for the release from you :)
<simbergm> thank you
<jbjnr_> rp docs will be my next frog then.
<simbergm> it seems like more rp tests and the executor/scheduler cleanup will be done if there is nothing else to do at the end, but I think there won't be enough time to do it properly
<simbergm> looking at my version 1.1.0 must haves list on github
<simbergm> then we just have to wait for me to get my runtime suspension done!
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
carpediem has joined #ste||ar
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vN2jz
<github> hpx/gh-pages ac8357a StellarBot: Updating docs
<github> [hpx] sithhell created thread_data_refcount (+1 new commit): https://git.io/vN2jQ
<github> hpx/thread_data_refcount 79eeeb6 Thomas Heller: Don't use boost::intrusive_ptr for thread_id_type...
<github> [hpx] sithhell opened pull request #3120: Don't use boost::intrusive_ptr for thread_id_type (master...thread_data_refcount) https://git.io/vN2jA
<github> [hpx] sithhell closed pull request #3115: Fixing race condition in channel test (master...fixing_post_3104) https://git.io/vNzk9
mcopik has quit [Ping timeout: 240 seconds]
carpediem has quit [Ping timeout: 260 seconds]
parsa has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
<zao> Having tried to use HPX on DragonFly and their weird hwloc topologies, hwloc is indeed pretty much mandatory already, and has some rather strict assumptions about hwloc hierarchies.
nanashi55 has quit [Ping timeout: 268 seconds]
nanashi55 has joined #ste||ar
<K-ballo> heller_: the new thread_id_type has pretty weird const semantics, were you going for pointer like semantics?
<heller_> K-ballo: yes, more or less a intrusive_ptr drop-in replacement
carpediem has joined #ste||ar
<K-ballo> in that case you should only have const qualified observers
<K-ballo> op*, op->, get, bool, etc
EverYoung has joined #ste||ar
hkaiser has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
<heller_> K-ballo: yeah ... left over ... I wanted to have proper const semantics first...
carpediem has quit [Ping timeout: 268 seconds]
<github> [hpx] sithhell force-pushed thread_data_refcount from 79eeeb6 to 48e1453: https://git.io/vNacO
<github> hpx/thread_data_refcount 48e1453 Thomas Heller: Don't use boost::intrusive_ptr for thread_id_type...
<heller_> K-ballo: fixed
<heller_> thanks
<K-ballo> heller_: looks good
<K-ballo> constexpr assignment won't work on 11, needs CONSTEXPR14
<heller_> copy construction will though?
<K-ballo> should, yes
<heller_> K-ballo: are the defaulted constructors automagically constexpr?
<K-ballo> yes
<K-ballo> and noexcept and trivial and everytihng
<K-ballo> I'm thinking the assignment operators ought to be redundant, but I'd have to try it out
<heller_> due to the conversion operator?
<K-ballo> right
<K-ballo> no
<K-ballo> the converting constructor
<K-ballo> I had not noticed the operator
<heller_> right, the constructor
<heller_> let me check it out...
hkaiser has quit [Ping timeout: 268 seconds]
hkaiser has joined #ste||ar
carpediem has joined #ste||ar
<hkaiser> jbjnr_: may I ask you to change pycicle in a way such that it does not reuse an existing cmake cache? currently it is impossible to get a change to th ebuild system test that causes things to be removed from the cache.
<zao> I've never trusted CMake enough to let anything of my build-dir survive.
<zao> (for any project, not just HPX)
hkaiser has quit [Quit: bye]
eschnett has quit [Quit: eschnett]
diehlpk_work has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
aserio has joined #ste||ar
parsa has joined #ste||ar
parsa has quit [Ping timeout: 265 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 265 seconds]
<github> [hpx] sithhell force-pushed thread_data_refcount from 48e1453 to 9559988: https://git.io/vNacO
<github> hpx/thread_data_refcount 9559988 Thomas Heller: Don't use boost::intrusive_ptr for thread_id_type...
parsa has joined #ste||ar
parsa has quit [Client Quit]
parsa has joined #ste||ar
parsa has quit [Client Quit]
rtohid has left #ste||ar [#ste||ar]
rtohid has joined #ste||ar
thundergroudon[m has quit [*.net *.split]
thundergroudon[m has joined #ste||ar
hkaiser has joined #ste||ar
<diehlpk_work> hkaiser, Klaus merged hpx thread support for blazemark to master
<hkaiser> nice!
eschnett has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 265 seconds]
hkaiser has quit [Quit: bye]
carpediem has quit [Remote host closed the connection]
daissgr has joined #ste||ar
hkaiser has joined #ste||ar
<K-ballo> heller_: HPX_CONSTEXPR (C++11), HPX_CXX14_CONSTEXPR (C++14)
<K-ballo> op-> and op* are no longer 11 constexpr-able
EverYoung has joined #ste||ar
EverYoung has quit [Client Quit]
<github> [hpx] hkaiser closed pull request #3103: Adding support for generic counter_raw_values performance counter type (master...fixing_3102) https://git.io/vNWCp
EverYoung has joined #ste||ar
EverYoun_ has joined #ste||ar
hkaiser has quit [Quit: bye]
EverYoung has quit [Ping timeout: 260 seconds]
EverYoun_ has quit [Read error: Connection reset by peer]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
simbergm1 has joined #ste||ar
Smasher has joined #ste||ar
hkaiser has joined #ste||ar
<hkaiser> jbjnr_: yt?
hkaiser_ has joined #ste||ar
<diehlpk_work> Hi, tomorrow is the deadlien for gsoc application.
hkaiser has quit [Ping timeout: 268 seconds]
<diehlpk_work> Please check the old proposals and edit/update/delete them
<github> [hpx] hkaiser force-pushed disable_executor_compatibility from 27a034c to 1edade6: https://git.io/vN2lz
<github> hpx/disable_executor_compatibility 1edade6 Hartmut Kaiser: This patch disables default executor compatibility with V1 executors...
david_pfander has quit [Ping timeout: 248 seconds]
<jbjnr_> merged into running pycicle so if it shows improvment, I'll merge it to master
<hkaiser_> jbjnr_: thanks!
ct-clmsn has joined #ste||ar
<hkaiser_> jbjnr_: this branch should pass now: https://github.com/STEllAR-GROUP/hpx/pull/3118
<jbjnr_> ok. I can force a rebuild of that and see if it improves if you want
<hkaiser_> yes, please
EverYoung has quit [Ping timeout: 265 seconds]
<simbergm1> jbjnr_: very nice
<K-ballo> is pycicle independent? can I use it on my own projects?
<simbergm1> hkaiser_: do you have a moment? I would have some questions for runtime suspension
<hkaiser_> sure
<simbergm1> so currently I'm passing an "exit mode", shutdown or suspend to hpx::stop (and by extension hpx::init)
<hkaiser_> ok
<simbergm1> my problem is that hpx::finalize and eventually shutdown_all do not know about this mode (the way I have it currently) and so it will go ahead and shut down all of AGAS (or at least do unregister locality)
<simbergm1> so, is it possible to either:
<simbergm1> reinitialize AGAS (ideally not, as I'd like to do less so that it can be as fast as possible), or
<simbergm1> disable AGAS completely, or
<hkaiser_> disable AGAS is probably not a good idea
<simbergm1> would I have to tell hpx::start already whether to shutdown or suspend so that hpx::finalize has the mode available
mbremer has joined #ste||ar
<simbergm1> because currently hpx::finalize could be called before hpx::stop
<hkaiser_> it most probably will be called before stop, iirc
<simbergm1> I think it would depend on how much hpx_main does, assuming hpx_main calls hpx::finalize
<hkaiser_> but finalize just tells the runtime to shut down whenever it's done doing things
<simbergm1> yep
<hkaiser_> so why does finalize need to know about the shutdown mode?
<simbergm1> but shutdown_all does a bit more
<simbergm1> well, in order for shutdown_all not to call unregister_locality
<simbergm1> or whatever it's called
<hkaiser_> but finalize should invoke shutdown_all only once the runtime is ready to exit
<simbergm1> yeah, exactly, but then shutdown_all (and finalize) should know whether it should exit or suspend
<simbergm1> so probably I need to tell hpx::init/start already whether to exit or suspend then
<hkaiser_> finalize does not need to know
<simbergm1> but shutdown_all, right?
EverYoung has joined #ste||ar
<hkaiser_> finalize's job is to tell the runtime to exit at its first convenient spot, if the runtime decides to suspend in between, so be it
<hkaiser_> shutdown_all should be invoked only in the case of a real shutdown
<hkaiser_> not for suspension
<simbergm1> mmh, I see now
<simbergm1> I was going with calling hpx::finalize at the end of any invocation of hpx::init/start so that I can say hpx::start(callback, suspend or shutdown) without changing callback
<simbergm1> or rather hpx::init
<simbergm1> but if I remove that assumption I get what you're saying
<hkaiser_> you have been thinking about this much more than I did
<hkaiser_> I'd try not to overload the semantics of init/start/stop, though
<simbergm1> yeah, I'd like to avoid it as much as possible
<hkaiser_> why not leave the job of suspension to the suspend/resume api?
<simbergm1> I think that's where this is heading
<hkaiser_> ok
<simbergm1> thanks, I'll have another look tomorrow, but this seems like a much more reasonable approach
<jbjnr_> K-ballo: yes, it is independent - but there are a couple of places in the config that still have hpx specific things - I can remove them and make them options if you want to try it out
<hkaiser_> thanks!
<hkaiser_> jbjnr_: I think we'd like to use pycicle for Phylanx as well
<hkaiser_> so this might be a good thing (tm) to do
<simbergm1> jbjnr_: you're getting users!
<hkaiser_> we won't let anything go to waste!
<jbjnr_> ok. I will create an issus to make the project fully configurable for anything.
<hkaiser_> cool
<jbjnr_> Do you have a rough deadline (day/weeks/longer)?
<simbergm1> hkaiser_: mind if I ask if you have a plan for current master with the broken tests? I'd like to avoid merging things in the current state... or is reverting an option?
<hkaiser_> jbjnr_: no rush
<hkaiser_> simbergm1: what tests?
<hkaiser_> didn't realize things are broken
<hkaiser_> uhh
<simbergm1> I think it started with 3104, things got a bit better with one of your later PRs but still a lot broken :(
<simbergm1> unsure why we're not seeing anything on pycicle...
<hkaiser_> I reverted the actual cause
<hkaiser_> not sure why all of the debug builds segfault
<jbjnr_> hkaiser_: http://cdash.cscs.ch/viewBuildError.php?buildid=74391 should this have gone away?
<simbergm1> me neither, was hoping you might know
<hkaiser_> jaafar: yes
<hkaiser_> jbjnr_: yes
<hkaiser_> simbergm1: give me some time, I'll investigate - I wouldn't even know what has caused this
<hkaiser_> simbergm1: debug builds are fine, release builds fail
<simbergm1> hkaiser_: no worries, I just wanted to make sure you're aware
<hkaiser_> ok, I am now - thanks
<mbremer> hkaiser_: please see pm
aserio has quit [Ping timeout: 265 seconds]
simbergm1 has quit [Quit: WeeChat 2.0.1]
<diehlpk_work> hkaiser_, Do you had a look to the gsoc proposals?
<hkaiser_> diehlpk_work: not yet, sorry
<diehlpk_work> Deadline is tomorrow January 23, 2018 at 12:00 (EST)
<diehlpk_work> Can you do it this evening?
<hkaiser_> k
<hkaiser_> heller_: yt?
<jbjnr_> back again
<jbjnr_> I'll fix the cache wiping
<jbjnr_> and retry 3118
aserio has joined #ste||ar
ct-clmsn has quit [Quit: Leaving]
jaafar has quit [Ping timeout: 256 seconds]
mcopik has joined #ste||ar
parsa has joined #ste||ar
aserio has quit [Ping timeout: 276 seconds]
jaafar has joined #ste||ar
jaafar_ has joined #ste||ar
jaafar has quit [Ping timeout: 248 seconds]
eschnett has quit [Quit: eschnett]
aserio has joined #ste||ar
hkaiser_ has quit [Quit: bye]
aserio has quit [Ping timeout: 256 seconds]
aserio has joined #ste||ar
eschnett has joined #ste||ar
aserio has quit [Ping timeout: 246 seconds]
hkaiser has joined #ste||ar
diehlpk_work has quit [Read error: Connection reset by peer]
<github> [hpx] hkaiser force-pushed reinit_counters from 9fff8c6 to b33d059: https://git.io/vN2a0
<github> hpx/reinit_counters f44b519 Hartmut Kaiser: Adding performance_counter::reinit to allow for dynamically changing counter sets...
<github> hpx/reinit_counters b33d059 Hartmut Kaiser: Adding test
diehlpk_work has joined #ste||ar
aserio has joined #ste||ar
daissgr has quit [Ping timeout: 256 seconds]
Smasher has quit [Remote host closed the connection]
<github> [hpx] hkaiser force-pushed disable_executor_compatibility from 1edade6 to 5e6b810: https://git.io/vN2lz
<github> hpx/disable_executor_compatibility 5e6b810 Hartmut Kaiser: This patch disables default executor compatibility with V1 executors...
aserio has quit [Quit: aserio]
<github> [hpx] hkaiser force-pushed disable_executor_compatibility from 5e6b810 to 9838dca: https://git.io/vN2lz
<github> hpx/disable_executor_compatibility 9838dca Hartmut Kaiser: This patch disables default executor compatibility with V1 executors...
<diehlpk_work> Anyone seen this error in the current master
<hkaiser> we see segfaults on master for release builds only, but those are segfaults, not illegal instruction errors
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
<hkaiser> diehlpk_work: since the nodes on rostam have different architectures you should always generate the binaries on the nodes you're planning to run things on
<hkaiser> never build on the headnode
<diehlpk_work> ok, so I have to wait before running the hpx blazemark
<hkaiser> don't think so
<hkaiser> just buildthe binaries on the same node you'll run things on
<diehlpk_work> Both Ariel nodes are occupied, so I can not compile there
<hkaiser> not sure yet why we see segfaults, still investigating
<hkaiser> why do you need th eariel nodes?
<diehlpk_work> Because I am running the benchmarks there
<hkaiser> marvin nodes are just fine for the benchmarks
mbremer has quit [Quit: Page closed]