hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: bye]
nanashi55 has quit [Ping timeout: 240 seconds]
nanashi55 has joined #ste||ar
jaafar has quit [Ping timeout: 260 seconds]
david_pfander has joined #ste||ar
jgolinowski has quit [Quit: Leaving]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] biddisco closed pull request #3429: One thread per core (master...one_thread_per_core) https://github.com/STEllAR-GROUP/hpx/pull/3429
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] biddisco deleted one_thread_per_core at edbd829: https://github.com/STEllAR-GROUP/hpx/commit/edbd829
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://github.com/STEllAR-GROUP/hpx/commit/a6154f527d692da300f26aade964ca8f694a598a
<ste||ar-github> hpx/gh-pages a6154f5 StellarBot: Updating docs
ste||ar-github has left #ste||ar [#ste||ar]
<simbergm> jbjnr: one thread per core almost worked: http://cdash.cscs.ch/testDetails.php?test=2691433&build=17028
<simbergm> might be the option I removed from hpxrun, I'll have a look
<simbergm> (it's just one test failing)
<jbjnr> I ran that test 150 times on daint without a single fail before I merged
<jbjnr> I could not reproduce it
<jbjnr> I do not understand where it comes from
<jbjnr> but I'll be watching the dashboard to see if appears on master
<jbjnr> oh shit - I didn't see that you added extra commits to the PR. Bad me.
<jbjnr> ok. I can reproduce the fail locally now. will fix
<simbergm> jbjnr: I'm on it
<simbergm> it's passing -1 to hpx:threads
<simbergm> PR coming up soon
<jbjnr> it's the change you made to hpxrun.py
<simbergm> yeah, exactly
<simbergm> I added back the -1 = all and a -2 = cores
<jbjnr> just push to master
<simbergm> no way!
K-ballo has joined #ste||ar
<simbergm> somethings off though, I'm getting 2 threads with --hpx:threads=all (2 cores, 4 pus)
<simbergm> looks like I only checked that --hpx:threads=cores works...
<jbjnr> I'd better fix it.
<jbjnr> yup. you're right. when I changed the default, I overrode one of the paths to getting it right
<simbergm> do you know what's missing?
<simbergm> I'll push a branch with the hpxrun fix, you can continue from there if you know what's wrong
<jbjnr> do it
<jbjnr> I know what's wrong
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] msimberg created fix-threads-cores (+1 new commit): https://github.com/STEllAR-GROUP/hpx/commit/00a83ccd482b
<ste||ar-github> hpx/fix-threads-cores 00a83cc Mikael Simberg: Add back threads=all option to hpxrun, add threads=cores option
ste||ar-github has left #ste||ar [#ste||ar]
<simbergm> jbjnr: ^^
<heller> simbergm: what's the cause of action now?
<heller> er, way to move forward
<heller> or course of action, even
<simbergm> heller: with what?
<heller> the failure
<simbergm> jbjnr: is working on it (I think)
<heller> ok
<jbjnr> correct
<simbergm> well, the failing test shold be ok with my commit, jbjnr is working on fixing hpx:threads=all
<heller> k
hkaiser has joined #ste||ar
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] biddisco pushed 1 new commit to fix-threads-cores: https://github.com/STEllAR-GROUP/hpx/commit/db726740e605a38a4a330806c94be766e84e0089
<ste||ar-github> hpx/fix-threads-cores db72674 John Biddiscombe: Fix thread setting when --hpx:threads=all is used
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] biddisco opened pull request #3434: Fix threads cores (master...fix-threads-cores) https://github.com/STEllAR-GROUP/hpx/pull/3434
ste||ar-github has left #ste||ar [#ste||ar]
<jbjnr> simbergm: heller ^^^
<jbjnr> just opeded PR with fix
eschnett has quit [Quit: eschnett]
<simbergm> jbjnr: thanks
<heller> jbjnr: thanks
<heller> Why wasn't the failure caught by pycicle/circle?
<simbergm> heller: it was... also by circleci
<simbergm> heller: did you notice that thread local storage doesn't seem to work correctly? do you know when it (thread local storage or the test for it) last worked correctly?
<jbjnr> heller: scroll up a bit. the error was caught by CI, but I could not reproduce it because I didn't see that there was an extra commit on the branch. I tested on daint with my copy of the PR and concuded it was just HPX giving random errors again.
<jbjnr> my mistake. won't hapen again
<K-ballo> no more HPX giving random errors again?
<hkaiser> lol
eschnett has joined #ste||ar
<heller> simbergm: it never worked reliably. That's why we turned the feature of by default
<heller> The phylanx guys need that feature. I told hkaiser that the tss test is likely to fail on us
<heller> Also commented on the pr that enabled it on
<heller> Circle CI so people should know about it
<heller> The comment was "let's fix it when it turns up"
<heller> simbergm: hkaiser: ^^
<heller> The problem is 4 years old :p
<heller> Easy to forget, especially when it has been closed ;)
<hkaiser> ok
<simbergm> heller: thanks for the links, makes sense
aserio has joined #ste||ar
<heller> I think no one investigated so far. Will have a look
<heller> The unstable tests can be observed here as well:
<zao> I love our thread specific pointer. Inspecting it in GDB on most of the BSDs crashes GDB.
<zao> Very helpful when printing a topology...
hkaiser has quit [Quit: bye]
hkaiser has joined #ste||ar
stevenrbrandt has joined #ste||ar
<stevenrbrandt> I'm on an 80 core machine
<stevenrbrandt> I compiled HPX with HPX_MORE_THAN_64_THREADS=On
<stevenrbrandt> and I still get RuntimeError: partitioner::add_resource: Creation of 2 threads requested by the resource partitioner, but only 1 provided on the command-line.
<stevenrbrandt> what am I missing?
<stevenrbrandt> Sorry, I get that error message from Phylanx
<stevenrbrandt> python3 -c 'import phylanx'
<simbergm> stevenrbrandt: you need to set HPX_WITH_MAX_CPU_COUNT=80 or more (cpu count includes hyperthreads)
<zao> stevenrbrandt: The flags you should use are the ones with _WITH_ in the name, you might've set some result flag, that bypasses some checks.
<zao> And yes, you need the CPU_COUNT one.
<zao> Bah, simbergm beat me to it... I blame train wifi :)
aserio has quit [Ping timeout: 250 seconds]
aserio has joined #ste||ar
<aserio> heller: yt?
<heller> aserio: hey
<aserio> I was wondering if you were following the IJHPC paper cleanup
<heller> I got sick right on tuesday, was essentially tied to my bed
<heller> I am still at home
<aserio> awe, that's no good
<heller> no :(
<aserio> I hope you are feeling a bit better
<heller> yes
<heller> I am in front of my computer right now ;)
<aserio> heller: well once you are up to it, hkaiser and I would really appreciate it if you would help get that paper out of the door
<heller> yes, that's on my high priority list
<heller> i want to get it out as well
<zao> Sorry for not getting the BSD stuff done by the way, ran into some weirdness where no threads were assigned properly, topology is a mess :)
<zao> I guess I should push the stuff I have to my repo.
<heller> aserio: i'll try to work on it tomorrow
<aserio> heller: Thanks!
stevenrbrandt has quit [Quit: Page closed]
david_pfander has quit [Ping timeout: 246 seconds]
aserio has quit [Ping timeout: 250 seconds]
* K-ballo is not having much luck with vcpkg
jaafar has joined #ste||ar
jaafar_ has joined #ste||ar
jaafar has quit [Ping timeout: 260 seconds]
jgolinowski has joined #ste||ar
<diehlpk_work> hkaiser, see pm
jaafar_ has quit [Remote host closed the connection]
jaafar has joined #ste||ar
aserio has joined #ste||ar
jaafar has quit [Quit: Konversation terminated!]
jaafar has joined #ste||ar
<simbergm> jgolinowski: I guess you saw the message to your opencv pr? he seems happy ? I think you just need to clarify to him what's missing and if he's okay with the current state
<jgolinowski> simbergm, you mean that some tests are not passing?
<simbergm> yeah
<jgolinowski> I am currently rebuilding everything (starting from HPX) and will rerun some tests and start looking into why it breaks
<simbergm> ok, nice
<simbergm> note that it might be something we can't fix
<simbergm> or not directly in hpx at least
<simbergm> but I don't know
<jgolinowski> yes I am aware - so far do not have a good idea where to start even :P
<simbergm> hkaiser: I wrote to appveyor support about the failing downloads, they said they've updated external ip:s recently (https://www.appveyor.com/updates/2018/07/23/)
<simbergm> did you ever have to whitelist something to get that working? I assume the stellar site is hosted at lsu?
jaafar has quit [Ping timeout: 245 seconds]
jaafar has joined #ste||ar
<hkaiser> simbergm: yah
<hkaiser> send me the ip I will make sure its whitelisted
jaafar has quit [Ping timeout: 252 seconds]
jaafar has joined #ste||ar
<simbergm> hkaiser: that link above contains a bunch of them, I don't know if just one of them is enough
<hkaiser> simbergm: ok
aserio has quit [Ping timeout: 260 seconds]
aserio has joined #ste||ar
eschnett has quit [Quit: eschnett]
hkaiser has quit [Quit: bye]
nikunj has quit [Read error: Connection reset by peer]
nikunj97 has joined #ste||ar
eschnett has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
hkaiser has joined #ste||ar
<heller> hkaiser: I think I fixed the TSS problems
<hkaiser> heller: nice
<hkaiser> what was it?
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell created fixing_1205 (+3 new commits): https://github.com/STEllAR-GROUP/hpx/compare/a091d6063663^...9930af08b82f
<ste||ar-github> hpx/fixing_1205 a091d60 Thomas Heller: Fixing double delete problem...
<ste||ar-github> hpx/fixing_1205 97c8606 Thomas Heller: Unbreaking unit test...
<ste||ar-github> hpx/fixing_1205 9930af0 Thomas Heller: Properly reseting the coroutine context to avoid an assertion being triggered.
ste||ar-github has left #ste||ar [#ste||ar]
<heller> hkaiser: ^^
<heller> I can't believe it ever worked
<hkaiser> lol
<hkaiser> as always
<heller> without the move ctor and assignment, you should have gotten a double free *all the time*
<K-ballo> really?
<heller> yeah...
hkaiser has quit [Quit: bye]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell opened pull request #3435: Fixing 1205 (master...fixing_1205) https://github.com/STEllAR-GROUP/hpx/pull/3435
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell pushed 1 new commit to master: https://github.com/STEllAR-GROUP/hpx/commit/afa690313a1c90ae966397bcb11088600e34aa6d
<ste||ar-github> hpx/master afa6903 Thomas Heller: Merge pull request #3434 from STEllAR-GROUP/fix-threads-cores...
ste||ar-github has left #ste||ar [#ste||ar]
aserio has joined #ste||ar
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed fixing_1205 from 9930af0 to 0d64e9a: https://github.com/STEllAR-GROUP/hpx/commits/fixing_1205
<ste||ar-github> hpx/fixing_1205 7b18b24 Thomas Heller: Fixing double delete problem...
<ste||ar-github> hpx/fixing_1205 cacb5c8 Thomas Heller: Unbreaking unit test...
<ste||ar-github> hpx/fixing_1205 0d64e9a Thomas Heller: Properly reseting the coroutine context to avoid an assertion being triggered.
ste||ar-github has left #ste||ar [#ste||ar]
hkaiser has joined #ste||ar
aserio has quit [Quit: aserio]
nikunj97 has quit [Read error: Connection reset by peer]
eschnett has quit [Quit: eschnett]