aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<hkaiser> K-ballo: you tell me, I'm not aware of any problems with atomic on our platforms
<hkaiser> we've kept it for older versions of gcc, iirc
<K-ballo> up to 4.6, apparently
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
<hkaiser> K-ballo: yah
mcopik has quit [Ping timeout: 260 seconds]
eschnett has quit [Ping timeout: 240 seconds]
<K-ballo> now I remember why I didn't do this earlier, there's some bad atomic usages in the codebase
<hkaiser> K-ballo: are there?
<hkaiser> uhh
<K-ballo> one boost::atomic<scheduled_executor> default_executor_instance;
<hkaiser> ahh, non-trivial type
<K-ballo> and another one in external/lockfree that looks fixable
eschnett has joined #ste||ar
<hkaiser> that scheduler one can be replaced by a spinlock and a non-atomic
<K-ballo> what kind of spinlock? util::?
<hkaiser> lcos::local::spinlock, works for any type of thread
<hkaiser> I think we can remove the util::spinlock, it's not used anywhere anymore
<K-ballo> I never remember which is which
<hkaiser> right
<github> [hpx] K-ballo created std-atomic (+1 new commit): https://git.io/vQFbN
<github> hpx/std-atomic 8b20f5b Agustin K-ballo Berge: Replace boost::atomic with std::atomic
<hkaiser> K-ballo: can't we also remove external/lockfree alltogether?
<K-ballo> no idea, can we? which boost version did ... it's not in any boost version, is it?
<hkaiser> I thought it was
<K-ballo> it's different than what ended up being accepted, or something like that?
<K-ballo> I'll check that out
<hkaiser> we'v ebeen using it for Boost < V1.53 iirc
<hkaiser> don't really remember, though
<K-ballo> we could bump the minimum to 1.55 I believe? assuming there's a point to it
<hkaiser> we did already, I think
<K-ballo> 1.51
<hkaiser> nod
<hkaiser> I thought we did :/
<K-ballo> IIRC there was no gain in bumping it to 54, so we didn't
<K-ballo> which would mean I completely overlooked external/boost
<K-ballo> not unless we bump the minimum to 55
<K-ballo> aaaah
<K-ballo> I'm not 100% positive, something else might be including boost/atomic indirectly without the proper include?
<K-ballo> s/proper/workaround
<hkaiser> K-ballo: but if we don't depend on boost atomic anymore this is not our problem, is it?
<hkaiser> well, this include doesn't hurt, I guess
<K-ballo> nod, unless boost is broken and we try to use it, then it becomes our problem
<K-ballo> let's jump to 55, drop the include and everything else
<hkaiser> ok, fine by me
<github> [hpx] K-ballo created range-fixes (+1 new commit): https://git.io/vQFNW
<github> hpx/range-fixes e2af7e5 Agustin K-ballo Berge: Fix some uses of begin/end, remove unnecessary includes
<hkaiser> supporting 10 versions is enough
<github> [hpx] K-ballo created boost-bump (+1 new commit): https://git.io/vQFNd
<github> hpx/boost-bump 28fc70f Agustin K-ballo Berge: Bump minimal Boost version to 1.55.0
vamatya has joined #ste||ar
<K-ballo> nope
<github> [hpx] K-ballo force-pushed boost-bump from 28fc70f to e252518: https://git.io/vwLkY
<github> hpx/boost-bump e252518 Agustin K-ballo Berge: Bump minimal Boost version to 1.55.0
<github> [hpx] K-ballo force-pushed std-atomic from 8b20f5b to 7870ddd: https://git.io/vQFAO
<github> hpx/std-atomic 7870ddd Agustin K-ballo Berge: Replace boost::atomic with std::atomic
vamatya has quit [Ping timeout: 248 seconds]
ajaivgeorge has quit [Ping timeout: 260 seconds]
ajaivgeorge has joined #ste||ar
ajaivgeorge has quit [Ping timeout: 248 seconds]
ajaivgeorge has joined #ste||ar
mars0000 has quit [Quit: mars0000]
mars0000 has joined #ste||ar
hkaiser has quit [Quit: bye]
zbyerly_ has quit [Ping timeout: 260 seconds]
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
<github> [hpx] K-ballo force-pushed std-atomic from 7870ddd to 1abea7d: https://git.io/vQFAO
<github> hpx/std-atomic 1abea7d Agustin K-ballo Berge: Replace boost::atomic with std::atomic
pree has joined #ste||ar
vamatya has joined #ste||ar
pree has quit [Ping timeout: 268 seconds]
K-ballo has quit [Quit: K-ballo]
mars0000 has quit [Quit: mars0000]
patg has quit [Quit: This computer has gone to sleep]
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
mars0000 has joined #ste||ar
mars0000 has quit [Client Quit]
pree has joined #ste||ar
pree has quit [Remote host closed the connection]
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
mars0000 has joined #ste||ar
mars0000 has quit [Client Quit]
Matombo has joined #ste||ar
ajaivgeorge has quit [Ping timeout: 260 seconds]
ajaivgeorge has joined #ste||ar
Matombo has quit [Remote host closed the connection]
bikineev has joined #ste||ar
bikineev has quit [Remote host closed the connection]
david_pfander has joined #ste||ar
bikineev has joined #ste||ar
david_pfander has quit [Quit: david_pfander]
david_pfander has joined #ste||ar
david_pfander has quit [Quit: david_pfander]
bikineev has quit [Ping timeout: 248 seconds]
david_pfander has joined #ste||ar
bikineev has joined #ste||ar
mcopik has joined #ste||ar
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
pree has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
Matombo has joined #ste||ar
Matombo has quit [Ping timeout: 248 seconds]
david_pfander has quit [Quit: david_pfander]
david_pfander has joined #ste||ar
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vQbcm
<github> hpx/gh-pages a191bc3 StellarBot: Updating docs
vamatya has quit [Ping timeout: 260 seconds]
Matombo has joined #ste||ar
ajaivgeorge has quit [Ping timeout: 260 seconds]
ajaivgeorge has joined #ste||ar
<mcopik> ajaivgeorge: have you verified that all tests are passing on multiple localities?
ajaivgeorge has quit [Ping timeout: 260 seconds]
ajaivgeorge has joined #ste||ar
hkaiser has joined #ste||ar
Matombo has quit [Ping timeout: 260 seconds]
<github> [hpx] ct-clmsn opened pull request #2762: partitioned_vector serializer (master...pv_serializer) https://git.io/vQbET
bikineev has quit [Ping timeout: 260 seconds]
bikineev has joined #ste||ar
K-ballo has joined #ste||ar
<ajaivgeorge> mcopik: yes
denis_blank has joined #ste||ar
bikineev has quit [Ping timeout: 246 seconds]
bikineev has joined #ste||ar
eschnett has quit [Quit: eschnett]
zbyerly_ has joined #ste||ar
<heller> hkaiser: the exception on buildbot should be fixed now
<hkaiser> heller: perfect!
<hkaiser> thanks a lot
<heller> stupid regex
<hkaiser> ;)
<hkaiser> where was the problem?
<hkaiser> ahh, makes sense
<heller> hkaiser: what's the bigger picture behind #2756?
<hkaiser> making channels migratable
<hkaiser> heller: see #2730
<heller> hmm
<heller> did you run the unit/regression tests for this PR?
<hkaiser> not all of it
<hkaiser> does it break?
<heller> would be interesting
<heller> didn't try
<hkaiser> the tests I tried running are fine
<hkaiser> why do you ask?
<heller> it's a substantial change though, testing it fully on at least one platform wouldn't hurt, I guess
<hkaiser> ok, will run it on buildbot
<heller> good idea
<heller> hkaiser: so the only reason to funnel through the comptype is for set_lco_value to do the right thing?
pree has joined #ste||ar
ajaivgeorge has quit [Quit: ajaivgeorge]
<hkaiser> well, I initially though I need to funnel it through, but it's not necessary, actually
<hkaiser> I left it in place anyways as it does not hurt
<heller> we need to serialize it, no?
<hkaiser> heller: it is being serialized
<hkaiser> it's a part of the address
<hkaiser> heller: but I can rmeove this
<heller> right, we always had it as a member
<hkaiser> nod
<heller> this additional change obfuscates the real change a little bit
<hkaiser> the decision what action type to use has to be made on the sending end, so funneling it through is not actually needed
<hkaiser> yes, sorry
<hkaiser> alternatively we would need to change the way base_lco_with_value::set_value is invoked
<hkaiser> deferring the decision of whether the target is (un)managed to the receiving locality
<hkaiser> but that would have been an even bigger fundamental change, I was not sure if that was worth it
<heller> i think, it would have been better to unify the advantages of managed_component and simple and make the result there migratable
<hkaiser> you can't migrate managed components
<heller> i understand that his approach is getting things done faster though ;)
<heller> i'd argue that you migrate components even if they were allocated from a pool. not with the current code, of course
<heller> and that would involve some significant investment
<heller> in terms of time and thinking
<heller> and of course adds lots of uncertainty if it'll work and be as performant as what we have ;)
heller has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
heller has joined #ste||ar
<github> [hpx] K-ballo opened pull request #2763: Bump minimal Boost version to 1.55.0 (master...boost-bump) https://git.io/vQbiK
<hkaiser> heller: you'd have to either migrate the whole pool or add an additional mapping layer which defeats the whole purpose
eschnett has joined #ste||ar
<heller> I don't see that
<heller> for migratable objects, we already have the additional mapping layer, no?
<heller> hkaiser: the only real difference between managed and unmanaged components is the allocation, no?
<heller> where do you see the additional mapping layer?
zbyerly_ has quit [Ping timeout: 276 seconds]
pree has quit [Ping timeout: 268 seconds]
pree has joined #ste||ar
<heller> hkaiser: do I miss something?
<hkaiser> heller: we pool managed components for the gids to be guaranteed to be consecutive --> no AGAS interaction is needed for all except the first component instance in a pool
<heller> no AGAS interaction when those are constructed, right?
<hkaiser> yes
pree has quit [Ping timeout: 248 seconds]
<heller> I am thinking that we would never need network turnarounds for assigning GIDs
<hkaiser> that's true
<heller> wouldn't the pair of locality + locally unique ID be enough?
<hkaiser> for what?
<heller> creating GIDs
<heller> assigning GIDs to LVAs
<hkaiser> that's what we do for (simple) components
<hkaiser> we want to get rid of managed components anyways
<heller> yes
<hkaiser> just see this patch as an intermediate step ;)
patg has joined #ste||ar
<heller> so we're basically on the same page, good ;)
denis_blank has quit [Quit: denis_blank]
<hkaiser> heller: however, even if you generate the gid locally, it still needs to be registered with agas for others to be able to resolve it
<hkaiser> so agas interaction is needed for every instance of a unmanaged component
<heller> true
<heller> on a side note: did you notice how we are overloading the term "unmanaged"?
<hkaiser> what do you mean?
<hkaiser> components vs id_type?
<heller> yes
pree has joined #ste||ar
<hkaiser> well, formerly we called the 'unmanaged' components 'simple'
<hkaiser> you didn't like that ;)
<heller> I never said I didn't like it. What I do not like is that we have both, managed and (simple) components
<hkaiser> k
<hkaiser> anyways, gtg
<hkaiser> ttyl
hkaiser has quit [Quit: bye]
pree has quit [Read error: Connection reset by peer]
bikineev has quit [Ping timeout: 240 seconds]
bikineev has joined #ste||ar
akheir has joined #ste||ar
hkaiser has joined #ste||ar
patg has quit [Quit: This computer has gone to sleep]
mars0000 has joined #ste||ar
mars0000 has quit [Client Quit]
elfring has joined #ste||ar
eschnett has quit [Ping timeout: 240 seconds]
eschnett has joined #ste||ar
aserio has joined #ste||ar
<heller> aserio: 152!
bikineev has quit [Ping timeout: 260 seconds]
<aserio> heller: sounds like you've had a productive day!
<heller> almost ;)
mbremer has joined #ste||ar
EverYoung has joined #ste||ar
bikineev has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
vamatya has joined #ste||ar
mcopik has quit [Ping timeout: 255 seconds]
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
pree has joined #ste||ar
Matombo has joined #ste||ar
david_pfander has quit [Ping timeout: 276 seconds]
aserio has quit [Ping timeout: 246 seconds]
denis_blank has joined #ste||ar
EverYoung has quit [Ping timeout: 240 seconds]
EverYoung has joined #ste||ar
thundergroudon[m has quit [*.net *.split]
denis_blank has quit [Quit: denis_blank]
taeguk[m] has quit [Ping timeout: 255 seconds]
bibek_desktop has quit [Ping timeout: 258 seconds]
bibek_desktop has joined #ste||ar
thundergroudon[m has joined #ste||ar
taeguk[m] has joined #ste||ar
elfring has quit [Quit: Konversation terminated!]
bibek_desktop has quit [Ping timeout: 246 seconds]
bibek_desktop has joined #ste||ar
vamatya has quit [Ping timeout: 240 seconds]
<mbremer> Hi, (to anyone :) ) is it possible to obtain the a rolling-average for the thread's idle rates counter? I see the idle-rates counter and the statistics/rolling_average, but don't know how / if you can combine these.
mcopik has joined #ste||ar
<hkaiser> mbremer: yes
<hkaiser> well
<hkaiser> the idle-rate counter itself is an average
<mbremer> Oh sure, I'm more interested in the time aspect of it.
<hkaiser> what you could do is to reset the counter on each query and use that for a rolling average
<hkaiser> something like /statistics{/threads{locality#0/total}/idle-rate}/rolling_average@200,1000,1
ajaivgeorge has joined #ste||ar
<hkaiser> sample idle-rate every 200 ms, average over 1000 ms (5 samples), and reset the underlying counter on each sample
<hkaiser> errm
<hkaiser> this: /statistics{/threads{locality#0/total}/idle-rate}/rolling_average@200,5,1
<hkaiser> 5 samples
<hkaiser> mbremer: ^^
<mbremer> Ah great. Let me try it out.
<mbremer> Thanks.
<heller> hkaiser: grr, looks like my buildbot restart didn't pick up the changes...
<github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vQNzy
<github> hpx/master bcda395 Hartmut Kaiser: Merge pull request #2753 from STEllAR-GROUP/fixing_fallthrough...
<github> [hpx] hkaiser deleted fixing_fallthrough at d89ef9e: https://git.io/vQNz9
eschnett has quit [Ping timeout: 240 seconds]
eschnett has joined #ste||ar
eschnett has quit [Quit: eschnett]
aserio has joined #ste||ar
vamatya has joined #ste||ar
<thundergroudon[m> Is tycho on Rostam stuck?
<thundergroudon[m> some old commands of mine still not executed
<thundergroudon[m> mseshad+ 45848 1 0 Jul15 ? 00:00:00 python benchmark/StencilScript.py
<thundergroudon[m> mseshad+ 45895 45848 0 Jul15 ? 00:00:00 sh -c srun -p tycho -N 1 ./StencilHPX 11000000 >> StencilHPX.dat
<thundergroudon[m> mseshad+ 45896 45895 0 Jul15 ? 00:00:00 srun -p tycho -N 1 ./StencilHPX 11000000
<thundergroudon[m> this is running ps -ef from my user
<hkaiser> thundergroudon[m: please talk to akheir
pree has quit [Ping timeout: 276 seconds]
Matombo has quit [Remote host closed the connection]
aserio1 has joined #ste||ar
mars0000 has joined #ste||ar
aserio has quit [Ping timeout: 276 seconds]
aserio1 is now known as aserio
pree has joined #ste||ar
RostamLog has joined #ste||ar
akheir has joined #ste||ar
bikineev has joined #ste||ar
vamatya has quit [Ping timeout: 268 seconds]
akheir has quit [Remote host closed the connection]
akheir has joined #ste||ar
EverYoun_ has quit [Remote host closed the connection]
<zbyerly_> is anyone else getting these suddenly:
<zbyerly_> include/hpx/util/jenkins_hash.hpp:196:17: note: in expansion of macro ‘HPX_FALLTHROUGH’
<zbyerly_> warnings
EverYoung has joined #ste||ar
<zao> zbyerly_: Hi and welcome to GCC 7.1?
<zao> Well, what is the actual warning you see?
<zao> We've recently added support for [[fallthrough]] via a compile test, which ought to work on roughly the same compilers that start nagging about case fallthrough.
<hkaiser> zbyerly_: what do you mean by 'suddenly?
<hkaiser> top of master from today?
Matombo has joined #ste||ar
<zbyerly_> hkaiser, i don't think i've pulled today but in the last 2 or 3 days, but it's also new code
<zbyerly_> I was just wondering if it was something I should worry about
<zao> Still not sure what you're actually seeing, and on what toolchain and TU.
<K-ballo> so how does one properly link atomic from cmake?
<mcopik> zbyerly_: which compiler, which version?
<zbyerly_> gcc 6.3.0
<zbyerly_> boost 1.63
<mcopik> zbyerly_: and what warning exactly?
testing123456322 has joined #ste||ar
<zao> Does that test get run properly and work properly if building explicitly with 14?
<zao> Hrm, stray semicolon at use site?
testing123456322 has left #ste||ar [#ste||ar]
<mcopik> zao: zbyerly_: nope
<mcopik> the feature detection is in fact incorrect
<zao> Nice!
<mcopik> the macro should not be activated on gcc 6.3, otherwise it wouldn't be complaining about not used attribute
akheir has quit [Remote host closed the connection]
eschnett has quit [Quit: eschnett]
<hkaiser> zbyerly_: pull again, should be fixed now
zbyerly_ has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
<hkaiser> zbyerly_: pull again, should be fixed now
<zbyerly_> hkaiser, k
aserio has quit [Quit: aserio]
<K-ballo> hpx/util/lockfree/* declares all its stuff in namespace boost::, how did that happen?
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
<hkaiser> K-ballo: it was meant to be pushed upstream at some point ;)
eschnett has joined #ste||ar
<hkaiser> K-ballo: this can now be moved into some hpx namespace, really
<K-ballo> really... I almost filed a bug report with boost, I was surprised to not see one already
<K-ballo> should have been moved into hpx:: namespace when moved from external/boost into hpx/util
<K-ballo> I'll take care of that later
<hkaiser> K-ballo: ok, thanks
<K-ballo> how can a pair of two nodes, each with two pointers and some payload, be lockfree? that's a lot of bits
<K-ballo> on second thoughts, I don't think I want to touch any of that :P
eschnett has quit [Quit: eschnett]
<hkaiser> lol
<hkaiser> good thinking
<hkaiser> K-ballo: it's bryce's code
<mcopik> shouldn't it be return hpx::util::report_errors()?
<mcopik> now tests always pass, even if all test macros return a fail
zbyerly_ has quit [Ping timeout: 246 seconds]
<mcopik> there are many more tests which always return 0. what am I missing here? do we have multiple tests fooling ctest that everything is running fine even when it isn't?
<mcopik> hkaiser: ^
mars0000 has quit [Quit: mars0000]
Matombo has quit [Remote host closed the connection]
bikineev has quit [Remote host closed the connection]
vamatya has joined #ste||ar
EverYoung has quit [Ping timeout: 258 seconds]
EverYoung has joined #ste||ar