aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
EverYoun_ has joined #ste||ar
EverYoun_ has quit [Remote host closed the connection]
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 260 seconds]
vamatya has quit [Ping timeout: 240 seconds]
EverYoun_ has quit [Ping timeout: 260 seconds]
vamatya has joined #ste||ar
vamatya has quit [Ping timeout: 260 seconds]
quaz0r has quit [Quit: WeeChat 1.8-dev]
shoshijak has quit [Ping timeout: 255 seconds]
hkaiser has quit [Quit: bye]
* jaafar
just notices he was mentioned
<jaafar>
I gotta get my notifications fixed
<jaafar>
oh, I was the wrong person :)
<jaafar>
No wonder, my results are bad actually
K-ballo has quit [Quit: K-ballo]
quaz0r has joined #ste||ar
pree has joined #ste||ar
bikineev has joined #ste||ar
bikineev has quit [Ping timeout: 246 seconds]
bikineev has joined #ste||ar
shoshijak has joined #ste||ar
shoshijak has quit [Ping timeout: 255 seconds]
Remko has joined #ste||ar
Remko has quit [Remote host closed the connection]
Remko has joined #ste||ar
shoshijak has joined #ste||ar
taeguk has quit [Quit: Page closed]
Remko has quit [Remote host closed the connection]
taeguk has joined #ste||ar
<taeguk>
jbjnr: Sorry, go back from toilet.. :(
Remko has joined #ste||ar
jaafar has quit [Ping timeout: 245 seconds]
<jbjnr>
heller_: I hope you saw the osu BW plot I posted last night. Results are starting to get pretty good really.
david_pfander has joined #ste||ar
<pree>
In component placement policies the change of return type of " create " API is allowed ? Or it has to be uniform with it's counter parts deflaut & binpacking policies ?
<pree>
thank you
<pree>
i.e from hpx::future<id_type> ----> std::vector<hpx::future<id_type>>
<pree>
heller_ && jbjnr ^^^
<jbjnr>
I'm not sure. I am not familair with that code I'm afraid. returning a vector of futures seems reasonable - why do the others not need to do so
<jbjnr>
taeguk very sorry about this morning - I shouild have warned you that I was not available.
<jbjnr>
If you have things you need to ask about then we can scheduke another call anytime.
<pree>
*friends
<pree>
But In cyclic_distribution_policy. It seems good to return a vector of id's than a single id's
<jbjnr>
pree: I guess my question then is -why do you need a differnent API in this case - if there's a good reason, then go for it
<jbjnr>
why do the other policies not return vectors of ids?
Remko has quit [Remote host closed the connection]
<pree>
in cyclic case , it sounds good for me to create one component per cycle by choosing the component using counter data
<pree>
i.e component which have least no:of:components of the given type will be chosen among the localities assigned to the policy
bikineev has quit [Remote host closed the connection]
mcopik has quit [Ping timeout: 246 seconds]
<jbjnr>
pree: sound like returning a vector is fine. hartmut will be online soon I expect and you'd best discuss it with him.
<pree>
* thank you john'
josef_k has joined #ste||ar
<jbjnr>
pree: how did you make your messages to me appear as from(pree) and be highlighted in green? some kind of DM but in the main channel?
<pree>
yes I use /notice
<pree>
But some wired thing happens
<pree>
Is there any probelm ?
<pree>
I use /notice jbjnr
<zao>
They appear off in the status window for me, which is very confusing :)
shoshijak has quit [Ping timeout: 255 seconds]
<pree>
zao : oh sorry for that : )
bikineev has joined #ste||ar
<jbjnr>
pree - there is no problem, I have never seen messages appear the way you did it
<jbjnr>
(just testing)
<pree>
okay ! :)
<jbjnr>
zao: you mean, you don't see the messages sent with /notice ?
<zao>
jbjnr: They often end up nested with server messages off in the combined server status window.
<zao>
Or in the channel looking nothing like messages do, depending on if they're in a PM or channel.
<jbjnr>
ok. your irc client must show thing differently from mine I guess
<zao>
Behaviours vary among clients there, indeed.
bikineev has quit [Ping timeout: 245 seconds]
bikineev has joined #ste||ar
shoshijak has joined #ste||ar
shoshijak has quit [Read error: Connection reset by peer]
K-ballo has joined #ste||ar
hkaiser has joined #ste||ar
<pree>
hkaiser -> In component placement policies the change of return type of " create " API is allowed ? Or it has to be uniform with it's counter parts deflaut & binpacking policies ?
<hkaiser>
pree: what would you like to use as the return type instead of id_type?
<pree>
But In cyclic_distribution_policy. It seems good to return a vector of id's than a single id's
<hkaiser>
you create one component instance, why return a vector?
<pree>
in cyclic case , it sounds good for me to create one component per cycle by choosing the component using counter data i.e component which have least no:of:components of the given type will be chosen among the localities assigned to the policy
<hkaiser>
sure
<pree>
Can I go for it ?
<hkaiser>
please explain why?
<hkaiser>
I don't see a need for this at this point, except if you explain why you want to return a vector just to return a single value
<pree>
wait for a sec
shoshijak has joined #ste||ar
<pree>
Because in cyclic property, We will have a parameter "runs" which tells us how many times we have to cycle through the localities. If we create just one component on one of the localities it seems for me we are not caring about the "runs" parameter.
<hkaiser>
why not?
<hkaiser>
create() is supposed to create _one_ instance
<hkaiser>
so what ever you do in create() it will return one instance of an id_type
<hkaiser>
do you agree?
<pree>
Please tell me how it differs from binpacking or default ?
<hkaiser>
those create one instance as well, no?
<pree>
Then what is the gain from the API create of cyclic_distribution_policy ?
<pree>
On what basis decision should be taken on which locality the component should be created on ?
<hkaiser>
pree: that's what I was asking you the other day
<hkaiser>
just use 'the next' locality, what ever that is
denis_blank has joined #ste||ar
<pree>
sorry i'm not convinced.
<hkaiser>
please first explain what you want to return from create() in that vector
<pree>
id_type of componets
<pree>
created on one locality per cycle
<hkaiser>
so create() should return more than one component instances?
<pree>
Yes .
<hkaiser>
how many?
<pree>
By this we taking some runtime info.. For performance gain.
<hkaiser>
runtime info of what?
<pree>
no:of:components == no:of:runs
<hkaiser>
what's the difference between create() and bulk_create()?
<pree>
bulk_create creates the given "count" components on each locality for each cycle
<hkaiser>
no
<hkaiser>
it creates N components overall, not N components per locality
<pree>
Okay then I have to change accordingly
<hkaiser>
ok
<hkaiser>
create() is essentially the same as bulk_create() with N == 1
<pree>
okay ! Create N components on localities by using M runs --> bulk_create
<pree>
Create 1 components on localities by using M runs ---> create
<pree>
okay ! I implemented as the one described above, Now i will have to change it :)
<pree>
Thank you @ hkaiser!
shoshijak has quit [Ping timeout: 258 seconds]
bikineev has quit [Read error: No route to host]
bikineev has joined #ste||ar
hkaiser has quit [Quit: bye]
diehlpk_work has joined #ste||ar
jakemp has joined #ste||ar
aserio has joined #ste||ar
pree has quit [Ping timeout: 240 seconds]
<david_pfander>
heller_, wash: have you ever seen a "<jemalloc>: Error in dlsym(RTLD_NEXT, "pthread_create")" when executing octotiger on knl (in this case on tave)?
pree has joined #ste||ar
<github>
[hpx] aserio closed pull request #2658: Unify access_data trait for use in both, serialization and de-serialization (master...serialization_access_data) https://git.io/vHCpv
eschnett has quit [Quit: eschnett]
hkaiser has joined #ste||ar
<github>
[hpx] hkaiser deleted serialization_access_data at 12f10ff: https://git.io/vH1NM
<github>
hpx/master 62cd27c Hartmut Kaiser: Making sure uninitialized_value_construct shows up in generated documentation index
<heller_>
david_pfander: no
<heller_>
Never saw that
<david_pfander>
heller_: could you tell which modules you're using on tave? In the past, I only loaded the craype-mic-knl and switch the prog. env. with PrgEnv-gnu, and that worked for me. I'm suspecting some cross-compilation issue.
<heller_>
david_pfander: there seems to be a problem with cmake. The newer versions want to do static linking
<heller_>
As a workaround for now, either do a full static build or switch to cmake 2.8.12 and remove the 3.xx requirements in the main hpx cmake file
<david_pfander>
heller_: ok, I'll try the older cmake version then (less work). Thanks!
jgoncal has joined #ste||ar
<zao>
...
<aserio>
^^lol
<zao>
Sounds like a horrible idea to intentionally ruin the CMake setup just because some silly cluster somewhere is dumb.
<zao>
Not bitter from having to maintain installations of overly ancient things for users.
<zao>
libstdc++5, just saying.
<david_pfander>
zao, aserio: Life is pain :)
akheir has quit [Remote host closed the connection]
bikineev has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
bikineev has quit [Read error: No route to host]
EverYoung has joined #ste||ar
bikineev has joined #ste||ar
bibek_desktop has quit [Ping timeout: 255 seconds]
akheir has joined #ste||ar
<heller_>
zao: still looking for the reason for the static link attempt.
josef_k has quit [Ping timeout: 255 seconds]
bibek_desktop has joined #ste||ar
vamatya has joined #ste||ar
jbjnr has quit [Read error: Connection reset by peer]
jbjnr has joined #ste||ar
<bikineev>
jbjnr: hi John
<bikineev>
yt?
aserio has quit [Ping timeout: 245 seconds]
pree has quit [Ping timeout: 240 seconds]
pree has joined #ste||ar
bikineev has quit [Ping timeout: 246 seconds]
mcopik has quit [Ping timeout: 240 seconds]
aserio has joined #ste||ar
<heller_>
aserio: fix it!
<aserio>
you'll have to be more specific
<heller_>
The serialization branch you merged is broken on gcc
<heller_>
gcc 4.9.4 only as it seems
<heller_>
Looks like some unit tests are failing as well
<heller_>
The refcount tests, not good
<heller_>
Which might be due to the lf branch merge...
<hkaiser>
heller_: could be because of the changes made by the serialization branch aserio merged
<heller_>
Looks like the tests failed before
bikineev has joined #ste||ar
<heller_>
I'm getting more and more frustrated with buildbot.. very tedious to figure out which commit lead to which failure
jaafar has joined #ste||ar
<K-ballo>
poor buildbot, not its fault
<heller_>
K-ballo: depends, even with our wild merging, the UI could help us to identify bad commits better
Matombo has joined #ste||ar
<K-ballo>
you just find were the color switches from green to orange, and you get a list of all the commits included in that build
<zao>
Can you still run builds and tests on branches if you care enough?
<heller_>
Here is the problem, it assumes linear commits and sometimes adds unrelated commits to a build
david_pfander has quit [Ping timeout: 246 seconds]
<K-ballo>
zao: yes, but the nodes run out of .. memory or something, too often while building tests and examples
<pree>
hkaiser ??
<heller_>
Which is another problem, yes
<K-ballo>
"assumes linear commits" just sounds as bad merging to me
<heller_>
Git commits are just not linear
<K-ballo>
if you merge without rebasing first, then the commits in that line are all affected
<K-ballo>
those in fact were never tested, because the CI doesn't test merges
<K-ballo>
so it's effectively a new commit with new content
<K-ballo>
here's a concrete example from a couple weeks ago:
<K-ballo>
on one branch we removed support for vc2013, so I replaced HPX_NOEXCEPT with noexcept
<K-ballo>
on a separate branch I added new uses of HPX_NOEXCEPT
<K-ballo>
both branches worked fine separately, but after a "non-linear" merge the build failed because it was referencing an HPX_NOEXCEPT that no longer existed
<heller_>
You should not be required to rebase, a simple merge should be fine
david_pfander has joined #ste||ar
<K-ballo>
you are not required to rebase, you are just required to understand the effects of not doing so
<K-ballo>
the changes affect you whether you rebase or not, and if you don't rebase then those changes aren't tested until they hit master
<K-ballo>
in the end, the failures caused to my ranges branch were not caused by changes to the ranges branch at all, but to changes to the merged noexcept branch
<heller_>
And one test run should include a fixed set of commits, since merging happens linearly, you'd immediately see what lead to a failure
<K-ballo>
you'll have to define "linearly"...
<heller_>
One after the other
<K-ballo>
only fast-forward merges are "linear" according to my understanding of "linearly"
<K-ballo>
fast-forward merges are achieved by rebasing
<heller_>
The actual merge commit is linear in time
<K-ballo>
no, it's not
<heller_>
As far as the build runner is concerned
<K-ballo>
I don't follow what you are saying
<K-ballo>
if you merge in a non fast-forward fashion then you are effectively changing the meaning of all the commits in between
<K-ballo>
by changing the meaning of commits already in master you are introducing untested commits, not even tested by circle ci
<heller_>
Once a pr is merged to master, we should test that set of commits. Each of those sets arrive linearly in time. I want to easily see which of those sets lead to which failure
<K-ballo>
I suppose we must be talking about different things, because while you get that set of commits linearly in time they do affect already existing commits, potentially changing their meaning
<K-ballo>
the effect is similar to rewinding the branch, and then cherry picking a few from this branch, a few from the other one, then a few more from the first one.. like shuffling a deck of cards
<heller_>
Yes, we're talking about two different things ;)
<K-ballo>
that makes sense, as otherwise it wouldn't
<K-ballo>
when you merge you are taking two sets of commits and generating a new third set after all
<heller_>
Let me formulate it differently: I'm unhappy about the way buildbot presents the failures
pree has quit [Quit: AaBbCc]
<heller_>
Which makes it very hard, at least for me, to figure out which change introduced given failures
<K-ballo>
I think I can imagine what you are saying
<K-ballo>
sometimes a change introduces a failure because another commit changes the things that first commit was relying on
<K-ballo>
and while the fault is in the first commit, you'd blame the second one?
zbyerly_ has quit [Ping timeout: 246 seconds]
<heller_>
K-ballo: more or less, but that's only half of the story, and buildbot of course is not to blame here. There is a reason why GitHub warns about the branch being not up to date
<K-ballo>
nod, it's equivalent to merging untested commits, under a false illusion that they were compiled
<heller_>
The problem really is that it's not easy to see which change/merged introduced the failures even though the merges were always against an up to date master.
<heller_>
Right. If prs would be tested more properly, and we only merge up to date branches, this would mitigate the problem in a way
<heller_>
But that assumes test runners which do not fail due to mysterious reasons and way more resources to test all the prs
<K-ballo>
make PRs be focused and short lived, and merge only one at a time
<K-ballo>
yes, of course that assumes a stable testsuite, which is tricky for something with a non deterministic nature
<heller_>
And a turnaround of 8 hours...
<heller_>
Of course, that's not something to blame buildbot for ;)