#ste||ar on 2017-06-09 — irc logs at irclog.cct.lsu.edu

2017-05-17 13:54 aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:00 EverYoun_ has joined #ste||ar

00:00 EverYoun_ has quit [Remote host closed the connection]

00:02 EverYoun_ has joined #ste||ar

00:03 EverYoung has quit [Ping timeout: 260 seconds]

00:54 vamatya has quit [Ping timeout: 240 seconds]

00:59 EverYoun_ has quit [Ping timeout: 260 seconds]

01:17 vamatya has joined #ste||ar

01:31 vamatya has quit [Ping timeout: 260 seconds]

01:36 quaz0r has quit [Quit: WeeChat 1.8-dev]

01:56 shoshijak has quit [Ping timeout: 255 seconds]

02:20 hkaiser has quit [Quit: bye]

02:24 * jaafar just notices he was mentioned

02:24 <jaafar> I gotta get my notifications fixed

02:25 <jaafar> oh, I was the wrong person :)

02:25 <jaafar> No wonder, my results are bad actually

02:33 K-ballo has quit [Quit: K-ballo]

03:39 quaz0r has joined #ste||ar

05:25 pree has joined #ste||ar

05:48 bikineev has joined #ste||ar

05:56 bikineev has quit [Ping timeout: 246 seconds]

05:58 bikineev has joined #ste||ar

06:37 shoshijak has joined #ste||ar

06:44 shoshijak has quit [Ping timeout: 255 seconds]

06:50 Remko has joined #ste||ar

06:55 Remko has quit [Remote host closed the connection]

06:56 Remko has joined #ste||ar

06:58 shoshijak has joined #ste||ar

07:02 taeguk has quit [Quit: Page closed]

07:02 Remko has quit [Remote host closed the connection]

07:02 taeguk has joined #ste||ar

07:02 <taeguk> jbjnr: Sorry, go back from toilet.. :(

07:03 Remko has joined #ste||ar

07:03 jaafar has quit [Ping timeout: 245 seconds]

07:27 <jbjnr> heller_: I hope you saw the osu BW plot I posted last night. Results are starting to get pretty good really.

07:41 david_pfander has joined #ste||ar

08:36 <pree> In component placement policies the change of return type of " create " API is allowed ? Or it has to be uniform with it's counter parts deflaut & binpacking policies ?

08:36 <pree> thank you

08:41 <pree> i.e from hpx::future<id_type> ----> std::vector<hpx::future<id_type>>

08:42 <pree> heller_ && jbjnr ^^^

08:44 <jbjnr> I'm not sure. I am not familair with that code I'm afraid. returning a vector of futures seems reasonable - why do the others not need to do so

08:45 <jbjnr> taeguk very sorry about this morning - I shouild have warned you that I was not available.

08:45 <jbjnr> If you have things you need to ask about then we can scheduke another call anytime.

08:52 <pree> *friends

08:53 <pree> But In cyclic_distribution_policy. It seems good to return a vector of id's than a single id's

08:53 <jbjnr> pree: I guess my question then is -why do you need a differnent API in this case - if there's a good reason, then go for it

08:54 <jbjnr> why do the other policies not return vectors of ids?

08:55 Remko has quit [Remote host closed the connection]

09:24 <pree> in cyclic case , it sounds good for me to create one component per cycle by choosing the component using counter data

09:25 <pree> i.e component which have least no:of:components of the given type will be chosen among the localities assigned to the policy

09:27 bikineev has quit [Remote host closed the connection]

09:55 mcopik has quit [Ping timeout: 246 seconds]

10:03 <jbjnr> pree: sound like returning a vector is fine. hartmut will be online soon I expect and you'd best discuss it with him.

10:26 <pree> * thank you john'

10:29 josef_k has joined #ste||ar

10:47 <jbjnr> pree: how did you make your messages to me appear as from(pree) and be highlighted in green? some kind of DM but in the main channel?

10:48 <pree> yes I use /notice

10:48 <pree> But some wired thing happens

10:49 <pree> Is there any probelm ?

10:50 <pree> I use /notice jbjnr

11:04 <zao> They appear off in the status window for me, which is very confusing :)

11:08 shoshijak has quit [Ping timeout: 255 seconds]

11:10 <pree> zao : oh sorry for that : )

11:12 bikineev has joined #ste||ar

11:22 <jbjnr> pree - there is no problem, I have never seen messages appear the way you did it

11:22 <jbjnr> (just testing)

11:22 <pree> okay ! :)

11:22 <jbjnr> zao: you mean, you don't see the messages sent with /notice ?

11:23 <zao> jbjnr: They often end up nested with server messages off in the combined server status window.

11:23 <zao> Or in the channel looking nothing like messages do, depending on if they're in a PM or channel.

11:23 <jbjnr> ok. your irc client must show thing differently from mine I guess

11:23 <zao> Behaviours vary among clients there, indeed.

11:33 bikineev has quit [Ping timeout: 245 seconds]

11:46 bikineev has joined #ste||ar

11:48 shoshijak has joined #ste||ar

11:49 shoshijak has quit [Read error: Connection reset by peer]

11:50 K-ballo has joined #ste||ar

12:04 hkaiser has joined #ste||ar

12:12 <pree> hkaiser -> In component placement policies the change of return type of " create " API is allowed ? Or it has to be uniform with it's counter parts deflaut & binpacking policies ?

12:13 <hkaiser> pree: what would you like to use as the return type instead of id_type?

12:13 <pree> hpx::future<id_type> ----> hpx::future< std::vector<hpx::future<id_type>> >

12:13 <hkaiser> why's that"

12:13 <hkaiser> ?

12:13 <pree> But In cyclic_distribution_policy. It seems good to return a vector of id's than a single id's

12:13 <hkaiser> you create one component instance, why return a vector?

12:14 <pree> in cyclic case , it sounds good for me to create one component per cycle by choosing the component using counter data i.e component which have least no:of:components of the given type will be chosen among the localities assigned to the policy

12:14 <hkaiser> sure

12:14 <pree> Can I go for it ?

12:15 <hkaiser> please explain why?

12:15 <hkaiser> I don't see a need for this at this point, except if you explain why you want to return a vector just to return a single value

12:16 <pree> wait for a sec

12:19 shoshijak has joined #ste||ar

12:19 <pree> Because in cyclic property, We will have a parameter "runs" which tells us how many times we have to cycle through the localities. If we create just one component on one of the localities it seems for me we are not caring about the "runs" parameter.

12:20 <hkaiser> why not?

12:20 <hkaiser> create() is supposed to create _one_ instance

12:21 <hkaiser> so what ever you do in create() it will return one instance of an id_type

12:21 <hkaiser> do you agree?

12:21 <pree> Please tell me how it differs from binpacking or default ?

12:21 <hkaiser> those create one instance as well, no?

12:22 <pree> Then what is the gain from the API create of cyclic_distribution_policy ?

12:23 <pree> On what basis decision should be taken on which locality the component should be created on ?

12:24 <hkaiser> pree: that's what I was asking you the other day

12:24 <hkaiser> just use 'the next' locality, what ever that is

12:25 denis_blank has joined #ste||ar

12:25 <pree> sorry i'm not convinced.

12:26 <hkaiser> please first explain what you want to return from create() in that vector

12:26 <pree> id_type of componets

12:27 <pree> created on one locality per cycle

12:27 <hkaiser> so create() should return more than one component instances?

12:27 <pree> Yes .

12:27 <hkaiser> how many?

12:28 <pree> By this we taking some runtime info.. For performance gain.

12:28 <hkaiser> runtime info of what?

12:28 <pree> no:of:components == no:of:runs

12:28 <hkaiser> what's the difference between create() and bulk_create()?

12:29 <pree> bulk_create creates the given "count" components on each locality for each cycle

12:29 <hkaiser> no

12:30 <hkaiser> it creates N components overall, not N components per locality

12:30 <pree> Okay then I have to change accordingly

12:31 <hkaiser> ok

12:31 <hkaiser> create() is essentially the same as bulk_create() with N == 1

12:32 <pree> okay ! Create N components on localities by using M runs --> bulk_create

12:33 <pree> Create 1 components on localities by using M runs ---> create

12:34 <pree> okay ! I implemented as the one described above, Now i will have to change it :)

12:34 <pree> Thank you @ hkaiser!

12:41 shoshijak has quit [Ping timeout: 258 seconds]

13:11 bikineev has quit [Read error: No route to host]

13:11 bikineev has joined #ste||ar

13:20 hkaiser has quit [Quit: bye]

13:20 diehlpk_work has joined #ste||ar

13:21 jakemp has joined #ste||ar

13:27 aserio has joined #ste||ar

13:28 pree has quit [Ping timeout: 240 seconds]

13:32 <david_pfander> heller_, wash: have you ever seen a "<jemalloc>: Error in dlsym(RTLD_NEXT, "pthread_create")" when executing octotiger on knl (in this case on tave)?

13:33 pree has joined #ste||ar

13:33 <github> [hpx] aserio closed pull request #2658: Unify access_data trait for use in both, serialization and de-serialization (master...serialization_access_data) https://git.io/vHCpv

13:45 eschnett has quit [Quit: eschnett]

13:49 hkaiser has joined #ste||ar

13:52 <github> [hpx] hkaiser deleted serialization_access_data at 12f10ff: https://git.io/vH1NM

13:52 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vH1Ny

13:52 <github> hpx/master b9574d6 Hartmut Kaiser: Merge pull request #2664 from STEllAR-GROUP/uninitialized_move...

13:53 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vH1NH

13:53 <github> hpx/master 179850d Hartmut Kaiser: Merge pull request #2676 from STEllAR-GROUP/parallel_destroy...

14:12 mcopik has joined #ste||ar

14:14 eschnett has joined #ste||ar

14:37 <diehlpk_work> hkaiser, Please see pm

14:40 <hkaiser> diehlpk_work: ok

14:53 bikineev has quit [Ping timeout: 246 seconds]

14:56 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vHMIT

14:56 <github> hpx/master 62cd27c Hartmut Kaiser: Making sure uninitialized_value_construct shows up in generated documentation index

15:00 <heller_> david_pfander: no

15:01 <heller_> Never saw that

15:03 <david_pfander> heller_: could you tell which modules you're using on tave? In the past, I only loaded the craype-mic-knl and switch the prog. env. with PrgEnv-gnu, and that worked for me. I'm suspecting some cross-compilation issue.

15:04 <heller_> david_pfander: there seems to be a problem with cmake. The newer versions want to do static linking

15:05 <heller_> As a workaround for now, either do a full static build or switch to cmake 2.8.12 and remove the 3.xx requirements in the main hpx cmake file

15:07 <david_pfander> heller_: ok, I'll try the older cmake version then (less work). Thanks!

15:09 jgoncal has joined #ste||ar

15:11 <zao> ...

15:11 <aserio> ^^lol

15:11 <zao> Sounds like a horrible idea to intentionally ruin the CMake setup just because some silly cluster somewhere is dumb.

15:13 <zao> Not bitter from having to maintain installations of overly ancient things for users.

15:13 <zao> libstdc++5, just saying.

15:14 <david_pfander> zao, aserio: Life is pain :)

15:15 akheir has quit [Remote host closed the connection]

15:24 bikineev has joined #ste||ar

15:28 EverYoung has joined #ste||ar

15:28 EverYoung has quit [Remote host closed the connection]

15:29 bikineev has quit [Read error: No route to host]

15:29 EverYoung has joined #ste||ar

15:29 bikineev has joined #ste||ar

15:45 bibek_desktop has quit [Ping timeout: 255 seconds]

15:46 akheir has joined #ste||ar

15:46 <heller_> zao: still looking for the reason for the static link attempt.

15:51 josef_k has quit [Ping timeout: 255 seconds]

15:58 bibek_desktop has joined #ste||ar

16:03 vamatya has joined #ste||ar

16:26 jbjnr has quit [Read error: Connection reset by peer]

16:26 jbjnr has joined #ste||ar

16:29 <bikineev> jbjnr: hi John

16:29 <bikineev> yt?

16:50 aserio has quit [Ping timeout: 245 seconds]

17:05 pree has quit [Ping timeout: 240 seconds]

17:08 pree has joined #ste||ar

17:10 bikineev has quit [Ping timeout: 246 seconds]

17:14 mcopik has quit [Ping timeout: 240 seconds]

17:33 aserio has joined #ste||ar

17:40 <heller_> aserio: fix it!

17:41 <aserio> you'll have to be more specific

17:43 <heller_> The serialization branch you merged is broken on gcc

17:44 <heller_> gcc 4.9.4 only as it seems

17:45 <heller_> Looks like some unit tests are failing as well

17:48 <heller_> The refcount tests, not good

17:50 <heller_> Which might be due to the lf branch merge...

17:54 <hkaiser> heller_: could be because of the changes made by the serialization branch aserio merged

17:55 <heller_> Looks like the tests failed before

17:57 bikineev has joined #ste||ar

17:57 <heller_> I'm getting more and more frustrated with buildbot.. very tedious to figure out which commit lead to which failure

17:58 jaafar has joined #ste||ar

18:00 <K-ballo> poor buildbot, not its fault

18:02 <heller_> K-ballo: depends, even with our wild merging, the UI could help us to identify bad commits better

18:03 Matombo has joined #ste||ar

18:03 <K-ballo> you just find were the color switches from green to orange, and you get a list of all the commits included in that build

18:03 <zao> Can you still run builds and tests on branches if you care enough?

18:04 <heller_> Here is the problem, it assumes linear commits and sometimes adds unrelated commits to a build

18:04 david_pfander has quit [Ping timeout: 246 seconds]

18:04 <K-ballo> zao: yes, but the nodes run out of .. memory or something, too often while building tests and examples

18:04 <pree> hkaiser ??

18:04 <heller_> Which is another problem, yes

18:05 <K-ballo> "assumes linear commits" just sounds as bad merging to me

18:05 <heller_> Git commits are just not linear

18:06 <K-ballo> if you merge without rebasing first, then the commits in that line are all affected

18:06 <K-ballo> those in fact were never tested, because the CI doesn't test merges

18:06 <K-ballo> so it's effectively a new commit with new content

18:07 <K-ballo> here's a concrete example from a couple weeks ago:

18:08 <K-ballo> on one branch we removed support for vc2013, so I replaced HPX_NOEXCEPT with noexcept

18:08 <K-ballo> on a separate branch I added new uses of HPX_NOEXCEPT

18:08 <K-ballo> both branches worked fine separately, but after a "non-linear" merge the build failed because it was referencing an HPX_NOEXCEPT that no longer existed

18:08 <heller_> You should not be required to rebase, a simple merge should be fine

18:08 david_pfander has joined #ste||ar

18:08 <K-ballo> you are not required to rebase, you are just required to understand the effects of not doing so

18:08 <K-ballo> the changes affect you whether you rebase or not, and if you don't rebase then those changes aren't tested until they hit master

18:09 <K-ballo> in the end, the failures caused to my ranges branch were not caused by changes to the ranges branch at all, but to changes to the merged noexcept branch

18:10 <heller_> And one test run should include a fixed set of commits, since merging happens linearly, you'd immediately see what lead to a failure

18:10 <K-ballo> you'll have to define "linearly"...

18:11 <heller_> One after the other

18:11 <K-ballo> only fast-forward merges are "linear" according to my understanding of "linearly"

18:11 <K-ballo> fast-forward merges are achieved by rebasing

18:11 <heller_> The actual merge commit is linear in time

18:11 <K-ballo> no, it's not

18:12 <heller_> As far as the build runner is concerned

18:12 <K-ballo> I don't follow what you are saying

18:13 <K-ballo> if you merge in a non fast-forward fashion then you are effectively changing the meaning of all the commits in between

18:14 <K-ballo> by changing the meaning of commits already in master you are introducing untested commits, not even tested by circle ci

18:14 <heller_> Once a pr is merged to master, we should test that set of commits. Each of those sets arrive linearly in time. I want to easily see which of those sets lead to which failure

18:15 <K-ballo> I suppose we must be talking about different things, because while you get that set of commits linearly in time they do affect already existing commits, potentially changing their meaning

18:16 <K-ballo> the effect is similar to rewinding the branch, and then cherry picking a few from this branch, a few from the other one, then a few more from the first one.. like shuffling a deck of cards

18:16 <heller_> Yes, we're talking about two different things ;)

18:16 <K-ballo> that makes sense, as otherwise it wouldn't

18:17 <K-ballo> when you merge you are taking two sets of commits and generating a new third set after all

18:18 <heller_> Let me formulate it differently: I'm unhappy about the way buildbot presents the failures

18:19 pree has quit [Quit: AaBbCc]

18:19 <heller_> Which makes it very hard, at least for me, to figure out which change introduced given failures

18:21 <K-ballo> I think I can imagine what you are saying

18:21 <K-ballo> sometimes a change introduces a failure because another commit changes the things that first commit was relying on

18:21 <K-ballo> and while the fault is in the first commit, you'd blame the second one?

18:23 zbyerly_ has quit [Ping timeout: 246 seconds]

18:26 <heller_> K-ballo: more or less, but that's only half of the story, and buildbot of course is not to blame here. There is a reason why GitHub warns about the branch being not up to date

18:28 <K-ballo> nod, it's equivalent to merging untested commits, under a false illusion that they were compiled

18:28 <heller_> The problem really is that it's not easy to see which change/merged introduced the failures even though the merges were always against an up to date master.

18:29 <heller_> Right. If prs would be tested more properly, and we only merge up to date branches, this would mitigate the problem in a way

18:30 <heller_> But that assumes test runners which do not fail due to mysterious reasons and way more resources to test all the prs

18:30 <K-ballo> make PRs be focused and short lived, and merge only one at a time

18:30 <K-ballo> yes, of course that assumes a stable testsuite, which is tricky for something with a non deterministic nature

18:32 <heller_> And a turnaround of 8 hours...

18:33 <heller_> Of course, that's not something to blame buildbot for ;)

18:59 <jbjnr> bikineev: here now

19:09 atrantan has joined #ste||ar

19:09 atrantan has quit [Client Quit]

19:24 jakemp has quit [Ping timeout: 260 seconds]

19:30 aserio has quit [Ping timeout: 260 seconds]

19:51 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vHMHN

19:51 <github> hpx/master 6ad6860 Hartmut Kaiser: Working around a compilation problem with gcc 4.9.x

19:56 aserio has joined #ste||ar

20:03 <bikineev> jbjnr: not sure what affiliation would be correct. Should it be my current working place or LSU where I did some work on serialization?

20:04 <jbjnr> bikineev: your choice, it's only what appears on the paper under your name and nobod really cares ... just for your own satisfaction really

20:05 <bikineev> hm, then "Kaspersky Lab" maybe..

20:07 hkaiser has quit [Quit: bye]

20:07 <jbjnr> ok

20:17 aserio has quit [Quit: aserio]

20:33 <heller_> jbjnr: good job!

20:33 <heller_> jbjnr: I'll read over it tomorrow

20:40 <jbjnr> heller_: I hope you like it. space was a problem

20:40 <jbjnr> IPDPS for the next one I think :)

20:50 bikineev has quit [Remote host closed the connection]

20:52 diehlpk_work has quit [Quit: Leaving]

20:54 <heller_> jbjnr: yup. Dissertation first...

20:58 eschnett has quit [Quit: eschnett]

21:05 jgoncal has quit [Quit: Leaving]

21:06 bikineev has joined #ste||ar

21:16 hkaiser has joined #ste||ar

21:35 hkaiser has quit [Read error: Connection reset by peer]

21:42 hkaiser has joined #ste||ar

22:00 eschnett has joined #ste||ar

22:25 mcopik has joined #ste||ar

23:04 bikineev has quit [Remote host closed the connection]

23:14 bikineev has joined #ste||ar

23:23 ABresting has quit [Ping timeout: 365 seconds]

23:24 ABresting has joined #ste||ar

23:28 Matombo has quit [Remote host closed the connection]

23:55 hkaiser has quit [Read error: Connection reset by peer]