#ste||ar on 2018-02-08 — irc logs at irclog.cct.lsu.edu

2017-05-17 13:54 aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:00 EverYoung has quit [Ping timeout: 252 seconds]

00:00 zombieleet has quit [Ping timeout: 248 seconds]

00:01 galabc has joined #ste||ar

00:05 <github> [hpx] hkaiser force-pushed refactor_base_action from 8d692d5 to c0673fc: https://git.io/vAvAI

00:05 <github> hpx/refactor_base_action c0673fc Hartmut Kaiser: Refactoring component_base and base_action/transfer_base_action to reduce number of instantiated functions and exported symbols...

00:10 EverYoung has joined #ste||ar

00:11 EverYoung has quit [Remote host closed the connection]

00:18 EverYoung has joined #ste||ar

00:18 EverYoung has quit [Remote host closed the connection]

00:19 EverYoung has joined #ste||ar

00:19 EverYoung has quit [Remote host closed the connection]

00:21 EverYoung has joined #ste||ar

00:39 EverYoung has quit [Remote host closed the connection]

00:40 EverYoung has joined #ste||ar

00:47 EverYoung has quit [Remote host closed the connection]

00:48 EverYoung has joined #ste||ar

00:51 EverYoung has quit [Remote host closed the connection]

00:51 EverYoun_ has joined #ste||ar

01:07 EverYoun_ has quit [Remote host closed the connection]

01:09 EverYoung has joined #ste||ar

01:17 EverYoung has quit [Ping timeout: 248 seconds]

01:18 vamatya has quit [Ping timeout: 260 seconds]

01:39 EverYoung has joined #ste||ar

01:46 <github> [hpx] hkaiser pushed 2 new commits to master: https://git.io/vAUsb

01:46 <github> hpx/master 8d02529 Mikael Simberg: Move yield_while out of detail namespace and into own file

01:46 <github> hpx/master c807a4e Hartmut Kaiser: Merge pull request #3147 from msimberg/refactor-yield_while...

01:46 <github> [hpx] hkaiser opened pull request #3154: More refactorings based on clang-tidy reports (master...clang_tidy) https://git.io/vAUsN

01:49 simbergm has quit [Ping timeout: 248 seconds]

01:56 galabc has quit []

01:56 EverYoung has quit [Ping timeout: 265 seconds]

02:09 EverYoung has joined #ste||ar

02:13 hkaiser has quit [Quit: bye]

02:13 EverYoung has quit [Ping timeout: 252 seconds]

02:27 hkaiser has joined #ste||ar

02:28 gedaj has quit [Remote host closed the connection]

02:28 gedaj has joined #ste||ar

02:32 simbergm has joined #ste||ar

02:52 daissgr has quit [Ping timeout: 260 seconds]

03:01 sam29 has joined #ste||ar

03:12 hkaiser has quit [Quit: bye]

03:20 sam29 has quit [Ping timeout: 260 seconds]

03:30 sam29 has joined #ste||ar

03:49 sam29 has quit [Ping timeout: 260 seconds]

03:59 gedaj has quit [Remote host closed the connection]

03:59 gedaj has joined #ste||ar

04:08 quaz0r has quit [Ping timeout: 256 seconds]

04:22 nanashi55 has quit [Ping timeout: 260 seconds]

04:22 nanashi64 has joined #ste||ar

04:23 nanashi64 is now known as nanashi55

05:15 EverYoung has joined #ste||ar

05:20 EverYoung has quit [Ping timeout: 252 seconds]

05:23 vamatya has joined #ste||ar

06:04 <github> [hpx] sithhell force-pushed fix_timed_suspension from 3f00c42 to 7ee57c7: https://git.io/vAURo

06:04 <github> hpx/fix_timed_suspension 7ee57c7 Thomas Heller: Fixing a race with timed suspension...

06:06 mcopik has quit [Ping timeout: 248 seconds]

07:24 heller_ has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]

07:24 heller_ has joined #ste||ar

07:25 <jbjnr> heller_: good morning - question - are you done with small task launching/destruction cleanups or are there more PRs to come?

07:52 <heller_> jbjnr: there are more, I'm currently waiting on the lazy init PR to get approved

07:53 <jbjnr> shall I take a look at that now?

07:53 <heller_> And fixing bugs in the meantime

07:53 <heller_> Sure

07:54 <heller_> The wait_or_add_new removal is next

07:54 <heller_> And then the thread map removal

07:54 <jbjnr> aha. that's a big one

07:54 <jbjnr> which #PR is lazy_init

07:54 <jbjnr> found it

07:54 <heller_> 3146

07:56 <heller_> I guess the bug fixing PRs have higher priority though

08:03 <jbjnr> do we still have a race condition somewhere deep inside hpx that causes random tests to fail from time to time

08:03 <heller_> I just fixed one last night which was there since forever

08:04 <jbjnr> great. PR?

08:04 <heller_> Those will always pop up

08:04 <jbjnr> s/always/should never/g

08:04 <heller_> 3153

08:04 <heller_> And the stack size test is fixed with my other pr

08:05 <jbjnr> you get a gold star

08:05 <heller_> #3150

08:05 <heller_> migrate test is next...

08:06 <heller_> I want a green release

08:06 <jbjnr> hmmm. I see that most tests are passing with normal pycicle build, but simbergm 's dodgy sanitizer build is flaggging everything as bad.

08:07 <heller_> Let's make master green, branch, create the RC, and call it a day

08:07 <heller_> Yes

08:07 <heller_> Right, I wanted to look into the leak sanitizer failures first

08:07 <jbjnr> has anyone looked at the sanitizer problems - do we know where they are coming from?

08:07 <jbjnr> ^^ :)

08:08 <heller_> Thread sanitizer is something we should turn off

08:08 <heller_> It's too buggy in itself

08:08 <jbjnr> fair enough

08:14 <jbjnr> I shall merge 3150

08:15 <jbjnr> simbergm: when you read this - please disable the thread sanitizer build - the leak sanitizer is finding enough for now and we have enough red. We can reenable it sometime if mecessary

08:16 <simbergm> jbjnr: yeah, no problem

08:16 <simbergm> stopped

08:16 <jbjnr> cool.

08:16 <jbjnr> How many pycicle instances have you got running?

08:16 <jbjnr> ^had

08:17 <heller_> another thing: would be cool if we turned on debugging symbols for the sanitizer builds

08:17 <simbergm> 1

08:17 EverYoung has joined #ste||ar

08:17 <simbergm> heller_: yep, can do that once we turn it on again

08:17 <jbjnr> adding a debug builds would be useful in general too

08:17 <heller_> simbergm: I mean for the leak sanitizer

08:18 <jbjnr> I will do some options for pycicle today to simplify that

08:18 <simbergm> heller_: yeah, that's what I understood

08:18 <heller_> and I found that the leak/address sanitizer reports a lot of false positives when you haven't build boost with the same sanitizer flags

08:18 <github> [hpx] biddisco closed pull request #3150: Avoiding more stack overflows (master...fixing_thread_stacks_3) https://git.io/vAfP6

08:19 <simbergm> ok, something to add as well

08:19 <simbergm> or fix

08:19 <heller_> which might be the issue at hand ... since I ran the same tests with a debug build and don't encounter those reports

08:19 <heller_> but will try with a relwithdebinfo build today

08:20 <jbjnr> simbergm: please add the compile flags / build settings that you are using for your sanitizer builds to the pycicle issue so I can add them to the 'options' branch I will start on

08:20 <simbergm> jbjnr: ok

08:21 EverYoung has quit [Ping timeout: 265 seconds]

08:22 <jbjnr> what are our rules for merging several PRs at once. Do we wait for a bvuild cycle or do we now trust pycicle enough since the PRs have been tested lots

08:23 vamatya has quit [Ping timeout: 256 seconds]

08:23 <jbjnr> and should I leave all merges to simbergm as release master, or can I do some?

08:23 <simbergm> jbjnr: you're more than welcome to merge

08:23 <jbjnr> multiple PRs at a time?

08:23 <simbergm> I usually wait, but it's quite slow

08:23 <heller_> I tend to wait as well

08:23 <simbergm> I count on hkaiser merging some in the night :)

08:24 <heller_> but it is upon your judgement, if you think they will cause no further damage ;)

08:24 <jbjnr> yup. I'm thinking that when PRs are not overlapping and since they are tested independently, it ought to be ok to merge several now

08:24 <heller_> right

08:24 <jbjnr> ok.

08:24 <simbergm> in normal circumstances I think waiting for them to finish is okay

08:24 <heller_> also, something like doc fixes can be easily interleaved

08:24 <simbergm> some PRs have been obviously okay so those can overlap

08:24 <github> [hpx] biddisco closed pull request #3153: Fixing a race with timed suspension (master...fix_timed_suspension) https://git.io/vAJjp

08:25 <simbergm> it's getting quite close

08:25 <jbjnr> what is?

08:25 <simbergm> the possibility of a release candidate

08:25 <jbjnr> k

08:25 <heller_> yeah

08:25 <simbergm> meaning rostam is quite clean

08:26 <heller_> let's try to get master as clean as possible on rostam

08:26 <simbergm> yep

08:26 <heller_> then branch the RC, let it settle and test it for a week and then release it

08:26 <jbjnr> rostam? I forgot it was still there!

08:26 <simbergm> :P

08:27 <simbergm> it's still very useful because it does so many builds

08:28 <heller_> yes

08:28 <heller_> eventually, we need to migrate most of them to pycicle ;)

08:30 david_pfander has joined #ste||ar

08:44 <jbjnr> heller_: feel free to comment on https://github.com/biddisco/pycicle/issues/13 with the kind of options for different builds you'd like to see in order to replace rostam/buildbot

08:56 <heller_> simbergm: so I can't reproduce the leak sanitizer failures locally, that is with a sanitizer enabled boost build

08:57 <heller_> so the pycicle errors seem to be one of those false positives I was talking about

08:58 <simbergm> heller_: thanks for checking, did not realize that it needs that as well

08:58 <simbergm> this is the split_gid error, right?

09:00 <heller_> yes

09:00 <heller_> hmmm

09:00 <heller_> I might be wrong after all ...

09:00 <heller_> the FAQ don't see a problem with only having parts of the program being instrumented

09:00 <simbergm> did you run it on multiple tests?

09:00 <heller_> yes

09:00 <simbergm> ok

09:00 <heller_> I might just upgrade my clang

09:01 <simbergm> I am running with gcc though

09:01 <simbergm> this might make a difference

09:01 <heller_> ahh!

09:01 <heller_> no clang?

09:01 <simbergm> well, not yet at least

09:01 <heller_> ok, I wasn't aware of that

09:01 <heller_> let me try with gcc as well

09:02 <simbergm> but I can't tell if it's gcc giving false positives or clang missing something

09:02 <simbergm> second seems more likely

09:02 <github> [hpx] biddisco created ctest_warnings (+1 new commit): https://git.io/vAUru

09:02 <github> hpx/ctest_warnings 1631a92 John Biddiscombe: Add CTest custom warning exceptions file

09:05 <heller_> yup

09:05 <heller_> I'll check with gcc

09:05 <heller_> thanks for pointing that out

09:07 <github> [hpx] biddisco opened pull request #3155: Add CTest custom warning exceptions file (master...ctest_warnings) https://git.io/vAUr5

09:08 <heller_> jbjnr: is there a possibility for me to get access to the cdash database?

09:08 <jbjnr> if you have a good reason for it, I can try to get a copy

09:08 <jbjnr> direct access, probably no.

09:09 <heller_> ok, copy is certainly not good enough ;)

09:09 <heller_> I want to create a better UI :P

09:09 gedaj has quit [Remote host closed the connection]

09:09 gedaj has joined #ste||ar

09:09 <heller_> and since cdash already has all the information needed in a database ... there is no need to replicate that effort

09:10 <jbjnr> heller_: don't create a better UI, just fork cdash and add improvements

09:10 <jbjnr> we will use git master for next upgrade

09:11 <jbjnr> I like the UI a lot, so I don't know what you want to change

09:11 <heller_> ok, tell me once you upgraded

09:11 <heller_> I will try to see what the cdash API can give me

09:11 <jbjnr> what don't you like?

09:11 <simbergm> the grid view on buildbot something that I wish cdash had (to view single branches)

09:12 <jbjnr> you mean, one 'track' per branch?

09:12 <jbjnr> like we have with 'master' at the moment

09:15 <simbergm> yeah, more or less, except that cdash puts different commits all in one column

09:16 <jbjnr> ?

09:17 <simbergm> in the grid view on buildbot columns are commits, rows are builds

09:18 <simbergm> on cdash the columns are different steps, but the commits and builds are mixed are each a separate row

09:18 <simbergm> i.e. buildbot displays the same thing more compactly

09:20 <heller_> I'd also like to have a overview over which tests failed for a specific branch

09:21 <heller_> that is, for all configurations, give me a list of tests that failed, for each test, the links to the failed tests

09:21 <jbjnr> simbergm: NB. on cdash click the "build name" title and then results are sorted by PR and not by time/other, then build for one PR are grouped

09:21 <heller_> I am not really interested in what succeeded, just what failed

09:22 <heller_> and I also don't want to see the failed builds of PRs which have been updated to fix the errors

09:23 <heller_> simbergm: good news, on gcc, I am able to reproduce the leak sanitizer error

09:23 <heller_> first, migration test though, which I have a handle on now

09:23 <simbergm> heller_: great, sanity restored

09:24 <simbergm> I was able to reproduce the migrate_component failure but got stuck after that, so thanks for looking at it

09:25 <github> [hpx] biddisco pushed 1 new commit to ctest_warnings: https://git.io/vAUKW

09:25 <github> hpx/ctest_warnings 239e2fe John Biddiscombe: Another try

09:31 simbergm has quit [Ping timeout: 256 seconds]

09:34 heller_ has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]

09:36 heller_ has joined #ste||ar

09:46 simbergm has joined #ste||ar

10:17 <heller_> simbergm: what about #2974?

10:18 EverYoung has joined #ste||ar

10:23 EverYoung has quit [Ping timeout: 276 seconds]

10:24 simbergm has quit [Ping timeout: 256 seconds]

10:25 <heller_> simbergm: jbjnr: since hwloc2 is now released, should we make everything working with it before the release?

10:28 <jbjnr> heller_: no

10:28 <jbjnr> hwloc 2 completely changes the memory heirarchy and everything breaks completely. We cannot use hwloc 2 without a rewrite of the topology class.

10:28 <jbjnr> unfortunately :(

10:29 <jbjnr> we will need to add in a check i cmake to give an error if hwloc 2 is used at the moment

10:29 <jbjnr> all the numa domain stuff needs to be redone.

10:32 simbergm has joined #ste||ar

10:35 <heller_> ok

10:35 <heller_> too bad

10:43 <heller_> alright, I think I fixed component migration

10:46 <heller_> nope.

10:46 <heller_> :/

11:09 <heller_> I think someone broke rostam

11:15 <simbergm> heller_: #3153? :/

11:16 <simbergm> pycicle seems to have timed out on that

11:16 <simbergm> jbjnr: two feature requests for pycicle

11:16 <simbergm> 1. set the PR status to pending when starting a build

11:16 <simbergm> 2. merge the config, build, test statuses into one

11:16 <jbjnr> already got an issue for that

11:16 <jbjnr> #1 I mean

11:17 <simbergm> very nice

11:17 <simbergm> 2 is not a must but I think they don't all need a separate status

11:17 <jbjnr> 2 - could do. Not a very big deal though, github already does and and/or operation on the results for us

11:17 <simbergm> I can try to hack it together in pycicle as well

11:18 <simbergm> yep

11:18 <simbergm> but 1 would be very nice

11:33 <github> [hpx] msimberg created revert-3153-fix_timed_suspension (+1 new commit): https://git.io/vAUHD

11:33 <github> hpx/revert-3153-fix_timed_suspension 30135c6 Mikael Simberg: Revert "Fixing a race with timed suspension"

11:33 <heller_> why did you revert it?

11:33 <jbjnr> why do you think 3153 is the problem?

11:33 <simbergm> not reverted yet but have a look on rostam

11:34 <heller_> rostam is completely fired

11:34 <heller_> as it seems

11:34 <simbergm> ah, you think it's unrelated?

11:34 <heller_> yes

11:34 <jbjnr> when the builds don't complete - it's a disk/memory error or something

11:34 <heller_> yeah

11:34 <heller_> there are three jobs on rostam than run for over 12 hours at the moment

11:35 <simbergm> sorry, missed that

11:35 <heller_> and the binaries running as long as the job

11:35 <simbergm> can you see that on the buildbot page? or is this insider knowledge?

11:35 <heller_> so I am guessing that they just hammer the FS right now...

11:35 <heller_> this is me logging onto rostam as root and looking at the jobs

11:35 <simbergm> ok, thanks for stopping me

11:36 <heller_> once the guys at LSU wake up, we should try to merge your doc PR

11:36 <github> [hpx] msimberg deleted revert-3153-fix_timed_suspension at 30135c6: https://git.io/vAUQm

11:36 <simbergm> yep

11:37 <heller_> and see how it goes

11:40 <heller_> https://gist.github.com/sithhell/328b2ac05ca55058760fdac6592fcd5f

11:40 <heller_> that's the queue on rostam right now

11:42 <heller_> http://rostam.cct.lsu.edu/builders/hpx_clang_3_8_boost_1_59_centos_x86_64_release/builds/210/steps/run_unit_tests/logs/stdio

11:42 <heller_> just take a look at this, doesn't get through the queue

11:45 <heller_> simbergm: the failures are still suspicous though

11:45 <heller_> I'll investigate

11:46 <simbergm> I hope it's rostam

11:46 <simbergm> but the builds didn't finish on pycicle either

11:48 <heller_> just noticed

11:48 <heller_> I'll give it a whirl

11:49 <heller_> at least migrate_component is running super smooth now :P

11:50 hkaiser has joined #ste||ar

11:50 <simbergm> that is good news!

11:51 <heller_> and just the second after I said it, it crashed again :/

12:30 <jbjnr> simbergm: the fact that rostam keeps dying on us is one of the reasons I wanted to do builds on daint. We have a whole team of people who keep daint running and maintain the filesyste etc. Rostam has one guy - and he's supposed to be doing a PhD not maintaining the system.

12:42 <heller_> K-ballo: yes

12:43 <heller_> jbjnr: simbergm: I guess it was the patch after all

12:43 <heller_> let me investigate though

12:43 <heller_> before reverting and then commiting again

12:44 <simbergm> jbjnr: yeah, that's a good thing, but daint is not exactly super stable either

12:44 <simbergm> heller_: yep, invoke_no_more_than hangs

12:44 <jbjnr> builds failed on daint - oh dear. I'll have a look too

12:46 <heller_> simbergm: I'm sure it's not much

12:47 <simbergm> I think so too, luckily it's a small patch

12:48 <jbjnr> srun: Job step creation temporarily disabled, retrying

12:48 <jbjnr> so daint couldn't launch any jobs for some reason.

12:49 <heller_> simbergm: works for me :/

12:50 <simbergm> huh, I can check again

12:50 <heller_> got it to hang now ... needs 4 cores

12:50 <simbergm> ok

12:50 <simbergm> "good"

12:52 <github> [hpx] AntonBikineev created fix_3134 (+1 new commit): https://git.io/vAUAf

12:52 <github> hpx/fix_3134 6b25fb4 AntonBikineev: Fixing serialization of classes with incompatible serialize signature...

12:53 <github> [hpx] AntonBikineev opened pull request #3156: Fixing serialization of classes with incompatible serialize signature (master...fix_3134) https://git.io/vAUAT

12:53 <jbjnr> action_invoke_no_more_than_test hung by the looks of it

13:09 <jbjnr> so ....

13:11 <jbjnr> action_invoke_no_more_than hung on daint. Ctest tried to kill it, but it didn't die. After that, all the next tests timed out and failed because slurm couldn't get the job step. Once I manualy killed the action_no_more_than test, then slurm continued to work and the remaining tests began passing.

13:11 <jbjnr> simbergm: ^ heller_ ^

13:11 <heller_> yeah

13:11 <heller_> looking into it right now

13:11 <simbergm> hum, this happens on rostam as well

13:11 <simbergm> seems like ctest is not able to properly kill tests sometimes

13:12 <jbjnr> yes, because they are inside a dodgy python wrapper

13:12 <jbjnr> note to self :get rid of hpxrun.py

13:13 <jbjnr> timed_this_thread_executors is also bad by the looks of things

13:17 <heller_> yeah, not sure what's going on

13:17 <jbjnr> just remember rule #1

13:18 <simbergm> it's heller's fault?

13:18 <jbjnr> you're a fast learner!

13:18 <simbergm> sorry heller, jbjnr has brainwashed me, I couldn't help it

13:19 <jbjnr> though it's probly your dodgy thread suspension stuff that broke everything :)

13:19 <simbergm> everything!

13:20 <simbergm> probably

13:28 <jbjnr> simbergm: if I had a pool called "stuff" and I wanted to put just that pool to sleep, could I launch a task with a pool_executor("stuff") on the stuff pool and then do put the pool to sleep by saying async(stuff_executor, task).then(put_pool_to_sleep("stuff");

13:29 <jbjnr> I want to create two pools, use them for something, and then put them to sleep until I need them again, and by attaching a continuation to the tasks on those pools, I could make sure my task runs, then make them sleep

13:30 <simbergm> jbjnr: you should *not* launch put_pool_to_sleep on the pool you want to suspend

13:30 <simbergm> i.e. it can't put itself to sleep

13:30 <jbjnr> ok, find, I do this async(stuff_executor, task).then(default_eecutor, put_pool_to_sleep("stuff");

13:31 <jbjnr> so the put pool to sleep task runs on another pool

13:31 <simbergm> yeah, something like that should work

13:31 <jbjnr> but I can do it with your changes yes?

13:31 <jbjnr> cool

13:31 <jbjnr> I will implement it then

13:31 <simbergm> assuming I didn't break everything, yes ;)

13:31 <simbergm> it's there at least

13:32 <jbjnr> to suspend one pool I call suspend("pool name") ?

13:32 <simbergm> get_thread_pool("stuff").suspend()

13:32 <jbjnr> lovely

13:32 <jbjnr> thanks

13:32 <simbergm> it's a member function of thread_pool_base

13:32 <jbjnr> to wake it up again can I just get_thread_pool("stuff").resume()

13:33 <simbergm> yep

13:33 <jbjnr> lovely

13:38 <jbjnr> tests.unit.threads.set_thread_state died too. must be a hanging on termination problem.

13:47 hkaiser has quit [Quit: bye]

14:12 mcopik has joined #ste||ar

14:15 <jbjnr> simbergm: it's may bad - I am looking at the dashboard and I see that the PR tests for 3153 all failed, but I merged it anyway. I must have looked at the wrong PR number when I decided it was safe to merge. Sorry.

14:15 <jbjnr> I guess it's a good idea to revert it.

14:21 <heller_> it's all broken!

14:21 <heller_> I hate race conditions

14:22 mcopik_ has joined #ste||ar

14:22 mcopik_ has quit [Client Quit]

14:36 <simbergm> jbjnr: I've done the same... that's why the pending status on the PR is useful

14:36 <jbjnr> ok. good point.

14:36 <simbergm> heller_: I'll go ahead and revert for now so we can merge other stuff in the meantime

14:39 <simbergm> so rostam seems to be slowly making progress, I guess it's okay to merge or do you think I should wait?

14:40 <heller_> shouldn't matter

14:40 <heller_> a fix will probably take until tomorrow, yeah

14:41 <github> [hpx] msimberg created revert-3153-fix_timed_suspension (+1 new commit): https://git.io/vATsm

14:41 <github> hpx/revert-3153-fix_timed_suspension 7e37600 Mikael Simberg: Revert "Fixing a race with timed suspension"

14:42 <github> [hpx] msimberg opened pull request #3157: Revert "Fixing a race with timed suspension" (master...revert-3153-fix_timed_suspension) https://git.io/vATsE

14:43 <Guest58625> [hpx] msimberg pushed 1 new commit to master: https://git.io/vATsa

14:43 <Guest58625> hpx/master 249696b Mikael Simberg: Merge pull request #3157 from STEllAR-GROUP/revert-3153-fix_timed_suspension...

14:50 hkaiser has joined #ste||ar

14:52 hkaiser has quit [Client Quit]

14:53 hkaiser has joined #ste||ar

14:54 <heller_> I'll cancel the hung tests

14:55 gedaj has quit [Remote host closed the connection]

14:55 gedaj has joined #ste||ar

14:58 <hkaiser> parsa[w]: the operators should be available now

14:58 <parsa[w]> excellent!

14:58 <parsa[w]> thanks

15:01 <heller_> just testing the fix to the fix ...

15:01 <heller_> action_invoke_more_than seems to be fine now, let's see the rest

15:04 aserio has joined #ste||ar

15:20 mcopik_ has joined #ste||ar

15:20 mcopik_ has quit [Client Quit]

15:25 <aserio> simbergm: yt?

15:26 <simbergm> aserio: yeah

15:29 <aserio> simbergm: please see pm

15:44 eschnett has quit [Quit: eschnett]

15:53 gedaj has quit [Remote host closed the connection]

15:53 gedaj has joined #ste||ar

16:02 eschnett has joined #ste||ar

16:06 rtohid has joined #ste||ar

16:09 kisaacs has joined #ste||ar

16:10 aserio1 has joined #ste||ar

16:14 aserio has quit [Ping timeout: 252 seconds]

16:14 aserio1 is now known as aserio

16:30 daissgr has joined #ste||ar

16:41 aserio1 has joined #ste||ar

16:42 mcopik has quit [Ping timeout: 255 seconds]

16:43 quaz0r has joined #ste||ar

16:45 aserio has quit [Ping timeout: 252 seconds]

16:45 aserio1 is now known as aserio

16:56 <diehlpk_work> Did something in the latest hpx master changed for hpx::parallel::static_chunk_size and co?

16:56 <diehlpk_work> I can not reproduce the previous results with the latest version of these.

16:57 EverYoung has joined #ste||ar

16:57 <diehlpk_work> 10000000 549.305 MFLOPS/s vs 3677.46 MFLOS/S with the older version of HPX

16:59 EverYoung has quit [Remote host closed the connection]

17:00 EverYoung has joined #ste||ar

17:05 daissgr has quit [Ping timeout: 256 seconds]

17:06 EverYoung has quit [Remote host closed the connection]

17:07 EverYoung has joined #ste||ar

17:07 <github> [hpx] AntonBikineev force-pushed fix_3134 from 6b25fb4 to 3cff7c8: https://git.io/vAToO

17:07 <github> hpx/fix_3134 3cff7c8 AntonBikineev: Fixing serialization of classes with incompatible serialize signature...

17:07 eschnett has quit [Remote host closed the connection]

17:08 eschnett has joined #ste||ar

17:09 <hkaiser> diehlpk_work: heller_ was fiddling with the thread manager

17:17 daissgr has joined #ste||ar

17:20 EverYoung has quit [Remote host closed the connection]

17:20 EverYoung has joined #ste||ar

17:23 david_pfander has quit [Ping timeout: 240 seconds]

17:24 mcopik has joined #ste||ar

17:24 <github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vAT6G

17:24 <github> hpx/gh-pages ce2da59 StellarBot: Updating docs

17:27 aserio1 has joined #ste||ar

17:30 aserio has quit [Ping timeout: 255 seconds]

17:30 aserio1 is now known as aserio

17:32 <diehlpk_work> hkaiser, Ok, so I will contact him

17:33 <diehlpk_work> I have both thresholds now and for smaller vectors and matrices we are slighlty better as omp

17:36 <hkaiser> nice

17:50 EverYoung has quit [Remote host closed the connection]

17:51 EverYoung has joined #ste||ar

17:51 EverYoung has quit [Remote host closed the connection]

17:52 EverYoung has joined #ste||ar

17:52 vamatya has joined #ste||ar

17:58 EverYoun_ has joined #ste||ar

17:59 EverYoun_ has quit [Remote host closed the connection]

18:01 EverYoun_ has joined #ste||ar

18:01 EverYoung has quit [Ping timeout: 265 seconds]

18:02 kisaacs has quit [Ping timeout: 240 seconds]

18:02 <heller_> diehlpk_work: release vs debug build?

18:16 vamatya has quit [Read error: Connection reset by peer]

18:17 vamatya has joined #ste||ar

18:31 EverYoung has joined #ste||ar

18:35 EverYoun_ has quit [Ping timeout: 276 seconds]

18:37 aserio has quit [Ping timeout: 252 seconds]

18:51 mbremer has joined #ste||ar

19:06 <heller_> round 2.

19:06 <github> [hpx] sithhell created fix_timed_suspension (+1 new commit): https://git.io/vAT5M

19:06 <github> hpx/fix_timed_suspension 81b2856 Thomas Heller: Fixing a race with timed suspension...

19:07 <github> [hpx] sithhell opened pull request #3158: Fixing a race with timed suspension (second attempt) (master...fix_timed_suspension) https://git.io/vAT55

19:19 aserio has joined #ste||ar

19:23 <diehlpk_work> heller_, Was in a meeting and have a look soon

19:28 kisaacs has joined #ste||ar

19:29 aserio1 has joined #ste||ar

19:30 EverYoung has quit [Remote host closed the connection]

19:32 aserio has quit [Ping timeout: 276 seconds]

19:32 aserio1 is now known as aserio

19:33 parsa[w] has quit [Read error: Connection reset by peer]

19:33 kisaacs has quit [Ping timeout: 260 seconds]

19:36 parsa[w] has joined #ste||ar

19:37 sam29 has joined #ste||ar

20:07 EverYoung has joined #ste||ar

20:10 kisaacs has joined #ste||ar

20:21 autrilla has quit [Disconnected by services]

20:21 autrilla1 has joined #ste||ar

20:22 autrilla1 has quit [Client Quit]

20:32 <diehlpk_work> hkaiser, heller_ The default behavior of hpx with parsing commandline args is not nice

20:32 <hkaiser> diehlpk_work: ok?

20:33 <hkaiser> what behavior is bothering you?

20:33 <diehlpk_work> Passing an unknown argument does do not result in an error

20:33 <hkaiser> that depends

20:33 <diehlpk_work> With the nee version -t is not supported anymore and I switched to --hpx:threads=

20:33 <hkaiser> that's only true if you use hpx_main.hpp - that is to allow for applications handling their own arguments without hpx complaining

20:34 <diehlpk_work> Forgot to append the s to threads and HPX was running with all coresd

20:34 <diehlpk_work> *cores

20:34 <hkaiser> -t is not supported only for hpx_main.hpp as well

20:34 <diehlpk_work> Therefore, I got different results for BLAZE

20:35 <hkaiser> diehlpk_work: how do you suggest we handle this?

20:35 <diehlpk_work> Is it possible to print an warning that the provided commandline option is not known or used by hpx

20:35 <diehlpk_work> Some tools say unrecognized option

20:36 <hkaiser> that would annoy people that use hpx_main.hpp and want to handle their own command line arguments

20:37 <diehlpk_work> Yes, you are right

20:39 <hkaiser> we could handle that using a application wide pp constance

20:39 <hkaiser> constant*

20:48 EverYoun_ has joined #ste||ar

20:51 <diehlpk_work> Yes, or just let hpx start with one core as default and not all

20:51 <diehlpk_work> I would consider default is one core and with --hpx:threads= one specifiy the amount of cores

20:52 EverYoung has quit [Ping timeout: 276 seconds]

20:52 <diehlpk_work> I just used htop and realized that hpx uses too many cores

20:52 <diehlpk_work> At least I know why my performance dropped

20:53 <diehlpk_work> And can start to work on tune blaze again

20:58 EverYoun_ has quit [Remote host closed the connection]

21:06 sam29 has quit [Ping timeout: 260 seconds]

21:08 EverYoung has joined #ste||ar

21:34 <hkaiser> diehlpk_work: fight this out with jbjnr, he changed it to use all cores

21:34 <diehlpk_work> Ok, I will do this

21:35 hkaiser has quit [Quit: bye]

21:59 cogle has joined #ste||ar

22:00 EverYoung has quit [Remote host closed the connection]

22:10 EverYoung has joined #ste||ar

22:21 kisaacs has quit [Ping timeout: 276 seconds]

22:22 hkaiser has joined #ste||ar

22:24 kisaacs has joined #ste||ar

22:28 twwright_ has joined #ste||ar

22:28 twwright has quit [Read error: Connection reset by peer]

22:28 twwright_ is now known as twwright

22:31 twwright has quit [Client Quit]

22:33 twwright has joined #ste||ar

22:47 daissgr has quit [Quit: WeeChat 1.4]

22:56 daissgr has joined #ste||ar

23:08 <github> [hpx] aserio opened pull request #3159: Support Checkpointing Components (master...checkpoint_component) https://git.io/vAkWZ

23:22 kektrain_ has joined #ste||ar

23:22 kektrain_ has left #ste||ar ["iяс.sцреяиетs.ояg сни sцреявоwl"]

23:28 aserio has quit [Quit: aserio]

23:30 daissgr has quit [Ping timeout: 255 seconds]

23:34 <jbjnr> hkaiser: if I want to create a task and it is already ready to run - all futures it depends on are ready - what is the most efficient way of creating it and inserting it directly onto the queues - is lcos::local::futures_factory inefficient?

23:39 <hkaiser> it needs an allocation, otherwise it should be fine

23:39 <hkaiser> creating a future always allocates

23:39 <hkaiser> otherwise there shouldn't be much overhead

23:41 <jbjnr> is there a flag I should pass to say - this task can be run - it is ready, not waiting for nything

23:42 <jbjnr> I guess async says that already

23:42 daissgr has joined #ste||ar

23:52 ct-clmsn has joined #ste||ar

23:53 <ct-clmsn> hkaiser, where in the source tree is the strassen test you put together?

23:54 <ct-clmsn> i've done a local update and fgrep and can't seem to find the right keyword to search