#ste||ar on 2019-02-18 — irc logs at irclog.cct.lsu.edu

2018-08-26 23:03 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:08 eschnett_ has joined #ste||ar

00:13 diehlpk_work has quit [Remote host closed the connection]

00:35 <Yorlik> Hmm ... with a lot of object migration for load balancing and other memory heavy operations I wonder if this could be interesting when running HPX apps: https://arxiv.org/pdf/1902.04738.pdf https://arxiv.org/pdf/1902.04738.pdf

01:23 <hkaiser> Yorlik: interesting

02:43 nikunj has quit [Ping timeout: 258 seconds]

02:47 nikunj has joined #ste||ar

02:56 hkaiser has quit [Quit: bye]

04:46 nikunj has quit [Remote host closed the connection]

04:46 nikunj has joined #ste||ar

05:57 nikunj has quit [Ping timeout: 245 seconds]

07:40 daissgr has joined #ste||ar

08:36 hello has joined #ste||ar

09:36 daissgr has quit [Ping timeout: 250 seconds]

09:58 daissgr has joined #ste||ar

10:03 hkaiser has joined #ste||ar

10:07 <hkaiser> heller_: isn't it your birthday today?

10:07 <heller_> hkaiser: it is

10:07 <hkaiser> Happy Birthday!

10:09 <heller_> thanks ;)

10:11 pmikolajczyk41 has joined #ste||ar

10:26 <heller_> hkaiser: does my cuda commit work on windows?

10:26 <hkaiser> heller_: have not tried yet

10:27 <heller_> that would be very interesting and was the main reason to commit it at this stage ;)

10:33 <pmikolajczyk41> Hello. While working on issue #3442 with a set_operation.hpp (hpx/parallel/algorithms/detail/set_operation.hpp) I got a little bit concerned with two function calls to parallel::util::partitioner<>::call(...) in lines 98 and 178.

10:33 <pmikolajczyk41> As I understand (looking at chunk_size.hpp or .*_partitioner.hpp), the 3rd argument should be the number of elements to be partitioned, but these two calls use the number of available cores instead.

10:34 <pmikolajczyk41> When I tried substituting *cores* with *len1*, I got some strange runtime errors.

10:34 <pmikolajczyk41> I think that it concerns only effectiveness, because the code works well, I mean, it gives correct answer for e.g. set difference or intersection.

10:34 <pmikolajczyk41> Of course, I could misunderstood the code, so I decided to ask whether somebody could have a look at it.

10:36 <hkaiser> pmikolajczyk41: you mean this: https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/algorithms/detail/set_operation.hpp#L98?

10:37 <pmikolajczyk41> exactly

10:38 <hkaiser> yes, we use whatever the executor returns as the number of processing units to use

10:38 <hkaiser> don't remember why *puzzled*

10:46 <hkaiser> pmikolajczyk41: could have been the result of some performance analysis...

11:31 adityaRakhecha has quit [Ping timeout: 256 seconds]

11:32 <Yorlik> o/

11:34 daissgr has quit [Ping timeout: 240 seconds]

11:53 pmikolajczyk41 has quit [Quit: Page closed]

12:04 <heller_> hkaiser: the problem with the cancelable action example is not that its trying to set the state of an active thread

12:05 <heller_> but the state of a thread that doesn't exist anymore...

12:05 <hkaiser> ok

12:07 <heller_> so, in the case of an interruption, I don't think one needs to set the state again if the thread is active ... or we need to add reference counting back in

12:08 hello has quit [Ping timeout: 255 seconds]

12:08 <hkaiser> heller_: is the test broken?

12:08 <heller_> the test itself is fine, I think

12:09 <hkaiser> how do we know the thread was interrupted in order not to set the state?

12:09 <heller_> we set the flag, don't we?

12:09 <heller_> and since we can only handle interruptions on interruption points...

12:10 <hkaiser> we set the flag to cause interruption, but once interrupted (exception being thrown) - how do you know then?

12:10 <heller_> we just terminate the thread, no?

12:10 <heller_> there's no need to set the state from the outside

12:10 <hkaiser> and flag any waiting threads, yes

12:11 <mdiers_> heller_: Happy Birthday. Will you be in heidelberg on wednesday and thursday?

12:11 <heller_> mdiers_: yes

12:11 <heller_> mdiers_: and thanks ;)

12:11 <heller_> hkaiser: if the thread is suspended, we need to put it into pending, for sure

12:12 <heller_> mdiers_: you as well? I'll arrive tomorrow afternoon

12:12 <hkaiser> heller_: when is your talk?

12:13 <mdiers_> heller_: yes I will arrive later tomorrow too.

12:14 <heller_> 21st at 15:00

12:15 <heller_> mdiers_: excellent! btw, did you hear back from BMBF yet?

12:17 <heller_> mdiers_: after the speakers dinner on the 19th, I'd be up for a beer!

12:17 <hkaiser> heller_: so, do you know what to change in order to fix the cancellable action test?

12:17 <heller_> hkaiser: not yet

12:20 <mdiers_> heller_: yes, today. unfortunately a cancellation :-( . gerald is just trying to figure out by phone why.

12:22 <heller_> ok

12:22 hello has joined #ste||ar

12:28 <mdiers_> heller_: unfortunately i will arive at 10:30pm.

12:28 <heller_> oh, ok

13:05 <hello> My machine has 8 ddr channels and every channel has 12.5GB bandwidth. Why I get 100GB and 300GB bandwidth when diffrent gcc compile parameters for stream benchmark?

13:14 <hkaiser> cache effects?

13:15 <hello> I am using stream benchmark

13:19 <Yorlik> Cache friendliness beats any sophisticated algorithms ;)

13:30 <heller_> Debug vs Release?

13:31 <heller_> 12.5 times 8 is 100, so the 300 get served from cache

13:31 <heller_> The most interesting part: what are the different parameters and how did you run it?

13:32 <hello> icc T=56

13:32 <hello> 100GB/s

13:32 <hello> Icc T=56, -xHost opt=auto

13:33 <hello> 398GB/s

13:33 <heller_> That doesn't make lots of sense ;)

13:34 <heller_> What's T?

13:37 heller_ has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]

13:40 heller_ has joined #ste||ar

13:41 <Yorlik> -xHost implies -O2 and uses SIMD instructions ...

13:41 <heller_> So my guess is: you are running a small array size?

13:41 <Yorlik> That ~should have visible consequences.

13:42 <Yorlik> I think if you really want to understand whats going on here you'd have to look at the assemply and compare.

13:44 <heller_> Or performance counters

13:45 <heller_> Could you please share your full command line to launch the program?

13:45 <Yorlik> Ya. It comes down to some hardcore lowlever stuff.

13:54 hkaiser has quit [Quit: bye]

13:59 <hello> Thtread

14:12 daissgr has joined #ste||ar

14:18 hkaiser has joined #ste||ar

14:26 aserio has joined #ste||ar

14:39 daissgr has quit [Ping timeout: 272 seconds]

14:59 eschnett_ has quit [Quit: eschnett_]

15:18 daissgr has joined #ste||ar

15:25 hello has quit [Quit: Going offline, see ya! (www.adiirc.com)]

15:31 akheir has quit [Remote host closed the connection]

15:34 aserio has quit [Ping timeout: 264 seconds]

15:36 akheir has joined #ste||ar

15:39 aserio has joined #ste||ar

15:57 diehlpk_work has joined #ste||ar

15:57 <diehlpk_work> simbergm, Non-standard cmake paths work now for Fedora

16:00 aserio1 has joined #ste||ar

16:04 aserio has quit [Ping timeout: 268 seconds]

16:04 aserio1 is now known as aserio

16:12 detan has joined #ste||ar

16:17 <detan> Does anyone know why the examples/compute/cuda/partitioned_vector only compiles with clang?

16:21 <K-ballo> simbergm: when you have a chance could you tell me what the status of the function serialization PR is on the pycycles

16:34 <diehlpk_work> detan, Which version of HPX are you using?

16:34 mbremer has joined #ste||ar

16:38 <detan> diehlpk_work: 1.2.0

16:38 <diehlpk_work> Is the error message related to constexp?

16:39 <detan> diehlpk_work: Sorry, I was just looking at the CMakeLists.txt file at the examples/compute/cuda/ folder.

16:40 <detan> diehlpk_work: I was just wondering if there was any feature that only works with clang. Like partitioned_vectors...

16:40 <diehlpk_work> detan, We had issued with nvcc there, but these should be addressed in current master

16:41 <diehlpk_work> I had some issues with constexp when I switched from clang to nvcc

16:41 <detan> diehlpk_work: That is good to know... Thanks!

16:41 <diehlpk_work> You could try hpx master and see if it is working there

16:42 <detan> diehlpk_work: Will do!

16:42 <diehlpk_work> detan, cuda clang handles constexpr differenty from nvcc

16:42 <diehlpk_work> *different

16:43 <detan> diehlpk_work: How different?

16:44 <diehlpk_work> In my case clang was able to compile the constexp statement and nbcc complained about it

16:44 <diehlpk_work> I had one function defined as constexp and clang could compile it, but nvcc complained

16:48 <detan> I see... did you use -cuda-relaxed-constexpr?

16:49 <diehlpk_work> Yes

16:50 <simbergm> K-ballo: one unused variable warning, some capture/move warnings, and something broken in the guided pool executor (only with gcc...?)

16:50 <simbergm> and -Werror is on, hence errors

16:50 <simbergm> http://cdash.cscs.ch//viewBuildError.php?buildid=41380

16:50 <simbergm> http://cdash.cscs.ch//viewBuildError.php?buildid=41404

16:51 <simbergm> first link, page 1: unused variable, page 2: guided executor

16:52 <K-ballo> thanks

16:52 <detan> diehlpk_work: It's hard to work with nvcc sometimes...

16:52 <simbergm> second link: move warnings

16:52 <K-ballo> had not realized there were pages

16:53 <simbergm> K-ballo: last commit on that branch was a week ago, no? (if yes, results should be up to date)

16:53 <diehlpk_work> detan, Yes, it is. At least my use case compiled with the master of hpx

16:53 <K-ballo> simbergm: yeah, I waited a couple days

16:53 <simbergm> you can also set "items per page = all"

16:53 <K-ballo> I'll try fixing those issues later today, and ask back later this week

16:53 <simbergm> sounds good

16:54 <simbergm> K-ballo: scratch that, those snuck into master....

16:55 <K-ballo> which ones?

16:57 <detan> diehlpk_work: Ok, thanks... I saw that the condition in CMakeLists.txt checking for clang still there for partitioned_vector on master. Let's see what happens when I remove it...

16:57 <simbergm> looks like 3704

16:57 <simbergm> http://cdash.cscs.ch//index.php?project=HPX&date=2019-02-17&filtercount=1&field1=buildname/string&compare1=63&value1=3704-debind-more

16:57 <simbergm> master was ok yesterday

17:00 <K-ballo> function serialization is not based on debind more

17:00 <K-ballo> ...is it? it should not be

17:00 <hkaiser> what's 'debind'?

17:01 <K-ballo> oh does pycicle rebases/merge?

17:02 <K-ballo> hkaiser: the PR for removing/lowering bind and friends

17:02 <K-ballo> left some unused using placeholder::_1; in place

17:02 <hkaiser> k

17:10 <simbergm> K-ballo: yep, it merges master before testing

17:11 <simbergm> diehlpk_work: nice work on the cmake

17:11 <simbergm> what did you have to do in the end to convince cmake to look in the right places?

17:11 <diehlpk_work> Adding a new flag to the cmake and change the installation path

17:13 <diehlpk_work> Fedora searches in /usr/lib*/*mpi*/lib/

17:13 <simbergm> hmm, ok

17:13 <diehlpk_work> simbergm, I just asked in their irc and did what they told me to do

17:13 <diehlpk_work> and it worked

17:14 <diehlpk_work> I could install the rpm packages and compile the hello world with gcc, mpich, and openmpi

17:15 aserio has quit [Ping timeout: 246 seconds]

17:43 <K-ballo> simbergm: are the capture errors in master too?

17:56 <K-ballo> looks like they are

18:02 <K-ballo> jbjnr_: https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/traits/is_future_tuple.hpp

18:03 <hkaiser> K-ballo: I implemented then_execute and bulk_then_execute on your executor, should propagate the launch policy correctly now

18:04 <K-ballo> ok, let me know if I have to do anything else on that branch

18:04 <K-ballo> hkaiser: what should we do about the C++11 errors on master?

18:05 <hkaiser> will look

18:05 <hkaiser> wrt the executor - I'd appreciate it if you went over the changed test one more time

18:06 <hkaiser> what errors btw? circle is green?

18:06 <K-ballo> circle does 14 or 17, right?

18:06 <hkaiser> didn't think so, anyways - where can I see those errors?

18:07 <K-ballo> my last PR introduced some (bad in 11) uses of the capture macros along with [=] default capture

18:07 <K-ballo> http://cdash.cscs.ch//viewBuildError.php?buildid=41404

18:07 <hkaiser> darn

18:07 <hkaiser> #ifdef it?

18:08 <K-ballo> no, it has to be rewritten proper

18:08 <K-ballo> manually capturing entities is one possibility

18:08 <hkaiser> right

18:08 * K-ballo wonders if we could have an inspect check for bad uses of capture macros

18:09 <hkaiser> can you manually capture args... in 11?

18:09 <K-ballo> uhm, by value I think yes?

18:09 <hkaiser> the whole point of that dance you removed was to avoid capturing the template parameter pack

18:09 <K-ballo> which dance?

18:10 <hkaiser> the wrapping into a tuple which then was captured

18:10 <K-ballo> I think we are looking at different errors?

18:11 <K-ballo> I only see one and it comes from when_each, no vaiadics

18:11 <hkaiser> ahh, ok then

18:11 <hkaiser> I was thinking about a different capture

18:11 <K-ballo> hah, and the silliest thing is the [=] doesn't actually capture anything

18:11 <hkaiser> lol

18:11 <hkaiser> well, then the solution is easy

18:12 <K-ballo> trivial...

18:15 <K-ballo> and then there's this: https://github.com/STEllAR-GROUP/hpx/blob/master/src/runtime/agas/addressing_service.cpp#L1292

18:18 <hkaiser> heh, some leftover

18:23 <Yorlik> I think I'm going to love C++ templates (first time use, just now) as much as macros :)

18:23 <K-ballo> that's either wrong or wrong

18:24 <Yorlik> Something is always wrong somewhere.

18:26 <simbergm> K-ballo: you found all the error messages you need?

18:28 <K-ballo> simbergm: maybe... is it possible to stop cdash from reinterpreting `<` as html tag opening? half the templated output is missing

18:28 <K-ballo> I think it is all there, I'm just having a hard time deciphering it

18:29 <K-ballo> I'm trying to reconstitute it form the DOm tree

18:29 <simbergm> K-ballo: ugh, good question

18:29 <simbergm> jbjnr_: ?

18:30 <simbergm> I don't know if it can be changed (easily)

18:31 <K-ballo> additionally, that guided thread pool executor, is it not C++14 only? the code uses several C++14 features

18:32 <simbergm> K-ballo: yeah, it should be, might be missing some ifdefs

18:32 <simbergm> (also jbjnr_ ^)

18:32 <K-ballo> nevermind, I assumed I was still looking at C++11 failures, but I see the report mentions C++17

18:35 mdiers_ has quit [Remote host closed the connection]

18:56 <heller_> detan: there are a few hiccups with compiling in cuda mode right now...

18:56 <heller_> There's a PR which addresses this, but it's not quite ready yet...

19:02 aserio has joined #ste||ar

20:10 daissgr has quit [Ping timeout: 244 seconds]

20:18 hkaiser has quit [Quit: bye]

20:31 K-ballo has quit [Ping timeout: 272 seconds]

20:32 daissgr has joined #ste||ar

21:20 hkaiser has joined #ste||ar

22:32 aserio has quit [Quit: aserio]

22:36 daissgr has quit [Ping timeout: 255 seconds]

22:58 daissgr has joined #ste||ar

23:10 daissgr has quit [Ping timeout: 255 seconds]