#ste||ar on 2017-09-12 — irc logs at irclog.cct.lsu.edu

2017-05-17 13:54 aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:18 StefanLSU has joined #ste||ar

00:23 StefanLSU has quit [Quit: StefanLSU]

00:27 StefanLSU has joined #ste||ar

00:40 denis_blank has quit [Quit: denis_blank]

00:43 StefanLSU has quit [Quit: StefanLSU]

01:20 hkaiser has quit [Quit: bye]

02:12 jbjnr_ has joined #ste||ar

02:12 patg has joined #ste||ar

02:12 patg is now known as Guest44047

02:13 jbjnr has quit [Ping timeout: 246 seconds]

02:13 jbjnr_ is now known as jbjnr

02:38 Guest44047 has quit [Read error: Connection reset by peer]

02:50 patg_ has joined #ste||ar

03:00 K-ballo has quit [Quit: K-ballo]

03:32 StefanLSU has joined #ste||ar

03:34 StefanLSU has quit [Client Quit]

03:45 rod_t has joined #ste||ar

05:02 taeguk has joined #ste||ar

05:06 <taeguk> congraturations to 700 stars! :)

05:14 rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

05:33 taeguk has quit [Quit: Page closed]

05:33 rod_t has joined #ste||ar

05:35 <jbjnr> yay \o\ 700!

05:35 <jbjnr> ops. that was a \o/

05:42 bikineev has joined #ste||ar

05:44 rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

05:52 <github> [hpx] biddisco created numa_balanced (+1 new commit): https://git.io/v51dj

05:52 <github> hpx/numa_balanced 62f72e3 John Biddiscombe: Add numa-balanced mode to hpx::bind, spread cores over numa domains

05:52 <github> [hpx] biddisco merged numa_balanced into throttle_cores: https://git.io/v51Ff

06:05 <jbjnr> oops

06:06 <github> [hpx] biddisco force-pushed throttle_cores from 62f72e3 to bb9f490: https://git.io/v5wWj

06:06 <github> hpx/throttle_cores bb9f490 John Biddiscombe: Add numa-balanced mode to hpx::bind, spread cores over numa domains

06:06 <jbjnr> force pushed a fix to that last commit cos I renamed something without checking. sorry

06:08 <github> [hpx] biddisco deleted numa_balanced at 62f72e3: https://git.io/v51FN

06:08 <github> [hpx] biddisco force-pushed throttle_cores from bb9f490 to 2924fda: https://git.io/v5wWj

06:08 <github> hpx/throttle_cores 2924fda John Biddiscombe: Add numa-balanced mode to hpx::bind, spread cores over numa domains

06:25 AnujSharma has joined #ste||ar

06:29 <heller> jbjnr: hijacking my PR, hm?

06:30 <jbjnr> not hijacking, but enhancing. I will remove commits if you don't like it

06:30 <heller> np

06:30 <jbjnr> I will make a test to keep hartmut happy today

06:30 <heller> thanks

06:31 <heller> I think it would be wise to keep the balanced-numa distribution seperate to the other fixes

06:31 <jbjnr> in my calendar it says skye call this afternoon - did we agree to that or was I being presumptive?

06:31 <heller> i think we agreed

06:31 <jbjnr> I renamed it numa-balanced because otherwise the partit thingy chokes

06:32 <heller> ok

06:32 <heller> I am fine with the name

06:32 <jbjnr> ok. I will move the commit back to my branch and remove it from the other PR

06:32 <heller> thanks, smaller PRs are easier to handle ;)

06:33 <github> [hpx] biddisco force-pushed throttle_cores from 2924fda to de6c7d7: https://git.io/v5wWj

06:34 <github> [hpx] biddisco created numa_balanced (+1 new commit): https://git.io/v51NW

06:34 <github> hpx/numa_balanced a71bee0 John Biddiscombe: Add numa-balanced mode to hpx::bind, spread cores over numa domains

06:34 <heller> I want to have the SLURM PR in first though ...

06:34 <heller> MBGA

06:35 <jbjnr> ?

06:35 <heller> Make Buildbot Green Again

06:36 <github> [hpx] biddisco opened pull request #2900: Add numa-balanced mode to hpx::bind, spread cores over numa domains (master...numa_balanced) https://git.io/v51Nu

06:36 <heller> http://rostam.cct.lsu.edu/builders/hpx_clang_3_8_boost_1_59_centos_x86_64_debug/builds/17/steps/run_unit_tests/logs/tests.unit.parallel.countif%20%28Timeout%29

06:36 <heller> there are failures like this all over the place

06:36 <heller> #2895 fixes this

06:39 <jbjnr> not good.

06:39 <heller> the PR or the failure?

06:39 <jbjnr> the failure and state of buildbot

06:39 <heller> yes

06:40 <heller> that's why I want to prioritize the PRs that attempt to fix the state

07:32 <github> [hpx] sithhell closed pull request #2895: Fixing SLURM environment parsing (master...fix_slurm) https://git.io/v5X8u

07:32 <github> [hpx] sithhell deleted fix_slurm at 880aa6a: https://git.io/v51pj

07:47 david_pfander has joined #ste||ar

08:23 <heller> jbjnr: made two comments on #2900

09:16 <github> [hpx] biddisco pushed 1 new commit to numa_balanced: https://git.io/v5MtW

09:16 <github> hpx/numa_balanced 958f5dc John Biddiscombe: Fix numa-balanced when odd number of threads requested

09:31 <heller> jbjnr: look at buildbot now!

09:31 <heller> the return of the green coming soon!

09:32 <heller> I am seeing one genuine failing test, and one related to a missing PAPI counter on a specific node

10:02 bikineev has quit [Read error: Connection reset by peer]

10:11 mcopik has joined #ste||ar

10:15 <mcopik> has anyone seen this problem appearing during HPX shutdown? https://gist.github.com/mcopik/f40353877e8b68496d759c3dcf48195d

10:15 <mcopik> looks like coroutine passes a nullptr to set_self which does not make sense

10:16 <mcopik> I checked for memory problems but I have run out of ideas

10:16 <mcopik> any ideas what could cause such problems?

10:20 bikineev has joined #ste||ar

10:23 <heller> mcopik: never saw this before

10:23 <heller> mcopik: how would I reproduce it?

10:25 <mcopik> heller: it's happening in my sycl stuff and the primary cause seems to be OpenCL callbacks

10:25 <heller> hmm

10:25 <mcopik> it used to work fine but this problem started appearing after merging with HPX master and updating SYCL compiler

10:25 <heller> could be stack overflows

10:26 <heller> does it also happen in debug builds?

10:26 <mcopik> I was 100% sure that callbacks are working correctly and it looks that the future_data is updated and destroyed correctly

10:26 <mcopik> heller: I'm running with larger values for -hpx:ini=hpx.stacks.small_size, no help

10:26 <mcopik> I'll try debug build

10:27 <mcopik> and custom malloc

10:33 <mcopik> heller: funny thing, I also started to get random deadlocks on shutdown. for some reason, Intel OpenCL throws errors randomly, even if the only action is creating a context and device, and stacktrace obtained after interrupting suggests that SYCL issues termination, HPX termination_handler is called and then there is a deadlock in some function called by the termination_handler. but it should not call anything except std::abort

10:33 <mcopik> I wonder if these problems could be connected

10:38 <github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/v5MZD

10:38 <github> hpx/gh-pages d33c54e StellarBot: Updating docs

10:38 <mcopik> heller: error appearing in debug build. https://gist.github.com/mcopik/ba0dd862970f6543f7406f1dfadc10a5 assertion in set_self is catching the nullptr

10:43 <github> [hpx] sithhell created partition_vector_fix (+1 new commit): https://git.io/v5MnW

10:43 <github> hpx/partition_vector_fix a77d93f Thomas Heller: Fixing partitioned_vector creation...

10:44 <github> [hpx] sithhell opened pull request #2901: Fixing partitioned_vector creation (master...partition_vector_fix) https://git.io/v5MnE

10:44 <heller> mcopik: regarding the deadlock, is this in a distributed run or single locality?

10:45 <mcopik> heller: single locality

10:45 <heller> ok

10:45 <heller> now to the nullptr thingy

10:45 <heller> can you rerun with --hpx:attach-debugger=exception?

10:45 <mcopik> heller: yes, sure

10:46 <heller> then attach the debugger to the process, look for the thread that sits in nanosleep and show me the backtrace?

10:46 <heller> it looks like you are trying to call HPX thread functionality outside of an HPX thread

10:46 <mcopik> heller: it won't catch on exception, I have to attach to startup

10:47 <mcopik> heller: well, the OpenCL callback is using both hpx::register_thread and hpx::applier::register_thread_nullary to set future state

10:47 <heller> ok

10:48 <heller> ok, when it doesn't attach

10:48 <heller> run gdb your_app

10:48 <heller> inside the gdb shell, make sure to have catch throw

10:48 <heller> so that you get the exception when it is thrown and not the report

10:49 <mcopik> sure

10:49 <mcopik> heller: I caught an exception in finalize()

10:51 <heller> ok, what's the backtrace?

10:51 <mcopik> heller: threads https://gist.github.com/mcopik/c9aeb021370155f2115a5e88d0e06580

10:51 bikineev has quit [Remote host closed the connection]

10:52 <heller> that's not the backtrace

10:52 <heller> that's where all threads are at the moment

10:52 <heller> bt

10:52 <mcopik> I know

10:52 <mcopik> heller: https://gist.github.com/mcopik/78af2029eda657e5df0ccdf816ce5a31

10:53 <heller> #8 0x00007ffff5845c27 in hpx::detail::throws_if (ec=..., errcode=hpx::invalid_status, msg="this function can be called from an HPX thread only", func="hpx::finalize",

10:54 <heller> /home/mcopik/Projekty/stellar/sycl/hpx_compute/hpx_backup/examples/compute/sycl/vector_functionality.cpp:67

10:54 <heller> this seems to be the call site

10:54 <heller> funny, it is from outside of hpx_main

10:55 <mcopik> heller: but isn't finalize called inside hpx_main?

10:55 <mcopik> looking at the backtrace?

10:55 <heller> yes

10:56 <heller> and it looks like it is indeed inside an hpx thread

10:56 <mcopik> heller: !threads::get_self_ptr() is failing

10:56 <heller> something went horribly wrong

10:56 <mcopik> heller: I have to leave now, I'll be back in an hour

10:56 <heller> can you post the content of that file?

10:57 <mcopik> heller: which one?

10:57 <heller> the test that fails

10:57 <heller> compute/sycl/vector_functionality.cpp

10:58 <mcopik> heller: I can try reproduce this with pure OpenCL or make the smallest possible case running with SYCL

10:58 <mcopik> but then you have to use their compiler

10:58 <mcopik> heller: https://gist.github.com/mcopik/06220318f8d3b88ef732b74e14084565

10:59 <heller> mcopik: next thing you should try is to comment out the sycl pieces bit by bit

10:59 <mcopik> heller: yes, Thomas, I've done it

10:59 <mcopik> heller: it works without futures

10:59 <heller> so which one leads to the error?

11:00 <mcopik> heller: creating a future after enqueue of OpenCL kernel

11:00 <mcopik> obtaining a future before always works

11:00 <heller> ok

11:00 <mcopik> what I noticed is that when creating future after the kernel enqueue

11:00 <mcopik> the callback is always called immediately and it's called before creating the future

11:01 <mcopik> however, I'm 100% sure that the future data is not destroyed as long as it does not go out of scope

11:01 <heller> it looks like there is some strange buffer overflow or similar going on then

11:01 <heller> get_self_ptr() is stored inside a TLS segment

11:02 <heller> you said it worked before?

11:02 <heller> and using target.synchronize() is working as expected?

11:02 <mcopik> heller: yes to first question

11:04 <mcopik> heller: yes, pure synchronize works. I ran it hundreds times and no segfaults have appeared

11:04 <heller> hmmm

11:04 <heller> what do you do inside get_future?

11:04 <heller> can you upload the code somewhere?

11:04 <mcopik> heller: should I perhaps build without native TLS? I recall that Hartmut mentioned it might cause problems when we've been working with AMD stuff

11:05 <heller> you could try that

11:06 <mcopik> heller: https://gist.github.com/mcopik/3ebe75d185dff65275117cd5596d9dd1

11:08 <mcopik> heller: I'll be back soon

11:14 hkaiser has joined #ste||ar

11:15 <hkaiser> heller: congrats!

11:16 jaafar has joined #ste||ar

11:24 <heller> hkaiser: to you too!

11:24 <hkaiser> it

11:24 <hkaiser> s your award

11:24 <heller> see pm please

11:25 <hkaiser> well deserved

11:43 <jbjnr> Congratulations heller - you earned it

11:53 bikineev has joined #ste||ar

12:07 pree has joined #ste||ar

12:15 jaafar has quit [Read error: Connection reset by peer]

12:19 <heller> hkaiser: jbjnr: thanks!

12:19 patg_ has quit [Read error: Connection reset by peer]

12:19 <heller> hkaiser: look at the buildbot!

12:19 <heller> hkaiser: I fixed the partitioned_vector failure as well now

12:19 <heller> and have a handle onto the service executor fix

12:21 patg has joined #ste||ar

12:22 patg is now known as Guest31423

12:22 <hkaiser> heller: perfect! thanks a lot

12:23 jkleinh has joined #ste||ar

12:24 bikineev has quit [Ping timeout: 240 seconds]

12:28 <heller> hkaiser: there is also the failing papi test

12:28 <heller> http://rostam.cct.lsu.edu/builders/hpx_gcc_7_boost_1_65_centos_x86_64_release/builds/21/steps/run_regression_tests/logs/tests.regressions.performance_counters.papi_counters_active_interface%20%280.60%20sec%29

12:29 <hkaiser> yah, apparently the PAPI counter we use is not available everywhere

12:29 <heller> the node on which the test runs doesn't have the counter

12:29 <hkaiser> heller: is there a counter available always?

12:29 <heller> two options: 1) exclude the node as a test runner 2) Make the test pass if the counter is not available

12:29 <heller> there might be

12:29 <hkaiser> this test is not to verify the functioning of this particular PAPI counter

12:30 <heller> but then we'd have to come up with a meaningful test ;)

12:30 <hkaiser> right

12:31 <heller> currently, we count the store instructions and check if the counter returns a value which is close enough

12:31 <heller> checking different systems, it is almost impossible to come up with a counter that's available everywhere :/

12:33 <hkaiser> heller: we could have a fallback - if one is not available, use another

12:34 <heller> let me think about it

12:35 <hkaiser> heller: or list of counters out of which one has to pass - it's about verfying the PAPI extension is functional, nothing else

12:35 pree has quit [Read error: Connection reset by peer]

12:37 <heller> sure

12:50 <jkleinh> what is the status of the executor/executor traits interface?

12:50 <jkleinh> Can I write against https://github.com/executors/issaquah_2016/blob/master/explanatory.md and expect things to mostly work or should I stick to executor_traits for now?

12:51 pree has joined #ste||ar

12:51 K-ballo has joined #ste||ar

12:51 <hkaiser> jkleinh: this is in heavy flux right now

12:51 <hkaiser> we have not implemented any of this yet

12:52 bikineev has joined #ste||ar

12:53 <hkaiser> the traits are functional, but will go away - sorry, you caught us in the middle of a major change here

12:54 <jkleinh> ok, no problem. So you'd suggest writting against executor_traits and then adapting when the new interface is stabilized?

12:56 <jkleinh> Also is there an implementation of an executor that can distribute work over multiple localities somewhere in hpx or are all implemented executors strictly local?

12:59 <hkaiser> jkleinh: we have the distribution_policy_executor

13:00 <hkaiser> you give it a distribution policy and it will distribute the work using that

13:00 <hkaiser> not tested too well, though - so any feedback is appreciated

13:02 pree has quit [Read error: Connection reset by peer]

13:12 <hkaiser> jkleinh: this is where we could use input and real use cases. nothing is set in stone

13:13 <github> [hpx] sithhell created fix_service_executor (+1 new commit): https://git.io/v5MVZ

13:13 <github> hpx/fix_service_executor 2c6e61f Thomas Heller: Fixing service_executor...

13:13 <github> [hpx] sithhell opened pull request #2902: Fixing service_executor (master...fix_service_executor) https://git.io/v5MV8

13:17 <jkleinh> cool that looks like the right thing. We are currently porting a sort of large quantum monte carlo code to hpx. I'll definitely let you know any stumbling blocks we run into.

13:17 diehlpk has joined #ste||ar

13:18 <heller> hkaiser: service_executor under control!

13:18 bikineev has quit [Ping timeout: 240 seconds]

13:19 pree has joined #ste||ar

13:19 <hkaiser> heller: what was the problem?

13:19 <heller> https://github.com/STEllAR-GROUP/hpx/pull/2902

13:20 <heller> it's running smoothly with that change

13:22 <hkaiser> nice catch

13:22 <hkaiser> thanks!

13:31 pree has quit [Ping timeout: 240 seconds]

13:33 <heller> can't reproduce the ignore_while_locked_1485 hang yet :/

13:37 hkaiser has quit [Quit: bye]

13:40 <diehlpk> heller, when should we skype today?

13:40 <diehlpk> zack and I contributed to the paper.

13:41 aserio has joined #ste||ar

13:42 rod_t has joined #ste||ar

13:43 pree has joined #ste||ar

13:44 aserio has quit [Read error: Connection reset by peer]

13:44 rod_t has quit [Client Quit]

13:45 aserio has joined #ste||ar

13:48 <heller> diehlpk: can we move it to tomorrow please?

13:50 <diehlpk> Ok, remember that the deadline is this Friday

13:50 <diehlpk> And I do not have time to work on the paper tomorrow or Thursday.

13:54 <heller> ok

13:54 <heller> I'll polish it tomorrow

13:58 <jbjnr> diehlpk do you still need input/help?

14:00 <diehlpk> jbjnr, Yes, you could prrofread the introduction and edit it

14:00 <jbjnr> ok. I'll look at it this evening

14:00 <diehlpk> Section 4 is finished too.

14:00 <jbjnr> heller: sorry completely forgot other skype call. it can wait.

14:00 <diehlpk> zbyerly, Will finish section 3 today.

14:01 <diehlpk> I will finish and conclusion and outllok today.

14:02 <diehlpk> jbjnr, Would be great if you can read introduction and section 4

14:05 <jbjnr> no prpblem

14:05 pree has quit [Read error: Connection reset by peer]

14:06 <heller> jbjnr: no problem, I forgot as well

14:06 <heller> jbjnr: let's try tomorrow

14:07 hkaiser has joined #ste||ar

14:09 diehlpk has quit [Ping timeout: 264 seconds]

14:12 rod_t has joined #ste||ar

14:21 pree has joined #ste||ar

14:24 rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

14:24 rod_t has joined #ste||ar

14:27 <jbjnr> can anyone think of a reason why boost log/locale/graph/regex are linking to cuda on our cray, but other libs do not (pulled in by cray wrappers, but I'm not sure why)

14:34 pree has quit [Read error: Connection reset by peer]

14:38 <mcopik> heller: same error with TLS disabled

14:38 <mcopik> but now I see something really, really strange. two target futures are created but the callback is executed three times

14:39 AnujSharma has quit [Ping timeout: 248 seconds]

14:48 pree has joined #ste||ar

14:49 <mcopik> heller: no, it's different. when the callback is executed between a creation of future_state and a corresponding future, the function passed to hpx::applier::register_thread_nullary is not executed at all?

14:55 aserio has quit [Quit: aserio]

14:55 aserio has joined #ste||ar

14:56 pree has quit [Read error: Connection reset by peer]

14:57 hkaiser has quit [Read error: Connection reset by peer]

15:00 rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

15:01 zbyerly_ has joined #ste||ar

15:01 rod_t has joined #ste||ar

15:12 pree has joined #ste||ar

15:18 pree has quit [Read error: Connection reset by peer]

15:27 zbyerly_ has quit [Ping timeout: 240 seconds]

15:31 david_pfander has quit [Ping timeout: 248 seconds]

15:36 pree has joined #ste||ar

15:36 EverYoung has joined #ste||ar

15:36 pree has quit [Remote host closed the connection]

15:38 pree has joined #ste||ar

15:38 bibek_desktop has joined #ste||ar

15:41 <jkleinh> is the distribution_policy concept defined somewhere?

15:42 <jkleinh> default_distribution_policy has an async member function which is used by distribution_policy_executor but this method is missing from binpacking_distribution_policy

15:44 <jkleinh> also the get_next_target method used by default_distribution_policy always returns the first locality

15:44 pree has quit [Read error: Connection reset by peer]

15:46 <jkleinh> I'm not sure if this is desired behavior. Based on how the create method works I would expect the async method of default_distribution_policy to iterate over localities cycliclly

15:47 pree has joined #ste||ar

15:55 <zbyerly> diehlpk_work, i'm almost done

15:56 <zbyerly> diehlpk_work, do you mind if i proofread everything for grammar / spelling ?

16:02 mbremer has joined #ste||ar

16:09 rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

16:09 pree has quit [Remote host closed the connection]

16:15 rod_t has joined #ste||ar

16:18 rod_t has quit [Client Quit]

16:22 pree has joined #ste||ar

16:25 hkaiser has joined #ste||ar

16:27 rod_t has joined #ste||ar

16:28 hkaiser has quit [Read error: Connection reset by peer]

16:44 <diehlpk_work> zbyerly, Thanks. Sure go for it

16:44 pree has quit [Read error: Connection reset by peer]

16:44 <diehlpk_work> I will extend the conclusion and outllok soon

16:56 Matombo has joined #ste||ar

16:58 pree has joined #ste||ar

16:59 EverYoung has quit [Ping timeout: 246 seconds]

17:00 rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

17:00 <zbyerly> diehlpk_work, are we going to use any plots?

17:00 <diehlpk_work> I asusem that heller will provide some figures on section 2.

17:00 <diehlpk_work> *assume

17:01 <heller> You assume correctly

17:02 rod_t has joined #ste||ar

17:02 aserio has quit [Ping timeout: 264 seconds]

17:04 EverYoung has joined #ste||ar

17:05 rod_t has quit [Client Quit]

17:07 hkaiser has joined #ste||ar

17:10 EverYoun_ has joined #ste||ar

17:14 EverYoung has quit [Ping timeout: 260 seconds]

17:32 rod_t has joined #ste||ar

17:37 jkleinh has quit [Quit: Page closed]

17:39 eschnett has quit [Quit: eschnett]

17:49 StefanLSU has joined #ste||ar

17:49 EverYoun_ has quit [Remote host closed the connection]

17:50 Matombo has quit [Ping timeout: 240 seconds]

17:50 EverYoung has joined #ste||ar

17:52 <zbyerly> diehlpk_work, i added another 2 paragraphs, right now I am at about 1.25 pages, I have two more things to add

17:52 <diehlpk_work> Ok, we still can shorten things later

17:53 <zbyerly> diehlpk_work, yes, I was about to say I will go over that limit most likely, but we can trim down later

17:54 jkleinh has joined #ste||ar

18:05 aserio has joined #ste||ar

18:13 pree has quit [Quit: AaBbCc]

18:15 Matombo has joined #ste||ar

18:25 StefanLSU has quit [Quit: StefanLSU]

18:39 Matombo has quit [Remote host closed the connection]

18:40 Matombo has joined #ste||ar

18:42 <mcopik> heller: what has changed that it was no longer necessary to modify future states from an HPX thread? https://github.com/STEllAR-GROUP/hpx/commit/ef3ef37bb7f0d35ff9ecf93f3e6b35e4727e1cc0#diff-d92fed1e98aec89fb2ce4882f1a65c8a

18:56 <hkaiser> mcopik: the lock used can be safely acquired from both an hpx thread and a non-hpx-thread

18:58 Matombo has quit [Remote host closed the connection]

19:00 akheir has joined #ste||ar

19:00 Matombo has joined #ste||ar

19:01 <heller> mcopik: does this change anything?

19:01 <jbjnr> jesus christ - who wrote that introduction? it's very strange!

19:01 <heller> hkaiser: will you be in Berkeley next week?

19:02 <jbjnr> zbyerly: diehlpk_work what is the page limit?

19:03 <zbyerly> 10 i think

19:03 <zbyerly> jbjnr, jesus christ did not write any of the sections AFAIK

19:04 akheir has quit [Remote host closed the connection]

19:04 <jbjnr> zbyerly: indeed

19:05 <jbjnr> why you worried about space there's only 5 in there so far

19:05 eschnett has joined #ste||ar

19:06 Matombo has quit [Remote host closed the connection]

19:06 <jbjnr> can I edit it freely, or will I conflict with others?

19:07 Matombo has joined #ste||ar

19:07 <mcopik> heller: I'm asking because I'm not really knowledgeable about what HPX and non-HPX threads are allowed to do

19:07 Matombo has quit [Read error: Connection reset by peer]

19:09 <hkaiser> heller: no

19:10 <zbyerly> jbjnr, i think t. heller is going to bring the figures

19:10 Matombo has joined #ste||ar

19:10 <hkaiser> jbjnr: go ahead

19:10 <hkaiser> mcopik: sure - the old code was not really necessay, so we removed it

19:11 <hkaiser> heller: will you come by for a day or two afterwards?

19:11 <mcopik> heller: when running as an HPX thread, I can't confirm that the lambda passed to register_thread_nullary is ever executed in the case when callback is executed before creating hpx::future

19:11 <zao> « On the third day He returned from his \write18 and emitted a \section. »

19:12 <hkaiser> ROFL

19:12 <hkaiser> I hope that didn't violate anybodies religious feelings

19:16 <mcopik> heller: and I get the failed assertion when executing get() on future (non-HPX thread). and for the case, when the future state is modified from a non-HPX thread, f.get() somehow is successful but I get the failed assertion in coroutine self_set

19:17 <zbyerly> hkaiser, i don't think it's offensive to imply that Jesus would use LaTeX if the Bible were written today

19:17 <hkaiser> yah, you can't call get() on a non-hpx-thread

19:18 <hkaiser> get might suspend inside the future, set will not

19:18 <heller> mcopik: that makes perfect sense indeed

19:18 <heller> hkaiser: hmm, I could do that

19:19 <hkaiser> heller: I'd enjoy that!

19:19 <heller> hkaiser: talk is Thursday, I could fly to br, and we could talk Friday.

19:20 <hkaiser> nice

19:20 <heller> Let's do it then

19:20 <hkaiser> you won't get to br from SF in the afternoon/evening, though

19:20 <hkaiser> except over night - but I wouldn't suggest doing this

19:20 <heller> This will be a 3k trip for the days, nice.

19:20 <heller> Hmm

19:20 <zbyerly> heller, FYI there are non-stop flights from SF to NOLA

19:21 <hkaiser> ahh yes, I could pick you up ther

19:21 <hkaiser> e

19:21 <heller> I was going to be back on Saturday, since I need to be in Stockholm on Monday

19:21 <heller> zbyerly: good to know!

19:21 <hkaiser> so come before the talk?

19:21 <hkaiser> come monday, fly to SF Wed

19:24 <hkaiser> heller: otoh, don't sweat it - np if it doesn't work out

19:24 <mcopik> heller: I can't reproduce the issue when hpx::register_thread and unregister_thread is not performed

19:25 <mcopik> perhaps OpenCL setCallback fires the callback immediately, within the HPX thread which called setCallback?

19:25 <hkaiser> mcopik: those calls shouldn't hurt, they can only prevent problems, not add them

19:25 <mcopik> and then, when callback is finished, this HPX thread calls unregister_thread

19:26 <mcopik> couldn't that lead to my problems if an HPX thread tries to unregister himself?

19:26 <hkaiser> yah, hpx threads shouldn't call [un]register_thread, that will blow things up

19:26 <hkaiser> note to self - I should check that those are not called on hpx threads

19:26 <mcopik> hkaiser: and I always assumed that callback can be called only by a foreign thread coming from OpenCL library

19:26 <mcopik> which might not be true

19:27 <hkaiser> nod

19:27 <mcopik> shit

19:27 <mcopik> two days of debugging

19:27 <mcopik> I just hope that's the true cause

19:27 <hkaiser> mcopik: check the result of hpx::threads::get_self_ptr(), it will be nullptr only on a non-hpx thread

19:30 <heller> hkaiser: I'll sleep over it

19:30 <mcopik> hkaiser: yes, the ptr is not null

19:31 <mcopik> heller: many thanks for helping me today, I think I solved it

19:31 <heller> Great!

19:31 <heller> I didn't do anything ;)

19:31 <mcopik> now only I have to solve deadlocks on Intel OpenCL

19:32 <heller> hkaiser: I'll see what flight options I'll get

19:32 <heller> Your free either way?

19:32 <hkaiser> heller: cool

19:32 <hkaiser> yes, I'll make it happen

19:32 <heller> That is Wednesday or Friday?

19:32 <hkaiser> heller: yes

19:32 <heller> Great

19:32 <hkaiser> Tuesday or Friday, I guess

19:33 <hkaiser> not sure what flights there are over night

19:33 <heller> Yeah, let's see

19:50 bikineev has joined #ste||ar

20:13 <mbremer> @hkaiser: I finally have some profiling data. Would you have some time this week to sit down and talk me through it?

20:15 <hkaiser> mbremer: sure, absolutely

20:16 <mbremer> How would Wednesday or Thursday work?

20:18 <mbremer> Also, do you have a tacc account or can I tar up these results and send them to you?

20:18 <mbremer> The results were run using vtune17 update 4.

20:20 EverYoun_ has joined #ste||ar

20:21 <hkaiser> mbremer: I might have a tacc account, but sending the files should work too

20:21 <hkaiser> mbremer: Wed/Thu should work yah - gtg now, though

20:21 <hkaiser> ttyl

20:21 hkaiser has quit [Quit: bye]

20:23 EverYoun_ has quit [Remote host closed the connection]

20:23 EverYoung has quit [Ping timeout: 246 seconds]

20:23 EverYoung has joined #ste||ar

20:29 Matombo has quit [Remote host closed the connection]

20:30 Matombo has joined #ste||ar

20:32 jaafar has joined #ste||ar

20:42 eschnett has quit [Quit: eschnett]

20:50 rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

20:50 jaafar has quit [Ping timeout: 248 seconds]

21:07 hkaiser has joined #ste||ar

21:08 jfbastien_ has joined #ste||ar

21:19 <github> [hpx] aserio pushed 2 new commits to new_people: https://git.io/v5DPt

21:19 <github> hpx/new_people 847026d aserio: adding Denis Blank, Ajai V. George, and Taeguk Kwon to the People sec.

21:19 <github> hpx/new_people 7892289 aserio: Adding people to people.qbk

21:38 <github> [hpx] aserio pushed 1 new commit to new_people: https://git.io/v5D1O

21:38 <github> hpx/new_people 847a4a6 aserio: More updates... fixing Bryce's email

21:40 <github> [hpx] aserio opened pull request #2903: Documentation Updates-- Adding New People (master...new_people) https://git.io/v5D1u

21:43 EverYoun_ has joined #ste||ar

21:45 aserio has quit [Quit: aserio]

21:46 EverYoung has quit [Ping timeout: 240 seconds]

21:50 EverYoun_ has quit [Remote host closed the connection]

21:51 EverYoung has joined #ste||ar

22:13 jkleinh has quit [Ping timeout: 260 seconds]

22:26 diehlpk_work has quit [Quit: Leaving]

23:08 mcopik has quit [Remote host closed the connection]

23:12 jbjnr has quit [Remote host closed the connection]

23:30 Matombo has quit [Remote host closed the connection]

23:32 EverYoung has quit [Ping timeout: 246 seconds]

23:37 diehlpk has joined #ste||ar

23:48 EverYoung has joined #ste||ar

23:50 EverYoun_ has joined #ste||ar

23:53 EverYoung has quit [Ping timeout: 255 seconds]