#ste||ar on 2017-09-25 — irc logs at irclog.cct.lsu.edu

2017-05-17 13:54 aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:03 Rodario1 has joined #ste||ar

00:04 Rodario has quit [Ping timeout: 240 seconds]

00:21 Rodario has joined #ste||ar

00:24 Rodario1 has quit [Ping timeout: 248 seconds]

00:43 Rodario1 has joined #ste||ar

00:47 Rodario has quit [Ping timeout: 260 seconds]

00:51 StefanLSU has joined #ste||ar

01:01 Rodario has joined #ste||ar

01:04 Rodario1 has quit [Ping timeout: 240 seconds]

01:21 Matombo has quit [Ping timeout: 248 seconds]

01:23 StefanLSU has quit [Quit: StefanLSU]

01:27 Rodario1 has joined #ste||ar

01:29 Rodario has quit [Ping timeout: 255 seconds]

01:33 eschnett has joined #ste||ar

01:42 Rodario has joined #ste||ar

01:44 Rodario1 has quit [Ping timeout: 240 seconds]

01:48 StefanLSU has joined #ste||ar

01:54 StefanLSU has quit [Quit: StefanLSU]

02:03 Rodario1 has joined #ste||ar

02:06 Rodario has quit [Ping timeout: 240 seconds]

02:24 eschnett has quit [Quit: eschnett]

02:24 Rodario has joined #ste||ar

02:26 Rodario1 has quit [Ping timeout: 240 seconds]

02:29 hkaiser has quit [Quit: bye]

02:48 Rodario1 has joined #ste||ar

02:51 Rodario has quit [Ping timeout: 240 seconds]

02:58 diehlpk has joined #ste||ar

03:01 zbyerly_ has quit [Ping timeout: 240 seconds]

03:02 diehlpk has quit [Remote host closed the connection]

03:03 Rodario has joined #ste||ar

03:05 Rodario1 has quit [Ping timeout: 240 seconds]

03:22 Rodario1 has joined #ste||ar

03:26 Rodario has quit [Ping timeout: 240 seconds]

03:41 Rodario has joined #ste||ar

03:44 Rodario1 has quit [Ping timeout: 240 seconds]

03:50 EverYoung has joined #ste||ar

03:54 EverYoung has quit [Ping timeout: 246 seconds]

04:01 Rodario1 has joined #ste||ar

04:03 Rodario has quit [Ping timeout: 240 seconds]

04:26 Rodario has joined #ste||ar

04:29 Rodario1 has quit [Ping timeout: 248 seconds]

04:46 AnujSharma has joined #ste||ar

04:57 Rodario1 has joined #ste||ar

04:59 Rodario has quit [Ping timeout: 240 seconds]

05:18 Rodario has joined #ste||ar

05:21 Rodario1 has quit [Ping timeout: 240 seconds]

05:40 Rodario1 has joined #ste||ar

05:44 Rodario has quit [Ping timeout: 248 seconds]

05:49 Rodario1 has quit [Quit: Leaving.]

06:33 jaafar has joined #ste||ar

06:42 jaafar has quit [Ping timeout: 240 seconds]

07:18 <jbjnr> heller: yt? when you are available, can I have a quick skype chat about the stream benchmark and making it work with RP. I want to kill off some of the block_executor stuff and replace it with pool executors etc etc

07:30 david_pfander has joined #ste||ar

07:52 EverYoung has joined #ste||ar

07:56 EverYoung has quit [Ping timeout: 248 seconds]

08:05 quaz0r has quit [Ping timeout: 240 seconds]

08:12 mcopik has joined #ste||ar

08:19 quaz0r has joined #ste||ar

08:39 <jbjnr> orange is the new green!

08:40 <zao> The best of us are colour-impaired.

08:40 Matombo has joined #ste||ar

09:07 <github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vdkTe

09:07 <github> hpx/gh-pages 1443dbc StellarBot: Updating docs

09:18 mcopik has quit [Ping timeout: 248 seconds]

09:24 Matombo has quit [Remote host closed the connection]

11:28 Matombo has joined #ste||ar

11:28 hkaiser has joined #ste||ar

11:41 <jbjnr> hkaiser: yt?

11:41 <hkaiser> here

11:41 <hkaiser> g'morning

11:41 <jbjnr> hi

11:42 <jbjnr> question. I wanted to ask heller but he/'s not around.

11:42 <hkaiser> he's in stockholm today

11:42 <jbjnr> do you know how https://github.com/STEllAR-GROUP/hpx/blob/56ff70a14bfeae1fd757b7782fd0e84207bdc8da/hpx/compute/host/block_executor.hpp actually interacts with the tread pool?

11:43 <jbjnr> it is an executor, but it uses a cpu mask to set itself up

11:43 <jbjnr> and I'm not sure how it interacts with the schedulers etc and the thread pool

11:43 <jbjnr> I want to get rid of it.

11:43 <hkaiser> that's a left-over, most probably - I have not touched th executors at all after working on the RP stuff

11:43 <jbjnr> it seems to be a 'special case' executor though

11:44 <hkaiser> by all means if we can subsume its functionality.. go ahead

11:44 <hkaiser> pls update all the code using it, though

11:44 <jbjnr> ok. I'm puzzled by it's operation though. I wanted to ask about its internals

11:45 <hkaiser> ok

11:46 Matombo has quit [Ping timeout: 246 seconds]

11:46 <hkaiser> it hosts several executors, one for each numa domain - that's an artifact

11:46 <hkaiser> it was the only way before the rp to handle things

11:49 <jbjnr> yeah. I want to remove these executors, but I'm not sure how they work internaly. how they interact with the thread pool.

11:49 <jbjnr> sorry. people coming in and out and distracting me ...

11:53 <hkaiser> jbjnr: the executors are 'attached' to the pools

11:53 <hkaiser> executors are very thin objects by design, they don't own anything, just provide access to an underlying pool

12:04 taeguk has joined #ste||ar

12:09 <jbjnr> hkaiser: I understand about the executors - what is tioubling me is that the target is defined solely by a birmap mask for pus - the executor is then bound to the target and when it launches a task, it uses hpx::parallel::execution::blah-blah - but then I am not sure how it interacts with the schedulers and the normal executors etc etc

12:10 <jbjnr> it seems like it could hijack a thread pool by puting tasks on it, bypassing the usual task creation mechanism

12:10 <hkaiser> the pool i screating the tasks, no?

12:10 <hkaiser> is*

12:11 <jbjnr> the executor just calls parallel::execution::async_execute(executors_[current],std::forward<F>(f), std::forward<Ts>(ts)...);}

12:11 <hkaiser> jbjnr: I think all the exeutors need to be adapted to the rp one way or another - that's completely missing...

12:13 <hkaiser> and the async_execute dispatches to the pool, here: https://github.com/STEllAR-GROUP/hpx/blob/56ff70a14bfeae1fd757b7782fd0e84207bdc8da/src/runtime/threads/executors/thread_pool_attached_executors.cpp#L63-L74

12:16 <jbjnr> ok, thanks. that helps

12:16 <hkaiser> here is teh call chain: https://github.com/STEllAR-GROUP/hpx/blob/56ff70a14bfeae1fd757b7782fd0e84207bdc8da/hpx/parallel/executors/thread_execution.hpp#L38-L50

12:18 <hkaiser> also here: https://github.com/STEllAR-GROUP/hpx/blob/c17edc83d24ec7aff9830d4e4daa5121875bee29/hpx/async.hpp#L244-L270, and here: https://github.com/STEllAR-GROUP/hpx/blob/c17edc83d24ec7aff9830d4e4daa5121875bee29/hpx/lcos/local/futures_factory.hpp#L115-L121

12:18 <hkaiser> that should tie it together

12:21 taeguk has quit [Quit: Page closed]

13:00 eschnett has joined #ste||ar

13:04 pree has joined #ste||ar

13:20 hkaiser has quit [Quit: bye]

13:27 eschnett has quit [Quit: eschnett]

13:36 mcopik has joined #ste||ar

13:48 eschnett has joined #ste||ar

13:54 mbremer has joined #ste||ar

14:01 aserio has joined #ste||ar

14:07 AnujSharma has quit [Ping timeout: 246 seconds]

14:08 hkaiser has joined #ste||ar

14:09 heller has quit [Ping timeout: 240 seconds]

14:14 heller has joined #ste||ar

14:16 hkaiser has quit [Read error: Connection reset by peer]

14:18 mbremer has quit [Quit: Page closed]

14:19 hkaiser has joined #ste||ar

14:20 rod_t has joined #ste||ar

14:33 hkaiser has quit [Quit: bye]

14:34 pree has quit [Ping timeout: 240 seconds]

14:47 pree has joined #ste||ar

14:51 david_pfander has quit [Ping timeout: 252 seconds]

14:56 pree has quit [Read error: Connection reset by peer]

15:11 pree has joined #ste||ar

15:15 pree has quit [Ping timeout: 252 seconds]

15:17 AnujSharma has joined #ste||ar

15:22 EverYoung has joined #ste||ar

15:25 AnujSharma has quit [Ping timeout: 240 seconds]

15:28 pree has joined #ste||ar

15:30 pree has quit [Read error: Connection reset by peer]

15:32 hkaiser has joined #ste||ar

15:47 pree has joined #ste||ar

15:48 mcopik has quit [Ping timeout: 252 seconds]

15:58 pree has quit [Ping timeout: 246 seconds]

16:01 mbremer has joined #ste||ar

16:05 <mbremer> @heller: That direct channel implementation seems to be working pretty well. There seems to be roughly a 6% (as in idle rate units) decrease in idle rate, and 10% speed-up in performance across various oversubscription configurations.

16:07 <mbremer> ^should have read the irc log first

16:09 <heller> mbremer: great

16:09 <heller> mbremer: does it also improve distributed performance?

16:10 <mbremer> Well those numbers are for a problem across 8 nodes.

16:10 pree has joined #ste||ar

16:10 <mbremer> So right now observed IR is 15% for 2X oversubsubscription.

16:12 <mbremer> I would like to figure out if I'm actually using the intel omnipath network correctly. I read a talk by some charm++/NAMD people saying that they need more MPI processes per node to saturate the network vs whatever Cray network was on Cori

16:13 <mbremer> Are there counters for measuring network bandwidth in HPX? Or should I rely on something like I_MPI_STATS?

16:21 <heller> there are ways to read the number of bytes

16:21 <heller> and the time it took

16:22 <heller> where you could get the bandwidth from then

16:29 <mbremer> Cool. I'll do that. Thanks @heller

16:43 mcopik has joined #ste||ar

16:48 pree has quit [Read error: Connection reset by peer]

16:53 EverYoun_ has joined #ste||ar

16:53 EverYoun_ has quit [Remote host closed the connection]

16:53 EverYoun_ has joined #ste||ar

16:54 EverYoung has quit [Ping timeout: 246 seconds]

17:02 pree has joined #ste||ar

17:03 aserio1 has joined #ste||ar

17:04 aserio has quit [Ping timeout: 246 seconds]

17:04 aserio1 is now known as aserio

17:09 EverYoung has joined #ste||ar

17:11 jaafar has joined #ste||ar

17:12 EverYoun_ has quit [Ping timeout: 255 seconds]

17:16 pree has quit [Ping timeout: 240 seconds]

17:29 pree has joined #ste||ar

17:35 pree has quit [Ping timeout: 240 seconds]

17:40 aserio has quit [Ping timeout: 255 seconds]

17:46 jaafar has quit [Ping timeout: 240 seconds]

17:47 pree has joined #ste||ar

18:04 Matombo has joined #ste||ar

18:06 pree has quit [Ping timeout: 264 seconds]

18:09 jaafar has joined #ste||ar

18:12 <github> [hpx] hkaiser created fixing_2439_3 (+2 new commits): https://git.io/vdIJc

18:12 <github> hpx/fixing_2439_3 485f64e Hartmut Kaiser: Replace executor_parameter_traits with separate customization points

18:12 <github> hpx/fixing_2439_3 7c283f8 Hartmut Kaiser: Merge branch 'master' into fixing_2439_3...

18:13 <heller> mbremer: another thing worth trying is to reduce the number of background threads

18:15 <github> [hpx] hkaiser pushed 28 new commits to master: https://git.io/vdIJM

18:15 <github> hpx/master 71716f3 Taeguk Kwon: Reduce code duplications in parallel::util::scan_partitioner.

18:15 <github> hpx/master 71b1ee0 Taeguk Kwon: Implement parallel::unique.

18:15 <github> hpx/master ea61017 Taeguk Kwon: Add unit tests for parallel::unique.

18:17 aserio has joined #ste||ar

18:19 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vdIUG

18:19 <github> hpx/master 463be75 Hartmut Kaiser: Adding parallel::unique to <hpx/parallel/algorithm.hpp>

18:19 <jbjnr> I'm not happy

18:20 <zbyerly> jbjnr, sorry to hear that

18:20 <jbjnr> it's hwloc's fault

18:21 <jbjnr> it should have been written in c++

18:21 <jbjnr> we need to subsume it into hpx somehow

18:21 <zbyerly> jbjnr, i'm all for that

18:23 <hkaiser> lol

18:23 <hkaiser> good luck

18:23 <zbyerly> and the linux colonel

18:23 <hkaiser> jbjnr: we should be able to wrp the needed functionality in nice ways

18:23 <hkaiser> most of it is already wrapped anyways

18:23 <jbjnr> I want to pass hwloc_bitmap_t around

18:24 <jbjnr> but that means having a #include for hwloc in some fundamental places

18:24 <jbjnr> and polluting the hpx api with hwloc_bitthe RP etcmap_t in

18:24 <jbjnr> and polluting the hpx api with hwloc_bitmap in the RP etc

18:25 <jbjnr> wrapping every hwloc type in some wrapper is just a total waste of time

18:26 <jbjnr> typedef hpx_bitmap_t hwloc_bitmap_t

18:26 <jbjnr> ?

18:26 <jbjnr> --> <--

18:28 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vdITt

18:28 <github> hpx/master f032046 Hartmut Kaiser: Adding parallel::unique to docs

18:30 <hkaiser> jbjnr: NO

18:30 <hkaiser> we have a bitmap type: mask_type and mask_cref_type

18:30 <hkaiser> jbjnr: don't wrap HWLOC, wrap functionality we need in sensible ways, similar to what the topology class is doing

18:31 <hkaiser> nobody wants to deal with bitmasks and similar lowlevel nonesense

18:31 <hkaiser> jaafar: you introduced a nice abstraction of all of this in the RP, why not stick to it?

18:31 <hkaiser> jbjnr: ^^

18:32 <jbjnr> that's my point. hwloc api does not accept out bitset types

18:32 <jbjnr> so I have to duplicate everything

18:32 <jbjnr> its pathetic

18:32 <hkaiser> you don't

18:32 <hkaiser> what functionality do you miss from the topology class?

18:32 <jbjnr> passing bitmaps around to use in our memor biding calls

18:32 <jbjnr> ^memory binding

18:33 <hkaiser> either pass a mask_type or your class hierarchy

18:33 <jbjnr> copying to and from mask_cref_type etc every time is pointless

18:33 <hkaiser> why's that pointless?

18:33 <hkaiser> hide it i a wrapper and forget about it

18:33 <jbjnr> you have to do an hwloc_bitmap_allo and free everyt time you convert from bitset<> to hwloc_bitmap_t

18:33 <hkaiser> so what?

18:34 <jbjnr> this is not what I joined hpx for

18:34 <hkaiser> lol

18:34 <jbjnr> I don't want shit code

18:34 <jbjnr> if hwloc_bitmap_t is valid, then I want to use it natively

18:34 <hkaiser> what did you join it for? to duplicate the hwloc nonesense on the user-api level?

18:34 <jbjnr> no

18:34 <jbjnr> you are missing the point

18:35 <hkaiser> if you pass around hwloc_bitmap_t's then you have to make sure it gets deallocated properly...

18:35 <jbjnr> I do not want to copy bitmpas from hwloc to hpx constantly - I want to just use the hwloc types directly

18:35 <hkaiser> so you HAVE to wrap it

18:35 <jbjnr> we have hpx::resource::numa_domain

18:35 <hkaiser> yes

18:35 <jbjnr> I've put it in there for now

18:35 <hkaiser> shrug

18:36 <jbjnr> but it means we have hwloc_ types exposed (typdef void * hpx_hwloc_bitmap_t)

18:36 <jbjnr> <sigh>

18:36 <hkaiser> do whatever you want as long as a) the hwloc resources are properly allocated, copied, moved, and deallocated, and b) the user does not see anything of this nonesense

18:36 <jbjnr> fine

18:36 <jbjnr> thank you

18:37 <hkaiser> jaafar: also, if you make hwloc mandatory now, please change all the related code, docs, etc.

18:37 <hkaiser> jbjnr: darn, sorry

18:37 * jbjnr still wants hwloc++

18:37 <hkaiser> ^^

18:37 <jbjnr> yup

18:37 <hkaiser> that means you can remove the none_topology and make all functions from the topology base class non-virtual, etc.

18:38 <jbjnr> this is why I am unhappy

18:38 <jbjnr> I have opened a huge can of worms

18:38 <jbjnr> and you and heller will hate me

18:38 <hkaiser> jbjnr: so close it before it gets out

18:38 <hkaiser> jbjnr: I still think we shouldn't use hwloc types directly

18:39 <jbjnr> this is where my conversation started.

18:39 <hkaiser> that bit of back-and-forth between mask_type and the hwloc counterpart can be done in one place, then we can forget about it

18:39 <jbjnr> I'm unhappy because I know we whouldn't, but wrapping is not good either

18:40 <hkaiser> why's that so bad?

18:40 <jbjnr> hwloc_bitmap_alloc and free

18:40 <jbjnr> total waste of resources

18:40 <hkaiser> blame hwloc

18:40 <jbjnr> making me anxious and sweaty

18:40 <hkaiser> so hide it deeply and forget about it

18:41 <jbjnr> the god of prgramming will never let me into heaven

18:41 <jbjnr> I'll be condemned to an eternity of fortran hell

18:43 <hkaiser> why? because you make the bad stuff disappear?

18:43 <hkaiser> you're trying to prematurely optimize

18:43 <hkaiser> let's do it right first, let's make it fast later (if needed)

18:47 <github> [hpx] hkaiser opened pull request #2915: Introduce executor parameters customization points (master...fixing_2439_3) https://git.io/vdIIc

18:56 jaafar has quit [Ping timeout: 248 seconds]

19:00 mbremer has quit [Quit: Page closed]

19:01 mcopik has quit [Ping timeout: 246 seconds]

19:01 EverYoung has quit [Remote host closed the connection]

19:20 <github> [hpx] hkaiser pushed 1 new commit to fixing_2439_3: https://git.io/vdIms

19:20 <github> hpx/fixing_2439_3 39fc05d Hartmut Kaiser: Fixing namespace for static_chunk_executor

19:24 <github> [hpx] hkaiser force-pushed fixing_2439_3 from 39fc05d to ccc2245: https://git.io/vdImP

19:24 <github> hpx/fixing_2439_3 ccc2245 Hartmut Kaiser: Fixing namespace for executor parameters objects

19:25 hkaiser has quit [Quit: bye]

20:05 EverYoung has joined #ste||ar

20:13 EverYoun_ has joined #ste||ar

20:15 mcopik has joined #ste||ar

20:16 EverYoung has quit [Ping timeout: 246 seconds]

20:20 mbremer has joined #ste||ar

20:27 aserio has quit [Ping timeout: 255 seconds]

20:34 eschnett has quit [Quit: eschnett]

20:35 mbremer has quit [Quit: Page closed]

20:46 hkaiser has joined #ste||ar

20:48 aserio has joined #ste||ar

21:07 EverYoun_ has quit [Remote host closed the connection]

21:07 EverYoung has joined #ste||ar

21:20 aserio1 has joined #ste||ar

21:22 aserio has quit [Ping timeout: 255 seconds]

21:22 aserio1 is now known as aserio

21:49 <github> [hpx] hkaiser force-pushed fixing_2439_3 from ccc2245 to 8cbf0f6: https://git.io/vdImP

21:49 <github> hpx/fixing_2439_3 8cbf0f6 Hartmut Kaiser: Fixing namespace for executor parameters objects

21:56 aserio has quit [Quit: aserio]

22:02 eschnett has joined #ste||ar

22:19 eschnett has quit [Quit: eschnett]

22:22 Matombo has quit [Quit: Leaving]

22:26 eschnett has joined #ste||ar

22:28 jaafar has joined #ste||ar

22:46 mcopik has quit [Ping timeout: 240 seconds]

23:04 rod_t has left #ste||ar [#ste||ar]

23:13 jaafar has quit [Ping timeout: 260 seconds]

23:26 EverYoun_ has joined #ste||ar

23:30 EverYoung has quit [Ping timeout: 255 seconds]

23:30 EverYoun_ has quit [Ping timeout: 255 seconds]

23:31 EverYoung has joined #ste||ar

23:37 jaafar has joined #ste||ar

23:41 EverYoung has quit [Ping timeout: 240 seconds]

23:42 EverYoung has joined #ste||ar

23:49 EverYoung has quit [Remote host closed the connection]

23:50 EverYoung has joined #ste||ar