#ste||ar on 2017-06-21 — irc logs at irclog.cct.lsu.edu

2017-05-17 13:54 aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:10 Guest83749 has quit [Read error: Connection reset by peer]

00:45 patg has joined #ste||ar

00:46 patg is now known as Guest65618

01:39 eschnett has joined #ste||ar

01:48 hkaiser has quit [Quit: bye]

01:53 zbyerly has joined #ste||ar

01:54 EverYoung has joined #ste||ar

01:59 EverYoung has quit [Ping timeout: 246 seconds]

02:06 ajaivgeorge has quit [Read error: Connection reset by peer]

02:10 K-ballo has quit [Quit: K-ballo]

02:40 <Guest65618> heller_: on the SC site it says notifications in July

02:40 Guest65618 is now known as patg

02:41 <patg> heller_: sorry I was Guest...

02:49 eschnett has quit [Quit: eschnett]

02:53 eschnett has joined #ste||ar

02:54 zbyerly has quit [Remote host closed the connection]

02:54 zbyerly has joined #ste||ar

03:27 eschnett has quit [Quit: eschnett]

03:34 patg has quit [Quit: See you later]

03:54 zbyerly has quit [Remote host closed the connection]

03:54 zbyerly has joined #ste||ar

04:01 pree has joined #ste||ar

04:07 jaafar has quit [Quit: Konversation terminated!]

04:23 zbyerly has quit [Ping timeout: 240 seconds]

04:30 zbyerly has joined #ste||ar

04:45 zbyerly has quit [Ping timeout: 255 seconds]

05:40 shoshijak has joined #ste||ar

05:43 bikineev has joined #ste||ar

05:56 EverYoung has joined #ste||ar

05:58 david_pfander has joined #ste||ar

06:01 EverYoung has quit [Ping timeout: 255 seconds]

06:02 bikineev has quit [Remote host closed the connection]

06:09 david_pfander has quit [Ping timeout: 260 seconds]

06:12 Matombo has joined #ste||ar

06:14 david_pfander has joined #ste||ar

06:46 shoshijak has quit [Ping timeout: 240 seconds]

06:53 Matombo has quit [Remote host closed the connection]

07:03 shoshijak has joined #ste||ar

07:15 Matombo has joined #ste||ar

07:27 david_pfander has quit [Ping timeout: 268 seconds]

07:55 <jbjnr_> heller_: have you received a HiHat invitation?

07:55 <jbjnr_> I was supposed to forward something to you a few weeks ago. Forgot. Now I've got an invite - hoping you did too

07:56 david_pfander has joined #ste||ar

08:02 <heller_> HiHat?

08:02 <heller_> jbjnr_: which email adress did you send it to?

08:02 <jbjnr_> https://xstackwiki.modelado.org/Hierarchical_Heterogeneous_Asynchronous_Tasking

08:03 <jbjnr_> I didn't send it, I forgot, but this morning I got an email from CJ and it said "you" plural, so I assumed you got it too

08:03 <jbjnr_> but now I can't find ir

08:03 <jbjnr_> ^it

08:04 <heller_> no, I didn;t get one

08:04 <jbjnr_> forwarded

08:05 <heller_> ok, I'll try to join

08:06 Matombo has quit [Remote host closed the connection]

08:09 <heller_> jbjnr_: I want my notification now!

08:09 <jbjnr_> of what?

08:11 <heller_> GB

08:13 <jbjnr_> August

08:14 <jbjnr_> Patience young paduan

08:17 <heller_> I still want it now!

08:27 david_pfander has quit [Ping timeout: 246 seconds]

08:46 <ABresting> heller_: find module is in util?

08:48 <heller_> ABresting: the cmake directory contains a ton of FindXXX.cmake files

08:49 <heller_> you can use that to setup the libsigsev libraries and release

08:50 <heller_> look around in the CMakeLists.txt files, they do reference them with find_package from time to time

08:50 <heller_> based on a user settable cmake option

08:51 <heller_> ABresting: for example here: https://github.com/STEllAR-GROUP/hpx/blob/master/CMakeLists.txt#L257-L262

08:52 <jbjnr_> heller_: did you receive the hihat forwarded msg - the one I sent to harm ut keeps bouncing back

08:52 <heller_> jbjnr_: I got it, yes

08:52 <heller_> subscribed already

08:53 <jbjnr_> website or just "accepted" the invite

08:53 <heller_> both

08:54 Matombo has joined #ste||ar

08:56 <ABresting> heller_: I need libsigsegv by default, and for this user need to enter path in a config file, now which config file it should be? on build time when we use cmake?

08:57 <heller_> yes

08:57 <heller_> as said, it should be optional

08:57 <heller_> that is, you only need it when the user sets the option

08:57 <heller_> for example: HPX_WITH_STACKOVERFLOW_DETECTION

08:58 <heller_> then you search for libsigsev and include all the rest

08:59 <ABresting> at cmake time user is gonna enter -HPX_WITH_STACKOVERFLOW_DETECTION="<path-to-libsigsegv>"?

08:59 <heller_> no.

08:59 <heller_> HPX_WITH_STACKOVERFLOW_DETECTION=On

08:59 <heller_> by default, it would be off

08:59 <ABresting> but I need path as well

09:00 <heller_> find_package(LibSigSev) then

09:00 <heller_> that's handled by the find module

09:00 <zao> --with-foo=/bar/baz is a very autotools/configure thing.

09:00 <heller_> ABresting: https://cmake.org/cmake/help/v3.9/manual/cmake-packages.7.html

09:00 <heller_> wrong link

09:00 <heller_> sorry

09:00 <zao> CMake way is to have Find* modules that honor explicit BLARGH_DIR/BLARGH_ROOT or CMAKE_PREFIX_PATH.

09:00 <heller_> ABresting: https://cmake.org/cmake/help/v3.9/manual/cmake-developer.7.html#find-modules

09:03 <ABresting> I am little confused, here we are using "LibSigSegv" or it's a path given by user or the find module is going to find all occurences of "LibSigSegv"?

09:03 <heller_> no

09:03 <heller_> read through the last link I presented

09:04 <heller_> and here: https://cmake.org/cmake/help/v3.9/command/find_package.html#command:find_package

09:06 <github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vQTnm

09:06 <github> hpx/gh-pages 1ca0161 StellarBot: Updating docs

09:07 <ABresting> by default can we makeit ON ?

09:07 <heller_> why should we?

09:07 <heller_> it is not necessary for every day usage, not everyone has it installed

09:10 <ABresting> the reason is it will be functional only if the user initiates it in the main() module. If they have initiated it then system will try and include it in compile time, then if not found it's gonnaa throw an error

09:10 <zao> Do we have any other features that are off-by-default that turn on in the presence of a library on the system?

09:11 <zao> Would it be something that would be possible to obscure from the end-user in say the HPX init functions, or is it exactly-once-per-program?

09:16 <ABresting> its more of a user initiated thing, as user should know that they need to trigger it through API call, but even if user is not having it installed then its gonna throw a warning then install it if you want to use this feature

09:20 david_pfander has joined #ste||ar

09:21 <heller_> zao: no, there is nothing that gets magically turned on

09:21 <heller_> ABresting: why is it initiated by the user

09:22 <heller_> in the user program even?

09:22 <heller_> why is not possible to activate the stack overflow handling like any other (possibly optional) component?

09:23 <heller_> you also have the command line and HPX configuration utilities

09:26 <ABresting> because it need to be initiated from the main module else it doesn't work in multithreaded environment

09:27 <heller_> so?

09:27 <heller_> hpx::init is called from within the main module

09:27 <heller_> (usually)

09:27 <ABresting> so without it the signal handler is not installed for the entire process

09:27 <heller_> close enough for your project

09:28 <ABresting> fdf

09:28 <ABresting> ;sorry

09:28 <ABresting> typo

09:28 <ABresting> when I said from main module

09:28 pree has quit [Ping timeout: 240 seconds]

09:28 <ABresting> I meant hpx::init() :P

09:29 <heller_> ABresting: https://github.com/STEllAR-GROUP/hpx/blob/master/src/runtime.cpp#L153-L174

09:29 <heller_> ABresting: https://github.com/STEllAR-GROUP/hpx/blob/master/src/hpx_init.cpp#L1111

09:42 eschnett has joined #ste||ar

09:46 <jbjnr_> heller_: if an application allocates a thing sing new blah(...) and then passes it to the runtime - it is deleted by the runtime - are there any cases where we can get the wrong allocator (like jemalloc etc) that is freeing the object - ie different from the one that allocated it

09:46 <jbjnr_> I have a callback in the thread pool to return a user allocated scheduler - this bombs out in destruction

09:50 shoshijak has quit [Ping timeout: 240 seconds]

09:51 <jbjnr_> how can I force my code (that uses HPX) to use the jemalloc allocator too. I assumed it would be doing thet alresady - but this looks suspicious

09:52 shoshijak has joined #ste||ar

09:53 pree has joined #ste||ar

09:53 <heller_> jbjnr_: I would assume the same

09:54 <heller_> jbjnr_: what does "bombing out" mean? do you destruct the scheduler twice, maybe?

09:54 <jbjnr_> destruct once, segfault once

09:54 <heller_> are you sure?

09:55 <heller_> where does the segfault happen?

09:55 <jbjnr_> yes. sure

09:55 <jbjnr_> in the destructor of my custom scheduler

09:55 <jbjnr_> when I delte stuf that gdb says is lovely

09:55 shoshijak has quit [Client Quit]

09:55 shoshijak has joined #ste||ar

09:55 <heller_> jbjnr_: if it would be jemalloc vs. some other malloc, you'd see a different segfault though

09:56 <jbjnr_> like what?

09:56 <jbjnr_> bad_alloc

09:56 <heller_> somewhere inside of jemalloc or system malloc

09:57 <heller_> if you get a segfault inside of your destructor, that could only mean that the object has been already destructed, the pointer doesn't point to the original object anymore, or a problem with the destructor

09:57 <heller_> show me

09:58 <jbjnr_> the segfault happens when deleting one of the lockfree queues inside the schedluer

09:58 EverYoung has joined #ste||ar

09:59 K-ballo has joined #ste||ar

09:59 <heller_> have they not been allocated correctly beforehand, maybe?

10:00 <jbjnr_> it's an effing queue that tasks have been running on all day, then dies in destructor, the queues are ok

10:00 <heller_> jbjnr_: https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/runtime/threads/policies/local_priority_queue_scheduler.hpp#L983

10:00 <jbjnr_> I'll add a delete callback and see if the problem goes away

10:00 <heller_> there is some lazy initialization going on here

10:01 bikineev has joined #ste||ar

10:02 <jbjnr_> hmmmm

10:02 EverYoung has quit [Ping timeout: 246 seconds]

10:03 <jbjnr_> shoshijak: "it's not your creepy jemalloc thingy"

10:03 <jbjnr_> quote of the day :)

10:04 <heller_> jbjnr_: do you have a unit test, that's built within HPX that doesn't fail? but essentially does the same thing?

10:05 Matombo has quit [Remote host closed the connection]

10:05 <jbjnr_> of course not

10:05 <heller_> write it!

10:05 <heller_> ;)

10:05 <heller_> so you can rule out different global allocators for sure

10:07 shoshijak has quit [Ping timeout: 240 seconds]

10:07 <heller_> jbjnr_: tell shoshijak to not constantly close her lid, please :)

10:08 <jbjnr_> don't worry, she leaves in a few days

10:08 <heller_> oh noes

10:10 eschnett has quit [Quit: eschnett]

10:10 <jbjnr_> https://gist.github.com/biddisco/2ee3ef4369ba563797e81f79beadd7df

10:10 <jbjnr_> does anything there look interesting to you?

10:11 <jbjnr_> that's the same error when we do not use our custom scheduler

10:11 <heller_> hmmm

10:12 <heller_> delete nullptr should be fine

10:12 <heller_> scoped_ptr?

10:14 <heller_> doesn't look wrong

10:14 <heller_> jbjnr_: where can I look at your files?

10:15 <jbjnr_> you can't. but do not worry. I just wanted to ask if the maloc thing was a possibility. We clearly have a bug, and it may be connected to the tss stuff ...

10:15 <heller_> write an isolated testcase

10:15 <heller_> that would be my advise

10:30 denis_blank has joined #ste||ar

10:32 <zao> It's not occuring during process shutdown phase, I hope. There be destruction order dragons.

10:36 bikineev has quit [Remote host closed the connection]

10:42 <heller_> http://walac.github.io/java-faster-than-cpp/?utm_content=buffere2185&utm_medium=social&utm_source=facebook.com&utm_campaign=buffer

11:07 <pree> xD

11:10 parsa[[w]] has quit [Read error: Connection reset by peer]

11:24 heller_ has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]

11:26 heller has joined #ste||ar

11:48 shoshijak has joined #ste||ar

12:12 zbyerly has joined #ste||ar

12:16 ajaivgeorge has joined #ste||ar

12:20 bikineev has joined #ste||ar

12:28 hkaiser has joined #ste||ar

12:32 hkaiser has quit [Read error: Connection reset by peer]

12:32 hkaiser has joined #ste||ar

12:38 bikineev has quit [Ping timeout: 255 seconds]

12:53 bikineev has joined #ste||ar

12:59 eschnett has joined #ste||ar

13:04 jaafar has joined #ste||ar

13:09 denis_blank has quit [Quit: denis_blank]

13:13 zbyerly has quit [Remote host closed the connection]

13:13 zbyerly has joined #ste||ar

13:22 eschnett has quit [Quit: eschnett]

13:24 pree has quit [Ping timeout: 240 seconds]

13:24 bikineev has quit [Ping timeout: 260 seconds]

13:28 eschnett has joined #ste||ar

13:28 <jbjnr_> hkaiser: we are seeing the runtime TSS and thread_num TSS initialized and deinitialized multiple times throughout the code - is this normal? (or might we have broken something).

13:48 aserio has joined #ste||ar

13:55 bikineev has joined #ste||ar

13:56 eschnett has quit [Quit: eschnett]

14:00 EverYoung has joined #ste||ar

14:01 pree has joined #ste||ar

14:05 ajaivgeorge_ has joined #ste||ar

14:05 ajaivgeorge has quit [Read error: Connection reset by peer]

14:05 EverYoung has quit [Ping timeout: 276 seconds]

14:05 ajaivgeorge has joined #ste||ar

14:10 ajaivgeorge_ has quit [Ping timeout: 268 seconds]

14:18 <hkaiser> jbjnr_: hmmm

14:19 <hkaiser> for each os-thread this should happen once, I think

14:19 <jbjnr_> I mean that at start, the runtime tss is set 2 or 3 times (there is a null check), then on cleanup, it is deinit-ed multiple times

14:19 <jbjnr_> there are no segfaults - but it strikes me as sloppy

14:19 <jbjnr_> I'm testing master bbrach at the mo to see if it does the sae

14:19 <jbjnr_> ^same

14:19 <hkaiser> jbjnr_: even without your new code?

14:20 <hkaiser> that would be unexpected

14:20 <jbjnr_> I'm testing now

14:20 <hkaiser> but everything is possible, I wouldn't be surprised to have a bug there

14:21 <jbjnr_> it may not be a bug, it's just that flags are sometimes set from different places and done twice ...

14:27 <hkaiser> jbjnr_: note that the functions are (supposed to) being called once per os-thread

14:29 <jbjnr_> I see the runtime and applier ptrs TSS being set at least twice on the main thread (on master) - but as I say, this may not be a problem since it says 'if nullptr then ...'

14:30 <jbjnr_> gosh main thread calls deinit_tss 3 times :(

14:30 <jbjnr_> for the thread_num_tss that is

14:31 <hkaiser> even on master?

14:40 <jbjnr_> yes, on master

14:43 <hkaiser> uhh

14:43 <hkaiser> pls create a ticket, I'll look

14:45 <jbjnr_> if you're interested at the moment - look at this gist https://gist.githubusercontent.com/biddisco/cb65e01f08299ad37f883652af4bad11/raw/1028887745fcbe598ebc6c853e9931774ce1bae6/gistfile1.txt

14:45 <jbjnr_> and highlight 0x7f0e55b60800

14:45 <jbjnr_> you can see when that thread calls init/deinit (we are using just one worker thread to make the trace short)

14:46 <hkaiser> jbjnr_: is 0x7f0e55b60800 the thread id?

14:46 <jbjnr_> yes. the os thread id

14:46 <hkaiser> ok, will have a look

14:46 <hkaiser> that shouldn't happen, really

14:47 <jbjnr_> The only reason I'm worried, is that we have random memory corruption in our multi-pool hpx and we moved some TSS code around

14:47 <jbjnr_> so debugging it in case we screweed up

14:47 eschnett has joined #ste||ar

14:47 <hkaiser> nod

14:47 <jbjnr_> with mutiple-pools we might have messed up the thread_num_tss (hence my checking)

14:51 eschnett has quit [Client Quit]

14:52 zbyerly has quit [Remote host closed the connection]

14:52 zbyerly has joined #ste||ar

14:56 bikineev has quit [Ping timeout: 240 seconds]

14:59 <hkaiser> jbjnr_: yah, I can confirm that init_tss is called twice for the main-thread

14:59 ajaivgeorge has quit [Read error: Connection reset by peer]

14:59 <hkaiser> this is a bug

14:59 ajaivgeorge has joined #ste||ar

15:00 <jbjnr_> ok, but not a serious one (hopoefully - no real side effects)

15:00 bikineev has joined #ste||ar

15:07 eschnett has joined #ste||ar

15:15 <hkaiser> jbjnr_: right

15:19 <hkaiser> jbjnr_: the functions are called more than once only for the main thread

15:19 <hkaiser> I can fix that

15:20 ajaivgeorge has quit [Quit: ajaivgeorge]

15:20 ajaivgeorge has joined #ste||ar

15:22 aserio has quit [Ping timeout: 246 seconds]

15:25 ajaivgeorge has quit [Ping timeout: 240 seconds]

15:26 aserio has joined #ste||ar

15:34 <aserio> wash, wash[m]: will you be joining us today

15:38 <wash[m]> Aserio hey

15:39 <wash[m]> Aserio I am in israel

15:39 <wash[m]> Cannot call in

15:39 <hkaiser> wash[m]: back to the homeland, huh?

15:40 david_pfander has quit [Ping timeout: 240 seconds]

15:50 ajaivgeorge has joined #ste||ar

15:51 <wash[m]> Yah :)

15:56 <aserio> wash[m]: enjoy your trip

16:01 aserio has quit [Ping timeout: 258 seconds]

16:11 eschnett has quit [Quit: eschnett]

16:13 aserio has joined #ste||ar

16:14 zbyerly has quit [Remote host closed the connection]

16:14 zbyerly has joined #ste||ar

16:32 zbyerly has quit [Ping timeout: 246 seconds]

16:48 bikineev has quit [Ping timeout: 268 seconds]

16:51 bikineev has joined #ste||ar

16:59 bikineev has quit [Ping timeout: 260 seconds]

17:03 aserio has quit [Ping timeout: 246 seconds]

17:09 shoshijak has quit [Ping timeout: 240 seconds]

17:20 hkaiser has quit [Read error: Connection reset by peer]

17:32 EverYoung has joined #ste||ar

17:36 EverYoung has quit [Ping timeout: 246 seconds]

17:40 jaafar has quit [Ping timeout: 240 seconds]

17:40 shoshijak has joined #ste||ar

17:44 zbyerly has joined #ste||ar

17:47 Matombo has joined #ste||ar

17:56 aserio has joined #ste||ar

18:01 Matombo has quit [Remote host closed the connection]

18:05 <heller> aserio: did you get a tutorial notification yet?

18:05 <aserio> heller: yes, let me tell you about it after this meeting

18:06 <heller> aserio: ok

18:06 <heller> aserio: care to share the reviews?

18:06 eschnett has joined #ste||ar

18:08 eschnett has quit [Client Quit]

18:20 aserio has quit [Ping timeout: 255 seconds]

18:20 ajaivgeorge has quit [Ping timeout: 260 seconds]

18:20 ajaivgeorge has joined #ste||ar

18:26 bikineev has joined #ste||ar

18:26 aserio has joined #ste||ar

18:33 hkaiser has joined #ste||ar

18:34 jaafar has joined #ste||ar

18:35 ajaivgeorge has quit [Ping timeout: 240 seconds]

18:35 ajaivgeorge has joined #ste||ar

18:39 hkaiser_ has joined #ste||ar

18:39 hkaiser has quit [Read error: Connection reset by peer]

18:43 jaafar has quit [Ping timeout: 260 seconds]

18:52 zbyerly has quit [Remote host closed the connection]

18:59 denis_blank has joined #ste||ar

19:23 jaafar has joined #ste||ar

19:24 ajaivgeorge has quit [Ping timeout: 240 seconds]

19:26 eschnett has joined #ste||ar

19:26 ajaivgeorge has joined #ste||ar

19:27 aserio has quit [Ping timeout: 240 seconds]

19:33 EverYoung has joined #ste||ar

19:41 EverYoung has quit [Ping timeout: 258 seconds]

19:42 shoshijak has quit [Ping timeout: 240 seconds]

19:45 aserio has joined #ste||ar

19:57 shoshijak has joined #ste||ar

20:07 jgoncal has quit [Ping timeout: 258 seconds]

20:18 bikineev has quit [Ping timeout: 240 seconds]

20:20 jgoncal has joined #ste||ar

20:25 bikineev has joined #ste||ar

20:25 Matombo has joined #ste||ar

20:31 eschnett has quit [Quit: eschnett]

20:32 jaafar has quit [Quit: Konversation terminated!]

20:33 bikineev has quit [Remote host closed the connection]

20:38 <jbjnr_> hkaiser_: is hpx::async(executor, &fun, ...); not a valid overload of async?

20:44 <pree> what the continuations in parcels specify ?

20:45 patg[w]_ has joined #ste||ar

20:45 <pree> I'm confused , continuation means tasks which should occur after a event

20:45 <pree> ??

20:47 <hkaiser_> jbjnr_: yah, sure

20:47 <pree> ^^

20:47 <hkaiser_> pree: yah, the continuation is usually a global id of a lco which receives the result

20:48 <pree> thanks but I have thought continuation means tasks which should continue after some event

20:49 <hkaiser_> nod

20:49 <hkaiser_> lcos usually trigger things

20:49 <patg[w]_> Got Irc working at work finally

20:49 <hkaiser_> patg[w]_: great

20:50 <pree> thank you hkaiser_

20:50 <pree> I got confused by it's name

20:50 <patg[w]_> Now to find some cycles to do that install

21:00 <jbjnr_> soo... hkaiser_ I discovered by accident that hpx::async(executor, &func, . ..) works fine, but when I put that inside a lambda - it doesn't work if the executor is captured by value - only works when captured by reference.

21:01 <jbjnr_> is that expected?

21:01 <hkaiser_> jbjnr_: might just not work for const executors

21:01 <hkaiser_> make the lambda mutable

21:01 <jbjnr_> ok

21:03 <K-ballo> is executor copyable?

21:03 <jbjnr_> yup mutable works

21:03 <K-ballo> it is then

21:03 <jbjnr_> it is copyable normally I think

21:03 eschnett has joined #ste||ar

21:06 <patg[w]_> hkaiser_, see private

21:08 bikineev has joined #ste||ar

21:10 patg[w]_ has quit [Quit: Leaving]

21:15 patg[w] has joined #ste||ar

21:19 <heller> we really need my thesis, the only parallex terms I use are AGAS and Parcel ;)

21:19 <patg[w]> heller, we really do!

21:20 <patg[w]> When do you expect to be done?

21:20 <heller> 33 days to go

21:21 <heller> it will be very C++ centric

21:24 <patg[w]> heller, hope its human readable :)

21:24 <K-ballo> for some humans at least

21:24 <heller> patg[w]: yes. the PDF renders fine :P

21:25 <patg[w]> K-ballo, I have a feeling human readable means different things to you and me

21:25 pree has quit [Quit: AaBbCc]

21:27 <patg[w]> heller, I'm sure it will be great!

21:28 <heller> patg[w]: the only thing I care about right now is to have it submitted and that I'll pass ;)

21:28 <heller> noone will read it in the end anyway

21:28 <hkaiser_> heller: you should use 'split-phase transaction' as well ;)

21:28 <patg[w]> heller: I can empathize

21:29 <heller> hkaiser_: I am using the C++ Memory Model definitions instead ;)

21:29 <hkaiser_> doesn't sound as cool ;)

21:29 patg[w] has quit [Quit: Leaving]

21:29 <heller> makes it more approachable though

21:30 <heller> and underlines the story about the natural extension of the C++ Programming Language as of today ;)

21:32 <ABresting> any advantage of using alternate stack technique while detecting stack overflow ?

21:33 <heller> which come to your mind?

21:35 <ABresting> david_pfander wrote a technique using alternate stack

21:35 <ABresting> meanwhile it can be done without using the alternate stack

21:36 <ABresting> https://github.com/STEllAR-GROUP/hpx/issues/2408

21:36 <ABresting> here

21:36 Matombo has quit [Remote host closed the connection]

21:37 EverYoung has joined #ste||ar

21:39 <heller> ABresting: well, this issue pretty much is the outline of your project

21:39 <heller> ABresting: so what exactly are you asking for?

21:39 <ABresting> so basically alternate stack technique is used to cleanup inconsistent data from the affected thread

21:40 <ABresting> I am asking do we really need extra handler thread stack ?

21:40 <ABresting> as we just have to show if it was stack overflow or not, this can be achieved by general technique as well

21:41 <ABresting> unless there is a catch that this is why we need alternate stack !

21:42 <heller> well, we discussed libsigsegv so far. Why don't you get started with that as a first prototype?

21:42 EverYoung has quit [Ping timeout: 240 seconds]

21:43 <ABresting> yes but I wish to push the general pthread version wrapper first for review purposes, which can later be integrated in hpx_init()

21:44 <heller> ABresting: we don't use pthread. we use boost::thread

21:44 <ABresting> by later means tomorrow maybe

21:44 <ABresting> concept remains the same

21:44 <heller> ok

21:44 <heller> well, I suggest you get started with *something*

21:46 <ABresting> yes going to push it on my test repo for wash

21:57 eschnett has quit [Quit: eschnett]

22:01 aserio has quit [Quit: aserio]

22:02 <heller> http://www.netlib.org/utk/people/JackDongarra/PAPERS/prt_qr.pdf <-- who wants to implement that in HPX?

22:24 <jbjnr_> hkaiser_: (or heller ) if this assertion is being triggered https://github.com/STEllAR-GROUP/hpx/blob/1649b5308125bdeb68fbfa1f9a3835986070bd71/hpx/runtime/threads/policies/thread_queue.hpp#L457

22:24 <jbjnr_> what would you suspect has gone wrong?

22:34 <heller> jbjnr_: the task to be deleted ended up in the wron thread pool

22:34 <heller> wrong*

22:35 <heller> https://github.com/STEllAR-GROUP/hpx/blob/1649b5308125bdeb68fbfa1f9a3835986070bd71/hpx/runtime/threads/policies/thread_queue.hpp#L878

22:36 <heller> this is the routine that decides on which terminated_tasks_ queue to push the terminated thread which will get cleaned up eventually

22:44 <jbjnr_> hmmm. confused

22:45 <heller> https://github.com/ParRes/Kernels/pull/188#issuecomment-310211302

22:47 <heller> now I don't remember when I gave this talk ... was it mardis gras?

22:49 ajaivgeorge has quit [Ping timeout: 240 seconds]

22:49 ajaivgeorge has joined #ste||ar

22:50 <jbjnr_> by wrong thread pool - how do you imagine this happening - we have two pools with N and M os threads - are you saying that a thread is being deleted by pool B when it is owned by Pool A ?

22:53 ajaivgeorge has quit [Client Quit]

22:53 ajaivgeorge has joined #ste||ar

22:54 <jbjnr_> quetion - pool A has 6 threads and they are numbered 0-5 - pool B has 2 threads numbered 6-7 - but insde their own pools they have indices 0-5 and 0-1 : do any of the stealing/suspending/resuming functions use the thread numbers (tss) where an incorrect indexing might be an issue?

22:55 <jbjnr_> so threadmanager knows threads 0-7 but each pool uses different numbering and a thread offset for each pool is maintained. We get problems only when we use multiple pools - so we suspect issues here. Any suggestions are welcome for where to look for a bad index.

22:55 <jbjnr_> falling asleep now. will resume tomorrow.

23:01 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vQIRR

23:01 <github> hpx/master e97f06a Hartmut Kaiser: Merge pull request #2706 from STEllAR-GROUP/clang_format...

23:13 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vQI0s

23:13 <github> hpx/master 52dd1e1 Hartmut Kaiser: Fix comment typo

23:14 <heller> jbjnr_: resuming might be an issue

23:15 <heller> jbjnr_: when resuming a thread, it is usually put into any queue

23:15 <hkaiser_> heller: a thread 'remembers' the scheduler it was running on

23:15 <heller> are you sure?

23:15 <hkaiser_> absolutely

23:15 <heller> yeah, you are right ...

23:15 <heller> otherwise it wouldn't work at all

23:16 <heller> and the ownership is based on the memory pool, not the thread index

23:38 EverYoung has joined #ste||ar

23:41 denis_blank has quit [Quit: denis_blank]

23:43 EverYoung has quit [Ping timeout: 255 seconds]

23:54 ajaivgeorge has quit [Ping timeout: 240 seconds]