#ste||ar on 2020-05-10 — irc logs at irclog.cct.lsu.edu

2020-02-24 20:46 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020

00:07 <weilewei> error: ‘threads’ in namespace ‘cds::threading::hpx’ does not name a type: static hpx::threads::thread_id_type id;

00:07 <weilewei> why cannot find hpx threads correctly? Did I not link the application to hpx correctly?

00:07 <weilewei> The app has no problem including hpx header files

00:11 <hkaiser> missing #include?

00:13 <weilewei> well i do have #include <hpx/include/threads.hpp> this one

00:19 <hkaiser> weilewei: we're restructuring headers, so things might not be pulled by the headers in hpx/include

00:27 <hkaiser> bita_: https://github.com/STEllAR-GROUP/phylanx/pull/1168 should take care of things

00:27 <hkaiser> will require top of HPX master, however

00:32 <weilewei> I include all direct thread related headers though...

00:33 <hkaiser> weilewei: ok, pls show me the full error listing and the code

00:36 <weilewei> https://github.com/weilewei/libcds/blob/8fb829632725ec0a408a4eec3aa7f29c0cabc8e7/cds/threading/details/hpx_manager.h#L54

00:37 <weilewei> and error: https://gist.github.com/weilewei/513d5c69e54dd4c4fefdc87968d3fa20

00:39 <weilewei> linking to hpx library is done here: https://github.com/weilewei/libcds/blob/hpx-thread/CMakeLists.txt#L196-L197

00:42 <hkaiser> weilewei: heh

00:42 <hkaiser> inside the namespace ...::hpx you ask for a variable of type hpx::threads::...

00:43 <hkaiser> but inside your namespace ...::hpx no such type is defined

00:43 <hkaiser> if you want this to work you need to write ::hpx::threads::thread_id_type id;

00:44 <weilewei> wow, the compilation moves on

00:45 <weilewei> why?

00:47 <weilewei> another error comes so soon: https://gist.github.com/weilewei/513d5c69e54dd4c4fefdc87968d3fa20#gistcomment-3297832

00:47 <weilewei> can't find headers...

00:48 <hkaiser> weilewei: namespace decorations are resolved relatively to the current namespace first before going one level up etc

00:48 <hkaiser> so if you want to be sure start from the global namespace

00:49 <weilewei> I see

00:55 <weilewei> if change ...::hpx to ...::hpx_threads, the conflict resolved...

00:55 <weilewei> I undestood this point now

01:05 <weilewei> ok, change cmake keywords for HPX::hpx from private to public, it works, as this target will be used by testing targets

01:38 hkaiser has quit [Quit: bye]

01:40 bita_ has quit [Quit: Leaving]

01:47 weilewei has quit [Remote host closed the connection]

10:26 gonidelis has joined #ste||ar

10:29 gonidelis has quit [Remote host closed the connection]

10:49 Nikunj__ has joined #ste||ar

12:18 hkaiser has joined #ste||ar

13:08 Nikunj__ has quit [Ping timeout: 256 seconds]

14:36 gonidelis has joined #ste||ar

15:08 <gonidelis> In this https://gist.github.com/gonidelis/9a2157b380d8683afc957f007b52dcdb , why is wrong to prefix `template` in front of `algorithm_result` ?

15:15 <zao> You typically only need `template ` in places where a disambiguation is required or where you're declaring a nested template.

15:16 <gonidelis> is `algorithm_result` a template instatiations although?

15:16 <zao> Like in cases akin to `void D::f() { B::template get<9001>(); }` where you're accessing something in a dependent base.

15:17 <zao> If you're already in a context where a type is expected I don't know of any reason why you'd use the keyword.

15:18 <zao> (unless there's language parts I'm not familiar with, like half of what HPX uses :D )

15:18 <zao> Are you investigating some problem or just trying to understand some code?

15:18 <gonidelis> Just trying to understand the code

15:18 <gonidelis> I want to be sure that I can reproduce what I read and not just comprehend in an inactive way...

15:20 <gonidelis> U said "dependent base". Yesterday I was reading articles about typenames / templates for like 3-4 hours and I think I am still confused on when a name actually is a disambiguation....

15:23 <zao> Say that you've got `template <typename B> struct D : B {};` or `template <typename T> struct D : B<T> {};`

15:24 <zao> The base there is dependent (on a template argument) and the rules for implicit lookup of base members in a derived type do not apply and you need to more explicitly mention what the thing you're looking up is, and where it lives.

15:27 <gonidelis> So both commands produce warnings?

15:28 <zao> Those definitions are fine on their own.

15:29 <zao> The problem is if you're in a member function and try to refer to something from a dependent base. At the point of parsing the template it's rather impossible to know what a name refers to in the base, or what base it's referring to at all.

15:29 <gonidelis> Can u use a synoym for base plz?

15:29 <gonidelis> I think I am pretty close to getting what you are saying

15:29 <zao> I can say base class or interface if it helps :)

15:32 <gonidelis> So let me get this straight: the problem is that the compiler has litteraly no idea what `B` or `T`could be in context of the class???

15:32 <zao> https://gcc.godbolt.org/z/zJaC6e

15:32 <zao> See how this behaves if you remove the "template" on line 9, or if you remove "Base::template ".

15:35 <gonidelis> ok so Base in context of D must carry a function `get()` no matter what.

15:36 <gonidelis> I can see that it pops an error

15:36 <zao> If you omit the `Base::` part, it doesn't look into the dependent base class and as such, can't find anything.

15:37 <zao> If you omit the `template` part, it assumes that 'get' is a member and the following `<` is a less-than comparison.

15:39 Nikunj__ has joined #ste||ar

16:04 Nikunj__ has quit [Ping timeout: 256 seconds]

17:38 <Yorlik> zao: Following your example: Would this be valid too? https://gcc.godbolt.org/z/c_cX7E

17:41 <zao> Clang seems upset with that.

17:42 <zao> Personally, I have no idea :)

17:42 <zao> (about to head out for a walk right now too)

17:43 <Yorlik> Cheers! :)

18:03 <Yorlik> hkaiser: yt?

18:03 <Yorlik> Seems I'm getting HPX exceptions now (Debug mode): https://gist.github.com/McKillroy/a3bc0d9e3a5c9188b02ef9ee70cb83ce

18:03 <Yorlik> I'm using this code in several scopes : hpx::util::ignore_all_while_checking ignore_lock_checks;

18:04 <Yorlik> Gotta be off a while - BBL.

18:14 Nikunj__ has joined #ste||ar

18:23 gonidelis has quit [Remote host closed the connection]

21:07 nikunj has quit [Read error: Connection reset by peer]

21:08 nikunj has joined #ste||ar

21:16 nikunj97 has joined #ste||ar

21:19 Nikunj__ has quit [Ping timeout: 240 seconds]

21:22 <Yorlik> hkaiser: yt?

21:32 <hkaiser> Yorlik: here

21:32 <Yorlik> Hello !

21:32 <hkaiser> hey

21:32 <Yorlik> Did you see my question above?

21:33 <hkaiser> yah, but the stack-backtrace is useless ;-)

21:33 <Yorlik> Yup

21:33 <Yorlik> Any idea how I should debug / trace this?

21:33 <hkaiser> can you stop at that line and look at the backtrace yourself?

21:33 <Yorlik> You mean a breakpoint in HPX?

21:33 <hkaiser> set a break point on the HPX_THROW_EXCEPTION

21:34 <Yorlik> I'll do that !

21:34 <hkaiser> or let the debugger stop on any throw

21:35 <Yorlik> Allright - working on it.

21:35 <Yorlik> Next week I'm getting a threadripper :D

21:37 <Yorlik> There is none of my code in the call stack

21:40 <Yorlik> hkaiser: https://gist.github.com/McKillroy/3ff80387fd97e2fe91d172ab6ec3a6b2

21:41 <hkaiser> uhh, that does not make any sense :/

21:41 <hkaiser> perhaps something that was left locked on a previous thread exit (as you disabled all checks..)

21:41 <Yorlik> I'll restart and get the first exception

21:41 <hkaiser> I think we don't check for locked locks on thread exist (but we should)

21:42 <hkaiser> *thread exit*

21:42 <hkaiser> hold on, this _is_ on thread exit

21:42 <hkaiser> you left something locked and ignored it

21:43 <Yorlik> What do you mean with that?

21:43 <hkaiser> then re-enabled its tracking so hit triggers at thread exit

21:43 <Yorlik> All my locks are implemented as lock_guard

21:43 <Yorlik> with a spinlock

21:43 <hkaiser> your thread exits with a locked lock hanging around

21:43 nikunj97 has quit [Ping timeout: 256 seconds]

21:43 <hkaiser> hmmm

21:44 <Yorlik> It simply impossible I am locking and exiting - or something crashes

21:44 <Yorlik> Every lock is strictly scoped

21:45 <hkaiser> Yorlik: the exception is triggered here: after the actual thread function returned

21:45 <hkaiser> well

21:46 <hkaiser> you could have ignored the lock, then suspended the hpx thread, it got resumed on a different core and the next hpx thread on the old core sees the locked lock and complains about it

21:46 <hkaiser> please reconsider leaving the lock locked while suspending, do you really need that?

21:47 <Yorlik> I could do several things

21:47 <hkaiser> I think the locks are ignored on a core-by-core basis

21:47 <hkaiser> (iirc)

21:47 <Yorlik> Maybe it's timne for me to implement proper logging :)

21:48 <Yorlik> The locks helped me to get clear output while a lua state is being initialized

21:48 <Yorlik> So their output on init doesn't get intermingled

21:48 <Yorlik> I'll try to work around that.

21:49 <Yorlik> However: Should I make an issue out of this? I mean do you have a realistic chance to improve your checks to consider thread migration?

21:49 <hkaiser> not sure

21:49 <hkaiser> I've never seen this before

21:50 <Yorlik> I am pretty sure it is my init() function, sionce when I createmany engines it happend faster and more often

21:50 <Yorlik> Let me remove that one lock and see what happens

21:50 <Yorlik> Actually I could enable it dynamically

21:51 <Yorlik> Because I do not need it always - only when debugging the engine initialization lua scrips

22:00 <hkaiser> Yorlik: yah, the lock registration is entirely thread_local

22:01 <hkaiser> I think I could work around that

22:01 <Yorlik> If you can do it with no or minimal overhead as an option for tricky cases I tzhink that would be useful.

22:02 <Yorlik> After all it is meant for development, not production

22:03 <Yorlik> The good thing about this problem is, it forces me to reflect on every lock :D

22:06 <hkaiser> yes

22:15 <Yorlik> If a thread writes to a slot in a std::unordered_map and another reads, but it is guaranteed to be another slot - is that UB or safe?

22:36 <hkaiser> Yorlik: probably UB

22:37 <hkaiser> Yorlik: might be safe if the entries exist

22:37 <Yorlik> Anothger portion to redesign.

22:37 <Yorlik> How can I store something on a task and query it inside ?

22:38 <Yorlik> Because that map is a crappy workaround

22:38 <Yorlik> I just need to store a reference to a lua engine

22:38 <Yorlik> So every update can get it from the chunks task

22:46 <hkaiser> Yorlik: you have a void* (size_t) at your disposal, I thin kwe've talked about that

22:46 <hkaiser> I even created an example for you demonstrating how to delete things ince the thread exited

22:47 <Yorlik> I think I didn't understand it yet.

22:47 <Yorlik> Dang - somehow I missed something.

22:47 <hkaiser> each thread can store a user defined size_t that it carries around for you

22:47 <hkaiser> you can set it and query it using the thread's id

22:48 <Yorlik> So basically I use that size_t as a raw pointer?

22:48 <hkaiser> yes

22:48 <Yorlik> Wow - you using a raw pointer ;)

22:48 <hkaiser> I'm using a size_t

22:49 <hkaiser> casting to/from void* is on you ;-)

22:49 <Yorlik> How else would it make any sense?

22:49 <hkaiser> sure, sure

22:49 <Yorlik> I think I'll create an object in on_start and delete it in on_exit

22:50 <hkaiser> right

22:50 <Yorlik> Is that mechnic already available in the lambdas?

22:50 <hkaiser> what lambdas?

22:50 <Yorlik> on_start and on_exit

22:50 <hkaiser> btw: I have a solution for the held locks problem, I think

22:50 <Yorlik> That custom executor I am using

22:50 <hkaiser> sure, on_start/on_end are executed by the hpx thread that will run the actual function

22:51 <Yorlik> I think the held locks exploding is somehow a good thing. It gives you a thorough warning on possible deadlocks

22:51 <Yorlik> Though - I guess if theres task switching involved they will eventually dissolve anyways

22:52 <Yorlik> Like the holding task will be rescheduled soonish

22:52 <hkaiser> Yorlik: yah, that's the purpose of the lock registration

22:53 <hkaiser> make you think twice

22:54 <Yorlik> Don't remove it - just make the switch to turn it off work. I think this is really dangerous territory, especially since it might just work for a while.

22:54 <Yorlik> So - that warning is really good, imo.

22:54 <hkaiser> Yorlik: you can already turn it off (through some hpx.register_locks=0 or somesuch)

22:55 <hkaiser> I will not remove it, I will fix the problem that the lock stays registered if the thread is resumed on a different core

22:55 <Yorlik> I won't do it - I'm too new to all this - being knocked here is a good thing - also I am rethinking suboptimal design choices..

23:05 <hkaiser> Yorlik: https://github.com/STEllAR-GROUP/hpx/pull/4610 should fix your issue

23:07 <Yorlik> Oh man - you are so insanely fast :D - I'll check it out after fixing my task local data issue

23:08 <Yorlik> So it'S std::size_t set_thread_data(std::size_t data), I guess - it'S returning the size_t it just received as a convenience, I assume?

23:33 <Yorlik> So essentially I'm doing this: hpx::this_thread::set_thread_data( reinterpret_cast<size_t>( task_data_p ) );

23:35 <hkaiser> yep

23:35 <hkaiser> set returns the previous value

23:36 <Yorlik> IC - makes sense

23:36 <Yorlik> This is going to simplify many things in a nice way by just using a tiny reinterpret cast ;)

23:37 <hkaiser> right

23:37 <Yorlik> on_exit:

23:37 <Yorlik> task_data* task_data_p = reinterpret_cast<task_data*>( hpx::this_thread::get_thread_data( ) );

23:37 <Yorlik> task_data_p->task_engine.reset(); // give back lua engine

23:37 <Yorlik> delete task_data_p;

23:37 <Yorlik> Controlled unsafety in a narrow space

23:38 <hkaiser> sure, wrap it in an object with RAII semantics

23:39 <Yorlik> How would I keep thi object from on_start to on_exit? I think I really need new and delete here - don't I?

23:39 <hkaiser> have it in your wrapper class you use in the executor

23:39 <Yorlik> I was thinking about that, but it would make the executor less generic ofc.

23:40 <hkaiser> ok

23:40 <Yorlik> Though I culd make the task_data type a template parameter

23:40 <hkaiser> give me a sec, need to look at the code

23:43 <hkaiser> you could make the hook_wrapper a template

23:43 <hkaiser> template parameter of the executor, that is

23:45 <Yorlik> Allright.

23:45 <Yorlik> I have the dangerous solution ready - testing now - then I'll work on making it safe

23:58 <Yorlik> Now that's interesting: It throws, because in update() I get a nullptr and on_start is not being called. I wonder if there coule be a race