#ste||ar on 2022-01-17 — irc logs at irclog.cct.lsu.edu

2021-08-06 22:55 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu

03:18 jehelset has quit [Ping timeout: 250 seconds]

03:44 hkaiser has quit [Quit: Bye!]

05:27 jehelset has joined #ste||ar

10:00 jehelset has quit [Ping timeout: 250 seconds]

12:08 jehelset has joined #ste||ar

12:22 hkaiser has joined #ste||ar

12:46 nanmiao has quit [Quit: Client closed]

14:44 hkaiser_ has joined #ste||ar

14:45 hkaiser has quit [Ping timeout: 240 seconds]

16:21 <gonidelis[m]> "Under NUMA, a processor can access its own local memory faster than non-local memory..."

16:21 <gonidelis[m]> does local mean just cache or does it mean dram too?

16:32 <hkaiser_> gonidelis[m]: NUMA means 'Non-Uniform Memory Access', it's not related to caches

16:33 <hkaiser_> it's only referring to different access times to various parts of the main (dram) memory

16:34 <gonidelis[m]> ah great!

16:36 <gonidelis[m]> so my question was, how can access times differ since dram is a concrete memory thing. i mean it's not physically partitioned and allocated to processors, right?

16:36 <hkaiser_> gonidelis[m]: each socket usually has it's own memory bus connexted to it's own dram memory

16:37 <hkaiser_> memory reads/writes to that part of the memory is faster than having to go through the socket interconnect and the other memory bus

16:37 <gonidelis[m]> yes

16:38 <hkaiser_> so memory is physically partitioned

16:38 <gonidelis[m]> how? different memory chips??

16:38 <gonidelis[m]> wow!

16:39 <hkaiser_> well, as said, each socket has it's own

16:39 <gonidelis[m]> wow

16:39 <gonidelis[m]> ok amazing

16:39 <gonidelis[m]> thanks!

16:39 jehelset has quit [Ping timeout: 250 seconds]

18:33 <gnikunj[m]> hkaiser_: why would this code not work? https://godbolt.org/z/c8ba8bnaf (and what should I do to make it work)

18:35 <hkaiser_> well, print_first needs the Args... explicitly

18:35 <hkaiser_> also, Args.. need to be last

18:36 <hkaiser_> is tuple fully constexpr nowadays?, I guess it is

18:36 <gnikunj[m]> It is

18:37 <gnikunj[m]> if I send Args... to be last, how do I make tuple as a template argument?

18:37 <gnikunj[m]> do you mean specialize it with tuple?

18:50 jehelset has joined #ste||ar

19:00 <hkaiser_> gnikunj[m]: not sure if you can use std::tuple as an integral template argument

19:00 <gnikunj[m]> hkaiser_: what about something like this: https://godbolt.org/z/9Pr5xfW5j

19:01 <gnikunj[m]> (I want to achieve something like this^^)

19:01 <hkaiser_> do you need the tuple as an integral (non-type) template parameter

19:02 <gnikunj[m]> hkaiser_: not really. It was just for an example. But I need an argument pack and use that pack to instantiate another type (function pointer)

19:03 <hkaiser_> well, that's easy enough, I think

19:03 <gnikunj[m]> I could do it using the struct specialization trick you mentioned yesterday but I was wondering if this can be done with functions too

19:03 <hkaiser_> you can't specialize function templates

19:03 <gnikunj[m]> Yes, and that's bothering me

19:05 <hkaiser_> gnikunj[m]: what about: https://godbolt.org/z/1bWP719qs

19:06 <gnikunj[m]> <gnikunj[m]> "hkaiser_: what about something..." <- hkaiser_: see this.

19:06 <hkaiser_> see what?

19:06 <gnikunj[m]> The tuple part is simple if I pass it as an argument. I want a function pointer as a template parameter. https://godbolt.org/z/qoE63Mqqb

19:07 <gnikunj[m]> I can pass it as argument but then I loose on the serialization aspect of it.

19:08 <hkaiser_> you can't do that with functions as you can't specialize them

19:09 <hkaiser_> see how we have done it with types

19:09 <gnikunj[m]> Figured. Any work around? I want to have it as template value and not as function argument.

19:09 <gnikunj[m]> hkaiser_: using the macros?

19:10 <hkaiser_> no macro

19:11 <hkaiser_> https://godbolt.org/z/9YTMWa5G3

19:12 <gnikunj[m]> yeah, that's what I meant when I said struct specialization. Cool, I'll do something like that.

19:13 <hkaiser_> gnikunj[m]: with c++ this can be simplified by using template <auto F> struct;

19:13 <gnikunj[m]> Isn't that C++20?

19:14 <hkaiser_> yep, that's what I meant

19:14 <gnikunj[m]> I so wish I was using C++17 or C++20 ;_;

19:15 <gnikunj[m]> I had to write make_index_pack myself because I'm on C++11

19:15 <hkaiser_> hpx has it, just copy it from there

19:15 <hkaiser_> fairly efficient implementation, even

19:15 <gnikunj[m]> I should've known it. I implemented it myself lol. It's O(N) but I think and O(logN) can be implemented.

19:17 <hkaiser_> https://github.com/STEllAR-GROUP/hpx/blob/master/libs/core/type_support/include/hpx/type_support/pack.hpp#L45-L59

19:17 <hkaiser_> note, it's back in the main repo

19:18 <gnikunj[m]> It looks somewhat similar to what I've done

22:09 <gnikunj[m]> hkaiser_: Should I update the resiliency libraries with retry sender? Or should I add that to execution?

22:11 <gnikunj[m]> Also, what cmake 3.18 functions are we using?

22:22 <hkaiser_> gnikunj[m]: first add it to execution and then create a new version allowing to compare performance

22:22 <gnikunj[m]> Got it. I should be able to add one this week! I think I understand the code now too.

22:23 <gnikunj[m]> hkaiser_: Btw I leave tomorrow morning. Want to catch up again tonight?

22:24 <hkaiser_> gnikunj[m]: I think I can do that

22:25 <gnikunj[m]> hkaiser_: Perfect! Where do you want to go? Chimes again?

22:25 <hkaiser_> whatever you like

22:25 <gnikunj[m]> well you know better. Any recommendations?

22:25 <hkaiser_> none

22:26 <gnikunj[m]> lol. Let's go to chimes again then

22:29 <hkaiser_> ok

23:11 <gnikunj[m]> hkaiser_: I have a debug build and I'm trying to step-into the code but there still seems to be optimizations from compilers end (trying to debug one of the sender tests). Is there an additional flag that I need to provide so that the compiler doesn't optimize anything and I can nicely view the call stack?

23:11 <hkaiser_> -Od should do the trick

23:11 <hkaiser_> sorry O0

23:11 <hkaiser_> -O0

23:12 <gnikunj[m]> So, I add that to CMAKE_CXX_FLAGS?

23:12 <hkaiser_> look at what command lines are being generated: make VERBOSE=1

23:12 <gnikunj[m]> I thought HPX was adding -O0 when building with debug flag

23:12 <hkaiser_> yah

23:12 <hkaiser_> it should

23:18 <gnikunj[m]> hkaiser_: there's no -O0 with CMAKE_BUILD_TYPE=Debug

23:18 <hkaiser_> uhh

23:19 <hkaiser_> the flags are generated by cmake, not us

23:20 <gnikunj[m]> Yeah, I thought setting debug on would ensure it also had -O0 but make VERBOSE=1 doesn't emit -O0 in the command

23:21 <gnikunj[m]> anyway, I've added that as an additional flag.

23:34 jehelset has quit [Ping timeout: 250 seconds]