hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
jehelset has quit [Ping timeout: 250 seconds]
hkaiser has quit [Quit: Bye!]
jehelset has joined #ste||ar
jehelset has quit [Ping timeout: 250 seconds]
jehelset has joined #ste||ar
hkaiser has joined #ste||ar
nanmiao has quit [Quit: Client closed]
hkaiser_ has joined #ste||ar
hkaiser has quit [Ping timeout: 240 seconds]
<gonidelis[m]> "Under NUMA, a processor can access its own local memory faster than non-local memory..."
<gonidelis[m]> does local mean just cache or does it mean dram too?
<hkaiser_> gonidelis[m]: NUMA means 'Non-Uniform Memory Access', it's not related to caches
<hkaiser_> it's only referring to different access times to various parts of the main (dram) memory
<gonidelis[m]> ah great!
<gonidelis[m]> so my question was, how can access times differ since dram is a concrete memory thing. i mean it's not physically partitioned and allocated to processors, right?
<hkaiser_> gonidelis[m]: each socket usually has it's own memory bus connexted to it's own dram memory
<hkaiser_> memory reads/writes to that part of the memory is faster than having to go through the socket interconnect and the other memory bus
<gonidelis[m]> yes
<hkaiser_> so memory is physically partitioned
<gonidelis[m]> how? different memory chips??
<gonidelis[m]> wow!
<hkaiser_> well, as said, each socket has it's own
<gonidelis[m]> wow
<gonidelis[m]> ok amazing
<gonidelis[m]> thanks!
jehelset has quit [Ping timeout: 250 seconds]
<gnikunj[m]> hkaiser_: why would this code not work? https://godbolt.org/z/c8ba8bnaf (and what should I do to make it work)
<hkaiser_> well, print_first needs the Args... explicitly
<hkaiser_> also, Args.. need to be last
<hkaiser_> is tuple fully constexpr nowadays?, I guess it is
<gnikunj[m]> It is
<gnikunj[m]> if I send Args... to be last, how do I make tuple as a template argument?
<gnikunj[m]> do you mean specialize it with tuple?
jehelset has joined #ste||ar
<hkaiser_> gnikunj[m]: not sure if you can use std::tuple as an integral template argument
<gnikunj[m]> hkaiser_: what about something like this: https://godbolt.org/z/9Pr5xfW5j
<gnikunj[m]> (I want to achieve something like this^^)
<hkaiser_> do you need the tuple as an integral (non-type) template parameter
<gnikunj[m]> hkaiser_: not really. It was just for an example. But I need an argument pack and use that pack to instantiate another type (function pointer)
<hkaiser_> well, that's easy enough, I think
<gnikunj[m]> I could do it using the struct specialization trick you mentioned yesterday but I was wondering if this can be done with functions too
<hkaiser_> you can't specialize function templates
<gnikunj[m]> Yes, and that's bothering me
<hkaiser_> gnikunj[m]: what about: https://godbolt.org/z/1bWP719qs
<gnikunj[m]> <gnikunj[m]> "hkaiser_: what about something..." <- hkaiser_: see this.
<hkaiser_> see what?
<gnikunj[m]> The tuple part is simple if I pass it as an argument. I want a function pointer as a template parameter. https://godbolt.org/z/qoE63Mqqb
<gnikunj[m]> I can pass it as argument but then I loose on the serialization aspect of it.
<hkaiser_> you can't do that with functions as you can't specialize them
<hkaiser_> see how we have done it with types
<gnikunj[m]> Figured. Any work around? I want to have it as template value and not as function argument.
<gnikunj[m]> hkaiser_: using the macros?
<hkaiser_> no macro
<gnikunj[m]> yeah, that's what I meant when I said struct specialization. Cool, I'll do something like that.
<hkaiser_> gnikunj[m]: with c++ this can be simplified by using template <auto F> struct;
<gnikunj[m]> Isn't that C++20?
<hkaiser_> yep, that's what I meant
<gnikunj[m]> I so wish I was using C++17 or C++20 ;_;
<gnikunj[m]> I had to write make_index_pack myself because I'm on C++11
<hkaiser_> hpx has it, just copy it from there
<hkaiser_> fairly efficient implementation, even
<gnikunj[m]> I should've known it. I implemented it myself lol. It's O(N) but I think and O(logN) can be implemented.
<hkaiser_> note, it's back in the main repo
<gnikunj[m]> It looks somewhat similar to what I've done
<gnikunj[m]> hkaiser_: Should I update the resiliency libraries with retry sender? Or should I add that to execution?
<gnikunj[m]> Also, what cmake 3.18 functions are we using?
<hkaiser_> gnikunj[m]: first add it to execution and then create a new version allowing to compare performance
<gnikunj[m]> Got it. I should be able to add one this week! I think I understand the code now too.
<gnikunj[m]> hkaiser_: Btw I leave tomorrow morning. Want to catch up again tonight?
<hkaiser_> gnikunj[m]: I think I can do that
<gnikunj[m]> hkaiser_: Perfect! Where do you want to go? Chimes again?
<hkaiser_> whatever you like
<gnikunj[m]> well you know better. Any recommendations?
<hkaiser_> none
<gnikunj[m]> lol. Let's go to chimes again then
<hkaiser_> ok
<gnikunj[m]> hkaiser_: I have a debug build and I'm trying to step-into the code but there still seems to be optimizations from compilers end (trying to debug one of the sender tests). Is there an additional flag that I need to provide so that the compiler doesn't optimize anything and I can nicely view the call stack?
<hkaiser_> -Od should do the trick
<hkaiser_> sorry O0
<hkaiser_> -O0
<gnikunj[m]> So, I add that to CMAKE_CXX_FLAGS?
<hkaiser_> look at what command lines are being generated: make VERBOSE=1
<gnikunj[m]> I thought HPX was adding -O0 when building with debug flag
<hkaiser_> yah
<hkaiser_> it should
<gnikunj[m]> hkaiser_: there's no -O0 with CMAKE_BUILD_TYPE=Debug
<hkaiser_> uhh
<hkaiser_> the flags are generated by cmake, not us
<gnikunj[m]> Yeah, I thought setting debug on would ensure it also had -O0 but make VERBOSE=1 doesn't emit -O0 in the command
<gnikunj[m]> anyway, I've added that as an additional flag.
jehelset has quit [Ping timeout: 250 seconds]