hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
<hkaiser>
john98zakaria[m]: using a requirements.txt file is an option, but we could also pin the versions in the dockerfile itself
<hkaiser>
john98zakaria[m]: but I agree, let's build the image locally and once that works, update the one in the repository
<john98zakaria[m]>
hkaiser: Yes but no
<john98zakaria[m]>
Because the pinnend versions themselves may have themselves something like numpy < 1.22
<john98zakaria[m]>
The safest way is to pin everything
<john98zakaria[m]>
So they may pull a non deterministic version of numpy based on the interdependencies
<hkaiser>
ok, fair point
<john98zakaria[m]>
I work as a python dev, I have been bitten numerous times
<hkaiser>
john98zakaria[m]: I appreciate your experience, for sure
<hkaiser>
any help you can give is highly appreciated as well
<srinivasyadav227>
hkaiser: I am working with integrating SVE to HPX. There is some problem with using (SVE) targets from cmake_fetch_content directly with INSTALL rules in HPX cmake
<sestro[m]>
Should using `hpx::compute::host::block_executor<>` and `hpx::compute::host::block_allocator<value_type>` be enough to make algorithms such as `hpx::transform` NUMA-aware, or are any additional modifications requried for that?
<srinivasyadav227>
yes, those are enough
<srinivasyadav227>
use executor with the policy like `policy.on(executor)`
<sestro[m]>
srinivasyadav227: Okay, thanks. I am under the impression that the tasks are not scheduled properly in many cases, but maybe something else is interfering.
<sestro[m]>
srinivasyadav227: Thanks for the link, so what I'm using right now should be fine.
<srinivasyadav227>
cool!
<srinivasyadav227>
hkaiser: Fresh configure didnt work It gave me same error. But after doing this `-DHPX_WITH_PKGCONFIG=OFF` everything worked perfectly
<srinivasyadav227>
hkaiser: could you tell me the reason why its happening this way ? thanks!
<sestro[m]>
I'm seeing a large imbalance between the threads (almost a factor of 2 in some cases) and I'm running out of ideas what could cause this.
<srinivasyadav227>
Imbalance ? you mean time taken for each thread ?
<srinivasyadav227>
sestro: I mean time spent by each thread on doing the work ?
<sestro[m]>
srinivasyadav227: Yes. The chunks should be the same size, so I'd expect the runtime per task to be similar. But this is not the case.
<srinivasyadav227>
sestro: can you trying passing ` --hpx:numa-sensitive` to the application (in command line)?
<sestro[m]>
srinivasyadav227: Tried that before, did not have any measurable impact.
<K-ballo>
kinda, member functions don't have addresses
<satacker[m]>
K-ballo: Ohh
<gonidelis[m]>
K-ballo: so how does it bind?
<satacker[m]>
K-ballo: Ohh, right so it would have been void (*)(int) otherwise
<K-ballo>
what do you mean by that, how is it internally implemented by the usual compilers?
<K-ballo>
there's variability in implementation across compilers, plus it varies for class data member pointer and class function member pointer, and for the later it also matters virtual vs non-virtual, and the kind of inheritance hierarchy (single, multiple, diamond)
<K-ballo>
the important thing is that when you bind it later on, the resulting execution is as if you had written `xobject.f(20)`
<gonidelis[m]>
K-ballo: I am asking how is xobject aware of `*ptfptr`
<K-ballo>
what does "aware" mean??
<K-ballo>
xobject doesn't need to know anything about ptfptr
<K-ballo>
and it's `.*`, it's its own token, not separable
<gonidelis[m]>
lol true
<gonidelis[m]>
just seems weird that pointer to mem function was declared before the very object itself
<K-ballo>
it's a pointer to *class*, not a pointer to *object*
<gonidelis[m]>
sheesh!
<gonidelis[m]>
no i get it
<K-ballo>
it doesn't point into xobject, it points into X
<gonidelis[m]>
and what is the use case of having a pointer to a class? why would you want sth like that?
<gonidelis[m]>
now*
<satacker[m]>
Is it even a pointer if we cannot get it's address?
<K-ballo>
pointer to a class member
<K-ballo>
satacker[m]: a pointer to member type is not a pointer type, C++ is fun like that
<gonidelis[m]>
ahhhh it's pointer to class member. not object member!!!!!!!!!
<K-ballo>
no
<K-ballo>
it's a pointer to class object or function member
<K-ballo>
sorry, data
<gonidelis[m]>
make a decision already
<K-ballo>
pointer to class data member or function member
<gonidelis[m]>
ok ok ok yes
<gonidelis[m]>
yes
<gonidelis[m]>
yes
<K-ballo>
a data member is a subobject, which is also an object
<gonidelis[m]>
yes
<gonidelis[m]>
so again: why would you need that ?
<K-ballo>
the data implementation is actually fairly simple, it's just an offset
<gonidelis[m]>
he just mentioned that like an hour ago
<gonidelis[m]>
i forgot immediately
<K-ballo>
that's the more mundane use case I can think of
<gonidelis[m]>
SAME EXAMPLE HE GAVE!
<K-ballo>
of course, that's the cannonical projection example
<gonidelis[m]>
wait
<satacker[m]>
It's used in pybind11 based bindings pretty heavily
<gonidelis[m]>
?
<gonidelis[m]>
the projection needs to be a pointer to a class mem fun cause it operates on the very object that the (external) algo (e.g. sort) is operating on
<K-ballo>
the projection needs to be a callable
<gonidelis[m]>
ok
<gonidelis[m]>
?
<K-ballo>
a pointer to class member is a callable
<K-ballo>
callable = function object or pointer to member
<gonidelis[m]>
so?
<gonidelis[m]>
oh ok!
<K-ballo>
it doesn't NEED to be a pointer to class mem fun
<K-ballo>
plus the example I just gave isn't, it's a pointer to class mem data
<K-ballo>
but it doesn't need to be a pointer to class at all
<gonidelis[m]>
ok ok.... what's the benefit of pointer to memfun compared to FOs?
<gonidelis[m]>
(yes ^^)
<K-ballo>
those are different things, how would we compare them?
<K-ballo>
you mean by creating the equivalent function object for each member of a class?
<gonidelis[m]>
they are both callables. in what case would you use the memfun over an FO?
<K-ballo>
I can't make sense of the question
<gonidelis[m]>
why would you need a pointer to a mem fun
<gonidelis[m]>
that's the question
<gonidelis[m]>
you said projections
<gonidelis[m]>
the question remains the same
<gonidelis[m]>
why?
<K-ballo>
are you asking why I say `&Employee::name` instead of something like `[](Employee const& e) -> std::string const& { return e.name; }`?
<K-ballo>
or in a 98 style, `struct employee_name_fo { std::string const& operator()(Employee const& e) const noexcept { return e.name; } };`
<K-ballo>
and in the fo case I'd also have to add all the different cv-ref qualification overloads (or deduce-this)
<gonidelis[m]>
the employee example is not exactly what i want cause it's a mem data
<K-ballo>
fine, make it `&Employee::get_name`
<gonidelis[m]>
'ight
<gonidelis[m]>
could you implement it as a lambda instead?
<gonidelis[m]>
with captures and all?
<K-ballo>
sure, and as a function object too
<K-ballo>
there are subtle differences between the different implementations, but for this particular user case it doesn't matter
<K-ballo>
so I choose the pmf because it's short, way shorter
<gonidelis[m]>
yes
<gonidelis[m]>
ok
<gonidelis[m]>
so pmf just points to sth that already exists and already has access to the data members
<K-ballo>
the only pmf case that can't be replaced by a lambda or function object is switching names
<gonidelis[m]>
so instead of declaring a whole FO from scratch you would use pmf
<gonidelis[m]>
because ... ?
<K-ballo>
ptrfpt could point to one function now, and a different function on the next line
<K-ballo>
the lambdas and function object equivalents have the name hard coded, different names require different types
<gonidelis[m]>
ultimately I want to know if `type(f)(args)` is a temporary or not and what is the benefit of invoking it that way instead of `type f; f(args);` given that `type` overloads the `operator()` of course
<hkaiser>
gonidelis[m]: type(f) creates a temporary instance of type
<gonidelis[m]>
But u can still use f autonomously afterwards ?
<hkaiser>
you could certainly to type f1 = f; f1(args); which would be equivalent (almost)
<K-ballo>
I don't think I've ever heard of Panos before, have I?
<gonidelis[m]>
i just pinged him... new guy on the block
<K-ballo>
haven't met him
<pansysk75[m]>
yes, came here from GSoC and working with gonidelis , I will be around and more active in this chat once I catch up with the introductory stuff
<pansysk75[m]>
nice to meet you
<K-ballo>
hi pansysk75[m], so you are Panos
<K-ballo>
if one assumes `type` to be a class type then yes, there's a difference in that `type(f)` is an rvalue and `f` is an lvalue
<K-ballo>
but then it would not be unrelated to the invoke macro implementation
<K-ballo>
dah, it would be unrelated :)
<gonidelis[m]>
the question is can we initialize (or declare i don't know what the proper wording) f like `type(f)` and use the `operator()` immediately? hence, `type(f)()` ?
<gonidelis[m]>
seems like no
<K-ballo>
a declaration is a statement, an invocation is an expressoin
<gonidelis[m]>
cannot mingle them i guess?
<K-ballo>
you can, but not in that particular way
<K-ballo>
what would be the point anyway?
<gonidelis[m]>
ok ok yes
<gonidelis[m]>
the bottom question was: is ` (::hpx::util::detail::invoke<decltype((F))>(F)(__VA_ARGS__))` a temporary object here?
<K-ballo>
that entire expression is equivalent to the result of the invocation, so that would depend on the return type of the selected operator
<K-ballo>
you probably mean to ask something else
<gonidelis[m]>
and if yes: 1. what's the point of handling as a temp instead of unrolling it and using it as an lvalue (primal question)?
<gonidelis[m]>
2. Could we do F(args) later on after that expression? (secondary question just for the sake of C++ knowledge)
<K-ballo>
unrolling and using it as an lvalue??
<gonidelis[m]>
Hartmut's suggestion
<K-ballo>
uh?
<gonidelis[m]>
<hkaiser> "you could certainly to type f1 =..." <- i meant this
<K-ballo>
that'd be almost equivalent, the almost would matter in a tool like invoke
<K-ballo>
you can't lvalue-ize the target in invoke, or you may end up selecting lvalue overloads for an rvalue
<K-ballo>
you must perfectly forward the target, which is what the cast is doing
<K-ballo>
you also wouldn't be able to have variables declared in a macro (not in the way you'd want them to work)
<gonidelis[m]>
yes yes of course
<gonidelis[m]>
ok perfect explanation
<gonidelis[m]>
it's all about std::invoke then
<K-ballo>
no, not really
<gonidelis[m]>
SORRY I MENAT `HPX_INVOKE` WHICH IS SUPER MORE EFFICIENT THAN std::invoke
<K-ballo>
no, not really either
<gonidelis[m]>
hahahahahahha
<K-ballo>
if you have a target callable which doesn't overload on value category, then it doesn't matter which value category you use, there's always one candidate possible
<K-ballo>
if you have a target callable which overloads on value category, invoking it on the wrong value category will give you a compilation error if you are lucky, or select a bad candidate if not
<gonidelis[m]>
rumor has it you provided us with HPX_INVOKE for better perf
<K-ballo>
no, just for less bloat
<K-ballo>
there's no performance difference whatsoever in an optimized build
<gonidelis[m]>
K-ballo: guh this would be awful
<K-ballo>
note: `(F)` is an expression, `decltype((F))` is thus always a reference type,
<K-ballo>
`static_cast<decltype((F))&&>(F)` is effectively std::forward,
<K-ballo>
`T(F)` is a C-style cast, which is equivalent to `static_cast<T>(F>)` if that's well formed, so in this case it is also effectively forward
<K-ballo>
the macro basically says `std::forward<F>(f)(std::forward<Args>(args)...)`
<K-ballo>
I believe be have a forward macro now? so `HPX_FWD(F)(HPX_FWD(args)...)`
<gonidelis[m]>
yes we do
<sestro[m]>
<hkaiser> "the allocator makes sure that..." <- Yes, I'm using both components. The allocation seems to work just fine, at least according to some crude testing via `get_mempolicy()` on different segments of the allocated memory.
<gonidelis[m]>
hkaiser: K-ballo: is the exec_policy argument on C++17 parallel algorithms a restriction or a suggestion on the execution?
<gonidelis[m]>
does std::execution::par suggest parallel execution if possible or does it require it?
<hkaiser>
it's a restriction on the lambda's execution
<hkaiser>
it says that the lambda is allowed to be executed out of order
<hkaiser>
and possibly concurrently
<hkaiser>
also, it says that the lambda is allowed to synchronize between its invocations
<K-ballo>
std::execution::par tells the algorithm that it can call the predicate in parallel, it requires nothing
<hkaiser>
seq says that the lambda needs in-order, non-concurrent execution
<gonidelis[m]>
calling concurrent execution more restrictive is sth i wouldn't think of
<hkaiser>
its less restrictivecompared to seq
<gonidelis[m]>
ok got it
<gonidelis[m]>
but the algo may as well opt to execute the thing sequentially if say you provide fwd iterators
<gonidelis[m]>
even though you provided par, as the first arg
<hkaiser>
yep
<hkaiser>
par does not require parallel execution, it merely allows it