hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
Yorlik_ has joined #ste||ar
Yorlik has quit [Ping timeout: 260 seconds]
hkaiser has quit [Quit: Bye!]
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 276 seconds]
K-ballo1 is now known as K-ballo
HHN93 has joined #ste||ar
<HHN93> does the c++ std suggest against execution policies for inner product?
<HHN93> haven't found an overload which accepts execution policy for std and hpx versions
Yorlik_ is now known as Yorlik
HHN93 has quit [Quit: Client closed]
HHN93 has joined #ste||ar
<HHN93> I have tried running copy and move algorithms on my machine for 1M and 1000M elements with seq, par execution policies and got the same execution time.
<HHN93> 16 core machine, other algorithms like for_each are being parallelised.
<HHN93> are the copy and move algorithms not parallelised?
HHN93 has quit [Quit: Client closed]
<pansysk75[m]> HHN93: You are refering to the algorithms under the hpx namespace (for example hpx::copy), correct? Those should run in parallel when passed a par execution policy, afaik
hkaiser has joined #ste||ar
<pansysk75[m]> In case you are refering to the std:: namespace, that will depend on the compiler, but implementations of parallel implementations are generally lacking (passing std::execution::par and such will still compile, but it will nevertheless still run sequentially)
<pansysk75[m]> s/implementations/algorithms/
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 276 seconds]
K-ballo1 is now known as K-ballo
diehlpk_work has joined #ste||ar
HHN93 has joined #ste||ar
<HHN93> is hpx::copy parallelised by calling futures on chunks?
<hkaiser> HHN93: yes
<HHN93> ok
<HHN93> how do we plan to implement unseq on such algorithms
<hkaiser> HHN93: btw, the parallel version of inner_product is transform_reduce
<HHN93> such => those which use futures to be parallelised
<hkaiser> (similarly to reduce being the parallel version of accumulate)
<hkaiser> HHN93: using unseq(task) should do the trick
<HHN93> ok sounds simple, will try looking into it
<hkaiser> HHN93: well, sorry I might have misunderstood your question
<hkaiser> par_unseq does uses tasks to run the loop functions on chunks
<HHN93> oh ok, loop body in the future call is going to be vectorized by pragmas?
<HHN93> cool, will look into it
<HHN93> thank you
<HHN93> hpx;:generate is parallelised by calling std generate on chunks
<HHN93> is that the best way of doing it?
<HHN93> I am not sure how unseq can be implemented if we are calling std::generate
<HHN93> ok wait my bad, I think I confused something. it does not call std generate
HHN93 has quit [Quit: Client closed]
HHN93 has joined #ste||ar
<hkaiser> HHN93: we'd have to reimplement sequential generate, I think it was already done for the simd policy
<HHN93> ok will have a look at it
<hkaiser> HHN93: not sure where you have seen uthe use of std::generate
<hkaiser> ok, that needs implementing by dispatching to the appropriate loop function
<srinivasyadav18[> yes, we could use util::loop_ind, the same way we used it here : https://github.com/STEllAR-GROUP/hpx/blob/master/libs/core/algorithms/include/hpx/parallel/algorithms/detail/generate.hpp#L24, which takes advantage of loop unrolling
<HHN93> can someone help me understand how the hpx copy algorithm works at this point https://github.com/STEllAR-GROUP/hpx/blob/master/libs/core/algorithms/include/hpx/parallel/algorithms/copy.hpp#L420
<HHN93> I am guessing we are trying to generate futures for each partition
<hkaiser> HHN93: it uses foreach_partitioner
<hkaiser> foreach_partitioner divides the input sequence into chunks and runs the lambda on each of those
<HHN93> so it generates a future to execute each partition, right?
<HHN93> ok got it
<HHN93> the get_in_out_result is responsible for synchronisation of the futures?
<hkaiser> whether it generates a future for each chunk (partition) or not depends on the used executor
<hkaiser> the default executor assiciated with par generates one HPX task per used core and synchronizes everything using a singe future
<hkaiser> HHN93: no, it should stop here: https://github.com/STEllAR-GROUP/hpx/blob/master/libs/core/algorithms/include/hpx/parallel/util/foreach_partitioner.hpp#L40 (if used with par, par_simd, or par_unseq)
<HHN93> future and the other one doesn't depending on the executor
HHN93 has quit [Quit: Client closed]
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 246 seconds]
K-ballo1 is now known as K-ballo
tufei has joined #ste||ar
<Isidoros[m]> Hello, I have a question about relocation semantics:
<Isidoros[m]> When an object that manages some heap memory (e.g. `unique_pointer` or `vector`) is relocated, a bitwise copy of it is created with `memcpy` or `memmove`, and we end up with two objects that manage the same memory buffer. When the original (e.g.) `unique_pointer` leaves the scope will it not call free() on the pointer that both vectors hold?
<Isidoros[m]> Looking at facebook folly's fbvector I cannot see how this is handled in case T is a `unique_pointer`. (Note that a `unique_pointer` is a "trivially relocatable" type)
<Isidoros[m]> Here is a snippet from facebook's "folly" implementation of reallocation for fbvector: https://pastebin.com/iQFR040j
<Isidoros[m]> s///, s/vectors/`unique_pointer`s/, s///
<hkaiser> Isidoros[m]: no, that's not what happens
sivoais has quit [Ping timeout: 256 seconds]
<hkaiser> unique_ptr and vector behave differently when being assigned/moved
<hkaiser> assigning a vector to another one will copy the data it holds, moving a vector will hand over the buffer to the vector it is assigned to and sets its own internal pointer to nullptr
<hkaiser> assigning unique_ptr is not allowed, moving it will hand over the internal pointer to the object it is assigned to
<Isidoros[m]> I was referring to relocation, which as far as I understand does not alter the source object in any way.
<hkaiser> I don't know what 'relocation' is in terms of C++
<hkaiser> you can either copy or move an object
<hkaiser> ahh
<hkaiser> makes sense now
<hkaiser> relocation is equivalent to a bitcopy assuming the source object's destructor will not be called
<hkaiser> i.e. it's sematically equivalent to moving and immediate calling of the source's destructor
<Isidoros[m]> I see, so how can we stop the destructor from being called?
<hkaiser> delete [] (char*)ptr; instead of delete[] ptr; ?
<Isidoros[m]> facebook's library doesn't even bother with the deallocation
<hkaiser> right
<Isidoros[m]> but it should, right?
<Aarya[m]> Hi so I started writing the proposal for the "hpxMP: HPX threading system for LLVM OpenMP". And I had a question that is the project only consisting of adding hpxmp calls(and other symbols) from hpxc in llvm/openmp for all pthread calls.
<Aarya[m]> Or are there some pthread symbols not implemented in hpxc
sivoais has joined #ste||ar
<hkaiser> Aarya[m]: most likely not all of the APIs needed are implemented in hpxc yet
<Aarya[m]> Ah okay
<hkaiser> Aarya[m]: for instance threa attributes are not in place, iirc
<Aarya[m]> So the hpx thread calls are implemented using hpx library
<Aarya[m]> * calls are to be implemented using
<hkaiser> Aarya[m]: yes
diehlpk_work has quit [Ping timeout: 248 seconds]
<sarkar_t[m]> Hi hkaiser gonidelis ! I am Tanmay Sarkar, a 2022 graduate of the Electrical Engineering Department of IIT Roorkee and currently working as a backend developer for around 8 months now. Since the final year of my bachelors degree I really wanted to get involved with the HPX project but couldn't do so because of being engaged in other projects.
<sarkar_t[m]> But with those things out of the way I want to start by contributing to the GSoC project "(Re-)Implement executor API on top of sender/receiver infrastructure"
<sarkar_t[m]> So, I have a few initial questions about this project. https://github.com/STEllAR-GROUP/hpx/pull/5758 this PR is linked to the project description.
<sarkar_t[m]> So, this PR is about adding `completion_signature`. Is `completion_signature` similar to `completion_handler` which is kind of what receivers are in S/R proposal (saying from my understanding of the S/R architecture)
<sarkar_t[m]> * So, I have a few initial questions about this project. https://github.com/STEllAR-GROUP/hpx/pull/5758 this PR is linked to the project description.
<sarkar_t[m]> So, this PR is about adding `completion_signature`. Is `completion_signature` similar to `completion_handler` which is kind of what receivers are in S/R proposal (saying from what I understand about the S/R architecture so-far)
<hkaiser> sarkar_t[m]: welcome
<hkaiser> completion_signatures are a means for receivers to figure out what types connected senders will provide
<hkaiser> also, for the project you mentioned, there has been done some work in the mean time, but I think we can extend it to using s/r for implementing algorithms
<sarkar_t[m]> hkaiser: If this is the case, does "adding facilities to support completion_signatures" mean adding support for methods like `set_value`, `set_error` and `set_done` which are basically the functions that the operation state calls to notify the receiver?
<hkaiser> these three functions are being invoked by senders on their connected receiver
<hkaiser> completion signatures are exposed by a sender encoding what types it will send through set_value/set_error, and whether it exposes set_stopped
<sarkar_t[m]> hkaiser: Is the work that has been done, related to https://github.com/STEllAR-GROUP/hpx/pull/5758 this PR, or there is more work as well?
<hkaiser> but what I meant would involve changing the existing algorithms to support s/r based executors
<sarkar_t[m]> hkaiser: Can you please explain a bit what you mean by "what types it will send" or maybe link something related to this that I can refer to for understanding the implication of the quoted part of your statement?
<hkaiser> like done in the same commit for for_each, for instance:
<hkaiser> set_value(...) takes some arguments representing the result of a senders operation
<hkaiser> those arguments can be of arbitrary type, the completion signatures of a sender expose those types
<sarkar_t[m]> hkaiser: Okay
<sarkar_t[m]> hkaiser: Thank you for this. I will look into it, understand this, and get back with further queries.
<hkaiser> sarkar_t[m]: the whole of this document is the main source of information about s/r
<sarkar_t[m]> Yes, I did look at it from the top. Now will look into it in detail
<sarkar_t[m]> <hkaiser> "but what I meant would involve..." <- Okay, so basically the underlying support for S/R architecture is there in HPX, and in this project I need to do changes in the existing parallel algorithms in HPX so that they make use of the S/R architecture, am I right in saying this hkaiser ?
<hkaiser> yes