<hkaiser>
diehlpk_work: we probably should, this repo is automatically populated, however, so we might need to do some scripting
akheir has quit [Read error: Connection reset by peer]
akheir has joined #ste||ar
<hkaiser>
diehlpk_work: done, let's see if it gets overwritten
hkaiser has quit [Quit: bye]
weilewei has quit [Remote host closed the connection]
bita has joined #ste||ar
bita has quit [Quit: Leaving]
nan1 has quit [Remote host closed the connection]
akheir has quit [Quit: Leaving]
mdiers_ has quit [Remote host closed the connection]
mdiers_ has joined #ste||ar
<wash[m]>
Sorry folks, what did you need my consent for?
<wash[m]>
Ah for the JOSS submission? That's fine :)
<simbergm>
hkaiser: thanks for adding the license
<simbergm>
I think it should stay there iirc what the scripts do
<heller1>
so I just noticed one thing ...
<heller1>
... and I think hpx::init needs to go. Here is the reason: Each test that requirese HPX threads to run, needs to go through hpx_init ... creating nice cycles...
<simbergm>
heller: if anything the tests need to be rewritten not to use hpx::init
<simbergm>
but it's only tests, the modules otherwise don't have that dependency
<simbergm>
and kicking out hpx::init just for the sake that feels user-hostile
<simbergm>
but there might be good solutions
<simbergm>
do we have a way of making a future<void> from a future<tuple<future<void>, future<void>>> (returned from e.g. when_all) without spawning a task?
<simbergm>
unwrap will actually wait for the future, but I'd like to just collapse it into a future<void>
<heller1>
future<future<T>> -> works
<heller1>
* future<future<T>> -> future<void> work
<heller1>
but we don't have a way to further inspect that
<heller1>
ms[m]: there's split_future, if that helps
<simbergm>
yeah, I guess split future would actually do the right thing, even if it's semantically a bit iffy
<simbergm>
thanks!
nikunj has quit [Remote host closed the connection]
nikunj has joined #ste||ar
hkaiser has joined #ste||ar
<hkaiser>
simbergm: master is broken since the latest merges
<simbergm>
hkaiser: right you are
<simbergm>
sorry, entirely my fault
<simbergm>
I'll fix it
<hkaiser>
thanks a lot
<hkaiser>
simbergm: thanks for your thorough review of #4540
<diehlpk_work>
hkaiser_, I went through all ste||ar group repos and added a ticket were a license is missing
<hkaiser_>
diehlpk_work: thanks
nan11 has joined #ste||ar
gonidelis has joined #ste||ar
akheir has joined #ste||ar
<hkaiser_>
diehlpk_work: do we have a meeting now?
<diehlpk_work>
Yes, we are already in
<diehlpk_work>
hkaiser_, I sent the Zoom link to the operation bell list
karame_ has joined #ste||ar
rtohid has left #ste||ar [#ste||ar]
rtohid has joined #ste||ar
akheir has quit [Read error: Connection reset by peer]
akheir1 has joined #ste||ar
mcopik has joined #ste||ar
mcopik has quit [Client Quit]
bita has joined #ste||ar
bita_ has joined #ste||ar
bita_ has quit [Quit: Leaving]
gonidelis has quit [Ping timeout: 240 seconds]
rtohid has left #ste||ar [#ste||ar]
nan11 has quit [Remote host closed the connection]
nan11 has joined #ste||ar
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
<weilewei>
hkaiser_ how should I insert timer of communication phase correctly for ringG algorithm? for the computation part, I can insert start and end timer around line 73. but for the communication phase, it is an async operation and more importantly, it is a loop (depending how many ranks), also I do not want to count memorycpy phase as well
<hkaiser_>
you can only measure the overall time reliably, I think
<weilewei>
I see
<hkaiser_>
or each timestep in the loop
<weilewei>
I see, is it a wise choice to time each function inside the loop and also each step? And then do communication_time = total_time_per_step - compute_time - copy_time
Amy1 has quit [Ping timeout: 256 seconds]
Amy1 has joined #ste||ar
karame_ has quit [Quit: Ping timeout (120 seconds)]
<hkaiser_>
weilewei: try it - you can't really measure comunication time as it's overlapped
<weilewei>
hkaiser_ ah, I see now, even when receiving data, the program is doing copy and update...
<hkaiser_>
weilewei: the most you can do is to measure how long it sits in mpi_wait
<hkaiser_>
to allow to assess how muc cu time is wasted ;-)
<hkaiser_>
*cpu time*
Rory89 has joined #ste||ar
<weilewei>
hkaiser_ hmmm true... but still, if I meaure mpi_wait, that does not reflect the complete picture of communication time. maybe just measure the whole for-loop
<hkaiser_>
right
<bita>
hkaiser_, Rory89 and I was talking about parallel inverse. What kind of algorithm should be worked on? I was telling Rory that having a for loop parallelized with constraint is not what we do in Phylanx
<hkaiser_>
Rory89 and Avah have discussed what algorithm to use, no?
<bita>
I think Avah has OpenMP thing in her mind
<hkaiser_>
sure, that's what we could start with, no?
<Rory89>
Yeah, it's just Gauss Inverse with different localities owning different columns
<hkaiser_>
in the first step our implementation will suck perf-wise anyways ;-)
<bita>
I am not sure how it can be implemented. Rory can you explain more about its detail?
<hkaiser_>
Rory89: several columns per locality?
<Rory89>
Perhaps more than one column per locality. If you have an nxn matrix, it just splits those n columns up evenly, or approximately so, across the localities
<hkaiser_>
nod
<hkaiser_>
makes sense
<hkaiser_>
so you need to do different operatiions: a) find pivot, b) find coefficients, and c) apply coefficients
<hkaiser_>
is there more?
<bita>
I was telling Rory that he needs to make a distributed matrix (for the one that starts as the identity)
<Rory89>
The problem I was having was how to handle the result matrix.
<hkaiser_>
yes
<bita>
and he had questions about where we have annotations
<hkaiser_>
ok, what's the problem?
<hkaiser_>
the inverse returns a new (tiled) matrix with a corresponding annotation attached
<bita>
Rory, can you implement the 3 functions that hkaiser_ mentioned?
<hkaiser_>
it's very similar to what we have done in other places
<bita>
In other places we didn't have iterations
<Rory89>
The user sends off a matrix A to be inverted, all of the localities need access to another matrix, call it B. So thats the only problem, creating a new matrix in the code that isnt sent in the test
<Rory89>
that all of the localities can read and write to their respective locations.
<bita>
in guass inverse by Rory everything is happening in a for loop
<hkaiser_>
Rory89: same is done in dot product
<hkaiser_>
it creates a new matrix and fills it with the result of the operation
<hkaiser_>
the only difference is that you attach the distributed matrix to the result and not to the input
<hkaiser_>
or possibly to both, not sure
<hkaiser_>
you start off with an identity matrix, right?
<Rory89>
Ah, so B in this case is essentially like "result_matrix" in dist_dot?
<Rory89>
Yep, exactly
<hkaiser_>
yah
<hkaiser_>
we can even call inverse with both matrices, the one to invert and the identity matrix generated by nan's identity_d()
<hkaiser_>
inverse_d(A, __arg(B, identity_d(shape(A))) or somesuch
<hkaiser_>
if that helps, that is
<hkaiser_>
so you don't have to duplicate Nan's code
<Rory89>
Yep that makes sense, thanks!
nan11 has quit [Remote host closed the connection]
bita has quit [Quit: Leaving]
Rory89 has quit [Remote host closed the connection]