April 13 – May 4Open source organizations apply to take part in Season of Docs
Should we apply again this year?
If so who will help me in preparing the application
akheir has joined #ste||ar
akheir, Could it be possible does slurm not send email yet?
diehlpk_work: I will look into it today
Ok, cool, will run some jobs this week
akheir has quit [Read error: Connection reset by peer]
akheir1 has joined #ste||ar
diehlpk_work do you run into any problems with running jobs on Summit?
weilewei, Ha dno time to run them yet
diehlpk_work ok, just checking
Anyways we will not run much on Summit right now. Only two jobs to test the code
I think we will run more jobs at the end of this year
diehlpk_work I am not sure if Summit will be available on 2021 or not, because Frontier will be delivered on 2021. not sure if I remember correctly or not
weilewei, Ok, anyways we have to port more computation to CUDA vefore we cna run on Summit
Currently, the most expensive part runs on the CPU
diehlpk_work nice. just saying
diehlpk_work: configured it. use --mail-type=ALL --mail-user=<your email> to submit jobs
akheir1, Cool, thanks
I will try it
diehlpk_work: you can use --mail-type=END to only get an email when the job is done.
I also like to get the failed ones to check and resubmit
hkaiser, yt?
As a general cluster sysadmin, we recommend not going overkill on job status mails, it's a great way to fill your mailbox and to get the HPC site blacklisted as "spam".
nikunj97: here
hkaiser, is there a proposal for STL to support __sizeless_struct?
STL containers ^^
what should it do to the STL?
so I'm working with SVE simd and it's pack works with __sizeless_struct i.e. the size of the struct is determined at runtime. Using std::vector<nsimd::pack<float> > will lead to compiler error due to this
it will complain: error: arithmetic on a pointer to an incomplete type
not sure you can do anything about that, actually
the concept of __sizeless_struct is currently only supported by ARM compilers. GCC on the other hand wants you to provide the vector length at compile time, which makes the whole code non portable :/
and if you tell the vector length at compile time, it kind of shadows the use of SVE itself which is meant to be vector length agnostic
hkaiser, is there nothing I can do about it?
nikunj97: I wouldn't know how to have a vector<> for a type for which the size is unknown at compile-time
ughh, let me write an email to ARM guys then
they may have some useful info
lol arm hpc compiler isn't open sourced or have a community where I could ask
nikunj97: do they support such a vector<>?
not really, but since they've developed __sizeless_struct, they may have some idea of how to make wrappers around STL containers to support __sizeless_struct
nikunj97: well, actually - it might be possible - if you can detect the size of T at runtime, the vector<T> could be created, it internally is a pointer to T[] anyways
rtohid has joined #ste||ar
hkaiser, how do you propose I should handle it?
karame_ has joined #ste||ar
nikunj97: wrtie your own vector<T>?
hkaiser, that part I understood ;)
I meant how do I work with type that's deduced at runtime?
nikunj97: you can take compute::vector as a starting point
nikunj97: well, their sizeless type has to somehow expose its size at runtime
it most likely has a size() member
so on the vector allocation function, instead of using sizeof(T) you use T::size()
aah makes sense!
got what you're saying, this looks doable
depending on whether scalar<T>::is_sized is true or false
Amy1 has quit [Quit: WeeChat 2.2]
Amy1 has joined #ste||ar
nan11 has joined #ste||ar
Amy1 has quit [Quit: WeeChat 2.2]
Amy1 has joined #ste||ar
Bleh. HPX doesn't spawn runtime threads on SMTs on my machine out of the box. Time to figure out how to configure the runtime.
I guess I'll just jam in hpx.os_threads on the command line somehow.
Hashmi has joined #ste||ar
nikunj97 has quit [Ping timeout: 260 seconds]
With great command line arguments comes great pessimisation... init time went from 86s to 134s when leveraging all threads :D
hkaiser, see pm, please
diehlpk_work: ok
hkaiser, please reply to diehlpk_work since I use a different device
nan11 has quit [Remote host closed the connection]
bita has joined #ste||ar
nan11 has joined #ste||ar
Hashmi has quit [Quit: Connection closed for inactivity]
rtohid has quit [Remote host closed the connection]
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
K-ballo has quit [Remote host closed the connection]
K-ballo has joined #ste||ar
hkaiser has quit [Ping timeout: 265 seconds]
bita_ has joined #ste||ar
bita has quit [Ping timeout: 260 seconds]
nan11 has quit [Remote host closed the connection]
nan11 has joined #ste||ar
rtohid has joined #ste||ar
I would like to have a reduce operation across ranks in the following manner, say, rank 0 allocates a vector of size 10 and computes values if index ranges from 0 to 4, and rank 1 allocates a vector of size 5 and computes values of index ranges from 5 to 9. And rank 1's vector will be reduced to rank 0's vector starting from index 5 to 9
I have an hpx question: for some part of my test, when I am using the verbose flag, I see that my "base command" is <unknown>. Any ideas?
is there any mpi operation that is available? the rank numbers could be large, not just 2 ranks.
hkaiser has joined #ste||ar
maybe, shall I use MPI_Reduce, and then offset recvbuf pointer in non-root rank?
stmatengss has joined #ste||ar
weilewei has quit [Remote host closed the connection]
hkaiser, I have an hpx question: for some part of my test, when I am using the verbose flag, I see that my "base command" is <unknown>. Any advice?
weilewei has joined #ste||ar
stmatengss has left #ste||ar [#ste||ar]
bita_: not sure what you're referring to
and using ctest -V, I see the base command is: 177: Base command is "/phylanx/build/bin/retile_6_loc_test --hpx:ini=hpx.parcel.tcp.enable=1 --hpx:threads=1 --hpx:localities=6"
Adding test_retile_6loc_2d_0, the base command is becoming <unknown>
where do you see that? circleci?
no I build that on docker and msvc
in MSVC one of the cores have and <unknown> in it
bita_: how can I reproduce this?
Of course on docker it times out after 200 seconds, and I have seen hangs on windows 1 out of 5 times
weilewei has quit [Remote host closed the connection]
I think so, the branch is add_retiling
bita_: how?
weilewei has joined #ste||ar
On Phylanx's add_retiling branch, I use cmake --build /phylanx/build --target tests.unit.plugins.dist_matrixops.retile_6_loc, and then ctest -V -R tests.unit.plugins.dist_matrixops.distributed.tcp.retile_6_loc
how can I make an issue, when it needs the changes in a branch?