hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
diehlpk has joined #ste||ar
diehlpk has quit [Ping timeout: 240 seconds]
Yorlik_ has joined #ste||ar
Yorlik has quit [Ping timeout: 260 seconds]
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: Bye!]
tufei__ has joined #ste||ar
tufei_ has quit [Ping timeout: 276 seconds]
Yorlik__ has joined #ste||ar
Yorlik_ has quit [Ping timeout: 252 seconds]
diehlpk_work has quit [Remote host closed the connection]
<dkaratza[m]>
ms: good morning, can you send me the link?
<dkaratza[m]>
ms: so, the thing is that i can remove the pages by just removing `include`, but I would like to prevent the files from being generated as well. For example, the files `cmake_toolchains.rst`, `cmake_variables.rst`
<dkaratza[m]>
sorry, only `cmake_toolchains.rst`, not the variables
<jedi18[m]>
These are the default chunk size measurements, I'll try varying the chunk size next
<hkaiser>
nice!
<jedi18[m]>
Ran these on a single node of buran btw
<hkaiser>
nod, good work!
<hkaiser>
how many cores did that use?
<jedi18[m]>
hkaiser: I dind't specify the number of cores (just ran srun bin/...)
<jedi18[m]>
Is there a way to check the number of cores?
<hkaiser>
ok, so all of the cores were used
<hkaiser>
you can do --hpx:print-bind to see how threads are being bound
<srinivasyadav227>
jedi18[m]: it uses 48 cores default on buran
<jedi18[m]>
Oh ok, thanks
<hkaiser>
jedi18[m]: what type of elements did you use ? int? double?
hkaiser has quit [Quit: Bye!]
<jedi18[m]>
Int, and the it's the average of 50 iterations for each input size
<pedro_barbosa[m]>
Hey guys, I was trying to adapt correct a problem I had in an example I have in HPXCL, I have a program that runs on a GPU and every iteration of the kernel I want to swap the input with the output, so I copy the output to the host, swap the values and copy it back to the GPU, however this has poor performance cause I have to copy the arrays every iteration, so my idea was to swap the two buffers on the argument array that I pass to
<pedro_barbosa[m]>
the function. expecting the kernel to then use the output array as input and vice versa without having to copy everything back to the GPU. Can someone confirm if it is possible to do it like this?
<pedro_barbosa[m]>
s/Hey guys, I was trying to adapt correct a problem I had in an example I have in HPXCL, I have a program that runs on a GPU and every iteration of the kernel I want to swap the input with the output, so I copy the output to the host, swap the values and copy it back to the GPU, however this has poor performance cause I have to copy the arrays every iteration, so my idea was to swap the two buffers on the argument array that I pass to
<pedro_barbosa[m]>
copy it back to the GPU, however this has poor performance cause I have to copy the arrays every iteration, so my idea was to swap the two buffers on the argument array that I pass to the function, expecting the kernel to then use the output array as input and vice versa without having to copy everything back to the GPU. Can someone confirm if it is possible to do it like this?/
<pedro_barbosa[m]>
the function. expecting the kernel to then use the output array as input and vice versa without having to copy everything back to the GPU. Can someone confirm if it is possible to do it like this?/Hey guys, I was trying to correct a problem I have in an example in HPXCL, I have a program that runs on a GPU and every iteration of the kernel I want to swap the input with the output, so I copy the output to the host, swap the values and