<diehlpk_work>
jbjnr_, Are you still intend to join today;s meeting?
akheir has joined #ste||ar
hkaiser has quit [Quit: bye]
diehlpk has joined #ste||ar
<diehlpk>
jbjnr_, yet?
diehlpk has quit [Ping timeout: 264 seconds]
hkaiser has joined #ste||ar
<heller>
simbergm: what's the state of the installation@
<rori>
quit
rori has quit [Quit: WeeChat 1.9.1]
<simbergm>
heller: release and profiling-apex are there now
<simbergm>
but I'm not sure if I got the apex build right, vampir refuses to read the otf files
<heller>
Hmmm
<simbergm>
diehlpk_work: I would assume so (john is away now)
<heller>
Maybe wrong otf2 library?
<diehlpk_work>
simbergm, Cool, even better
<simbergm>
heller: maybe, I just used the one that was there
<heller>
That might have been too old
<heller>
That's 2 years old, IIRC
<simbergm>
ok, I'll try that tomorrow
<simbergm>
heller: did you use the vtune/itt build last time? do we need it?
<heller>
I don't think we need it
<heller>
We showed some itt results last time, but I think it's good enough to focus on vampir
hkaiser has quit [Quit: bye]
hkaiser has joined #ste||ar
<diehlpk_work>
Name: Modern CUDA and C++ by Bryce Adelstein Lelbach
<diehlpk_work>
hkaiser, Do you want to have a different title for Bruyce's talk?
<diehlpk_work>
Playlist: Talks @ Ste||ar group
nikunj97 has joined #ste||ar
<simbergm>
tarzeau: the various dev packages are needed for libhpx-dev
<simbergm>
although I don't fully understand how the boost packages work
<simbergm>
and if you want to have google-perftools as a dependency you can use tcmalloc instead ofjemalloc, although I'll admit I don't really know what one gets by linking hpx with google-perftools (not talking about tcmalloc)
<nikunj97>
hkaiser: yt?
<hkaiser>
nikunj97: here
<hkaiser>
diehlpk_work: nah, I think it's fine as it is
<nikunj97>
hkaiser: I calculated the amortized time for 1 tile/timestep. It is pretty low at about 5-7 us only
<hkaiser>
nikunj97: ok, that's not what I meant :/
<nikunj97>
:/
<hkaiser>
let's talk again tomorrow
<nikunj97>
I have all the graphs btw
<hkaiser>
ok, nice - let's look over them tomorrow as well
<nikunj97>
32000 points and 64 domains is the sweet spot
<hkaiser>
nod
<heller>
simbergm: did you get Apex working? Could you send me steps to reproduce?
<simbergm>
heller: haven't tried again
<heller>
ok
<simbergm>
you want to try the build or running with apex?
<heller>
both
<diehlpk_work>
hkaiser, Ok, so I will let the IT guys know and they can publish the talk
<diehlpk_work>
So we can release the talk and get rid of my stalkers :)
<simbergm>
I'll look at it again tomorrow, but you can try opening ~/OTF_archive/APEX.otf (or something like that), I think I left it there
<diehlpk_work>
people still ask me for his talk on Twitter
<hkaiser>
diehlpk_work: sure, and thanks!
<nikunj97>
hkaiser: to implement validate as they have done, I will need to know about the sum of all points from previous and next tile. Also, I will need the first the element from next tile.
K-ballo1 has joined #ste||ar
K-ballo has quit [Read error: Connection reset by peer]
K-ballo1 has quit [Ping timeout: 258 seconds]
K-ballo has joined #ste||ar
<nikunj97>
I think I know how to implement checksums
<nikunj97>
hkaiser: I think I finally understood everything Jackob is trying in his code. I'll convert the 1d stencil to do exactly what his code. As for tomorrow, let's just say we found scope for optimization so we're working on it.
<nikunj97>
I can port 1d stencil to work same as Jackob, but much faster. It will take some time though.
<nikunj97>
hkaiser: just implemented the checksums and ported Jackson's code
<nikunj97>
am going home now. Will write benchmarking scripts tomorrow morning
<nikunj97>
just in case you want to know how it performs, then it takes about 8s to run
<nikunj97>
I think I can optimize it further, but I'm too tired now