<hkaiser>
zao: async knows that the function is caled once so it tries to move the arguments to the function
<zao>
I'm gonna hit the sack, but if I have several functions needing the result of an async call, can I pass a shared_future to them all, or should I try to get the data out at some point and hand them shared_ptr:s?
<zao>
Also still not sure how to handle async functions returning several different things, is a tuple of futures my best bet, or should I bake some custom return struct for each one?
<zao>
(this codebase has lots of nice functions of arity >20 with a lot of big input and output arrays, scientific code at its best)
<hkaiser>
async return a future which you can turn into a shared_future
<hkaiser>
f.share()
<hkaiser>
let the function return a tuple<> (i.e. async will give you a future<tuple<>>)
<hkaiser>
or use the returned future just as a flag that the ref'ed args are ready
<hkaiser>
or create a struct and return an instance of it which gives you a future<foo>, might require move operators and somesuch, though
<weilewei>
hkaiser and @everyone, if I want to understand more about operating system and computer architecture, what books will you recommend?
<hkaiser>
the book with the dinosaurs on the cover
<weilewei>
feel like I need to understand more about the underlying hardware
<weilewei>
Operating System Concepts Avi Silberschatz
<weilewei>
But we cannot enter the CCT building now
<hkaiser>
but you're not supposed to go there and my office is locked
<hkaiser>
weilewei: but you're lucky that I'm not supposed to go there anyways and will not get a book that I'm not supposed to give to Karame this week
<hkaiser>
;-)
<nan11>
lol
<weilewei>
lol, maybe I can find a E-book online of it
<hkaiser>
I'll let you know once I have not been there and where you can't find the book
<weilewei>
hkaiser thanks!! Let me know then
<hkaiser>
I won't
<nan11>
I hear nothing xD
<weilewei>
lol
<weilewei>
If I have a kernel-thread only, once it finishes its work, then it will get destroyed (right?). Then, if I have a hpx user-level thread running on top of one kernel thread, if this user-level threads finishes same work and some other user level threads will come right after, then does the kernel-thread get destroyed as well?
<weilewei>
Will a new kernel-thread gets created to support the new coming user-level threads?
<weilewei>
Or just reuse the first kernel-thread?
<zao>
HPX seems to have a whole lot of persistent OS threads to serve as workers and IO runners.
<zao>
There’s not much point in scaling them up and down if you can keep them around for cheap.
<weilewei>
IC, that's how we save overhead
<weilewei>
btw, does this explanation exist somewhere? I barely find good online pages explain this concept well. Some of them just same kernel thread is expensive to create and manage, then period
<zao>
If you break a HPX program in a debugger you can see nicely named threads (if the OS supports names) and get a feeling for what their stacks look like.
<zao>
The concept of thread pools is reasonably common out there, the ways work gets onto them tends to vary a bit.
<weilewei>
True, I will take a look
<weilewei>
Right, that's down to implementation level
<zao>
One of the niftier things of HPX is how work can yield for other work as needed, something that’s otherwise hard.
<weilewei>
Can't OS scheduler do similar thing?
<hkaiser>
zao: those threads are mostly dormant
<weilewei>
hkaiser what are "those threads"? those keep OS busy?
<hkaiser>
no, they simply sit and wait in the kernel, doing nothing - mostly
<hkaiser>
weilewei: we have 6 additional threads in HPX, 2 for IO, 2 for timers, and two for networking
<weilewei>
hkaiser these 6 additional threads are waiting for responding tasks and will be immediately used when needed, right?
<hkaiser>
yes
<weilewei>
If there is not much tasks in user space, will hpx keep those free kernel threads from being destroyed?
<hkaiser>
those 6 threads are not the ones doing the hpx tasks
<weilewei>
I see, I guess my question is what happen to kernel threads that have hpx tasks to work
<weilewei>
that have *no hpx tasks
<hkaiser>
they keep running the scheduling loop
<weilewei>
oh, I understand now, thanks. hpx scheduler makes them busy
<hkaiser>
the threads that don't run hpx threads sleep when there is no work
<zao>
If you’d write your own thread pool, you’d typically have some control function running on each OS thread waiting for work to appear.
<hkaiser>
zao: yah, we keep them running to be able react faster
<hkaiser>
they however do a exponential backoff if there is no work for a longish time
<zao>
You _could_ make that self-terminate and have the issuer start new threads when need arises again, but in the HPC world you don’t have much need to.
<hkaiser>
right, you want to keep overheads down
nan11 has quit [Remote host closed the connection]
<zao>
So there’s tiers of overhead. You can actively spin for work, consuming CPU resources and hoping work appears shortly, you can back off and wait on a heavier primitive, and you could again in theory shut down the thread and have someone wind up a new one later at great cost.
<weilewei>
I see
<zao>
Yielding out to the OS puts you at the mercy of its scheduler, which tends to be rather coarse-grained.
<zao>
(please correct me if I’m off on something)
<weilewei>
hkaiser actually I still confused about the wording, you said "the threads that don't run hpx threads sleep when there is no work", and then you said "zao: yah, we keep them running to be able react faster". In the later sentence, what does "them" refer to?
<weilewei>
in my mind, i am thinking "kernel threads that don't have any hpx tasks to do" as "them"
<hkaiser>
the threads that do hpx work run all the time, the others sleep
<hkaiser>
zao: exactly
<hkaiser>
except that we don't stop and restart the threads, but we do let them back off if there is no work
<weilewei>
hkaiser ah, I see now
<hkaiser>
putting them to sleep
akheir1 has quit [Quit: Leaving]
hkaiser has quit [Quit: bye]
bita has quit [Quit: Leaving]
nikunj_ has joined #ste||ar
weilewei has quit [Remote host closed the connection]
<Yorlik>
Sweet. Lets hope we will profit from this statement of the paper: " ORNL and Cray will partner with AMD to co-design and develop enhanced GPU programming tools designed for performance, productivity and portability,"
kale_ has joined #ste||ar
kale_ has quit [Ping timeout: 258 seconds]
kale_ has joined #ste||ar
<heller1>
nod
kale_ has quit [Read error: No route to host]
kale_ has joined #ste||ar
kale_ has quit [Remote host closed the connection]
nikunj_ has joined #ste||ar
hkaiser has joined #ste||ar
<Yorlik>
Did any of you have this problem? :
<Yorlik>
Could not find a configuration file for package "boost_system" that exactly matches requested version "1.72.0".
<Yorlik>
It only comes up in debug build
<Yorlik>
The file is there, but it doesn't accept it
<Yorlik>
Like this: The following configuration files were considered but not accepted:
<simbergm>
don't know if that helps? (and we're obviously still missing 1.72...)
<Yorlik>
I just added it and 1.72 and .0, but the error persists.
<Yorlik>
Funnily enough HPX compiled without issues.
<Yorlik>
The debug version was missing boost_system-config-version.cmake
<Yorlik>
I just copied it over
<simbergm>
zao: since you spend all your days fighting bad build systems... are we being bad citizens by having SOVERSION set to the release version (as opposed to a counter that we increment every time we break the abi, i.e. all the time) when we don't guarantee abi compatibility between minor releases? or does noone care?
<zao>
In my particular EasyBuild world we don't care, as we have exact matches of versions. For packaging in general there might be some assumptions of what works with what, but that I don't really know about.
<simbergm>
thanks, I suppose it could be a problem in distros
<simbergm>
but maybe we can worry about that later
<zao>
Does HPX declare somewhere what kind of versioning scheme is in use, semver and other guarantees?
<hkaiser>
nope
<hkaiser>
nothing formal
<K-ballo>
we should formalize the no guarantees
<hkaiser>
K-ballo: I'll bring it up tomorrow during the PMC meeting
<simbergm>
I was going to write up a draft for that today or tomorrow... (but for 2.0, where I hope we'll start using semver; good idea to write down the no-guarantees given at the moment)
<hkaiser>
simbergm: we could start collecting something along the lines of Python PEP documents
<zao>
Adopt the motto of our computer club - "no-one ever promised that anything would work"
<hkaiser>
isn't that self-evident?
<zao>
hkaiser: I'm curious, who's this "aurianer" person working on the windows path issue I had?
<K-ballo>
2.what?
<hkaiser>
zao: she is working with Mikael in Zuerich
<hkaiser>
K-ballo: once the modularization is done, we plan to release it as HPX V2
nan11 has joined #ste||ar
<simbergm>
K-ballo, hkaiser: it could be more than just modularization but let's see how much energy we have for that
K-ballo has quit [Remote host closed the connection]
K-ballo has joined #ste||ar
<zao>
Hitting 2.0 before Boost :P
<simbergm>
hkaiser: I would just add PR/issue tag HEP :)
<hkaiser>
sure, whatever ;-)
<simbergm>
while I like the idea of a PEP in general, I find it funny that enhancement proposals stay as enhancement proposals once they've been accepted
<hkaiser>
no need to overdo things
<hkaiser>
zao: Boost will never hit 2.0
<simbergm>
the api guarantees belong in our documentation anyway
<zao>
Yeah :)
weilewei has joined #ste||ar
<hkaiser>
nikunj_: yt?
<nikunj_>
hkaiser, here
<hkaiser>
hey
<nikunj_>
hkaiser, hey! hope you're safe and well
<hkaiser>
where did we host the resiliency paper?
<hkaiser>
thanks, all is well
<nikunj_>
you mean apply in SC?
<hkaiser>
yah
<nikunj_>
it was FTXS
<hkaiser>
can't find it - we've got permission to publish it now
<simbergm>
(I think it's a bit too early to finalize anything since we're not even close to thinking about 2.0.0, but we can start thinking about this)
Yorlik has joined #ste||ar
<simbergm>
zao: you monster (jk, I used to love pizza hawaii as a kid, now I just accept it)
<simbergm>
nikunj_: I'm glad things are faster :D I probably broke the block_executor in the first place though, so I shouldn't get much credit for making it faster again
<simbergm>
also, just for the record, I don't like meaty drinks
<Yorlik>
No meatonade?
<nikunj_>
simbergm, my code is running faster due to your efforts ;)
bita has joined #ste||ar
kale_ has joined #ste||ar
kale_ has quit [Client Quit]
<nikunj_>
hkaiser, is there any book you'd recommend on metaprogramming other than C++ Template Metaprogramming: Concepts that you gave me last summer?
<nikunj_>
something more on the line of modern C++
<nikunj_>
right, I can say that we're seeing good amounts of cache benefits. So arithmetic intensity isn't 1/8 imo
weilewei99 has joined #ste||ar
weilewei99 has quit [Remote host closed the connection]
<heller1>
Yes, which is nice
<nikunj_>
absolutely!
<heller1>
Did you implement any blocking?
<nikunj_>
heller1, blocking as in block iterators?
<heller1>
As in cache blocking, such that you iterate over your domain in a tiling fashion
<nikunj_>
I have not, how do I do that?
<nikunj_>
current results are pure HPX results. Only thing I tried was to limit any complexities in the code so that the compiler can optimize the code better.
<zao>
Yorlik: Ah, the thing I was thinking of was:
<zao>
1> [CMake] New Boost version may have incorrect or missing dependencies and imported
<zao>
1> [CMake] targets
<Yorlik>
Yes - that warning is vommon - but it doesn't break builds.
<nikunj_>
heller1, cache blocking is present inherently
<nikunj_>
since the stencil dimension is 8192*131072 and carrying line wise updates, we're already using cache blocking
<nikunj_>
heller1, currently my stencil fits in L2 cache perfectly so we're cache benefits already
<heller1>
I see, there you go
<nikunj_>
when I say stencil, I mean the line updates fits into L2 cache nicely
<nikunj_>
heller1, with these numbers, I can write my results for the lab based project and gain university credits. Now I want to extend this work so that I can get a paper out of it. What would you suggest?
<heller1>
Read the prior art about the topic
<heller1>
And think about what you're making differently, what are your benefits, where are the drawbacks, etc
<nikunj_>
I have done some literature review and most of them usually ends with either a new tiling solution or making tiling easier to use for application user
gonidelis has quit [Ping timeout: 240 seconds]
<heller1>
Where's your approach going in a different direction?
<nikunj_>
right now, my application is a very basic one. It's nothing different from what people have been using. It's just an optimized version making use of cache effectively
<heller1>
What architectures did you investigate?
<nikunj_>
aah, none of them were really done on ARM
<nikunj_>
they were all based out of x86 architectures
<heller1>
Is it performing equally good everywhere?
<nikunj_>
well you have the results. It's performing nicely.
<nikunj_>
What I couldn't explain was simd floats on thunderX2
<nikunj_>
it was performing way better than available bandwidth
<heller1>
I'm asking the questions for you to answer.
<nikunj_>
I investigated x86 and arm and I saw that arm had irregular scaling while x86 had a regular overall scaling
<heller1>
If you find an answer to those questions which have not been answered by previous papers, you have yours
<nikunj_>
so essentially start with literature review on stencils and its performance
RostamLog has joined #ste||ar
akheir has joined #ste||ar
gonidelis has joined #ste||ar
K-ballo has quit [Remote host closed the connection]
K-ballo has joined #ste||ar
nikunj_ has quit [Ping timeout: 240 seconds]
Amy1 has quit [Killed (Sigyn (Stay safe off irc))]
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
diehlpk_work has quit [Remote host closed the connection]
wate123_Jun has joined #ste||ar
diehlpk_work has joined #ste||ar
mreese3 has quit [Read error: Connection reset by peer]
bita has quit [Ping timeout: 256 seconds]
nikunj97 has joined #ste||ar
wate123_Jun has quit []
gonidelis has quit [Remote host closed the connection]
weilewei has quit [Remote host closed the connection]