<mdiers[m]>
interesting, i'm also working on tensorflow right now. there is a tensorflow-rocm docker container. i got it running with singularity and made some first tests.
<tarzeau>
(built yourself using bazel, or pypi binaries)?
nikunj has quit [Ping timeout: 260 seconds]
nikunj has joined #ste||ar
<mdiers[m]>
i think it was still 1.15. my problem was to get rocm per rpm running on the system without affecting other things (vnc/mesa)
<mdiers[m]>
<tarzeau "(built yourself using bazel, or "> dockerhub
gonidelis has joined #ste||ar
<gonidelis>
jbjnr it's probably like 05:00 in the morning in Louisiana ;p . He will be probably be logged in in about ~2 hours
<mdiers[m]>
I got it running, but I haven't gotten any further at the moment, because almost only nvidia is available and the priority is a c++/python interface.
<Yorlik>
The Data at smallish Object Counts is quite chaotic - not sure it's meaningful - I might have to improve the measurements here
<gonidelis>
As I reading past PR's I can see that there is a directory called `hpx/parallel/segmented_algorithms`. What was that about? What is its present name ?
<gonidelis>
am ^^
<hkaiser>
Yorlik: here
<Yorlik>
Hello!
<Yorlik>
Did you see the image of the measurements I made yesterday?
<Yorlik>
I need to understand better what happened here and surely there might be errors
<hkaiser>
doesn't sound right
<Yorlik>
5 seconds single threaded for 200k Objects?
<gonidelis>
hkaiser is there a reason to have segmented_algorithms since ranges have been introduced?
<Yorlik>
Thats 200K messages sent and processed and the according calls into Lua
<hkaiser>
gonidelis: segmented algorithms operate on segmented (possibly distributed) data partitions, that's different
<hkaiser>
Yorlik: so this is ok?
<gonidelis>
gonidelis oh ok... thakns
<hkaiser>
Yorlik: so 25 us/object
<hkaiser>
not too bad, true
<Yorlik>
Yes
<Yorlik>
Including a call into a Lua State and running a script there.
<hkaiser>
nod
<Yorlik>
I was more interested about what it tells about our scalability
<hkaiser>
ms[m]: sorry for spamming you with review comments
<ms[m]>
hkaiser: no worries, sorry and thanks for looking through
<Yorlik>
And OFC the measurements have a lot of weaknesses - htis is more a rough exploration of the situation than something compliant with scientific standards.
<ms[m]>
I didn't really test anything in the PR yet so that was expected...
<hkaiser>
Yorlik: if you want to scaling plots, the plot something like objects/s or objects/frame
<Yorlik>
The numbers at the lower end for low object counts are bonkers
<hkaiser>
I'd plot objects/frame instead
<hkaiser>
because, that's what you're interested in, no?
<Yorlik>
Yes
<Yorlik>
There's a bunch of stuff I could do.
<Yorlik>
Maybe that graph, yes
<hkaiser>
the fps plot doesn't tell you anything as you might idle
<Yorlik>
And then fix some unhandled exceptions I envountered and improve the measurement
<Yorlik>
It's the unbounded updater - it never idles
<hkaiser>
then it doesn't make sense that you level off when going to higher core numbers
<Yorlik>
I think I have mearuement errors when the frametime is too low
<hkaiser>
fps should theoretically go up linearly with number of cores
<Yorlik>
The curve for the higher object counts makes sense
<Yorlik>
And FPS is log scale on Y
<hkaiser>
doesn't make sense anyways
<hkaiser>
why if fps getting worse when running on more cores?
<hkaiser>
*is*
<Yorlik>
I think it's an artifact on the low object numbers
<Yorlik>
Might even be rounding errors
<hkaiser>
100 objects on 12 cores, that is ~8 objects per core
<Yorlik>
== a lot of overhead
<hkaiser>
that means the update should take about 200 us per cycle
<Yorlik>
I used the default chunker
<hkaiser>
so you should see scaling (not perfect scaling mind you)
<Yorlik>
I think I'll repeat the measurement with the autochunker
<hkaiser>
shrug
<hkaiser>
something is off with your measurements
<Yorlik>
The default chunker splits it up, even if it doesn't make sense at very low object counts
<Yorlik>
So it gets inefficient in this extreme edge case
<Yorlik>
OFC splitting up 8 objects into 8 core with short update times doesn't make sense, right?
<Yorlik>
I think that is part of the artifact
<Yorlik>
I'll think of a way to automate the measurement, so I don't have to do it all manually (every data point is a manual run and processing of log data)
<gonidelis>
how can I find the target for compiling just `/algorithms` ?
<hkaiser>
gonidelis: make help | grep algorithms ?
<gonidelis>
hkaiser thank you!
<gonidelis>
hkaiser why do you use fwiterB and fwiterE on the iterators adaptation? I mean what do these letters stand for?
<hkaiser>
forward iterator begin/end
<gonidelis>
oh great! I was searching for sth like A,B or 1,2 but that makes more sense ;D =D
<gonidelis>
hkaiser I can see that in `for_each.hpp`, `HPX_CONCEPT_REQUIRES_` is used in the parameters of the template declaration. While in `reduce.hpp` (which is the newer + better version of iterator based algos) there is `std::enable_if` outside the template parameters. It is actually placed as the return type (??? correct me if I am wrong) of `reduce()`.
<gonidelis>
I remember you saying that we use the later one on the MACROS to achieve the effect. So do we use `enable_if` vs `HPX_CONCEPT_REQUIERS` according to the case or do we just go with `enable_if` from now on as a more modern solution?
<hkaiser>
gonidelis: I don't remember why it's done one way here and another way there
<hkaiser>
the macro expands to enable_if anyways, so I think the reduce is older and has not been changed to use the macros
<gonidelis>
hkaiser ok i totally get it. I shall prefer going with the MACRO then... (do you think that we should gradually try to turn the `enable_if`s into MACROs?)
<hkaiser>
gonidelis: we can do that, the macros help especially if you have more than one condition
<gonidelis>
hkaiser ok I will keep it in mind as soon as I manage to adapt `for_each`. Just one last quite important question (sorry for the spam). We know that the `begin` should be different from the `end` iterator. What should be the type on the `algorithm_result<ExPolixy, Iter>` at function's result type? I guess it's `IterB`, right?
<jbjnr>
hkaiser: I have a memory somewhere that ou recently committed an executor wrapper of some kind. I'd like to see it, but I can't remember what it was called. Is it in master or a branch anywhere?
<jbjnr>
* hkaiser: I have a memory somewhere that you recently committed an executor wrapper of some kind. I'd like to see it, but I can't remember what it was called. Is it in master or a branch anywhere?
<hkaiser>
gonidelis: looks at the spec (standard), I think it should be the begin iterator
<hkaiser>
ms[m], jbjnr, heller1: I sent a mail wrt sponsoring yesterday - care to respond?
<heller1>
Yorlik: so you're happy with the performance so far?
<Yorlik>
All in all yes - but I feel I need to understand more
<ms[m]>
hkaiser: where to? cscs.ch address?
<hkaiser>
hpx-pmc ml
<Yorlik>
The machine ofc: awesome. But I'd like to automate and improve the measurements
<heller1>
hkaiser: awesome, thanks!
<Yorlik>
heller: interestin is, that on certain configurations of cores and workload I triggered exceptions - possibly races and a lock i removed - i needed to reinstall it - will have to try to make it more fine grained.
<ms[m]>
hkaiser: thanks for pinging me, I found a bunch of pmc emails in my spam (sorry if there were some old ones you expected a reply on...)
<hkaiser>
Yorlik: that's expected - races tend to show up with higher core counts
<hkaiser>
mdiers[m]: any time
<Yorlik>
I'll have to investigate more - but first I want to fix some things and automate measureing . 98 manual datapoints tonight was a bit crazy
<Yorlik>
It also is error prone ofc.
<jbjnr>
all my pmc email goes to spam too ms[m]
<hkaiser>
darn, ms[m]: any time
<hkaiser>
jbjnr: that's where it belongs ;-)
<jbjnr>
and gsoc mostly :(
<jbjnr>
hkaiser: I will replace some of my limiting executor with cut'n'paste from your executor wrapper. I like your better.
<jbjnr>
^yours
<jbjnr>
Mine was not forwarding properly
<hkaiser>
jbjnr: ok
<heller1>
hkaiser: i really like the idea of sponsorship and the general direction
<hkaiser>
heller1: great! just send a +1, then (if you don't mind)
<heller1>
Didn't I?
<hkaiser>
as an email?
<hkaiser>
haven't seen it (yet)
<hkaiser>
ahh got it now
<ms[m]>
hkaiser: just replied, very good initiative!
<hkaiser>
thanks!
<mdiers[m]>
hkaiser: short?
<hkaiser>
mdiers[m]: yah, sorry
<heller1>
How do I join the open collective?
<hkaiser>
register on their website and give me your nick, I'll add you to the hpx project
<hkaiser>
ms[m], jbjnr: same for you ^^
<mdiers[m]>
hkaiser: so go ahead
<hkaiser>
mdiers[m]: I mistypes your nick and accidentially highlighted your name, sorry
<bita_>
I think annotation-wise it is Okay, but I am not sure how to make an empty primitive
<hkaiser>
ok, what can I do?
<hkaiser>
you mean ho wto return an empty partition?
<bita_>
I get the error of {what}: Invalid array of elements: HPX(unhandled_exception) followed by invalid state: thread pool is not running: HPX(invalid_status
<bita_>
yes
<hkaiser>
what do you return now?
<hkaiser>
a null-sized vector? nil?
<bita_>
On locality 1 I return annotate_d([], "array_1_sliced/1",
<bita_>
list("tile", list("columns", 0, 0)))
<hkaiser>
well, I'd need to run the code to see what's wrong
<ms[m]>
hkaiser: heller just fyi, daint is unlikely to come back up still this week...
<hkaiser>
ms[m]: thanks for letting us know
<heller1>
hkaiser, gonidelis: FYI, 19:00 Fridays is very bad for me
<gonidelis>
heller1 we could change that then...
gonidelis63 has joined #ste||ar
<hkaiser>
heller1: what time would work for you?
gonidelis has quit [Ping timeout: 245 seconds]
gonidelis63 is now known as gonidelis
gonidelis has quit [Remote host closed the connection]
gonidelis has joined #ste||ar
<gonidelis>
oh i am so sory. had problem with my connection and missed your `base_iterator` messages =( =( if you please could repeat them i would appreciate it
<gonidelis>
rori
<rori>
sure
weilewei has quit [Remote host closed the connection]
diehlpk_work_ has quit [Remote host closed the connection]
weilewei has joined #ste||ar
diehlpk_work_ has joined #ste||ar
nan111 has quit [Remote host closed the connection]
nan111 has joined #ste||ar
karame_ has quit [Remote host closed the connection]
gonidelis has quit [Ping timeout: 245 seconds]
<heller1>
hkaiser, gonidelis: around 4 would be better
<hkaiser>
heller1: I can do Friday's 9am/4pm
<hkaiser>
rori: how about you?
<rori>
perfect for me
<hkaiser>
gonidelis: what time would that be for you? 6pm?
karame_ has joined #ste||ar
rtohid has quit [Remote host closed the connection]
<hkaiser>
heller1, rori: let's decide when he's back
<rori>
5pm for him I believe
<hkaiser>
k
rtohid has joined #ste||ar
akheir has joined #ste||ar
<bita_>
hkaiser, using primitive_argument_type(ast::nil{true}, attached_annotation) and represent the result with nil works for my problem. I will ask you if there is a better method in the personal meeting. So, debugging that is not a priority, thanks for the offer though
<hkaiser>
bita_: nod, thought so
nan111 has quit [Remote host closed the connection]
weilewei has quit [Remote host closed the connection]
nan111 has joined #ste||ar
rtohid has quit [Remote host closed the connection]
rtohid has joined #ste||ar
nan111 has quit [Remote host closed the connection]
rtohid has quit [Remote host closed the connection]
karame_ has quit [Remote host closed the connection]
akheir1 has joined #ste||ar
akheir has quit [Ping timeout: 240 seconds]
nikunj97 has quit [Quit: Leaving]
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
nan11 has joined #ste||ar
nikunj has quit [Ping timeout: 265 seconds]
nikunj has joined #ste||ar
nan11 has quit [Remote host closed the connection]
weilewei has joined #ste||ar
nan11 has joined #ste||ar
<K-ballo>
we are getting github sponsors now?
nikunj97 has joined #ste||ar
rtohid has joined #ste||ar
<weilewei>
K-ballo who?
<K-ballo>
STE||AR
<weilewei>
or you mean the Acknowledgements part in hpx github?