aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
EverYoung has joined #ste||ar
diehlpk has joined #ste||ar
EverYoung has quit [Ping timeout: 252 seconds]
vamatya has quit [Ping timeout: 240 seconds]
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: bye]
diehlpk has quit [Ping timeout: 240 seconds]
vamatya has joined #ste||ar
vamatya_ has joined #ste||ar
vamatya has quit [Ping timeout: 246 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
Matombo has joined #ste||ar
vamatya_ has quit [Ping timeout: 248 seconds]
Matombo has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
<heller> jbjnr: got a running stream benchmark on daint again
<heller> including GPU support
david_pfander has joined #ste||ar
bikineev has joined #ste||ar
bikineev has quit [Read error: Connection reset by peer]
bikineev has joined #ste||ar
<jbjnr> heller: what's your skype id?
<jbjnr> we are setting everything up here
<jbjnr> we have a laptop ready for your face
bikineev_ has joined #ste||ar
bikineev_ has quit [Remote host closed the connection]
bikineev has quit [Ping timeout: 240 seconds]
david_pfander has quit [Ping timeout: 240 seconds]
<heller> jbjnr: heller52
<jbjnr> ok, I'll send an invite from the cscs machine/account
<heller> thanks
david_pfander has joined #ste||ar
<jbjnr> Will is giving a quick intro etc
<jbjnr> On our team is Alan, from Nvidia
<jbjnr> he will help us port our kernels, if we can explain them well enough to him
<heller> great
<heller> I am only having problems with cuda + clang in debug mode
<jbjnr> ok, we were deciding that relwithdebinfo would be our mode of choice for the week
<jbjnr> I'll do an hpx install that we will use at this end, we hope all of us can use the same basic set of binaries if poss
<heller> ok
<heller> I have some adjustements
<heller> and we'll probably need to adjust the install over the week over time
<jbjnr> well, we'll probably have our own octo builds as we tweak stuff
<jbjnr> yes^
<heller> MPI support is missing for example
<jbjnr> adjustments for sure
<jbjnr> We will concentrate on single node to begin with
<heller> sure
<heller> jbjnr: speak up that we intend to use clang
<jbjnr> nb. there is a 128 node reservation on daint, but only eurohack accounts can access it :(
<jbjnr> I announced that at the mentors meeting this morning
<jbjnr> (cuda clang)
<heller> ok
<heller> np
<heller> I'm on the slack already
<jbjnr> great
<jbjnr> talk coming to an end, no idea if you can hear it
david_pfander has quit [Ping timeout: 255 seconds]
david_pfander has joined #ste||ar
pree has joined #ste||ar
Matombo has joined #ste||ar
bikineev has joined #ste||ar
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
bikineev_ has joined #ste||ar
bikineev has quit [Ping timeout: 240 seconds]
EverYoung has joined #ste||ar
bikineev_ has quit [Ping timeout: 248 seconds]
EverYoung has quit [Ping timeout: 255 seconds]
K-ballo has joined #ste||ar
diehlpk has joined #ste||ar
bikineev has joined #ste||ar
bikineev has quit [Ping timeout: 248 seconds]
bikineev has joined #ste||ar
hkaiser has joined #ste||ar
<github> [hpx] sithhell force-pushed cuda_clang from 1b08a20 to f6b1a6a: https://git.io/v5ukg
<github> hpx/cuda_clang f6b1a6a Thomas Heller: Making Clang + CUDA work...
heller has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
Matombo has quit [Ping timeout: 246 seconds]
heller has joined #ste||ar
mcopik has joined #ste||ar
pree has quit [Ping timeout: 240 seconds]
pree has joined #ste||ar
pree has quit [Remote host closed the connection]
pree has joined #ste||ar
Matombo has joined #ste||ar
pree has quit [Ping timeout: 240 seconds]
Matombo has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
diehlpk has quit [Ping timeout: 248 seconds]
bikineev_ has joined #ste||ar
bikineev has quit [Ping timeout: 255 seconds]
<github> [hpx] sithhell pushed 1 new commit to cuda_clang: https://git.io/v5uZL
<github> hpx/cuda_clang 56a0a30 Thomas Heller: Fixing ICE with nvcc
bikineev_ has quit [Ping timeout: 240 seconds]
<jbjnr> hkaiser: heller I have not been able to find a reasonable explanation for the double peaks in our task times https://pasteboard.co/GI0Dk5x.png - is there any conceivable way that when runnin hpx on many threads - it could accidentally run the task twice - due to a race in the deep internals?
<heller> very unlikely
<jbjnr> indeed.
<hkaiser> unlikely indeed
<jbjnr> just posted on slack that stream ok now, thanks for boost patch
<heller> semi ok
<heller> performance sucks
<hkaiser> jbjnr: could be a matter of critical tasks bein gexecuted too late, holdin gback everything else
<heller> which worries me
<heller> i'd really check the hardware counters to see what kind of cache misses or other memory transfer we are dealing with here
<jbjnr> no.
<heller> likwid would be a perfect tool to check this
<jbjnr> the cache cannot explain it and the task execution cannot causse it - the time is started inside the lambda, and stopped at the end of the lambda
<jbjnr> memory bw calcs do not alow for the scale of the slowdown - cache not the cause
<hkaiser> no suspension?
<jbjnr> if we run a small time, it takes 8ms and we see a peak at 8 and another at 16
<heller> please just check it
<jbjnr> if we run a big tile that takes 30, we see a peak at 30 and another at 60
<hkaiser> loading tlbs?
<jbjnr> only explanation is two threads bound to one code
<jbjnr> core^
<jbjnr> but diagnostics disprove this
<jbjnr> as I can dump out the core with each task
<jbjnr> and they all are differnt and correct
<hkaiser> TLBs?
<heller> translation lookaside buffers?
<hkaiser> yes
<heller> instruction cache misses?
<jbjnr> none of these would cause a 2x delay - they would add some overhead, but not scale with tile size
<heller> well
<jbjnr> bbiab
<hkaiser> jbjnr: TLBs would scale
<heller> we'll only know for sure once we actually look at the counters
<heller> hkaiser: btw, nvcc on daint works now. as well as cuda clang
<hkaiser> cool, what did you change?
<heller> hooray for spending hours and hours in front of compiler error messages ;)
<heller> really nothing
<hkaiser> heller: you're my hero
<heller> now, we need to bring back the performance :P
<hkaiser> uhh, so why did it start to work?
<heller> well, the strange segfaults last week were on my local test system
<heller> now I am running on daint
<hkaiser> and the compilation problems in unwrap?
<heller> they are only showing up with the binpacking distribution policy
<hkaiser> ahh, because that ties in the actions, right?
<heller> the assertion (in the EDG frontend) is coming out out of a file named "scope_tks.c"
<hkaiser> lol
<hkaiser> very helpful
<heller> the policies are statically initialized, right?
<hkaiser> might be
<heller> with a global that is, at namespace scope
<hkaiser> don't remember
<heller> yeah, they are
<hkaiser> ok
<heller> so this is my guess: it is ok when unwrap is called from within function scope
<hkaiser> heller: could you comment about your findings on the related tickets, pls?
<heller> but the assert fails once it is instantiated from a static scope
<heller> not static scope
<heller> file scope
<hkaiser> k
<heller> this is were the instantiation happens
<hkaiser> k, should we blend this out for cuda, then?
<heller> ^^ this is one of the problematic lines that lead to the ICE
<heller> yes
<heller> just came to my mind that the real solution to the problem is to move this into a seperate TU
<hkaiser> heller: well, this unwrap is just a convenience, could be removed alltogether
<heller> conclusion: explaining problems to other people helps in finding solutions ;)
<heller> well sure
<hkaiser> glad to serve as a rubber-duck
<zao> *quack*
<heller> but that would just solve the symptoms
<heller> we know the reason now
<hkaiser> indeed
<hkaiser> but we can't solve the cause anyways
<heller> no
<heller> there are plenty of other unwraps in that file though
<heller> and cuda clang is sooo much nicer
<heller> I'll properly test and file the PR tomorrow
<heller> gtg now
<hkaiser> ok, thanks
<heller> also, the thing with the static is just an assumption for now which needs to be verified
<hkaiser> heller: the important part is that cuda is unblocked now
<heller> yes
<heller> the guys in lugano had something to work with since the very first hour, so a close call again ;)
<github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/v5uC5
<github> hpx/master ee5303e Hartmut Kaiser: Merge pull request #2871 from STEllAR-GROUP/fix_throttle...
<heller> hkaiser: buildbot seems to work again, which is very nice
<heller> now showing a few failing tests
<hkaiser> heller: yah, akheir has invested a lot of time
<heller> good
<hkaiser> some things are still pending
<heller> ok
<heller> I couldn't spot the issues I reported anymore
<heller> so all failures should be considered as the real thing now
<heller> yeah, saw the message
<hkaiser> k
<heller> I hope I can get around the throttle thingy by this week
<heller> I get reports that the throttling scheduler the IBM guy wrote now hangs as well :P
bikineev has joined #ste||ar
parsa has joined #ste||ar
<jbjnr> heller: slack ping
diehlpk has joined #ste||ar
<diehlpk> hkaiser, heller jbjnr zbyerly When should we skype tomorrow? Deadline for the paper is in 10 days
<hkaiser> diehlpk: any time works for me
<diehlpk> heller, Would 6pm your time work for you agin?
parsa has quit [Quit: Zzzzzzzzzzzz]
ajaivgeorge has joined #ste||ar
bikineev has quit [Read error: Connection timed out]
jaafar has joined #ste||ar
ajaivgeorge has quit [Quit: ajaivgeorge]
diehlpk has quit [Ping timeout: 260 seconds]
jaafar has quit [Ping timeout: 240 seconds]
diehlpk has joined #ste||ar
david_pfander has quit [Ping timeout: 246 seconds]
bikineev has joined #ste||ar
parsa has joined #ste||ar
Matombo has joined #ste||ar
Matombo has quit [Ping timeout: 248 seconds]
bikineev has quit [Remote host closed the connection]
Matombo has joined #ste||ar
K-ballo1 has joined #ste||ar
Matombo has quit [Ping timeout: 255 seconds]
diehlpk has quit [Ping timeout: 246 seconds]
K-ballo has quit [*.net *.split]
K-ballo1 is now known as K-ballo
patg has joined #ste||ar
patg is now known as Guest67730
Guest67730 is now known as patg
patg is now known as Guest6416
Guest6416 is now known as patg
Matombo has joined #ste||ar
patg has quit [Quit: This computer has gone to sleep]
diehlpk has joined #ste||ar
<github> [hpx] K-ballo created format (+2 new commits): https://git.io/v5uSY
<github> hpx/format ee5da64 Agustin K-ballo Berge: Wrap boost::format uses in traditional (variadic) function call syntax
<github> hpx/format b1dab66 Agustin K-ballo Berge: Add inspect check for unguarded boost::format usage
Matombo has quit [Ping timeout: 240 seconds]
Matombo has joined #ste||ar
patg has joined #ste||ar
pree has joined #ste||ar
<heller> diehlpk: no
Matombo has quit [Ping timeout: 240 seconds]
diehlpk has quit [Ping timeout: 240 seconds]
Matombo has joined #ste||ar
<heller> 18:00 doesn't work. 16:00 would be fine
pree has quit [Ping timeout: 240 seconds]
david_pfander has joined #ste||ar
pree has joined #ste||ar
bikineev has joined #ste||ar
pree has quit [Quit: AaBbCc]
diehlpk has joined #ste||ar
<diehlpk> heller, Any other time?
<diehlpk> hkaiser, and I are available the whole day
<hkaiser> diehlpk: he said 16:00 works for him
<heller> Yes
<diehlpk> Ok, so let us meet at this time. I will commit my chnages before the meeting
<diehlpk> We have two pages now
<hkaiser> I might be 5-10 minutes late for this
<hkaiser> but will try to b eon time
<heller> Sure, no problem
<heller> I'll work on it tomorrow
<jbjnr> heller: in general, which is better, generic context (boost) coroutines, on, or off.
<jbjnr> or hkaiser ^
<hkaiser> jbjnr: depends on platform
<jbjnr> x86
<hkaiser> linux?
<jbjnr> mostly
<hkaiser> then leave it off
<jbjnr> cray linux anyway
<jbjnr> ok
<jbjnr> what's the main diff
<hkaiser> ours is a tick faster
<jbjnr> ok thanks
patg has quit [Read error: Connection reset by peer]
patg has joined #ste||ar
david_pfander has quit [Ping timeout: 240 seconds]
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
diehlpk has quit [Ping timeout: 240 seconds]
<jbjnr> hkaiser: fyi - hpx::bind=scatter looks like it is broken. Not sure, but pu mask does not seem right to me ...
<hkaiser> ok
<hkaiser> pls create a ticket
<jbjnr> ok, still doing tests.
<patg> use balanced
<jbjnr> balanced is wrong too
<jbjnr> this is not good
bikineev has quit [Remote host closed the connection]
Matombo has quit [Remote host closed the connection]
<hkaiser> jbjnr: --bind=balanced distributes thread across cores in a balanced way, not across sockets - at least that's the intent.
<hkaiser> from what I can see from the bitmasks you posted all is correct
<hkaiser> jbjnr: so this is not a bug
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
mbremer has joined #ste||ar
mbremer has quit [Client Quit]
diehlpk has joined #ste||ar