aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
parsa has quit [Quit: Zzzzzzzzzzzz]
kisaacs has joined #ste||ar
kisaacs has quit [Ping timeout: 240 seconds]
parsa has joined #ste||ar
kisaacs has joined #ste||ar
kisaacs has quit [Ping timeout: 260 seconds]
jaafar has quit [Ping timeout: 258 seconds]
Matombo444 has joined #ste||ar
Matombo222 has joined #ste||ar
Matombo has quit [Ping timeout: 248 seconds]
Matombo444 has quit [Ping timeout: 258 seconds]
Matombo has joined #ste||ar
Matombo222 has quit [Quit: Leaving]
Matombo has quit [Quit: Leaving]
EverYoung has joined #ste||ar
Matombo has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
Matombo has quit [Remote host closed the connection]
Matombo has joined #ste||ar
Matombo has quit [Remote host closed the connection]
<jbjnr>
I just added a link to HPX_LIBRARIES to make it go away
<heller>
the dreaded HPX_LIBRARY_DIR
<jbjnr>
(btw I tried using gcc5.3 and getting cuda running - but the cmake wrapping is broken and I couldn't be arsed fixing it)
<heller>
fixed
<jbjnr>
that link_directories is already in the parent cmakelists
<heller>
which I don't use
<jbjnr>
aha. You probably only build one example, not all of them
<jbjnr>
loser
<jbjnr>
I cannot get papi counters to appear in apex/vampir
<jbjnr>
I want to display hpx perf counters in vampir :(
<jbjnr>
PS. Change of plan tonight. if you only have small rucksack -then I collect you from airport on scooter, we go straight to hotel or manor for dinner and save time.
<heller>
ok
<heller>
sounds good
<heller>
i'll only pack small
<jbjnr>
remind me of landing time
<jbjnr>
pls
<heller>
18:15
<jbjnr>
nice weather here - should be no probs
<jbjnr>
if I am late and bus is there, just take it and I find you at hotel.
<heller>
ok
<heller>
text me or so then
<heller>
if you don't make it
<heller>
why is it impossible for stuff to just work?
<heller>
jbjnr: stencil fixed
<jbjnr>
\o/
<jbjnr>
does it use unwrapper?
<jbjnr>
unwrapped
<jbjnr>
I wanted a slide on that
<heller>
it doesn't
<heller>
but it is easy to devise a slide or two on that
<jbjnr>
put it in your api session pleasec :)
<heller>
ok
<jbjnr>
clang plus instlled hpox ok for you, no issues?
<heller>
yes
<heller>
fixes pushed
<heller>
now, adding RP support to stream benchmark
<jbjnr>
that might save some time, because it creates pools one, per numa domain for you
<jbjnr>
(NB. PR awaiting that allows default pool to be renamed so ou can just loop over numa domains and create pools)
<heller>
jbjnr: great!
<heller>
I assume the patch is already in the cscs17 branch?
<jbjnr>
yes
<heller>
even greater
<heller>
so stencil next, I guess ;)
<jbjnr>
the RP_stream does not work, but the pools are setup. What I wanted was to create targets from the pols, but I do not like the block executor much and though it might be an idea to just launch n threads per pool
<jbjnr>
the block executor needs work after recent change I think
<jbjnr>
I have numa allocator not yet committ eiterh
<heller>
yes
<heller>
the block allocator still has its use though
jaafar has joined #ste||ar
<heller>
it can be seen as a migration path
<jbjnr>
or membind - can use it with pool - pooll->get_numa_bitmpa - then numa allocator alloc using mask either bound, interleaved etcetc
<heller>
yeah, sure
<jbjnr>
we need a "future directions" section to summarise the changes we are making for pools/targets/gpus etc
<jbjnr>
--hpx:attach-debugger=startup is broken so we can't do any debugging :(
jaafar has quit [Ping timeout: 246 seconds]
<heller>
er
<heller>
why is it broken?
kisaacs has joined #ste||ar
<jbjnr>
has been for weeks since some commit to disable it messed up, must be an easy fix. I didn't try yet
<heller>
ok
<heller>
yeah ... the block_executor requires some decent change to work with the RP
<heller>
we have a long consolidation phase ahead of us...
kisaacs has quit [Ping timeout: 258 seconds]
<jbjnr>
jesus. setencil_4 has some warnings!
<jbjnr>
still not compiling for me though
<heller>
not for me :/
<jbjnr>
oops. it's finboannci that failed. sorry
<heller>
ok, let me check that
<jbjnr>
fixed
<heller>
fibonacci?
<heller>
that one is using unwrapped
<heller>
unwrapped->unwrapping
david_pfander has joined #ste||ar
<heller>
jbjnr: can you log on to daint right now?
<jbjnr>
yes, fine
<heller>
ok, small hangup
<heller>
/apps/daint/UES/6.0.UP04/HPX/clang/hpx/rdmaster/include/hpx/compute/cuda/detail/launch.hpp:97:13: error: static_assert failed "We currently require the closure to be less than 256 bytes"
<heller>
shite
<heller>
jbjnr: can you do me a favor?
<jbjnr>
whassup?
<jbjnr>
rewrite clang ptx generation?
<heller>
jbjnr: in /apps/daint/UES/6.0.UP04/HPX/clang/hpx/rdmaster/include/hpx/compute/cuda/detail/launch.hpp line 97
<jbjnr>
Day 1 intro 1 + building and hello world= JB | intro 2+stencil = TH
<heller>
jbjnr: when do you want to handle RP and CUDA now
<jbjnr>
Day 2 : resource JB, then ...
<heller>
instead of profiling and debugging?
<jbjnr>
can you quickly skype now?
<heller>
sure
<jbjnr>
f 5 mins
hkaiser has joined #ste||ar
Matombo has quit [Quit: Leaving]
Matombo has joined #ste||ar
K-ballo has joined #ste||ar
<jbjnr>
hkaiser: good day. Thanks for the fixes. Things much more stable today.
<jbjnr>
Got time for a quick question?
<hkaiser>
jbjnr: g'morning
<hkaiser>
sure
<jbjnr>
how do I see a performance counter in apex?
<hkaiser>
APEX intercepts the PC counters you specify in th eusual way
<hkaiser>
do you need it periodically?
<jbjnr>
its the "in the usual way" part that I think I'm missing.
<jbjnr>
I can add a --print-counter"blah"
<jbjnr>
and it is printed - but I do not know how to tell apex to sample this
<hkaiser>
whatever you specify on the command line will show up in APEX
<jbjnr>
hmm. ok, then something isn't doing what I exoected
<hkaiser>
do you want to see periodic sampling?
<jbjnr>
yes
<hkaiser>
use --hpx:print-counter-interval
<jbjnr>
ok, I'll try again
<hkaiser>
counter interval takes time in milliseconds
<hkaiser>
--hpx:print-counter-interval=100 means evaluate all counters every 100ms
<hkaiser>
and APEX should intercept those
<jbjnr>
ah. it works now.
<jbjnr>
thanks
<jbjnr>
I thought I tried that, but must have messed up
<jbjnr>
cool - I can see papi counter.
<hkaiser>
cool
<jbjnr>
aha. new question.
<jbjnr>
Suppose I want to sample every 1ms, but not to display it on screen
<jbjnr>
TMI
Matombo has quit [Quit: Leaving]
<jbjnr>
I just had a nice idea.
<jbjnr>
when I increase the frquency of the counter sampling, it crashes - I suspect that this is because millions of tasks are being created to sample and building up in unlooked after queues if the cpu is busy ...
<jbjnr>
so we need a thread pool devoted to apex/sampling/profiling etc
<jbjnr>
(like the timer pool, but better)
<jbjnr>
and using RP
<hkaiser>
well, this sounds like a great idea, but I'd first fin dout why it crashes, which could have any cause
<jbjnr>
indeed - if only I hade some kind of debugging tool to help me ...
<jbjnr>
(ironic laughter)
<hkaiser>
visual studio s great for this ;)
<jbjnr>
qtcreator
<hkaiser>
crap
<hkaiser>
still relies on gdb
<jbjnr>
need to rebuiold everything again though
<jbjnr>
gdb rocks! ask heller -he knows
<hkaiser>
it's not the visual studion ide, but the embedded debugger which is great
<jbjnr>
I know. I used to use it always.
<jbjnr>
it was infiniband that drove me off VS and onto linux
<hkaiser>
nowaday this can be done even for linux apps
<github>
hpx/cscs2017 70a7d5a John Biddiscombe: Add #include files needed to set _POSIX_VERSION for debug check...
parsa has joined #ste||ar
kisaacs has joined #ste||ar
<github>
[hpx] biddisco opened pull request #2930: Add #include files needed to set _POSIX_VERSION for debug check (master...fixing_2924) https://git.io/vd4cM
<heller>
hkaiser: hey, are your cppcon slides uploaded somewhere?
<pree>
It's really nice to see hpx slides in a different way :)
<heller>
jbjnr: no pdf available on the interwebs, it seems
rod_t has joined #ste||ar
parsa has joined #ste||ar
eschnett has quit [Quit: eschnett]
<jbjnr>
heller: I'll send you pdf, but not now, need time to do slides
<heller>
sure
eschnett has joined #ste||ar
zbyerly_ has quit [Ping timeout: 255 seconds]
zbyerly_ has joined #ste||ar
zbyerly_ has quit [Ping timeout: 240 seconds]
<hkaiser>
jbjnr: I'd like to see the pdf as well
<hkaiser>
heller, jbjnr: also, pls se hpx-users
<heller>
kk
<diehlpk_work>
hkaiser, error: no viable conversion from returned value of type 'future<boost::range_detail::integer_iterator<int>>' to function return type 'future<void>'
<diehlpk_work>
Is this from broken HPX master or other bug?
<hkaiser>
uhh, master should be fine again
<hkaiser>
that is a strange thing to happen - odo you have a small test case for this?
<heller>
future<T> to future<void> should work
parsa[w] has joined #ste||ar
<diehlpk_work>
heller, Yes with gcc, but not with clang on circle-ci
<heller>
can you be more specific?
<diehlpk_work>
Compiling my code with gcc 6.2 on my local machine works. When I compile the same code with clang on circle-ci I get this error
<heller>
this has no more information ;)
<heller>
can you show the code or the specific error?
<hkaiser>
diehlpk_work: if it doesn't work we would need a small self-contained reproducing test case to fix it
<heller>
the problematic one is the future called target
<heller>
or am I being paranoid here?
<K-ballo>
I don't see concurrent access to shared_future instances
<heller>
free_resources.set might throw
<K-ballo>
channel is implemented on top of shared futures?
<heller>
no
kisaacs has quit [Ping timeout: 264 seconds]
<heller>
only this specific code
<heller>
not sure if I am seeing a red herring there
zbyerly_ has quit [Quit: Leaving]
<heller>
I found two workarounds to that bug: 1) waiting on the future returned by result.then 2) catching the exception inside the result.then continuation
<heller>
ahh, i see what's happening
<heller>
dangling reference. the result.then continuation might outlive free_resources, which is captured by reference
pree has quit [Ping timeout: 248 seconds]
jaafar_ has joined #ste||ar
<github>
[hpx] sithhell pushed 2 new commits to fixing_2916: https://git.io/vd49K
<github>
hpx/fixing_2916 3856c95 Thomas Heller: Merge branch 'fixing_2924' into fixing_2916
<github>
hpx/fixing_2916 90dc03d Thomas Heller: Fixing test for #2916...
kisaacs has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
parsa has quit [Ping timeout: 246 seconds]
kisaacs has quit [Ping timeout: 248 seconds]
<K-ballo>
maybe we just bite the bullet and add dtor/get sync? everyone seems to expect that to work
<K-ballo>
'// if one of the includes is <hpx/hpx.hpp> assume all is well
hkaiser has joined #ste||ar
kisaacs has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]