hkaiser changed the topic of #ste||ar to: The topic is 'STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
nikunj has joined #ste||ar
nikunj has quit [Client Quit]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 276 seconds]
weilewei has joined #ste||ar
<weilewei> simbergm yes, the build_type shows up
<weilewei> simbergm how should I enable tests run on hpx spack?
K-ballo has quit [Quit: K-ballo]
diehlpk has joined #ste||ar
<diehlpk> hkaiser, What about having the AMT workshop in the week before Easter in march 2021
<hkaiser> diehlpk: sure
<diehlpk> Ok, I will let karen know
<hkaiser> diehlpk: I have nothing planned yet ;-)
<diehlpk> I will submit the proposal tomorrow after the vis meeting
<diehlpk> We have a vis meeting for octotiger since a while
<hkaiser> nod, cool
hkaiser has quit [Quit: bye]
weilewei has quit [Remote host closed the connection]
diehlpk has quit [Ping timeout: 250 seconds]
nikunj has joined #ste||ar
nikunj has quit [Quit: Leaving]
khuck has quit [*.net *.split]
jaafar has quit [*.net *.split]
heller has quit [*.net *.split]
simbergm has quit [*.net *.split]
jbjnr has quit [*.net *.split]
mdiers_ has quit [*.net *.split]
diehlpk_work has quit [*.net *.split]
simbergm has joined #ste||ar
jbjnr has joined #ste||ar
simbergm has quit [*.net *.split]
simbergm has joined #ste||ar
khuck has joined #ste||ar
jaafar has joined #ste||ar
mdiers_ has joined #ste||ar
diehlpk_work has joined #ste||ar
heller has joined #ste||ar
K-ballo has joined #ste||ar
K-ballo has quit [Read error: Connection reset by peer]
K-ballo has joined #ste||ar
K-ballo has quit [Ping timeout: 250 seconds]
K-ballo has joined #ste||ar
rori has joined #ste||ar
hkaiser has joined #ste||ar
<simbergm> K-ballo: yt?
<K-ballo> simbergm: partially
<simbergm> ok, then I'll just ask the short version: are there any known "difficulties" with invoke_fused?
<K-ballo> no
<simbergm> trying to get clang to compile some cuda code, vanilla clang complains about no type named type in invoke_fused_result, cray clang ice's, and nvcc is happy
<simbergm> and I'm trying to figure out whose fault it is
<K-ballo> no type in invoke_fused_result means the call is not well formed
<simbergm> right
<simbergm> do you have any tricks for debugging these kind of (lack of) template instantiations?
<K-ballo> you should get the types involved in the call in the error message
<K-ballo> you could try setting up the equivalent call manually, to see what causes it to be ill-formed
<simbergm> do you spot anything out of place here: https://gist.github.com/msimberg/18229154965747df8257082643859c8c
<K-ballo> that your lambda has an explicit return type?
<K-ballo> either that, or it doesn't take 3 arguments
<K-ballo> the arguments for your call are hpx::compute::cuda::target_ptr<int>&, hpx::compute::cuda::target_ptr<int>&, hpx::compute::cuda::target_ptr<int>&
<simbergm> it does... and no explicit return type (it's void)
<K-ballo> what are the 3 parameters of the lambda?
<simbergm> arguments are correct
<simbergm> I wonder if it has to be HPX_HOST_DEVICE for that to work correctly
<simbergm> not that nvcc cares...
<K-ballo> try calling your lambda directly, with 3 target_ptr<int> lvalues, that should show the concrete error
<simbergm> even a lambda with no arguments produces the same error
<simbergm> making it host device (kind of) gets rid of the error
<simbergm> I guess it wouldn't be too surprising if nvcc handles this differently
<simbergm> it's not the target_ptrs at least, but I'm not sure yet if host device is correct in this case...
aserio has joined #ste||ar
aserio has quit [Quit: aserio]
aserio has joined #ste||ar
<simbergm> heller: are you joining the call_
<simbergm> ?
weilewei has joined #ste||ar
nikunj has joined #ste||ar
<hkaiser> simbergm: thanks!
<hkaiser> yah, it's the same issue
<hkaiser> simbergm: we might want to force this compiler into using C++11
<hkaiser> or C++14
<hkaiser> this compiler has definitely no notion of C++17
<hkaiser> simbergm: what do you think?
<hkaiser> alternatively we could stop running the C++17 feature tests for C++1z mode compilation
<hkaiser> (which would be proper anyways)
<simbergm> hkaiser: the second sounds like a clean thing to do
<simbergm> the first is fine as well by me, I think we should drop that compiler pretty soon as well
<simbergm> so it would not be a very long-lived hack
<hkaiser> I can send an email to Al asking him to apply the buildbot change for all clang4 builders
nikunj has quit [Remote host closed the connection]
nikunj has joined #ste||ar
<simbergm> ah right, that's even easier (I was thinking detecting it in cmake)
<simbergm> thanks!
rori has quit [Quit: bye]
nikunj has quit [Ping timeout: 265 seconds]
RostamLog has joined #ste||ar
<weilewei> Is it true that make tests in hpx takes long time?
<K-ballo> possibly, since it will run them too
<weilewei> really? To run tests, I shall pass make test, but I have not done so
<weilewei> Also, i am in a login node, so the tests are not supposed to run
<weilewei> It's like more than 1 hour, and only 20% of tests are built...
<K-ballo> yeah, back in the day I would get called for forgetting and doing `make tests` on the login node
<K-ballo> so my solution was to instead do make tests.unit and make tests.regression, those did not run the test, just build
<K-ballo> (one of those is plural, don't remember which one)
<K-ballo> maybe things have changed by now and `make tests` no longer runs tests?
<simbergm> make tests doesn't run the tests (make test or ctest does), and yes, building them takes a long(!) time
<simbergm> especially if you're doing literally make tests (i.e. no -jN)
<weilewei> Ah, I can use -jN to make tests in parallel
aserio has quit [Ping timeout: 250 seconds]
<zao> Watch out for memory consumption , linking tests eats RAM like candy
<zao> Less of a problem on a cluster, but great fun on a small home machine.
<weilewei> simbergm tests are building faster than before, while it is building, btw, is it possible to make test -jN in parallel?
<zao> My build machine has 64G mostly thanks to HPX :)
<zao> Not sure if you can run tests in parallel, would be some ctest feature
<weilewei> zao I guess as long as Summit admin does not kick me out, I do not need to worry... lol
<simbergm> weilewei: it'll work with some tests but not all
<weilewei> simbergm ok thanks!
<simbergm> the multinode tests expect to be the only hpx instance on a machine
<zao> Back when I worked on my soak tester I ran all tests in a networkless container :)
aserio has joined #ste||ar
nikunj has joined #ste||ar
<diehlpk_work> hkaiser, I am good to go to submit the proposal?
<aserio> hkaiser: see pm
<hkaiser> diehlpk_work: pls notify Ronnie asking him whether he has more comments
<diehlpk_work> hkaiser, see pm
nikunj has quit [Ping timeout: 265 seconds]
nikunj has joined #ste||ar
<aserio> hkaiser: yt?
<hkaiser> here
<hkaiser> aserio: ^^
<aserio> I have a question in the pm
<weilewei> diehlpk_work how do you run mpi related tests for hpx on Power9 machine?
<diehlpk_work> weilewei, submit a job?
<weilewei> diehlpk_work all mpi tests use mpiexe by default, but summit uses jsrun, these mpi tests all failed
<weilewei> diehlpk_work so you make no changes to the run scripts for mpi related tests?
<diehlpk_work> export mpiexec=jsrun
<weilewei> let me try
<diehlpk_work> So jsrun would be used for moiexec
<diehlpk_work> *,piexec
<weilewei> still failed
<diehlpk_work> Depending on you bash you have to use export or use setenv
<weilewei> not sure if mpi related tests are really failing on Summit or not
<weilewei> export
<diehlpk_work> weilewei, Where have you added the export command?
<diehlpk_work> In your job scriot?
<diehlpk_work> Can you send me your job script?
<weilewei> I am in an interactive compute node, I just export it
<weilewei> see PM
<weilewei> error log, in case anyone is interested in knowing it
<diehlpk_work> weilewei, Add the export mpiexec=jsrun to line 36
<diehlpk_work> weilewei, The error message is not related to hpx
<diehlpk_work> It seems that something is wrong with the interconnect
<weilewei> Let me check
<diehlpk_work> see my previous posted link
aserio has quit [Ping timeout: 250 seconds]
<weilewei> I replaced mpiexe as jsrun in bin/hpxrun.py command, those tests are getting passed : ) diehlpk_work you gave me a good idea
<hkaiser> weilewei: nice
<hkaiser> diehlpk_work: do you plan to join the physics zoom meeting now?
nikunj has quit [Ping timeout: 265 seconds]
nikunj has joined #ste||ar