hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
hkaiser_ has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
hkaiser has quit [Quit: bye]
akheir has quit [Quit: Leaving]
nanmiao11 has quit [Remote host closed the connection]
K-ballo has quit [Quit: K-ballo]
bita has quit [Ping timeout: 244 seconds]
diehlpk_work_ has joined #ste||ar
diehlpk_work has quit [Ping timeout: 260 seconds]
parsa| has joined #ste||ar
parsa has quit [*.net *.split]
parsa[m] has quit [*.net *.split]
kordejong has quit [*.net *.split]
tarzeau has quit [*.net *.split]
parsa| is now known as parsa
kordejong has joined #ste||ar
parsa[m] has joined #ste||ar
tarzeau has joined #ste||ar
hkaiser has joined #ste||ar
K-ballo has joined #ste||ar
diehlpk_work has joined #ste||ar
diehlpk_work_ has quit [Ping timeout: 258 seconds]
diehlpk_work has quit [Ping timeout: 240 seconds]
nanmiao11 has joined #ste||ar
<ms[m]> hkaiser: should we go with the MPI branch as it is so that we get those configurations more or less running again? I think the rest of the failures that are left on that PR are old
<hkaiser> ms[m]: yes sure
<hkaiser> the only concern I have (and sorry for being not very responsive lately) is that initializing MPI after the command line handling might be too late
<hkaiser> I have not looked closely, just skimmed over it
<ms[m]> mmh, yes, that was my concern as well, but I'm hoping that we can trust the tests
<ms[m]> I put it as early as I could after command line arguments are parsed
<hkaiser> I don't think we have those configurations covered
<hkaiser> especially the use cases JOhn has
<hkaiser> but sure, let's merge it and see what happens ;-)
<ms[m]> hmm, some of them are covered, but probably not that well...
<hkaiser> ms[m]: I know for sure that command line handling needs the mpi environment if run without a batch system (i.e. plain mpirun/mpiexec)
<ms[m]> yeah, I know, the mpi handling is now right in the middle of the command line handling, not right after it
<hkaiser> I was thinking to turn the mpi environment into yet another batch system (like we have for slurm, pbs, etc.)
<hkaiser> that might be the right spot to initialize things, at least look at the MPI environment
<ms[m]> yes, that might make things easier
<ms[m]> well, for a start we could check if jbjnr__ (you here?) could try out the branch
<hkaiser> right
<hkaiser> I'll try to make some time as well
<ms[m]> we will at least get daint running again now
<hkaiser> I'm very distracted by real work, currently - sorry
<ms[m]> I'd just like to avoid having ci down for too long again like earlier this year
<hkaiser> btw, what's up with jenkins on rostam?
<ms[m]> yeah, no problem, I understand
<ms[m]> well, that mpi init branch fixes it ;)
<hkaiser> ahh - good - then let's merge it
<hkaiser> we can always fix things afterwards
<ms[m]> we didn't have mpi in the slurm environment before so we just always tested the tcp parcelport
<ms[m]> I think...
<hkaiser> hmmm, did we? really?
<ms[m]> actually, we were using mpiexec directly
<ms[m]> it wasn't automatically detected at least
<hkaiser> ok
<ms[m]> in any case, I know it's not optimal right now, but we'll have to try things in the wild
<hkaiser> yes, merge it!
<ms[m]> lol
<ms[m]> don't encourage me too much!
<ms[m]> thanks
<ms[m]> while I have you here, did you have anything against #4946 going in as it is now?
<hkaiser> have not looked, so the answer is: I don't object ;-) Itt's your code after all, you know best
<ms[m]> yeah, the caveats are as commented in the PR, and the guy who opened the PR seems ok with it as it is
<ms[m]> the distributed suspension case is not very important for me, but I don't want to block someone from trying it out as long as they know what they're doing
<hkaiser> ms[m]: ok
<hkaiser> ms[m]: there is probably no generic solution for this anyways
diehlpk_work has joined #ste||ar
bita has joined #ste||ar
akheir has joined #ste||ar
weilewei has joined #ste||ar
<weilewei> hkaiser My CppCon speaker Tsung-wei Huang, assistant professfor from U of Utah, working on CppTaskFlow might want to chat with you sometime to discuss distributed model and look for some collaboration. His CppCon link: and
<hkaiser> YES!
<hkaiser> good
<hkaiser> I was hoping for them to get in contact at some point
<weilewei> Nice, shall I start an email chain between you two?
<hkaiser> yes, please
<hkaiser> add Katie, please
<weilewei> Ok, will write it now
<weilewei> ok, done.
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 256 seconds]
K-ballo1 is now known as K-ballo
<hkaiser> weilewei: thanks a lot!
<weilewei> :) welcome
nanmiao11 has quit [Remote host closed the connection]
nanmiao11 has joined #ste||ar
akheir has quit [Quit: Leaving]
shahrzad has joined #ste||ar
akheir has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
nanmiao11 has quit [Remote host closed the connection]
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
nanmiao11 has joined #ste||ar
nanmiao11 has quit [Remote host closed the connection]
nanmiao11 has joined #ste||ar
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar