hkaiser_ has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
hkaiser has quit [Quit: bye]
akheir has quit [Quit: Leaving]
nanmiao11 has quit [Remote host closed the connection]
K-ballo has quit [Quit: K-ballo]
bita has quit [Ping timeout: 244 seconds]
diehlpk_work_ has joined #ste||ar
diehlpk_work has quit [Ping timeout: 260 seconds]
parsa| has joined #ste||ar
parsa has quit [*.net *.split]
parsa[m] has quit [*.net *.split]
kordejong has quit [*.net *.split]
tarzeau has quit [*.net *.split]
parsa| is now known as parsa
kordejong has joined #ste||ar
parsa[m] has joined #ste||ar
tarzeau has joined #ste||ar
hkaiser has joined #ste||ar
K-ballo has joined #ste||ar
diehlpk_work has joined #ste||ar
diehlpk_work_ has quit [Ping timeout: 258 seconds]
diehlpk_work has quit [Ping timeout: 240 seconds]
nanmiao11 has joined #ste||ar
<ms[m]>
hkaiser: should we go with the MPI branch as it is so that we get those configurations more or less running again? I think the rest of the failures that are left on that PR are old
<hkaiser>
ms[m]: yes sure
<hkaiser>
the only concern I have (and sorry for being not very responsive lately) is that initializing MPI after the command line handling might be too late
<hkaiser>
I have not looked closely, just skimmed over it
<ms[m]>
mmh, yes, that was my concern as well, but I'm hoping that we can trust the tests
<ms[m]>
I put it as early as I could after command line arguments are parsed
<hkaiser>
I don't think we have those configurations covered
<hkaiser>
especially the use cases JOhn has
<hkaiser>
but sure, let's merge it and see what happens ;-)
<ms[m]>
hmm, some of them are covered, but probably not that well...
<hkaiser>
ms[m]: I know for sure that command line handling needs the mpi environment if run without a batch system (i.e. plain mpirun/mpiexec)
<ms[m]>
yeah, I know, the mpi handling is now right in the middle of the command line handling, not right after it
<hkaiser>
I was thinking to turn the mpi environment into yet another batch system (like we have for slurm, pbs, etc.)
<hkaiser>
that might be the right spot to initialize things, at least look at the MPI environment
<ms[m]>
yes, that might make things easier
<ms[m]>
well, for a start we could check if jbjnr__ (you here?) could try out the branch
<hkaiser>
right
<hkaiser>
I'll try to make some time as well
<ms[m]>
we will at least get daint running again now
<hkaiser>
I'm very distracted by real work, currently - sorry
<ms[m]>
I'd just like to avoid having ci down for too long again like earlier this year
<hkaiser>
btw, what's up with jenkins on rostam?
<ms[m]>
yeah, no problem, I understand
<ms[m]>
well, that mpi init branch fixes it ;)
<hkaiser>
ahh - good - then let's merge it
<hkaiser>
we can always fix things afterwards
<ms[m]>
we didn't have mpi in the slurm environment before so we just always tested the tcp parcelport
<ms[m]>
I think...
<hkaiser>
hmmm, did we? really?
<ms[m]>
actually, we were using mpiexec directly
<ms[m]>
it wasn't automatically detected at least
<hkaiser>
ok
<ms[m]>
in any case, I know it's not optimal right now, but we'll have to try things in the wild
<hkaiser>
yes, merge it!
<ms[m]>
lol
<ms[m]>
don't encourage me too much!
<ms[m]>
thanks
<ms[m]>
while I have you here, did you have anything against #4946 going in as it is now?
<hkaiser>
have not looked, so the answer is: I don't object ;-) Itt's your code after all, you know best
<ms[m]>
yeah, the caveats are as commented in the PR, and the guy who opened the PR seems ok with it as it is
<ms[m]>
the distributed suspension case is not very important for me, but I don't want to block someone from trying it out as long as they know what they're doing
<hkaiser>
ms[m]: ok
<hkaiser>
ms[m]: there is probably no generic solution for this anyways
diehlpk_work has joined #ste||ar
bita has joined #ste||ar
akheir has joined #ste||ar
weilewei has joined #ste||ar
<weilewei>
hkaiser My CppCon speaker Tsung-wei Huang, assistant professfor from U of Utah, working on CppTaskFlow might want to chat with you sometime to discuss distributed model and look for some collaboration. His CppCon link: and