hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
jaafar has quit [Remote host closed the connection]
jaafar has joined #ste||ar
jaafar has quit [Remote host closed the connection]
jaafar has joined #ste||ar
diehlpk has joined #ste||ar
eschnett has joined #ste||ar
quaz0r has quit [Ping timeout: 272 seconds]
diehlpk has quit [Ping timeout: 250 seconds]
hkaiser has quit [Quit: bye]
eschnett has quit [Ping timeout: 255 seconds]
quaz0r has joined #ste||ar
nikunj97 has quit [Ping timeout: 246 seconds]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 246 seconds]
nikunj has joined #ste||ar
<simbergm>
what on earth did hkaiser do? our cuda builds are (almost) fixed now...
<simbergm>
heller: 3662 should be ready to go, no? you're not planning any further changes?
<heller>
simbergm: no, it should be ready
<simbergm>
heller: thanks, let's give it a try, it's had to wait long enough...
<jbjnr_>
hkaiser: you pinged me last night - need anything?
<jbjnr_>
I would like to chat to you at some point too
<hkaiser>
jbjnr_: could you run your benchmarks on top of #3745, please?
<hkaiser>
jbjnr_: sure, any time
<jbjnr_>
will do
<hkaiser>
thanks
<heller>
hkaiser: something seems to be wrong ther
<heller>
hkaiser: doesn't build
<hkaiser>
stupid MS compiler again, most probably
<hkaiser>
builds for me
<heller>
no idea
<heller>
looks odd
<hkaiser>
I fixed some compilation errors this morning, however
<heller>
ah, ok
<jbjnr_>
hkaiser: heller Are there known problems with dataflow? Raffaele tells me that the GPU version of cholesky gives the wrong answers sometimes when they use dataflow, but the right answers when they use when_all().then()
<K-ballo>
hkaiser: I suspect it ought to be possible to combine those assert checks with the atomic modifications, otherwise looks good to me
<jbjnr_>
is dataflow known to be racey?
<hkaiser>
K-ballo: I didn't care for atomicity wrt the asserts
<hkaiser>
jbjnr_: not really
<jbjnr_>
well it is now then!
<hkaiser>
nothing would work if dataflow had a race
<jbjnr_>
no. I'm wrong. Raffaele just told me that have a problem with when_all as well. So ignore what I just wrote.
<hkaiser>
it has no synchronization anyways
<jbjnr_>
it just apears more frequently with one than the other it seems. Sorry. This news was hot off the press ...
<heller>
jbjnr_: we fixed a problem with that for 1.2.1 maybe he is running with an old version?
<jbjnr_>
It was master from a few weeks back I think, but I will check
aserio has joined #ste||ar
hkaiser has quit [Quit: bye]
<mdiers_>
hkaiser: yt? (for hpx in combination with slurm)
<jbjnr_>
mdiers_: can I help?
<mdiers_>
jbjnr_: perhaps, i have now been able to eliminate my performance problem under slurm by configuring slurm with pmix.
<jbjnr_>
what is the nature of your problem?
<mdiers_>
for pmix an adjustment for the mpi recognition is necessary : hpx.parcel.mpi.env=MV2_COMM_WORLD_RANK,PMI_RANK,OMPI_COMM_WORLD_SIZE,ALPS_APP_PE,PMIX_RANK
<mdiers_>
now i'm stuck with a problem with hyperthreading
hkaiser has joined #ste||ar
<hkaiser>
K-ballo: better now?
<K-ballo>
sure
<mdiers_>
normally the number of physical cores is determined, but under slurm it runs under the logical number
<jbjnr_>
nothing changes under slurm unless you are using some kind of numactl instructions in your srun command. make sure you run one process per node. Is the machine you are runing on unusual in any respect?
<mdiers_>
jbjnr_: and the difference comes from handle_num_threads() in command_line_handling line 328
<mdiers_>
jbjnr_: have the same behavior on different nodes
<jbjnr_>
does it ignore your --hpx:threads=xxx from the command line
<mdiers_>
the problem is, that the slurm environment the logical cores provides (variable batch_threads in command_line_handling:319 ), this will reset the default_threads in command_line_handling:339
<mdiers_>
jbjnr_: --hpx:threads=xxx woks if the configuration of slurm is identical to the threads::topology
<jbjnr_>
(on't use slurm --threads-per-task and that kind of thing)
<jbjnr_>
^don't
<mdiers_>
jbjnr_: but we want to feed a heterogeneous cluster, which makes it difficult with --hpx:threads=.
<jbjnr_>
--hpx:threads=cores, each node will choose the amount for that node. Is slurm breaking that?
<mdiers_>
directly with mpirun there is not this behavior
<jbjnr_>
isn't there an ignore-batch environment flag
<hkaiser>
mdiers_: all depends on the environment
<hkaiser>
slurm sets some env variables we try to interpret
<jbjnr_>
mdiers_: --hpx:ignore-batch-env
<mdiers_>
hkaiser: Yes, I've noticed that too.
<mdiers_>
jbjnr_: sounds good, also works so far. now we have a special case where we have to start our processes per socket, but unfortunately it doesn't work
<mdiers_>
i am just a bit confused because of the different behavior of hpx under mpirun and slurm, in both variants the number of logical cores is passed via the environment variables, and hpx goes to the physical cores under mpirun and to the logical cores under slurm.
<hkaiser>
mdiers_: this could be a bug
<jbjnr_>
if you use "--hpx:threads=cores --hpx:ignore-batch-env" then on every node it will launch one thread per core. If it isn't doing that, then please explain what it is doing better cos I don't quite follow what you mean by "logical cores under slurm"
<hkaiser>
mdiers_: also, you can specify a node-specific command line option even in batch mode, if that helps
<hkaiser>
--hpx:N:option will be applied to node 'N" only
<mdiers_>
jbjnr_: yes "--hpx:threads=cores --hpx:ignore-batch-env" works for homogeneous nodes, but unfortunately for our heterogeneous clusters it's a bit expensive
<mdiers_>
hkaiser: yes, I already tried it, it also works, unfortunately also something in our case expensive
<mdiers_>
Actually, I'm only slightly confused by that difference:
<mdiers_>
jbjnr_: yes, that works, but is unfortunately somewhat difficult to parameterize in heterogeneous clusters
<jbjnr_>
hkaiser: if I have an action, that returns a future and I attach a continuation to it, but I don't need a future from the continuation - is there a .then version of apply?
<jbjnr_>
or must I returna a future and discard it
<mdiers_>
has to break up now, I'll get back to you tomorrow. thanks a lot
<jbjnr_>
mdiers_: it's not hard at all. It's the same for every node!
akheir has quit [Quit: Konversation terminated!]
akheir has joined #ste||ar
bibek has joined #ste||ar
<hkaiser>
jbjnr_: no, we don't have a fire&forget continuation
<diehlpk_work>
We would need to start next month with our application
<simbergm>
diehlpk_work: yeah, thanks for reminding me!
aserio has quit [Ping timeout: 252 seconds]
<Yorlik>
Newbie question: Could I do Line 36 (depending on line 17) in a more simple way than just using a templated initializer function? https://wandbox.org/permlink/fhPgrEss0iZNxyA5
david_pfander has quit [Ping timeout: 250 seconds]