aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
mcopik has quit [Ping timeout: 240 seconds]
Matombo has quit [Remote host closed the connection]
StefanLSU has joined #ste||ar
StefanLSU has quit [Quit: StefanLSU]
parsa has joined #ste||ar
hkaiser has quit [Quit: bye]
Aalice has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
parsa has quit [Quit: Zzzzzzzzzzzz]
david_pfander has joined #ste||ar
david_pfander has quit [Ping timeout: 255 seconds]
patg has quit [Quit: This computer has gone to sleep]
Aalice has quit [Quit: Leaving.]
jaafar has joined #ste||ar
<github> [hpx] sithhell pushed 1 new commit to throttle_cores: https://git.io/v5PDQ
<github> hpx/throttle_cores de6c7d7 Thomas Heller: Making inspect happy...
jaafar has quit [Ping timeout: 246 seconds]
bikineev has quit [Remote host closed the connection]
david_pfander has joined #ste||ar
david_pfander has quit [Client Quit]
Matombo has joined #ste||ar
bikineev has joined #ste||ar
bikineev has quit [Remote host closed the connection]
<github> [hpx] sithhell pushed 1 new commit to master: https://git.io/v5PF4
<github> hpx/master 7bc54fe Thomas Heller: Fixing floating point comparisons...
mcopik has joined #ste||ar
david_pfander has joined #ste||ar
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/v5PbD
<github> hpx/gh-pages f6730ca StellarBot: Updating docs
<jbjnr> heller: yt?
<heller> jbjnr: hey
<jbjnr> I nee to add a new binding mode for balanced-numa
<heller> ok
<heller> sounds like a good idea indeed
<jbjnr> what goes on in this code? Are you familiar with it?
<heller> yeah
<heller> one sec
<heller> jbjnr: you just need to add a "| partlit("balanced-numa") >> qi::attr(balanced-numa)"
<heller> at the end
<jbjnr> yes. I did that, but then I need to actually add the code to make it happen, I'm not sure where to put it
<heller> add the new type here
<jbjnr> I mean - do the allocation of pus according to the balanced-numa mode
<jbjnr> yup. added it there too
<jbjnr> aha!
<jbjnr> that's the place. Thanks
<heller> yw
<jbjnr> had a good chap with a bloke at ETHZ this morning about implementing his collective library in hpx
<jbjnr> ^chat^
<jbjnr> I will start on it, once I fix the matrix nonsense.
<jbjnr> new HPX guy starting tomorrow at CSCS oo.
<jbjnr> ^too
bikineev has joined #ste||ar
<heller> jbjnr: exciting!
<github> [hpx] sithhell pushed 1 new commit to master: https://git.io/v5Pjc
<github> hpx/master ab51af0 Thomas Heller: Fixing problems in MPI parcelport....
<heller> what's left for the matrix nonsense?
<heller> I am making buildbot greener again ;)
<heller> MBGA!
<heller> I want to have our buildbot starting with an A...
<jbjnr> the matrix code still runs shit slow compared to parsec.. The strange distribution of task times is causing slowdown and we do not know why
<heller> ok, the same old problem then
<heller> did you look at the hardware counters regarding cache misses etc.?
<heller> you should really do that ... since we have no other clue
<jbjnr> still not yet. Hackathon + other issues got in the way. vtune not working either
<jbjnr> it's my task for this week
<jbjnr> but I want to fix thread binding first
<heller> sure
<heller> jbjnr: how does your code look like?
Matombo has quit [Remote host closed the connection]
Matombo has joined #ste||ar
<jbjnr> I added 2 lines - balanced_numa type, added a string to the parse description (same two links you showed above), but it never gets to the actual decode distribution where the work is done cos the qi stuff throws that the parameter is bad. Seems like it does not use the "| partlit("balanced-numa") >> qi::attr(balanced_numa)" part
<jbjnr> but I'm not familiar with boost spirit so not sure how to debug inside it
bikineev has quit [Ping timeout: 248 seconds]
<jbjnr> discovery
<jbjnr> "balanced-numa" doesn't work because it is similar to "balanced" and the parser is choking
<jbjnr> if I use "numabalance" it works
<heller> I was about to say...
<jbjnr> bloody spirit. shite!
<heller> it's not spirit
<heller> if you put balanced-numa at the top, it should work
<jbjnr> what's to blame then?
<jbjnr> ok
<heller> it's the partlit parser, written by hartmut :P
<jbjnr> pfff
Matombo has quit [Remote host closed the connection]
Matombo has joined #ste||ar
Matombo has quit [Remote host closed the connection]
<github> [hpx] mcopik opened pull request #2894: Fix incorrect handling of compile definition with value 0 (master...cmake_fix) https://git.io/v5XfZ
bikineev has joined #ste||ar
<heller> jbjnr: what(): partitioner::add_resource: Creation of 5 threads requested by the resource partitioner, but only 4 provided on the command-line.
<heller> jbjnr: ever saw this error?
<jbjnr> if you say hpx::threads=N but then in int main say add_pu 5 times, or add core etc
<jbjnr> N=4
<jbjnr> if you want to bind more threads than there are available you have to --alow-oversubscription
<jbjnr> (might not have implemented that yet)
<heller> ok
<heller> it seems to only occur on a specific system
<heller> I don't have more details there yet ...
<heller> ahh, missing HPX_WITH_MAX_CPU_COUNT...
bikineev has quit [Ping timeout: 240 seconds]
bikineev has joined #ste||ar
hkaiser has joined #ste||ar
K-ballo has joined #ste||ar
denis_blank has joined #ste||ar
pree has joined #ste||ar
<heller> {what}: assertion 'pp ? pp->here() == pp->agas_locality(cfg) : true' failed: HPX(assertion_failure)
<heller> this one is really annoying :/
<heller> doesn't seem to have any effect for release builds
<hkaiser> heller: looks like a problem in the command line handling code
<heller> yes
<heller> it's been there since a while now
<hkaiser> was not aware of that :/
<heller> I only ever saw it on buildbot
<heller> looks like some sort of strange race condition, since not all test runs are affected
<heller> apart from some spuriosly failing tests and this assert, it doesn't look too bad at the moment
<hkaiser> heller: just don't remove the assert, pls
<heller> that was not my plan, just said it is annoying ;)
<diehlpk_work> mcopik, Most people who were interested in HPXCL asked for CUDA
<diehlpk_work> mcopik, If you think we still need opencl, I can keep it and disbale it for circle-ci
<heller> hkaiser: ha! figured it out :D
<hkaiser> heller: what is it?
<heller> hkaiser: we look for SLURM_NODELIST, which includes the complete nodelist we asked for during salloc, however, when we do an srun -n1 -N1, we might get a different nodelist, which is in the environment variable SLURM_STEP_NODELIST
<hkaiser> stupid slurm
<heller> I'd commit it to master directly ...
<heller> any objections?
<hkaiser> depends on how big the change is
<hkaiser> if it's one line, sure
<heller> see the gist
<heller> essentially just a line, yes
pree has quit [Ping timeout: 252 seconds]
<hkaiser> pls go ahead, yes
<hkaiser> so those env settings are mutually exclusive?
<hkaiser> heller: ^^ ?
<heller> no
<heller> both variables seem to be always set
<hkaiser> but if the step_nodelist is given it is prevalent?
<heller> yes
<hkaiser> well, then your 'fix' is not correct
<heller> right
<heller> we should only ask for the _STEP_ variables
<hkaiser> arethere other places this has to be changed?
aserio has joined #ste||ar
<heller> good point
Matombo has joined #ste||ar
hkaiser has quit [Quit: bye]
Matombo has quit [Ping timeout: 260 seconds]
<github> [hpx] sithhell created fix_slurm (+1 new commit): https://git.io/v5X8G
<github> hpx/fix_slurm 1981b99 Thomas Heller: Fixing SLURM environment parsing...
pree has joined #ste||ar
<github> [hpx] sithhell force-pushed fix_slurm from 1981b99 to 880aa6a: https://git.io/v5X8R
<github> hpx/fix_slurm 880aa6a Thomas Heller: Fixing SLURM environment parsing...
<github> [hpx] sithhell opened pull request #2895: Fixing SLURM environment parsing (master...fix_slurm) https://git.io/v5X8u
pree has quit [Ping timeout: 255 seconds]
Matombo has joined #ste||ar
pree has joined #ste||ar
<diehlpk_work> Should we put this paper into the OpenSuCo paper too? Frank Löffler, Zhoujian Cao, Steven R. Brandt, Zhihui Du. “A new parallelization scheme for adaptive mesh refinement.” Journal of Computational Science, 16 (2016) 79–88.
bikineev has quit [Ping timeout: 246 seconds]
hkaiser has joined #ste||ar
<diehlpk_work> hkaiser, Frank Löffler, Zhoujian Cao, Steven R. Brandt, Zhihui Du. “A new parallelization scheme for adaptive mesh refinement.” Journal of Computational Science, 16 (2016) 79–88. Should we mention this paper here too?
<hkaiser> diehlpk_work: that has no relation to hpx
<diehlpk_work> Ok
pree has quit [Ping timeout: 260 seconds]
pree has joined #ste||ar
<diehlpk_work> hkaiser, Can you read the introduction of the paper?
<diehlpk_work> I shortened heller's introduction from the thesis
david_pfander has quit [Ping timeout: 248 seconds]
<hkaiser> diehlpk_work: will do
<diehlpk_work> Ok, I will finish the conclusion soon.
pree has quit [Ping timeout: 260 seconds]
pree has joined #ste||ar
pree has quit [Ping timeout: 246 seconds]
bikineev has joined #ste||ar
rod_t has joined #ste||ar
pree has joined #ste||ar
rod_t has quit [Client Quit]
rod_t has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
<diehlpk_work> hkaiser, Can we use this one here as a reference for HPX.Compute Copik, M., and Kaiser, H. Using sycl as an implementation framework for hpx.compute. In Proceedings of the 5th
<diehlpk_work> International Workshop on OpenCL (New York, NY, USA, 2017), IWOCL 2017, ACM, pp. 30:1–30:7.
<hkaiser> diehlpk_work: absolutely
<hkaiser> I missed that one, thanks
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
jaafar has joined #ste||ar
aserio has quit [Ping timeout: 246 seconds]
<heller> hkaiser: only checking for the STEP variables is the right thing according to the docs
<diehlpk_work> hkaiser, How should we cite HPXCL? Chapter of my thesis or just link to github repo?
<heller> There might be situations where you only have one job step (for example when doing a srun in solitude, that is without salloc or sbatch beforehand)
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
jaafar has quit [Ping timeout: 252 seconds]
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rod_t has joined #ste||ar
pree has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
pree has quit [Read error: Connection reset by peer]
rod_t has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
jkleinh has joined #ste||ar
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
jbjnr has quit [Read error: Connection reset by peer]
jaafar has joined #ste||ar
jbjnr has joined #ste||ar
rod_t has joined #ste||ar
pree has joined #ste||ar
jaafar has quit [Ping timeout: 252 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
aserio has joined #ste||ar
Matombo has quit [Ping timeout: 252 seconds]
bikineev has quit [Ping timeout: 240 seconds]
Matombo has joined #ste||ar
StefanLSU has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
bikineev has joined #ste||ar
StefanLSU has quit [Quit: StefanLSU]
StefanLSU has joined #ste||ar
StefanLSU has quit [Quit: StefanLSU]
jbjnr_ has joined #ste||ar
aserio has joined #ste||ar
jbjnr has quit [Ping timeout: 255 seconds]
jbjnr_ is now known as jbjnr
jfbastien has quit [Ping timeout: 255 seconds]
Matombo has quit [Read error: Connection reset by peer]
bikineev has quit [Remote host closed the connection]
aserio1 has joined #ste||ar
bikineev has joined #ste||ar
aserio has quit [Ping timeout: 246 seconds]
aserio1 is now known as aserio
bikineev has quit [Remote host closed the connection]
Matombo has joined #ste||ar
bikineev has joined #ste||ar
<github> [hpx] aserio created new_people from master (+0 new commits): https://git.io/v51fZ
jkleinh has quit [Quit: Page closed]
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
hkaiser has quit [Quit: bye]
aserio has quit [Ping timeout: 246 seconds]
jkleinh has joined #ste||ar
aserio has joined #ste||ar
pree has quit [Quit: AaBbCc]
bikineev has quit [Remote host closed the connection]
aserio1 has joined #ste||ar
bikineev has joined #ste||ar
aserio has quit [*.net *.split]
zbyerly has quit [*.net *.split]
ABresting has quit [*.net *.split]
aserio1 has quit [Ping timeout: 264 seconds]
denis_blank has quit [Ping timeout: 240 seconds]
denis_blank has joined #ste||ar
zbyerly has joined #ste||ar
jaafar has joined #ste||ar
aserio has joined #ste||ar
aserio has quit [Client Quit]
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
wash is now known as washcuda
bibek_desktop has quit [Quit: Leaving]
jaafar has quit [Ping timeout: 260 seconds]
hkaiser has joined #ste||ar
Matombo has quit [Quit: Leaving]
ABresting has joined #ste||ar
mcopik has quit [Ping timeout: 248 seconds]
jkleinh has quit [Quit: Page closed]
<github> [hpx] hkaiser created fixing_2896 (+1 new commit): https://git.io/v51VI
<github> hpx/fixing_2896 26be28a Hartmut Kaiser: Removing dependency on Boost.ICL
<hkaiser> zao: this should take care of the odd gcc issue ^^
<zao> Yay!
<github> [hpx] hkaiser opened pull request #2897: Removing dependency on Boost.ICL (master...fixing_2896) https://git.io/v51Vm
bikineev has quit [Remote host closed the connection]
rod_t has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
bikineev has joined #ste||ar
bikineev has quit [Ping timeout: 240 seconds]
<github> [hpx] hkaiser pushed 1 new commit to partitioned_vector: https://git.io/v51Vy
<github> hpx/partitioned_vector b411685 Hartmut Kaiser: Addressing review comment
<github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/v51Vj
<github> hpx/master 8e9926a Hartmut Kaiser: Update natvis files
<github> [hpx] hkaiser created fixing_2893 (+1 new commit): https://git.io/v51w8
<github> hpx/fixing_2893 1ddd79d Hartmut Kaiser: Adding missing #include and missing guard for optional code section...
<github> [hpx] hkaiser opened pull request #2898: Adding missing #include and missing guard for optional code section (master...fixing_2893) https://git.io/v51wz
<github> [hpx] K-ballo force-pushed format from 47db254 to 691115c: https://git.io/v5zUg
<github> hpx/format def228f Agustin K-ballo Berge: Wrap boost::format uses in traditional (variadic) function call syntax
<github> hpx/format 691115c Agustin K-ballo Berge: Add inspect check for unguarded boost::format usage
EverYoung has quit [Ping timeout: 264 seconds]