hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
hkaiser has quit [Ping timeout: 260 seconds]
bita has joined #ste||ar
bita has quit [Quit: Leaving]
hkaiser has joined #ste||ar
<hkaiser> Yorlik: depends on how many iterations you have, perhaps parallel::for_loop?
<Yorlik> Yes - that was the only additional idea I had
<Yorlik> I will need this and I also need to generilize this to 2d and 3d matrices
<Yorlik> It is used in an iterative form of fractal noice
<Yorlik> noise
<Yorlik> We will use this for all sorts of generative algorithms
<Yorlik> From heightfileds for landscape to feature distribution
<hkaiser> ok
<hkaiser> how large are those arrays?
<Yorlik> So - efficient calculation of this noise is important - but we can precalculate a lot which mitigates the pressure
<hkaiser> is constexpr an option?
<Yorlik> Absolutely
<Yorlik> Actually a lot
<hkaiser> well, there you go
<hkaiser> let the compiler do the work
<Yorlik> I want to use fixed, repeatable random number generators to have deterministic results on server and client
<Yorlik> OFC we could simply pregenerate and transmit
<Yorlik> But I like the idea of a potentially unlimited world, where we only transmit radom seeds
<Yorlik> So client and server generate deterministically
<Yorlik> And same for noise
<Yorlik> Be it landscape and - features, distribution of plants and animals and other things that make up the world
<Yorlik> Coordinates would generate local seeds.
<hkaiser> nod
<Yorlik> We would only send manual edits of the pregenerated world
<Yorlik> But sending a "Put a mountain with these parameters here at this location" is much faster than sending a full mesh
<Yorlik> A little bit like No mans sky and other games using generative techniques.
<Yorlik> hkaiser: What do you think would be a good time to chat in voice?
<Yorlik> Just roughly, so I can adapt to your schedule.
<hkaiser> Yorlik: still have that proposal in the works, SUnday, perhaps?
<Yorlik> Sunday sounds good - I'll be there
wate123_Jun has quit [Remote host closed the connection]
wate123_Jun has joined #ste||ar
wate123_Jun has quit [Ping timeout: 265 seconds]
diehlpk_work has quit [Remote host closed the connection]
wate123_Jun has joined #ste||ar
hkaiser has quit [Quit: bye]
wate123_Jun has quit [Remote host closed the connection]
wate123_Jun has joined #ste||ar
wate123_Jun has quit [Ping timeout: 260 seconds]
weilewei has quit [Ping timeout: 240 seconds]
nikunj97 has joined #ste||ar
<simbergm> nikunj97: really do give `--hpx:queuing=shared-prioirity` a try, no recompilation needed
<simbergm> it's meant to handle numa better than the default scheduler
<simbergm> not sure you'll get better results, but they'll most likely at least be different ;)
<nikunj97> simbergm, yes, I was trying with shared-priority
<simbergm> did it make any difference?
<nikunj97> not really
<simbergm> :/
<simbergm> all right
<simbergm> thanks for trying in any case!
<nikunj97> yes!
<nikunj97> is there any other hpx related functionality that can boost my performance?
<simbergm> rewrite the schedulers and executors? ๐Ÿ™Š
<simbergm> more seriously, make sure your tasks are at least 100 ยตs each, if not 1 ms, when running on machines with that many cores
<simbergm> we have quite high overheads on high core count machines
<nikunj97> how many task would I need to completely fill all cores up with work?
* zao eyes 272 core KNLs
<simbergm> well, at least as many as there are cores
<simbergm> but ideally at least 2-4 times the number of cores
<nikunj97> in that case I have way more
<simbergm> much higher multiples than that just add overheads for little gain
<nikunj97> about 2-3k tasks per core
<simbergm> you don't necessarily need that much parallelism
<simbergm> you're using the static chunker now?
<nikunj97> I tried it and was getting performance losses
<nikunj97> so I got rid of it
<simbergm> you used the default values for it?
<nikunj97> yes
<simbergm> and now you're using what?
<nikunj97> maybe static chunk size of 256 will do better
<nikunj97> for my 64 core processor
<nikunj97> my testing is on basic futures where I control the grain-size
<simbergm> remember chunk size of 256 does not mean 256 tasks
<nikunj97> ohh so that's what I'm getting wrong
<nikunj97> how is chunk size related to tasks then?
<simbergm> it's num_items / chunk_size tasks (sorry if this was clear to you, just don't want you to get that wrong)
<simbergm> oh...
<simbergm> chunk size is the number of items processed by one task
<nikunj97> number of iterations you mean?
<simbergm> I don't know how much work 256 items is, but if it's just one for loop it's most likely not enough work
<simbergm> yeah, iterations
<simbergm> or items, whatever the parallel for loop is over
<nikunj97> I have about 131072 iterations in total
<nikunj97> for 64 cores, I should have chunk size of 1024 then
<simbergm> so the default static chunk size would be roughly that divided by (4 * num_cores)
<nikunj97> or 512
<nikunj97> 512 it is then
<simbergm> yep
<simbergm> again, not sure if that's enough work for a task because it depends on what those 512 iterations do
<simbergm> but you might want to do a plot over different chunk sizes
<nikunj97> it's a basic jacobi operation on a stencil
<nikunj97> nad it's not dynamic either
<nikunj97> *and
<simbergm> starting with one task per core up to maybe 100 tasks per core (even that might be overkill)
<nikunj97> alright. let me play around a bit
<simbergm> with one task per core you don't get benefits of work stealing for work imbalance, and with too many tasks you have too high overheads
<simbergm> somewhere in between is a sweet spot
<simbergm> another thing you can do is use apex+otf+vampir to get some task plots
<simbergm> you'll easily be able to see work imbalance between cores, and see how big your tasks are
<simbergm> that'll give you much more information than trying different input parameters
<nikunj97> I don't have apex on my cluster
<nikunj97> I'll have to build it
<simbergm> get it ;)
<simbergm> if you're building hpx yourself it's just a cmake option
<simbergm> it's integrated into hpx
<simbergm> or hpx clones it for you
<nikunj97> aah, that's even beteter
<simbergm> you also need otf2
<nikunj97> -DHPX_WITH_APEX=ON?
<simbergm> for the task plots
<simbergm> yep
<nikunj97> I don't have otf2 either
<nikunj97> I do have vampir though. it's a common profiler
<simbergm> `-DAPEX_WITH_OTF2=ON` as well
<simbergm> and probably `-DOTF2_ROOT=...`
<simbergm> otf2 is good to have around
<simbergm> it's easy enough to compile yourself
<simbergm> just make sure you get the latest version, I think we had some problems with earlier versions at some point
<nikunj97> alright, will do
<nikunj97> is there any documentation/presentation I can look into to get used to using it?
<simbergm> (I never remember the exact options by heart, so let me know if those don't work)
<simbergm> I always cheat by looking here: https://github.com/STEllAR-GROUP/tutorials/tree/master/hlrs2019
<simbergm> you might want to skim through the slides
<simbergm> to get apex to produce otf2 files you have to set some environment variables, and they're in one of those sessions (3 or 4 I think)
<simbergm> apex also has some documentation on that, but it's not always so clear which options you need
<nikunj97> thanks for the links!
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
gonidelis has joined #ste||ar
<gonidelis> I have encountered this error while trying to run some examples from my build directory like `./bin/fibonacci` https://github.com/STEllAR-GROUP/hpx/issues/2357. When I try to run the example directly from my /usr/local/bin directory though everything works fine.
<gonidelis> 1. How and when did these executables where placed under my `/usr/local` dir?
<gonidelis> 2. Any idea how could the problem be fixed?
nikunj97 has quit [Ping timeout: 260 seconds]
hkaiser has joined #ste||ar
K-ballo has quit [Ping timeout: 256 seconds]
hkaiser has quit [Ping timeout: 260 seconds]
K-ballo has joined #ste||ar
hkaiser has joined #ste||ar
gonidelis36 has joined #ste||ar
gonidelis36 is now known as gonidelis_
gonidelis has quit [Ping timeout: 240 seconds]
gonidelis_ has quit [Remote host closed the connection]
gonidelis has joined #ste||ar
wate123_Jun has joined #ste||ar
<heller1> 1. After you said make install
<heller1> 2. Read the ticket, I think the problem was resolved there
<gonidelis> Thank you Mr.heller.
<gonidelis> Do you ste||ar people prefer to use certain `using namespace`'s at the start of your HPX code or rather use multiple times the namespace resolution operator `::`? I think that first one might lead into conflicts between namespaces while the second one makes the code less readable...
<zao> `using namespace` tends to be a smell and is a great way to get things you don't want into your lookup.
<simbergm> gonidelis: people have different preferences, and we're not very consistent in our tests and examples, but I like to use local using statements if something is going to be used many times
<simbergm> it's rarely necessary to do a top-level using
hkaiser_ has joined #ste||ar
<simbergm> hkaiser: you should also relax on the weekends... or perhaps reviewing hpx prs is relaxing to you ;)
<hkaiser_> heller1, K-ballo, nikunj: would you mind responding tp the #pragma once VOTE request?
<simbergm> thank you in any case for looking through them!
<hkaiser_> simbergm: :D
<hkaiser_> need to catch up on things
<simbergm> I really appreciate it
hkaiser has quit [Ping timeout: 260 seconds]
<gonidelis> In order to make a fresh new installation of hpx from source should I clean first `/usr/local` from anythin related to hpx?
<gonidelis> Aren't these files going to be overwritten after compiling the whole library again?
hkaiser_ has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
<hkaiser> gonidelis: files that are part of the old and new installation will be overwritten, however files that are old will not be removed
<hkaiser> that can turn out to be a problem, especially for old binaries
<gonidelis> yeah I get it. I speculate that files that are just part of the old installation won't interfere with my new compiles, are they?
<zao> gonidelis: It's often a good idea to not install things to system locations, instead opting to install somewhere specific with -DCMAKE_INSTALL_PREFIX so you have control over the artifacts and can wipe them or have several installed sets at the same time.
<zao> It has happened in the past that installed things in system locations have affected builds or test runs by being accidentally pulled in.
<gonidelis> Should that be like an `/opt` location or do you people just prefer some home folder?
<gonidelis> zao wow... that seems bad. Maybe reinstalling Ubuntu is an option after all ;q
<zao> It's quite convenient to install somewhere you don't need superuser rights to touch.
<zao> gonidelis: If you're unsure of what might be installed, you could make an installation somewhere private and see what an installation deploys.
<zao> You've got four kinds of artifacts pretty much. Headers, libraries, binaries, and docs/metadata.
<gonidelis> Ok I get it thus far. So while I try to reinstall HPX where do you think cmake and boost should be placed? (`apt` undortunately provides obsolete versions that do not work with HPX)
<zao> I tend to have a "stellar" tree somewhere where I keep sources, builds, and installs.
<zao> I tend to go for separate directories for each dependency install there.
<gonidelis> So you just might go with many different versions of cmake for every time cmake is needed as a dependency...
nikunj97 has joined #ste||ar
weilewei has joined #ste||ar
<weilewei> hkaiser I am not understanding this lamba function here:https://github.com/CompFUSE/DCA/blob/8473fa9c3d71b1678ac8555bcac1bd5ccd0415fd/include/dca/phys/dca_step/cluster_solver/stdthread_qmci/stdthread_qmci_cluster_solver.hpp#L416 I don't see where are these variables defined? like meas_id, n_meas
<zao> Those are the formal parameters of the lambda function.
<zao> iterateOverLocalMeasurements invokes your callable object with actual parameters on lines 341 and 346 in your second link.
<weilewei> zao I see the connection now
wate123_Jun has quit [Remote host closed the connection]
wate123_Jun has joined #ste||ar
wate123_Jun has quit [Remote host closed the connection]
<gonidelis> When did we start using `gmake` instead of `make` at the building instructions? And why? I am just curious :)
<zao> Some platforms have GNU Make installed as "gmake", while "make" is some cool Make like BSD's.
<zao> The author of the instructions may have been exposed to macOS :)
<gonidelis> '=D '=D 'exposed'
weilewei has quit [Remote host closed the connection]
Hashmi has joined #ste||ar
nikunj97 has quit [Ping timeout: 260 seconds]
nikunj97 has joined #ste||ar
bita has joined #ste||ar
nikunj97 has quit [Remote host closed the connection]
Hashmi has quit [Quit: Connection closed for inactivity]
wate123_Jun has joined #ste||ar
hkaiser has quit [Ping timeout: 260 seconds]
wate123_Jun has quit [Ping timeout: 256 seconds]
nikunj has quit [Remote host closed the connection]
nikunj has joined #ste||ar
gonidelis has quit [Ping timeout: 240 seconds]
Abhishek09 has joined #ste||ar
<Abhishek09> X)
hkaiser has joined #ste||ar
<Abhishek09> hkaiser: Hi, any update from ur side?
bita has quit [Quit: Leaving]
<hkaiser> Abhishek09: about what?
<Abhishek09> hkaiser: GSoC
<hkaiser> Abhishek09: the decisions will be made by April 20th or so, see the GSoC timeline (I don't remember the actual date, sorry)
<diehlpk_mobile[m> May 4 18:00 UTC Accepted student projects announced
<diehlpk_mobile[m> Abishek we are not allowed to announce anything before the official date
<Abhishek09> diehlpk_mobile[m: Do you agree with my implementation . As i have tried ur way that u told me to do
<diehlpk_mobile[m> Abishek we are not allowed to let you know such information
<diehlpk_mobile[m> All rating of the proposals are internal and we are not allowed to let students know before the official deadline
<Abhishek09> diehlpk_mobile[m: no i m not talking about gsoc . This is my personal question based on ur fact that u have given me
<diehlpk_mobile[m> Keep in mind that your proposal competes with all we have received
<diehlpk_mobile[m> Abishek the only thing you can do is to wait for the official deadline