aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
zbyerly_ has joined #ste||ar
zbyerly_ has quit [Ping timeout: 268 seconds]
hkaiser has quit [Quit: bye]
K-ballo has quit [Quit: K-ballo]
<taeguk> My PR failed to CircleCI. I don't know what problem is.
<taeguk> Can anyone tell me what that CI error means?
<heller_> taeguk: our inspect tool doesn't pass
<heller_> you need to be logged in to circle-ci to get the link
<heller_> taeguk:
denis_blank has quit [Quit: denis_blank]
<heller_> taeguk: log in and click on artifacts
<taeguk> heller_: good. thank you. Is there way to use hpx inspection tool before PR?
<heller_> yes
<heller_> you can build it locally
<heller_> run cmake with -DHPX_WITH_TOOLS=On
<taeguk> heller_: Thank you. I'll try.
<jbjnr> taeguk - ust a helpful tip - if you look at your circle ci page, for build of your PR https://circleci.com/gh/STEllAR-GROUP/hpx/6849 and scroll down, you will see the red bit near the bottom which is the failed inspect check on the code. If you look at the command line used to launch the inspect check, it is "./build/bin/inspect --all --output=./build/hpx_inspect_report.html /hpx" - so in...
<jbjnr> ...your build dir, you need to do as heller_ says and setp HPX_WITH_TOOLS=ON in your cmake, then do a "make inspect_exe " this will build the inspect tool. Then from your build dire you can use "./bin/inspect --all --output=inspect.html /path/to/hpx/src"
<jbjnr> then the inspect report is generated in the inspect.html file in your build dir
<jbjnr> and you can make sure everything passes before doing the next PR :)
<heller_> jbjnr: could you look into https://github.com/STEllAR-GROUP/hpx/pull/2619 please?
<github> [hpx] sithhell pushed 1 new commit to lf_multiple_parcels: https://git.io/vHCTm
<github> hpx/lf_multiple_parcels b870220 Thomas Heller: Fixing line lengths
<taeguk> jbjnr: Thank you for your tip!
<jbjnr> heller_: yes. I'll have a look
<jbjnr> taeguk we should setup a video call sometime soon. You seem to be making good progress. and coding starts in a few days!
<taeguk> jbjnr: yes! I got it!
shoshijak has joined #ste||ar
<jbjnr> taeguk how about we arrange a chat on Friday 2nd June?
<jbjnr> 9am my time, wouold be 4pm your time I think. Are you free then on Fri 2nd at 4pm?
<taeguk> jbjnr: yes, I'm free at that.
<taeguk> I'm now 3:48pm.
<jbjnr> ok. Fri 4pm then. See you then. Google hangouts - my gmail address is biddiscoj@gmail.com
<taeguk> jbjnr: Okay, I got it. I send message to you for check.
<jbjnr> ok
bikineev has joined #ste||ar
<jbjnr> bikineev: yt?
<jbjnr> heller_: some comments. Is the PR working ok. Any issues with lockups on tave etc still?
<heller_> jbjnr: still lockups
<heller_> jbjnr: but more lockups without that PR :/
<jbjnr> heller_: if you send me your tave command line to reproduce lockups, I will have a play this week
<heller_> jbjnr: I am pretty sure right now that there is something fishy with all the memory management
<heller_> jbjnr: srun -ul ./octotiger -Ihpx.parcel.mpi.enable=0 -Ihpx.parcel.libfabric.enable=1 --hpx:threads=64 -Problem=dwd -Max_level=9 -Xscale=10.0 -Eos=wd -Disableoutput -Ihpx.stacks.use_guard_pages=0 -Ihpx.max_busy_loop_count=500 -Ihpx.parcel.message_handlers=1
<heller_> jbjnr: on 128 localities
<jbjnr> heller_: why do you suspect memory management?
<heller_> jbjnr: because I am getting memory registration errors
<jbjnr> at startup? or later on
<jbjnr> we found a massive bug with raffaele's code caused by the stack cleanup. Reducing that stack cleanup flag to a smaler numbver fixed all our out of munmap - memory problems
<jbjnr> not connected to memory registration - but if memory is getting tight - then it might exacerbate the problem
david_pfander has joined #ste||ar
<heller_> jbjnr: during running
<heller_> jbjnr: ok, maybe we are indeed just running out of pages. what did you change?
<jbjnr> HPX_MAX_TERMINATED_THREADS needs to be reduced to a much smaler number, on KNL, with 64 cores (threads=?) there will be 64K undeleted stacks if normal queues alone are used. We set our to 100 instead of 1000 and problems go away. My suspsicion is that a value of 10 would still be fine.
<jbjnr> I can't see any great advantage to having a large number of unclaimed deleted tasks - a smaller number is advantageous from a memory/cache/everything point of view. The only gain is a reduction in oveheads cleaning up from time to time, but the amount of work to do is still the same.
<jbjnr> I will submit a PR where the TERMINATED THREADS is made into a command line param
<jbjnr> (actually, thinking about it, that undeleted tasks flags might be the cause of all the knl trouble. )
<jbjnr> try 10
<jbjnr> heller_: ^^
<heller_> i will, thanks
<heller_> jbjnr: oh fuck ... code crept in ... thanks for spotting...
<heller_> jbjnr: I need to take care of something else first though
<github> [hpx] biddisco created terminated_threads (+1 new commit): https://git.io/vHCmt
<github> hpx/terminated_threads 150c584 John Biddiscombe: Reduce MAX_TERMINATED_THREADS default, improve memory use on manycore cpus
bikineev has quit [Remote host closed the connection]
<github> [hpx] biddisco opened pull request #2656: Reduce MAX_TERMINATED_THREADS default, improve memory use on manycore… (master...terminated_threads) https://git.io/vHCmV
<zao> jbjnr: I wonder how many days it'd take to build HPX on my KNL.
<zao> Cross-compiling is for scrubs.
<jbjnr> zao: we don't have problems compiling on the login nodes
<jbjnr> and I don't even set cross compilation flags explicitly. Might be set in the toolchain file for me though
<jbjnr> cmake/toolchains/knlxxx
<zao> jbjnr: "it's complicated" in our build system, as dependencies are in modules in weirdo places on filesystem :)
<jbjnr> wash - what does ABP mean in abp_priority scheduler?
bikineev has joined #ste||ar
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vHC3X
<github> hpx/gh-pages 44d94aa StellarBot: Updating docs
<taeguk> In cmake, HPX_WITH_TOOLS cannot be used in windows?
zbyerly_ has joined #ste||ar
<heller_> taeguk: should
zbyerly_ has quit [Ping timeout: 260 seconds]
bikineev has quit [Ping timeout: 240 seconds]
bikineev has joined #ste||ar
bikineev has quit [Read error: No route to host]
K-ballo has joined #ste||ar
bikineev has joined #ste||ar
hkaiser has joined #ste||ar
pree has joined #ste||ar
<wash> jbjnr: uh
<wash> jbjnr: It may refer to the type of queue
<jbjnr> wash nobody knows what it stands for, and apparently you created it :)
<jbjnr> I wonder if anyone uses it...
<wash> jbjnr: nope, not really used
<wash> jbjnr: we experimented with it
<wash> but the atomics needed to implement the deque (dcas) are quite slow on x86
<wash> which makes queuing operations more expensive
<wash> jbjnr: it does refer to the queuing strategy
<wash> one sec
<wash> let me see if I can find the ref
<jbjnr> what does it do differentkly from the other schedulers - I've forgotten
<hkaiser> jbjnr: push/pop on both ends of the queue
<jbjnr> ok
<hkaiser> i.e. stealing happens from the back end
<wash> jbjnr: like a lot of code I have written
<wash> it is only good in theory :p
<jbjnr> should we get rid of it?
<hkaiser> should we?
<wash> jbjnr: I'd prefer to leave it, I'm actually playing around with it recently a little
<jbjnr> ok
<wash> been updating some thread overhead results.
<jbjnr> just doing some cleaning up in the scheduler area after shoshijak 's changes to the thread pools etc
<wash> hkaiser: btw, clang coroutines are now in trunk
<hkaiser> wash: nod, heard about it
<wash> jbjnr: what was the thread pool change, btw?
<hkaiser> requires build-system changes on our end
<wash> hkaiser: yah
<hkaiser> wash: and I think we can remove the library emulation helpers you added
<jbjnr> wash the changes are that instead of a single thread pool with a single scheduler, you can now have multiple thread pools, running different scheduler in each
<jbjnr> it's a fairly major change to everything unfurtunately.
<wash> hkaiser: yes we can
<wash> jbjnr: ah, so moving more towards a true executor design, nice
<jbjnr> yes, you will be able to (can) schedule tasks on executors by pool by getting an executor for the pool in question.
<wash> nice
<wash> one sec
<wash> I may be able to find the reference
<wash> still looking
<jbjnr> wash don't waste time on it. I have been curious for some time as to abp, but it's not important
<wash> Found it, adding a comment.
<github> [hpx] brycelelbach pushed 1 new commit to master: https://git.io/vHCw8
<github> hpx/master 4062ca2 Bryce Adelstein Lelbach aka wash: Add a comment explaining what "ABP" in ABP queuing refers to.
<wash> hkaiser: ^ didn't PR for that, just adds a short comment
<wash> gotta run, I'll be around on wash[m] though.
<hkaiser> tks
<heller_> hkaiser: nice trolling ;)
<hkaiser> heller_: shrug - that guy is nuts - suffering from amnesia
<heller_> ;)
<jbjnr> wash thanks
diehlpk_work has joined #ste||ar
eschnett has quit [Quit: eschnett]
hkaiser has quit [Quit: bye]
aserio has joined #ste||ar
<jbjnr> K-ballo: yt? If you have a moment .... If I have a custom scheduler and I instantiate a thread_pool_impl<> templated on my new scheduler but I get this link error https://gist.github.com/biddisco/83bb7a315487def321a09fc3db5fbdc8#file-gistfile1-txt-L23 - what do I need to add to the code to make it create the constructor I need and instantiate it?
<K-ballo> a linker error on a lambda (or other unnamed type)? that doesn't sound possible, it would have internal linkage
<K-ballo> where is this coming from?
<jbjnr> hold on, let me up,load the test
<diehlpk_work> K-ballo, I strated to use hpx::compat in HPXCL and replace boost::mutex with hpx::compat::mutex and I got an error for boost::mutex::scoped_lock lock(this->m);
<diehlpk_work> It is not declared in hpx::compat::mutex
<K-ballo> diehlpk_work: nested scoped locks were deprecated about a decade ago
<diehlpk_work> Ok, I will fix it for opencl
<diehlpk_work> Can I remove it with boost::scoped_lock lock(m); // m is a mutex
<diehlpk_work> Or hpx::compat::scoped_lock?
<K-ballo> jbjnr: I don't understand the error..
<K-ballo> diehlpk_work: I'll get back to you later
<jbjnr> the constructor for my templated thread_pool_impl is not instantiated, but I'm nopt sure how to make it happen
<K-ballo> did you define it? is it in a header? or is it extern? or..?
<jbjnr> I guess because the tread pool stuff is inside hpx code, and the test uses it outside (I added HPX_EXPORT to all relevant classes, but that was not the problem)
<K-ballo> what's callback_notifier?
<jbjnr> something from inside the main thread_manager stuff in hpx code
<jbjnr> I'm trying to add a callback function to create a custom sceduler
<jbjnr> aha. I think I miss an explicit instantiation
<K-ballo> I can't find thread_pool_impl
<jbjnr> sorry, this is a heavily modified branch
<jbjnr> let me play a bit longer, I think I see the problem
<K-ballo> bah, I was never going to find it then? :/
<jbjnr> thanks for looking
<jbjnr> - well more of a cnocept problem really
<jbjnr> ^concept - I mean you know the general idea beind solving these things, even if you can't seee every class involved
<wash> /join #debian
<wash> whoops
<zao> taeguk: What kind of problems do you experience with that define, if any?
<K-ballo> diehlpk_work: I think boost locks are flagged by inspect, what kind of lock do you want?
<K-ballo> boost's old nested scoped locks were sometimes lock guards and sometimes unique locks, rather bad design choice
<taeguk> zao: what do you mean?
<taeguk> Now I have no problem.
<zao> 11:27:42 taeguk | In cmake, HPX_WITH_TOOLS cannot be used in windows?
<diehlpk_work> K-ballo, I have to think what kind of lock I need. I am fixing someone else code
<zao> Regarding that question.
<K-ballo> diehlpk_work: then I suggest you go with `std::lock_guard`, and defer thinking about it until it fails to compile
<taeguk> zao: I resolved the problem. The problem occurs because I'm not familiar to CMake.
<taeguk> In windows, I can use inspector, too.
<zao> Great.
<diehlpk_work> K-ballo, Thanks, I will give it a try
<taeguk> But, there is a problem.
<taeguk> inspection results are different in windows and linux.
<K-ballo> yes, they are, we need to fix that :/ recently it's become more of a problem, as more lines hit the 90 cols mark
<taeguk> In linux, there is no inspection errors. But, in windows, there are some inspection errors about "files with lines exceeding the character limit".
<K-ballo> windows line endings are two characters, inspect interprets the line as 91 cols long
<taeguk> K-ballo: aha, I got it.
<K-ballo> now that I think about it, it should just be a matter as opening the file stream as text
<K-ballo> they are explicitly opened as binary for some odd reason
<jbjnr> K-ballo: so I added an explicit instatiation of the thread_pool_impl<blah> to my test.cpp file and now it gives me link errors for each of the member functions. Normally these would be (I presume) generated at hpx compile time and exist and be exported from the hpx lib. Any idea how I make them appear for me - this is what I added https://gist.github.com/biddisco/455714ae59876d564d7f81f1ed6d4377#
<jbjnr> file-gistfile1-txt-L26-L32
eschnett has joined #ste||ar
<K-ballo> jbjnr: it *seems* you are explicitly preventing instantiation, so it's up to you to export them
<K-ballo> check the code, see if you are marking something as extern, or undefined
<jbjnr> hmm. ok
bikineev has quit [Ping timeout: 268 seconds]
shoshijak has quit [Ping timeout: 240 seconds]
<diehlpk_work> K-ballo, Thanks, it is compiling now
bikineev has joined #ste||ar
shoshijak has joined #ste||ar
<taeguk> I finally implemented parallel is_heap and is_heap_until.
<taeguk> There is my PR and performance measurement.
<jbjnr> taeguk I'll test it sometime in the next couple of days
<taeguk> I don't include my benchmark codes to HPX not yet.
<taeguk> jbjnr: Okay, thank you.
<jbjnr> seems to scale ok up to 8 cores, then start choking according to your numbers. If the task size too small beyond that number?
<jbjnr> is^
<jbjnr> aha. 8 physical cores.... 16 logical - why test on 32 if there are only 16 avail?
<jbjnr> K-ballo: I cannot make it link. I am a loser.
<taeguk> jbjnr: what do you mean? I can't understand.
<K-ballo> jbjnr: where can I find the corresponding definition?
<jbjnr> if rostam has 8 cores and 16 PUs, but you tested with 32 as well.
<taeguk> No specific reason for that.
<taeguk> just for enough test.
<jbjnr> taeguk ok. when I looked at the numbers I was initially surprised, then I saw that there are only 8 real cores anyway and the algorithm is probbably limited by memory access
<K-ballo> and the definition of the constructor?
<jbjnr> I'll test on soime nodes here at CSCS ranging from 8 core up to 72,
<jbjnr> K-ballo: very good question :)
<taeguk> jbjnr: Do you need my benchmark codes?
<jbjnr> K-ballo: and all the implementation. I didn't think it was in cpp instead of header. my bad. You see, you only have to sniff the code to find the bugs!
<jbjnr> taeguk might be good if I can get them, if they are ion a branch in your hpx clone, just send me a link (or another repo)
<taeguk> jbjnr: here you are:
<jbjnr> thanks
hkaiser has joined #ste||ar
<wash[m]> I don't know why I go to runtime system/async tasking/threading talks, they just make me aggravated
<taeguk> what differences there are between hpx::async and hpx::dataflow. They seem to be similar.
<taeguk> ah, sorry. I'm fool.
<taeguk> the problem is my tiredness.
<wash[m]> They are similar
<jbjnr> taeguk dataflow is a kind of optimized when_all().then()
<taeguk> jbjnr: wash: thank you :)
<hkaiser> taeguk: the only difference between async and dataflow is that any arguments to those functions that are futures are handled differently
<hkaiser> async passes futures through to the invoked function verbatim, dataflow makes sure those a re ready before the function is called
<wash[m]> ^
jbjnr has quit [Remote host closed the connection]
jbjnr has joined #ste||ar
pree has quit [Quit: AaBbCc]
pree has joined #ste||ar
<jbjnr> hkaiser: is it the general rule that classes in hpx:xxx:detail are not supposed to be used by users?
bikineev has quit [Read error: No route to host]
bikineev has joined #ste||ar
<K-ballo> yes
<K-ballo> and it should even be avoided by things out of xxx either
<K-ballo> either? too
<jbjnr> ok thanks. I will have to rearrange some of the thread pool stuff I guess then. Since it is now going to be possible to have user provided schedulers, it makes some of the previously hidden api visible.
<jbjnr> K-ballo: compiled!
<K-ballo> "detail" comes from "implementation detail", meaning it's not part of the interface and we'll change it or remove it as we see fit
<jbjnr> thanks very much - just finished building after I moved everything from the template _impl class into the header
<jbjnr> now the specializations are picked up outside of hpx.
<jbjnr> I should have s[potted that. thanks a bundle.
bikineev has quit [Ping timeout: 240 seconds]
bikineev_ has joined #ste||ar
<hkaiser> jbjnr: if the impl is not a template anymore, most of it's functions can go into the hpx module, no reason to keep the implementation in a header file
shoshijak has quit [Ping timeout: 240 seconds]
<diehlpk_work> Police here is very ambitious to find the owner of a stolen bicycle. They rand at 50 apartments to find the owner of the stolen bike at 3:30 in the night.
<zao> Sounds legit.
<zao> diehlpk_work: Probable cause to do all sorts of other stuff, or is such evidence inadmissible?
<diehlpk_work> zao, I do not know and will do some research later
<zao> I could tip the cops off that the bike is in your apartment and they can explain it to you :{
<zao> :P
<diehlpk_work> My colleagues told me that it is normal here, because they need to find the owner of the bike to arrest the theft.
<github> [hpx] hkaiser deleted emulate_deleted at 32bac50: https://git.io/vHChJ
<zao> Ah, didn't even grok that they were looking for the owner, thought it was just looking for the bike.
<zao> "fun"
bikineev_ has quit [Remote host closed the connection]
bikineev has joined #ste||ar
pree has quit [Quit: AaBbCc]
aserio has quit [Ping timeout: 258 seconds]
arashamini has joined #ste||ar
arashamini has quit [Client Quit]
david_pfander has quit [Ping timeout: 240 seconds]
Matombo has joined #ste||ar
aserio has joined #ste||ar
shoshijak has joined #ste||ar
bikineev has quit [Remote host closed the connection]
Matombo has quit [Remote host closed the connection]
bikineev has joined #ste||ar
<aserio> patg: yt?
<github> [hpx] sithhell force-pushed lf_multiple_parcels from b870220 to 38605df: https://git.io/vHnkr
<github> hpx/lf_multiple_parcels 38605df Thomas Heller: Fixing line length
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
bikineev has quit [Remote host closed the connection]
denis_blank has joined #ste||ar
hkaiser has quit [Quit: bye]
jesseg_ has joined #ste||ar
jesseg_ has quit [Client Quit]
hkaiser has joined #ste||ar
bikineev has joined #ste||ar
eschnett has quit [Quit: eschnett]
denis_blank has quit [Quit: denis_blank]
denis_blank has joined #ste||ar
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
<github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vHWZr
<github> hpx/master 47e122e Hartmut Kaiser: Adding links to cppreference.com to algorithm docs...
<github> [hpx] hkaiser created fixing_vcpkg (+1 new commit): https://git.io/vHWZ6
<github> hpx/fixing_vcpkg 6ae2418 Hartmut Kaiser: Improve integration with vcpkg
<github> [hpx] hkaiser opened pull request #2659: Improve integration with vcpkg (master...fixing_vcpkg) https://git.io/vHWZi
aserio has quit [Quit: aserio]
bikineev has quit [Remote host closed the connection]
denis_blank has quit [Quit: denis_blank]