#ste||ar on 2018-07-16 — irc logs at irclog.cct.lsu.edu

2018-04-23 16:40 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC2018: https://wp.me/p4pxJf-k1

01:48 hkaiser has quit [Quit: bye]

02:14 diehlpk has quit [Ping timeout: 244 seconds]

02:27 K-ballo has quit [Quit: K-ballo]

03:19 nanashi55 has quit [Ping timeout: 264 seconds]

03:19 nanashi55 has joined #ste||ar

06:09 jaafar has quit [Ping timeout: 240 seconds]

06:32 jbjnr has joined #ste||ar

06:35 biddisco has joined #ste||ar

06:36 jbjnr_ has joined #ste||ar

06:39 jbjnr has quit [Ping timeout: 240 seconds]

06:54 <M-ms> jbjnr_: how about trying gcc to rule out libc++ or clang? I think mathieu has a full gcc stack that you could use

06:59 <jbjnr_> M-ms: would be a good plan. I am still setting up stuff in my horrid new office. Not going to get much done for a bit

07:12 <biddisco> \help

07:13 biddisco has left #ste||ar ["WeeChat 2.1"]

07:45 <heller_> jbjnr_: sounds a bit like a stack overflow?

07:46 <jbjnr_> could be

07:52 <heller_> did you try?

07:52 <heller_> --hpx:ini=hpx.stacks.small_size=0x0200000 --hpx:ini=hpx.stacks.medium_size=0x0200000

07:52 <heller_> this is usually what I use to rule them out

09:06 <github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/fNYbD

09:06 <github> hpx/gh-pages c08b709 StellarBot: Updating docs

09:29 jakub_golinowski has quit [Quit: Ex-Chat]

09:59 jakub_golinowski has joined #ste||ar

10:02 <jakub_golinowski> M-ms, I am finished with rebuilding of hpx and opencv with the networking off

10:02 <jakub_golinowski> now I will write the scrip that runs the dnn tests with different options and let it run when I am out 1pm-4pm

10:34 quaz0r has quit [Ping timeout: 268 seconds]

10:41 quaz0r has joined #ste||ar

11:03 jbjnr_ has quit [Ping timeout: 240 seconds]

11:15 jbjnr has joined #ste||ar

11:45 hkaiser has joined #ste||ar

12:23 <jbjnr> hkaiser: yt?

12:23 <hkaiser> here

12:24 <jbjnr> I finally got around to installing VS2017 on my windows machine and built HPX, but I get link errors for boost::program_options

12:24 <jbjnr> are there any special settings you use

12:24 <hkaiser> none

12:25 <zao> Do we turn off Boost autolinking?

12:25 <hkaiser> jbjnr: on windows I'd suggest using vcpkg

12:25 <jbjnr> what does that mean

12:25 <zao> If not, Boost might be quite happy on trying to link libraries variants you don't have present on the system.

12:25 <hkaiser> zao: only when using vcpkg

12:25 <hkaiser> https://github.com/Microsoft/vcpkg

12:25 <zao> Sounds silly, considering that the set of libraries installed may not be at all what autolinking expects.

12:25 <jbjnr> zao: no. I have just compiled boost myslef and the cmake stuff is finding it fine

12:26 <zao> jbjnr: The problem is that Boost.Config has an "autolink" feature when using MSVC, where it "intelligently" decides what libraries to link with #pragma comment(lib,"libboost-blargh...")

12:26 <hkaiser> jbjnr: release/debug mismatch?

12:26 <zao> (the generic problem, not necessarily _your_ problem, that is)

12:27 <zao> It can be disabled and have the user link everything themselves, but apparently we don't do that for regular builds.

12:27 <hkaiser> jbjnr: release/debug will cause linker errors with boost

12:27 <zao> The autolinker might be looking for the wrong variant or dynamic/static libraries you may not have built.

12:27 <zao> What kind of errors do you get?

12:27 <zao> Libraries not existing at all, or some particular symbols?

12:30 <jbjnr> just a couple of symbols

12:30 <zao> Personally, I tend to use the binary downloads from boost.org, Good Enough for me.

12:30 <hkaiser> jbjnr: I'd bet it's a release/debug mismatch

12:30 <jbjnr> lost them now as I started a rebuild. I am familiar with the boost autolinking.

12:30 * zao nods

12:30 <zao> hkaiser: As in debug Boost w/ Release HPX?

12:31 <zao> Shouldn't that be handled properly by FindBoost and friends?

12:31 <hkaiser> or vv.

12:31 <hkaiser> cmake sometimes fails to handle that correctly

12:31 <zao> Joy.

12:31 <hkaiser> yah

12:32 <jbjnr> it's not a debug/release problem

12:32 <hkaiser> I tend to explicitly specify CMAKE_BUILD_TYPE=Debug, in which case cmake finds both types properly

12:32 <hkaiser> the generated VS project then properly handles both configs

12:32 <zao> What Boost version, any particular flags, what --layout?

12:48 K-ballo has joined #ste||ar

13:22 hkaiser has quit [Quit: bye]

13:44 aserio has joined #ste||ar

13:49 <zao> Dependencies on MSVC seem to be a bit botched...

13:49 <zao> »50>LINK : fatal error LNK1104: cannot open file '..\..\..\Debug\lib\hpxd.lib'«

13:50 <zao> https://gist.github.com/zao/b69d347bb2298fccf4630ae939437d8a

13:52 <zao> Someone's doing something naughty with std::streampos in slurm_environment.cpp :D

13:54 <zao> Oh gods, Boost.Iostreams is assuming things about the implementation.

13:54 <zao> 2>c:\local\boost_1_67_0\boost\iostreams\positioning.hpp(96): error C2039: 'seekpos': is not a member of 'std::fpos<_Mbstatet>' (compiling source file F:\Stellar\hpx\src\util\batch_environments\slurm_environment.cpp)

13:54 <zao> Didn't Billy rewrite MSVC's impl lately?

13:56 <K-ballo> why didn't I run into those?

13:56 <zao> I'm on the Preview compiler.

13:57 <K-ballo> courageous

13:57 <zao> Comments in headers make it sound like they're transitioning away from _Fpos completely, setting it to 0 for new code.

13:58 <zao> K-ballo: I'd build with the regular compiler if I could be arsed finding the gosh-darn powershell scripts to set the correct vcvars environment.

13:59 <zao> https://github.com/boostorg/iostreams/blob/develop/include/boost/iostreams/positioning.hpp#L92-L100

13:59 <zao> Not fixed in develop either :(

14:01 <zao> Let's see if I managed to hit the old compiler now.

14:02 <K-ballo> so boost.iostreams broken for msvc, lovely

14:31 <jakub_golinowski> M-ms, yt?

14:31 <M-ms> jakub_golinowski: I will have a short meeting soonish, can let you know when I'm done and we can have the call whenever after that

14:32 <jakub_golinowski> M-ms, OK I just got back home and am available the whole evening

14:32 <zao> There's plenty of errors in tests with regular 2017 too, fwiw.

14:37 <zao> K-ballo: std::uniform_int_distribution<uint8_t>, didn't you people mention this the other day?

14:37 <K-ballo> yeah, PR ready and waiting

14:37 <zao> Ah.

14:38 <zao> 28>f:\stellar\hpx\hpx\parallel\algorithms\find.hpp(582): error C3489: '&op' is required when the default capture mode is by copy (=)

14:38 <zao> Ooh, an ICE.

14:39 <zao> I should probably upgrade to 15.7.5

14:41 <K-ballo> PR ready and waiting for the captures too

14:41 _bibek_ is now known as bibek

14:42 <jbjnr> zao: i forgot to say. I fixed the problem by recompiling boost with -DBOOST_PROGRAM_OPTIONS_DYN_LINK=1 as instructed here https://svn.boost.org/trac10/ticket/13326

14:43 <jbjnr> known bug it seems. (feature). That's why I asked if hk used any special settings

14:43 <jbjnr> last time I used boost on windows was 1.59

14:43 <jbjnr> now 1.67

14:43 <zao> Eew, you can tune that at Boost build time?

14:44 <zao> I wonder how my binary Boost is built, as that "works".

14:44 <jbjnr> some #define means the symbols are not there if you don't

14:44 <jbjnr> (buggy feature)

14:47 <M-ms> jakub_golinowski: now?

15:05 diehlpk has joined #ste||ar

15:08 <zao> cmake_build_dir_test fails to compile, linker errors in schedule_last_test_exe, those bloody HPX_COMPILER_FENCE things in spinlock_overhead*.

15:08 <zao> MSVC error list doesn't get easier to navigate with all these fail_* tests polluting it :)

15:10 <K-ballo> pr ready and waiting for those bloody fences too

15:10 <zao> Figured as much :D

15:16 galabc has joined #ste||ar

15:20 hkaiser has joined #ste||ar

15:41 <zao> K-ballo: Is foreach_prefetching_test_exe ICE:ing for you?

15:42 <K-ballo> I don't know, I didn't keep the logs

15:42 <zao> Compiler points to here: https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/util/prefetching.hpp?utf8=%E2%9C%93#L259

15:42 <K-ballo> something was ICEing, but I don't remember what

15:42 <zao> Lovely :)

15:45 <hkaiser> zao: yah, I have seen this before

15:59 aserio1 has joined #ste||ar

16:01 <K-ballo> hkaiser: this is your reminder to speak to aserio about my visit

16:01 aserio has quit [Ping timeout: 260 seconds]

16:02 aserio1 has quit [Client Quit]

16:02 aserio has joined #ste||ar

16:02 <hkaiser> K-ballo: will do

16:03 diehlpk has quit [Ping timeout: 240 seconds]

16:38 nikunj has joined #ste||ar

16:41 aserio has quit [Ping timeout: 264 seconds]

16:43 <M-ms> jbjnr: do you want to reply to the opencv guy? jakub_golinowski or I can do it if you're busy...

16:46 <diehlpk_work> M-ms, What did he asked?

17:02 <M-ms> diehlpk_work: just some basic stuff about hpx_main etc., follow up from a previous discussion

17:02 <M-ms> https://github.com/opencv/opencv/pull/11897#issuecomment-405284625

17:03 <diehlpk_work> Ok, thanks

17:04 <M-ms> jakub will hopefully have some good benchmark results to show tonight, but this vpisarev guy already seems a bit more positive about the backend than in the beginning

17:05 <diehlpk_work> Yes, I think we should mention that this is more a proof of concept and we have to investigate for performance improvements

17:10 jakub_golinowski has quit [Quit: Ex-Chat]

17:12 jakub_golinowski has joined #ste||ar

17:13 <jakub_golinowski> M-ms, I just realized there are binaries of opencv_perf_dnn

17:14 <jakub_golinowski> I think this incorrect closing of test applcation is the reason for the performance dependency on the order of launches

17:16 <M-ms> jakub_golinowski: I don't follow

17:17 <jakub_golinowski> So I noticed that I have the 20% utilization of all cores

17:17 <jakub_golinowski> Then I looked into processes in the System Monitor

17:17 <jakub_golinowski> and found 7 or so instances of opencv_perf_test running and they were at the top of the cpu consumption list

17:19 <jakub_golinowski> Therefore I am now suspecting that the test apps are not closed properly? Maybe it has to do sth with them being run from bash script?

17:19 <jakub_golinowski> not sure - exploring that now

17:21 <nikunj> hkaiser, yt?

17:21 <M-ms> That would be bad... I'll check in a bit if I have some leftovers

17:21 <jakub_golinowski> but I am not 100% sure as I was developing the script and the leftovers might be from that

17:22 <jakub_golinowski> (some wrong launches etc...)

17:26 <M-ms> jakub_golinowski: all clean here, but I was launching mine directly from the command line, not through a script

17:27 <M-ms> also cancelled quite a few runs and it seems to have worked fine

17:27 <jakub_golinowski> M-ms, ok will investigate that

17:27 <jakub_golinowski> but I had some bad launches that hanged

17:27 <jakub_golinowski> maybe did not kill them properly

17:27 <jakub_golinowski> this was caused by bad args

17:45 jaafar has joined #ste||ar

17:48 nikunj has quit [Quit: Leaving]

18:19 galabc has quit [Quit: Leaving]

18:29 aserio has joined #ste||ar

18:54 jakub_golinowski has quit [Remote host closed the connection]

18:57 jakub_golinowski has joined #ste||ar

19:09 <jakub_golinowski> M-ms, I had some issues and cpufrequutils settings wer ignored by the intel_pstate driver

19:11 <jakub_golinowski> but now I succeded in upeer bounding the cpu at lower freq

19:12 nikunj has joined #ste||ar

19:34 nikunj has quit [Quit: goodnight]

20:00 hkaiser has quit [Quit: bye]

20:32 <jbjnr> M-ms: you were telling me that i can't use command line params if I include hpx_main - I can't write hpx apps any more

20:34 <jbjnr> terminate called after throwing an instance of 'std::invalid_argument'

20:34 <jbjnr> what(): hpx::resource::get_partitioner() can be called only after the resource partitioner has been allowed to parse the command line options.

20:39 <jbjnr> nevermind. seems to be working again

21:07 hkaiser has joined #ste||ar

21:16 <jbjnr> hkaiser: how do I use --hpx:ini hpx.parcel.boot=tcp it says "what(): Attempt to initialize unknown entry: ..."

21:16 <jbjnr> what's the right syntax

21:17 <hkaiser> sec

21:18 <hkaiser> it's hpx.parcel.bootstrap=tcp

21:18 <jakub_golinowski> --hpx:ini=hpx.parcel.bootstrap=tcp

21:18 <jakub_golinowski> so like this in full

21:20 <jakub_golinowski> jbjnr, ^

21:20 <jakub_golinowski> or does it work without the first '=' as well?

21:21 <jbjnr> hkaiser: thanks. Silly me

21:21 <jbjnr> I used boot instead of bootstrap

21:37 jakub_golinowski has quit [Quit: Ex-Chat]

21:40 aserio has quit [Quit: aserio]

21:49 <jbjnr> hkaiser: or heller_ if I did not compile with MPI parcelport - is there any way I can mpirun -n 4 blah blah (boot = tcp) on a single node and get the localities to correctly initialize without manually having to do agas and hpx for every instance. I can't seem to find a way of getting the mpi env to do the setup of the ports and nodelist

21:49 <hkaiser> jbjnr: I don't think so

21:50 <hkaiser> but on a single node setting up multiple localities should be trivial

21:50 <hkaiser> app --hpx:localities=N -0 &

21:50 <hkaiser> app -1 &

21:50 <hkaiser> ...

21:50 <hkaiser> app -N

21:50 <hkaiser> -(N-1) that is

21:52 <hkaiser> jbjnr: or use hpx_run.py --localities=N app

23:08 jakub_golinowski has joined #ste||ar

23:51 jakub_golinowski has quit [Quit: Ex-Chat]