hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
adityaRakhecha has quit [Ping timeout: 256 seconds]
dexhunter has quit [Quit: Ping timeout (120 seconds)]
dexhunter has joined #ste||ar
hkaiser has quit [Quit: bye]
zao has quit [*.net *.split]
zao has joined #ste||ar
jaafar_ has quit [Quit: Konversation terminated!]
<jbjnr_> zao - thanks! Does that mean you actually watched it all the way through?
david_pfander has joined #ste||ar
<zao> 2/3 through thus far
<jbjnr_> don't punish yourself!
nikunj has quit [Ping timeout: 250 seconds]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 268 seconds]
nikunj97 has joined #ste||ar
hkaiser has joined #ste||ar
K-ballo has quit [Ping timeout: 246 seconds]
K-ballo has joined #ste||ar
hello has joined #ste||ar
<hello> hello, everybody
time_ has joined #ste||ar
<K-ballo> hello hello, hello
<time_> why does graph500 offical code segment fault when calculate scale >= 29?
<time_> ===================================================================================
<time_> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
<time_> = PID 120884 RUNNING AT server950-4
<time_> = EXIT CODE: 11
<time_> = CLEANING UP REMAINING PROCESSES
<time_> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
<time_> ===================================================================================
<time_> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
<time_> This typically refers to a problem with your application.
<time_> Please see the FAQ page for debugging suggestions
hello has quit [Remote host closed the connection]
time_ is now known as hello
<zao> See if you got a core dump, look at it with a debugger?
<hello> yes, I have done it.
<zao> Ah, it's a GSoC task to port some sort of benchmark to HPX?
<hello> warning: core file may not match specified executable file.
<hello> [New LWP 122104]
<hello> Failed to read a valid object file image from memory.
<hello> Cannot access memory at address 0x3d7ce21160
<hello> Cannot access memory at address 0x3d7ce21168
<hello> Core was generated by `./graph500_reference_bfs 30'.
<hello> Program terminated with signal SIGSEGV, Segmentation fault.
<hello> Python Exception <type 'exceptions.NameError'> Installation error: gdb.execute_unwindersfunction is missing:
<hello> #0 0x000000000040239a in fulledgehndl ()
<hello> (gdb) bt
<hello> Python Exception <type 'exceptions.ImportError'> No module named gdb.frames:
<hello> #0 0x000000000040239a in fulledgehndl ()
<hello> Backtrace stopped: Cannot access memory at address 0x7fff9401e2c8
<zao> Please use a gist or a pastesite for bulk output :)
<hello> ok, thanks for your remind.
<hello> zao: have you met this bug?
<zao> Never touched the software.
<hello> Emmm
<zao> I wonder if it's memory-heavy, this benchmark.
<hello> do you know other irc channel that maybe using this software?
<zao> Don't know, but the mentors may have some clue about the software.
<zao> heller_: boop ^
<zao> What kind of compute node are you running on?
<zao> And how do you run/submit?
<heller_> hello: hmm, isn't 29 quite a big input?
<hello> yeah, 2683
<hkaiser> hello: this is not an HPX application, is it?
<hello> I am a student, just interested in it.
<heller_> hello: this is the MPI reference implementation?
<hello> yes, you are right.
<heller_> I think you are just running out of memory
<hello> no
<zao> You'll also need 128G of disk space, according to the README.
<hello> My memory is 2T
<hello> both condition satisfied
<zao> I should try building this.
<heller_> hello: so, scale == level, right?
<hello> no
<heller_> sorry, yes
<heller_> it's been a while ;)
<zao> Could you please list the command line you used?
<heller_> so you want to run the mini problem
<hello> make clean;make
<hello> mpirun -n 64 ./graph500_reference_bfs 30
<heller_> what's the default edge factor?
<hello> thank you all very much, lol
<hello> 16
<hkaiser> how many cores do you have on your machine?
<hello> 128
<heller_> hello: I have no idea why it segfaults, to be honest. We are not the maintainer of the reference implementation ;)
<zao> You might get slightly less bad debugging output if you add `-g` to the build flags I guess.
<hello> yeah, you are right
<zao> Hrm, maybe not, that talks about per-process.
<hello> you can run this program. then I will know whether beacause my machine physical configuration caused that.
<hello> yeah, I have read that website, but not solved.
<zao> Segfault.
<hello> your run?
<zao> Aye
<hello> lol
<hello> thank you
<hello> I love you.
<zao> I don't have much memory per core on my cluster, so might not be the same kind of failure as yours.
<hello> could your compile this using -g -O0?
<hello> which gcc version do you use?
<zao> 7.3.0
<zao> Oddly enough, -O0 -g fails to link.
<hello> do you know why not to link?
<zao> Not yet.
<hello> ???
<hello> there are some warning...
<zao> Ah yes, all the "inline" functions need to be "static inline", probably a C standards thing.
<hello> yeah,you are right
<hello> lol
<zao> graph_generation: 506.287539 s
<zao> AML: Fatal: non power2 groupsize unsupported. Define macro PROCS_PER_NODE_NOT_POWER_OF_TWO to override
<zao> Silly me, only asked SLURM for -n64 :)
<zao> Anyway, the reference code seems a bit brittle, I hope you manage to sort it out.
<hello> I know this.
<hello> I have learned the code.
<hello> almost understand and I have expand some code.
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] K-ballo created function_ref (+1 new commit): https://github.com/STEllAR-GROUP/hpx/commit/16090779045d
<ste||ar-github> hpx/function_ref 1609077 Agustin K-ballo Berge: Implement util::function_ref
ste||ar-github has left #ste||ar [#ste||ar]
<hello> zao: how to gdb lauch multi MPI process?
<zao> No idea, we tend to recommend Allinea for our users when they need to debug MPI jobs.
<hello> I use attach
<hello> CFLAGS = -Drestrict=__restrict__ -O0 -g -gdwarf-2 -g3
<hello> zao: do you know why this not produce dump file when core dump?
<hello> my ulimit -c uliminted
<zao> `/proc/sys/kernel/core_pattern` might be sending the dumps somewhere else?
<hello> |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
<hello> core_pattern's content is thi
<hello> this
<zao> A leading pipe means that core dumps are handled by the following command, so in your case the 'abrt' tool.
<zao> Which may ignore or stow them away somewhere else on the machine.
<zao> Backtraces on my other cluster, had more bigmem nodes free there.
<hello> I can not modify the core_pattern file.
<hello> So, what should I do to generate core files?
<hello> it seems I cant modify the path
<zao> Built with GCC 6.3.0 and OpenMPI 2.0.2, crashes at the same point as GCC 7.3.0 and OpenMPI 3.1.1.
<zao> No idea what to do from here. As for your core dumps, you probably have to involve a cluster admin.
<hello> ok
<hello> thank you
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] msimberg opened pull request #3626: Use atomic in cancelable_action_client example to avoid exceptions in… (master...fix-cancelable_action_client) https://github.com/STEllAR-GROUP/hpx/pull/3626
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] K-ballo force-pushed function_ref from 1609077 to 167d29c: https://github.com/STEllAR-GROUP/hpx/commits/function_ref
<ste||ar-github> hpx/function_ref 167d29c Agustin K-ballo Berge: Implement util::function_ref
ste||ar-github has left #ste||ar [#ste||ar]
<heller_> err, the expressions in the assertions are always evaluated?
adityaRakhecha has joined #ste||ar
ct-clmsn has joined #ste||ar
<ct-clmsn> hkaiser, did some more work on opencv_imread this weekend, the data loads but still getting blocking behavior
<adityaRakhecha> while executing one of the examples I am getting this, make: *** No rule to make target 'fibonacci_local'. Stop.
<ct-clmsn> hkaiser, the dynamictensor<std::uint8_t> is working great for storing pixel data
<adityaRakhecha> anyone?
<ct-clmsn> adityaRakhecha, not sure about that issue - you get a clean build from cmake?
<adityaRakhecha> yes. The hello world example is working but not this one explained here https://stellar-group.github.io/hpx/docs/sphinx/latest/html/examples/fibonacci_local.html
<ct-clmsn> hmm ok
<ct-clmsn> will think about it, not sure i have a fix in mind
<simbergm> adityaRakhecha: are you on 1.2.0 or master? you might need `make fibonacci_local_exe`
<adityaRakhecha> 1.2.0
<adityaRakhecha> Still same message(error)
hello has quit [Read error: Connection timed out]
<simbergm> adityaRakhecha: looks like a bug... could you check `make help | grep fibonacci`?
<simbergm> if there are other fibonacci examples there but not `fibonacci_local` I'd be very grateful if you could open an issue
hello has joined #ste||ar
<heller_> K-ballo: regarding your template instantiation count ... I guess most of the enable_if instantiations could be eliminated with if constexpr
<heller_> and then the question is which one is faster ;)
<K-ballo> we already shouldn't be doing enable_if for something tag dispatching could handle
jaafar has joined #ste||ar
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] K-ballo force-pushed function_ref from 167d29c to 6a4e7be: https://github.com/STEllAR-GROUP/hpx/commits/function_ref
<ste||ar-github> hpx/function_ref 6a4e7be Agustin K-ballo Berge: Implement util::function_ref
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] K-ballo force-pushed function_ref from 6a4e7be to 6539b09: https://github.com/STEllAR-GROUP/hpx/commits/function_ref
<ste||ar-github> hpx/function_ref 6539b09 Agustin K-ballo Berge: Implement util::function_ref
ste||ar-github has left #ste||ar [#ste||ar]
<heller_> K-ballo: I don't think so. IIRC we don't use tag dispatching that often
<heller_> or we have a different definition for it ;)
<K-ballo> we might not be doing it, I haven't looked for it, but we should
<K-ballo> there's a greater gain in going from sfinae to tag dispatching than from tag dispatching to constexpr if
<heller_> ok
<K-ballo> and we actually do do it, maybe not everywhere but I have seen plenty of them
<adityaRakhecha> simbergm, yes there are other fibonacci examples but this `fibonacci_local` is not there but this https://stellar-group.github.io/hpx/docs/sphinx/latest/html/examples/fibonacci_local.html, provides link to download one.
<simbergm> adityaRakhecha: I know, I must've messed up the cmake config somehow
<simbergm> feel free to see if you can fix it, but I'll try to fix it tomorrow otherwise
<adityaRakhecha> I would love to work on it :D
<adityaRakhecha> Could you please suggest how should I proceed for it ?
adityaRakhecha_ has joined #ste||ar
adityaRakhecha has quit [Quit: Page closed]
<adityaRakhecha_> Tomorrow after my college I will open the issue and work on it. Please let me do it.
<simbergm> adityaRakhecha_: yeah, no problem, I'm very happy if you look into it
<simbergm> I don't know of a debugger for cmake, so your best bet is sprinkling message here and there
<heller_> it's a little beyond me why fibonacci_local shouldn't be there but the others are
<zao> Which one is it that has a name collision between examples and tests?
<zao> Also, do we build tests/examples by default now?
<heller_> since ever, yes
<zao> Heh.
<zao> I never remember and thus explicitly specify them.
<simbergm> adityaRakhecha_, heller_: hrm, I had an old checkout without fibonacci_local and thought it didn't work because of that
<simbergm> adityaRakhecha_: could you double-check with latest master and/or a fresh build directory
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] K-ballo opened pull request #3627: Implement util::function_ref (master...function_ref) https://github.com/STEllAR-GROUP/hpx/pull/3627
ste||ar-github has left #ste||ar [#ste||ar]
ct-clmsn has quit [Quit: Leaving]
K-ballo has quit [Read error: Connection reset by peer]
K-ballo has joined #ste||ar
nikunj97 has quit [Ping timeout: 246 seconds]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell created modular_assert (+3 new commits): https://github.com/STEllAR-GROUP/hpx/compare/2bc06e1ddaea^...2291aceb46a7
<ste||ar-github> hpx/modular_assert 2bc06e1 Thomas Heller: Adding HPX library "PP"...
<ste||ar-github> hpx/modular_assert 34113aa Thomas Heller: Adding HPX Library "Config"...
<ste||ar-github> hpx/modular_assert 2291ace Thomas Heller: Adding Assert module...
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell opened pull request #3628: Modular Assert (master...modular_assert) https://github.com/STEllAR-GROUP/hpx/pull/3628
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed modular_assert from 2291ace to 26f3c38: https://github.com/STEllAR-GROUP/hpx/commits/modular_assert
<ste||ar-github> hpx/modular_assert 26f3c38 Thomas Heller: Adding Assert module...
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell force-pushed modular_assert from 26f3c38 to 404e9d4: https://github.com/STEllAR-GROUP/hpx/commits/modular_assert
<ste||ar-github> hpx/modular_assert 404e9d4 Thomas Heller: Adding Assert module...
ste||ar-github has left #ste||ar [#ste||ar]
ste||ar-github has joined #ste||ar
<ste||ar-github> [hpx] sithhell pushed 1 new commit to modular_assert: https://github.com/STEllAR-GROUP/hpx/commit/c1a8db4d021d1f926258ad24088bae6c88f5a22c
<ste||ar-github> hpx/modular_assert c1a8db4 Thomas Heller: Making inspect happy...
ste||ar-github has left #ste||ar [#ste||ar]