<jbjnr_>
heller: Yorlik - the thing about docs is that if I can write docs/comments in the file where I am writing the code. I'm happy to do it. If I have to write docs in a separate file in some other docs location, then it never gets done and it rapidly goes out of date. I'm even happy to put a doc file in the same location as the header/class/other if I must. My ideal solution is something like doxygen where you document the code 'in place' but get nice output
<jbjnr_>
like sphynx.
<Yorlik>
Yup. I tend to use doxygen more and more these days, as much as I like sphinx, especially with sphinx-autobuild. Since I learned how to use Doxygen with markdown files I'm using it more now.
<jbjnr_>
can you do extended help/docs "in place"?
K-ballo has joined #ste||ar
<zao>
Too bad you can’t do rustdoc for C++ :)
<Yorlik>
jbjnr_: It's possible - I just tried and it worked. Essentially .md files are found like anay source file and put into the docs.
<jbjnr_>
good to know, thanks
<jbjnr_>
after simbergm spent ages redoing the docs, I doubt he'd want to go back to doxygen...
<Yorlik>
Sphinx clearly looks better for normal text, but imo the automated API docs are not good - from a functionality and from a rendering point of view. Browsing and searching with Doxygen is much better.
<Yorlik>
I think there are argument for each of Sphinx and Doxygen. But having two tools instead of just one is kinda cumbersome - for writers and users too.
<Yorlik>
When I use sphinx, I do it in a separate IDe - VSCODE - and run sphinx-autobuild in a terminal, so every edit gets detected, the files re-rendered and the browser auto updated. I'm not sure there is something comparable on the Doxygen side.
<Yorlik>
For me that would actually be a big plus for writing more longish docs.
<simbergm>
jbjnr_ and the rest: I don't mind better solutions for the api docs at all
<simbergm>
there's just so much non-api documentation that we shouldn't lose and I don't know how easy that is to integrate with a pure doxygen approach
aserio has joined #ste||ar
nikunj has quit [Ping timeout: 276 seconds]
rori has joined #ste||ar
hkaiser has joined #ste||ar
aserio has quit [Ping timeout: 264 seconds]
aserio has joined #ste||ar
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 245 seconds]
K-ballo1 is now known as K-ballo
<Yorlik>
simbergm: yt?
<heller>
hkaiser: Holding lock while suspending
<heller>
hkaiser: why do you ask?
<heller>
Don't remember if it was the allocator or the thread data ctor
nikunj has joined #ste||ar
<simbergm>
Yorlik: on and off, but ask away, I'll reply later
<Yorlik>
I am working on a PR for some docs, just wondering where I should put stuff that is not in-source.
<heller>
Yorlik: the docs dir is a good place to start
<hkaiser>
heller: just wondering
<Yorlik>
I have a writeup which explains the general mechanics of comnponent and action declaration, definition/registration and what they do.
<hkaiser>
we see a strange segfaukt on power in the thread_queue while creating new threads, so I was trying to understand things
<heller>
Hmmm
<heller>
Well, remove it, ignore locks and try again
<heller>
Stack overflow maybe?
<Yorlik>
simbergm: I want to put macro specific explanation into the source files as Doxygen comments, but an introductory writeup to get rid of this "black magic" feeling in the general docs
<Yorlik>
We could also make it so, that I forst finish it and then ask you for review before creating a PR.
<heller>
Yorlik: which macros?
<Yorlik>
The general Action/Component registration/declaration/definition thingies
<Yorlik>
There are gaps in the doxy comments
<heller>
Ok, we already have a section for it
<heller>
For the general action component foo
<simbergm>
Yorlik: it sounds like that file I linked to above could be a good place, but it depends on what specifically you have
<simbergm>
also, we'd love to know what you think... if you had looked for that information in the documentation, where would you have expected to find it?
<Yorlik>
I'd suggest we maybe make a voice session about the docs once you have time - theres so much to day and discuss.
<heller>
weilewei: so, please go to frame 9 and print thrd
<weilewei>
how should I go to frame 9?
<heller>
up 9, iirc
<heller>
Or it up 9 times
<weilewei>
you mean get call stacks? I am not familiar with what you are suggesting
<heller>
You should end up in thread_queue.hpp line 228
<heller>
In your gdb prompt
<hkaiser>
weilewei: run in gdb until it stops at the segfault
<hkaiser>
issue commands: up 9 and p trhd
<heller>
Type up and hit enter until you get to the mentioned line
<weilewei>
ok, let me try
weilewei64 has joined #ste||ar
<weilewei64>
[Inferior 1 (process 108443) exited with code 01]Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.6.ppc64le libxml2-2.9.1-6.el7_2.3.ppc64le xz-libs-5.2.2-1.el7.ppc64le(gdb) up 9No stack.(gdb) p trhdNo symbol "trhd" in current context.
<heller>
Did you hit the segfault?
<weilewei64>
so after i type up 9, it says no stack
<heller>
It says the process exited...
<heller>
What was printed before?
<weilewei64>
let me try it again, I forget to source things
<weilewei64>
(gdb) up 9#9 0x000020000110632c in hpx::threads::policies::thread_queue<std::mutex, hpx::threads::policies::lockfree_fifo, hpx::threads::policies::lockfree_fifo, hpx::threads::policies::lockfree_lifo>::add_new (this=0x20001a0c0000, add_count=997, addfrom=0x20001a0c0000, lk=..., steal=false) at
<weilewei64>
(gdb) p &thrd$2 = (hpx::threads::thread_id_type *) 0x200019bfc1a8
<heller>
Hmmm
<heller>
Does this only happen with your application?
<weilewei64>
I am not sure about other applications, as I am only running this application
<weilewei64>
the DCA++ project on Summit
<heller>
How certain are you that you built hpx and your application against the same version of gcc than what you have sourced right now?
<heller>
Do you know in which phase of your program this segfault happens?
<weilewei64>
They all use gcc/8.1.1 when I am building my application and hpx, otherwise, hpx will report that the compilers are not consistant
<weilewei64>
how many phases do I have in a program? My understand is the segfault happens like in the middle of the program
<heller>
Ok
<heller>
Does it also happen when you say --hpx:threads=1 when starting the program?
<weilewei64>
Yes, that's my command --hpx:threads=1
<weilewei64>
when getting this segfault
<heller>
Ok, so it can't be a race
<Yorlik>
I did't find any "doc" targets in the HPX CMake- or VS- targets after building the CMake cache (using Ninja and the VS Generator). Do they just not exist? OFC - I can just build the docs manually - just wanna know.
<weilewei64>
my whole command is jsrun -a 1 -n 1 -c 8 --smpiargs=none gdb --args ./dca_sp_DCA+_thread_test --hpx:threads=1
<heller>
Interesting
<heller>
Gotta run soon...
<weilewei64>
Or if you have any other small applications that I can try, I can run them to see if I have the same segfault
<heller>
weilewei64: can you run with a larger stack size?
<hkaiser>
heller: this is on the system stack
<weilewei64>
what command and number should I use?
<weilewei64>
hpx.stacks.small_size=?
<hkaiser>
weilewei64: I don't think this would change the stack in this context
<heller>
hkaiser: what if a hpx task with a corrupted stack corrupted the thread_map?
<hkaiser>
ahh, ok - could be
<heller>
Anyways, the only explanation I have is that hpx and the application where built against a different gcc/stdlib than what you currently have in your environment
<heller>
I'm on the run now, will check back later
<weilewei64>
thanks! I am doing double check
aserio has joined #ste||ar
<heller>
weilewei64: can you show the entire output of the segfault when but running in gdb?
<weilewei64>
Ok, I have to hit lots of times Enter, let me try
<weilewei64>
So, when I am building hpx, I turn off the network and without mpi, the compiler is gcc/8.1.1
<weilewei64>
when I am building DCA++, it involves spectrum-mpi
<weilewei64>
/CXX compilerCMAKE_CXX_COMPILER:FILEPATH=/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-8.1.1/spectrum-mpi-10.3.0.1-20190611-55ygkz53evhcwy3txeis32gc3kzu7wy6/bin/mpicxx//A wrapper around 'ar' adding the appropriate '--plugin' option// for the GCC
<Yorlik>
Does HPX_REGISTER_ACTION_DECLARATION have to use the same action name as HPX_DEFINE_PLAIN_ACTION? Or can I assign an arbitrary new action name for remote calls, as long as it is compatible with the serialization identifiers?
<Yorlik>
Especially: Would this example be correct?:
<Yorlik>
namespace app
<Yorlik>
{
<Yorlik>
{
<Yorlik>
void some_global_function(double d)
<Yorlik>
cout << d;
<Yorlik>
}
<Yorlik>
// This will define the action type 'app::some_global_action' which
<Yorlik>
// represents the function 'app::some_global_function'.
<weilewei12>
hkaiser so, I am testing dca_sp_DCA+_thread_test, version 1: with essl, no cuda, with netlib-lapack, result: hang in the program forever if I use gdb debugging, or will generate segfault if I do not use gdb. Any suggestion for how to get call stacks when hanging?
<weilewei12>
hkaiser version 2: without essl, no cuda, with netlib-lapack, result: passed
<weilewei12>
hkaiser for hpx stream_test, when I use --hpx:threads>=17, then result failed in the computing result only, and the whole program finishes it run (no segfault)