aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<github>
hpx/fixing_2699 1616e86 Hartmut Kaiser: Fixing mismatch and reduce algorithms
hkaiser has quit [Ping timeout: 240 seconds]
hkaiser has joined #ste||ar
parsa has quit [Read error: Connection reset by peer]
parsa| has joined #ste||ar
hkaiser has quit [Quit: bye]
ajaivgeorge has joined #ste||ar
ajaivgeorge_ has quit [Read error: Connection reset by peer]
ajaivgeorge has quit [Ping timeout: 268 seconds]
K-ballo has quit [Quit: K-ballo]
parsa| has quit [Quit: Zzzzzzzzzzzz]
EverYoung has joined #ste||ar
parsa has joined #ste||ar
EverYoung has quit [Ping timeout: 258 seconds]
vamatya has joined #ste||ar
pree has joined #ste||ar
patg has joined #ste||ar
<ABresting>
heller : I tried to add my wrapper to cmake file but I am facing some problems with it.Firstly, find_packages can only search for those libraries which use cmake, so it does not detect libsigsegv. And secondly, the path cannot be used in wrapper because the paths enters the macros at runtime while the path is required by preprocessor so it gives compilation error.
<heller>
what wrapper are you adding to cmake?
<heller>
I can not follow your first and second statements. those make no sense. that's not how it works
<heller>
ABresting: so what is it you are having problems with?
<ABresting>
In order to integrate stack overflow detection to HPX I am using a flag which the HPX init file will read. If it is on, a header file will be added which uses libsigsegv to detect stack overflow.then it will use find_package to find libsigsegv (I am having problem with it).Overall, I am trying to integrate the header file to HPX so that it uses find_package to find libsigsegv and use it to detect stack overflow.
<ABresting>
I am trying to use macros to include libsigsegv in header file as a variable path but since it is interpreted by preprocessor so it gives a compilation error.
<heller>
you seem to have a misunderstanding on how the C++ compilation process works...
parsa has joined #ste||ar
<heller>
you can't just dynamically add headers based on a dynamically set flag
<heller>
you pass the path where to find the header to the compiler, you include the header in your source code
<heller>
so you add a flag to the cmake scripts to optionally enable stack overflow detection
<heller>
(similar to how the HWLOC stuff works i just linked)
<heller>
then, if the flag is enabled, you use find_package to get the include directories and libraries to link against, and invoke the cmake commands such that they are passed to the compiler/linker
<heller>
now, in the code, you use a preprocessor define to check wether the feature is enabled or not, if it is, you include the necessary stuff and get cracking.
<heller>
add a command line parameter to dynamically enable/disable it for example
<heller>
and implement the stack overflow detection.
<heller>
ABresting: just out of curiosity, is this the first non-trivial C++ program you encounter? Non-trivial in the sense of being a larger system using non stdlib headers and such
<heller>
and multiple TU
<heller>
s
<heller>
ABresting: the hwloc support is a good example for your usecase ... just grep the source for HPX_HAVE_HWLOC to see how it is done for that case
patg has quit [Quit: See you later]
zbyerly_ has quit [Ping timeout: 260 seconds]
EverYoung has joined #ste||ar
bikineev has quit [Remote host closed the connection]
EverYoung has quit [Ping timeout: 255 seconds]
bikineev has joined #ste||ar
bikineev has quit [Remote host closed the connection]
Matombo has quit [Remote host closed the connection]
EverYoung has quit [Ping timeout: 276 seconds]
bikineev has joined #ste||ar
Matombo has joined #ste||ar
Matombo has quit [Ping timeout: 240 seconds]
<github>
[hpx] gentryx opened pull request #2720: reorder forward declarations to get rid of C++14-only auto return types (master...master) https://git.io/vQCfS
denis_blank has joined #ste||ar
hkaiser has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 246 seconds]
<hkaiser>
jbjnr: see hpx-users, pls
<jbjnr>
got iy
<jbjnr>
^it
parsa has joined #ste||ar
K-ballo has joined #ste||ar
parsa has quit [Ping timeout: 246 seconds]
<jbjnr>
hkaiser: if we have a parcelport that handles booting itself, do we need the service threads? (Are they used only for initial tcp stuff - or are they used for other things?)
<hkaiser>
the 2 tcp threads are used just for that
<hkaiser>
we need the timer threads, though
<jbjnr>
when are the time threads used?
<parsa[w]>
heller: *beeep*
<hkaiser>
for anything requiring a timer
<jbjnr>
great answer!
<hkaiser>
timed suspension etc.
<jbjnr>
who uses that?
<hkaiser>
hpx
<jbjnr>
it's like getting blood out of a stone ...
<heller>
parsa[w]: and there is nothing wrong with implementing it in terms of distribution policies.
<hkaiser>
jbjnr: do you need a test to verify things?
<parsa[w]>
heller: okay so the idea is to provide something that hands granular control over data distribution to the application?
<heller>
parsa[w]: yup.
<jbjnr>
hkaiser: no. I have been going through the thread pool setup looking at everything and wondered about what some of these things are used for.
<jbjnr>
There's an opportunity to add a PARCELPORT unique cmake setting and allow (for example) the service threads to be skipped when building on daint for example
<jbjnr>
if the user wants to compile with certain features diabled and others optimized for their configuration
<heller>
those threads don't hurt, do they?
<jbjnr>
they don't hurt, they just use bits of API that we are cleaning up with the resource partitioner changes
eschnett has quit [Quit: eschnett]
<heller>
one step at a time?
<heller>
jbjnr: do you have your results of your latest and greatest network storage benchmarks handy?
<jbjnr>
that is daint - going up to about 4TB/s on 2K nodes (I think)
<heller>
gracias amigo
jakemp has joined #ste||ar
<Vir>
david_pfander: I was trying to do anything it takes to get GCC 6.3 to inline the sqr function in octotigers defs.hpp. The only thing that works in http://en.wikipedia.org/wiki/Special:Search?go=Go&search=gnu::always_inline on the sqr function itself. frustrating...
<david_pfander>
Vir: thanks for trying, I'll try to simply remove it entirely
<david_pfander>
Vir: BTW I noticed that performance improved somewhat with your branch
<Vir>
david_pfander: why not using force/always_inline on it?
<Vir>
david_pfander: did you use the latest one?
<Vir>
i.e. mkretz/improve_codegen
<david_pfander>
Vir: I added mask_store stuff and implemented missing overload for bit_shift_right with second parameter int
<david_pfander>
yes
<Vir>
good
<david_pfander>
merged that a few minutes ago
<Vir>
ah, to the stellar fork?
<david_pfander>
yeah, I'm pushing to the branch pfandedd_inlining_AVX512 of the stellar fork
<Vir>
nice, I'm taking a quick look atm
<hkaiser>
jbjnr: the timed suspension is also used for periodic performance counters and things like parcel-coalescing
<hkaiser>
so it's all over the place, essentially
<Vir>
david_pfander: so you took the shortcut of making masked_store work only for double?
<Vir>
david_pfander: right, srai needs a constant expression for the shift. There's an interesting optimization opportunity using __builtin_constant_p here
bikineev has quit [Ping timeout: 240 seconds]
hkaiser_ has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser_ has quit [Client Quit]
<david_pfander>
Vir: could you give me a hint where I can at the static_datapar_cast<double> (from int32_t vector) overload?
<Vir>
depends on the ABI tag it belongs to, so detail/<abi>.h
<david_pfander>
Vir: are there fundamental overloads missing, because I can only find static_datapar_cast in synopsis.h, and that looks like a scalar cast
<Vir>
david_pfander: yes, it's just a stub implementation
<Vir>
david_pfander: I have not considered whether I want a dispatch via the impl classes or just plain overloading of the function
<david_pfander>
Vir: BTW, an option to warn whenever a stub is used would be very helpful
<Vir>
david_pfander: true, also for me to find them :-)
<david_pfander>
yeah :)
ajaivgeorge has joined #ste||ar
<jbjnr>
heller: affinity domains - does anyone use them or need them. It seems we support binding a thread to a dmoain rather than a PU, but I can't imagine anyone actually doing that.
<ABresting>
heller: Sorry heller, having a rough day, in the middle of a relocation :/
<heller>
jbjnr: the parcelport threads do that ;)
<ABresting>
yes I got what you said very helpful :)
EverYoung has joined #ste||ar
<jbjnr>
balls
aserio has joined #ste||ar
EverYoung has quit [Ping timeout: 240 seconds]
<ABresting>
heller: I have learned find_package(), find_path() with HINTS and PATH_SUFFIXES and include() to find and include libsigsegv using a cmake file just like FindHwloc.cmake does . Now I am stuck with the HINTS in find path like https://github.com/STEllAR-GROUP/hpx/blob/master/cmake/FindHwloc.cmake#L15 this is defined in CMakecache.txt when we compile it. so how do I get such HINTS for LibSigSegv?
<github>
[hpx] shoshijak opened pull request #2721: Mind map (master...mind-map) https://git.io/vQClj
<heller>
they are passed in from the su
<heller>
user
<ABresting>
heller: this is the first time I am using cmake for some external library
<ABresting>
so HINTS are passed from user? how?
<zao>
find_blargh looks in the CMAKE_PREFIX_PATH list automatically. You can also instruct it via HINTS to look in additional places, like our convention ${THING_ROOT}.
<zao>
So cmake .. -DHWLOC_ROOT=/opt/bleh
<pree>
<pree>
<pree>
<pree>
<pree>
oh sorry It's my keyboard mistake
<pree>
Extremely sorry
<zao>
:D
<pree>
zao : )
<aserio>
mcopik: yt?
<aserio>
diehlpk_work: yt?
<diehlpk_work>
Yes
Matombo has joined #ste||ar
hkaiser has joined #ste||ar
bikineev has joined #ste||ar
<diehlpk_work>
ajaivgeorge, Could you may do the evaluation today?
<ajaivgeorge>
Yep, doing it now.
<diehlpk_work>
taeguk[m], ABresting
<ABresting>
sorry was busy with relocation, will do it today :)
Matombo has quit [Remote host closed the connection]
<jbjnr>
heller: why were you asking about the network storage test? writing proposals?
<heller>
jbjnr: nope, someone asked. he is interested in comparing it with grpc
<jbjnr>
I think it'll be safe to assume we will leave them in the water.
<heller>
yes, currently, it is 0.3M vs 50M (messages per second)
<heller>
whereas the 0.3 are on a unspecified network and probably another benchmark ;)
<heller>
so apples and oranges
<jbjnr>
they use an IDL too, so the message types must be restricted slightly - no general purpose async type calls
<heller>
yes
<pree>
Hi .. One help it's between deciding "generic domains" VS "mdspan" approach, parsa[w] told it will only makes sense to use domains when we have a application that use non-integer indexes
<pree>
^^ heller
<pree>
what to do ?
<heller>
convince parsa he is worng?
<parsa[w]>
heller: what do you want domains to use with?
<heller>
as said: to decide how to partition the data
<pree>
heller : " ? " makes no sense to me sorry
<heller>
you asked me what to do, I made a suggestion using a rethorical question
<pree>
Oh ! okay that is new to me
jbjnr has quit [Quit: ChatZilla 0.9.93 [Firefox 54.0/20170608105825]]
<heller>
parsa[w]: imagine a simple array, it does not always make sense to have a block cyclic distribution, you sometimes want something else. you still need a way to map from your index to the partition where your data is located.
jbjnr has joined #ste||ar
<heller>
parsa[w]: and you need to map your chunks to localities
<heller>
chapel uses the concepts of domains and domain maps, I think it is a very powerful abstraction, that's why I put up the idea
<heller>
however, it makes sense to make baby steps
<pree>
heller : yes ! Absolutely
<hkaiser>
helle, pree: we agreed many times over that pree should start with domain maps
<hkaiser>
which is equivalent to distribution policies in hpx
<hkaiser>
once this is done we can start thinking on how to integrate (index) domains into the picture
<heller>
yes
<hkaiser>
pree: so what's your problem?
<hkaiser>
also, distribution policies in hpx are used in two semantically distinct scenarios
<hkaiser>
a) create N component instances and somehow distribute those over localities
<pree>
hkaiser : Nothing just to conform parsa[w] told me to go with integer indices first that is " mdspan "
<heller>
hkaiser: parsa[w] asked me about the rationale for domains and domain maps, I complied.
<hkaiser>
b) use distribution policies to make partitioned_vector place its partitions
<hkaiser>
pree: we talked about this more tha once
<hkaiser>
please start with domain maps
<heller>
i agree
<heller>
but yes, this has been discussed multiple times now
ajaivgeorge has quit [Ping timeout: 260 seconds]
<pree>
hkaiser heller : Done
ajaivgeorge has joined #ste||ar
<heller>
hkaiser: so yes, I can not reproduce dominics error at all. I asked for his exact environment/build scripts. No idea what's wrong
<hkaiser>
heller: ok, good - so all works for you?
<heller>
well, there is one bad commit which leads to hang. I told dominic
<hkaiser>
can you provide Dominic with the build instructions?
<heller>
I am more interested in his... I want to know what's wrong as well
<heller>
david_pfander: btw, I saw a pretty significant improvement of octotigers performance
Matombo has joined #ste||ar
<hkaiser>
heller: sure, I however doubt he can give you a decent description of what he's doing
<heller>
without this, we can't avoid this happening again
<hkaiser>
I know
<heller>
I have a suspicion, but need some information from him ...
<hkaiser>
heller: he's trying to get things working for over a month now...
<hkaiser>
well, please talk to him directly, cc me if needed
<heller>
david_pfander: 47s vs 34s (the first number is the result from operation bell so far, the second one is from yesterday)
<david_pfander>
heller: what did you change/improve?
<heller>
38% improvement ... did you change something already?
<heller>
nothing, checked out master
mbremer has joined #ste||ar
Matombo has quit [Ping timeout: 260 seconds]
<heller>
hkaiser: sure, i spent quite some time trying to reproduce his errors, and bisecting whatever lead to the hang
<david_pfander>
heller: Vir and I discovered quite a few codegen problem in Vc last week, Vir made some changes in his branch, that might be a reason
<david_pfander>
heller: my stuff is still in a branch
<david_pfander>
heller: Which Vc variant did you use?
<heller>
I didn't update Vc at all
<heller>
I was using ours
<heller>
master
<heller>
b1c442df6e36958cd0f155a16ca61fce8ea8a44d
<david_pfander>
heller: In that case, I don't have a clue. Our Vc isn't even using AVX512...
<heller>
he ;)
<heller>
might be something dominic changed then
<heller>
david_pfander: just looks like all in all, we are starting to get to something ;)
<heller>
distributed performance didn't suffer from those speedups either
<heller>
instead of 38%, we see a 17% from the sample I ran
<heller>
+improvement
aserio has quit [Ping timeout: 246 seconds]
mbremer has quit [Ping timeout: 260 seconds]
Matombo has joined #ste||ar
aserio has joined #ste||ar
<david_pfander>
Vir: can I create a int32_t 4-wide variable with the datapar avx abi? I'm getting an 8-wide variable with datapar<int64_t, Vc::datapar_abi::avx>
<david_pfander>
(without mixing sse and avx instructions)
<david_pfander>
should have been datapar<int32_t, Vc::datapar_abi::avx>
hkaiser has quit [Quit: bye]
denis_blank has quit [Quit: denis_blank]
zbyerly_ has joined #ste||ar
hkaiser has joined #ste||ar
zbyerly_ has quit [Client Quit]
Matombo has quit [Ping timeout: 260 seconds]
zbyerly_ has joined #ste||ar
EverYoung has joined #ste||ar
parsa has joined #ste||ar
<mcopik>
aserio: yes?
EverYoung has quit [Ping timeout: 276 seconds]
akheir has joined #ste||ar
Matombo has joined #ste||ar
<aserio>
mcopik: GSoC evals :p
<mcopik>
aserio: I know, thanks for the reminder though
<aserio>
lol
akheir has quit [Remote host closed the connection]
hkaiser has quit [Quit: bye]
parsa has quit [Quit: Zzzzzzzzzzzz]
aserio has quit [Ping timeout: 276 seconds]
hkaiser has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
ajaivgeorge_ has joined #ste||ar
<K-ballo>
hkaiser: are you going to the toronto meeting?
akheir has joined #ste||ar
<hkaiser>
K-ballo: yes
ajaivgeorge has quit [Ping timeout: 255 seconds]
david_pfander has quit [Ping timeout: 276 seconds]
vamatya has joined #ste||ar
pree has joined #ste||ar
Matombo has quit [Ping timeout: 276 seconds]
zbyerly_ has quit [Ping timeout: 240 seconds]
bikineev has quit [Remote host closed the connection]
aserio has joined #ste||ar
denis_blank has joined #ste||ar
bikineev has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 258 seconds]
aserio has quit [Ping timeout: 255 seconds]
aserio has joined #ste||ar
<zao>
Watching wash[m]'s presentation on C++17 from C++Now... I didn't know half of those features existed :)
<zao>
Isn't it nice that you'll be able to use them in HPX in as little as a decade :)
bikineev has quit [Remote host closed the connection]
Matombo has joined #ste||ar
aserio has quit [Quit: aserio]
aserio has joined #ste||ar
Matombo has quit [Remote host closed the connection]
<K-ballo>
test '1.0 <= dif.count()' failed in function 'int hpx_main()': '1' > '6.1319e-05'
<K-ballo>
seems like it does
<ajaivgeorge_>
ok thank you
denis_blank has quit [Ping timeout: 276 seconds]
denis_blank2 has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
aserio has quit [Ping timeout: 255 seconds]
denis_blank2 has quit [Quit: denis_blank2]
denis_blank has joined #ste||ar
pree has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
ArashA has joined #ste||ar
ArashA has quit [Client Quit]
parsa has joined #ste||ar
ArashA has joined #ste||ar
ArashA has quit [Client Quit]
ArashA has joined #ste||ar
parsa has quit [Client Quit]
ArashA has quit [Client Quit]
parsa has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
aserio has joined #ste||ar
pree has quit [Quit: AaBbCc]
hkaiser has joined #ste||ar
<hkaiser>
zbyerly: looks like a genuine bug
<hkaiser>
pls create a ticket
<zbyerly>
hkaiser, okay will do
<heller>
what are you running into?
<heller>
hkaiser: btw, got a reply from andrew sutton, waiting on herb sutter to green light the opening of the repository, the whole work is funded by microsoft
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
Matombo has quit [Remote host closed the connection]
zbyerly_ has joined #ste||ar
bikineev has quit [Read error: Connection reset by peer]
bikineev has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
<mbremer>
hkaiser: Just saw this. That's great news. I think circleCI just timed out, but we'll probably just work with the pull request directly then
<hkaiser>
mbremer: I've restarted circleci
<hkaiser>
mbremer: and yes, the branch should work for you
<hkaiser>
mbremer: you stumbled over a genuine bug, nobody had used register_as with channels yet
<hkaiser>
you might want to look at the test to see how to use it properly
<mbremer>
I saw that. That's a big help. It's also nice that connect_to has been tested as well now!
<hkaiser>
the only (invisible caveat is that the channel instance which you used to call register_as on will unregister the object during destruction
<hkaiser>
this might be surprising if the other channels have not called connect_to yet
<hkaiser>
also, if nobody has called connect_to yet, this would also free the actual channel
<mbremer>
Does that then impact migration?
<hkaiser>
mbremer: the other thing to note (not visible in the test either) is that register_as returns a future which becomes ready once the registration was successful
zbyerly_ has quit [Remote host closed the connection]
<hkaiser>
mbremer: I don't think this will impact migration
zbyerly_ has joined #ste||ar
<mbremer>
I would hope so :) So does migration just "move" the channel instance then?
<hkaiser>
mbremer: depends on what you migrate ;)
<hkaiser>
if you migrate the mesh partition, the channels would need to be moved explicitly as well
<hkaiser>
otherwise both ends of the channel object would have to be accessed remotely
<mbremer>
sure, I guess I'm specifically curious if when you migrate something if you call the destructor of that special instance and unregister the channel