weilewei has quit [Remote host closed the connection]
hkaiser has quit [Quit: bye]
Yorlik has quit [Ping timeout: 260 seconds]
Yorlik has joined #ste||ar
Yorlik has quit [Ping timeout: 272 seconds]
nan111 has quit [Remote host closed the connection]
bita_ has quit [Ping timeout: 260 seconds]
jaafar has quit [Quit: Konversation terminated!]
jaafar has joined #ste||ar
<kordejong>
Does anybody know what could be wrong in my code when this assertion fires: `Assertion 'split_gids_.empty()' failed: HPX(assertion_failure)` The stack trace refers to `hpx/runtime/serialization/detail/preprocess_gid_types.hpp`, line 47. I am using HPX 1.4.1.
mcopik has joined #ste||ar
mcopik has quit [Quit: Leaving]
<ms[m]>
heller: any idea? ^
<heller1>
Kor de Jong: ui, that shouldn't happen
<heller1>
Kor de Jong: do you happen to have a backtrace?
<kordejong>
This is with a run on a single cluster node, using a locality per numa node (8). When using a single locality on one numa node it works fine. I have tried lots of things, but could use some direction as to what to try next.
<heller1>
yeah, this has to be a multi locality run
<heller1>
so it is related to some call to `hpx::async`, where one of the arguments is either a client, or a id_type directly
<heller1>
can you run with `--hpx:attach-debugger=exception`?
<kordejong>
Indeed, I send component clients to other localities
<heller1>
then you can attach to the exception once this is happening
<kordejong>
OK, I will try that. I will be in a meeting in a few minutes. Will pick up after that and report the results. Thanks so far.
<kordejong>
<heller1 "then you can attach to the excep"> I see this message on the compute node: `PID: 3009 on gpu007.cluster ready for attaching debugger. Once attached set i = 1 and continue` Attaching to the process using `gdb -p 3009` works. What does `set i = 1` mean? gdb does not understand that syntax. Continuing anyway seems to hang. The cores on one of the numa nodes seem to idle. The other ones are busy. I am missing
<kordejong>
something.
<heller1>
yes
<heller1>
once you attached
<heller1>
gtg
<ms[m]>
Kor de Jong: you might have to change to the thread that threw the exception first
<ms[m]>
basically `--hpx:attach-debugger=exception` starts a busy loop when an exception is thrown that can be exited by changing the value of that i variable
<ms[m]>
once you're on the right thread a `bt` might be useful since the stacktrace above doesn't show anything above the preprocess_gid_types destructor
<ms[m]>
I most likely won't be able to help you with the actual problem though since I know little about that part of the codebase...
<gonidelis[m]>
`iter_sent.hpp` ? Any help on how could I test the trait with a vector for example ?
<gonidelis[m]>
K-ballo: hkaiser
neill[m] has joined #ste||ar
<gonidelis[m]>
well... just removed the initializations (`{0}` `{100}`) and the test both compiles and passes :shrug:
<gonidelis[m]>
Do you think that's correct?
<gonidelis[m]>
+ i still would like to put some other test cases
hkaiser has joined #ste||ar
<ms[m]>
hkaiser: heller jbjnr rori_[m] sorry, pycicle is still not quite happy... my folder cleanup isn't working and I keep going over quota on scratch
<ms[m]>
I just need some time to monitor it and figure out where it's failing
<hkaiser>
ms[m]: take your time
<ms[m]>
it's just ci... ;)
<hkaiser>
ms[m]: you're trying to go too fast - I know you're impatient, but hey - nothing is urging you to go at THAT speed
<ms[m]>
just myself...
<hkaiser>
yah
<hkaiser>
understood
rtohid has joined #ste||ar
<hkaiser>
ms[m]: I'll fix #4678 today, could we merge that soon? playing catch-up there for a while now
<hkaiser>
it's fairly small, so shouldn't break any of your ongoing things
<hkaiser>
same for #4693
<ms[m]>
hkaiser: yep, no problem
<ms[m]>
I just wanted to get the renamings in, things should be easier for a while again now
<hkaiser>
good, thanks for working on this!
<ms[m]>
thanks for not chasing us out of the room!
<rori>
yes thanks for merging the renaming things !
<hkaiser>
also, could we merge the naming fixes asap?
<hkaiser>
#4710
<hkaiser>
I'd need this to get Phylanx in order - it's broken right now
<hkaiser>
gonidelis[m]: well, HPX_TEST_MSG is a runtime check, static_assert() is compile-time, but sure, both would work
<gonidelis[m]>
hkaiser: Some things are so simple and trivial and I just keep asking questions about them... sory
<hkaiser>
no worries
<ms[m]>
hkaiser: just pushed to 4710, will merge it now
<hkaiser>
gonidelis[m]: just be careful to a) either use other names for the equality_result etc. or b) to make sure to use the existing ones
<ms[m]>
if I've missed something I'll fix separately (i.e. inspect or something like that, but I hope I got it all)
<hkaiser>
existing templates, I meant
<hkaiser>
gonidelis[m]: one minor thing
<gonidelis[m]>
Is defining `equality_result` under `detail` extra unecessary work? I mean I copied pasted them from the `is_iterator` trait....
<gonidelis[m]>
?
<hkaiser>
gonidelis[m]: we converge on using `east const` notation, i.e. we write `T const&` instead of 'const T&` - purely cosmetic change but more consistent
<hkaiser>
gonidelis[m]: copying them will create duplicate definitions if both headers are included, not?
<gonidelis[m]>
hkaiser: yup... actually I just encountered that problem like 5 minutes ago
<hkaiser>
gonidelis[m]: best might be to move the definitions to a separate header and #include that from both, is_iterator.hpp and is_sentinel_for.hpp
<gonidelis[m]>
hkaiser: oh ok... I will. Sth like `results.hpp` would be ok?? (sory I am not good with namings yet...)
<hkaiser>
gonidelis[m]: hmmm
<hkaiser>
let's call it hpx/iterator_support/traits/detail/concept_helpers.hpp or something like that
<hkaiser>
there will be more, I'm sure
<hkaiser>
gonidelis[m]: alternatively just #include is_iterator.hpp in your is_sentinel_for.hpp
<hkaiser>
might be the easiest solution
<hkaiser>
anybody #including is_sentinel_for will need is_iterator anyways
<gonidelis[m]>
hkaiser: yeah that's what I thought... might be a little bad for the icludes/file formatting. Anyways... It's true those two are closely binded and I reckon `#pragma once` will do the magic and compile won't notice...
nan111 has joined #ste||ar
<hkaiser>
yes
<hkaiser>
ms[m]: thanks
<hkaiser>
I'll add more if needed there
<hkaiser>
gonidelis[m]: do we meet today?
<hkaiser>
same time?
<gonidelis[m]>
hkaiser: of course
<hkaiser>
k
<gonidelis[m]>
hkaiser: as for the trait test cases you can see I just used the pair defined in `iter_sent.hpp`
<gonidelis[m]>
Or better: how could I find what other pairs need to be tested? On the TS maybe??
<K-ballo>
as I suggested the other day, test iterator pairs
<K-ballo>
also, make at least one test that it correctly rejects non-sentinels
<gonidelis[m]>
K-ballo: ahh you mean `being()` and `end()` right? On the rejection part: If I plug a non-sentinel in `is_sentinel_for` shouldn't the test fail then???
<K-ballo>
yes, a begin/end pair.. and no, is_sentinel_for should return false and the test should check that it is so
<K-ballo>
let me leave no room for confusion there.. is_sentinel_for for begin()/end() pair should return true, and is_sentinel_for say... a string and an iterator should return false
<gonidelis[m]>
ahh ok ... sory... I just need to test `::value == false` for the second case... you are right. Thanks a lot
<gonidelis[m]>
So no other pairs?
<K-ballo>
those three cases should cover all the "interesting cases", add any other you think is useful
weilewei has joined #ste||ar
<nan111>
STEllAR-GROUP/hpx#4710 is merged, so I updated the latest hpx. The old error gone but I got a new error, which says "/home/nanmiao/Documents/project/dev/src/phylanx/src/execution_tree/meta_annotation.cpp:16:10: fatal error: hpx/collectives.hpp: No such file or directory #include <hpx/collectives.hpp>"
<ms[m]>
hkaiser: nan111 that one was added after 1.4.1
<hkaiser>
nan111: we still need to fix phylanx for the recent changes, I'll do that asap
<ms[m]>
I wouldn't want to add a compatibility header for that...
<hkaiser>
no need
<nan111>
Got it. Thanks!
<hkaiser>
in general for the missing headers, change hpx/foo.hpp to hpx/modules/foo.hpp for now
<hkaiser>
I have a meeting now, will look into fixing it afterwards
<ms[m]>
this makes no sense... I find no traces of hpx/collectives.hpp either in 1.4.1 or on master (before the renaming)
<hkaiser>
ms[m]: it's auto-generated
<ms[m]>
even then
<ms[m]>
oh... it's on by default
<ms[m]>
then we did have that in 1.4.1
<ms[m]>
we'll need a compatibility header for that as well then
<ms[m]>
however, assuming you're going to do some renaming anyway I won't add it right now
<weilewei>
hkaiser ms[m] jbjnr btw, Hazard pointer related unit and stress tests passed with HPX support... next step is making sure dynamic hazard pointer related tests pass as well...
<ms[m]>
weilewei: very nice!
<weilewei>
and then other lock-free stuff...
<weilewei>
:)
<ms[m]>
so this is with hpx threads, right, not just hpx os threads?
<weilewei>
it uses hpx::thread, are hpx threads and hpx os threads different things?
<ms[m]>
yep, different things, and if it works with hpx::thread that's very good (that would've been the trickier one, but it looks like it's not a problem)
<ms[m]>
hpx os thread = hpx worker thread by another name
<weilewei>
ok, then I should be good, it uses hpx::this_thread::get_id() thing. It is tricky as my implementation fails at a thread counter variable, causing me deadlock. This thread counter was not atomic protected at all, but I make a PR to libcds team. Let's see what they say
<weilewei>
but now it is fixed after making the counter atomic
<diehlpk_work>
hkaiser, SC paper meeting?
<hkaiser>
diehlpk_work: sorry, I'm in a gsoc meeting
bita_ has joined #ste||ar
<hkaiser>
weilewei: I'll be a bit late for our meeting
<weilewei>
hkaiser np
<hkaiser>
weilewei: now?
rtohid has left #ste||ar [#ste||ar]
<ms[m]>
woop, cmake configuration in over an hour! scratch is setting new records...
<zao>
Intel's great at that for me otherwise.
<K-ballo>
does that mean >1h to run some project's cmake configuration(+generation?) step?
<zao>
Sounds like you've got highly performant filesystems there.
<hkaiser>
nan111: #1189 should be fine on top of HPX master
<nan111>
hkaiser Thanks!
<weilewei>
is it correct to create a thread pool simply using std::vector< hpx::thread > threads; ?
<weilewei>
I am seeing this error: /home/weile/project/dev/src/hpx/libs/synchronization/src/mutex.cpp:39: void hpx::lcos::local::mutex::lock(const char*, hpx::error_code&): Assertion 'threads::get_self_ptr() != nullptr' failed
<hkaiser>
weilewei: well a vector<thread> doesn't do anything on its own
<hkaiser>
just an empty vector, same as vector<int>
<weilewei>
but when I run each thread of threads, and tries to grab a lock, it returns this error
<hkaiser>
sure
<hkaiser>
you can't just run hpx code on any std thread
<weilewei>
hmmm, what should I do then?
<hkaiser>
what are you trying to achive with this?
<hkaiser>
hpx already has thread pools, why create another one?
<hkaiser>
the hpx scheduler _is_ a thread pool of sorts
<weilewei>
Right, so essentially, there is a vector of works to do, and then each work needs to be run on one of hpx threads
<weilewei>
Just trying to maintain same syntax... because libcds uses vector <std::thread> threads;
<hkaiser>
weilewei: well, then use a std::vector<hpx::thread>
<hkaiser>
or simply async each work item
<weilewei>
right, but then I get this Assertion 'threads::get_self_ptr() != nullptr' failed error...
<hkaiser>
when do you get that?
<hkaiser>
if you use a vector of HPX threads?
<weilewei>
yes, a vector of hpx threads
<hkaiser>
did you initialize the HPX runtime?
<hkaiser>
using hpx::init or similar?
<weilewei>
ah!! I forgot this! I thought I have it already... apparently not
<weilewei>
Thanks
<ms[m]>
K-ballo: zao exactly... it's hpx on daint's scratch filesystem, which has been hammered by some user all day
<zao>
ms[m]: is that node-local? Maybe you can build in /dev/shm if there’s enough mem?
<ms[m]>
zao: yeah, I think I could do that
K-ballo has quit [Quit: K-ballo]
K-ballo has joined #ste||ar
rtohid has joined #ste||ar
kale[m] has joined #ste||ar
rtohid has quit [Remote host closed the connection]
kale[m] has quit [Ping timeout: 272 seconds]
kale[m] has joined #ste||ar
rtohid has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 272 seconds]
kale[m] has joined #ste||ar
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
kale[m] has quit [Ping timeout: 256 seconds]
kale[m] has joined #ste||ar
karame_ has quit [Remote host closed the connection]
kale[m] has quit [Ping timeout: 260 seconds]
<Yorlik>
Is it guaranteed, that the address of a component as long as it exists on a locality and doesn't migrate will never change?
<hkaiser>
yes
<Yorlik>
Thanks
<hkaiser>
Yorlik: like any c++ object
<Yorlik>
Well ...
<Yorlik>
There might be situations where you want to move an object around. BTW: Are the Objects Movable?
<hkaiser>
what objects?
<Yorlik>
Components
<hkaiser>
no they are non-copyable and non-movable
<hkaiser>
why would you ever want to move one?
<Yorlik>
OK. Makes sense. I could work with pointers for what I have in mind.
<Yorlik>
I am thinking a lot about what is happening to my cache in the moment.
<hkaiser>
if you have a shared_ptr<Foo> p = make_shared_ptr<Foo>(), you never even thing of moving the allocated Foo
<hkaiser>
*thikn*
<hkaiser>
think
<Yorlik>
I just changed my shared ptrs into raw pointers again.
<Yorlik>
Its the right use case - they are non owning and just observers
<Yorlik>
Its complicated - moving the components would most likely be overkill if it were possible.
<Yorlik>
I'm just thinking in several directions.
<Yorlik>
The ost important feature I already have: I can move the data which belongs to the copmponents
<Yorlik>
So I'm free to optimize if I see an issue.
<Yorlik>
But the biggest problem I see is the cache thrashing by the many Lua states we use.
<Yorlik>
And also the migration of a task using a lua state is an issue - it will leave unused cache entries behind and start thrashing the cache on arrival at the new core.
<Yorlik>
A lua state uses ~1 MB of memory after all.
<Yorlik>
the state itself is cheap - 56 bytes only
<Yorlik>
But is uses a bunch of heap
<Yorlik>
So if it migrates that puts a lot of stress on the caches
<Yorlik>
hkaiser: when a parallel loop creates all these tasks to run the chunks: Are these tasks all created on the local thread and just stolen from others or are they distributed in a round robin manner?
<hkaiser>
round robin, I think - at least the hint is set that way
<Yorlik>
OK. That's good - since otherwise that would put a lot of stress on the local cache.
bita__ has joined #ste||ar
<Yorlik>
I'm seeing more and more the limitations of Lua.
<Yorlik>
But there isn't really an alternative that is better.
<Yorlik>
hpx::get_ptr returns a shared_ptr/future - is there a function that directly gives me a raw pointer, when I need just an observing ptr?
bita_ has quit [Ping timeout: 246 seconds]
karame_ has joined #ste||ar
nan111 has quit [Remote host closed the connection]
nan111 has joined #ste||ar
<hkaiser>
Yorlik: it returns a shared_ptr for good measure
<Yorlik>
I understand you're holding my hand here. I can live with that ;)
<hkaiser>
do you have proof that using a raw pointer instead of a shared_ptr gives you significant improvements?
<Yorlik>
Since the shared_ptr is already created I will keep using it instead of just calling .get(), but I will have to destroy it unnecessarily, since it is used for the backlink from the data which is already owned by the component.
<Yorlik>
The shared_ptr in this case is just in the way.
<Yorlik>
It's not a performance problem. Its the wrong use case
<Yorlik>
I need a link back to the component from the data which it owns
<hkaiser>
who is creating the data?
<hkaiser>
the component?
<Yorlik>
So - when migrating the data I will have to explicitely reset that pointer, migrate and reset it.
<Yorlik>
Yes
<Yorlik>
The component creates the data
<hkaiser>
why don't you pass 'this' to the data when it is being created, then?
<Yorlik>
Woops? :D
* Yorlik
bangs head on table
<Yorlik>
I wish I could teach a raw pointer to generate a static_assert if someone tries to use delete on it.
<hkaiser>
Yorlik: easy: don't use raw pointers
<Yorlik>
Nope
<hkaiser>
use references
<Yorlik>
Actually in this specific use case a reference would do the job
<Yorlik>
But I have some other situations where i really need to change a pointee a lot and reference_warpper is not as lightweight
<hkaiser>
nonesense
<hkaiser>
reference_wrapper is just a pointer internally
<Yorlik>
If that is the case there is bunk "knowledge" out there.
<hkaiser>
Yorlik: absolutely
<Yorlik>
Might be worth a blog post of someone capable and blogging. "Why we totally do not need raw pointers anymore"
<Yorlik>
I'm definitely not the authority to do that.
<Yorlik>
I have to think if references would work as good as pointers in the Lua interop though - probably yes.
<hkaiser>
Yorlik: Shawn Parent runs around for years talking about that in his presentations
<hkaiser>
what else do you need?
<Yorlik>
Do you have a specific talk in mingd I could look up ?
<hkaiser>
for interop you can also take the address of the reference if you need a pointer
<Yorlik>
True
<hkaiser>
Yorlik: C++ Seasoning, Microsofts Native C++ conference 2013
<jbjnr>
oh dear. an old one that needs to be removed by the looks of it
<hkaiser>
jbjnr: it's async_local now
<Yorlik>
hkaiser: When a component gets serialized for migration. What happens to references?
<Yorlik>
like my entity holding a reference to the managing component gameobject
<Yorlik>
Ofc the entity get serialized alongside the component, since they have to travel together.
<hkaiser>
Yorlik: what happens to them?
<hkaiser>
the serialization does not change anything, but once the migration is done the original object is deleted
<Yorlik>
The current layout is such, that the gameobject is the hpx component and it has a type erased point to the entity and the entity has a raw pointer backlink to the component
<hkaiser>
you will have to re-set the reference on the receiving end
<hkaiser>
during de-serialization
<Yorlik>
So - just assigning the gameobject component after arrivel to the reference member would be sufficient?
<Yorlik>
Like repairing the link, because the addresses changed
<Yorlik>
BTW: I'm running into quite some trouble using a reference: Suddenlöy my component holding the ref is no longer default constructible and that buibbles all the way through my structure
<Yorlik>
So - all the bopilerplate I carefully tried to avoid suddenly seems mandatory.
<hkaiser>
Yorlik: build you own reference_wrapper or similar
<hkaiser>
man, it's c==, stop complaining - everything is possible
<hkaiser>
c++ even
<Yorlik>
Like something that would temporarily allow a nullptr?