hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
jaafar has quit [Ping timeout: 240 seconds]
K-ballo has quit [Quit: K-ballo]
jaafar has joined #ste||ar
aserio has joined #ste||ar
hkaiser has quit [Quit: bye]
jaafar has quit [Ping timeout: 255 seconds]
aserio has quit [Quit: aserio]
nikunj has quit [Ping timeout: 245 seconds]
nikunj has joined #ste||ar
nikunj has quit [Read error: Connection reset by peer]
nikunj97 has joined #ste||ar
david_pfander has joined #ste||ar
<Yorlik>
What happens if you make Singletons Components and try to move them to a place where there already is a Highlander ?
<Yorlik>
Just curious - not that I have planned. Though there might be an application if you want to migrate an entire locality to shut down / reboot.
<Yorlik>
Couzld you even create a singleton hpx component at all with the private constructor? Probably that answers the question.
<Yorlik>
And what about statics? are they automagically cluster-wide and synchronized?
<zao>
Regular language level statics?
<Yorlik>
class statics
<Yorlik>
And then you move
<Yorlik>
err migrate
<Yorlik>
Or change at locality A and query at locality B
<Yorlik>
And yes - regular language statics too ofc
jaafar has joined #ste||ar
<zao>
HPX cannot influence how language features work, unfortunately.
<Yorlik>
So - you could have 2 classes with different statics and migrate?
<Yorlik>
My guess is, that after migration the recreation of the object wouzld just adopt the local class statics
<Yorlik>
But - that's an un-educated guess ... ;)
jaafar has quit [Quit: Konversation terminated!]
jaafar has joined #ste||ar
nikunj97 has quit [Quit: Leaving]
jbjnr__ has joined #ste||ar
jbjnr_ has quit [Ping timeout: 268 seconds]
K-ballo has joined #ste||ar
<heller_>
Yorlik: it all depends how write your serialization functions and move/copy constructors
<zao>
heller_: What on earth should I do with these migrate test failures? Wait for master to be compilable and see if it still manifests? Try some older builds and see if it's been around forever?
<heller_>
I personally would refrain from making classic singletons. In a way, components are global objects already. This becomes evident when using symbolic names
<heller_>
zao: it's been around for quite a while
<zao>
Got a metric duckton of ctest logs from failed runs, should look into those some day and categorize the faults.
<zao>
heller_: Yeah, I had a vague feeling it's not new.
<heller_>
Ok, I'll be on a computer tonight
<heller_>
What you could do however, is trying my sanitizers branch
<zao>
I'm in no hurry, just wanted to check in a bit.
<zao>
Ooh.
<heller_>
And see if that changes anything
<heller_>
I'll prepare prs for the different commits tonight
<heller_>
And fix the mpi tester...
<zao>
Speaking of MPI, we found a memory leak in UCX, some underlying component of OpenMPI.
<zao>
Lots of researchers having code that previously ran fine be OOM-killed on compute nodes as OpenMPI leaked typecache :D
<heller_>
Hihi
<heller_>
Isn't ucx this libfabric style messaging middleware?
david_pfander has quit [Remote host closed the connection]
hkaiser has joined #ste||ar
Yorlik has quit [Read error: Connection reset by peer]
aserio has joined #ste||ar
bibek has joined #ste||ar
hkaiser has quit [Quit: bye]
eschnett has joined #ste||ar
david_pfander has joined #ste||ar
aserio has quit [Ping timeout: 252 seconds]
bita has joined #ste||ar
eschnett has quit [Quit: eschnett]
aserio has joined #ste||ar
eschnett has joined #ste||ar
eschnett has quit [Client Quit]
eschnett has joined #ste||ar
eschnett has quit [Quit: eschnett]
eschnett has joined #ste||ar
eschnett has quit [Quit: eschnett]
hkaiser has joined #ste||ar
eschnett has joined #ste||ar
jaafar has quit [Quit: Konversation terminated!]
aserio has quit [Ping timeout: 252 seconds]
jaafar has joined #ste||ar
jaafar_ has joined #ste||ar
jaafar has quit [Ping timeout: 252 seconds]
eschnett has quit [Quit: eschnett]
aserio has joined #ste||ar
eschnett has joined #ste||ar
Yorlik has joined #ste||ar
diehlpk_work has joined #ste||ar
aserio has quit [Quit: aserio]
aserio has joined #ste||ar
aserio has quit [Client Quit]
jaafar_ has quit [Quit: Konversation terminated!]
hkaiser has quit [Quit: bye]
jaafar has joined #ste||ar
bibek has quit [Quit: Konversation terminated!]
hkaiser has joined #ste||ar
<Yorlik>
If I have an ordered set (std::set<uint64_t>) (*myset.end ( ) - *myset.begin ( )) should give me the difference between the largest and the smallest element, or am I wrong here?
<hkaiser>
Yorlik: you should never dereference the iterator returned by end()
<Yorlik>
Argh -- lol
<Yorlik>
WTF
* Yorlik
bangs head on table
<Yorlik>
Ohg man - this is funny
<Yorlik>
I knew it but didn't see it ...
bibek has joined #ste||ar
<hkaiser>
Yorlik: btw, if myset.empty(), then you shouldn't dereference the iterator return from begin() either
<Yorlik>
Sure
<Yorlik>
I am just checking out ordered sets as a container for free slots in a pool
<Yorlik>
Because I want to reuse the bottom most elements
<Yorlik>
To keep my loop tight
<Yorlik>
Playing with ways to create my custom allocator
<Yorlik>
set looks like a nice way to keep the free list
<Yorlik>
Even with 100k entries access and writes are below 1us
<hkaiser>
Yorlik: a heap might be more appropriate, I'd suggest to do measurements
<Yorlik>
The main problem is, I want to keep my (typed) elements in a straight line
<hkaiser>
sure
<Yorlik>
So - when loopingf I might have to occasionally skip
<Yorlik>
and on a long running process I might move elementsa from time to time
<Yorlik>
I really want to be able to loop over an array like arrtangement of my elements
<Yorlik>
But I might just ditch that for a while and do it as a later optimization
<Yorlik>
That idea has more issues and is mutch more problematic than I thought
<Yorlik>
But the ideal would be to have a self sorting pool
<Yorlik>
But then I'm ruinning into issues with rearranging elements
<hkaiser>
Yorlik: it's called a 'heap' ;-)
<Yorlik>
how would you keep elements packed in a rtow?
<hkaiser>
the idea to have a separate datastructure holding the (sorted) list of indices of free elements is a good one
<hkaiser>
use the heap instead of your set to maintain that list
<Yorlik>
my issue is to have the update loop cache friendly
<Yorlik>
IC
<hkaiser>
a set causes you to do pointer chasing to find the next entry, a heap is build in cosecutive memory
<Yorlik>
That makes sense
<Yorlik>
Oh - I would not use the set to loop
<Yorlik>
My idea is darker
<Yorlik>
keep an index to the highest element in the pool
<Yorlik>
and iterate over the pool from o to maxelement
<Yorlik>
the empty slots wouzld get skipped
<Yorlik>
There is an issue of having lots of deletions and a situation with a lot of element skipping
<Yorlik>
The set would be just used for the allocations and free operations
<Yorlik>
it would always give back the lowest element
<Yorlik>
since it hold the addresses of the lowest free slot
<hkaiser>
Yorlik: sure
<Yorlik>
the problem could come up, if I have many elements, like an allocation burst and then a lot of deletions and elements hanging at the top of the array forcing the loop to run there
<Yorlik>
But the real world behavior would have to be measured ofc
<Yorlik>
But the object count fluctuates
<Yorlik>
sometims a lot
<Yorlik>
Reordering could be a later optimization
<Yorlik>
HPX could actually help me here by using migration
<Yorlik>
it would be horribly slow and only be used in desperate situations
<Yorlik>
I'm just thinking ahead here to avoid certain nasty surprises I can anticipate now.
<heller_>
i still think iterating over anything id_type in your tight loops will hurt you badly
<Yorlik>
That's not the plan
<Yorlik>
The plan is to store everything in an array
<Yorlik>
That's why I'm so interested in a custom allocator for HPX Components
<Yorlik>
I want to be able to use placement new semantics for them
<Yorlik>
All that fuzz is about packing and direct access by having typped array like collections of objects
<Yorlik>
For me the learning task now is to makea thread safe allocator which gives me that. It's a specialized application and I didn't find anything like that o the net.
<Yorlik>
If the Objects are components I need no longer do the bookkeeping for their IDs, since I can use get:ptr to find them
<Yorlik>
get_ptr
<Yorlik>
but for looping I'd just zip over the storage directly
<Yorlik>
My entities are mostly pods and will rarely use pointers to other objects
<Yorlik>
containers are a special problem for example
<heller_>
ok, as said, good luck with non intrusive migration then
<heller_>
still have to review that PR
<Yorlik>
That would be great - but it has time - I'm still on a huge learning task
<Yorlik>
Its fun actually - especiually the atomic specials acquire - release semantics.
<Yorlik>
Saw a 2 great talks by Herb Sutter explaining it very well.
<heller_>
look for talks from tony van eerd on that topic
<Yorlik>
I'll do
<Yorlik>
Thnaks !
K-ballo has quit [Ping timeout: 240 seconds]
K-ballo1 has joined #ste||ar
<heller_>
the problem is: you think you got it now, then try to code such a data structure and fail miserably
K-ballo1 is now known as K-ballo
<Yorlik>
I think it come down to that proverb: "Sharing is the source of all contention"
<Yorlik>
These lock free techniques are optimizations - but you can't get rid of the fundamental problem if you doin't design for it
<heller_>
oh, performance is only secondary
<Yorlik>
Reducing shared stuff as much as possible is the first thing to do, I believe.
<heller_>
actually making them work correctly is the hardest part
<Yorlik>
memory expolosions? :)
<Yorlik>
UD all over the place?
<heller_>
yes, UB in the form of data races
<Yorlik>
I guess I'm going to see my share of pink elephants ...
<heller_>
you will
<heller_>
best is, to try to avoid to write your own concurrent data structures for now
<heller_>
you have to unlearn what you learned in kindergarten: caring is not sharing
<Yorlik>
I'll probably just cobble existing stuff together, indeed.
<Yorlik>
I'll need a concurrent set, concurrent vector and concurrent map - however - I need to not just use stuff - integrating threadsafe stuff into my use case still can give me races all over the place.
<Yorlik>
I really have to do a ton of learning anyways
<heller_>
just something upfront: lockfree/waitfree is not about performance!
<heller_>
you often get away with something using plain old mutices
<Yorlik>
I mean correctness comes first, right?
<Yorlik>
And then ... ?
<heller_>
well, correctness is a precondition
<Yorlik>
What is it about then from your view?
<heller_>
execution constraints. They mostly come out of real time systems, where you need to have an upper bound
<Yorlik>
Raw throughput is not all, yes.
<heller_>
depending on the actual use case, the performance can be way better than the traditional mutex approach, of course
<heller_>
so, a wait free algorithm is essential in safety critical real time systems. Most get away with lock free
<Yorlik>
My plan is to get together a first implementation which has internal interfaces in a way, that I can swap out elements easily later if I need optimizations
<heller_>
sure
<Yorlik>
We won't be able to have wait free
<Yorlik>
Too much contention sooner or later
<Yorlik>
contention in an MMO is really changing a lot
<heller_>
fun fact: if you read papers on lock free and wait free algorithms, they are mostly using plain math to proof the properties
<heller_>
you have a soft real time system
<Yorlik>
I'm not sure you can make statements about wait free without contention metrics
<heller_>
but I'd guess that you read concurrently way more often than you write
<Yorlik>
yep
<heller_>
so the first step here is a reader/writer (shared) mutex
<heller_>
with which you can get very far, I think
<Yorlik>
Would that work with non-concurrent queues?