#ste||ar on 2019-03-07 — irc logs at irclog.cct.lsu.edu

2018-08-26 23:03 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:49 jaafar has quit [Ping timeout: 240 seconds]

01:39 K-ballo has quit [Quit: K-ballo]

03:32 jaafar has joined #ste||ar

03:39 aserio has joined #ste||ar

03:52 hkaiser has quit [Quit: bye]

03:57 jaafar has quit [Ping timeout: 255 seconds]

03:59 aserio has quit [Quit: aserio]

04:52 nikunj has quit [Ping timeout: 245 seconds]

04:57 nikunj has joined #ste||ar

05:59 nikunj has quit [Read error: Connection reset by peer]

05:59 nikunj97 has joined #ste||ar

08:10 david_pfander has joined #ste||ar

08:30 <Yorlik> What happens if you make Singletons Components and try to move them to a place where there already is a Highlander ?

08:31 <Yorlik> Just curious - not that I have planned. Though there might be an application if you want to migrate an entire locality to shut down / reboot.

08:34 <Yorlik> Couzld you even create a singleton hpx component at all with the private constructor? Probably that answers the question.

08:37 <Yorlik> And what about statics? are they automagically cluster-wide and synchronized?

09:03 <zao> Regular language level statics?

09:11 <Yorlik> class statics

09:12 <Yorlik> And then you move

09:12 <Yorlik> err migrate

09:12 <Yorlik> Or change at locality A and query at locality B

09:13 <Yorlik> And yes - regular language statics too ofc

09:16 jaafar has joined #ste||ar

09:45 <zao> HPX cannot influence how language features work, unfortunately.

09:50 <Yorlik> So - you could have 2 classes with different statics and migrate?

09:51 <Yorlik> My guess is, that after migration the recreation of the object wouzld just adopt the local class statics

09:51 <Yorlik> But - that's an un-educated guess ... ;)

09:52 jaafar has quit [Quit: Konversation terminated!]

09:55 jaafar has joined #ste||ar

10:18 nikunj97 has quit [Quit: Leaving]

10:53 jbjnr__ has joined #ste||ar

10:56 jbjnr_ has quit [Ping timeout: 268 seconds]

11:18 K-ballo has joined #ste||ar

12:18 <heller_> Yorlik: it all depends how write your serialization functions and move/copy constructors

12:20 <zao> heller_: What on earth should I do with these migrate test failures? Wait for master to be compilable and see if it still manifests? Try some older builds and see if it's been around forever?

12:20 <heller_> I personally would refrain from making classic singletons. In a way, components are global objects already. This becomes evident when using symbolic names

12:20 <heller_> zao: it's been around for quite a while

12:20 <zao> Got a metric duckton of ctest logs from failed runs, should look into those some day and categorize the faults.

12:21 <zao> heller_: Yeah, I had a vague feeling it's not new.

12:21 <heller_> Ok, I'll be on a computer tonight

12:21 <heller_> What you could do however, is trying my sanitizers branch

12:21 <zao> I'm in no hurry, just wanted to check in a bit.

12:22 <zao> Ooh.

12:22 <heller_> And see if that changes anything

12:23 <heller_> I'll prepare prs for the different commits tonight

12:23 <heller_> And fix the mpi tester...

12:24 <zao> Speaking of MPI, we found a memory leak in UCX, some underlying component of OpenMPI.

12:24 <zao> Lots of researchers having code that previously ran fine be OOM-killed on compute nodes as OpenMPI leaked typecache :D

12:24 <heller_> Hihi

12:25 <heller_> Isn't ucx this libfabric style messaging middleware?

12:25 <zao> https://github.com/easybuilders/easybuild-easyconfigs/pull/7535

12:25 <heller_> Just depending on the fabric in use

12:26 <zao> Heaven knows, but considering the reporter is @mellanox, sounds right.

12:26 <zao> *commiter

12:26 <heller_> Yup

12:28 <heller_> http://www.openucx.org

12:29 <heller_> That's the one

12:59 david_pfander has quit [Remote host closed the connection]

13:05 hkaiser has joined #ste||ar

14:00 Yorlik has quit [Read error: Connection reset by peer]

14:01 aserio has joined #ste||ar

14:03 bibek has joined #ste||ar

14:31 hkaiser has quit [Quit: bye]

14:38 eschnett has joined #ste||ar

15:05 david_pfander has joined #ste||ar

15:14 aserio has quit [Ping timeout: 252 seconds]

15:20 bita has joined #ste||ar

15:56 eschnett has quit [Quit: eschnett]

15:58 aserio has joined #ste||ar

16:00 eschnett has joined #ste||ar

16:01 eschnett has quit [Client Quit]

16:04 eschnett has joined #ste||ar

16:40 eschnett has quit [Quit: eschnett]

16:47 eschnett has joined #ste||ar

17:11 eschnett has quit [Quit: eschnett]

17:13 hkaiser has joined #ste||ar

17:34 eschnett has joined #ste||ar

18:07 jaafar has quit [Quit: Konversation terminated!]

18:10 aserio has quit [Ping timeout: 252 seconds]

18:12 jaafar has joined #ste||ar

18:13 jaafar_ has joined #ste||ar

18:17 jaafar has quit [Ping timeout: 252 seconds]

19:19 eschnett has quit [Quit: eschnett]

19:28 aserio has joined #ste||ar

19:29 eschnett has joined #ste||ar

20:30 Yorlik has joined #ste||ar

20:55 diehlpk_work has joined #ste||ar

21:37 aserio has quit [Quit: aserio]

21:38 aserio has joined #ste||ar

21:41 aserio has quit [Client Quit]

21:43 jaafar_ has quit [Quit: Konversation terminated!]

21:43 hkaiser has quit [Quit: bye]

21:48 jaafar has joined #ste||ar

21:49 bibek has quit [Quit: Konversation terminated!]

22:09 hkaiser has joined #ste||ar

22:09 <Yorlik> If I have an ordered set (std::set<uint64_t>) (*myset.end ( ) - *myset.begin ( )) should give me the difference between the largest and the smallest element, or am I wrong here?

22:10 <hkaiser> Yorlik: you should never dereference the iterator returned by end()

22:10 <Yorlik> Argh -- lol

22:10 <Yorlik> WTF

22:10 * Yorlik bangs head on table

22:11 <Yorlik> Ohg man - this is funny

22:11 <Yorlik> I knew it but didn't see it ...

22:13 bibek has joined #ste||ar

22:13 <hkaiser> Yorlik: btw, if myset.empty(), then you shouldn't dereference the iterator return from begin() either

22:13 <Yorlik> Sure

22:14 <Yorlik> I am just checking out ordered sets as a container for free slots in a pool

22:14 <Yorlik> Because I want to reuse the bottom most elements

22:14 <Yorlik> To keep my loop tight

22:14 <Yorlik> Playing with ways to create my custom allocator

22:15 <Yorlik> set looks like a nice way to keep the free list

22:16 <Yorlik> Even with 100k entries access and writes are below 1us

22:16 <hkaiser> Yorlik: a heap might be more appropriate, I'd suggest to do measurements

22:17 <Yorlik> The main problem is, I want to keep my (typed) elements in a straight line

22:17 <hkaiser> sure

22:17 <Yorlik> So - when loopingf I might have to occasionally skip

22:17 <hkaiser> Yorlik: I was referring to make_heap and friends (https://en.cppreference.com/w/cpp/algorithm/make_heap)

22:17 <Yorlik> and on a long running process I might move elementsa from time to time

22:18 <Yorlik> I really want to be able to loop over an array like arrtangement of my elements

22:18 <Yorlik> But I might just ditch that for a while and do it as a later optimization

22:19 <Yorlik> That idea has more issues and is mutch more problematic than I thought

22:19 <Yorlik> But the ideal would be to have a self sorting pool

22:19 <Yorlik> But then I'm ruinning into issues with rearranging elements

22:20 <hkaiser> Yorlik: it's called a 'heap' ;-)

22:20 <Yorlik> how would you keep elements packed in a rtow?

22:20 <hkaiser> the idea to have a separate datastructure holding the (sorted) list of indices of free elements is a good one

22:20 <hkaiser> use the heap instead of your set to maintain that list

22:21 <Yorlik> my issue is to have the update loop cache friendly

22:21 <Yorlik> IC

22:21 <hkaiser> a set causes you to do pointer chasing to find the next entry, a heap is build in cosecutive memory

22:21 <Yorlik> That makes sense

22:21 <Yorlik> Oh - I would not use the set to loop

22:21 <Yorlik> My idea is darker

22:22 <Yorlik> keep an index to the highest element in the pool

22:22 <Yorlik> and iterate over the pool from o to maxelement

22:22 <Yorlik> the empty slots wouzld get skipped

22:22 <Yorlik> There is an issue of having lots of deletions and a situation with a lot of element skipping

22:23 <Yorlik> The set would be just used for the allocations and free operations

22:23 <Yorlik> it would always give back the lowest element

22:23 <Yorlik> since it hold the addresses of the lowest free slot

22:24 <hkaiser> Yorlik: sure

22:24 <Yorlik> the problem could come up, if I have many elements, like an allocation burst and then a lot of deletions and elements hanging at the top of the array forcing the loop to run there

22:25 <Yorlik> But the real world behavior would have to be measured ofc

22:25 <Yorlik> But the object count fluctuates

22:25 <Yorlik> sometims a lot

22:25 <Yorlik> Reordering could be a later optimization

22:26 <Yorlik> HPX could actually help me here by using migration

22:27 <Yorlik> it would be horribly slow and only be used in desperate situations

22:27 <Yorlik> I'm just thinking ahead here to avoid certain nasty surprises I can anticipate now.

22:57 <heller_> i still think iterating over anything id_type in your tight loops will hurt you badly

23:03 <Yorlik> That's not the plan

23:03 <Yorlik> The plan is to store everything in an array

23:04 <Yorlik> That's why I'm so interested in a custom allocator for HPX Components

23:04 <Yorlik> I want to be able to use placement new semantics for them

23:05 <Yorlik> All that fuzz is about packing and direct access by having typped array like collections of objects

23:06 <Yorlik> For me the learning task now is to makea thread safe allocator which gives me that. It's a specialized application and I didn't find anything like that o the net.

23:07 <Yorlik> If the Objects are components I need no longer do the bookkeeping for their IDs, since I can use get:ptr to find them

23:07 <Yorlik> get_ptr

23:07 <Yorlik> but for looping I'd just zip over the storage directly

23:08 <Yorlik> My entities are mostly pods and will rarely use pointers to other objects

23:08 <Yorlik> containers are a special problem for example

23:28 <heller_> ok, as said, good luck with non intrusive migration then

23:38 <heller_> still have to review that PR

23:38 <Yorlik> That would be great - but it has time - I'm still on a huge learning task

23:38 <heller_> ok

23:38 <heller_> sorry about the delay

23:38 <Yorlik> Mamory management howtos, concurrency , ... hardcore stuff for me.

23:38 <heller_> ;)

23:39 <Yorlik> Its fun actually - especiually the atomic specials acquire - release semantics.

23:39 <Yorlik> Saw a 2 great talks by Herb Sutter explaining it very well.

23:40 <heller_> look for talks from tony van eerd on that topic

23:40 <Yorlik> I'll do

23:40 <Yorlik> Thnaks !

23:40 K-ballo has quit [Ping timeout: 240 seconds]

23:40 K-ballo1 has joined #ste||ar

23:40 <heller_> the problem is: you think you got it now, then try to code such a data structure and fail miserably

23:42 K-ballo1 is now known as K-ballo

23:42 <Yorlik> I think it come down to that proverb: "Sharing is the source of all contention"

23:43 <Yorlik> These lock free techniques are optimizations - but you can't get rid of the fundamental problem if you doin't design for it

23:43 <heller_> oh, performance is only secondary

23:43 <Yorlik> Reducing shared stuff as much as possible is the first thing to do, I believe.

23:43 <heller_> actually making them work correctly is the hardest part

23:44 <Yorlik> memory expolosions? :)

23:44 <Yorlik> UD all over the place?

23:44 <heller_> yes, UB in the form of data races

23:44 <Yorlik> I guess I'm going to see my share of pink elephants ...

23:47 <heller_> you will

23:47 <heller_> best is, to try to avoid to write your own concurrent data structures for now

23:48 <heller_> you have to unlearn what you learned in kindergarten: caring is not sharing

23:48 <Yorlik> I'll probably just cobble existing stuff together, indeed.

23:49 <Yorlik> I'll need a concurrent set, concurrent vector and concurrent map - however - I need to not just use stuff - integrating threadsafe stuff into my use case still can give me races all over the place.

23:49 <Yorlik> I really have to do a ton of learning anyways

23:50 <heller_> just something upfront: lockfree/waitfree is not about performance!

23:50 <heller_> you often get away with something using plain old mutices

23:51 <Yorlik> I mean correctness comes first, right?

23:51 <Yorlik> And then ... ?

23:51 <heller_> well, correctness is a precondition

23:51 <Yorlik> What is it about then from your view?

23:52 <heller_> execution constraints. They mostly come out of real time systems, where you need to have an upper bound

23:53 <Yorlik> Raw throughput is not all, yes.

23:53 <heller_> depending on the actual use case, the performance can be way better than the traditional mutex approach, of course

23:54 <heller_> so, a wait free algorithm is essential in safety critical real time systems. Most get away with lock free

23:54 <Yorlik> My plan is to get together a first implementation which has internal interfaces in a way, that I can swap out elements easily later if I need optimizations

23:55 <heller_> sure

23:55 <Yorlik> We won't be able to have wait free

23:55 <Yorlik> Too much contention sooner or later

23:56 <Yorlik> contention in an MMO is really changing a lot

23:56 <heller_> fun fact: if you read papers on lock free and wait free algorithms, they are mostly using plain math to proof the properties

23:56 <heller_> you have a soft real time system

23:56 <Yorlik> I'm not sure you can make statements about wait free without contention metrics

23:57 <heller_> but I'd guess that you read concurrently way more often than you write

23:57 <Yorlik> yep

23:57 <heller_> so the first step here is a reader/writer (shared) mutex

23:58 <heller_> with which you can get very far, I think

23:59 <Yorlik> Would that work with non-concurrent queues?