hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<K-ballo> simbergm: clang-3.8-boost-1.58.0-c++11-Release (cray(daint)) complains about having a misconfigured cpu count
<heller_> K-ballo: 7
<K-ballo> 8, I win!
<K-ballo> oh, the gsoc question
<heller_> ;)
<K-ballo> simbergm: if my reading of the cdash is correct, it seems we are continuously rebuilding all open PRs, some of which have been abandoned for months
<K-ballo> should we prune PRs? and branches?
<heller_> hkaiser: Yorlik: If we solve the problem of potential clashes of GIDs in AGAS when restoring *any* component. One possible way of solving this, I think, is by not persisting the GID, but an application specific symbolic name, uniquely defined by the application. This would also solve the problem of having different versions of the objects
<hkaiser> heller_: or ensure uniqueness at runtime
<Yorlik> Hashing my creation time ns+locality
<heller_> what if you restore a component with a GID that is currently in use?
<Yorlik> Hashing ofc always has a certain risk of clashing
<heller_> i like the idea of the segments
<Yorlik> Why does the locality have to be a part of the ID anyways? Couldn't we just have a globally unique ID in the first place and the currently responsible AGAS locality be a field in the id_type since its a struct anyways? If an AGAS query would fail a fallback to a global instance couzld force a live update of the AGAS field.
<Yorlik> It would be like anoither lazy cache level
<Yorlik> allowing objects to migrate their AGAS administrative locality would then be a matter of object policy
<Yorlik> For efficiency you yould simply allow/disallow it on a per object basis
<Yorlik> That would be application specific
<Yorlik> After a reload the possibly persistent AGAS fields wozuld have cache misses ofc, but quickly be updated
<Yorlik> Thats close to what I wrote in the CRUD section of siggestion A
<hkaiser> Yorlik: it's part of the id as an optimization to quickly find the locality that knows how to resolve the id to a local address
<Yorlik> After all we want AGAS responsibility and object locality be the same where possible for most cases.
<heller_> our experiments show that cache misses are quite expensive operations. This might be a problem at scale. The current system allows to directly determine the service locality and serve the request without cache lookup
<Yorlik> Generally object should not migrate their AGAS host ofc.
<Yorlik> Only in the special and slow case of store/reload cycles or for general optimization on long running simulations with object proximity constraints
<Yorlik> Generally objects shouzldn't move their AGAS host.
<Yorlik> or you find a really good trick to do it fast.
<Yorlik> The problem I see here is to update the references
<Yorlik> And to not force every object to have its AGAS host random because of hashing
<Yorlik> It's really a tradeoff which should be configurable by the application I think.
<hkaiser> nah
<hkaiser> nobody will get tis right
<Yorlik> You have the silver bullet ? ;)
<hkaiser> HPX ;-)
<Yorlik> lol - could you be more specific ?
<hkaiser> lol
<hkaiser> let's cross the bridge once we're there
* Yorlik pokes into the sore spots at time
<Yorlik> Lets have all id types listen to a messaging service
* Yorlik hides under a c++ manual
<hkaiser> you won't be able to pay for the overheads introduced by that
* Yorlik puts a linux manual on top of his cover
<Yorlik> We have a local cache for object-locality pairs, right ?
<hkaiser> Yorlik: no
<hkaiser> we have a local cache id->lva (local virtual address)
<Yorlik> If an objectID would be a 64bit int and a locality ID a 16 bit int alocal 1:1 cache for a million objects would be just 80 MB
<hkaiser> what are you trying to say?
<Yorlik> A local 1:1 object-locality chache could be cheap
<hkaiser> lol
<Yorlik> No more triangulating to AGAS - only afdter a miss
<hkaiser> define 'cheap'
<Yorlik> I mean whats 80 bits for a local cache entry?
<hkaiser> how long does it take to lookup things in such a cache?
<Yorlik> one memory read?
<hkaiser> and how do you keep those caches consistent?
<Yorlik> You don't
<Yorlik> You update lazily after a miss and use AGAS for that
<Yorlik> So - if a request fails you ask agas
<hkaiser> nod, I think that's how we use our local agas caches, they tend to be much smaller, though
<Yorlik> You probaly just update as needed
<Yorlik> An unordered map can do the trick
<Yorlik> So - if you already have a cache. Is it really an issue to have AGAS not at the objects location?
<Yorlik> The problem I see here is, that the requirements for address resolution and caching might vary greatly between applications
<Yorlik> And it depends a lot on the frequency and amount of object migrations
<hkaiser> sure
<hkaiser> btw, your math above is wrong
<Yorlik> Woops - 80 megaBITS - right
<hkaiser> you end up with 80Mbytes if you store one byte of information per id
<Yorlik> True - 10 MB
<hkaiser> usually you want to store more than that
<Yorlik> An unordered map might be more efficient for a really huge address space.
<Yorlik> It was a bitof a shot from the hip. ;)
jaafar has quit [Ping timeout: 244 seconds]
eschnett has joined #ste||ar
eschnett has quit [Quit: eschnett]
eschnett has joined #ste||ar
eschnett has quit [Client Quit]
<Yorlik> Is there a way to create an object locally in an efficient way in a container, e.g. vector.emplace_back(my_hpx_component) ?
<Yorlik> that was psudo-code ofc
<hkaiser> Yorlik: you want to vreate more than one object at a time?
<hkaiser> create*
<Yorlik> Many - and I want to have them in various containers - in this case actually its an unordered map
<Yorlik> So - just wondering if i can avoid copying.
<Yorlik> Something lik an HPX managed vector
<Yorlik> The objectsa are mostly used locally
<Yorlik> But they will get remot calls too
<Yorlik> That's the default new_ right?
<hkaiser> yes, just slightly different syntax
<hkaiser> new_<Component[]>(id, count, ...)
<hkaiser> instead of new_<Component>(id, ...)
<Yorlik> Oh - is that a custom vector you created?
<hkaiser> no, a std:vector<id_type>
<Yorlik> I mean - they are not created at once
<hkaiser> they are created at the same time
<Yorlik> Could I emplace_back?
<hkaiser> why?
<Yorlik> new objects
<Yorlik> NPCs have bybies .. ;)
<Yorlik> Gameobjects are created and destroyed all the time
<hkaiser> sorry, I don't understand
<Yorlik> Its a dynamic allocation - objects come and go - so the vector needs to be able to grow
<hkaiser> you create one instance of a component using new_<Component>(id, ...)
<hkaiser> you create more than one using new_<Component[]>(id, count, ...)
<hkaiser> why do you need emplace?
<Yorlik> For new objects later on
<hkaiser> Yorlik: the new_<> returns a vector<id_type> referring to the new objects, do with it whatever you want
<Yorlik> How wouzld I grow it and add components to it?
<hkaiser> you have your own container and add the new ones to it
<Yorlik> emplace allows me to construct new object in place in the vector - could I do that?
<Yorlik> I would like to construct in place to avoid copying
<hkaiser> so you want to control _where_ the new objects are created?
<Yorlik> Yes
<hkaiser> objects wouldn't be copied
<Yorlik> So the vector would always just store the id_types?
<hkaiser> new_ returns id_types, not the objects themselves
<Yorlik> That is a problem, if i need the objects locally and want to directly access them by pointer
<hkaiser> use get_ptr<Component>(id)
<Yorlik> I'd have to bookkeep a second vector with pointers
<hkaiser> if all you need is the pointers you can let go of the id_type
<Yorlik> Locally i would reference objects always by pointer
<hkaiser> no problem
<Yorlik> I was thinking about using an unordered map with id_types as keys.
<Yorlik> and having the objects as values
<hkaiser> you just told me that you don't want to have double-bookkeeping, i.e. ids and ptrs
<Yorlik> I want to do two things: Create objects efficiently at their final storage place and access them quickly.
<Yorlik> My though was to use an unordered map <id_type, object> for it
<hkaiser> in principle you can design a different scheme of creating component instances, but currently components are created on the heap of the target locality and new_<> returns the id_type referring to it
<hkaiser> why would you need the id_type in that case?
<Yorlik> E.g. if I want to directly access methods or fields I currently need a pointer, right ?
<hkaiser> yes
<Yorlik> The current scheme is, to have the id_type by the id of the object
<Yorlik> Accessing the object in an unordered map by hashed id_type seemed fast
<Yorlik> or am I thinking wrong here?
<hkaiser> Yorlik: the current scheme is equivalent to shared_ptr<foo> p = new foo; or shared_ptr<foo[]> p = new foo[count];
<Yorlik> If i want to pass around objects to a remote location I ofc want the id_type
<hkaiser> so ask the object for its id, if needed
<Yorlik> I need the ability to find them by id
<Yorlik> thats why a map
<Yorlik> But maybe you are right and I could just ditch that
<hkaiser> why not use the ids to access the components
<Yorlik> and consequently just use pointers
<Yorlik> I need to access the component members
<Yorlik> locally and fast
<Yorlik> no futures
<Yorlik> just object-> member
<hkaiser> well, 'fast' is relative
<Yorlik> futures are just for remote operations
<hkaiser> you will not know without measurement whether something is 'fast'
<Yorlik> true
<hkaiser> why are futures just for remote operations?
<Yorlik> But in a tight look creating a ton of futures will have a cost, won't it?
<hkaiser> let the object do things instead of accessing its members
<Yorlik> It's an ecs
<Yorlik> the system responsible for the component of the object (ecs comnponent, not hpx conmponent) is zipping over an array and doing its thing
<hkaiser> instead of having foo(obj->data), have obj->foo() or async(foo_action, objid) and make sure foo has enough work
<Yorlik> Thats killing the cache
<hkaiser> shrug, premature optimization
<Yorlik> if you have thousands of objects and want to updatre them in a frame ... every bi counts
<Yorlik> and zipping through an array give additionalk hardware prefetch
<hkaiser> then having single objects as components is not the way to go
<Yorlik> the hardware understands I'm looping over an array and prefetches members
<Yorlik> ?
<hkaiser> wrapping each of your game objects into its own component is not the right way to go
<Yorlik> But they need to be accessible remotely
<hkaiser> *sigh*
<hkaiser> you want to have a free lunch and eat it too
<Yorlik> OFC I could write some messaging on top of it all.
<hkaiser> make the tiles into components, and let them locally dispatch work
<Yorlik> Gameobjects move out of tiles all the time
<hkaiser> shrug
<Yorlik> and into new ones
<hkaiser> need to think about this
<Yorlik> I'll think of sth. maybe emplacing into a map is indeed premature opt.
<Yorlik> I'll meditate on it. :)
<hkaiser> Yorlik: if you create a map<id_type, object> you duplicate AGAS as it is already responsible to do that mapping
<Yorlik> map(id_type)->member= 42
<Yorlik> actually .member, not ->
<hkaiser> agas::resolve(id)->member=42
<Yorlik> How fast is that compared to the map approach?
<hkaiser> it is a map ;-)
<Yorlik> Again - we're in the tightest and crammedest loop of the entire system here
<Yorlik> lol
<Yorlik> OK - you win -. thats funny
<Yorlik> why a map and not an unordered map?
<Yorlik> for plain resolving you don't nbeed ordering and its faster
<Yorlik> Just use an unordered map
<hkaiser> our measurements have shows so far that unordered_map does not make a big difference
<Yorlik> unordered map = O(1)
<hkaiser> unordered_map may give you O(1) access, but this in reality if kO(1) with a big 'k'
<Yorlik> more or less
<Yorlik> Where does the k come from?
<hkaiser> also, unordered_map requires a good hash and you have to make sure it has the right number of buckets
<hkaiser> something I don't know how to generically solve
<Yorlik> true
<hkaiser> the k comes from hash collisions
<Yorlik> just take the least significant gid bits?
<Yorlik> OK
<hkaiser> and hash collisions turn the O(1) into O(N)
<Yorlik> Full O(N) or O(N)/somefactor
<hkaiser> I'm not saying its impossible, just its not 'just' a switch map --> unordered_map
<Yorlik> truze
<hkaiser> O(N)/somefactor, but still O(N)
<Yorlik> if you sequentially create gids ...
<Yorlik> and you use the lowest gid bits to determine the bucket
<hkaiser> if, if, if
K-ballo has quit [Quit: K-ballo]
<hkaiser> that's why we have the GSoC project I linked above
<Yorlik> you'd get the pretty even distributiopn probably
<Yorlik> IC
<hkaiser> Yorlik: sure, and as soon as you have more ids than buckets you're back in O(N) land
<Yorlik> This summer the silver bullet of AGAS resolving will be forged :D
<hkaiser> no, we're not part of GSoC this year :(
<Yorlik> 15 Sorting Algorithms in 6 Minutes
<Yorlik> Including bogosort
<hkaiser> nod
<Yorlik> Audifying t he algorithm .. :)
<Yorlik> I think I'll just use ags resolve - just because I'm lazy and I need to move on
<Yorlik> s/sgc/agas/g
<Yorlik> Argh - my typing is rotten since I'm using Discord...
<hkaiser> Yorlik: if needed we can optimize things once we know what's really slow
<hkaiser> including object placement and such
<Yorlik> I probably will be able to live with many little imperfections for while - after what we had before it can only get better by orders of magvitude - no joke ! :)
<Yorlik> 2GB of Garbage within 2 seconds by some C# networking code using a ton of dynamic allocations ...
<Yorlik> I'm happy we are writing our own middleware now
<hkaiser> Yorlik: you can make any correct applications run fast, you can't make a fast application correct, though
<Yorlik> I'm a friend of the many little things that matter. Means - there is no such thing as premature optimizations. There are wrong prioritizations though, sure. ...
<hkaiser> I'm usually against optimization without measurement
<Yorlik> And don't forget - I'm still a C++ newbie .. you'll have to live with me thinking in weird ways for a while ;)
<Yorlik> Oh - we did measure in the past
<Yorlik> I built a cache inside lua to avoid engine calls, because access was 20 times faster
<Yorlik> for certain functiuons 100 times fastewr
<Yorlik> Now I think I should just dump my id_types in a set
<Yorlik> And if I ever need some optimized search just create indices
<Yorlik> hkaiser: hpx::agas::resolve ( hpx::launch::sync, obj_id_type ) doesn't allow me to reference mebers of the object like hpx::agas::resolve ( hpx::launch::sync, obj_id_type )->something. It returns an address type which is a struct again. How would I use this?
<Yorlik> the id type is created with: auto obj_id_type = hpx::local_new<game::object> ( ).get ( );
<hkaiser> yah, it's not that easy ;-)
<hkaiser> Yorlik: your best bet is still get_ptr<> I think, at least for now
<hkaiser> that uses resolve under covers
<Yorlik> I shall dig into that code .. thanks !
<hkaiser> we should optimize get_ptr<>(sync, id) to actually give you the pointer without ever creating a future
<hkaiser> I'll add that to the list
hkaiser has quit [Quit: bye]
<Yorlik> Do objects exist as long as I hold a valid id_type of them somewhere? Like a shared pointer?
<Yorlik> Unordered map, that is now: https://gitlab.com/snippets/1830425
<heller_> Yorlik: yes
<heller_> I second hkaisers suggestion to only have the tiles as components
<heller_> Keep the entities as normal objects and have them move from tile to tile
<heller_> That makes it easy enough to also implement "remote" vision
<heller_> Meister Yuke used that technique
<heller_> You can even use techniques like adaptive refinement to further split up busy tiles in a quadtree/octree fashion
<heller_> And have the tiles migrate around to implement load balancing
<heller_> Like a game of life type of thing
<simbergm> K-ballo: 1. yes, and 2. kind of?
<simbergm> I've just messed up the clang 3 builder for my bitset branch
<simbergm> we rebuild PRs on merges to master, and I might've triggered some extra builds by mistake
<simbergm> but PRs that have conflicts don't get built
<simbergm> but in general yes, fewer PRs would be nice for pycicle
david_pfander has joined #ste||ar
nikunj97 has joined #ste||ar
daissgr has quit [Ping timeout: 252 seconds]
daissgr has joined #ste||ar
K-ballo has joined #ste||ar
daissgr has quit [Ping timeout: 264 seconds]
daissgr has joined #ste||ar
nikunj97 has quit [Ping timeout: 240 seconds]
nikunj97 has joined #ste||ar
hkaiser has joined #ste||ar
<heller_> hkaiser: so my plan for tomorrow is to setup appear.in to show both my slides and my video, so whoever wants to listen can tune in ;)
<hkaiser> heller_: nice
<hkaiser> thanks
nikunj has joined #ste||ar
nikunj97 has quit [Ping timeout: 268 seconds]
nikunj has quit [Ping timeout: 250 seconds]
<Yorlik> heller_: which time?
<heller_> Yorlik: 14:00 tomorrow
<heller_> CET
<Yorlik> I'll try to watch - whats the topic?
<heller_> HPX ;)
<Yorlik> Heneral HPXness? ;)
<Yorlik> Woops - just had a race condition duplicate the HPX H and race to the beginning ... ;)
<heller_> my dissertation is about the HPX programming model, futurization and getting it to scale essentially
<Yorlik> hkaiser, heller_: You essentially think having the objects/entities be components might just be overkill and ineffective?
<Yorlik> heller_: So there will also be HPX internals and performance tidbits and thingamabobs?
<Yorlik> Or just the general model of chopping code into pieces and connect them with futures?
<heller_> the talk will only scratch the surface
<Yorlik> OK. I'll still watch. Can't learn enough :)
<heller_> it's only 30 minutes ... so no real chance, to get into detail too much
<Yorlik> Thats short, indeed.
<zao> heller_: CUDA 10.1 out, claims support for GCC 8.x and icc 19.0
<heller_> wee
diehlpk_mobile has joined #ste||ar
diehlpk_mobile has quit [Ping timeout: 250 seconds]
<diehlpk_work> jbjnr__, See pm
aserio has joined #ste||ar
bibek has joined #ste||ar
eschnett has joined #ste||ar
<aserio> heller_, hkaiser, jbjnr__, simbergm: Meeting?
<hkaiser> aserio: I'm in
<simbergm> just joined
<hkaiser> aserio: I used the link you sent around yesterday
<aserio> oops
<aserio> wrong link
<simbergm> e&compare3=61&value3=Pull_Requests&field4=buildstarttime&compare4=83&value4=0
nikunj has joined #ste||ar
<parsa> K-ballo: i want to call a global function from a function in a class and i need to pass some constexpr values declared in the class. is there anything else i can do beside using templates for this?
<K-ballo> why would you need templates, parsa?
<parsa> i don't know how else i can keep them static
<hkaiser> templates wouldn't keep the vars constexpr either
<parsa> i gave up on their constexpr-ness and tried to keep them static at least
<parsa> i can't have constexpr arguments, in a non-constexpr function, can i?
<K-ballo> needs more context
hkaiser has quit [Quit: bye]
<parsa> K-ballo: something like this: https://wandbox.org/permlink/0LDJkqwFwT9zTFgr
<K-ballo> parsa: and what's the problem? the missing definitions?
<K-ballo> you want a template<> to emulate inline variables?
<parsa> the problem is i need to pass er_i and fx_* to the global function and i want to know if there's anything else i can do besides having them as template args to keep them static
<parsa> at least, if i can't keep them constexpr
<K-ballo> I don't understand why you wouldn't be able to keep them static
<K-ballo> is it because of the missing definitions?
<K-ballo> or you don't want to have a template in the first place?
<K-ballo> I still can't spot what the underlying problem is
<parsa> sorry i'm lost... what i'm asking is that if i can't have a constexpr argument like `void radiation_cpu_kernel(/*stuff/*, constexpr integer d)` then what do i do?
<K-ballo> yeah, template params
<parsa> okay
<K-ballo> the part that threw me off was the "declared in the class"
<parsa> sorry, still can't articulate well
<parsa> K-ballo: thank you
hkaiser has joined #ste||ar
<parsa> daissgr: ping
aserio has quit [Ping timeout: 264 seconds]
<Yorlik> just realized I have become a #pragma region junkey : https://imgur.com/a/W9gf4JB :)
aserio has joined #ste||ar
<Yorlik> Oh - EA made a high performance partial STL replacement with a BSD license: https://github.com/electronicarts/EASTL
david_pfander has quit [Quit: david_pfander]
david_pfander has joined #ste||ar
eschnett has quit [Quit: eschnett]
<K-ballo> yeah, that old thing
<Yorlik> Sounds like you think it's outdated?
<K-ballo> no, just not news
<Yorlik> For me it's all still new an thrilling.
* Yorlik has a newbie bonus on being thrilled.
<zao> EASTL has been around for ages, not sure how well it has evolved.
<Yorlik> ;)
<zao> It's not as much a "SC++L replacement" as their own data structures and junk, the same stuff that any game studio reinvents.
<zao> Important aspects tends to be different ideas about allocators and PODness.
<Yorlik> I was thinking about writing a custom allocator as a future optimization for the map holding our gameobjects. Then a friend pointed me there.
<Yorlik> It's all not short term important
<Yorlik> Just my general search for fast solutions.
<Yorlik> We are not yet in a stage to optimize these details.
<zao> The reason there's separate libraries is because the constraints of the standard C++ library are not always adequate for their niche.
<Yorlik> Yep.
<Yorlik> I had a longish discussion with said friend about the STL and gaming yesterday
<Yorlik> And the result essentially was, that gaming is a niche application, though it's mainstream now
<Yorlik> Concerning the size of the game industry I think it makes perfectly sense to have a specialized library for their needs though the C++ std:: should also be as much performance optimal as possible (without sacrificing a great deal of mainainability and such)
<zao> If you're looking for inspirational/saddening reading, see what SG14 has achieved.
<Yorlik> SG14?
<zao> The committee special group for games and low-latency.
<Yorlik> IC - got a link?
<zao> C++ is so extremely fractured and pulled in different directions that it's literally impossible to use :(
<zao> Yorlik: Not really, used to follow the mailing list for a while.
<K-ballo> every C++ domain is a niche application
<Yorlik> Makes sense.
aserio has quit [Ping timeout: 255 seconds]
akheir has quit [Quit: Konversation terminated!]
akheir has joined #ste||ar
<diehlpk_work> hkaiser, https://pastebin.com/Kv2rmPvf Can you please look into the phylanx build error
<diehlpk_work> Felix tries to compile phylanx on the RPI
<zao> 32-bit platform?
<diehlpk_work> yes, it think so
<diehlpk_work> Linux localhost 4.20.10-200.fc29.armv7hl #1 SMP Fri Feb 15 21:04:49 UTC 2019 armv7l armv7l armv7l GNU/Linux
<diehlpk_work> It seems to be 32bit, because it is a PI2
nikunj has quit [Remote host closed the connection]
jaafar has joined #ste||ar
aserio has joined #ste||ar
jaafar has quit [Quit: Konversation terminated!]
jaafar has joined #ste||ar
<hkaiser> diehlpk_work: will do
<diehlpk_work> hkaiser, Felix added a ticket and it seems that the issues is that he uses 32bit
<zao> Tried on actual iron, system OpenMPI are indeed hecked.
<heller_> the question still is what's actually wrong with it
hkaiser has quit [Quit: bye]
<Yorlik> How would you iterate over objects created with CRTP and different types or hold them together in a container?
<heller_> you don't
<Yorlik> I'd rather like not to fallback to raw pointers or any types if humanly possiblöe
<Yorlik> So it's as I guessed - die fast or die slow, but die :(
<heller_> "slow" is always an approach
<heller_> you won't die
<heller_> I promise
<Yorlik> It was more metaphoric, not really thinking of an implementation
<heller_> what's the usecase, how many types are we talking about?
<Yorlik> My systems implement an ISystem abstract interface i use to put them in a map with enums as keys.
<Yorlik> The number of systems is always limited
<Yorlik> But I need to iterate over them on various occasions
<Yorlik> Like when composing an entity, doing egneral stuff to it, and ofc in the central loop
<Yorlik> There are so view , that i could just manually adjust code as i create systems
<Yorlik> But its violating DRY
<Yorlik> It like to find an elegant programatic way
<Yorlik> What i am doing now is, to add the typed data in the children of the ISystem
<Yorlik> The use of virtual functions is acceptable here
<Yorlik> Though I don't really like it.
<Yorlik> So i am postponing the typing to the children
<Yorlik> But actually the Systems are also typed
<Yorlik> It would be more concise somehow to use CRTP for Systems
<Yorlik> And feed the custom Component Type as Tempülate Parameter
<Yorlik> I have a feeling I am thinking wrong here
<Yorlik> Certain methods, like creating a new component are identical, except for the typed data they use
<Yorlik> So it ~should be moved to the base class, which then would be a CRTP object
<Yorlik> And I could not longer hold it in a trivial conbtainer
<diehlpk_work> Could be interesting for Phylanx
<diehlpk_work> wash[m], see pm
aserio has quit [Quit: aserio]
<zao> heller_: Good thing is, if I build OpenMPI with EasyBuild using all the options from Ubuntu, it fails in the same way.
hkaiser has joined #ste||ar
<zao> I'm not done testing, but it seems like the configuration flag that does this is "--enable-heterogeneous".
<zao> Considering that it's responsible for doing all sorts of struct layout and endianess stuff, it could very well be buggy.
<zao> heller_: In short, Ubuntu builds OpenMPI with a KNOWN BROKEN flag, @#$@#$$@#$@
<zao> Someone[TM] needs to punch them or Debian to fix this, as it's in universe.
<zao> I don't have high hopes for anything to get fixed in the distro, the Lua needed for Lmod has been broken for a good year and the fix is trivial.
<zao> I guess we need to pay for professional Canonical support :P