aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
eschnett has quit [Quit: eschnett]
vamatya has quit [Ping timeout: 268 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
jaafar_ has quit [Quit: Konversation terminated!]
jaafar has joined #ste||ar
jaafar has quit [Read error: Connection reset by peer]
jaafar has joined #ste||ar
jaafar has quit [Quit: Konversation terminated!]
jaafar has joined #ste||ar
jaafar has quit [Remote host closed the connection]
jaafar has joined #ste||ar
EverYoung has quit [Read error: Connection reset by peer]
eschnett has joined #ste||ar
EverYoung has joined #ste||ar
daissgr has quit [Ping timeout: 268 seconds]
EverYoung has quit [Read error: Connection reset by peer]
daissgr has joined #ste||ar
<hkaiser> heller_, simbergm: if you read this: the coroutine patch makes things hang for me at startup when the application is run with --hpx:threads=1
hkaiser has quit [Quit: bye]
jaafar has quit [Ping timeout: 240 seconds]
jaafar has joined #ste||ar
daissgr has quit [Ping timeout: 240 seconds]
gedaj has joined #ste||ar
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
EverYoung has joined #ste||ar
jaafar has quit [Ping timeout: 252 seconds]
jaafar has joined #ste||ar
jaafar has quit [Ping timeout: 252 seconds]
EverYoun_ has joined #ste||ar
EverYoung has quit [Read error: Connection reset by peer]
EverYoun_ has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
<github> [hpx] msimberg created revert-3131-fixing_2325 (+1 new commit): https://git.io/vNbbt
<github> hpx/revert-3131-fixing_2325 58f7716 Mikael Simberg: Revert "Fixing #2325"
<github> [hpx] msimberg opened pull request #3136: Revert "Fixing #2325" (master...revert-3131-fixing_2325) https://git.io/vNbb3
<github> [hpx] msimberg closed pull request #3136: Revert "Fixing #2325" (master...revert-3131-fixing_2325) https://git.io/vNbb3
gentryx has quit [Ping timeout: 246 seconds]
gentryx has joined #ste||ar
gentryx has quit [Ping timeout: 252 seconds]
gentryx has joined #ste||ar
david_pfander has joined #ste||ar
EverYoung has joined #ste||ar
marco has quit [Ping timeout: 260 seconds]
<github> [hpx] StellarBot pushed 1 new commit to gh-pages: https://git.io/vNNty
<github> hpx/gh-pages ebc9875 StellarBot: Updating docs
<github> [hpx] msimberg opened pull request #3137: Suspend speedup (master...suspend-speedup) https://git.io/vNNqi
hkaiser has joined #ste||ar
<hkaiser> heller_: #3126 makes apps hang at startup when run with one thread
<simbergm> hkaiser: release/debug? does it hang with any example or test, or something of your own?
<hkaiser> debug, one app of my own, but I can try with hello_world later today
<hkaiser> I don't think my app is doing anything unusual, even at it works after I'm rolling back to before #3126
<simbergm> okay, just tried hello_world in release and it was find, but compiling debug now
<hkaiser> as*
<simbergm> it was fine
<hkaiser> could be windows specific - shrug
<simbergm> okay, so it's definitely that PR at least
<hkaiser> yes
<K-ballo> activity on #hpx@cpplang slack
<hkaiser> tks
zombieleet has joined #ste||ar
<heller_> hkaiser: ugh, has to be some windows thing
<heller_> hkaiser: buildbot looks fine in that regard
<hkaiser> could be, during startup one of the scheduled threads never gets executed
<hkaiser> buildbot does not run using one thread
<heller_> hkaiser: did you verify the same happens with hello world?
<heller_> it does
<hkaiser> not yet, but I will
<heller_> the only thing I can think of the state_ex change that might have cause this
<hkaiser> just rolling back fixes things, though
<heller_> let me prepare a patch to check that
<hkaiser> ok
<heller_> hkaiser: fix_3126
<github> [hpx] sithhell created fix_3126 (+1 new commit): https://git.io/vNNnf
<github> hpx/fix_3126 ee35a95 Thomas Heller: Partially reverting #3126...
<hkaiser> heller_: thanks, will try
<hkaiser> heller_: update: hello_world hangs using master with any number of threads
<hkaiser> now I will try your patch
<heller_> debug build, right?
<hkaiser> yes
<heller_> release works fine?
<hkaiser> have not tried yet
<hkaiser> compiling
<hkaiser> the patch is not compiling, btw
<heller_> yup
<heller_> give me another second
<heller_> hkaiser: should work now ...
<github> [hpx] sithhell force-pushed fix_3126 from ee35a95 to f5b9b9c: https://git.io/vNNc6
<github> hpx/fix_3126 f5b9b9c Thomas Heller: Partially reverting #3126...
<heller_> that's what you get when doing things in a hurry
<heller_> hello world runs for me
<hkaiser> release hangs as well
<hkaiser> but at different spots
<hkaiser> debug always hangs in the same place, I'll investigate later today
hkaiser has quit [Quit: bye]
eschnett has quit [Quit: eschnett]
hkaiser has joined #ste||ar
<hkaiser> heller_: update: fix_3126 hangs, still
<hkaiser> heller_: my guess would be that this is caused by an uninitialized variable
zombieleet has quit [Ping timeout: 252 seconds]
mbremer has quit [Quit: Page closed]
<hkaiser> heller_: something is screwed up with the context switch
daissgr has joined #ste||ar
eschnett has joined #ste||ar
ct-clmsn has joined #ste||ar
aserio has joined #ste||ar
daissgr has quit [Ping timeout: 256 seconds]
<jbjnr> simbergm: you are running your own pycicle?
<heller_> hkaiser: no idea what's wrong then ... coroutine_impl::operator() is then the only culprit I could think of ...
<heller_> but I miss what's leading to this then...
daissgr has joined #ste||ar
daissgr has quit [Ping timeout: 240 seconds]
mbremer has joined #ste||ar
<mbremer> HI all, I had a design question about serializing classes. Right now I have a class with a reference as a member variable. Presumably this won't serialize or even really allow the default constructor to work. Should I just switch these references to raw pointers -- intended as observers?
<K-ballo> what does it represent?
<simbergm> jbjnr: yes! sorry for the noise on cdash, I thought I'd give it a try
daissgr has joined #ste||ar
<simbergm> I did not hack into your account...
<zao> That's what one that hacked _would_ say.
<simbergm> I added a bunch of sanitizer builds for master now, might look a bit ugly right now
<mbremer> Well so it represents data that's shared between a bunch of the same objects.
<mbremer> In this case it represents an integration quadrature rule, which is the same across all elements, so replicating the data would be tedious
<mbremer> -- or expensive -- or lead to bad cache behavior
<K-ballo> and these objects are not assignable
<mbremer> The one I'm trying to serialize?
<K-ballo> so, when serializing a bunch of them at once, you'd expect each shared instance to be instantiated just once?
<K-ballo> yeah, that wasn't a question
<mbremer> Exactly
<K-ballo> you want what in serialization is known as pointer semantics, but IIRC we departed from that model when we moved away from boost.serialization
<K-ballo> I don't think we do any tracking anymore, but I'm nowhere near sure
<mbremer> I'm also shipping the container class that contains all of the small objects that are being serialized and the single-instance objects
<K-ballo> in any case, you wouldn't need to switch references to pointers, just serialize them as pointers
<mbremer> What will they then point to, when I unserialize them?
<mbremer> Will it just be undefined till, I assign the reference?
daissgr has quit [Read error: Connection reset by peer]
<K-ballo> IIRC we have some support for non-default constructible serializable stuff, somewhat recent
<K-ballo> I'm missing a lot of the important details, maybe heller_ can help
<mbremer> Is it clear what I'm trying to do? I can write up a gist real quick
<K-ballo> it is to me, I'm just not very knowledgeable of the current serialization implementation
<mbremer> Thanks @K-ballo!
<K-ballo> load_construct_data, I think, is what would let you serialize without default construction
<K-ballo> with traditional semantics, serializing via pointer would serialize the first time an address is seen, then back reference to it.. on the other side, identity would be preserved
<K-ballo> so you'd simply serialize/deserialize the shared stuff via pointer
<K-ballo> but I remember being on the losing side of that argument, so I don't think we have those semantics anymore
EverYoung has quit [Ping timeout: 276 seconds]
<mbremer> Where is load_construct_data?
aserio has quit [Ping timeout: 246 seconds]
<K-ballo> IIUC it would be an ADL customization point, just like serialize
<K-ballo> uhm... that example ... looks bogus
<K-ballo> hkaiser: is serialize called after load_construct_data, rather than instead of?
<mbremer> Looking through the source code, it seems like there are two forms of pointer serialization, tracked and untracked.
<mbremer> I wonder if that relates to those traditional vs simplified semantics
<K-ballo> tracked sounds like what you want
<mbremer> It should be easy enough to write up a unit test and see if it's working.
<K-ballo> there should already be a unit test if the semantics are supported
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
<hkaiser> K-ballo: I don't remember :/
<hkaiser> what does the test do?
<K-ballo> I have not run the test, but it constructs the thing with a dummy value rather than one from the archive
<K-ballo> presumably one could still do the serialization in load_construct_data, and then let serialize be a no-op
<github> [hpx] hkaiser deleted completion_handlers at ba143e4: https://git.io/vNN6V
<hkaiser> K-ballo: I think that was the idea
<hkaiser> load_construct was meant to do the work related to serializing things externally to the object, while serialize does the 'internal' things
<hkaiser> it's a weak distinction, but there you have it
<hkaiser> heller_: so #3126 is completely borked for me :/ I'd have to step through assembly to follow what's wrong with the context switches
<hkaiser> could be some stack overwrite or something
<heller_> hkaiser: very strange ... I don't get it at all :/
<hkaiser> context switching is different on window
<hkaiser> s
<heller_> sure, but I didn't toucht the actual switching code
<hkaiser> you might see the same when using bost context, I think that uses fibers on windows as well
<hkaiser> not sure
<heller_> rostam uses boost.context for some runs
<hkaiser> nod
<heller_> and it doesn't show problems
<hkaiser> nod
<hkaiser> something is off, though
<heller_> ok
<heller_> is it how the result is bound, maybe?
<heller_> or the yield value, more specifically?
<hkaiser> shrug, one of the scheduled threads (not the first one, strangly enough) simply comes out of the context switching at the wrong spot
<hkaiser> while switching away from the scheduler
<hkaiser> so no yielding is involved
<hkaiser> it's a new thread
<heller_> so it's a new thread that is being yielded to?
<hkaiser> it's a new thread that is being executed
<heller_> or more preciesly, being switched to
<hkaiser> not sure if it's some leftover from a previous use of the stack?
<heller_> sounds so
<hkaiser> would explain why it's not the first thread failing
<heller_> yes
<heller_> so thread rebinding is broken on windows?
<hkaiser> heller_: can we disable stack reuse for a second?
<heller_> sure
EverYoung has joined #ste||ar
<heller_> if you comment that out, it should be disabled
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<heller_> hkaiser: so yes, rebind_stack is more or less empty for the windows fiber implementation
<heller_> which would explain what you see ... no idea why it shows up only now though
<hkaiser> heller_: let me try that
<hkaiser> heller_: yep, that makes it run
jaafar has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
daissgr has joined #ste||ar
EverYoung has joined #ste||ar
aserio has joined #ste||ar
<hkaiser> heller_, simbergm: btw, appveyor never passed the tests for #3126 because of this - it shouldn't ever have been merged
patg[[w]] has joined #ste||ar
<patg[[w]]> hkaiser: yt?
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<heller_> hkaiser: interesting.
<heller_> So it uncovered a unrelated bug
akheir has joined #ste||ar
david_pfander has quit [Ping timeout: 255 seconds]
<akheir> hkaiser: the examples are here: https://github.com/hapoo/blaze-hdf5
zombieleet has joined #ste||ar
<heller_> hkaiser: aha, I know what's going on...
<hkaiser> patg[[w]]: here
<heller_> hkaiser: the loop I removed was what made the windows context work without resetting the stack...
<hkaiser> heller_: ok, glad you found it
<heller_> The loop in coroutine_impl::operator()
<hkaiser> nod makes sense now
<patg[[w]]> hkaiser: see pm
<heller_> hkaiser: so, two options: 1) reenable that loop, this would also mean that we don't need the reset_stack for the other context switch implementations, I guess
<heller_> 2) implement stack reseting for windows fibers
<heller_> I am not sure if 2) is even possible
<hkaiser> yah, 2 is not possible as we don't know anythng about the internals
<hkaiser> we could leave the loop in there conditionally for the fiber implementation only
<heller_> removing the loop was more a cosmetic change
<hkaiser> k
<heller_> hkaiser: (force) pushed the path
<heller_> patch
<github> [hpx] sithhell force-pushed fix_3126 from f5b9b9c to 66f0493: https://git.io/vNNc6
<github> hpx/fix_3126 66f0493 Thomas Heller: Partially reverting #3126...
<heller_> should work now
<K-ballo> going back to bind
aserio has quit [Ping timeout: 265 seconds]
<heller_> K-ballo: yeah ... the thread function signature change didn't work out as intended
<K-ballo> pity
<K-ballo> let's see if the mbs come back along
<hkaiser> heller_: the first partial roll-back was not necessary, was it?
<heller_> hkaiser: I think it is
<heller_> hkaiser: this failure was probably due to the change
<hkaiser> k
jaafar has quit [Remote host closed the connection]
jaafar has joined #ste||ar
<hkaiser> heller_: yep, that fixed it - thanks!
<heller_> hkaiser: great!
<github> [hpx] sithhell opened pull request #3138: Partially reverting #3126 (master...fix_3126) https://git.io/vNNAG
<heller_> glad we found it...
<heller_> brain debugging is the best...
daissgr has quit [Ping timeout: 256 seconds]
vamatya has joined #ste||ar
aserio has joined #ste||ar
<zao> braaaains
<diehlpk_work> zooombies?
<mbremer> Hi all, I'm looking more into the non_default_constructible serialization, how would I instantiate a non_constructible object from the input_archive?
<mbremer> Is that even possible? It seems like the non-default stuff is not really meant for lvalues...
<mbremer> Also, I suppose I could just add a default constructor, although I'd prefer not to at the moment
<heller_> the non default ctor serialization is currently only enabled if you serialize your stuff through a pointer, IIRC
<heller_> serializing anything that has a reference or pointer inside is always tricky
<K-ballo> really? what's the relation between non-default and by-pointers?
<heller_> that you can always default construct a pointer, but not a object without default constructor
<heller_> more or less
<mbremer> Ah, ok. That's what it kind of smelled like to me.
<heller_> the problem is, that you need your object alive once you load your members
zombieleet has quit [Ping timeout: 248 seconds]
<heller_> at least that's how it is right now, when having pointers, you do a placement new inside th load_construct functions
<heller_> so, you have two options: turn the reference inside your object into a pointer, then think about how you could transmit the information you need to reconstruct that pointer over the wire
<heller_> as I understand it, it's not something with dynamic memory management or similar
<mbremer> yeah yeah! ok, cool that's what I was thinking at first.
<heller_> K-ballo: pointer tracking is still there
<mbremer> I guess I'll just add the pointers and default constructor, and then just pepper the code with asserts :)
<heller_> so you can send the thing you are pointing to over the wire, but then, when it is being loaded on the receiving end, it will be 'new'-ed into existing, and you have to delete it eventually, which probably complicates your class destrocutor logic
<heller_> yeah, that'll work
<heller_> so how do you assign the reference normally?
<mbremer> So I have a mesh container, and each vertex gets a reference to its edges
<mbremer> so the mesh currently manages the lifetime of the edges, and then connects the vertices during construction
<heller_> and for that you need this reference?
<mbremer> Well the mesh is essentially a vector<vertices>
<mbremer> so the vertices have the references,
<mbremer> so either I transmit the references (or pointer if changed), or I just connect the vertices during the load
<mbremer> of the mesh
<heller_> that'll be up to you then
<heller_> and you have to decide what's more expensive
<mbremer> kk, yeah, my gut is telling me that shipping through the wire would be worse than a little extra compute...
<mbremer> But I'll also do that because it seems easier at the moment
<mbremer> Thanks!!
<diehlpk_work> hkaiser, Where is the hpx implenentation located in blaze?
zombieleet has joined #ste||ar
<diehlpk_work> Ok, I will start to make the execution policy configurable in a HPX config header
bibek has joined #ste||ar
bibek has quit [Client Quit]
bibek has joined #ste||ar
parsa[w] has quit [Remote host closed the connection]
parsa[w] has joined #ste||ar
zombieleet has quit [Ping timeout: 256 seconds]
patg[[w]] has quit [Quit: Leaving]
<diehlpk_work> error: ‘dynamic_chunk_size’ is not a member of ‘hpx::parallel’
<diehlpk_work> Are there any recent chnages in hpx?
<hkaiser> hpx::parallel::execution::...
<zao> Missing headers perchance?
<zao> Or rather, changed include nesting.
<diehlpk_work> static_assert(
<diehlpk_work> Yeah it is working with hpx::parallel::execution:: but this is not a proper executor type
<K-ballo> dynamic_chunk_size appears to be a function?
aserio has quit [Ping timeout: 276 seconds]
<diehlpk_work> No, it is not
<diehlpk_work> Not working for auto, dynamic, and static
<K-ballo> is it being interpreted as a function declaration perhaps?
<zao> More {} for the {} gods?
<K-ballo> hard to tell without context, but something in that output screams function
<diehlpk_work> Context
<diehlpk_work> hpx::parallel::execution::auto_chunk_size cs();
<diehlpk_work> for_loop( par.with(cs), size_t(0), threads, [&](int i)
<K-ballo> yeah, that's a function declaration
<K-ballo> a function called `cs` that takes no arguments and returns an `auto_chunk_size`
<K-ballo> drop the () or replace with {}
<diehlpk_work> Yes, my fault, thanks
K-ballo has quit [Quit: K-ballo]
K-ballo has joined #ste||ar
aserio has joined #ste||ar
aserio has quit [Ping timeout: 276 seconds]
eschnett has quit [Quit: eschnett]
hkaiser has quit [Quit: bye]
bibek has quit [Quit: Leaving]
ct-clmsn has quit [Quit: Leaving]
bibek has joined #ste||ar
hkaiser has joined #ste||ar
mbremer has quit [Quit: Page closed]
EverYoung has quit [Remote host closed the connection]