<hkaiser>
you should be able to work around the problem for now
<hkaiser>
give me a sec
<hkaiser>
try using -DHPX_WITH_DYNAMIC_HPX_MAIN=OFF with cmake
<hkaiser>
weilewei: ^^
<weilewei>
hkaiser will try later, thanks
<zao>
weilewei: /usr/bin/ld is the linker.
nikunj97 has quit [Ping timeout: 260 seconds]
hkaiser has quit [Quit: bye]
<weilewei>
zao I see
bita has quit [Ping timeout: 256 seconds]
Pranavug has joined #ste||ar
Pranavug has quit [Client Quit]
karame_ has quit [Quit: Ping timeout (120 seconds)]
weilewei has quit [Ping timeout: 240 seconds]
<Yorlik>
I'm getting hpx exeptions when trying to acquire a lock with a hpx spinlock like: "lock_guard lock( lua_engine_lockable_ );" where lua_engine_lockable_ is an hpx lcos spinlock. The exception info says: "vector deleting destructor". I'm a bit clueless where to go from here.
mdiers_ has quit [Ping timeout: 264 seconds]
nikunj97 has joined #ste||ar
nikunj97 has quit [Ping timeout: 250 seconds]
hkaiser has joined #ste||ar
<Yorlik>
hkaiser: yt?
<hkaiser>
here
<Yorlik>
I'm hitting a wall with a very strange lockup.
<Yorlik>
I know sortof what is triggering it, but I have no clue why.
<Yorlik>
I tried to find if there is any sort of shared resource, but couldn't find one
<Yorlik>
The situation is like this:
<Yorlik>
I have a central run() function, which contains a while loop inside of which there is an update() function.
<Yorlik>
That's my central object update loop
<Yorlik>
It runs updates, and if there is time left at the end of the frame it does a busy wait
<Yorlik>
It works nicely and without problems so far.
<Yorlik>
Then I recently introduces a second path of updates.
<Yorlik>
These updates do not have a wait - so the ramerate is unbounded.
<Yorlik>
I call then in run(), before entering the while loop
<Yorlik>
I receive a future form these, since they are started async.
<Yorlik>
They keep running until I give a signal through an atomic to halt the simulation
<Yorlik>
When this happens, this type of update exits, as well the while loop which does the bounded framerate updates.
<Yorlik>
after the while loop, still within run() I collect the future of the unbounded updates.
<Yorlik>
The futures from the bounded updates are collected after each frame inside the while loop.
<Yorlik>
I can run either of these two paths
<Yorlik>
The decision, which path is used for which type of object depends on the cpomponents.
<Yorlik>
Basically I am looping through all templates, but the updaters which do not meet the respective conditional are empty using an if constexpr()
<Yorlik>
This way I can decie whiochg objects are updated where.
<Yorlik>
I tested it and it seems to work nicely.
<Yorlik>
Now, when running both paths - even if one is never really doing updates, because I do not create objects - the unbounded path locks up.
<Yorlik>
Looking at the worker threads it seems only the path running the bounded updates is active and mostly sitting in its busy wait
<Yorlik>
The other worker threads seem to idle and sit in their schedulers waiting for work.
<Yorlik>
The lockup appears to happen inside the parallel loop after finishing a chunk of work.
<Yorlik>
My updater function have quit, the executor has been destroyed but nothing is happenoing.
<Yorlik>
It's like stuck in the for loop not scheduling any more chunks.
<Yorlik>
It also is not my on_enter or _on exit lambdas
<Yorlik>
They have quit and are not running when it hangs
<Yorlik>
I also had another bug - which is probably unrelated and I could not reproduce it anymore - an exception when trying to acquire a lock with a spinlock
<Yorlik>
From my perspective it loocks like some weirdness deep inside hpx, hidden from me - buty maybe I'm ding something obviously wrong and just don't see it.
<Yorlik>
So - that's pretty much it.
<Yorlik>
So hkaiser: that's the wall of text ^^ :)
<Yorlik>
I wish I would better understand what these seemingly idleing threads are actually doing and how to understand the state they are in when it hangs.
<hkaiser>
do you see any problems in debug? any memory issues? objects going out of scope prematurely?
<Yorlik>
I don't really know what to look for further.
<hkaiser>
the seemingly idling threads wait for new tasks to be created, nothing else
<hkaiser>
the 'lock-up' could be acused by a future you're waiting on, but that never gets ready
<Yorlik>
The two code branches work nicely independently from each other. That's the only hint I currently have
<Yorlik>
I let the server run for two hours without any problem
<hkaiser>
hmm
<Yorlik>
But as soon as i activate the bounded updates again it stops. It might be, that for some reason, it doesn't schedule the unbounded updates after some point, because no one askes for the future at the entry point
<hkaiser>
difficult to tell from here
<Yorlik>
Between the creation of the future and the checking of it lies the while loop for the bounded updates.
<Yorlik>
But it stats and then stops
<Yorlik>
Like as if the system wanted to tell me: If you don't ask for the future I'm not going to do anything anymore.
<Yorlik>
It runs for short and stops after running some chunks from the parallel loop.
<Yorlik>
BTW: The unbounded callsite is like this: auto frameless_fut = hpx::async( &controller::update_frameless, this );
<hkaiser>
nah
<hkaiser>
the problem can happen if call get on a future that never becomes ready
<hkaiser>
*if you call get()*
<Yorlik>
The while loop never stops running - just the loop inside the unbounded call.
<Yorlik>
Again - if I switch off the while loop it works
<Yorlik>
So it reaches the future.get and waits for it
<Yorlik>
After the bounded update while loop: frameless_fut.get( );
<hkaiser>
the unbounded thread runs all the time ? or is it created for each frame?
<Yorlik>
It runs the updaters for each type async, collects the futures and starts over
<hkaiser>
what does that mean 'it starts over'?
<Yorlik>
so the while loop keeps spinning - I checked it
<hkaiser>
ok
<Yorlik>
The while loop runs one frame after the other
<hkaiser>
how does it make the future ready?
<Yorlik>
the while loop has its own set of futures
<Yorlik>
the bounded path inside that while loop and the unbounded path are independently synchronized
<Yorlik>
They have their own futures to manage what#s async
<hkaiser>
you said the unbounded runs until an atomic is set
<Yorlik>
Fundamentally both paths are exactly the same - just different functions
<Yorlik>
The unbounded one has no busy loop after a frame - that's all
<hkaiser>
where is that atomic set?
<Yorlik>
inside the unbounded one and at the top of the while loop for the bounded one are checks for the atomic
<Yorlik>
The atomic is set from the outside - but no one writes to it
<Yorlik>
Only if I use the admin client to pause the simulation
<Yorlik>
Or if I sghutdown the server
<hkaiser>
ok, let's recap
<hkaiser>
you launch the unbounded one, get back a future, then do the bounded work, and then wait for the future returned from the unbounded one
<Yorlik>
The wait for the unbounded future is only ever reached, if the bounded work has finished.
<hkaiser>
is that correct?
<Yorlik>
Basically yes
<hkaiser>
so you relaunch unbounded work on each frame?
<Yorlik>
Just the while loop runs continuously
<Yorlik>
Yes - it's done inside the while loop
<Yorlik>
The wile loop never exits as long as the simulation is runniong
<hkaiser>
you get a new unbounded future for each frame?
<Yorlik>
I only ever ask for the unbounded future when the simulation is halted
<Yorlik>
it has it's own while loop inside
<hkaiser>
you lost me
<Yorlik>
So there are two while loops
<hkaiser>
on two tasks?
<Yorlik>
Yes
<hkaiser>
ok
<Yorlik>
One is inside the async I posted
<Yorlik>
the other is in run()
<hkaiser>
run does the bounded work?
<Yorlik>
So run spawns the unbounded task before it enters the bounded while
<hkaiser>
ok
<Yorlik>
after the bounded while, on pause state the unbounded future is collected
<Yorlik>
and then run() exits
<hkaiser>
what does that mean?
<hkaiser>
'future is collected'?
<Yorlik>
I am not exiting run() before all work is fnished
<Yorlik>
It's just for synchronization
<hkaiser>
so you call get() on the unbounded future
<Yorlik>
exactly
<hkaiser>
ok
<Yorlik>
its value is discarded. I think it's a char right now
<hkaiser>
sure, np
<Yorlik>
Could also be void - noone ever cares
<hkaiser>
so where does it lock up?
<Yorlik>
Its like a sandwich: start unbounded - start bounded - collect unbounded
<hkaiser>
ok
<Yorlik>
It locks in the middle of running
<Yorlik>
So - not server pause is triggered
<hkaiser>
'middle of running' ?
<Yorlik>
In theory the center while should run doung bounded updates and the task for unbounded updates should do itsa thing
<hkaiser>
which task locks up? the bounded or the unbounded one?
<Yorlik>
The unbounded one
<Yorlik>
The bounded one keeps ticking nicely
<hkaiser>
how can it lock up if it doesn't do any synchronization?
<Yorlik>
Its hangs inside the parallel loop after exiting a chunk
<Yorlik>
It looks as if no more chunks are launched
<hkaiser>
so the unbounded task runs a parallel for?
<Yorlik>
Despite not being finished
<Yorlik>
Yes - both do
<hkaiser>
and it never exists the loop?
<Yorlik>
Not until the server is halted
<hkaiser>
how many chunks does it run?
<Yorlik>
Again - just be switching off the middle while loop of the bounded updates it starts working correctly
<Yorlik>
2-3 chunks or so
<hkaiser>
ok
<hkaiser>
let's go back again
<Yorlik>
both paths use different specializations of the par loop
<hkaiser>
you launch unbounded work for each frame?
<Yorlik>
I was thinking if something is shared, but there isn't
<Yorlik>
essentially the unbounded updates run in frames too, but without a frequency limit
<hkaiser>
so you launch unbounded work for each frame?
<Yorlik>
after each frame it starts over isntead of trying to sync with a framrate
<hkaiser>
a new async for each frame?
<Yorlik>
no
<Yorlik>
I did both
<hkaiser>
you lost me again
<Yorlik>
async and not async
<Yorlik>
it doesn't make a difference
<Yorlik>
I do not spawn additional tasks inside the unbounded updates, except what the parloop does.
<hkaiser>
ok
<hkaiser>
one bounded async per frame?
<Yorlik>
One per entity type
<hkaiser>
I'm lost, sorry
<Yorlik>
The entities are different types - they are distributed between the two update methods.
<Yorlik>
Each entity type has its own parloop
<Yorlik>
So inside the managing loop which checks if the server is running, one function specialization for each entity type is started to do the updates
<hkaiser>
that does not matter at all
<Yorlik>
Indeed.
<Yorlik>
Just saying - that'Äs the structure
<Yorlik>
In the bounded updates these functions are asyncs
<hkaiser>
pls create a 10 liner that reproduces the execution structure and relation between tasks