<heller1>
Yorlik: so you eventually throw in one of your element functions?
<heller1>
Did you try catching the exception yourself and see if that lockup persists?
hkaiser has joined #ste||ar
<ms[m]>
Yorlik: what heller said + the question is how do you launch your parallel for loop? with par(task)? do you ever get() the value from the future?
<Yorlik>
Allright - back. Had to take a nap for beauty and sanity :)
<Yorlik>
I can add the callsite to the report ofc - I'll do that right away.
<Yorlik>
And yes - there are occasions where I throw (standard exceptions like std::runtime_error) and catch the exception through the future. All futures are put into a list and checked for exceptions.
<Yorlik>
If I didn't produce a bug, there should be not a single unchecked future. It has worked in the past.
<Yorlik>
OFC I can't catch anything at the loop callsite, since nothing ever arrives there.
<Yorlik>
heller1, ms[m], hkaiser ^^
<hkaiser>
Yorlik: we'll need a small reproducing case
<hkaiser>
Yorlik: our parallel algorithms all have tests for exception handling, so you must be doing something differently
<Yorlik>
Whyt could I have done wrong? Forget to check a future and never call future.get() ? Or not catch when calling .get()?
<hkaiser>
I'm not saying you did something wrong
<hkaiser>
I said you do something in a way we do not test
<Yorlik>
Probably. What I'm doing is std::move the futures between lists for delayed checking at the end of a frame.
<hkaiser>
Yorlik: why did you close the ticket?
<Yorlik>
Did I? Then it was an accident
<Yorlik>
Sorry for that.
<hkaiser>
ok, I'll reopen it
<Yorlik>
Probably when trying to get that code sample in
<hkaiser>
np
<hkaiser>
now please tell me, where is that exception thrown? in update_entity<>?
<Yorlik>
Must be
<Yorlik>
But not in there probably
<Yorlik>
Couzld also be inside the Lua Call stack
<hkaiser>
so it's thrown in the loop body?
<hkaiser>
what exception is it?
<Yorlik>
I don't know - the endpoint on the HPX side has an error list and the debugger crashed each time I tried to open and inspect it
<Yorlik>
I'll retry and see what's acually there
<Yorlik>
But I don't have many sites where I throw
<hkaiser>
Yorlik: I looked now - we don't test exception handling for for_loop :/
<Yorlik>
Sorry for that ;)
<hkaiser>
that could be causing your problem
<Yorlik>
It wasn't my intention .... :o
<Yorlik>
So - do you think there's an obvious fix?
<hkaiser>
let's see
<Yorlik>
Debugger crashing again - it doesn't want me to see the errors
<Yorlik>
errors length was 1
<Yorlik>
hkaiser: Updated - I commented with the e.what() output
<Yorlik>
I gave up on asking the debugger and just rethrew and printed. Take that debugger !!! :)
<Yorlik>
If I read this correctly it didn't like me holding a spinlock while creating a new lua state
<Yorlik>
The first thing I do in "agns::luaengine::lua_engine::init" is to acquire a lock_guard with a spinlock as lockable.
<Yorlik>
And it seems I yielded while holding that lock and that seems not to be allowed.
<hkaiser>
Yorlik: now it makes sense why it fails in debug only
<Yorlik>
You solved the riddle?
<hkaiser>
we don't check for held locks in release
<Yorlik>
IC. I guess it's a protection mechanism to avoid deadlocks?
<hkaiser>
yes
<hkaiser>
you call yield while holding a lock
<Yorlik>
And init() is a pretty large function with a ton of possible output
<hkaiser>
can you unlock the lock while yielding?
<hkaiser>
you could use util::scoped_unlock<> to handle that
<Yorlik>
The main reason why I protected it was, that when creating a lua_engine I needed readable ungarbled output from the lua side, because part of the initialization requires me to run Lua scripts
<hkaiser>
no, it's called unlock_guard
<Yorlik>
I could try to just remove it entirely
<hkaiser>
hpx::cout should ungarble output
<Yorlik>
Not sure if it was really required
<Yorlik>
The output comes from Lua print
<hkaiser>
you can have the lock, just unlock it while yielding
<hkaiser>
(if possible)
<Yorlik>
I think I need to find a way to get synchronized output from Lua
<hkaiser>
if you you have to hold on while yielding, use ignore_lock to tell HPX not to bother checking
<Yorlik>
The problem is, even a synched output would garble the output between the lua engines being created
<hkaiser>
Yorlik: whatever, that's not the problem we're trying to solve here
<Yorlik>
Could i use ignore lock locally for a single case?
<hkaiser>
yes
<Yorlik>
So it's hpx::ignore_lock(true/false) ??
<hkaiser>
no
<hkaiser>
it's an object that disables checking in it's ctor and re-enables it in its dtor
<Yorlik>
Oh that's nice
<Yorlik>
So I just create one and it aute reenables when out of scope?
<hkaiser>
simply create a scope befre calling yield with this variable inside: util::ignore_all_while_checking ignore_lock;
<hkaiser>
I'll look into for_loop exception handling
<Yorlik>
Allright
<Yorlik>
Seems the last days are finally steering towards a result :)
<Yorlik>
It's kinda nice hpx has so much header only code, it was easy for me to poke into it and generate all the output I needed without recompiling everything.
<hkaiser>
Yorlik: btw, exception handling for for_loop seems to be ok, btw - would be interesting to see why it failed for you
<Yorlik>
So you'd have expected for me to get an exception at the loop callsite?
<Yorlik>
hkaiser ^^
<hkaiser>
yes
<hkaiser>
IIF you call .get on the returned future - that will rethrow the exception
<hkaiser>
but I'll add the test as it is missing
<Yorlik>
I have wrapped that in try catch ofc
<hkaiser>
Yorlik: so let me ask again
<hkaiser>
is the exception thrown in the loop body?
<Yorlik>
Yes
<Yorlik>
On occasion I have to create a new Lua_engine
<hkaiser>
k
<Yorlik>
So it gets initialized and there is that lock
<Yorlik>
Sometimes a task spawns new lua engines, when I'm going out of lua and back into lua
<Yorlik>
So it mostly happens either in update() or in the Lua scripts called by update
<hkaiser>
Yorlik: do you launch hpx tasks using apply()?
<hkaiser>
of always using async?
<hkaiser>
*or*
<Yorlik>
I don't think I have any apply left - I'd have to scan my code, but I doubt it
<hkaiser>
please have a look
<Yorlik>
After you explained that exception mechanics I decided to chack all responses
<Yorlik>
OK
<Yorlik>
There are three left, but not in the simulator - they live in the code launching commands on the server controller - only used when issuing administrative commands coming from the admin client.