diehlpk has quit [Remote host closed the connection]
hkaiser has quit [Quit: bye]
K-ballo has quit [Quit: K-ballo]
Coldblackice has quit [Ping timeout: 268 seconds]
Coldblackice has joined #ste||ar
Coldblackice has quit [Ping timeout: 276 seconds]
Coldblackice has joined #ste||ar
Coldblackice_ has joined #ste||ar
Coldblackice has quit [Ping timeout: 240 seconds]
jaafar_ has joined #ste||ar
jbjnr_ has joined #ste||ar
jaafar_ has quit [Ping timeout: 268 seconds]
Coldblackice_ has quit [Ping timeout: 265 seconds]
Coldblackice has joined #ste||ar
Coldblackice has quit [Ping timeout: 264 seconds]
Coldblackice has joined #ste||ar
K-ballo has joined #ste||ar
hkaiser has joined #ste||ar
<heller>
Oh no
<heller>
hkaiser: saw it. Accidentally pushed to master, sorry
<heller>
You were right, the tss reset needed to be before the result binding
<hkaiser>
heller: let's see if this fixes things now
<hkaiser>
and thanks for looking into it
<heller>
I tested it locally at least
<heller>
Should have been more careful
<hkaiser>
heller: how did the talk go?
<hkaiser>
is it available somewhere?
<heller>
I think it went well
<heller>
It will be uploaded next week, I think
<heller>
It was recorded
<heller>
We might have hit a never there
<heller>
Nerve
<heller>
Adapting everything to p0443 will be time consuming though
<hkaiser>
absolutely
<hkaiser>
this involves redoing all of our executor stuff
<heller>
Yes
<hkaiser>
possibly even the future implementation
<heller>
Yeah
<hkaiser>
that's for after the refactoring
<heller>
Having something like that it there quickly is important
<heller>
There are quite a few use cases which could benefit from such an infrastructure
<heller>
And people would like to use it if it were there
<heller>
So we need to find a middle ground between getting something out, keeping the API stable, and start the migration to P0443
<hkaiser>
sure, needs to be in a coordinated way and step by step
<heller>
We can probably ignore the sender/receiver part and start with executor
<hkaiser>
hmmm
<heller>
And keep our executors implementation
<heller>
At first
<hkaiser>
I think the sender/receiver part is the lowest level everything else is built upon
<heller>
Right, which also means that our stuff needs to be converted as well
<hkaiser>
indeed
<heller>
The biggest appeal is to just use or future based infrastructure
<heller>
Having to reimplement that based on the sender receiver stuff will take some time
<hkaiser>
it definitely will take time
<heller>
Getting the execution context stuff to run with our current executors interface is less intrusive
<heller>
And I hope that this transition will make it easier to port to the new design
<hkaiser>
ok
<heller>
The risk is that we might need to break the API
<hkaiser>
I'd like to keep the variadic API
<heller>
For the executor dispatch?
<hkaiser>
I think they're making a mistake without it
<heller>
Ok, let's get some implementation experience there...
<heller>
And usage experience
<hkaiser>
right
<hkaiser>
heller: the stackfull threads on the stackless branch are now on par with master, btw
<hkaiser>
stackless is still ~7%% faster
<heller>
Cool, what did you do?
<hkaiser>
remove the virtual function for the operator()()
<heller>
7% is still not a lot. Where did it come from?
<hkaiser>
less code executed, this corresponds to about 100ns per thread
<heller>
So it's just the context switch
<hkaiser>
and the stack allocation
<hkaiser>
and related costs
<heller>
The stack allocation costs should have been mitigated
<hkaiser>
it's a start, even if it's not much
<heller>
As we reuse them
<hkaiser>
sure
<heller>
Get rid of the atomic state change
<heller>
That should do wonders
<hkaiser>
will introduce vistual functions again
<hkaiser>
virtual
<hkaiser>
but it's worth a try
<heller>
Yes
<heller>
But those should be insignificant
<hkaiser>
the virtuala functions themselves are insignificant, it's the optimization barrier introduced by using them that causes slowdown
<heller>
What does the profiler say where the costs currently are?
<hkaiser>
no real hotspot
<heller>
Can you show the top 10?
<hkaiser>
don't have anything right now I could show
<hkaiser>
I can reproduce it for you - it's mostly the thread stealing, as always
<hkaiser>
I'll try the non-atomic state, but not today
<heller>
Sure
<heller>
I'll enjoy the train ride home as well
<heller>
We should talk some time next week
<hkaiser>
ok
<zao>
A small heads up btw, the stellar-group website is mixed-content thanks to the group logo being fetched over HTTP regardless of the protocol used to access the page.
<hkaiser>
zao: ok, I'll have a look
<hkaiser>
easy enough to fix
<zao>
I've never used any other allocator than `system`. Which one ought I pick for packaging if I can get both jemalloc and tcmalloc?
<hkaiser>
zao: both are fine
<zao>
Alright. Making an EasyBuild config and had to flip a coin to pick a malloc impl.