hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: Bye!]
diehlpk_work_ has quit [Remote host closed the connection]
Yorlik has joined #ste||ar
<ms[m]> FunMiles: ignoring bugs, the only way you should get that flag is if you link to HPX::wrap_main in your cmake configuration
<ms[m]> just linking to HPX::hpx should not give you that
<ms[m]> are you linking to HPX::wrap_main?
<ms[m]> and can you think of anything relevant that has changed between "before your recent update" and "after the recent update"? did you update HPX? something else?
<ms[m]> and re: the idling, if you have really only one thread that has not-close-to-zero cpu utilization (and you're using more than one HPX worker thread) then is it possible that you have on HPX thread that keeps suspending itself and is waiting for something? how do you wait for input and where?
rachitt_shah[m] has quit [Quit: You have been kicked for being idle]
<Yorlik> If I have a component with actions that return futures: Is there a way to defer all computation until I tell the component to do so and deliver the futures? Like an on/off switch?
K-ballo has joined #ste||ar
<zao> Yorlik: You wanted something the other day?
<Yorlik> Hello zao!
<Yorlik> I thought I remembered you knew things about intrinsics and bit fiddleing - or were I wrong?
<zao> Not too much, I zone out when I see instructions with too many letters :D
<Yorlik> Allright - nevermind then. I was working on a Morton Code function - but I finally made one which intersects 8 32 bit value into a 256 bit Morton Code. On my machine it takes ~7 ns and I was happy. Before I was at ~400 using pdep, but I turned to AVX2 shift/permute/shuffle and ended up much better.
<Yorlik> afk a moment - craftsman coming in ...
<zao> PDEP is one of those infamous ones, tends to have horrible perf cliffs on AMD.
hkaiser has joined #ste||ar
Yorlik has quit [Ping timeout: 264 seconds]
Yorlik has joined #ste||ar
<Yorlik> And back. zao: That's what I had to learn. Using AVX2 256 bit shuffles/permutes/shifts brought an incredible performance boost together with dropping the bit width from 64 to 32 bit coordinates.
<Yorlik> But I guess dropping pdep helped a lot.
<Yorlik> hkaiser: YT for a question? If I have a component with actions that return futures: Is there a way to defer all computation until I tell the component to do so and deliver the futures? Like an on/off switch?
<hkaiser> Yorlik: have a queue of tasks for each component?
<hkaiser> should the execution of those tasks be delayed or just the returning of the results?
nanmiao has quit [Quit: Client closed]
Yorlik has quit [Ping timeout: 260 seconds]
FunMiles has quit [Remote host closed the connection]
FunMiles has joined #ste||ar
FunMiles has quit [Ping timeout: 264 seconds]
nanmiao has joined #ste||ar
Yorlik has joined #ste||ar
<Yorlik> hkaiser: It's the execution which needs to be delayed. There are some operations, like the operation on sets, which should not be performed while the actors are updated, because it would cause concurrency issues / races. These tasks need to be delayed to the end of each frame, where there is a synchronization point.
<Yorlik> So I'm going to use pipelining to keep everything ion order.
<Yorlik> I simply prefer to put these tasks together to the end of a frame, instead of locking a bunch of stuff for each single task when running them during the frame processing.
<hkaiser> Yorlik: you would have to somehow implement that yourself, atm
<Yorlik> IC - I'll use some sort of messages+queue then.
hkaiser has quit [Quit: Bye!]
nanmiao has quit [Quit: Client closed]
FunMiles has joined #ste||ar
FunMiles has quit [Ping timeout: 260 seconds]
Yorlik has quit [Remote host closed the connection]
Yorlik has joined #ste||ar
FunMiles has joined #ste||ar
FunMiles has quit [Ping timeout: 260 seconds]
diehlpk_work has joined #ste||ar
hkaiser has joined #ste||ar
<gonidelis[m]> hkaiser: do we always need to .join a thread before destroying it?
<hkaiser> gonidelis[m]: either .join() or .detach()
<gonidelis[m]> hm got it
<gonidelis[m]> what's detach doing differently?
<hkaiser> or use jthread, which joins automatically
<hkaiser> detach leaves the underlying kernel thread alone, detaching it from the c++ object
<gonidelis[m]> we hpx_threads nevertheless so it does not matter right?
<hkaiser> why not, why not use hpx::thread?
<gonidelis[m]> ahhh yeah
<gonidelis[m]> that's what i am saying
<gonidelis[m]> use hpx::thread
<hkaiser> or hpx::jthread for that matter
<gonidelis[m]> ah, does hpx::thread also need join detach
<hkaiser> same rules apply
<gonidelis[m]> ah right ok
<gonidelis[m]> yy right i thought the machinery was different
<hkaiser> c++ standards conformance, remember?
<hkaiser> hpx::thread (std::thread) will throw if not joined/detached
<gonidelis[m]> yy
<hkaiser> gonidelis[m]: see pm, pls
<K-ballo> it will terminate(), though being a destructor throwing would be almost effectively the same
<gonidelis[m]> K-ballo: you mean if i create a destructor it's the same?
<gonidelis[m]> i was lost on your syntax
<K-ballo> throwing in a destructor leads to terminate under normal conditions
<gonidelis[m]> throwing in ? you mean, if I create a destructor by myself
<gonidelis[m]> ?
<gonidelis[m]> implement*
<K-ballo> "hpx::thread (std::thread) will throw if not joined/detached"
<gonidelis[m]> ok
<K-ballo> from C++'s `throw Exception();`
<gonidelis[m]> aha
<gonidelis[m]> now i am having trouble discriminating std::promise with std::future
<gonidelis[m]> do we have an hpx promise?
<gonidelis[m]> K-ballo: ah ok you use the throw std::runtime_error explicitly
<hkaiser> gonidelis[m]: we do
<hkaiser> hpx::lcos::local::promise, I believe
<gonidelis[m]> thanks
Yorlik has quit [Read error: Connection reset by peer]
<gonidelis[m]> what do we include under libs/execution?
<gonidelis[m]> because i can see things defined under libs/executors in `namespace hpx { namespace execution`
Yorlik has joined #ste||ar
Yorlik has quit [Ping timeout: 260 seconds]
FunMiles has joined #ste||ar
<FunMiles> ms[m]: You are correct, I had hpx::wrap_main. I removed it and I don't have the -e anymore. However I still have the main thread being registered to HPX. But at this point I need to investigate more how it happens. I don't think you can figure it out without more info.
FunMiles has quit [Remote host closed the connection]
FunMiles has joined #ste||ar
<hkaiser> FunMiles: the main thread is registered by hpx itself
FunMiles has quit [Ping timeout: 260 seconds]
diehlpk_work has quit [Ping timeout: 260 seconds]
FunMiles has joined #ste||ar