hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoD: https://developers.google.com/season-of-docs/
<jaafar> Is it possible to choose the number of threads used on a more granular basis, i.e. before a call to an algorithm?
<jaafar> Would be nice to have the same control as chunk size
<hkaiser> jaafar: yah, that's possible
<hkaiser> there is a parameters hook to return the number of cores to use
<jaafar> oh cool thanks hkaiser
nikunj has joined #ste||ar
hkaiser has quit [Ping timeout: 276 seconds]
nikunj has quit [Remote host closed the connection]
quaz0r has quit [Ping timeout: 240 seconds]
quaz0r has joined #ste||ar
rori has joined #ste||ar
simbergm1 has left #ste||ar ["WeeChat 1.9.1"]
hkaiser has joined #ste||ar
<hkaiser> heller: thanks for merging #4114!
<heller> hkaiser: working on getting my stuff in now ;)
<hkaiser> good, good
<heller> hkaiser: see #4125
<heller> hkaiser: another thing I'd like to do is remove the unique_ptr<erased_output_container>
<heller> not sure how to achieve that though ...
<hkaiser> well, you could try turing the output container into another extra data item
<hkaiser> turning*
<heller> hmmmm
<simbergm> gcc 7 and up are not happy with that pr: http://rostam.cct.lsu.edu/grid
<simbergm> and I don't know why pycicle doesn't fail with that one :/
<hkaiser> simbergm: that should have been fixed by merging #4114 :/
<heller> c++14 vs. c++17
<hkaiser> heller: no, it's the basic_any constructor
<simbergm> damn :(
<hkaiser> simbergm: is that current master?
<zao> Isn't that the same one that I was muttering about?
<simbergm> according to rostam it is
<hkaiser> simbergm: ok
<zao> It's been screwed since the 1.3.0 branch, fwiw.
<zao> (at least)
<zao> Oh, wait, nevermind.
<zao> git describe tricked me again.
<simbergm> since 1.3.0?
<zao> simbergm: Didn't look too closely at my log name, it was 1.3.0 + 679 commits :P
<simbergm> pycicle's gcc-newest is gcc 9
<simbergm> maybe that changes things?
<hkaiser> zao: now, this is brand new stuff
<zao> Mine was master a few days ago, sorry for the confusion.
<hkaiser> simbergm: the is_move_constructible shouldn't even be instantiated if the first enable_if fails
<hkaiser> we could try doing the anable_if in the template<typename T, ...> instead
<hkaiser> simbergm: I'll put something together today
<simbergm> hkaiser: I definitely don't have other ideas, sorry :/ thanks for dealing with it
<simbergm> so for the coroutines module are you suggesting that thread_id holds a void* to thread data...?
<heller> hkaiser: I know it is the basic any constructor, it works however if you enfore C++14
<heller> hkaiser: no idea why
<heller> #4125 should make master compile again. The move any issue will still remain, if someone uses it
<heller> ... with that compiler
<hkaiser> heller: yah, I'll try to move the enable_if around, let's see
<hkaiser> simbergm: the fixing_basic_any branch seems to fix the issue:http://rostam.cct.lsu.edu/builders/hpx_gcc_7_boost_1_65_centos_x86_64_debug/builds/428/steps/build_core/logs/stdio, I'll create a PR
<hkaiser> see #4126
<simbergm> hkaiser: woop, thanks!
<simbergm> do you happen to know what's going on there? is gcc doing the wrong thing?
<hkaiser> it shouldn't instantiate the second enable_if if the first fails
<hkaiser> having those in different places seems to help
<simbergm> "shouldn
<simbergm> "shouldn't instantiate" = "standard says so"?
<hkaiser> ask K-ballo ;-)
<simbergm> K-ballo: you know what to do...
<simbergm> :P
<hkaiser> no other compiler does it - FWIW
<K-ballo> yeah, sfinae happens in lexical order
hkaiser has quit [Ping timeout: 240 seconds]
<simbergm> hkaiser: sorry for bothering you again... if thread_id stores a void* we lose all the nice dereferencing operators from thread_id... do you feel like keeping them around?
<simbergm> on the other hand being explicit doesn't hurt
<simbergm> alternatively I could stop being stubborn and put thread_data and friends into the coroutines module... :/
jbjnr has joined #ste||ar
<zao> hkaiser, simbergm: pr4125 seems to address the above build problem on my GCC 8.3.0
hkaiser has joined #ste||ar
<simbergm> zao: thanks!
aserio has joined #ste||ar
* heller wants virtual template functions :/
<hkaiser> simbergm: yah, I think those dereferencing operators are another way of reconstrcuting the thread_data from the internal id, those could be templated - but I think global functions might be more appropriate
<hkaiser> simbergm: look at https://en.cppreference.com/w/cpp/thread/thread/id for an example
<hkaiser> the std::thread::id is completely agnostic of the platform handle and can only be used to identify the thread
<simbergm> hkaiser: yeah, that's what they're there for but we'd lose them if I want to not have any thread_data in the coroutines module
<hkaiser> I don't think the coroutines layer cares about the actual thread_data type
<simbergm> no, it doesn't
<simbergm> I think we might agree... :/
<hkaiser> the thread_data type itself is the only place where an id could be securely mapped to a pointer to the data
<simbergm> I'm just trying to say that not having the dereferencing operators makes the code a bit more verbose, that's all
<hkaiser> initially we've had a std::map<id, thread_data*> to achieve this, but this got optimized away as it deterioated into a std::map<thread_data*, thread_data*>
<simbergm> I'm going for the free function to get a thread_data from a thread_id at the moment
<simbergm> which lives outside the coroutines module
<hkaiser> so using thread_id as a pointer is just an optimization (a very useful one), but I'd like to avoid to make that explicit and let code depend on this
<hkaiser> yah, having 'global functions' that are part of the thread_data API doing the conversion id<==>thread_data is the way to go
<hkaiser> even if those are trivially reinterpret_cast<> under the hoods
<simbergm> yep, it's a static cast from a void* now
<hkaiser> sure
<simbergm> but "using thread_id as a pointer is just an optimization"
<simbergm> what's the optimization?
<hkaiser> simbergm: removing the need to have a map<id, thread_data*>
<simbergm> ah, I see
<hkaiser> also, a pointer nicely guarantees the uniqueness requirement of thread ids
<simbergm> that wasn't wrt to the dereferencing operators then (which make thread_id behave like a thread_data pointer)
<hkaiser> right
<hkaiser> I don't like what we have today
<simbergm> :/
<hkaiser> no code should know that the thread_id is just th epointer to the data
<simbergm> what's the dream?
<hkaiser> well, hide this information and don't let anybody depend on this
<simbergm> right, you answered my question before I asked it...
<hkaiser> having a thread_data::get_id() and a get_thread_data(thread_id) is the way to go, I believe
bibek has quit [Remote host closed the connection]
<simbergm> all right, let's see what I put together and then you can rewrite it ;)
<hkaiser> nah, I trust you're doing the right thing! sorry for getting in between all the time
bibek has joined #ste||ar
<simbergm> well... I don't know what I'm doing
<simbergm> so it's good to have someone to tell me how to do it correctly!
aserio has quit [Ping timeout: 250 seconds]
aserio has joined #ste||ar
hkaiser has quit [Ping timeout: 264 seconds]
hkaiser has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
aserio has joined #ste||ar
rori has quit [Quit: WeeChat 1.9.1]
aserio has quit [Ping timeout: 250 seconds]
aserio has joined #ste||ar
diehlpk_work has joined #ste||ar
aserio has quit [Quit: aserio]
aserio has joined #ste||ar
aserio has quit [Read error: Connection reset by peer]
aserio has joined #ste||ar
aserio has quit [Ping timeout: 240 seconds]
<diehlpk_work> hkaiser, https://pastebin.com/mcRhwqfE
<diehlpk_work> VC error in ppc
<diehlpk_work> Disabling vc in hpx vanishes the compiler error in octotiger too
aserio has joined #ste||ar
<hkaiser> diehlpk_work: ok, thanks - I'm trying to fix it currently
nikunj has joined #ste||ar
<hkaiser> diehlpk_work: should be fixed now: #4129
<diehlpk_work> Ok, hkaiser thanks
<diehlpk_work> Can you have a look into the errors in my pm as well>
nikunj has quit [Quit: Bye]
Vir has quit [Ping timeout: 264 seconds]
hkaiser has quit [Ping timeout: 276 seconds]
hkaiser has joined #ste||ar
aserio has quit [Quit: aserio]
<jaafar> hkaiser: thank you for the customization point tip yesterday. I used it in a similar way to the chunk size and it seems to get called, but not actually obeyed... I wonder if you have any suggestions for debugging
<hkaiser> heh
<hkaiser> what algorithm are you looking at?
<jaafar> still exclusive_scan
<jaafar> I can make a gist of what I did if you like
<hkaiser> sec
<jaafar> I do see that it gets called... whether I return 3 or 2 or 5 it makes no difference, the primary thread count mechanism appears to override (i.e. --hpx:threads= etc.)
<hkaiser> sorry, 3 lines up
<jaafar> I'll track it down from there, thanks!
<hkaiser> jaafar: thanks!
<hkaiser> jaafar: I think this is a gap in our implementation, you can specify the number of cores to use, this might influence th calculation of how many chunks are being used etc. but it will not change the number of cores used by the underlying executor :/
<hkaiser> this was left off in an unfinished state and everybody forgot about it ...
<jaafar> ah...
<jaafar> that makes sense. It seems to change something, but not the actual threads used
<jaafar> I can make do with the command line for this
<hkaiser> jaafar: might be the easiest for now - sorry
<jaafar> No problem