K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
nanmiao has joined #ste||ar
K-ballo has quit [Ping timeout: 240 seconds]
K-ballo has joined #ste||ar
nanmiao has quit [Quit: Connection closed]
K-ballo has quit [Quit: K-ballo]
shahrzad has quit [Quit: Leaving]
hkaiser has quit [Quit: bye]
lst_phnx has joined #ste||ar
lst_phnx has quit [Quit: Ping timeout (120 seconds)]
diehlpk_work has quit [Remote host closed the connection]
<gnikunj[m]>
for device execution space, it will be allocated on the device
<hkaiser>
no, that's aview
<gnikunj[m]>
sure, the view gets allocated on the host but it points to atomic<bool> in the device. Do you mean initialize it within the for loop itself?
<hkaiser>
that will not work as you may have more than one for_loop running concurrently
<gnikunj[m]>
exactly why I have it outside as a view
<hkaiser>
an atomic_view would help, not sure if we have that
<gnikunj[m]>
let me think of something. I'll get it working. The bigger problem are the performance tests btw. They use hpx timers and distribution policies. No way that's going to work in a device kernel.
<gnikunj[m]>
<hkaiser "an atomic_view would help, not s"> they don't :/
<gonidelis[m]>
hkaiser: `reverse` is complete. There was much tweaking needed but now all tests pass. I will take care of the multiple `iter_sent` headers along with providing a uniform `advance` facility in my next PR. Once CI is ok, I think #5225 is ready to go ;))
<hkaiser>
gonidelis[m]: great, thanks!
<diehlpk_work>
Yeah, we got the Piz Daint propsoal accepted
<hkaiser>
wow!
<hkaiser>
great news
<diehlpk_work>
But only 50% of the requested time
<hkaiser>
doesn't matter
<diehlpk_work>
and the benchmark paper was accepted as well
<hkaiser>
small steps at a time
<diehlpk_work>
A good day for Octo-Tiger
<hkaiser>
diehlpk_work: your hard work starts to pay off!
<diehlpk_work>
Yes, I could run a advertisement agency to push open source codes and stress collaborators
<gonidelis[m]>
diehlpk_work: congrats!!
<gonidelis[m]>
diehlpk_work: what's the proposal again?
<diehlpk_work>
We got plenty of compute time to run some study of stellar merger
<gnikunj[m]>
hkaiser: ported for cuda as well. Things seem to work on my laptop. Let me see if I can get things running on rostam.
<diehlpk_work>
hkaiser, see pm about the press release
<weilewei>
K-ballo how about this code, does that make sense?
<K-ballo>
no
<K-ballo>
you want an atomic int, not an atomic int pointer
<K-ballo>
dereferencing an atomic int pointer yields a plain old non-atomic int
<weilewei>
I see... is there any way to fix it, if I can't change the do_count api? which requires passing a pointer but requires atomic add on the input argument?
weilewei has quit [Quit: Ping timeout (120 seconds)]
weilewei has joined #ste||ar
weilewei has quit [Client Quit]
weilewei has joined #ste||ar
nanmiao has quit [Quit: Connection closed]
<tiagofg[m]>
hello everyone, a few months ago I asked here about running an hpx program that loads shared libraries made by me, and Kaiser said that it works with --hpx:ini=hpx.component_paths=
<tiagofg[m]>
it worked with linux but it didn't with macos
<tiagofg[m]>
does this functionality already work with macos?
<tiagofg[m]>
my thesis work is built upon dynamic libraries and I would like it to work on macOS also
<tiagofg[m]>
do you think that is possible, use --hpx:ini=hpx.component_paths= on macOS? thanks
weilewei has quit [Quit: Ping timeout (120 seconds)]
<bita>
hkaiser, does decllow() in blaze make a lower triangular matrix? if not how tril can use of it?
<hkaiser>
it marks the argument as triangular, so if you assign the result it will retrive the triangle data only
<bita>
Actually I receive an exception when I declare something that is not triangular as triangular. Have tested it?
<hkaiser>
the docs says it should work
<bita>
Okay, thanks, we will test it further
weilewei has joined #ste||ar
weilewei has quit [Quit: Ping timeout (120 seconds)]
<hkaiser>
hmmm, further dow it says it's undefined behavior :/
<hkaiser>
down*
<hkaiser>
bita: LowerMatrix<M>/UpperMatrix<M> are adaptors that can be used
<hkaiser>
LowerMatrix<DynamicMatrix<double>> d(dest); d = rhs;
<hkaiser>
dest is a preallocated DynamicMatrix and rhs is the source matrix, after the asignment, dest will have the lower half filled
weilewei has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
hkaiser has joined #ste||ar
weilewei has quit [Quit: Ping timeout (120 seconds)]
<bita>
hkaiser, got it, thanks
<gonidelis[m]>
hkaiser: what's the difference in replacing `constexpr` with `HPX_HOST_DEVICE` in tag_fallback code?
<gonidelis[m]>
ah you don't
<gonidelis[m]>
hkaiser: why did you add the HPX_HOST_DEVICE macro then ?
<gonidelis[m]>
i guess there is some thing going on with HOST_DEVICE and constexpr
<gonidelis[m]>
we tend to couple them?
<hkaiser>
gonidelis[m]: it didn't like constexpr variables in device code
<hkaiser>
so I had to separate this
nanmiao has joined #ste||ar
<K-ballo>
both device and constant evaluation have restrictions on the kind of code they can run, there's no other relation
<gonidelis[m]>
hkaiser: K-ballo but all the HOST_DEVICE functions are also constexpr
<gonidelis[m]>
in this PR at least
<gonidelis[m]>
it's like they go together
<hkaiser>
functions yes, but not variables
<K-ballo>
the simplest of functions will likely be able to run on both scenarios
<K-ballo>
in fact most constexpr code should be able to run on device, but plenty of device code won't run in constexpr
<K-ballo>
any relation is incidental
<gonidelis[m]>
hkaiser: so you are optimizing these functions by using both HOST_DEVICE and constexpr
<K-ballo>
neither of those qualifiers optimize them
<hkaiser>
for functions HPX_HOST_DEVICE marks them to be available on host and device
<hkaiser>
for variables, one can mark them as __device__, but they are not allowed to be constexpr
<gonidelis[m]>
didn't know that. K-ballo constexpr does optimize the code though, isn't it? in the way that it does work on compile time rather than runtim
<gonidelis[m]>
e
<K-ballo>
no
<K-ballo>
it marks the code as available in constant expressions, assuming all the constrains are met
<gonidelis[m]>
and if all constraints are met, then what I mentioned above happens
<K-ballo>
a function being available in constant expressions doesn't mean that whenever you call it is evaluated at compile time
<gonidelis[m]>
i said: if all constraints are met
<K-ballo>
still
<K-ballo>
a function will be called at compile time if it's called from a context in which a constant expression is required
<gonidelis[m]>
but if it's called on compile time then the runtime is reduced
<K-ballo>
any other constant folding optimization can take place with or without `constexpr` (and optimizers don't actually look at constexpr for doing that)
<K-ballo>
compile time and run time are disjoint
<K-ballo>
if it's called at compile time then it must be called at compile time, then it could never have had any effect in run time
<gonidelis[m]>
what
<K-ballo>
constexpr int fun() { return 4; }
<K-ballo>
std::cout << fun(); // this is a run time call
<gonidelis[m]>
because of std::cout
<K-ballo>
because the context in which the call happens does not require a constant expression
<gonidelis[m]>
<K-ballo "because the context in which the"> because of the std:cout
<gonidelis[m]>
because outputting is a runtime thing
<K-ballo>
int arr[fun()]; // this is a compilation time call, which could have never been run time
<K-ballo>
enum X { enumerator = fun() }; // this is another compilation time call, never could have been run time
<K-ballo>
fun(); // now this is a run time call again
<gonidelis[m]>
what are you anyway?
<gonidelis[m]>
do you have an example that the constexpr does actually matter?
<K-ballo>
there's two up here
<K-ballo>
`int arr[fun()];` and `enum X { enumerator = fun() };` would be compilation errors without it
<gonidelis[m]>
0.0
<K-ballo>
that's what constexpr means: can be used in a constant expression
<K-ballo>
oh, you mean, if the given arguments are constant expressions and such?
<gonidelis[m]>
arguments?
<K-ballo>
function arguments
<gonidelis[m]>
what i am trying to figure out is if constexpr optimizes if enabled and if not then consteval sure does
<K-ballo>
the compile time optimization can be applied regardless of whether the function is constexpr or not
<K-ballo>
it's a regular as-if optimization
<K-ballo>
there are side effects in attempting to call the function at compile time that make it impossible to "try" and see if a compilation time is possible
<K-ballo>
so save from gcc's broken constant propagation implementation, which mixes those levels together (and gets some things wrong as a result), there's no difference
<K-ballo>
just turn on optimizations and the call will go away regardless
<gonidelis[m]>
if i turn optimizations on then i might as well not need constexprs? (i lost you somewhere between the lines above)
<K-ballo>
the optimization that you have in mind can be performed (and has for decades) for regular functions, as long as their definition is visible to the compiler
<gonidelis[m]>
it seems to me like you are rendering `constexpr` useless
<K-ballo>
constexpr is not about optimizations, is available to call those functions in those contexts in which you couldn't before (at compile time)
<K-ballo>
you have the wrong model of constexpr
<K-ballo>
you must have thought it was something different than what it actually is
<K-ballo>
the examples I gave of array bounds and enumerators are among the main reasons for having `constexpr`
<gonidelis[m]>
i know that is "evaluate a function at compile time if possible"
<K-ballo>
others being NTTPs and switch cases
<K-ballo>
it is not
<gonidelis[m]>
i am happy to hear that
<K-ballo>
it was always possible to evaluate a function at compile time when possible
<gonidelis[m]>
lol
<gonidelis[m]>
recursive argument
<K-ballo>
well yeah :)
<K-ballo>
constexpr is about those contexts that the optimization doesn't reach
<gonidelis[m]>
I THOUGHT YOU WERE ADVOCATING THIS WHOLE TIME THAT CONSTEXPR HAS NOTHING TO DO WITH OPTIMIZATIONS
<K-ballo>
it isn't
<K-ballo>
those contexts aren't even optimizable
<gonidelis[m]>
ah ah
<gonidelis[m]>
ok!!
<gonidelis[m]>
right
<gonidelis[m]>
so it's not an optimization, it's just a facilitation
<K-ballo>
you can't optimize `case fun():` into working, that's a semantic change
<gonidelis[m]>
yy i get it now
<gonidelis[m]>
so constexpr is for 1. using functions as rvalues (cases you mentioned) 2. NTTPs and 3. switches
<K-ballo>
rvalues?
<gonidelis[m]>
yeah in both cases you used fun() in the right side of the = operator
<gonidelis[m]>
`int arr[fun()];` fun() is rvalue
<K-ballo>
where's the =?
<gonidelis[m]>
yeah sorry
<K-ballo>
it's about constants, not rvalues
<gonidelis[m]>
`enum X { enumerator = fun() }; ` that's the one i was talking about
<K-ballo>
contexts that expect a constant
<gonidelis[m]>
for a function call to be used as an rvalue it has to be constant
<K-ballo>
no, not at all
<K-ballo>
function calls have been used as rvalues since the beginning of time
<gonidelis[m]>
but you said that this `int arr[fun()];` was not allowed before `constexpr`s
<gonidelis[m]>
or at least that's what i got
<K-ballo>
that's right
<K-ballo>
but if you have `i = fun();` that's a function call being used as an rvalue
<K-ballo>
constexpr has nothing to do with rvalues