hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
K-ballo has joined #ste||ar
Amy1 has joined #ste||ar
Amy2 has quit [Ping timeout: 264 seconds]
Yorlik has quit [Ping timeout: 260 seconds]
sayef_ has quit [Read error: Connection reset by peer]
sayef_ has joined #ste||ar
weilewei has quit [Remote host closed the connection]
nan111 has quit [Remote host closed the connection]
hkaiser has quit [Quit: bye]
karame_ has quit [Remote host closed the connection]
nikunj97 has joined #ste||ar
<nikunj97> if actions are invokable, why does the compiler throw me this error: no type named ‘type’ in ‘struct hpx::util::detail::invoke_deferred_result<foo_action&>’
<nikunj97> it works fine with functions/functors but throws this error while using actions
bita_ has quit [Ping timeout: 260 seconds]
nikunj97 has quit [Read error: Connection reset by peer]
nikunj has quit [Ping timeout: 256 seconds]
nikunj has joined #ste||ar
nikunj has quit [Read error: Connection reset by peer]
nikunj has joined #ste||ar
kale[m] has joined #ste||ar
Yorlik has joined #ste||ar
<jbjnr> "no type named type" error nearly always means that the arguments you are passing to the function are wrong, so the function isn't invokable, and that error comes out of the return type deduction stuff
<jbjnr> (at least 75% of the time anyway)
<jbjnr> check very carefully what is being passed to the function triggering your error
<jbjnr> ^^nikunj97
mcopik has joined #ste||ar
mcopik has quit [Client Quit]
nikunj97 has joined #ste||ar
nikunj97 has quit [Read error: Connection reset by peer]
sayef_ has quit [Ping timeout: 260 seconds]
nikunj97 has joined #ste||ar
<kordejong> heller: I have been able to reproduce the failing assertion issue I mentioned last friday with the stencil example released with HPX. See https://github.com/STEllAR-GROUP/hpx/issues/4732
Nikunj__ has joined #ste||ar
nikunj97 has quit [Ping timeout: 260 seconds]
Nikunj__ has quit [Read error: Connection reset by peer]
Nikunj__ has joined #ste||ar
nikunj97 has joined #ste||ar
Nikunj__ has quit [Ping timeout: 260 seconds]
<K-ballo> nikunj97: you need an id
<nikunj97> K-ballo, ya I figured. It works now.
<K-ballo> also function objects, not functors
<nikunj97> what's the difference between functors and function objects?
<nikunj97> I used to think function objects are objects that are callable like a function with ()
<nikunj97> and functors were short for function objects
<K-ballo> a functor is a mapping between cathegories, a math theory concept, not all function objects are functors
<K-ballo> some people use "funcjet" as short for function object, but meh
<nikunj97> can you give an example where a function object is not a functor?
Nikunj__ has joined #ste||ar
nikunj97 has quit [Ping timeout: 246 seconds]
nikunj has quit [Ping timeout: 256 seconds]
nikunj has joined #ste||ar
Nikunj__ is now known as nikunj97
kale[m] has quit [Ping timeout: 240 seconds]
kale[m] has joined #ste||ar
kale_ has joined #ste||ar
<kale_> I'm having problem with the static build of HPX : https://gist.github.com/git-kale/3882388619f82ae8281806fe7e51d9ca. How can I resolve this issue ?
kale_ has quit [Quit: Leaving]
tiagofg[m] has joined #ste||ar
hkaiser has joined #ste||ar
<ms[m]> kale_: static linking not working is a known issue: https://github.com/STEllAR-GROUP/hpx/issues/3970 (the error messages have changed a bit, but the problem is mostly the same)
<ms[m]> we haven't had a need for fixing it for ourselves or had requests from the outside so it's be low prioirity
<ms[m]> if you're very stuck because of this please comment on the issue and we'll see what we can do about it
<ms[m]> (you can in any case comment with your error message)
<nikunj97> K-ballo, is there a way to implement function overloading with variadic templated functions?
<K-ballo> sure, why not
<nikunj97> any example code I can look into?
<K-ballo> I think I may be missing the underlying question... you just add the overloads, there's nothing special to it
<K-ballo> what are you trying, and what problems are you running into?
<nikunj97> all calls will go to https://gist.github.com/NK-Nikunj/46090bb572aea66fc3d446ac1b94435b#file-variadic-cpp-L13 even when I do not want it to be instantiated
<K-ballo> sure, it is more specialiezd
<K-ballo> the example looks too artificial, when would you expect each overload to be called?
<K-ballo> what are you actually trying to do?
<nikunj97> we have something similar in resiliency module in hpx
<nikunj97> I couldn't use function overloading last yr when i worked so I added different function names
<K-ballo> fwiw you'd implemented function overload with variadic "templated" functions just fine
<K-ballo> does your real use case have some concrete types? or concrete concepts? or anything that would make the problem decidable?
<nikunj97> this is where I wanted to make use of function overloading. Having different names for api that does similar work doesn't sound too fascinating
<K-ballo> you place no constrains on Pred whatsoever, right?
<nikunj97> yup
<nikunj97> well there's a constraint
<K-ballo> am I looking at the right function?
<nikunj97> that it has to be a function or a function object
<K-ballo> that constrain seems to be only in your mind
<nikunj97> yea, I wanted to let the compiler know that it is a function/function object
<nikunj97> so that we can specialize the use case
<K-ballo> however you do constrain on `F`... the cases in which both overloads would be viable would be rather thin
<K-ballo> are you sure that example is representative?
<nikunj97> I believe so. The example I shared has multiple function arguments and takes all the arguments of one of the functions.
<K-ballo> for that code to match your earlier example, then the predicate would have to be callable with the rest of the stuff as argument
<nikunj97> Predicate has to be callable, yes. But the arguments are for f
<K-ballo> be callable isn't really meaningful, you can't really ask whether something is callable, you need to specify the arguments with which you'd call it
<K-ballo> and the invoke_deferred_results in there constrain `F` to be callable with all the following arguments
<K-ballo> so for the second overload to be viable when you specify a predicate, it would have to be callable with the function and all the function arguments
<K-ballo> could you possibly rename those functions and make the code actually fail?
<nikunj97> yes, wait.
<K-ballo> it seems the error you were facing is a different one than what I'd expect from the examples
<nikunj97> ok, for some reason it is compiling now and executing as well
<nikunj97> I wonder why it wasn't working last year
<nikunj97> last year, it was always utilizing the specialized function. Later it would complain saying "too few arguments" for function call
<nikunj97> that may have come from using auto previously
<K-ballo> did you have the invoke_deferred_result before?
<nikunj97> no
<nikunj97> I used auto for type deductions
<K-ballo> so they were totally unconstrianed before, as in your example snippet
<K-ballo> that explains things
<nikunj97> how does invoke_deferred_result change the dynamics exactly?
<nikunj97> it does put a constraint that F has to be callable
<K-ballo> it has to be deferred-callable with the given arguments
<K-ballo> if you take the second overload and try to put a predicate there, `F` will deduce to the type of the predicate
<nikunj97> so the compiler knows which specialization to use?
<K-ballo> the predicate is unlikely to be deferred-callable with (f, args...)
<nikunj97> that makes sense. I think I got it.
<nikunj97> hkaiser, yt?
<K-ballo> what you'd need to do to express what I understand you actually want is:
<K-ballo> - add a constrain to the first overload that the predicate is callable with.. whatever the argument of the predicate is, and returns bool
<K-ballo> - add a constrain to the second overload, that F is not a predicate, so the inverse of the above constrain (which puts you in trouble with nullary functions)
<K-ballo> so you end up needing three overloads
<nikunj97> what's the 3rd one?
<K-ballo> async_replay(std::size_t n, F&& f)
<nikunj97> ohh, yeah. A function that takes no arguments.
<nikunj97> wait I don't think we need a 3rd argument.
<K-ballo> what are the requirements of your predicate?
<nikunj97> predicate does not take any argument in the API
<nikunj97> you just need to provide the function name as predicate and the API calls it with the result it gets after calling f with args
<nikunj97> so predicate has only one condition - be a callable
<K-ballo> "be a callable" is not meaningful
<nikunj97> callable with 1 argument, where that argument is the return type of f
<K-ballo> and the return value, is it ignored?
<nikunj97> no return value is passed as the argument. argument type is return type of f.
<hkaiser> nikunj97: wazz'up?
<K-ballo> what is the predicate expected to return?
<K-ballo> I expect the answer to be bool, given the name
<nikunj97> predicate is expected to result a bool, always
<K-ballo> and is the predicate copied? probably right, since async?
<nikunj97> predicate decides whether f computed the correct result
<nikunj97> hkaiser, nothing much. Refactoring old hpxr code. Plus I got things working on distributed mode for actions for async. Still have to decide on node affinity and stuff. But a basic infra is ready and working ;)
<hkaiser> :D
<nikunj97> K-ballo, what do you mean?
<hkaiser> look into distribution policies
<nikunj97> where will I find those distribution policies?
<K-ballo> nikunj97: do you make a copy of the given predicate, or do you keep a reference to it? (keeping a reference would be dangerous given it may be dead by the time you call it asyncronously)
<nikunj97> K-ballo, I do send in a reference
<K-ballo> I just checked against the code, because keeping a reference would be wrong.. you don't, you make a decayed-copy
<hkaiser> async<Action>() accepts one of those as its first argument
<K-ballo> so the predicate needs to be decay-copyable, invokable with the result of the function invocation, and return something compatible with bool
<nikunj97> K-ballo, sounds about right
<nikunj97> hkaiser, I make use of async(action_object, locality, args)
<hkaiser> ok, the dist policies can be used instead of the locality argument
<nikunj97> so the distribution policies take care of locality affinity?
<hkaiser> they were an early attempt towards distributed executors
<hkaiser> yes
<K-ballo> so the constrain would be something like `is_invocable_r<bool, decay<Pred>::type&, invoke_deferred_result<F, Ts...>::type>`
<hkaiser> it's an object that decides where to execute things
<nikunj97> well that's great! It lifts off a major headache about deciding on affinity
<nikunj97> K-ballo, alright. I'll try it out! Thanks for helping.
<K-ballo> keep in mind you need to negate it for the second overload
<nikunj97> gotcha!
<K-ballo> you need `F` to *not* be a predicate
<nikunj97> hkaiser, can you create another repo on STE||AR where I can dump in refactor codes and new set of example/benchmarks?
<hkaiser> nikunj97: sure, what name would you like to use?
<nikunj97> K-ballo, hold on, how do I set it to not be predicate?
<K-ballo> what do you mean by set?
<nikunj97> hkaiser, resilience sounds nice
<nikunj97> K-ballo, I meant how do I write the negation such that F is not the predicate?
<hkaiser> nikunj97: should it be a public or a private repo?
<K-ballo> !is_invoc...?
<nikunj97> hkaiser, I'd like it to be private for now. We'll make it public when we have the implementation in place.
<nikunj97> K-ballo, ohh! my bad. Got it
<nikunj97> hkaiser, thanks! I'll start working on the project. I'll try to implement something cool ;)
<hkaiser> sure you do!
nan11 has joined #ste||ar
weilewei has joined #ste||ar
rtohid has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 256 seconds]
kale[m] has joined #ste||ar
karame_ has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
bita_ has joined #ste||ar
weilewei has quit [Remote host closed the connection]
weilewei has joined #ste||ar
rtohid has quit [Remote host closed the connection]
nan11 has quit [Remote host closed the connection]
nan11 has joined #ste||ar
rtohid has joined #ste||ar
weilewei has quit [Remote host closed the connection]
rtohid has quit [Remote host closed the connection]
karame_ has quit [Remote host closed the connection]
nan11 has quit [Remote host closed the connection]
rtohid has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 246 seconds]
kale[m] has joined #ste||ar
nan11 has joined #ste||ar
weilewei has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
K-ballo has quit [Ping timeout: 264 seconds]
K-ballo has joined #ste||ar
nikunj97 has quit [Read error: Connection reset by peer]
LiliumAtratum has joined #ste||ar
<LiliumAtratum> Hello! It seems my hpx program has deadlocked. Is there an easy way to debug it? All my worker threads are in an `idle_callback`, so the stacktrace is not helpful.
<hkaiser> uhh
<hkaiser> difficult
<hkaiser> is that reproducible?
<hkaiser> does it happen when running on one core (--hpx:threads=1)?
<LiliumAtratum> randomly yes. Meaning - I start the program, wait for a few minutes and it occurs.
<LiliumAtratum> didn't try that one
<LiliumAtratum> but, is there a way to get to the currently scheduled task's stacks?
<hkaiser> what platform are you on?
<LiliumAtratum> windows x64
<LiliumAtratum> with MSVC 2019
<hkaiser> with Visual Studion you can get to see the pending tasks, etc.
<hkaiser> you need to dig into the internal datastructures
<hkaiser> LiliumAtratum: break hanging app while running in debugger, switch to main thread, go up a couple of stack frames until you see runtime in the function name, look at *this
<hkaiser> that will show you all of HPX's internal data structures
<LiliumAtratum> yeah. `this` of scheduled_thread_pool?
<hkaiser> yes, that works as well
<hkaiser> but first find out if it hangs with --hpx:threads=1 as well
<LiliumAtratum> yep, I will try that one as you suggest
<hkaiser> if no, then you have a race in your code
<LiliumAtratum> I like racing... but not that kind of race ;)
<hkaiser> yah... fun
<ms[m]> `--hpx:debug-hpx-log` might help as well (verbose, but will tell you if the same tasks are being rescheduled or if tasks are suspended)
<hkaiser> or if an exception is thrown but swallowed
<LiliumAtratum> ok, this will take a while... thank you for the hints!
<LiliumAtratum> ok, it seems it hanged too :(
<hkaiser> LiliumAtratum: ok
<hkaiser> now use the logging option ms[m] mentioned: --hpx:debug-hpx-log=<file>
<hkaiser> this will slow down things and create a large amount of data, but might help
<LiliumAtratum> I have some logs of my own, so I have put more of then to narrow it down. But I will try the hpx logs next
<LiliumAtratum> it is already slow on a single thread though :(
<hkaiser> sure - see it from the bright side - running on more than one core speeds things up
<hkaiser> ;-)
<LiliumAtratum> yeah, definetely. There are pieces in my code that scale pretty well with the core count. Some others not as much, but there is room for improvement.
<LiliumAtratum> ok, got it narrowed down to a function that has no hpx calls inside :/
<LiliumAtratum> it does use CUDA+thrust though
<hkaiser> LiliumAtratum: such hangs usually happen if a) you drop a futurue on the floor (don't call get() anywhere), or b) you ignore exceptions (usually caused by the same, not calling .get on every future)
<LiliumAtratum> Hm... I call .wait() I don't call .get() because it is a future<void>
<hkaiser> you should always call .get on the future that is passed to the continuation attached with .then
<hkaiser> wait is fine, but does not rethrow if the future is execptional
<LiliumAtratum> and there is no .then
<hkaiser> ok
karame_ has joined #ste||ar
<hkaiser> do you use dataflow?
<LiliumAtratum> in some places, but not in this particular place
<LiliumAtratum> it is just async+wait
<hkaiser> ok, anyways always call.get() on the futures inside the function executed by dataflow
<hkaiser> the hpx logs will have entries if exceptions are being thrown, you should look for 'exception' in the generated logs
<hkaiser> gtg
hkaiser has quit [Quit: bye]
<LiliumAtratum> thank you for the hints!
rtohid has left #ste||ar [#ste||ar]
LiliumAtratum has quit [Remote host closed the connection]
ralph[m] has joined #ste||ar
mcopik has joined #ste||ar
mcopik has quit [Client Quit]
hkaiser has joined #ste||ar