hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC2018: https://wp.me/p4pxJf-k1
diehlpk_mobile has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]
hkaiser has quit [Quit: bye]
K-ballo has quit [Quit: K-ballo]
nikunj1997 has joined #ste||ar
nikunj97 has quit [Ping timeout: 240 seconds]
nikunj1997 has quit [Quit: Leaving]
nikunj has joined #ste||ar
anushi has quit [Ping timeout: 248 seconds]
jaafar has quit [Ping timeout: 264 seconds]
Anushi1998 has joined #ste||ar
jakub_golinowski has quit [Ping timeout: 256 seconds]
anushi has joined #ste||ar
david_pfander has joined #ste||ar
anushi has quit [Ping timeout: 255 seconds]
anushi has joined #ste||ar
anushi has quit [Remote host closed the connection]
anushi has joined #ste||ar
david_pfander has quit [Ping timeout: 255 seconds]
anushi has quit [Ping timeout: 276 seconds]
anushi has joined #ste||ar
anushi has quit [Remote host closed the connection]
anushi has joined #ste||ar
jakub_golinowski has joined #ste||ar
david_pfander has joined #ste||ar
david_pfander1 has joined #ste||ar
mcopik has joined #ste||ar
Anushi1998 has quit [Ping timeout: 245 seconds]
david_pfander has quit [Ping timeout: 256 seconds]
jakub_golinowski has quit [Ping timeout: 256 seconds]
jakub_golinowski has joined #ste||ar
<marco_>
Hello I am back and sorry, but unfortunately I do not see the forest for the trees. I need a parallel::for_each in which the elements are started in sequence and I do not find the appropriate execution policy, ...
<zao>
Still a bit unsure about what the requirements are there.
<zao>
"starting" an action doesn't mean much, even if you could control it.
<heller>
marco_: can you give your specific usecase?
<zao>
If you've got four OS threads servicing HPX, would you kick off the first four items and start additional ones as the first start to complete?
<zao>
Unless you never enter a point which can switch contexts, having started before something else doesn't mean much in terms of completion order.
<zao>
(I forget what the HPX term is for a possible thread swithc is)
<zao>
Some sort of dependency where a task won't start until the N tasks before it has completed, where you've determined N to be the number of threads servicing HPX?
<zao>
(oh wait, that'd be serial, blargh)
<zao>
I'm with heller here :)
<zao>
(and gonna leave it to him)
<jakub_golinowski>
M-ms, yt?
<marco_>
zao: I've a list of independent jobs, and they should start in order, and not in segmented parts for each thread.
<jakub_golinowski>
marco_, why should they start in order?
<M-ms>
jakub_golinowski: here
<M-ms>
good that the build works now! could you make a PR to change HPX_LIBRARIES?
<jakub_golinowski>
M-ms, I was going to ask about if I should do it
<jakub_golinowski>
M-ms, did you look at the face-recognition app?
<M-ms>
as for the tests, could you try disabling the tests that fail and collect a list of them, maybe there's something in common to them
<M-ms>
then we can see how many fail across all of opencv
<jakub_golinowski>
M-ms, this is a good idea, I will look into that
<M-ms>
I haven't had time to look at the app yet
<M-ms>
I'll try to do so tonight or tomorrow
<M-ms>
if I understood correctly it's working pretty well for you already, no?
<M-ms>
and for the opencv PR, you can open a new one and just reference the old one in the description
<M-ms>
heller: yt?
<marco_>
jakub_golinowski: *g*, it is not a technical or algorithmic requirement, it is more a buisiness requirement.
<heller>
M-ms: hey
<heller>
marco_: what is meant by "in order"? the start time should be in order? One ofter the other?
<heller>
marco_: are the different tasks allowed to overlap execution? Do you just need a specific sequence number or something for your task?
<M-ms>
hey, how far did you get with your kokkos explorations? any conclusions yet about how feasible it would be to have an HPX backend for kokkos?
<heller>
M-ms: IMHO, it doesn't make a lot of sense to have a HPX backend for kokkos
<M-ms>
heller: I'm asking because we've been discussing how to make some progress on this cuda business
<heller>
i think this is orthogonal
<heller>
it should just work to use the kokkos CUDA backend inside a HPX application
<M-ms>
yeah, that makes sense
<M-ms>
we would have support to go work with the kokkos people on something, but it's not clear yet if "something" is useful and what exactly that that would be
anushi has quit [Read error: Connection reset by peer]
<M-ms>
cuda graphs doesn't look like it's happening soon enough (for us at least)
<heller>
yes
<jbjnr>
marco_: I believe that if you want a parallel:for_each that starts each element in sequence, then what you want is a serial for_Each. You can use the sequenced/serial execution policy for that.
<heller>
well, the idea of implementing a kokkos backend is tempting
<M-ms>
it'd also be a shame to have to reimplement all the data layout management that kokkos is doing on top of their backends
<heller>
yes
<heller>
but that's orthogonal as well
<heller>
what would be great is, if we could factor out those data structures
<M-ms>
yes, it is
<M-ms>
and not first priority either, cuda is
<M-ms>
jbjnr: I didn't get so far with the kokkos task pdf, is there something about dags on gpus there?
<heller>
well, the nice thing about the Kokkos CUDA thingy is that they directly embedded the hierarchical memory model into their programming model
<heller>
I am just not yet sure how well that maps to tasking ;)
<jbjnr>
M-ms: heller 1) We have a problem with very small tasks, and implementing a back end that supports the kokkos model would get us closer. This might mean having a special kokkos_executor and implementing their scheduling loop in some form on top of our threads. I do believe this would be a lot of work, but mapping their thread teams onto our schedulers/executors might help us to redesign our own internals in such a way that we improve our small task
<jbjnr>
performance.
<jbjnr>
2) The data layout (array views) + cuda reordering of loops to make thread blocks work better is something we need in HPX regardless of whether we integrate with kokkos, so we might as well use theirs. This can be as simple as just having a cuda only kokkos that we forward stuff too.
<jbjnr>
3) Ideally we work WITH the kokkos guys to say, you do this well, and we do this well, can we make an API that both of us can use so that we get the nice task API of HPX, alongside the hierarchical mode they use.
<jbjnr>
futures on GPU is the problem we don't have a real solution for. They have a 'hack' that kind of works, but it introduces a shite task methodology we don't want.
<jbjnr>
working with them (for me) is just a way of trying to leverage the best of both worlds to move forward with use and adoption and performance.
<M-ms>
is it as much of a hack as the hpx cuda support? it's obviously not great for lots of small kernel launches but until cuda graphs is here and executors/futures are changed (again) there's not much we can do there
<jbjnr>
If I could find the pdf of the cuda graphs stuff I'd send it to you
<jbjnr>
which doc are you reading?
<M-ms>
did you mean cuda graphs is a hack or what kokkos is doing to have dags on the gpu?
<M-ms>
kokkos task dag capabilities
<jbjnr>
the kokkos dags on GPU is a 'hack' a good one, but it doesn't map well to our task model
<M-ms>
I'll read for a while, let's discuss later again
<marco_>
heller: yes the start time should be in order, one after the other. the tasks allow overlap execution, there are no dependencies between it. i doesn't need a sequence number ore something else.
K-ballo has joined #ste||ar
<heller>
marco_: ok, you can achieve that with the parallel execution policy and a chunk size of 1
<jbjnr>
that won't guarantee the ordering though
<heller>
that still doesn't guarantee that the start time will be in order (our execution model doesn't account for that) but the task creation time
<jbjnr>
the tasks will be added round robin to thread queues
<heller>
right
<jbjnr>
and might be pulled off in the wrong orders
<jbjnr>
marco_: we can't do what you ask for without changes to hpx. We can impose a dependency on the 'ending' of a task, via a future, but not on the 'starting' of a task. If the tasks were added in order to a queue and taken off the queue in order in the scheduler, it would work, but we could not guarantee it without changes to the scheduler currently.
<jbjnr>
unless an idea like heller's could be tweaked somehow
<heller>
the real question would be, why you need to order the start time?
<heller>
that would impose quite a ton of synchronization between the task queues
hkaiser has joined #ste||ar
<heller>
if they run in parallel, there's nothing wrong with them starting execution at the same time
<jbjnr>
he probably has some dodgy counter access at the start of each task and wants to make sure they happen 'in order'
<jbjnr>
nut you are right. address the problem of why and then the real answer will become clear
<jbjnr>
does make parallel:for_each a bit useless, however, we could 'wrap'it in some template magic to create one
<hkaiser>
could end up being too fine-grained
<jbjnr>
indeed
<heller>
that's what he asked for ...
<jbjnr>
correct.
<jbjnr>
back to work now ...
<github>
[hpx] biddisco force-pushed guided_pool_executor from 349ca7b to 31faced: https://git.io/vxkTv
<github>
hpx/guided_pool_executor b085914 Thomas Heller: Changing the coroutine implementations to do a lazy init...
<github>
hpx/guided_pool_executor 24ec144 John Biddiscombe: Remove staged queue from thread map and run_now param from create_thread api...
<github>
hpx/guided_pool_executor fdc1a6c John Biddiscombe: Remove wait_or_add_new from scheduling loop, thread_queue and schedulers
jaafar has joined #ste||ar
nikunj has quit [Remote host closed the connection]
nikunj has joined #ste||ar
<marco_>
Ok, Thank you very much for the explanation, I will then create my own loop with async.
<Guest71870>
[hpx] Jakub-Golinowski opened pull request #3365: Fix order of hpx libs in HPX_CONF_LIBRARIES. (master...fix_lib_order) https://git.io/fbhmA
<jbjnr>
marco_: you should consider very carefully whether you reaal _need_ to do what you have asked for. if your for_each is large, then the cost of an async for each will become an issue that will probably be bigger than the problem you are trying to solve by launching tasks in order. If you can tell us why you need to start them in order, then we might be able to suggest an alternative strategy.
anushi has joined #ste||ar
nikunj97 has joined #ste||ar
nikunj has quit [Remote host closed the connection]
<nikunj97>
hkaiser: yt?
<marco_>
jbjnr: Ok, I will write a brief review of my application. I will contact you tomorrow or later.
diehlpk_mobile has joined #ste||ar
<nikunj97>
diehlpk_mobile: is the last date of submitting our blog links and pr links 6th or by 6th?
diehlpk_mobile has quit [Read error: Connection reset by peer]
diehlpk_mobile2 has joined #ste||ar
diehlpk_mobile has joined #ste||ar
diehlpk_mobile3 has joined #ste||ar
anushi has quit [Remote host closed the connection]
<nikunj97>
diehlpk_work: yt?
diehlpk_mobile has quit [Ping timeout: 240 seconds]
diehlpk_mobile2 has quit [Ping timeout: 260 seconds]
anushi has joined #ste||ar
diehlpk_mobile3 has quit [Read error: Connection reset by peer]
<hkaiser>
nikunj97: here
aserio has joined #ste||ar
<hkaiser>
aserio: see pm, pls
<nikunj97>
hkaiser: I had an idea to resolve the global object situation.
<nikunj97>
Like _init() has the responsibility of initializing all the global objects, why don't we have our own _init() as well?
<hkaiser>
ok
<nikunj97>
so in case a user wishes to user HPX functionality he could do something like hpx.add_object("struct/class_name object_name")
<nikunj97>
or something similar. This way a user can create initialization routine specific to HPX
<hkaiser>
how would you prevent for those constructors to be called twice? what about destruction?
<nikunj97>
I'm currently trying to create a model to handle it
<nikunj97>
hkaiser: Do you think something like this is feasible?
<hkaiser>
worth a try, definitely
<nikunj97>
then I'll investigate further
<nikunj97>
Actually I have been trying to implement it without getting into initialization sequencing issues. I found the exact function that was initializing the global object but it has not been exported to libc.so so I can't wrap it in any way
<nikunj97>
To implement it I would have to implement _init function myself but that itself contains symbols that are not exported. So I would be forced to implement them as well and it would recursively proceed. This was creating portability issues.
<nikunj97>
I tried to implementing it and myself and on failing to do so, I came up with the idea of hpx's own init function to initialize hpx related global objects
<hkaiser>
nod, figures
<nikunj97>
hkaiser: did you review my pr?
<hkaiser>
nikunj97: not yet, sorry
<hkaiser>
working towards it ;-)
<nikunj97>
hkaiser: ok, actually I wanted to add link to my macos pr also to mid term evaluation.
<hkaiser>
nikunj97: pls go ahead and create that pr
<nikunj97>
but there will be an overlap of code between these 2 pr. I'll be adding code to hpx_wrap.cpp, so I thought that I should wait out to let this pr get merged.
<hkaiser>
nikunj97: ok, I'll try to get it done today
<nikunj97>
hkaiser: thanks, I will then add the pr once it's merged and the link to my mid term evaluation as well.
anushi has quit [Remote host closed the connection]
anushi has joined #ste||ar
<diehlpk_work>
nikunj97, yes
<nikunj97>
diehlpk_work: is the last date of submitting our blog links and pr links 6th or by 6th?
<diehlpk_work>
nikunj97, I like to have them by 6th, so I cna perpare a blog post over the weekend
<nikunj97>
diehlpk_work: will it be fine if I send them over on 6th?
<diehlpk_work>
Sure, I like tio have them on Saturday my local time zone
<nikunj97>
ok, thanks :)
akheir has joined #ste||ar
jakub_golinowski has quit [Quit: Ex-Chat]
jakub_golinowski has joined #ste||ar
hkaiser has quit [Quit: bye]
david_pfander has quit [Remote host closed the connection]
david_pfander has joined #ste||ar
galabc has joined #ste||ar
hkaiser has joined #ste||ar
Anushi1998 has quit [Quit: Bye]
anushi has quit [Ping timeout: 248 seconds]
anushi has joined #ste||ar
_bibek_ has joined #ste||ar
hkaiser_ has joined #ste||ar
anushi has quit [Remote host closed the connection]
anushi has joined #ste||ar
hkaiser has quit [Ping timeout: 248 seconds]
bibek has quit [Ping timeout: 276 seconds]
hkaiser_ has quit [Client Quit]
aserio has quit [Ping timeout: 260 seconds]
anushi has quit [Ping timeout: 260 seconds]
<jakub_golinowski>
M-ms, yt?
Anushi1998 has joined #ste||ar
nikunj97 has quit [Quit: bye]
nikunj has joined #ste||ar
hkaiser has joined #ste||ar
<M-ms>
jakub_golinowski: uhm, half here
galabc has quit [Read error: Connection reset by peer]
galabc has joined #ste||ar
gabriel_ has joined #ste||ar
galabc has quit [Ping timeout: 256 seconds]
K-ballo has quit [Quit: K-ballo]
anushi has joined #ste||ar
mcopik has quit [Ping timeout: 265 seconds]
aserio has joined #ste||ar
aserio1 has joined #ste||ar
nikunj has quit [Quit: Leaving]
aserio has quit [Ping timeout: 265 seconds]
aserio1 is now known as aserio
gabriel_ has quit [Ping timeout: 256 seconds]
mcopik has joined #ste||ar
anushi has quit [Read error: Connection reset by peer]
anushi has joined #ste||ar
galabc has joined #ste||ar
mcopik has quit [Ping timeout: 240 seconds]
K-ballo has joined #ste||ar
anushi has quit [Ping timeout: 256 seconds]
parsa[[w]] has joined #ste||ar
parsa[w] has quit [Ping timeout: 260 seconds]
<aserio>
_bibek_: yt?
<hkaiser>
diehlpk_work: yt?
<diehlpk_work>
yes
<hkaiser>
see pm, pls
galabc has quit [Ping timeout: 268 seconds]
eschnett has joined #ste||ar
aserio has quit [Remote host closed the connection]
aserio has joined #ste||ar
<heller>
aserio: Hey, what's the password?
<aserio>
stellargroup
<heller>
Thanks
katywilliams has joined #ste||ar
khuck has joined #ste||ar
<khuck>
aserio: phylanx meeting?
katywilliams has quit [Client Quit]
anushi has joined #ste||ar
khuck has quit []
<M-ms>
jakub_golinowski: trying the face detection now, getting cascadedetect.cpp:1694: error: (-215:Assertion failed) !empty() in function 'detectMultiScale'
<M-ms>
have you had something similar? at least it doesn't seem like hpx messing up
<jakub_golinowski>
this is the ML classifier
eschnett has quit [Quit: eschnett]
<jakub_golinowski>
M-ms, check if the .xml files are correctly pointed to -> it depends where exactly you are with build dir
<jakub_golinowski>
the app assumes you have a binary in the /hpx_opencv_webcam/build directory
anushi has quit [Remote host closed the connection]
anushi has joined #ste||ar
RostamLog has joined #ste||ar
jakub_golinowski has quit [Remote host closed the connection]
jakub_golinowski has joined #ste||ar
anushi has quit [Ping timeout: 260 seconds]
hkaiser has joined #ste||ar
anushi has joined #ste||ar
akheir has quit [Quit: Leaving]
K-ballo has quit [Quit: K-ballo]
<jakub_golinowski>
M-ms, I working on the tests in opencv and all the failing ones seems to be somehow connected with OCL
anushi has quit [Ping timeout: 268 seconds]
anushi has joined #ste||ar
aserio has quit [Quit: aserio]
anushi has quit [Ping timeout: 264 seconds]
anushi has joined #ste||ar
quaz0r has quit [Quit: reboot]
anushi has quit [Ping timeout: 240 seconds]
anushi has joined #ste||ar
K-ballo has joined #ste||ar
jakub_golinowski has quit [Ping timeout: 245 seconds]