#ste||ar on 2021-05-02 — irc logs at irclog.cct.lsu.edu

2020-09-17 16:16 K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

01:53 K-ballo has quit [Quit: K-ballo]

02:36 diehlpk_work has quit [Remote host closed the connection]

02:52 hkaiser has quit [Quit: bye]

10:23 hkaiser has joined #ste||ar

12:17 K-ballo has joined #ste||ar

13:39 <srinivasyadav227> hkaiser: hi, i have rebased the seperate_datapar branch against master few days back but, it also added few of the commits that i have not done, i again rebased today, but it still has the commits i did not do?, is there anything wrong i am doing here? i just did `git rebase upstream/master`

13:43 <K-ballo> that's because your PR doesn't target master

13:43 <K-ballo> and those commits from master aren't present in the branch you target

13:44 <K-ballo> usually you'd rebase against the branch you are targeting, but also targeting non-master is rare

13:44 <hkaiser> srinivasyadav227: you need to rebase onto the branch you're targetting, sorry for pointing you into the wrong direction

13:44 <hkaiser> or let me rebase the target branch onto master first

13:45 <srinivasyadav227> okay 🙂

13:48 <hkaiser> I have rebased it now

13:49 <srinivasyadav227> hkaiser: thanks, i will apply the changes soon ;)

14:23 <gonidelis[m]> K-ballo: hkaiser where could i copy this range in order for the copy to parallelizable?

14:23 <gonidelis[m]> https://github.com/ericniebler/range-v3/blob/b0696c75dd8b420432c76e5d90fa133ccb423437/example/calendar.cpp#L350-L351

14:24 <gonidelis[m]> i don't expect it to be formatted on the output still

14:24 <gonidelis[m]> but ostream_iterator for sure complains on parallelization

14:24 <K-ballo> probably a container

14:25 <gonidelis[m]> what container?

14:25 <K-ballo> not array, because that has fixed size

14:25 <gonidelis[m]> yeah that's what i thought

14:25 <K-ballo> any of the other sequence containers, probably

14:26 <gonidelis[m]> but you cannot predict the size

14:26 <gonidelis[m]> i mean

14:26 <gonidelis[m]> you can only get the size after the calculation of the range

14:26 <K-ballo> the input range itself is not random access?

14:27 <K-ballo> can you at least jump to an arbitrary location in amortized constant time?

14:27 <gonidelis[m]> <K-ballo "the input range itself is not ra"> i haven't figured that out yet

14:27 <gonidelis[m]> <K-ballo "can you at least jump to an arbi"> i don't understand the question

14:28 <K-ballo> well you should start by figuring out the input first, instead of the output

14:28 <K-ballo> if the input isn't parallelizable for copy, no matter what output you choose, it won't be paralelizable

14:28 <gonidelis[m]> K-ballo: the undefined template trik

14:28 <gonidelis[m]> trick

14:28 <gonidelis[m]> yy

14:28 <gonidelis[m]> sure

14:28 <K-ballo> ?

14:28 <gonidelis[m]> empty template

14:28 <gonidelis[m]> sth like

14:30 <K-ballo> no, not really.. at best the type would definitely guarantee not paralelizable, but it cannot definitively answer the question

14:30 <K-ballo> you have `dates | format_calendar`, `dates` is iota so it is parallelizable for input, what about the format_calendar adaptor?

14:32 <gonidelis[m]> indeed the incomplete template trick give me nothing

14:32 <gonidelis[m]> gives*

14:33 <gonidelis[m]> well `format_calenadar` is `views::group_by` | `transform` | `chunk` | `join`

14:33 <K-ballo> you could try to assert the concept, but it would only answer about this particular instantiation

14:35 <gonidelis[m]> all sound parallelizable for input

14:36 <K-ballo> but you cannot predict the size

14:37 <gonidelis[m]> yes

14:37 <gonidelis[m]> i cannot

14:38 <gonidelis[m]> ahhhhhh this runs `ranges::for_each(dates(start, stop) | format_calendar(months_per_line), [](auto i){return i;});`

14:38 <gonidelis[m]> but if i try with the `hpx::ranges::for_each` it does not

14:38 <gonidelis[m]> sad

14:38 <gonidelis[m]> it would be the perfect outro

14:40 <gonidelis[m]> K-ballo: can you see pm please?

14:40 <K-ballo> why wouldn't for_each work? it requires nothing extra

14:40 <gonidelis[m]> hpx for_each won't

14:40 <gonidelis[m]> i don't know why

14:40 <K-ballo> are you triple sure? sounds like an hpx bug

14:41 <gonidelis[m]> yes

14:41 <gonidelis[m]> it is an hpx bug

14:41 <gonidelis[m]> it should be running

14:41 <K-ballo> make sure to report it

14:41 <gonidelis[m]> i will

14:42 <gonidelis[m]> i will probably fix it too

14:42 <gonidelis[m]> K-ballo: pm...in case you have a spare minute.

14:45 <K-ballo> gonidelis[m]: consider how a parallel for each would work

14:45 <K-ballo> the simplest approach is to split the range into as many subranges as you have threads of execution

14:45 <gonidelis[m]> yes

14:45 <K-ballo> so if you have 2 threads, you'd split the range in the middle

14:45 <gonidelis[m]> make the transformation on an elem by elem basis and decise on removal at the spot

14:45 <K-ballo> so, how do you get to the middle?

14:46 <gonidelis[m]> everything sounds great? ;p

14:46 <gonidelis[m]> to the middle?

14:46 <gonidelis[m]> say 2 threads... i split the range in half

14:46 <gonidelis[m]> 1 threads does the one half and the other does the other one

14:47 <K-ballo> consider a good old iterator pair range: [first, last), how do you find the middle?

14:47 <gonidelis[m]> ahh you need `remove_if` to provide with a sequnce of elements. so the non-removed ones would add overhead on finding where they should be put

14:47 <K-ballo> no, forget remove_if

14:47 <gonidelis[m]> first + last /2

14:47 <gonidelis[m]> ( ) *

14:47 <K-ballo> first + last makes no sense, what does it mean to add iterators?

14:47 <gonidelis[m]> last - first / 2

14:47 <K-ballo> ok, better, bad precedence

14:47 <gonidelis[m]> (last - first) / 2

14:47 <K-ballo> there you go

14:48 <K-ballo> and that requires random access iterators

14:48 <gonidelis[m]> yy

14:48 <gonidelis[m]> actually

14:48 <gonidelis[m]> why

14:48 <gonidelis[m]> ?

14:48 <gonidelis[m]> it requires random access for that to be o(1)

14:48 <gonidelis[m]> it could be done in o(n) with bidir

14:48 <K-ballo> iterators only provide functionality that is (amortized) o(1)

14:48 <gonidelis[m]> ahhh ok

14:48 <gonidelis[m]> yeah

14:48 <gonidelis[m]> soudns right

14:49 <K-ballo> and you are not allowed by the algorithm complexity guarantees to do that in o(n) anyhow

14:49 <gonidelis[m]> we have algos for the o(n)'s

14:49 <gonidelis[m]> ok great

14:49 <gonidelis[m]> so that requires random access but i have bidir

14:49 <K-ballo> so, in the traditional world, no random access -> no parallelizable

14:49 <gonidelis[m]> actually why do you ask about the middle

14:49 <gonidelis[m]> ?

14:49 <gonidelis[m]> yyyyes

14:49 <gonidelis[m]> i wish that was the case

14:49 <K-ballo> in the lazy ranges world it can be

14:50 <gonidelis[m]> 0.0

14:50 <K-ballo> the middle comes from having 2 threads

14:50 <gonidelis[m]> ahh yes

14:50 <gonidelis[m]> you need to split it in hald

14:50 <K-ballo> if I had 8 threads I'd ask for the 7 positions that split that into 8 subranges

14:50 <gonidelis[m]> half

14:50 <gonidelis[m]> ok

14:50 <gonidelis[m]> how? what happened and all of a sudden a lazy view gives me the middle just that fast?

14:51 <K-ballo> you can't get the middle in O(1)

14:51 <K-ballo> but you don't necessarily need to

14:52 <gonidelis[m]> ahhh why?

14:52 <K-ballo> because your range has bidirectional iterators

14:52 <K-ballo> but if you can get the middle of the underlying range in O(1) you are good to go, or the middle of the underlying range of that one, or ... and so on

14:53 <gonidelis[m]> why is it that because i don't need to get the middle in o(1)?

14:53 <gonidelis[m]> i don't understand your argument

14:55 <K-ballo> you do still need to get to the middle in o(1), but it doesn't need to be the middle of the outmost range, you can pick any one of the lazy chain of operations

14:56 <K-ballo> if you have remove_if(vector) you can't jump to the middle of it in O(1), but you can jump to the middle of the vector in O(1), and apply remove_if to different halves

14:56 <gonidelis[m]> ahhhhhhhhhhhhhhh

14:56 <gonidelis[m]> yes

14:57 <gonidelis[m]> yeeeeeees

14:57 <gonidelis[m]> ok

14:57 <gonidelis[m]> see pm again then K-ballo ;p

14:57 <K-ballo> no, keep algorithmic discussion public

14:57 <gonidelis[m]> it's not algorithmic discussin

14:59 <gonidelis[m]> K-ballo: what you propose is a fork-join parallelism

14:59 <gonidelis[m]> and actually that's the go to solution in my talk if fused parallelism does not work

14:59 <gonidelis[m]> and yes it runs faster

14:59 <gonidelis[m]> (well of course it would)

14:59 <gonidelis[m]> so it's a all a matter of what iterators would be exposed

14:59 <gonidelis[m]> by your view

14:59 <gonidelis[m]> great

15:00 <K-ballo> I don't necessarily propose a fork-join parallelism, no

15:00 <gonidelis[m]> no necessarily

15:00 <gonidelis[m]> rather than a back-up solution

15:00 <K-ballo> I don't know what "fused parallelism" is

15:02 <gonidelis[m]> parallelize the two chained operations simultaneously

15:05 <hkaiser> srinivasyadav227: why did you close #5298?

15:11 <srinivasyadav227> hkaiser: sorry, that branch was completely messed up, so i deleted it (so it automatically got closed) , i have a backup branch, i have clean branch with changes rebased with fixing_datapar

15:11 <hkaiser> srinivasyadav227: there is no need to close PRs, you can always force-push

15:11 <hkaiser> to the branch, I mean

15:12 <srinivasyadav227> nono..i deleted the branch, so it got closed automatically, i did not do manually

15:13 <hkaiser> ok

15:14 <srinivasyadav227> hkaiser: is there any way now i can push new branch to #5298 again?

15:14 <hkaiser> if it's the same name, then you should be able to reopen it and push to it

15:15 <K-ballo> gonidelis[m]: the operations are lazy, there's no choice but to parallelize them simultaneously.. what am I missing?

15:15 <srinivasyadav227> yes, its same name ;)

15:21 <K-ballo> I just remembered a conversation we had in issaquah when we first discussed adopting parallel algorithms, the complexity guarantee requirements do allow for advancing the bidirectional iterator, so it would actually be viable for a *sized* bidirectional range (just undesirable)

15:22 <K-ballo> it wasn't an option pre ranges, because asking the size added an N

15:24 <hkaiser> what is a 'sized' bidir iterator?

15:26 <K-ballo> no, nevermind, that's still another N

15:26 <K-ballo> sized range

15:26 <srinivasyadav227> hkaiser: both branches have same name but its not allowing me to reopen, sorry, can i create another PR?

15:26 <K-ballo> sized range knows its size in constant time, but it's not necessarily random access

15:27 <hkaiser> srinivasyadav227: no other way, then

15:27 <gonidelis[m]> <K-ballo "gonidelis: the operations are la"> hm yeah

15:27 <gonidelis[m]> K-ballo: there is no straightforward way in parallelizing them fork-joinly

15:27 <hkaiser> K-ballo: how would that work?

15:28 <srinivasyadav227> hkaiser: ok, thanks, i will do it ;)

15:29 <K-ballo> hkaiser: list | take_exactly(7)

15:29 <gonidelis[m]> actually doing it in a fork-join way would destroy the concept of laziness

15:30 <K-ballo> generate(rnd, 17)

15:30 <K-ballo> https://en.cppreference.com/w/cpp/ranges/sized_range

15:30 <hkaiser> K-ballo: ok, but this is of no use in terms of parallizing

15:31 <hkaiser> efficient parallization still requires random_access

15:31 <K-ballo> it's paralellizable, just undesirable to parallelize it

15:31 <gonidelis[m]> <hkaiser "efficient parallization still re"> that ^^

15:31 <gonidelis[m]> "undesirable"

15:32 <hkaiser> correct

15:32 <K-ballo> it was part of the premise itself

16:50 <gonidelis[m]> hkaiser: yt?

16:53 <hkaiser> here

16:53 <gonidelis[m]> hkaiser: see pm please

18:05 <hkaiser> gonidelis[m]: yt?

18:05 <gonidelis[m]> hkaiser: yes

18:29 parsa has quit [Ping timeout: 260 seconds]

18:57 parsa has joined #ste||ar