K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
hkaiser has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
jaafar has quit [Remote host closed the connection]
jaafar has joined #ste||ar
hkaiser has quit [Quit: bye]
bita has joined #ste||ar
jejune has joined #ste||ar
shubham has joined #ste||ar
jejune has quit [Quit: "What are you trying to say? That I can dodge bullets?" "No Neo, what I'm trying to say, is that when you are ready.....you won't have to"]
bita has quit [Ping timeout: 260 seconds]
shubham has quit [Quit: Connection closed for inactivity]
linus2 has joined #ste||ar
diehlpk_work has quit [Ping timeout: 264 seconds]
linus2 has quit [Client Quit]
linus2 has joined #ste||ar
tid_the_harveste has quit [Quit: Idle for 30+ days]
K-ballo has joined #ste||ar
K-ballo has quit [Ping timeout: 245 seconds]
hkaiser has joined #ste||ar
<gonidelis[m]> heyy why do i get that?
<hkaiser> gonidelis[m]: this tries to compare a std::vector<short>::iterator with your sentinel
<gonidelis[m]> hkaiser: which is what I want, isn't it? Or is it that our custom sentinel is based on the custom iterator implemented below?
<hkaiser> I always argued that the operator!=()/operator==() should be defined on the sentinel, not the iterator
<gonidelis[m]> But then again we do use this facility for our algorithms tests
<hkaiser> sure
<gonidelis[m]> give me a sec.... there is a serious possibility that the millions of `iter_sent.hpp`s have created a confusion and it's on me to blame
<hkaiser> not sure why that happened
<gonidelis[m]> sec
<gonidelis[m]> i remember k-ballo suggesting that I should tweak iter_sent.hpp in order to work for the tests but I just can't find my changes (cause of the million instances of iter_sent.hpp probably_
<gonidelis[m]> )
<hkaiser> gonidelis[m]: as said, the current implementation of the iter_sent in master can be imrpved
<gonidelis[m]> i do get your point
<gonidelis[m]> and I aggree
<hkaiser> the comparison operators should be defined either on the sentinel or as global functions that are enable'd for iterators that make is_sentinel_for<> true
<hkaiser> for reasons unknown (most likely I simply didn't pay attention to this), currently the operators are defined for the specific Iterator type only - so the sentinel can't be used for arbitrary iterators
<gonidelis[m]> ahhh yeah now i recall!!!
<hkaiser> actually the operators should be sfinae'd on is_input_iterator<>, that should be sufficient
<hkaiser> yes
<hkaiser> not sure why the other file is on master, then
<hkaiser> those operators are correct
<gonidelis[m]> because we have multiple `iter_sent.hpp`s
<gonidelis[m]> and the problem stems from my newbie PRs when I didn't know where to place files in order to make them globally accessible
<gonidelis[m]> on it
<hkaiser> could you please clean that up?
<gonidelis[m]> don't even ask
<gonidelis[m]> just one thing
<gonidelis[m]> How could I make iter_sent sth like a global facility that we could access like #include <hpx/.....>
<gonidelis[m]> instead of stupid `include " "`s
<hkaiser> we need the file for the tests only, no?
<hkaiser> if the file is needed for the tests only, we should leave it as #include "", just try to have it at a spot where it can be #included from all tests
peltonp1 has quit [Read error: Connection reset by peer]
<k-ballo[m]> why would we have more than one itersent.hpp?
<gonidelis[m]> yes it's only for the tests
<gonidelis[m]> k-ballo: yeah we don't need them
<gonidelis[m]> I will remove them
<gonidelis[m]> hkaiser: the problem is that it's been used from tests both under `/parallelism` and `/core`
<gonidelis[m]> could I put them maybe under `hpx/tests` ?
<hkaiser> gonidelis[m]: let's have it duplicated, we don't have a central spot for test-related headers
<gonidelis[m]> ok
<gonidelis[m]> duplicated is the maximum for me. I won't allow triplecate or more ;p
<hkaiser> +1
<jedi18[m]> hkaiser: Can I include algorithms that use partitioner_with_cleanup? (uninitialized_copy, unintialized_fill..etc)
<hkaiser> jedi18[m]: you 'can include' whatever you like ;-)
<hkaiser> we're not in scholl here - you are an adult, I presume
<hkaiser> school, even
<jedi18[m]> Yeah no I mean it should be doable right? Or is there any reason these haven't been attempted yet?
<hkaiser> no particular reason except that nobody had time/resources to look into it
<hkaiser> partinioner_with_cleanup is very similar to the plain partitioner
<hkaiser> it enables some rollback in case of errors
<jedi18[m]> Oh ok so for adapting algorithms to C++ 20, I'm thinking of working on unique, adjacent_difference, lexicographical_compare, swap_ranges, unintialized_copy, uninitialized_fill and unitialized_move
<jedi18[m]> Since those are the partitioner/foreach_partitioner ones
<jedi18[m]> Given how long it too me to do min,max,minmax and assuming I can do it quicker once I do the starting few, that should comprise 1/3rd to 1/2 of my proposed work, is that okay?
Ri2Raj has joined #ste||ar
<Ri2Raj> Hello everyone so I have built hpx from source as mentioned in the docs but currently I'm unable to compile the hello world program. It says ** fatal error: hpx/hpx_main.hpp: No such file or directory
<Ri2Raj> **
<Ri2Raj> Currently I'm using Ubuntu 20.04 and have installed all the dependencies
<gonidelis[m]> do you link your program correctly to HPX?
<hkaiser> jedi18[m]: sounds good
<zao> Ri2Raj: How are you building your software and what instructions do you follow, if any?
<Ri2Raj> I have used this command ** g++ -o hello_world hello_world.cpp **
<hkaiser> Ri2Raj: have you installed HPX?
<Ri2Raj> Yess obviously I've first built hpx from source and then installed it using ** make install ** as in the docs
<hkaiser> Ri2Raj: where did you install it to?
<hkaiser> what's your CMAKE_INSTALL_PREFIX
<Ri2Raj> Okay So my CMAKE_INSTALL_PREFIX is my system's root directory
<hkaiser> ok, what did you specify when invoking cmake?
<Ri2Raj> ** cmake ../ **
<zao> (unless your dependencies are installed in a global location (which you really shouldn't), you need compiler flags to find the headers and linker flags to find the libraries)
<hkaiser> Ri2Raj: apparently he did install to system
<hkaiser> zao: ^^
<Ri2Raj> zao which compiler flags should I use
<zao> hkaiser: If it's truly the root, it may not be considered.
<zao> (i.e. -DCMAKE_INSTALL_PREFIX=/)
<hkaiser> nod
<zao> I was curious which paths _are_ considered :D https://gist.github.com/zao/85a46073b5a6b33c0517f116bbbe0828
<zao> Ri2Raj: -I (capital i) govens where headers are searched. -L governs where libraries are searched.
<zao> Typically one would use the CMake or pkg-config files from a HPX installation to get the correct flags to use.
<zao> I would strongly recommend installing software like this into a specific separate location, as it makes it easier to rebuild and wipe it.
<zao> Regardless, if it's not in a standard location, you tend to specify something like `-I$HOME/opt/hpx-master/include -L$HOME/opt/hpx-master/lib` to augment the search paths.
<zao> (when having installed it with `-DCMAKE_INSTALL_PREFIX=$HOME/opt/hpx-master`)
<Ri2Raj> Thanks ! I'll try
<zao> HPX may have additional directories you need to specify, I've never really looked :)
<Ri2Raj> So now I have to remove hpx
K-ballo has joined #ste||ar
<zao> Ri2Raj: The flags you get if you ask pkg-config are something like these: https://gist.github.com/zao/5a8bcfc03e7c024eb4879b88b5986bfa
<zao> I'm sure it's documented somewhere too how to use HPX, but it may assume some fundamental compiling knowledge
diehlpk_work has joined #ste||ar
<Ri2Raj> I observed a strange thing when in my ** usr/local/lib ** and ** usr/local/include** there are no hpx files =#
Ri2Raj has quit [Quit: Connection closed]
bita has joined #ste||ar
<srinivasyadav227> hkaiser: I have done small performance test by directly using vc
shubh_ has joined #ste||ar
<hkaiser> srinivasyadav227:still much worse than the omp loop
<srinivasyadav227> yes
<srinivasyadav227> performance improved by using (dataseq hpx::for_each) instead of (hpx::for_each )
<srinivasyadav227> is almost same as using (vc vectorized for loop) instead of (normal for loop)
<hkaiser> nod, the compiler might do some vectorization on its own for the normal loop
<srinivasyadav227> so i dont think any thing wrong with hpx implementation but vc vectorized needed to be optimized
<hkaiser> I'm concerned that our for_each is so much slower
<srinivasyadav227> yea..i think for_each is creating some significant over-head compared to normal for loop
<srinivasyadav227> but hpx::for each is slower than std::for_each
<hkaiser> could you try hpx::for_loop instead? that would allow to use 'normal' integer index bounds
<srinivasyadav227> sure, hpx::for_loop returns iterator right?
<hkaiser> hpx::for_loop(0, nums.size(), [&](auto& i) { nums[i] += 5; });
<srinivasyadav227> ok one min
shubh_ has quit [Quit: Leaving]
<ms[m]> hkaiser: freenode_srinivasyadav227[m]: I'd check that in the hpx non-parallel unseq case we're not doing any chunking (i.e. that we're just using one big chunk)
<hkaiser> ok, I'll have a look
<hkaiser> ms[m]: he is just doing seq for the hpx algorithms
<ms[m]> in parallel you obviously still need it, but it would set a baseline to see how much of the overhead comes from various function wrappers and how much from the chunking (it might not be much but worth looking into)
<ms[m]> I think we might do the chunking even in the sequential case, but I'm not sure
<hkaiser> don't think so - but you might be right
<ms[m]> I may also be wrong... just something I thought might be worth checking
<srinivasyadav227> ok, shall i test with chunking?.. what be optimal chunk size? i am testing for 2^^10, 2^^20, 2^^30
<srinivasyadav227> i mean would you suggest any optimal chunk size with my array size being 2^^10, 2^20, 2^^30
<hkaiser> srinivasyadav227: give me a day or so to reproduce your issues
<srinivasyadav227> ok sure, ;-)
<srinivasyadav227> hkaiser: finally my exams are done, i could start working on separating datapar, i have some doubts related to it
<hkaiser> cool
<srinivasyadav227> in datapar/transform_loop.hpp and util/transform_loop.hpp, inside namespace hpx::parallel::util in both headers
<srinivasyadav227> we defined transform_loop and transform_loop_n functions
<srinivasyadav227> and which internally calls structs (call method) defined in detail right?
<srinivasyadav227> so should we define a CPO for datapar transform_loop overloads? for seperating
<hkaiser> srinivasyadav227: right
<hkaiser> hmmm, the idea was to have separate implementations of the datapar algorithms, but that might lead to a lot of code duplication
<hkaiser> the specialization of the inner loops doesn't sound too bad, does it?
<srinivasyadav227> hkaiser: inner loops?
<srinivasyadav227> or transform_loop, transform_loop_n, binary_loop etc?
<hkaiser> yes
<hkaiser> srinivasyadav227: I'm not so sure anymore what can be done to separate the datapar stuff
<hkaiser> would looking into performance improvements be an option for you as well?
<diehlpk_work> We will give a short course at the 16th U.S. National Congress on Computational Mechanics
<diehlpk_work> Please spread the word
<srinivasyadav227> hkaiser: yes, i am very much interested in performance improvements
<hkaiser> srinivasyadav227: ok - might be a better project
<srinivasyadav227> hkaiser: i felt the same regarding separating datapar, i think it would involve lot of code duplication, honestly i could not understand why we want to separate them?, they will only be enabled if HPX_HAVE_DATAPAER is defined right?, so they are already seperated right?,
<hkaiser> well, yes
<hkaiser> I was hoping we could simplify the algorithm implementation if we didn't have to adapt to datapar
<srinivasyadav227> hkaiser: i think is_vectorpack trait is used in many places, i think it was mainly defined for datapar
<srinivasyadav227> hpx::is_vectorpack_execution_policy*
<srinivasyadav227> it is defined even if HPX_HAVE_DATAPAR is disabled, its deriving std::false_type
<srinivasyadav227> but we are mostly using it to check for datapar policy right?
shubhu_ has joined #ste||ar
shubham has joined #ste||ar
nanmiao has quit [Quit: Connection closed]
shubhu_ has left #ste||ar ["Leaving"]
shubhu_ has joined #ste||ar
shubham has left #ste||ar [#ste||ar]
shubham has joined #ste||ar
shubhu_ has quit [Quit: Leaving]
<hkaiser> srinivasyadav227: yes
bita_ has joined #ste||ar
K-ballo1 has joined #ste||ar
bita has quit [*.net *.split]
K-ballo has quit [*.net *.split]
K-ballo1 is now known as K-ballo
<hkaiser> srinivasyadav227: I have fixed the sequential for_loop to be faster now than the plain for(), trying to find a way to do the same for for_each....
K-ballo has quit [Quit: K-ballo]
K-ballo has joined #ste||ar
shubham has quit [Quit: Connection closed for inactivity]
nanmiao has joined #ste||ar
jejune has joined #ste||ar
diehlpk_work has quit [Remote host closed the connection]
<gonidelis[m]> hkaiser: how about converting the hpx::ranges::next facility impl from straightforward function templates to func. objects?
<hkaiser> gonidelis[m]: what would be the rationale for this?
<gonidelis[m]> 1. we are full for func. obj as far as I understand
<gonidelis[m]> 2. recommended by cppref
<gonidelis[m]> 3(and most important). we provide the facility in a more abstracted way???
<hkaiser> more abstract way?
<hkaiser> show me the cppref link, please?
<gonidelis[m]> aren't FOs a more abstract representation?
<hkaiser> not necessarily
<gonidelis[m]> hm...
<gonidelis[m]> i admit that the core impl is identical
<gonidelis[m]> i just put a couple of `op()`s there
<hkaiser> not sure why we should use a FO here
<gonidelis[m]> ok
<gonidelis[m]> np
<hkaiser> but nicely noticed
<hkaiser> just make sure all of our implementations are constexpr
<gonidelis[m]> ok
<gonidelis[m]> I already have, but why do we need that again?
<hkaiser> so can be evaluated at compile time, if possible
<gonidelis[m]> every time
parsa has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
parsa has joined #ste||ar
parsa has quit [Client Quit]
parsa has joined #ste||ar
parsa has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
<gonidelis[m]> hkaiser: is this supposed to work f
<gonidelis[m]> if last is an integer ^^
<gonidelis[m]> no
<K-ballo> only if the integer is a sentinel for the given iterator type
<gonidelis[m]> it's the other overload that should work
<gonidelis[m]> is an iterator equivalent to iterator_traits<InputIter>::difference_type ?? K-ballo
<K-ballo> no
<gonidelis[m]> why
<K-ballo> why isn't it?
<K-ballo> what do you think a difference type is?
<gonidelis[m]> shouldn't the difference type of an iterator be "kind of" an int?
<gonidelis[m]> the type that we get when we subtract two iterators
<K-ballo> double check your question, you possibly mean something else
<K-ballo> if not, then expand on your underlying reasoning
<gonidelis[m]> what question?
<gonidelis[m]> the rational is based on the fact that the impl here
<gonidelis[m]> on the second overload
<gonidelis[m]> is being tested with an int
<K-ballo> the advance_to_sentinel overloads are not identical to the range::next overloads
<K-ballo> range::next operates on both sentinels and differences
<K-ballo> advance_to_sentinel is range::next (3)
<gonidelis[m]> that' why I am using advance_to_sentinel for our ranges::next impl
<K-ballo> no clue, why would you?
<K-ballo> sounds backwards, having a ranges::next we wouldn't need an advance_to_sentinel at all
<gonidelis[m]> because cppref uses advance
<K-ballo> yeah, next is advance by ref
<K-ballo> no, other way around
<gonidelis[m]> it's just api stuff in order for us to be standards conforming
<K-ballo> advance mutates the argument, next returns a mutated copy
<K-ballo> then why would you use advance_to_sentinel?
<gonidelis[m]> i thought they exposed the same underlyig machinery
<K-ballo> they have common machinery
<K-ballo> advance_to_sentinel is range::next (3)
bita_ has quit [Ping timeout: 246 seconds]
parsa has joined #ste||ar
<gonidelis[m]> what about range::next (2) then?
<K-ballo> not
<K-ballo> that'd be advance_by_difference
<gonidelis[m]> it wasn't a closed type question
<K-ballo> ?
<gonidelis[m]> we do not have advance_by_difference
<gonidelis[m]> hkaiser: do we have anything like that implemented?
<K-ballo> unlikely, it's just std::advance
<gonidelis[m]> so you suggest i use std advance on (2)
<gonidelis[m]> actually that makes sense
<gonidelis[m]> ahh... yeah that makes a lot of sense!!!
<K-ballo> not really, I wouldn't suggest implementing ranges::next in terms of other stuff, but sure if it does the job go for it
<gonidelis[m]> why?
<gonidelis[m]> we always implement facilities in terms of c++11 and c++17
<K-ballo> will you also do range::advance, or just range::next?
<gonidelis[m]> just range::next
<gonidelis[m]> idk haven't checked on range::advance
<gonidelis[m]> if it's that trivial i may give it a try
<K-ballo> odd choice
<K-ballo> range::next is range::advance on a copy, just like std::next is to std::advance
<K-ballo> it would make sense to make std::next on top of std::advance, and ranges::next on top of ranges::advance
<hkaiser> we have all of this, just using a different name
<K-ballo> making ranges::next on top of std::advance is... awkward at best
<hkaiser> needs consolidation
<gonidelis[m]> hkaiser: how may i search what augustin suggests?
<K-ballo> what am I suggesting?
<hkaiser> we have parallel::detail::v1::next or somesuch
<gonidelis[m]> K-ballo: implementing on top of ranges::advance which hkaiser suggests we might have
<K-ballo> if we had ranges::advance we wouldn't need advance_to_sentinel
<gonidelis[m]> <K-ballo "range::next is range::advance on"> so the advancies are mutating?
<K-ballo> advance(it) advances the iterator
<K-ballo> next(it) gives you the iterator next to it
<K-ballo> and while we are at it, advance_to_sentinel isn't advancing, just to make things extra fun :)
<gonidelis[m]> so after next(it) i still expect it to show to its old value
<gonidelis[m]> <K-ballo "and while we are at it, advance_"> because it returns a copy?
<K-ballo> because it doesn't mutate the argument
<gonidelis[m]> ...by returning a copy
<K-ballo> it could mutate the argument and still return a copy, then it would be advancing
<gonidelis[m]> lol
<gonidelis[m]> that would be a case
<K-ballo> it wouldn't be an unreasonable case, advance returning nothing makes it unusable in expressions
<K-ballo> chances are after advancing the iterator you want to use it
<gonidelis[m]> let's propose it then
<K-ballo> it would probably lead to confusion though
<gonidelis[m]> ok that's one more reason in order for us to propose it
<K-ballo> ?
<gonidelis[m]> as a Standards solution
<K-ballo> sounds backwards
<gonidelis[m]> ?
<gonidelis[m]> K-ballo: practically (1) is std::advance
<gonidelis[m]> ?
<gonidelis[m]> right?
<K-ballo> yes
<K-ballo> and in implementing (2) you will be reimplementing it
<gonidelis[m]> then for my case using std::advance would be perfect
<gonidelis[m]> what (2) ?
<K-ballo> the other overloads
<K-ballo> sorry (3)
<gonidelis[m]> yeah sure
<K-ballo> consider the whole set as one, not only the multiple overloads, but the entire advance/next/prev family
<gonidelis[m]> so I use advance_to_sentinel for (2) and a mix of advance_to_sent + std::advance for (3)
<K-ballo> probably distance too, not sure what the status of distance is
<gonidelis[m]> I have implemented that
<gonidelis[m]> distance**
<gonidelis[m]> ^^
<K-ballo> (1) and (2) are special cases of (3)
<gonidelis[m]> right
nanmiao has quit [Quit: Connection closed]
parsa has quit [Ping timeout: 264 seconds]
parsa has joined #ste||ar
<gonidelis[m]> but works fine if i have it commented out