hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
fares_atef has joined #ste||ar
fares_atef has quit [Quit: Client closed]
<gonidelis[m]> * gdaiss: :)
<gonidelis[m]> is memmove just a better memcpy ?
<zao> It has guarantees about handling overlapping ranges and such, but it performs the same kind of operation.
<gonidelis[m]> figues
<gonidelis[m]> figures*
<gonidelis[m]> shouldn't we get rid of memcpy then
<gonidelis[m]> "we" šŸ˜…
<zao> It does more work to guarantee this, including doing fun stuff like copying backwards.
<gonidelis[m]> copying backwards uses memmove?
<gonidelis[m]> anyways, the other thing is, what's the diff between std::memcpy and memcpy (same for memmove). I kno one is C the other is C++ standard library, but... any actual differences?
<zao> If I saw a memmove in a codebase I'd expect it to be there for a reason. For the usual cases where a C++ codebase needs to copy bytes, it's typically for copying storage between distinct objects of unrelated types.
<zao> For those cases you'd use memcpy as that's the simpler and more prescribed one.
<zao> Fun notes section for std::memcpy btw:
<zao> > std::memcpy is meant to be the fastest library routine for memory-to-memory copy. It is usually more efficient than std::strcpy, which must scan the data it copies or std::memmove, which must take precautions to handle overlapping inputs.
<gonidelis[m]> i read on SO that std::memcpy is just an alias for std::memmove nowadays
<zao> That sounds like a very implementation-specific claim.
<gonidelis[m]> alright fair
<gonidelis[m]> what about std::memcpy vs memcpy ?
<zao> Even if they're the same logic on some particular implementations, you can still not use them interchangeably. If you require memmove semantics you need to call memmove. This both as there's other implementations, and that compilers could very well rewrite and optimize around the calls.
<gonidelis[m]> wait.... i am asking why have both memcpy in the std library and C
<gonidelis[m]> nothing related to memmove
<zao> The above was in continued response to the statement about SO.
<zao> I'm not sure what the general wording is about <cblorp> vs. <blorp.h>, but I think there's some overarching text on the standard libraries in the standard document.
<gonidelis[m]> oh sorry... and thanks
<zao> I've grown old, I used to know just where to look in the PDFs and cite chapter and verse :D
<gonidelis[m]> lol
<gonidelis[m]> yes i guess that's a general question
<zao> In general, there's some wording about C functions additionally being available in namespace std (and vice versa) when including with <cx.h> and <x.h>, and apart from some overloading rather than suffixing they're similar in semantics.
<gonidelis[m]> feels uncomfortable
<gonidelis[m]> ok
<gonidelis[m]> alrighty....
Yorlik_ has joined #ste||ar
Yorlik has quit [Ping timeout: 260 seconds]
HHN has joined #ste||ar
hkaiser has quit [Quit: Bye!]
HHN has quit [Ping timeout: 260 seconds]
K-ballo has quit [Ping timeout: 268 seconds]
K-ballo has joined #ste||ar
tufei has quit [Remote host closed the connection]
tufei has joined #ste||ar
KhushiBalia[m] has quit [Quit: You have been kicked for being idle]
Guest47 has joined #ste||ar
Guest47 has quit [Client Quit]
pku-whisper has joined #ste||ar
pku-whisper has quit [Quit: Client closed]
hkaiser has joined #ste||ar
KhushiBalia[m] has joined #ste||ar
hkaiser has quit [Quit: Bye!]
hkaiser has joined #ste||ar
RostamLog_ has joined #ste||ar
K-ballo has quit [Ping timeout: 255 seconds]
K-ballo has joined #ste||ar
K-ballo1 has joined #ste||ar
K-ballo has quit [Ping timeout: 268 seconds]
K-ballo1 is now known as K-ballo
<Isidoros[m]> Hello everyone! I'm Isidoros, an ECE student from Greece with a keen interest in HPC systems. I'm thrilled to be part of this group and connect with all of you. I'm looking forward to learning from your experiences and expertise in the field and contributing to the community in any way possible.
<Isidoros[m]> Also, while I'm here, I came across a possible bug that I wanted to share with everyone. Can I bring it up for discussion here?
<Isidoros[m]> Thanks, and looking forward to chatting with all of you!
<satacker[m]> > <@seiras:matrix.org> Hello everyone! I'm Isidoros, an ECE student from Greece with a keen interest in HPC systems. I'm thrilled to be part of this group and connect with all of you. I'm looking forward to learning from your experiences and expertise in the field and contributing to the community in any way... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/6f4b2a33fc660c0b5f4bc994485d1c8a866b0792>)
<Isidoros[m]> Great! So, there is a correctness bug in the set_intersection algorithm. This function behaves very differently depending on the amount of threads it is called with. Some thread amounts cause it to hang, while others just cause the result to be wrong.
<Isidoros[m]> This definitely happens when giving as input two equal vectors
<Isidoros[m]> And right now I am trying to narrow down the conditions for this to occur
<satacker[m]> The standard from here https://en.cppreference.com/w/cpp/algorithm/set_intersection says:
<satacker[m]> ```
<satacker[m]> ```
<satacker[m]> If either of the input ranges is not sorted (using operator< or comp, respectively) or overlaps with the output range, the behavior is undefined.
<Isidoros[m]> Yes, all the cases I have tried are sorted.
hkaiser has quit [Quit: Bye!]
<Isidoros[m]> Is it okay I post code to reproduce here?
<satacker[m]> Yes, you can use gist/pastebin links
<satacker[m]> There are tests under hpx/libs/core/algorithms/tests/unit/algorithms/set_intersection.cpp you can add test(s) and make a PR as well.
<Isidoros[m]> Running this snippet with different thread amounts gives me different results
<Isidoros[m]> <satacker[m]> "There are tests under hpx/libs..." <- I am not quite sure what the root cause is
<Isidoros[m]> > <@seiras:matrix.org> https://pastebin.com/Br1W6yqf
<Isidoros[m]> >
<Isidoros[m]> > Running this snippet with different thread amounts gives me different results
<Isidoros[m]> This is supposed to return "2" right?
<satacker[m]> Yes
<satacker[m]> Something's off
<Isidoros[m]> I think it has to do with how the chunks are split between threads
<Isidoros[m]> It appears when `threads >= n` and then persists.
<Isidoros[m]> But it also appears with larger sets. In some cases it hangs and in others it segfaults
<satacker[m]> Did you check issues on hpx github?
<Isidoros[m]> yes
<satacker[m]> If there isn't one raise one
<satacker[m]> and you should add a test case
<Isidoros[m]> will do, Thanks for the help!
<satacker[m]> Isidoros[m]: I am agreeing with your observation. Great
<gonidelis[m]> Isidoros: wow
hkaiser has joined #ste||ar
<Isidoros[m]> This does fix some of the cases with small input, but still bugs out in some others. It broke when calling it with two vectors of equal length filled with ones
<satacker[m]> Was supposed to take care of cores
<satacker[m]> So that we don't have to limit it manually
<satacker[m]> Isidoros[m]: +1
<satacker[m]> On the other hand this case is undefined
<Isidoros[m]> Maybe this is too much of an edge case, but according to the docs I think it should be defined to output a third vector equal to the other two.
<Isidoros[m]> "If some element is found m times in [first1, last1) and n times in [first2, last2), the first std::min(m, n) elements will be copied from the first range to the destination range."
<Isidoros[m]> I may be missing something tho
<gonidelis[m]> If it's sometimes correct, then it's never correct
<gonidelis[m]> that's one
<gonidelis[m]> second, Isidoros i just reproduced the bug thanks. that's an excellent catch
<Isidoros[m]> Thank you!
<satacker[m]> If either of the input ranges is not sorted (usingĀ operator<Ā orĀ comp, respectively) or **overlaps with the output range, the behavior is undefined.**
<satacker[m]> Not sure what overlap is but if we assume that it's just what it says then that case is not supposed to be covered.
<Isidoros[m]> I think it means the memory addresses do not overlap
<satacker[m]> Regardless of that case, the threads are making an issue
<Isidoros[m]> I think it is an off-by-one error somewhere
<Isidoros[m]> I started with two 100 element vectors filled with ones.... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/7f90eb8c8ca20ef914cd07f764f314eb07a0223c>)
<Isidoros[m]> There must be something getting lost on each split
<satacker[m]> Ref https://eel.is/c++draft/algorithms#alg.set.operations
<satacker[m]> Just noting it for reference.
hkaiser has quit [Quit: Bye!]