hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar-group.org | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | This channel is logged: irclog.cct.lsu.edu
Yorlik__ has joined #ste||ar
Yorlik_ has quit [Ping timeout: 240 seconds]
hkaiser has quit [Ping timeout: 252 seconds]
sivoais has quit [Server closed connection]
sivoais has joined #ste||ar
hkaiser has joined #ste||ar
hkaiser[m] has joined #ste||ar
hkaiser has quit [Quit: Bye!]
hkaiser has joined #ste||ar
hkaiser has quit [Quit: Bye!]
tufei has quit [Remote host closed the connection]
tufei has joined #ste||ar
diehlpk_work has joined #ste||ar
hkaiser has joined #ste||ar
Johan511 has joined #ste||ar
<Johan511> hello
<Johan511> the following is vector implementation for simd_first (find first element which returns true for given projection).
<Johan511> I have implemented it in HPX
<Johan511> and checked vectorization (and performance reports) and it doesn't seem to vectorize
<Johan511> I have tried just copying this implementation and didn't seem to vectorize on clang++ or g++,
<Johan511> should I use the same implementation as gcc?
<hkaiser> I think you should try to find out whether it vectorizes outside of HPX first, then we can decide what to do with it
<Johan511> no it doesn't
<hkaiser> why do you expect it to vectorize in the context of HPX, then?
<Johan511> I checked it in context of HPX first, it didn't, right now I checked it outside HPX, it didn't
<hkaiser> Johan511: so you're saying that the gcc implementation of unseq doesn't have any effect?
<Johan511> clang++ lc.cpp -O3 -Rpass=loop-vectorize -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize -fopenmp -o lc
<Johan511> this is the command I am compiling with
<Johan511> am I missing something?
<hkaiser> I don't know
<Johan511> ok
<hkaiser> I assume you're the one that knows most about vectorization
<hkaiser> at least your gsoc application said as much ;-)
<Johan511> I will see what I can do
<hkaiser> thank you!
<Johan511> just wanted to know, if I can't get the function the to vectorize, do I simple implement the standard lib way of doing things?
<hkaiser> I think that the solution has to be to find a solution that actually imprves performance, otherwise there is no point in having a specific implementation - we could simply fall back to no-unseq
Johan511 has quit [Quit: Client closed]
HHN93 has joined #ste||ar
HHN93 has quit [Quit: Client closed]
diehlpk_work_ has joined #ste||ar
diehlpk_work has quit [Ping timeout: 240 seconds]
diehlpk_work_ has quit [Remote host closed the connection]
HHN93 has joined #ste||ar
HHN93 has quit [Quit: Client closed]