K-ballo changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<jaafar>
I was browsing the code and noticed that there is a transform_loop_n that is not implemented in terms of loop_n - and the latter seems to have optimizations for e.g. SIMD
<jaafar>
loop unrolling, which isn't in transform_loop_n
<jaafar>
it seemed like rewriting transform_loop_n to use loop_n ought in theory to not cost anything, and possibly give gains for users
<gonidelis[m]>
hmmm
<gonidelis[m]>
jaafar: By taking this into consideration, we should optimize `transform_loop_n` accordingly.
<jaafar>
It seemed to me that just using loop_n might be enough
<gonidelis[m]>
jaafar: I happen to work in the `transform` C++20 adaptation right now and I have encountered `transform_loop_n` but it was just the interface
<gonidelis[m]>
jaafar: should we not check into the performance of `transform_loop_n` though? Just to see where it stinks?
surbhi has quit [Ping timeout: 260 seconds]
<jaafar>
oh sure, whatever you think :)
<jaafar>
I was just surprised to see that one was not implemented in terms of the other. transform_loop_n has a raw C++ loop instead.
<gonidelis[m]>
jaafar cool thanks for letting know... I might come across this issue in next few weeks