hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
<hkaiser> weilewei: yah, that's one way of doing it
kale[m] has quit [Ping timeout: 246 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 246 seconds]
kale[m] has joined #ste||ar
Yorlik has quit [Ping timeout: 240 seconds]
<weilewei> hkaiser thanks
bita__ has joined #ste||ar
hkaiser has quit [Quit: bye]
RostamLog has joined #ste||ar
Yorlik has joined #ste||ar
akheir has quit [Quit: Leaving]
kale[m] has quit [Ping timeout: 246 seconds]
kale[m] has joined #ste||ar
Yorlik has quit [Ping timeout: 246 seconds]
bita__ has quit [Ping timeout: 260 seconds]
nikunj97 has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
nikunj97 has quit [Read error: Connection reset by peer]
Nikunj__ has joined #ste||ar
nikunj97 has joined #ste||ar
Nikunj__ has quit [Ping timeout: 260 seconds]
Nikunj__ has joined #ste||ar
nikunj97 has quit [Read error: Connection reset by peer]
nikunj97 has joined #ste||ar
Nikunj__ has quit [Ping timeout: 260 seconds]
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 240 seconds]
kale[m] has joined #ste||ar
nikunj97 has quit [Read error: Connection reset by peer]
nikunj97 has joined #ste||ar
Nikunj__ has joined #ste||ar
nikunj97 has quit [Ping timeout: 240 seconds]
Yorlik has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
Nikunj__ is now known as nikunj97
<nikunj97> ms[m], Have you ever noticed the following error: https://gist.github.com/NK-Nikunj/d3889e4533fc062a870ed4f32a9dd8a7
<nikunj97> I'm getting bus errors while trying to allocate nsimd::pack<float> (a data structure) but it runs just fine using normal floats
<ms[m]> nikunj97: nope, sorry
<ms[m]> alignment issues with your SIMD types?
<nikunj97> I use a library for my SIMD types, I do believe they should be aligned properly
<nikunj97> how do I check if the simd types are aligned properly?
<nikunj97> btw the code runs perfectly fine for smaller grid sizes
<nikunj97> something like 8192x8192 runs smoothly. It is when I get to larger allocations that I face memory bus issues.
hkaiser has joined #ste||ar
<ms[m]> print the addresses ;) anyway, check if changing the allocator makes a difference
<nikunj97> ms[m], tried both tcmalloc and jemalloc. It seems that the problem was with the processor itself. Running it on another node resolves the error.
<nikunj97> weird stuff
<nikunj97> btw it can't be a problem of alignment issues with nsimd as I got into the same error with large grid sizes of float
<nikunj97> ms[m], it seems that the alignment issues are arising from boost::coroutines
<nikunj97> or may be it is just the processor idk
K-ballo has quit [Quit: K-ballo]
K-ballo has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
kale[m] has quit [Ping timeout: 246 seconds]
kale[m] has joined #ste||ar
hkaiser has quit [Quit: bye]
K-ballo has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 256 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 258 seconds]
kale[m] has joined #ste||ar
<nikunj97> hpx::init: hpx::exception caught: failed to initialize machine affinity mask: HPX(kernel_error)
<nikunj97> what is this error supposed to mean ^^
<nikunj97> Is there a problem with my hpx installation?
bita__ has joined #ste||ar
<zao> Does hwloc reports a sane topology? Is this in a job or otherwise constrained.l by cgroups?
<nikunj97> zao, lstopo shows the right topology
<nikunj97> I'm running it on a single node
hkaiser has joined #ste||ar
kale[m] has quit [Ping timeout: 260 seconds]
kale[m] has joined #ste||ar
<nikunj97> hkaiser, yt?
<hkaiser> nikunj97: here
<nikunj97> hkaiser, I'm getting this error: hpx::init: hpx::exception caught: failed to initialize machine affinity mask: HPX(kernel_error)
<hkaiser> uhh
<hkaiser> that means hwloc returned an error
<nikunj97> the code runs perfectly fine on other processors. And ik hpx is setup properly coz examples and the other benchmark runs just fine
<nikunj97> it is just this code that returns the erro
<nikunj97> *error
<hkaiser> can't think of what's happeneing, sorry
<nikunj97> anyway to debug the code?
<hkaiser> use a debugger?
<nikunj97> ugg.. sure
<nikunj97> should I try rebuilding hwloc and build hpx again?
<hkaiser> shrug, not sure what's wrong
<hkaiser> I'd try to look at the arguments of the failing call
<nikunj97> k
<nikunj97> hkaiser, using pkg-config made it work for some reason
<hkaiser> lol
<nikunj97> this is weird behavior ;-)
<nikunj97> btw, looks like pkg-config does not add optimization flags
nikunj97 has quit [Read error: Connection reset by peer]
bita__ has quit [Ping timeout: 260 seconds]
kale[m] has quit [Ping timeout: 256 seconds]
kale[m] has joined #ste||ar
kale[m] has quit [Ping timeout: 264 seconds]
kale[m] has joined #ste||ar