hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
eschnett_ has joined #ste||ar
diehlpk_work has quit [Remote host closed the connection]
nikunj has quit [Remote host closed the connection]
nikunj has joined #ste||ar
nikunj has quit [Ping timeout: 245 seconds]
daissgr has joined #ste||ar
hello has joined #ste||ar
daissgr has quit [Ping timeout: 250 seconds]
daissgr has joined #ste||ar
hkaiser has joined #ste||ar
<hkaiser>
heller_: isn't it your birthday today?
<heller_>
hkaiser: it is
<hkaiser>
Happy Birthday!
<heller_>
thanks ;)
pmikolajczyk41 has joined #ste||ar
<heller_>
hkaiser: does my cuda commit work on windows?
<hkaiser>
heller_: have not tried yet
<heller_>
that would be very interesting and was the main reason to commit it at this stage ;)
<pmikolajczyk41>
Hello. While working on issue #3442 with a set_operation.hpp (hpx/parallel/algorithms/detail/set_operation.hpp) I got a little bit concerned with two function calls to parallel::util::partitioner<>::call(...) in lines 98 and 178.
<pmikolajczyk41>
As I understand (looking at chunk_size.hpp or .*_partitioner.hpp), the 3rd argument should be the number of elements to be partitioned, but these two calls use the number of available cores instead.
<pmikolajczyk41>
When I tried substituting *cores* with *len1*, I got some strange runtime errors.
<pmikolajczyk41>
I think that it concerns only effectiveness, because the code works well, I mean, it gives correct answer for e.g. set difference or intersection.
<pmikolajczyk41>
Of course, I could misunderstood the code, so I decided to ask whether somebody could have a look at it.
<hkaiser>
yes, we use whatever the executor returns as the number of processing units to use
<hkaiser>
don't remember why *puzzled*
<hkaiser>
pmikolajczyk41: could have been the result of some performance analysis...
adityaRakhecha has quit [Ping timeout: 256 seconds]
<Yorlik>
o/
daissgr has quit [Ping timeout: 240 seconds]
pmikolajczyk41 has quit [Quit: Page closed]
<heller_>
hkaiser: the problem with the cancelable action example is not that its trying to set the state of an active thread
<heller_>
but the state of a thread that doesn't exist anymore...
<hkaiser>
ok
<heller_>
so, in the case of an interruption, I don't think one needs to set the state again if the thread is active ... or we need to add reference counting back in
hello has quit [Ping timeout: 255 seconds]
<hkaiser>
heller_: is the test broken?
<heller_>
the test itself is fine, I think
<hkaiser>
how do we know the thread was interrupted in order not to set the state?
<heller_>
we set the flag, don't we?
<heller_>
and since we can only handle interruptions on interruption points...
<hkaiser>
we set the flag to cause interruption, but once interrupted (exception being thrown) - how do you know then?
<heller_>
we just terminate the thread, no?
<heller_>
there's no need to set the state from the outside
<hkaiser>
and flag any waiting threads, yes
<mdiers_>
heller_: Happy Birthday. Will you be in heidelberg on wednesday and thursday?
<heller_>
mdiers_: yes
<heller_>
mdiers_: and thanks ;)
<heller_>
hkaiser: if the thread is suspended, we need to put it into pending, for sure
<heller_>
mdiers_: you as well? I'll arrive tomorrow afternoon
<hkaiser>
heller_: when is your talk?
<mdiers_>
heller_: yes I will arrive later tomorrow too.
<heller_>
21st at 15:00
<heller_>
mdiers_: excellent! btw, did you hear back from BMBF yet?
<heller_>
mdiers_: after the speakers dinner on the 19th, I'd be up for a beer!
<hkaiser>
heller_: so, do you know what to change in order to fix the cancellable action test?
<heller_>
hkaiser: not yet
<mdiers_>
heller_: yes, today. unfortunately a cancellation :-( . gerald is just trying to figure out by phone why.
<heller_>
ok
hello has joined #ste||ar
<mdiers_>
heller_: unfortunately i will arive at 10:30pm.
<heller_>
oh, ok
<hello>
My machine has 8 ddr channels and every channel has 12.5GB bandwidth. Why I get 100GB and 300GB bandwidth when diffrent gcc compile parameters for stream benchmark?
<hkaiser>
cache effects?
<hello>
I am using stream benchmark
<Yorlik>
Cache friendliness beats any sophisticated algorithms ;)
<heller_>
Debug vs Release?
<heller_>
12.5 times 8 is 100, so the 300 get served from cache
<heller_>
The most interesting part: what are the different parameters and how did you run it?
<detan>
diehlpk_work: It's hard to work with nvcc sometimes...
<simbergm>
second link: move warnings
<K-ballo>
had not realized there were pages
<simbergm>
K-ballo: last commit on that branch was a week ago, no? (if yes, results should be up to date)
<diehlpk_work>
detan, Yes, it is. At least my use case compiled with the master of hpx
<K-ballo>
simbergm: yeah, I waited a couple days
<simbergm>
you can also set "items per page = all"
<K-ballo>
I'll try fixing those issues later today, and ask back later this week
<simbergm>
sounds good
<simbergm>
K-ballo: scratch that, those snuck into master....
<K-ballo>
which ones?
<detan>
diehlpk_work: Ok, thanks... I saw that the condition in CMakeLists.txt checking for clang still there for partitioned_vector on master. Let's see what happens when I remove it...