00:13 
StefanLSU has joined #ste||ar
 
00:16 
bikineev has quit [Remote host closed the connection]
 
00:17 
StefanLSU has quit [Client Quit]
 
00:41 
mcopik_ has quit [Ping timeout: 264 seconds]
 
01:15 
jaafar has quit [Ping timeout: 252 seconds]
 
01:22 
jaafar has joined #ste||ar
 
01:31 
jaafar has quit [Ping timeout: 246 seconds]
 
01:37 
Matombo444 has joined #ste||ar
 
01:40 
Matombo has quit [Ping timeout: 240 seconds]
 
02:04 
Matombo444 has quit [Remote host closed the connection]
 
02:16 
K-ballo has quit [Quit: K-ballo]
 
03:08 
eschnett has quit [Quit: eschnett]
 
03:24 
hkaiser has quit [Quit: bye]
 
03:42 
jaafar has joined #ste||ar
 
04:20 
jaafar has quit [Ping timeout: 255 seconds]
 
05:42 
AnujSharma has joined #ste||ar
 
05:47 
bikineev has joined #ste||ar
 
05:54 
parsa has joined #ste||ar
 
06:27 
<
github >
hpx/master a9fda22 Thomas Heller: One more fix for service_executor...
 
06:27 
<
github >
hpx/master 1f803a7 Thomas Heller: Removing superfluous ')'
 
06:32 
<
github >
hpx/master 1120afc Thomas Heller: Fixing more typos within the PAPI perf counters
 
06:35 
parsa has quit [Quit: Zzzzzzzzzzzz]
 
06:55 
parsa has joined #ste||ar
 
07:30 
parsa has quit [Quit: Zzzzzzzzzzzz]
 
07:56 
Smasher has quit [Changing host]
 
07:56 
Smasher has joined #ste||ar
 
07:56 
Smasher has joined #ste||ar
 
08:06 
david_pfander has joined #ste||ar
 
08:15 
bikineev has quit [Remote host closed the connection]
 
08:47 
Matombo has joined #ste||ar
 
08:52 
mcopik_ has joined #ste||ar
 
09:28 
Matombo has quit [Remote host closed the connection]
 
09:29 
mcopik_ has quit [Ping timeout: 240 seconds]
 
09:41 
jaafar has joined #ste||ar
 
09:51 
<
github >
hpx/gh-pages d4c7836 StellarBot: Updating docs
 
10:09 
bikineev has joined #ste||ar
 
10:22 
hkaiser has joined #ste||ar
 
10:35 
<
github >
[hpx] hkaiser closed pull request #2907: Optionaly force-delete remaining channel items on close  (master...fixing_2890) 
https://git.io/v5dO2 
 
10:37 
jaafar has quit [Ping timeout: 255 seconds]
 
10:39 
<
heller >
hkaiser: good morning
 
10:39 
<
heller >
hkaiser: I hate the service_pool executor
 
11:01 
<
hkaiser >
heller: g'morning
 
11:25 
<
zao >
I wonder if it's HPX or my platform that's painfully broken.
 
11:25 
<
zao >
All tests wedge on DragonFlyBSD.
 
11:26 
<
hkaiser >
probably some problem in hpx for that platform
 
11:26 
<
zao >
On the thread I broke into, it was waiting for hpx::resource::get_partitioner
 
11:26 
<
zao >
Should've looked at all threads, I guess.
 
11:27 
<
hkaiser >
that's relying on magic statics, but we do that in several spots...
 
11:27 
<
hkaiser >
or hangs in the constructor of the rp
 
11:30 
<
zao >
Is it trying to get the partitioner while throwing while constructing the partitioner?
 
11:30 
bikineev has quit [Ping timeout: 246 seconds]
 
11:33 
<
hkaiser >
uhh, the constructor of the rp tries to recursively call get_partitioner
 
11:33 
<
hkaiser >
that hangs because of the 'magic statics' lock
 
11:35 
<
heller >
zao: hwloc problems?
 
11:36 
<
zao >
Release build, so very little debug info :(
 
11:41 
<
hkaiser >
hwloc_topology_info::get_number_of_cores throws for some reason and throwing exceptions apparently calls get_partitioner
 
11:41 
<
hkaiser >
no idea why (for both)
 
11:41 
<
zao >
hwloc_get_nbobjs_by_type yields 0 in a test program, but I'm not sure if I'm setting the topology up right.
 
11:41 
<
zao >
(for HWLOC_OBJ_CORE)
 
11:41 
<
hkaiser >
zao: can you do a debug build?
 
11:42 
<
zao >
It'll take a good while, but sure.
 
11:42 
<
hkaiser >
hold on for a sec, let be have a look
 
11:44 
<
zao >
Output from hwloc-ls seems rather sparse. Are we making any assumptions about the shape of the node somehow?
 
11:44 
<
hkaiser >
zao: shrug, I didn't think so
 
11:46 
<
hkaiser >
zao: I think I can fix that particular problem, let's see - at least I can fix the hang
 
11:47 
<
zao >
Building a single test, [115/182] targets atm.
 
11:52 
<
zao >
get_number_of_cores indeed returns 0.
 
11:53 
<
zao >
Note that lstopo only has PUs, no cores or cache info at all.
 
11:56 
<
hkaiser >
so hwloc is broken for you :/
 
11:57 
<
zao >
Broken and broken... less capable :D
 
12:01 
K-ballo has joined #ste||ar
 
12:09 
<
hkaiser >
zao: any idea what we could do for a workaround?
 
12:14 
<
zao >
Not sure what the HPX code does with these facts and if we can make some assumption that there's 1 core per PU or something.
 
12:14 
<
zao >
Feels like it's a legitimate hwloc structure, just strangely empty of a lot of the common components.
 
12:14 
<
zao >
Simple solution would be to refuse to use hwloc on the platform, I guess.
 
12:15 
<
hkaiser >
zao: well, if there is no core information then we have to assume one pu per core
 
12:15 
<
zao >
Not sure how much we lose then, or is it required?
 
12:15 
<
zao >
I always forget which libs are optional and not.
 
12:15 
<
hkaiser >
we move more and more to hwloc being mandatory
 
12:15 
<
hkaiser >
especially the rp code assumes that we have it
 
12:16 
<
hkaiser >
zao: at least I will change the rp initialization avoiding the hang
 
12:33 
<
hkaiser >
zao: I'll commit a fix for the hang in a sec, then we can start looking into the hwloc issue
 
12:35 
<
github >
hpx/fix_rp_hang 1c37b94 Hartmut Kaiser: Avoiding hang during creation of the resource partitioner
 
12:43 
<
hkaiser >
zao: how many PUs are reported for you?
 
13:11 
pree has joined #ste||ar
 
13:16 
<
zao >
Four PUs, the machine has 1 socket with 4 cores.
 
13:17 
<
zao >
CPU: Intel(R) Xeon(R) CPU E3-1225 v3 @ 3.20GHz (3192.63-MHz K8-class CPU)
 
13:17 
<
zao >
I've checked in the DragonFly IRC channel, seems like their exposure of topology to userspace is a bit lacking.
 
13:21 
<
zao >
Let's see if I can rebase and build on the fix_rp_hang branch.
 
13:22 
<
hkaiser >
ok, cool - thanks - I think we can add a some workaround code if numcpus is reported as zero
 
13:23 
<
zao >
HWLOC_OBJ_PU indeed says 4.
 
13:23 
<
hkaiser >
hwloc_get_nbobjs_by_type( HWLOC_OBJ_PU) ?
 
13:23 
<
zao >
Yeah, in my standalone test.
 
13:24 
<
hkaiser >
ok, I'll add this as a fallback
 
13:26 
<
zao >
Branch now crashes properly.
 
13:26 
<
zao >
 $ bin/reduce_test
 
13:26 
<
zao >
terminate called after throwing an instance of 'hpx::detail::exception_with_info<hpx::exception>'
 
13:26 
<
zao >
  what():  hwloc_get_nbobjs_by_type failed: HPX(kernel_error)
 
13:26 
<
zao >
[1]    64673 abort (core dumped)  bin/reduce_test
 
13:28 
<
hkaiser >
zao: good, so the hang is fixed
 
13:28 
<
github >
hpx/fix_rp_hang 00eb16c Hartmut Kaiser: Add fallback to topology::get_number_of_cores to looks at number of PUs reported
 
13:28 
<
hkaiser >
here is the workaround ^^
 
13:32 
hkaiser has quit [Quit: bye]
 
13:37 
<
zao >
Further, but still hosed :)
 
13:38 
<
zao >
/home/zao/stellar/hpx/src/runtime/threads/policies/hwloc_topology_info.cpp:235
 
13:38 
<
zao >
I'll hack away on it eventually.
 
13:38 
<
zao >
(got other stuff on the plate)
 
13:40 
diehlpk_work has joined #ste||ar
 
13:52 
aserio has joined #ste||ar
 
13:58 
eschnett has joined #ste||ar
 
14:12 
<
zao >
Seems like things like hwloc_topology_info::get_pu_number assumes that cores exist and always have PUs as children.
 
14:15 
hkaiser has joined #ste||ar
 
14:28 
pree has quit [Read error: Connection reset by peer]
 
14:44 
pree has joined #ste||ar
 
14:46 
aserio has quit [Read error: Connection reset by peer]
 
14:48 
aserio has joined #ste||ar
 
15:06 
rod_t has joined #ste||ar
 
15:20 
parsa has joined #ste||ar
 
15:26 
<
hkaiser >
heller: may I ask you to stop pushing directly to master
 
15:29 
<
zao >
hkaiser: Much more is hosed. hwloc_topology_info has some rather deep assumptions that cores are parents to PUs.
 
15:29 
<
zao >
Proper approach here may be to fix the OS.
 
15:29 
<
hkaiser >
zao: yah, that's true
 
15:30 
aserio has quit [Ping timeout: 264 seconds]
 
15:39 
aserio has joined #ste||ar
 
15:48 
<
heller >
hkaiser: sure. I thought those changes were rather trivial
 
15:48 
<
heller >
hkaiser: but yes, the service_executor is more painful than I thought...
 
15:49 
<
heller >
The other changes were fixing typos sneaking in in the format merge
 
15:55 
<
heller >
hkaiser: we didn't have as much green since quite some time now...
 
16:01 
<
K-ballo >
I'm amazed by that #include typo that I introduced about 4 years ago
 
16:02 
<
heller >
We should remove that example all together...
 
16:03 
AnujSharma has quit [Ping timeout: 264 seconds]
 
16:04 
EverYoung has joined #ste||ar
 
16:07 
mcopik_ has joined #ste||ar
 
16:12 
pree has quit [Read error: Connection reset by peer]
 
16:14 
mcopik_ has quit [Ping timeout: 240 seconds]
 
16:27 
pree has joined #ste||ar
 
16:36 
david_pfander has quit [Ping timeout: 255 seconds]
 
16:39 
mbremer has joined #ste||ar
 
16:55 
Matombo has joined #ste||ar
 
16:59 
hkaiser has quit [Read error: Connection reset by peer]
 
17:04 
<
zbyerly_ >
is anyone else working on KNLs?
 
17:16 
StefanLSU has joined #ste||ar
 
17:16 
pree has quit [Read error: Connection reset by peer]
 
17:17 
hkaiser has joined #ste||ar
 
17:18 
jaafar has joined #ste||ar
 
17:18 
<
zbyerly_ >
im' having trouble with avx512 stuff
 
17:21 
EverYoun_ has joined #ste||ar
 
17:24 
EverYoung has quit [Ping timeout: 246 seconds]
 
17:26 
EverYoun_ has quit [Ping timeout: 240 seconds]
 
17:29 
pree has joined #ste||ar
 
17:38 
hkaiser has quit [Ping timeout: 248 seconds]
 
17:42 
akheir has joined #ste||ar
 
17:42 
pree has quit [Ping timeout: 264 seconds]
 
17:43 
aserio has quit [Ping timeout: 246 seconds]
 
17:46 
hkaiser has joined #ste||ar
 
17:55 
<
hkaiser >
heller: why should we remove that example?
 
17:55 
EverYoung has joined #ste||ar
 
17:55 
pree has joined #ste||ar
 
18:02 
StefanLSU has quit [Quit: StefanLSU]
 
18:06 
StefanLSU has joined #ste||ar
 
18:10 
StefanLSU has quit [Client Quit]
 
18:10 
aserio has joined #ste||ar
 
18:11 
StefanLSU has joined #ste||ar
 
18:26 
<
K-ballo >
because it did not compile for 4 years and nobody noticed?
 
18:36 
StefanLSU has quit [Quit: StefanLSU]
 
18:39 
pree has quit [Ping timeout: 240 seconds]
 
18:45 
mcopik_ has joined #ste||ar
 
18:52 
pree has joined #ste||ar
 
19:04 
<
heller >
hkaiser: what exactly does it demonstrate?
 
19:04 
<
heller >
Also the fact that nobody noticed that it wasn't working for 4 years ;)
 
19:15 
<
zao >
I'd like to shake a fist at the owner of the quickstart example on how HPX can be used in a library.
 
19:16 
<
zao >
It's not exactly easy to pull out argc/argv from a library on cool OSes :D
 
19:17 
<
zao >
Also abusing it for the memory counters. I hope it's not expensive to kvm_open/kvm_get*
 
19:19 
<
K-ballo >
what's with `__argc/v` on linux? is that a windows detail leaked?
 
19:24 
pree has quit [Remote host closed the connection]
 
19:32 
<
zao >
It's assumedly a MSVCRT thing, which the example has hacked into working by defining globals by those names on macOS and Linux.
 
19:32 
<
zao >
(and now FreeBSD/DragonFly in my branch)
 
19:32 
aserio has quit [Read error: Connection reset by peer]
 
19:34 
aserio has joined #ste||ar
 
20:03 
hkaiser has quit [Quit: bye]
 
20:26 
bikineev has joined #ste||ar
 
20:37 
hkaiser has joined #ste||ar
 
20:48 
eschnett has quit [Quit: eschnett]
 
20:52 
bikineev has quit [Remote host closed the connection]
 
20:53 
Shahrzad has joined #ste||ar
 
20:53 
bikineev has joined #ste||ar
 
20:55 
<
zbyerly_ >
does hpx use any automatically generated source code?
 
20:56 
<
K-ballo >
how is that defined? as in a separate pre-compilation step?
 
20:57 
<
K-ballo >
there's a tinsy bit of generated config headers, generated by cmake
 
21:06 
Shahrzad has quit [Quit: Leaving]
 
21:07 
Shahrzad has joined #ste||ar
 
21:07 
Shahrzad has quit [Client Quit]
 
21:08 
Shahrzad has joined #ste||ar
 
21:08 
Shahrzad has quit [Client Quit]
 
21:08 
Shahrzad has joined #ste||ar
 
21:09 
Shahrzad has quit [Client Quit]
 
21:17 
akheir has quit [Remote host closed the connection]
 
21:22 
<
diehlpk_work >
hkaiser, see pm
 
21:30 
EverYoung has quit [Remote host closed the connection]
 
21:30 
EverYoung has joined #ste||ar
 
21:43 
<
aserio >
hkaiser: If you do not specify a policy, does dataflow use async?
 
21:43 
EverYoun_ has joined #ste||ar
 
21:46 
EverYoung has quit [Ping timeout: 246 seconds]
 
21:49 
<
hkaiser >
aserio: uhh
 
21:49 
<
hkaiser >
I think so, yes
 
21:50 
<
hkaiser >
same as async()
 
21:50 
<
zbyerly_ >
K-ballo, libgeodecomp uses ruby to generate cpp sourcecode
 
21:50 
<
aserio >
hkaiser: thanks!
 
21:51 
<
K-ballo >
hpx used to generate preprocessed cpp source with wave, but we don't need to anymore, as far as I know all what's left is those few config headers generated from cmake
 
22:14 
<
github >
hpx/add_checkpoint 3c2c29b aserio: Adding checkpoint.hpp
 
22:14 
<
github >
hpx/add_checkpoint fc3bebf aserio: Adding testing for checkpoint.hpp
 
22:14 
<
github >
hpx/add_checkpoint 07ccafa aserio: Preparing the test : checkpoint.cpp...
 
22:18 
aserio has quit [Quit: aserio]
 
22:30 
rod_t has left #ste||ar [#ste||ar]
 
22:33 
StefanLSU has joined #ste||ar
 
22:36 
<
K-ballo >
dataflow does not modify the types of the arguments nor the target callable, does it?
 
22:39 
StefanLSU has quit [Quit: StefanLSU]
 
22:48 
<
K-ballo >
I think I found the poisonous dataflow overload
 
22:49 
<
K-ballo >
all these deduced return types everywhere make reading the code next to impossible
 
22:59 
parsa has quit [Quit: Zzzzzzzzzzzz]
 
23:15 
EverYoun_ has quit [Remote host closed the connection]
 
23:36 
bikineev has quit [Remote host closed the connection]
 
23:53 
Matombo has quit [Remote host closed the connection]
 
23:55 
EverYoung has joined #ste||ar
 
23:56 
EverYoung has quit [Remote host closed the connection]
 
23:56 
EverYoung has joined #ste||ar