aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irc.cct.lsu.edu/
vamatya_ has quit [Ping timeout: 268 seconds]
bikineev has quit [Remote host closed the connection]
Matombo has quit [Remote host closed the connection]
<github>
[hpx] hkaiser force-pushed fixing_await from 10e32c4 to 4bbe3a0: https://git.io/v9Alv
<github>
hpx/fixing_await 4bbe3a0 Hartmut Kaiser: Adapting to latest changes of TS...
hkaiser_ has quit [Quit: bye]
K-ballo has quit [Quit: K-ballo]
vamatya_ has joined #ste||ar
EverYoung has quit [Ping timeout: 258 seconds]
pree has joined #ste||ar
vamatya_ has quit [Ping timeout: 260 seconds]
thundergroudon has joined #ste||ar
thundergroudon has quit [Ping timeout: 246 seconds]
thundergroudon has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
Matombo has joined #ste||ar
thundergroudon has quit [Read error: Connection reset by peer]
thundergroudon has joined #ste||ar
Matombo has quit [Remote host closed the connection]
david_pfander has joined #ste||ar
jakemp has quit [Ping timeout: 260 seconds]
jakemp has joined #ste||ar
ABresting[m] has quit [Ping timeout: 240 seconds]
pree has joined #ste||ar
jakemp has quit [Ping timeout: 240 seconds]
jakemp has joined #ste||ar
thundergroudon has quit [Read error: Connection reset by peer]
thundergroudon has joined #ste||ar
pree has quit []
taeguk has joined #ste||ar
bikineev has joined #ste||ar
bikineev has quit [Ping timeout: 246 seconds]
bikineev has joined #ste||ar
bikineev has quit [Remote host closed the connection]
hkaiser has joined #ste||ar
<Smasher>
hi hkaiser
<hkaiser>
hey
<Smasher>
it seems your patch fixed my serialization problem
<hkaiser>
\o/
<Smasher>
:)
<Smasher>
but i bumped into the next error now
<hkaiser>
hah, surprise, suprise
<Smasher>
and luckily i know now how not to blame hpx on that :))
<thunderGroudon|2>
coz the document is way too long
<thunderGroudon|2>
hkaiser: please take a look
bikineev has quit [Remote host closed the connection]
mcopik has joined #ste||ar
bikineev has joined #ste||ar
<hkaiser>
thunderGroudon|2: what compiler is that?
<thunderGroudon|2>
The CXX compiler identification is MSVC 19.0.24215.1
<hkaiser>
is that a 32bit build?
<thunderGroudon|2>
it is msvc
<thunderGroudon|2>
yes 32 bit build
<hkaiser>
ok, I'd strongly suggest not to do a 32 bit build, but build for 64bits
<hkaiser>
I have not looked into building for 32 bits in a while
<thunderGroudon|2>
is that what's causing the error?
<hkaiser>
yes
<thunderGroudon|2>
I shall try switching to 64 bit build
<zao>
I don't remember if I had any problems last I tried building HPX for Win32, as it didn't fit into my existing application in other ways that made it unusable.
<hkaiser>
also, 32bit builds of hpx are severly hampered by a windows system limitation
<hkaiser>
thunderGroudon|2: but I will go back and try to make it work for 32bits again
<thunderGroudon|2>
Sure! I shall also let you know if 64 bit build is successful
<hkaiser>
thanks
<diehlpk_work>
hkaiser, Thanks. I only test HPXCL on linux
<hkaiser>
diehlpk_work: what are you talking about?
<diehlpk_work>
On the problem of thunderGroudon|2
<hkaiser>
ahh
<hkaiser>
sure, any time
<diehlpk_work>
Oh, sorry just realized that it is about hpx
<hkaiser>
thunderGroudon|2: btw, the cmake scripts should have given you a warning about not using 32bit builds on windows
<diehlpk_work>
I assumed it is about hpxcl
<thunderGroudon|2>
hkaiser: It did!
<hkaiser>
diehlpk_work: that will come next ;)
<hkaiser>
diehlpk_work: I have not tried building hpxcl in a while either
<thunderGroudon|2>
last time I did 64 bit build on another machine, both hpx and hpxcl was successful
<diehlpk_work>
on linux it compiles, because we have the circle-ci test
<hkaiser>
K-ballo: they have changed the domain one more time, I think
<hkaiser>
need to find out what the final name will be today
hkaiser has quit [Quit: bye]
aserio has joined #ste||ar
bikineev has quit [Ping timeout: 260 seconds]
shoshijak has joined #ste||ar
aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<Smasher>
very rebuild - every time new errors to see :)
<Smasher>
what does that mean? {what}: assertion 'pp ? pp->here() == pp->agas_locality(cfg) : true' failed: HPX(assertion_failure)
<Smasher>
hkaiser repasted in a pm to you my messages i just posted here
pree has quit []
akheir has joined #ste||ar
<hkaiser>
akheir: what is the new url for the irc logger?
<hkaiser>
akheir: nvm, got it from the topic ;)
<akheir>
hkaiser: it's irclog.cct.lsu.edu
<hkaiser>
thanks
<akheir>
hkaiser: ;)
shoshijak has quit [Remote host closed the connection]
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 245 seconds]
aserio1 is now known as aserio
bikineev has joined #ste||ar
hkaiser has quit [Quit: bye]
EverYoung has joined #ste||ar
aserio has quit [Ping timeout: 255 seconds]
hkaiser has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
<Smasher>
sigh
<Smasher>
never ending exceptions
<Smasher>
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<hpx::exception> >' what(): HPX(unknown_error):
aserio has joined #ste||ar
<Smasher>
my protocol seems to start working but then.. blam!
<Smasher>
Thread 1 received signal SIGABRT, Aborted.
<Smasher>
[Switching to Thread 8451.8451]
<Smasher>
killpg (pgrp=0, sig=-1090526248) at ../sysdeps/posix/killpg.c:30
<Smasher>
no idea what's wrong :(
akheir has quit [Remote host closed the connection]
<Smasher>
backtrace looks odd
<Smasher>
#0 killpg (pgrp=0, sig=-1090526248) at ../sysdeps/posix/killpg.c:30
<Smasher>
#1 0xfffffffe in ?? ()
<hkaiser>
Smasher: use gdb, catch throw, look at stack backtrace
<Smasher>
that's what i did hkaiser
<Smasher>
i catch only that SIGABRT
<Smasher>
mhmhm, maybe that exception is thrown on the oder node?
<Smasher>
lets start both in gdb (oh dear )
ABresting[m] has quit [Ping timeout: 272 seconds]
bikineev has quit [Ping timeout: 260 seconds]
bikineev has joined #ste||ar
<aserio>
wash: will you be joining us today?
<Smasher>
do you have something like daily meeting? :)
<aserio>
Smasher: no, we have a weekly meeting
<aserio>
hkaiser: ^^
<Smasher>
aserio i c
hkaiser has quit [Quit: bye]
justwagle has joined #ste||ar
<zao>
I strongly recommend digging into the current SLURM codebase. So many off-by-ones, pointer corruption, and more.
<zao>
No pointer should ever end with 5.
<zao>
(to non-char)
akheir has joined #ste||ar
hkaiser has joined #ste||ar
<K-ballo>
packed bits?
EverYoung has joined #ste||ar
<zao>
Boring old corrupt state.
<zao>
Users get super happy when SLURM blows up, btw.
<zao>
I'm sure you people are familiar with failing batch systems and clusters.
<taeguk>
for fast construction of dependency tree?
<K-ballo>
the intent is to execute those in parallel
<hkaiser>
taeguk: it's an experiment, essentially
EverYoung has quit []
<taeguk>
It is curious that hpx::async for the fibonacci function that returns hpx::future still returns hpx::future, not hpx::future<hpx::future<std::uint64_t>>.
<K-ballo>
IIRC it doesn't, hpx::async still returns future<future<T>> and it is implicitly unwrapped when assigned to future<T>
vamatya_ has joined #ste||ar
<K-ballo>
(unless it was changed recently to do the unwrapping at the `async` level)
<hkaiser>
thundergroudon: I created a patch fixing the 32bit compilation problems you were seeing
<hkaiser>
pls see #2635
<thundergroudon>
hkaiser: Sure! I have since then shifted the build to 64-bit
<thundergroudon>
everything compiles and installs properly
<thundergroudon>
;) haha
mcopik has quit [Ping timeout: 240 seconds]
<thundergroudon>
I shall test the 32-bit build and report if this fixes
<hkaiser>
ok, good
<hkaiser>
thanks
<thundergroudon>
Thanks for making the changes :)
<hkaiser>
:D
<hkaiser>
that's what we're here for
mcopik has joined #ste||ar
david_pfander has joined #ste||ar
justwagle has joined #ste||ar
justwagle has quit [Client Quit]
aserio has joined #ste||ar
<Smasher>
is locality = node in hpx termination?
<Smasher>
terminology*
<pree>
Is Mr.parsa amini here ?
<K-ballo>
parsa[w]: ^
<K-ballo>
Smasher: uh, somewhat.. you can and sometimes want to have more than one locality within a single node
<pree>
Thanks @k-ballo
<zao>
K-ballo: So approximately a MPI task?
<zao>
But less MPI? :)
<zao>
Err, rank?
<zao>
Whatever they call them.
<K-ballo>
...maybe?
<zao>
An addressable instance of the HPX borg collective.
<Smasher>
K-ballo so a node is virtually a single system
<Smasher>
and a locality is a hpx subsystem
<Smasher>
comprendre
akheir has joined #ste||ar
aserio has quit [Ping timeout: 246 seconds]
mcopik has quit [Ping timeout: 240 seconds]
<pree>
Is it necessary to tell actions to execute on which locality ? If we don't specify it whether it will take hpx::find_here as a default one?
bikineev has joined #ste||ar
<Smasher>
pree you can use new_(find_here()) however it would require that your class is copyable
<Smasher>
in case you really really want to use your component only locally, there is local_new<>()
ABresting has quit [Ping timeout: 260 seconds]
Matombo has joined #ste||ar
<pree>
For hpx_plain_actions instances whether this is sufficient to call the actions -> action_instance(hpx::find_here(),parameters). *parametrs to the global function
david_pf_ has quit [Quit: david_pf_]
aserio has joined #ste||ar
mcopik has joined #ste||ar
akheir has quit [Remote host closed the connection]
akheir has joined #ste||ar
<pree>
Thank You @Smasher :)
<Smasher>
pree what have you used now?
<K-ballo>
pree: you can call actions directly or asynchronously, locally or remotely
<pree>
new_(find_here()) @Smasher
Matombo has quit [Remote host closed the connection]
<Smasher>
pree good ;)
Matombo has joined #ste||ar
<pree>
thanks@K-ballo
hkaiser has quit [Quit: bye]
<pree>
I have connected to rostam but it is very slow. Whether this is due to poor connectivity or something else?
<pree>
thank you
bikineev has quit [Read error: No route to host]
bikineev has joined #ste||ar
<K-ballo>
pree: looks responsive enough to me
<pree>
But it is very slow to me. whether it's because of poor internet connectivity?
<K-ballo>
I can only guess
<zao>
Does it have mosh?
<zao>
Might behave a bit better over high-latency connections with dropouts.
<pree>
mosh??
<zao>
Like SSH, but sets up the shell over an encrypted tunnel on top of UDP with predictive text on the client end to make it feel smoother.
<zao>
Probably won't work with all the firewalling around the clusters.
<pree>
No don't have mosh
Matombo has quit [Remote host closed the connection]
bikineev has quit [Remote host closed the connection]
Matombo has joined #ste||ar
Matombo has quit [Remote host closed the connection]
Matombo has joined #ste||ar
pree has quit []
hkaiser has joined #ste||ar
pree has joined #ste||ar
<pree>
@zao /usr/bin/mosh: Could not connect to rostam.cct.lsu.edu: No route to host ssh_exchange_identification: Connection closed by remote host /usr/bin/mosh: Did not find remote IP address (is SSH ProxyCommand disabled?)
<pree>
when using mosh it gives the above message but i can able to connect to rostam with ssh
<zao>
I've never touched the LSU machines, so no idea :)
<pree>
okay no probelm i will fix it if i can :)
<zao>
Protip - don't upgrade cmake in the middle of a build.
<zao>
It calls itself a lot :)
bikineev has joined #ste||ar
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
eschnett has quit [Quit: eschnett]
bikineev has quit [Remote host closed the connection]
bikineev has joined #ste||ar
pree has quit [Ping timeout: 260 seconds]
thundergroudon has quit [Ping timeout: 246 seconds]
Matombo has quit [Remote host closed the connection]
Matombo has joined #ste||ar
<Smasher>
would actually hpx::components::client<bz_server> make sense?
<Smasher>
i mean...client<Foo> is something like a global shared_ptr<Foo>
<hkaiser>
yes
<Smasher>
so if i want to pass a reference of a server to a client, could i pass it like this? Client c(hpx::components::client<bz_server>());
Matombo has quit [Remote host closed the connection]
<hkaiser>
well, a default constructed client<Foo> does not represent a valid component instance
<zao>
469>k:\stellar\hpx\hpx\parallel\algorithms\reduce_by_key.hpp(351): warning C4244: 'argument': conversion from 'const uint64_t' to 'int', possible loss of data
<zao>
VC++ is so loaded down by the build that it took a few seconds to copy a line from the output window :)
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<zao>
(and yes, that diagnostic is largely useless without context)
Matombo has joined #ste||ar
EverYoung has quit [Ping timeout: 260 seconds]
bikineev has quit [Remote host closed the connection]