aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
<parsa>
can you post the cmake error log?
<parsa>
oh that is the error log
<hkaiser>
that is the error log
parsa has quit [Quit: Zzzzzzzzzzzz]
<hkaiser>
parsa[w]: if I setthe blaze_DIR it does work now
<K-ballo>
all actions are derived from hpx::actions::basic_action, correct?
EverYoung has quit [Ping timeout: 252 seconds]
<hkaiser>
K-ballo: think so, yes
gedaj_ has quit [Read error: Connection reset by peer]
gedaj__ has joined #ste||ar
hkaiser has quit [Quit: bye]
ct-clmsn has joined #ste||ar
parsa has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
ct-clmsn has quit [Ping timeout: 240 seconds]
jaafar has joined #ste||ar
patg has quit [Quit: See you later]
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
jaafar has quit [Ping timeout: 246 seconds]
parsa has quit [Quit: Zzzzzzzzzzzz]
msimberg has quit [Ping timeout: 248 seconds]
gedaj__ has quit [Remote host closed the connection]
gedaj__ has joined #ste||ar
david_pfander has joined #ste||ar
kxkamil has joined #ste||ar
gedaj__ has quit [Remote host closed the connection]
<msimberg>
hkaiser: is there a github issue or test case for the hangup problem we discussed yesterday? #2955 seems related but is short on details...
pree has joined #ste||ar
<msimberg>
also, still wanted to ask about the idle backoff: the wait_count_ is only incremented by one each time, and is used directly for the wait time, so it's not really exponential back-off is it? or am I still misunderstanding how/where that is happening?
<hkaiser>
msimberg: I don't think there is a ticket, but there is a test case, hold on
<heller>
yes, instead of removing the PUs there, you should really just suspend them
<hkaiser>
or fix the shutdown
<msimberg>
or both?
<hkaiser>
or both
<msimberg>
I suppose this os_threads_count + 1 would also be wrong when there's oversubscription or if the scheduler does not have do_background_work set...
<msimberg>
anyway, thanks for the test case!
<msimberg>
but any comment on the idle back-off?
<heller>
if there is not background thread, it is fine
<heller>
because we have the greater then condition
<heller>
the deadlock termination doesn't even kick in when you observe the problem
<heller>
msimberg: idle back off: yes, you might be right
<msimberg>
heller: let's say you have no background threads, then you could exit the loop while you still have "real" hpx threads running? that's not a problem?
<heller>
errm
<heller>
yes
<heller>
but it doesn't lead to a hang :P
<msimberg>
right :)
pree has quit [Read error: Connection reset by peer]
<msimberg>
the check needs to be changed anyway, I suppose you're looking into it as well?
<heller>
not right now
<msimberg>
good, then I'
<msimberg>
ll see how far I get
<heller>
it's amazing how many "move after use" clang-tidy is able to find
<hkaiser>
heller: in hpx?
<heller>
hkaiser: yes
<hkaiser>
will you fix those?
pree has joined #ste||ar
<heller>
hkaiser: on it, yes
<hkaiser>
cool
<hkaiser>
heller: we should add this test to the clang-tidy pass on circleci
<heller>
hkaiser: that's the plan, yes
<hkaiser>
nod
<diehlpk_work>
hkaiser, What would be a good cs journal for a paper about the peridynamicHPX code?
<pree>
There's any plan to add vectorization support in HPX in future ?
<heller>
pree: no. because that's already done ;)
<pree>
heller : Oh Sorry I didn't notice that : )
<pree>
Thanks
<heller>
pree: it's done with the help of the Vc library
<pree>
nice
<pree>
heller : Thanks for the hint
<hkaiser>
diehlpk_work: uhh
<hkaiser>
I'm usually not uptodate with conferences
aserio has joined #ste||ar
<parsa[w]>
hkaiser: i bypassed that circleci error. can we merge now?
<hkaiser>
parsa[w]: let's get circleci fixed as well
<parsa[w]>
i did
<parsa[w]>
i told it to use the source directly
<aserio>
hkaiser: Good Morning
<hkaiser>
hey aserio
<hkaiser>
aserio will you join?
<aserio>
hkaiser: I am joining right now
pree has quit [Read error: Connection reset by peer]
jaafar has joined #ste||ar
pree has joined #ste||ar
jaafar has quit [Ping timeout: 252 seconds]
pree has quit [Read error: Connection reset by peer]
zbyerly_ has joined #ste||ar
<zbyerly_>
aserio, do you need me for the skype with jason?
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
<aserio>
zbyerly_: Well at the moment he isnt even on line.... :p
akheir has joined #ste||ar
<aserio>
hkaiser: are you still looking at parsa[w] pull request?
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 252 seconds]
aserio1 is now known as aserio
<zbyerly_>
aserio, so what's the skype call with Jason about? should I be on it?
<aserio>
zbyerly_: So I just called Jason about getting HPX into ADCIRC
<zbyerly_>
what did he say?
<aserio>
and we were wondering if you would be willing to work on getting HPX working with the ADCIRC CMake build system
<aserio>
zbyerly_: He was just saying the same stuff as on the call
<aserio>
So I asked if it were possible to break this down in smaller steps
<aserio>
and I asked about the build system
<aserio>
and then we came up with the request
<aserio>
zbyerly_: so what do you think?
<zbyerly_>
idk, why wasn't I on the call?
<aserio>
Because I wanted to have a one on one talk with Jason, I figured he would feel less pressured to commit to what was being said
<zbyerly_>
aserio, i don't understand what you mean at all
<aserio>
zbyerly_: which part?
<zbyerly_>
why wouldn't jason and I just talk directly and decide what to do?
<zbyerly_>
and what is Jason committing to?
<aserio>
Because it is less about the engineering and more about planting the seed
<aserio>
I wasn't sure what he would be committing to
<aserio>
that's why I asked to call him just to chat
<zbyerly_>
aserio, you didn't want to talk to me first about it?
<zbyerly_>
like "hey, what have you and Jason talked about doing wrt HPX and ADCIRC?"
<aserio>
All you asked was did you need to be on it
<aserio>
I can try to me more direct next time
<zbyerly_>
aserio, let's talk about this in person on monday
<aserio>
zbyerly_: Can do
<aserio>
parsa[w]: have you built Blaze on Rostam?
<parsa[w]>
no, why? i don't even need to build it btw
<parsa[w]>
it's a header file library
<aserio>
I am trying to figure out where to point cmake
<zao>
Build as in run tests and stuff?
<zao>
(and install)
EverYoung has joined #ste||ar
pree has joined #ste||ar
pree has quit [Read error: Connection reset by peer]
<aserio>
zao: yea, I mistyped. I meant did he build his branch of Phylanx on Rostam
<aserio>
I needed to know where to point to Blaze
<aserio>
and he kindly pointed me to the module
<zao>
Ah.
pree has joined #ste||ar
<zao>
Makes more sense then.
<aserio>
Yea I have heard that it is a header only library
david_pfander has quit [Ping timeout: 240 seconds]
gedaj_ has quit [Remote host closed the connection]
gedaj_ has joined #ste||ar
aserio has quit [Ping timeout: 255 seconds]
aserio1 is now known as aserio
<jbjnr>
hkaiser: K-ballo any advice on where I should look to see if I can make progress on my strange compiler error - I am using dataflow - so I'm not convinced there's anything 'pulling it in by accident'.
<K-ballo>
jbjnr: no, it's not any one isolated issue, it's a chain of implicit instantiations caused by faux deduced return types
<K-ballo>
that chain is recursive, so it ends up hitting an "incomplete" type (depending on where the instantiation chain starts)
<K-ballo>
we need to clean up all of dataflow, unwraped, the pack traversal stuff, and one or more of the executors
<K-ballo>
I'm currently fighting dataflow
<jbjnr>
the main symptom I'm seeing though is that my continuation is called with 2 args, when I only expect one.
<jbjnr>
and the types are rather strange - but the continuations themselves actually compile and run fine
<K-ballo>
it's not actually called with 2 args
<K-ballo>
the code is just asking whether the thing may be called with 2 args, somewhere in all the deduced return types metaprogramming
<jbjnr>
I print out the Ts... inside the continuation and there are two of them!
<K-ballo>
(or maybe it is, I haven't looked in detail at your particular case, but the other two like it aren't)
<K-ballo>
you print out? so you are actually called?
<K-ballo>
as in, executed at runtime
<K-ballo>
that's not what I see locally, but I did have to fix some errors in your branch first
<jbjnr>
K-ballo: at the end of this gist are the twpes my continuation is called with tps://gist.github.com/biddisco/e65f669d735f308251c826f4eebf21cb the first two calls are fine and show tha rgs as expected, the thrid is with a continuation and I expect a double or a future/double, but I get two args where the first is most unusual
<K-ballo>
no idea, that's not what I see locally, here it doesn't run, it doesn't even compile
<jbjnr>
when I comment out one of my tamplate calls, the code runs and I get the output above. when I kleave it in, I get silly compiler errors, but the
<K-ballo>
make the instantiation with more than one argument fail so that you get an instantiation trace?
<jbjnr>
yeah -it does not compile unless the call to numa_function(ts...) is commented out
<jbjnr>
but that should not affect the 'other' function ;)
<K-ballo>
ah, yeah, that's the one that fails currently
<K-ballo>
I thought *that* was the issue
<jbjnr>
yes - the fact that it sdoes not coile is my problem, but trying to find out why it doesn't compile, I comment it out, but print the ts... args in the other part of the function that is actually executed
<jbjnr>
and that shows the args are silly, and that's why it doesn't compile - and yet it does compile -and run - for the continuation itself.
<jbjnr>
it's weird to me.
<jbjnr>
what bugs did you fix in my branch?
<K-ballo>
two or three extra `template <>`, and an issue with hpx::get_config_entry in some convoluted defaulted arguments that might well have been an msvc bug
<K-ballo>
msvc kept saying get_config_entry could not be found in the global namespace, despite the hpx::
<jbjnr>
ok, the template<> I knew about - but I thought they were allowed
parsa has quit [Quit: Zzzzzzzzzzzz]
<jbjnr>
is the get_config_entry patch in any branch?
<parsa[w]>
hkaiser: should i put a temporary implementation in its place or should i disable it?
<hkaiser>
parsa[w]: det()?
jaafar has quit [Ping timeout: 240 seconds]
<parsa[w]>
yeah
<hkaiser>
it should work for doubles
<parsa[w]>
it doesn't
gedaj_ has quit [Remote host closed the connection]
<hkaiser>
your ticket is for a Vector<int>
gedaj_ has joined #ste||ar
<hkaiser>
aserio: yt?
<parsa[w]>
hkaiser: it's for a Matrix<int>. Same thing happens with doubles.
<hkaiser>
doesn't make sense as he's listing the missing functions for double* as being available
<parsa[w]>
hkaiser: i'm lost. i need det in phylanx::execution_tree::primitives::determinant and it doesn't work
<aserio>
hkaiser: yes
<hkaiser>
parsa[w]: the compiler error you added to the blaze ticket says that it's missing getrf() for int* but it has getrf() defined for complex<double>, complex<float>, double, and float
<hkaiser>
parsa[w]: so I conclude that if matrix<double>.det() doesn't compile - it can't be the same problem
<hkaiser>
aserio: thanks for the latest fix to if_conditional - one question, though
parsa has joined #ste||ar
<hkaiser>
aserio: why did you name the implementation class 'loop'?
<aserio>
hkaiser: because it is the if loop
<aserio>
and the loop contains the body
<hkaiser>
what is a "if loop"?
<aserio>
idk I was making crap up
<hkaiser>
lol
<aserio>
I just didnt want to use iteration
<hkaiser>
it isn't an iteration either
<hkaiser>
isn't it a if_conditional?
<hkaiser>
or just if_?
eschnett has quit [Quit: eschnett]
<aserio>
I can change it to either
<aserio>
hkaiser: what would you prefer
<hkaiser>
shrug, just wondering
<hkaiser>
if somebody will read the code 'loop' might cause confusion - that's all
<aserio>
no, its not a problem
<aserio>
what makes the most sense
parsa has quit [Quit: Zzzzzzzzzzzz]
<aserio>
hkaiser: was the shared pointer stuff right?
<hkaiser>
parsa: you need to have a BLAS library installed (MKL or similar)
<hkaiser>
do you hav ethat?
<hkaiser>
openblas (in vcpkg should do the trick as well)
<parsa[w]>
one of the three machines (machine C in that issue) did have OpenBLAS. yes it failed
parsa has quit [Client Quit]
<hkaiser>
dgetrf is a BLAS function - so something is not quite right with the blaze installation
<parsa[w]>
also why should it offer determinant calculation for BLAS only if i can choose the C++11 backend instead of BLAS?
<hkaiser>
that's unrelated
<hkaiser>
I think this tells us that our cmake script is by far not done yet ;)
<aserio>
hkaiser: when you are building in Windows do you set the architecture to x64 if you have a 64 bit version of Windows
<hkaiser>
yah, you should always build x64 only
<zao>
hkaiser doesn't care for us 32-bit suckers :P
<K-ballo>
what do you mean "*if* you have a 64 bit version of Windows" ?!
<zao>
ARM64, IA64 or AMD64? :)
<hkaiser>
aserio: see pm, pls
<aserio>
zao: Intel i7
<hkaiser>
aserio: nowadays nobody uses 32bit windows anymore
<zao>
The default should be x64, if you're unsure. There's legitimate reasons to run 32-bit code even today, but HPX won't work well there.
<hkaiser>
zao: it's a limitation in windows, not in hpx
<zao>
Limitations there may be, but HPX kind of rots easily on platforms where integer sizes change.
<hkaiser>
right
<hkaiser>
all soaftware rots... eventually
<K-ballo>
integer sizes change?
<zao>
size_t in particular.
eschnett has joined #ste||ar
<zao>
Reasonably common for code that's written 64-first to interchangeably use it around 64-bit integers.
<zao>
Infinite number of size warnings in 32-bit builds.
<K-ballo>
ah yes, indeed
<K-ballo>
my very first experience with HPX was seeing thousand of 32bit warnings
<K-ballo>
gave me pause
<hkaiser>
I thought we fixed those
<aserio>
I was just confused about the x86 vs x64 thing
<hkaiser>
have not compiled for 32 bits in a while...
<aserio>
hkaiser: bike shedding issue complete
<hkaiser>
aserio: thanks ;)
<K-ballo>
32bit windows configuration should raise an error and ask the user to opt-in to continue
<aserio>
hkaiser: I still can't get the last generate tree test to work
<hkaiser>
k
<aserio>
hkaiser: I am either missing something very simple or their is something wrong
twwright has quit [Read error: Connection reset by peer]
twwright has joined #ste||ar
<parsa[w]>
hkaiser: i don't see how installing OpenBLAS with Vcpkg helped. i'm obviously still getting linking errors because how is MSVC supposed to know it should look for BLAS
<hkaiser>
parsa[w]: that's what I meant by ' I think this tells us that our cmake script is by far not done yet'
<parsa[w]>
what does this have to do with "our" cmake script?
<parsa[w]>
i'm not supposed to install Blaze for people
<parsa[w]>
hkaiser: also this means you're officially adding lapack and blas as dependencies to the project
<K-ballo>
darn! why doesn't C2 work with HPX?
aserio has quit [Quit: aserio]
<zao>
vsclang thing?
<K-ballo>
yeah.. wanted a second opinion
<github>
[hpx] khuck pushed 1 new commit to parcel_coalescing_apex_policy: https://git.io/vdbcI
<github>
hpx/parcel_coalescing_apex_policy 5a8d902 Kevin Huck: Changing parcel coalescing policy from periodic to a custom event.
<hkaiser>
parsa[w]: we need to make it work, no?
<hkaiser>
parsa[w]: the blaze cmake script is supposed to find a blas library and if none is found it should fall back to som einternal implementation
<parsa[w]>
hkaiser: yes, but is it really a minor change?
<hkaiser>
shrug, it has to work in the end
<parsa[w]>
hkaiser: if it doesn't find some BLAS implementation phylanx won't build
<github>
[hpx] khuck pushed 1 new commit to parcel_coalescing_apex_policy: https://git.io/vdbc7
<github>
hpx/parcel_coalescing_apex_policy cb1c01a Kevin Huck: Changing from v1.2 to develop of apex.
eschnett has quit [Quit: eschnett]
<hkaiser>
parsa[w]: blaze is supposed to work without any external blas implementation
<parsa[w]>
what do you mean? that's why i created that issue. that's what we've been talking about this whole time
<hkaiser>
ok
<hkaiser>
it was not quite clear from the ticket
gedaj_ has quit [Remote host closed the connection]