aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
david_pfander has joined #ste||ar
david_pfander has quit [Ping timeout: 268 seconds]
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 240 seconds]
EverYoun_ has quit [Ping timeout: 258 seconds]
parsa has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
parsa has quit [Client Quit]
EverYoung has joined #ste||ar
parsa has joined #ste||ar
parsa has quit [Client Quit]
eschnett has joined #ste||ar
EverYoung has quit [Ping timeout: 240 seconds]
hkaiser has quit [Quit: bye]
twwright has quit [*.net *.split]
Vir has quit [*.net *.split]
twwright has joined #ste||ar
Vir has joined #ste||ar
jaafar has quit [Ping timeout: 240 seconds]
jaafar has joined #ste||ar
jaafar has quit [Ping timeout: 250 seconds]
hkaiser has joined #ste||ar
<hkaiser>
denisblank: I'm using openblas or mkl (use vcpkg)
hkaiser has quit [Quit: bye]
jaafar has joined #ste||ar
jaafar has quit [Ping timeout: 240 seconds]
K-ballo has quit [Quit: K-ballo]
jaafar has joined #ste||ar
denisblank has quit [Quit: denisblank]
jaafar has quit [Ping timeout: 248 seconds]
jaafar_ has joined #ste||ar
jaafar_ has quit [Ping timeout: 260 seconds]
david_pfander has joined #ste||ar
david_pfander has quit [Ping timeout: 248 seconds]
pree has joined #ste||ar
jbjnr has joined #ste||ar
<jbjnr>
msg nickserv identify webn0d3fr33
<jbjnr>
<sigh>
<jbjnr>
time for another password
<jbjnr>
I blame firefox anyway
<github>
[hpx] biddisco force-pushed fixing_2996 from 39d236d to f49573f: https://git.io/vF1uq
<msimberg>
how much of that is actually in progress, and of the stuff that's in progress how much needs to be done for the release?
<heller>
msimberg: I think we should shift it to the next release
<heller>
we don't have free cycles to work on it atm
<msimberg>
okay
<msimberg>
so what I'd like to do is clean up the 1.1 milestone of things like this, I don't quite like that everything is there
<msimberg>
do you think it would be possible for me to get access to do that?
<heller>
I'd really like to redesign our whole parcelport and serialization stuff
<msimberg>
I can't judge for everything, but with enough asking around...
<heller>
i've been thinking a lot about it lately
<heller>
msimberg: yes, I can grant you access
<heller>
msimberg: I suggest to go through the tickets, and either add a comment or judge by yourself
<msimberg>
thank you, I promise to be responsible
<heller>
just sent you an invite
<msimberg>
yeah, that was my plan, please tell me if I misjudge something badly
<heller>
sure
<heller>
i'll keep an eye on it ;)
<heller>
revoking access is easy and nothing gets lost ;)
<jbjnr>
heller: "I'd really like to redesign our whole parcelport and serialization stuff" please be aware that I have a large body of work on the rma_objects that I wan to merge in. it includes new serialization stuff and rma allocators and everything. don't make change without warning me so I can merge first.
<heller>
jbjnr: I won't
<jbjnr>
ta
<heller>
jbjnr: I plan to do the whole redesign in a seperate project at first to see how it goes
<jbjnr>
what would you like to change?
<heller>
the serialization process and how remote futures are triggered
<heller>
I think there is a great opportunity there to have some speedup
<heller>
in a nutshell
<heller>
I think there is a lot that gets lost in the whole setup
<heller>
So I want to have a parcelport/parcelhandler setup, that is completely indepedentant of the whole HPX tasking framework at first
<jbjnr>
(btw - My next parcelport work will focus on getting the rma stuff into channels and adding the collectives)
<heller>
yeah, collectives and point to point communication is another thing
<heller>
so one thing to have first, I guess, is to deal with the message passing in a more explicit way, allowing for HPX (and possibly other task based systems) to reuse it efficiently
<heller>
and then we can evaluate performance of the whole networking layer more realibly without anything else getting in the way
<heller>
for example, design it with communicators in mind from the ground up
<heller>
no GIDs, just endpoints to which you can dispatch functions to or send messages
<heller>
if that makes sense
<heller>
msimberg: did you run the whole testsuite with your changes in the throttle fix branch?
<msimberg>
heller: no, I did not
<msimberg>
I can still do that
<heller>
I am on it atm
<msimberg>
ah, okay, thanks
<heller>
getting lots of failures
<msimberg>
:(
<msimberg>
okay, let me know which ones
<heller>
not sure if it is related to all the other failures we have right now :/
<heller>
especially #2982, #2998 and #3007
<heller>
merging it to my branch atm...
<heller>
and try again...
<github>
[hpx] msimberg opened pull request #3011: Fix cpuset leak in hwloc_topology_info.cpp (master...fix-hwloc-leak) https://git.io/vFMY7
<msimberg>
ah, I see now, 2 revisions... thanks again
<heller>
ahh, almost ...
<heller>
the timed version is still failing :/
K-ballo has joined #ste||ar
<heller>
msimberg: did you touch the state_suspended vs. state_stopped thing?
<msimberg>
heller: not on this branch, I have in my experiments with suspending though
hkaiser has joined #ste||ar
hkaiser has quit [Quit: bye]
<heller>
msimberg: ok
mcopik has joined #ste||ar
mcopik has quit [Client Quit]
Hodor12345678 has joined #ste||ar
<Hodor12345678>
Hello Everyone
<K-ballo>
hi there
Hodor12345678 has quit [Remote host closed the connection]
<heller>
that was quick ;)
<jbjnr>
Hold the Door!
<heller>
msimberg: closing in now...
<heller>
msimberg: I have another patch that needs to be integrated into the fix-throttle-test
K-ballo has quit [Read error: Connection reset by peer]
K-ballo has joined #ste||ar
<msimberg>
heller: can I see? I'm just trying to understand what the thread pool executor is actually doing, so I'm afraid I'm of not much help (yet)
<heller>
msimberg: still running into hangs at shutdown ...
<heller>
msimberg: in essence, the thread pool executors run an embedded scheduling loop
<msimberg>
heller: do you have a concise explanation of how it stops? is the destructor supposed to block until all the work on the thread pool executor is done?
<heller>
msimberg: yes. the problem is, that it stops too early
<msimberg>
yeah, the cleanup_terminated functions are most likely too relaxed now
<msimberg>
first try with adding back checks for thread_map_.empty() at least hangs
<msimberg>
and btw, thread_map_count_ should always be the same as thread_map_.size(), no?
<heller>
yes
<heller>
well, there is a race, but that should be accounted for, that is, thread_map_count_ should only be increased once the item has been inserted, and decreased once it has been removed
<msimberg>
sure, but besides that
<msimberg>
so what is the purpose of having a separate count variable then?
<heller>
optimization
<heller>
checking an atomic is cheaper then acquiring a contented lock
<msimberg>
mmh, I see
<heller>
OTOH, we could turn it around and do a try lock instead... if that doesn't succeed, we assume there is enough work or equivalent for other functions
<K-ballo>
heller: what's holding the component factory removals?
david_pfander has joined #ste||ar
<msimberg>
heller: going to stop for today and continue tomorrow, but I found at least that it's only the test_timed_apply part that fails
<msimberg>
and at the time of the assert its 4 completed, 6 scheduled, so those might be the two missing
<msimberg>
do post_after/at do something differently than async/sync wrt the thread map or something?
<msimberg>
anyway, will continue tomorrow, thanks for finding the problems
<heller>
msimberg: the patch I sent in is needed. I was looking at the wrong spot though. I know what to do know...
david_pfander has quit [Ping timeout: 240 seconds]
EverYoung has joined #ste||ar
jaafar_ has joined #ste||ar
gedaj has joined #ste||ar
david_pfander has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
gedaj has quit [Quit: leaving]
david_pfander has quit [Ping timeout: 240 seconds]