hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC2018: https://wp.me/p4pxJf-k1
diehlpk has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
EverYoung has quit [Ping timeout: 265 seconds]
parsa has joined #ste||ar
EverYoung has joined #ste||ar
parsa has quit [Ping timeout: 240 seconds]
EverYoung has quit [Remote host closed the connection]
eschnett has joined #ste||ar
diehlpk has quit [Remote host closed the connection]
diehlpk has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
ABresting has joined #ste||ar
diehlpk has quit [Ping timeout: 240 seconds]
parsa has joined #ste||ar
K-ballo has quit [Quit: K-ballo]
nanashi55 has quit [Ping timeout: 248 seconds]
nanashi55 has joined #ste||ar
Anushi1998 has quit [Ping timeout: 276 seconds]
Anushi1998 has joined #ste||ar
apsknight has quit [Ping timeout: 265 seconds]
Anushi1998 has quit [Ping timeout: 255 seconds]
ABresting has quit [Quit: Connection closed for inactivity]
Anushi1998 has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
parsa has joined #ste||ar
apsknight has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
Anushi1998 has quit [Ping timeout: 265 seconds]
gedaj has quit [Read error: Connection reset by peer]
_gedaj_ has joined #ste||ar
hkaiser has quit [Ping timeout: 255 seconds]
_gedaj_ has quit [Remote host closed the connection]
_gedaj_ has joined #ste||ar
<heller> zao: so you would say singularity is docker done right?
_gedaj_ has quit [Read error: Connection reset by peer]
gedaj has joined #ste||ar
Anushi1998 has joined #ste||ar
<zao> A bit different niches, I’d say. Singularity is more about dipping into an image with most of the host mapped in (networking, home dir, devices, etc.) to run something in a somewhat different way.
<zao> While Docker and LXC is more about persistent lightweight VM-like envs, with NAT-by-default and whatnot.
<zao> It is definitely a project that is needed out there, let’s just hope they learn how to responsibly write suid binaries and handle incidents.
<zao> Because holy heck, they did approximately everything wrong handling my vulnerability :D
<heller> ;)
simbergm has joined #ste||ar
ABresting has joined #ste||ar
jaafar has quit [Ping timeout: 264 seconds]
apsknight has quit [Read error: Connection reset by peer]
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
<zao> Still allergic to issuing CVEs, heh.
gedaj has quit [Remote host closed the connection]
gedaj has joined #ste||ar
EverYoung has joined #ste||ar
nanashi55 has quit [Ping timeout: 263 seconds]
nanashi55 has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
recolic has joined #ste||ar
recolic has quit [Client Quit]
nikunj_ has joined #ste||ar
Zwei has quit [Ping timeout: 268 seconds]
ABresting has quit [Quit: Connection closed for inactivity]
nikunj_ has quit [Ping timeout: 260 seconds]
nikunj_ has joined #ste||ar
anushi has joined #ste||ar
verganz has joined #ste||ar
verganz has quit [Ping timeout: 260 seconds]
heller has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
heller has joined #ste||ar
hkaiser has joined #ste||ar
nikunj_ has quit [Quit: Page closed]
nanashi64 has joined #ste||ar
nanashi55 has quit [Ping timeout: 248 seconds]
nanashi55 has joined #ste||ar
EverYoung has joined #ste||ar
nanashi64 has quit [Ping timeout: 256 seconds]
<zao> Bah, vectorization talk is all OpenMP hogwash.
<zao> (got Intel people here running a KNL workshop)
EverYoung has quit [Ping timeout: 255 seconds]
nanashi55 has quit [Ping timeout: 240 seconds]
nanashi55 has joined #ste||ar
K-ballo has joined #ste||ar
anushi has quit [Read error: Connection reset by peer]
anushi has joined #ste||ar
<hkaiser> zao: forget this, KNL has been declared to be dead by Intel itself
<hkaiser> also omp does not scale on knl
<zao> Hehe.
<heller> noone does :P
<zao> This is PRACE, maybe they're slow on the uptake :P
<hkaiser> heller: some do better than others ;)
<heller> the machines that have been provisioned are better being used
<zao> Also, we have actual KNLs in production, so helps to teach our users to get some use out of them.
anushi has quit [Quit: bye]
parsa has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
nanashi55 has quit [Ping timeout: 255 seconds]
nanashi55 has joined #ste||ar
nikunj has joined #ste||ar
hkaiser has quit [Read error: Connection reset by peer]
parsa has quit [Quit: Zzzzzzzzzzzz]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
nikunj has quit [Ping timeout: 264 seconds]
nikunj has joined #ste||ar
aserio has joined #ste||ar
eschnett has quit [Quit: eschnett]
eschnett has joined #ste||ar
hkaiser has joined #ste||ar
<zao> Have you fine people had anything to do with Mikko Byckling and Asma Farjallah from Intel?
zao has quit []
zao has joined #ste||ar
anushi has joined #ste||ar
anushi has quit [Client Quit]
Anushi1998 is now known as anushi
Anushi1998 has joined #ste||ar
<hkaiser> heller: is the make_ready_future PR ok with you now?
diehlpk has joined #ste||ar
parsa has joined #ste||ar
nanashi55 has quit [Ping timeout: 255 seconds]
nanashi55 has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
nikunj has quit [Ping timeout: 264 seconds]
nikunj has joined #ste||ar
khuck has joined #ste||ar
<khuck> @hkaiser - you there? I think I found a problem with the parent_id stuff in HPX. I am getting an hpx_init_data with a parent_id pointer, but its data is garbage.
<khuck> I am wondering if I can just check the parent_id.thrd_->current_state_.state_ value - it is 0
<khuck> but other members, like the parent_thread_id_ and the parent_locality_id_ are obviously bogus.
<khuck> i.e. parent_locality_id_ = 395919568 on a 1 locality run
diehlpk has quit [Ping timeout: 248 seconds]
<hkaiser> khuck: interesting
<hkaiser> how can I reproduce this?
<hkaiser> ahh, I think I know what's happening
<hkaiser> doh
<hkaiser> let me think
elfring has joined #ste||ar
gedaj has quit [Ping timeout: 264 seconds]
gedaj has joined #ste||ar
EverYoung has quit [Ping timeout: 265 seconds]
EverYoung has joined #ste||ar
aserio has quit [Ping timeout: 255 seconds]
jaafar has joined #ste||ar
<khuck> hkaiser: one way to expose it is configure with APEX, then build Phylanx and run lra_csv with the breast cancer data set. I haven't found a smaller test case that replicates it.
<hkaiser> khuck: I think what happens is that a thread is created as a staged thread first and by the time it actually gets instantiated the parent threead is already gone
EverYoung has quit [Ping timeout: 255 seconds]
<khuck> but parent_thread_id_ is not null, so I think it's still there
<hkaiser> no
<hkaiser> heller: change thread-ids from actually keeping the object alive to plain pointers :/
<hkaiser> so it's his fault
<khuck> ah
<khuck> so a staged thread shouldn't "retain" its parent_thread_id_ at all, because it won't be around later
<hkaiser> I knew there was a reason, but when he changed it I couldn't remember why we were keeping the thread object alive
<hkaiser> khuck: what information do you need from a parent thread?
<khuck> parent_thread_id_.get()->get_apex_data()
<khuck> but the thread_data is gone
<khuck> only for dependency tracking. So if the parent isn't going to wait for the child, there's no reason to track it.
<hkaiser> K-ballo: I understand
<hkaiser> khuck: ^^
<hkaiser> sorry K-ballo
<khuck> :)
<K-ballo> ...
<khuck> I haven't been on IRC for a few months, so you'll have to get used to tabbing twice. :)
<hkaiser> khuck: yah
<hkaiser> khuck: I need to think about how this can be solved
<khuck> np
<khuck> hkaiser: I was wondering if storing the m_apex_data in the thread_data is the right thing to do
<hkaiser> khuck: I think it does make sense to track the parent even if it's not around anymore once the child is run
<hkaiser> khuck: what alternative do you see?
<hkaiser> (without having a separate map holding that)
<khuck> hkaiser: store it somewhere else, where it doesn't need to be updated at the rebind process
<hkaiser> khuck: what we could do is to call into apex the moment the staged thread is created at which point the apex data is still accessible
<khuck> somewhere more persistent to the task at hand
<khuck> yes
<khuck> hkaiser: in the thread_init_data?
<hkaiser> khuck: sec
<hkaiser> khuck: where is apex currently called when a thread is created?
<khuck> hkaiser: thread_data constructor, and rebind_base
<hkaiser> can you give me a link to those, pls
<hkaiser> ok
<hkaiser> sec
<khuck> hkaiser: no rush - i have a phone call in 2 minutes, could be an hour but I hope not
<hkaiser> that might be a better place for you to call into APEX than in the thread constructor
<khuck> (thumbs up)
<hkaiser> if run_now == true a new threa object is created (the constructor is called), otherwise a stage thread is created
<khuck> regardless, the parent is alive then?
<hkaiser> yes
<khuck> coolio
Anushi1998 has quit [Remote host closed the connection]
Anushi1998 has joined #ste||ar
aserio has joined #ste||ar
nanashi55 has quit [Ping timeout: 260 seconds]
nanashi55 has joined #ste||ar
_anushi has joined #ste||ar
Anushi1998 has quit [Ping timeout: 255 seconds]
_anushi is now known as Anushi1998
EverYoung has joined #ste||ar
quaz0r has quit [Ping timeout: 265 seconds]
EverYoung has quit [Ping timeout: 255 seconds]
quaz0r has joined #ste||ar
_anushi has joined #ste||ar
Anushi1998 has quit [Ping timeout: 260 seconds]
_anushi is now known as Anushi1998
verganz has joined #ste||ar
verganz has quit [Client Quit]
verganz has joined #ste||ar
verganz has quit [Ping timeout: 260 seconds]
diehlpk has joined #ste||ar
diehlpk has quit [Ping timeout: 240 seconds]
hkaiser has quit [Quit: bye]
nikunj has quit [Ping timeout: 248 seconds]
Anushi1998 has quit [Remote host closed the connection]
Anushi1998 has joined #ste||ar
aserio1 has joined #ste||ar
aserio has quit [Ping timeout: 255 seconds]
aserio1 is now known as aserio
diehlpk has joined #ste||ar
elfring has quit [Quit: Konversation terminated!]
eschnett has quit [Quit: eschnett]
nikunj has joined #ste||ar
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 265 seconds]
hkaiser has joined #ste||ar
Anushi1998 has quit [Remote host closed the connection]
Anushi1998 has joined #ste||ar
<khuck> hkaiser: I decided create_thread_object might be a better spot, since it is called from create_thread() multiple places: https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/runtime/threads/policies/thread_queue.hpp#L201
<khuck> but you can veto that
EverYoung has joined #ste||ar
<K-ballo> will there be any HPX people at cppnow?
<hkaiser> khuck: at that point the parent might have gone
<khuck> hkaiser: yeah, I have already tried it and crashed it.
<khuck> hkaiser: what does the parent_phase represent?
<hkaiser> khuck: doing it here might be a good spot
<hkaiser> the phase counter counts how often a thread is re-activated (after suspension)
nikunj has quit [Ping timeout: 248 seconds]
eschnett has joined #ste||ar
Anushi1998 has quit [Remote host closed the connection]
Anushi1998 has joined #ste||ar
parsa has quit [Quit: Zzzzzzzzzzzz]
aserio has quit [Quit: aserio]
eschnett has quit [Quit: eschnett]
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar
<khuck> hkaiser: for the case at line 779, how do I get access to the created thread?
<khuck> hkaiser: (or more specifically, something that has access to the m_apex_data
<khuck> )
<khuck> hkaiser: I think I have it figured out...
Anushi1998 has quit [Remote host closed the connection]
Anushi1998 has joined #ste||ar
Anushi1998 has quit [Remote host closed the connection]
Anushi1998 has joined #ste||ar
<hkaiser> khuck: I don't think you have access yet, but you should be able to access the parent
<hkaiser> the thread object does not exist at that point yet
<khuck> hkaiser: right! I've got it, I think
<hkaiser> what we might need is to accociate a 64bit word to staged threads as well for you to keep track of things
<khuck> I am capturing the parent in the thread_init_data object, then assigning it to the thread in rebind/constructor.
<khuck> you can review my work when I am done. :)
<hkaiser> just be careful as the parent might not be alive anymore in rebind
<khuck> it's ok, because I already have what I need from it by then
<hkaiser> k
Anushi1998 has quit [Remote host closed the connection]
Anushi1998 has joined #ste||ar
EverYoun_ has joined #ste||ar
EverYoung has quit [Ping timeout: 276 seconds]
parsa has joined #ste||ar
EverYoun_ has quit [Ping timeout: 265 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
parsa has quit [Quit: Zzzzzzzzzzzz]
EverYoung has joined #ste||ar
EverYoung has quit [Ping timeout: 255 seconds]
Anushi1998 has quit [Ping timeout: 256 seconds]
anushi has quit [Ping timeout: 255 seconds]
EverYoung has joined #ste||ar
EverYoung has quit [Remote host closed the connection]
EverYoung has joined #ste||ar