hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/ | GSoC: https://github.com/STEllAR-GROUP/hpx/wiki/Google-Summer-of-Code-%28GSoC%29-2020
jaafar has joined #ste||ar
hkaiser has joined #ste||ar
akheir has joined #ste||ar
<hkaiser> gnikunj[m]: nod, thanks
<hkaiser> does it work if the tcp pp is enabled?
<gnikunj[m]> yes
<gnikunj[m]> how do I know if it is using tcp or mpi in that case? iirc tcp is the default case
<hkaiser> if you use mpirun it will use the mpi pp
bita has joined #ste||ar
hkaiser has quit [Quit: bye]
shahrzad_ has joined #ste||ar
shahrzad has quit [Ping timeout: 256 seconds]
shahrzad_ has quit [Quit: Leaving]
bita has quit [Ping timeout: 260 seconds]
bita has joined #ste||ar
akheir has quit [Quit: Leaving]
Yorlik has quit [Ping timeout: 240 seconds]
nanmiao11 has quit [Remote host closed the connection]
bita has quit [Ping timeout: 260 seconds]
Pranavug has joined #ste||ar
<Pranavug> gnikunj: Are you currently using rostam cluster? What time will you be done?
<gnikunj[m]> Pranavug yes I have a script running on 15 nodes right now. It should be done in a couple of hours at best.
<gnikunj[m]> I'll notify you when I'm done
<Pranavug> gnikunj: Ok thanks. Please
<ms[m]> K-ballo: re std::result_of would you have time for a PR for that? we're not in a huge rush with the release
<gnikunj[m]> Pranavug: you can use the nodes now
Pranavug has quit [Ping timeout: 256 seconds]
<gnikunj[m]> Does anyone have experience working with recent PAPI versions? I can't find functions like PAPI_start_counters and PAPI_read_counters
Pranavug has joined #ste||ar
<Pranavug> gnikunj: Thanks for informing
weilewei has quit [Ping timeout: 245 seconds]
<Pranavug> Hey, why are there so many Jenkins processes on rostam cluster?
<gonidelis[m]> When I create a ranges-algo overload the last argument is ` Proj&& proj = Proj()`
<gonidelis[m]> But when I call this ranges-algo in a test, this argument might be missing
<gonidelis[m]> Does the compiler automatically pass `projection_identity` for this argument or is it a no-match case and thus compiler fails?
Pranavug has quit [Quit: Leaving]
<rori> it will use whatever the default constructor `Proj` uses I think
<gonidelis[m]> So I need to specify projection_identity bymyself as a deafult arg
<gonidelis[m]> so, it has a def-arg
<gonidelis[m]> and I don't need to pass it
<rori> 👍️
<gonidelis[m]> rori_[m]: thanks a lot!
<rori> it depends how your Proj is created
<gonidelis[m]> ??
<rori> I'm not sure where `Proj` is defined but you should have your answer there ;)
diehlpk_work has quit [Ping timeout: 256 seconds]
hkaiser has joined #ste||ar
<hkaiser> hey ms[m]
<gonidelis[m]> hkaiser: I pushed the ranges::transform CPO ;)
<hkaiser> gonidelis[m]: \o/
<gonidelis[m]> hkaiser: I think it is safe to proceed on the BinaryOperation overloads :)
<gonidelis[m]> hkaiser: reminder: meeting in 1 hour ;)
<hkaiser> gonidelis[m]: if you feel comfortable to do that - sure!
<hkaiser> yes, I will be there
<hkaiser> need coffee first, though
<gonidelis[m]> hkaiser: Member's must indicate their personal coffee mug in order to be accepted in the meeting anyways
<gonidelis[m]> Memebers^^
<hkaiser> ok, deal ;-)
<ms[m]> hkaiser: hey! sorry didn't see your message yesterday in time
<hkaiser> np, it was late
<hkaiser> ms[m]: would you have time for a short(-ish) chat about hpx-kokkos later today or tomorrow?
<ms[m]> yeah, sure
<ms[m]> either is fine
<ms[m]> including right now
<hkaiser> right now doesn't work, sorry - need coffee
<ms[m]> np :P
akheir has joined #ste||ar
<hkaiser> would 10am/17.00 work? or tomorrow 9am/16.00?
<ms[m]> let's do it tomorrow morning (for you) then
<hkaiser> ok
<hkaiser> thanks a lot
<ms[m]> 👍️
<hkaiser> Katie will send a zoom link
<ms[m]> all right, thanks
<rori> may I join the hpx-kokkos meeting ? :D
diehlpk_work has joined #ste||ar
<hkaiser> rori: sure
<rori> gonidelis: meeting?
<gonidelis[m]> I am loggin in right now
<rori> 👍️
Yorlik has joined #ste||ar
<hkaiser> hey Yorlik, welcome back!
<Yorlik> Heyo!
<Yorlik> Never been away - just lurking.
weilewei has joined #ste||ar
nanmiao11 has joined #ste||ar
bita has joined #ste||ar
<diehlpk_work> ms[m], Can we change rostam to a working cluster again?
<diehlpk_work> Currently, it is a build cluster and I am not sure this is what we want
<diehlpk_work> One thing we could do is that jenkins can not use all of our GPU nodes
<hkaiser> diehlpk_work: not sure what you mean by 'working cluster'? isn't it 'working'?
<diehlpk_work> hkaiser, It is working, but only for jenkins
<diehlpk_work> not for me because jenkins uses all nodes
<diehlpk_work> Just wanted to debug octotiger on rostam, but jenkins uses all cuda nodes
<hkaiser> diehlpk_work: akheir is working on making the jenkins jobs low priority, so that everybody should be able to quickly get access to nodes
<diehlpk_work> We might keep geev for us solely and jenkisn can use bahram
<diehlpk_work> and might keep at least one marvin and teo medusa nodes for us and jenkins can not use these
<diehlpk_work> So we could use these ndoes for debugging and do not have to wait until jenkins finished
<hkaiser> diehlpk_work: sure, we can do that - we talked about this with akheir last meeting, I believe
<diehlpk_work> I can mention it again tomorrow
<hkaiser> pls do
<diehlpk_work> At least having some nodes available and jenkins can not allocate all would be a first step
<ms[m]> diehlpk_work: yes, it's not meant to make life horrible for interactive users, this is just the initial configuration
<ms[m]> let's try what akheir has in mind first, and if that isn't enough we can try to change it further
<diehlpk_work> Ok, I hope it will become better
<diehlpk_work> At least I can apply for QB and run my code there
<diehlpk_work> hkaiser, How do I apply for QB?
<ms[m]> yeah, indeed, please remind us if things don't improve
<hkaiser> diehlpk_work: apply for a loni account, put me in as the sponsor
<diehlpk_work> Ok, I will do that
<hkaiser> then either use Dominics allocation or apply for one yourself
<hkaiser> startup allocations are easy to apply for and will be approve immediately, I think
<diehlpk_work> I think I will apply without DOminic, since I like to have some time to run the peridyanmic code on a large scale
<diehlpk_work> So we have time for octotiger and my code
<akheir> ms[m]: There is problem with some Jenkins' runs which I haven't figured out yet. The jobs hang and slurm cannot release the node.
<diehlpk_work> akheir, Can we exclude geev from the jenkins runs?
<diehlpk_work> So we have at least one cuda node available?
<akheir> yes, I will do that today
<diehlpk_work> Same for one marvin node and two or three medusa nodes?
<hkaiser> let's create special partitions on rostam to be used by jenkins
<ms[m]> is it possible to have a separate partitions for jenkins (cpu only and gpu) that are a subset of the other partitions
<ms[m]> ?
<hkaiser> nod
<hkaiser> akheir: ^^
<diehlpk_work> Can we have a debug queue as well? So we can get a small allocation (15 minutes) with a higher priority?
<akheir> diehlpk_work: That would be to much fragmentation on partitions. Jenkins jobs won't take that long, lower priority should solve the problem
<hkaiser> ok, cool
<ms[m]> 👍️
<diehlpk_work> akheir, Ok, anything what improves the current situation will be appreciated
<ms[m]> akheir: not sure what I can do about the hung jobs, the jenkins interface doesn't show anything until jobs are completed
<akheir> we don't hat that nodes for increasing the number of queues make a impact.
<ms[m]> if you have any ideas on what I could look at I'm all ears (or if you have access to logs for those jobs that you'd like me to have a look at)
<akheir> ms[m]: I have to investigate, slurm complains about open I/O files and cannot release the node, the only way to cancel the job is to reboot the node. I think this main reason for complains
<akheir> I have to find fix for this problem first
<ms[m]> hmm, ok
<ms[m]> I'll try to dig around and see if there's anything that looks like it could cause that
<tiagofg[m]> hkaiser Hello! Regarding the inheritance issue, I would like to know if the problem has a solution or not, I'm finishing my master's thesis and I really needed that information to know if I have to rewrite the code in another way or not
<tiagofg[m]> For you guys must be a simple thing I guess.
karame_ has joined #ste||ar
<hkaiser> tiagofg[m]: you wanted to create a small example I could look at
<karame_> hkaiser Could you please send me the zoom ID.
<akheir> ms[m]: I didn't see your comment about the special partition for Jenkins. Yes that's the way I have to configure the nodes, In slurm the queue and partition are the same. in order to have lower priority queue we have to create new partitions
shahrzad has joined #ste||ar
<tiagofg[m]> hkaiser: sure
<hkaiser> tiagofg[m]: will look
<ms[m]> akheir: 👍️
<tiagofg[m]> hkaiser: the main function is on the same repository. Thanks!
shahrzad has left #ste||ar [#ste||ar]
akheir has quit [Quit: Leaving]
akheir has joined #ste||ar
shahrzad has joined #ste||ar
shahrzad has quit [Client Quit]
shahrzad has joined #ste||ar
<weilewei> hkaiser see DM, please
<hkaiser> weilewei: sec
<weilewei> ok
shahrzad has quit [Remote host closed the connection]
shahrzad has joined #ste||ar
<hkaiser> tiagofg[m]: yt?
<K-ballo> is the inheritance puzzle solved?
bita has quit [Read error: Connection reset by peer]
<hkaiser> K-ballo: not yet, I need to find out what the puzzle was, first