#ste||ar on 2017-06-05 — irc logs at irclog.cct.lsu.edu

2017-05-17 13:54 aserio changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:01 Smasher has quit [Quit: 2ProShells BNC Powering down]

00:02 Smasher has joined #ste||ar

00:06 <jbjnr> ouch - during preprocess chunking is disabled,

00:08 <jbjnr> so the size is computed for an unoptimized serialization

00:09 <hkaiser> jbjnr: there you go

00:13 <jbjnr> but why?

00:13 <jbjnr> can there really be so many bugs?

00:13 <jbjnr> anton said the serialization was worse ....

00:18 <hkaiser> worse in what sense?

00:18 <hkaiser> jbjnr: so far we didn't care what we calculate for the required size of the serialization buffer as long as it was large enough

00:19 <hkaiser> so it's not really a bug, I guess

00:23 <jbjnr> hkaiser: worse in the sense that tests he ran in the past were better than ones recently

00:24 <hkaiser> jbjnr: I fixed that

00:24 <jbjnr> I am dumping out stuff and the flags are set correctly, so there's a bug somewhere in the checking ....

00:24 <jbjnr> I'd better sleep and fix this tomorrow. thanks for the help.

00:37 denis_blank has quit [Quit: denis_blank]

00:40 <diehlpk> hkaiser, Is there any attempt to write a book about hpx?

00:41 <diehlpk> I was thinking about the handbook of hpx, where different people contribute with chapters.

00:49 <hkaiser> diehlpk: heh

00:49 <hkaiser> diehlpk: heller_ talked about this - half jokingly

00:49 <hkaiser> other than taht, no attempts being mde

00:51 <diehlpk> Ok, if we are interested we could start to collect ideas for chapters.

00:51 <hkaiser> absolutely!

00:56 <diehlpk> should I send a mail at our mailinglist?

00:56 <hkaiser> :D

00:56 <hkaiser> stellar-internal?

00:57 <hkaiser> might be even better to make it a personal email to people you think would be interested

00:59 <diehlpk> Ok, so I think you, heller, adrian?

00:59 <hkaiser> John? Bryce? Pat? Kevin?

01:00 <diehlpk> Ok, I will ask them

01:00 <hkaiser> ok

01:01 <hkaiser> thanks!

01:16 <diehlpk> You are welcome.

01:16 <diehlpk> Mail was sent righ now

01:52 K-ballo has quit [Quit: K-ballo]

01:59 hkaiser has quit [Quit: bye]

02:21 ajaivgeorge has quit [Ping timeout: 255 seconds]

02:28 diehlpk has quit [Ping timeout: 240 seconds]

03:03 pree_ has joined #ste||ar

03:08 pree_ has quit [Remote host closed the connection]

03:56 hkaiser has joined #ste||ar

03:59 hkaiser has quit [Client Quit]

04:03 pree has joined #ste||ar

05:08 pree has quit [Quit: AaBbCc]

05:29 <taeguk> Excuse me, I have a question.

05:31 <taeguk> It seems that checking if an execution policy is vectorpack execution policy is omitted now in https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/util/loop.hpp#L495-L496.

05:31 <taeguk> unlike https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/util/loop.hpp#L206-L210

05:33 <taeguk> I don't know vectorpack execution policy exactly. But I have found the above.

05:33 <taeguk> Is there anyone to tell me this?

05:54 jaafar has quit [Ping timeout: 245 seconds]

06:50 Smasher has quit [*.net *.split]

06:50 wash[m] has quit [*.net *.split]

06:50 Smasher has joined #ste||ar

06:51 wash[m] has joined #ste||ar

07:11 shoshijak has joined #ste||ar

07:31 bikineev has joined #ste||ar

07:31 shoshijak has quit [Ping timeout: 255 seconds]

07:49 david_pf_ has joined #ste||ar

07:59 shoshijak has joined #ste||ar

08:01 bikineev has quit [Remote host closed the connection]

08:03 bikineev has joined #ste||ar

08:34 bikineev has quit [Remote host closed the connection]

08:40 shoshijak has quit [Ping timeout: 255 seconds]

08:41 bikineev has joined #ste||ar

08:44 <jbjnr> taeguk I think you'll find that support for vectorized types is incomplete, so if you find suppport in one place, but not in another, it is just because it hasn't been implemented or tested yet

08:45 <jbjnr> bikineev: yt?

08:45 shoshijak has joined #ste||ar

08:46 bikineev has quit [Ping timeout: 268 seconds]

08:47 <heller_> jbjnr: hmm, the chunks should get forwarded properly

08:48 <heller_> The flags are taken from the parcelport

08:49 <heller_> And utterly broken is a small overstatement...

08:53 <jbjnr> heller_: I fixed the size calculation error by commenting out this https://github.com/STEllAR-GROUP/hpx/blob/c24d02101fe6c9a58a5d46d1781b1c481c4e0d48/hpx/runtime/serialization/output_archive.hpp#L106

08:53 <jbjnr> during preprocess, the chunker is null and so all the optimizations are disabled

08:54 <jbjnr> leaving that check out doesn't seem to change anything, but now the size is correct

08:55 <jbjnr> but I am troubled still by https://github.com/STEllAR-GROUP/hpx/blob/c24d02101fe6c9a58a5d46d1781b1c481c4e0d48/hpx/runtime/serialization/output_archive.hpp#L360 I think it should only add count bytes if optimizations are off

08:56 <heller_> jbjnr: ok, I'll have a look

08:57 <heller_> jbjnr: Hartmut changed that

08:57 <jbjnr> I'm using this https://gist.github.com/biddisco/c25315db09ad037c7f4388a7900f68de

08:57 <jbjnr> and for my rma stuff it works fine.

08:57 <heller_> Sounds good

08:57 <jbjnr> (and saves a lot of unnecessary allocation when doing osu latency with large sizes)

08:58 <heller_> The first line should check if we have chunking enabled

08:58 <jbjnr> etc etc

08:58 <heller_> Sure

08:58 <heller_> It used to work :/

08:58 <jbjnr> I'm going over a lot of the code now anywy

08:58 <jbjnr> I have had a disaster

08:58 <heller_> How so?

08:59 <jbjnr> My thesis extension request was denied by the faculaty, so I'm now 3 months past my deadline and I have to submit immediately, so I need to get this code written up and submit a paper on friday

08:59 <jbjnr> The dept approved it in jan, but the faculty have only now rejected it - 3 months after the deadline passed'

09:00 <heller_> Yay...

09:00 <jbjnr> no.

09:00 <jbjnr> http://computing-conf.org/call-for-papers/

09:01 <jbjnr> deadlin end of week. onkly place I can find with a fast turnaround

09:01 <jbjnr> so I'll have a=to write a shit paper for that and we can do a better one for IPDPS or something later

09:02 <heller_> Here is the problem... I'm not helpful this week...

09:02 <jbjnr> It can't be a good conf because they only have microsoft word templates for submission :(

09:02 <heller_> EU Review on Wednesday

09:02 <jbjnr> heller_: do not worry. I can write it all on behalf of us 4

09:02 <heller_> Lol

09:02 <heller_> Ok

09:03 <jbjnr> I only need some feedback from anton about the serialization benchmarks, they don't compile properly for me

09:03 <jbjnr> once i fix that, I can run graphs etc myself

09:04 <heller_> Great

09:04 <jbjnr> (The only thing I do not fully understand is how action types are serialized)

09:04 <heller_> I'll be at the airport at around 4

09:04 <jbjnr> but I don't really want to write about that anyway'

09:04 <heller_> And hope to finish my slides during the flight

09:04 <jbjnr> just a mention of it is enough really

09:05 <heller_> So we can run through stuff at night

09:05 <jbjnr> have a safe journey

09:05 <heller_> If that's ok

09:05 <heller_> Thanks

09:09 Matombo has joined #ste||ar

09:09 bikineev has joined #ste||ar

09:42 david_pf_ has quit [Ping timeout: 240 seconds]

11:17 bikineev has quit [Ping timeout: 260 seconds]

11:40 K-ballo has joined #ste||ar

11:55 hkaiser has joined #ste||ar

11:58 bikineev has joined #ste||ar

12:01 denis_blank has joined #ste||ar

12:07 shoshijak has quit [Ping timeout: 255 seconds]

12:09 josef__k has joined #ste||ar

12:14 shoshijak has joined #ste||ar

12:17 josef__k has quit []

12:17 <heller_> hkaiser: the merging of future awaiting and size calculation is an optimization. saves a serialization pass

12:17 josef__k has joined #ste||ar

12:17 <hkaiser> heller_: sure

12:18 <hkaiser> first correctness, then performance, though

12:19 ajaivgeorge has joined #ste||ar

12:19 <hkaiser> jbjnr: I should be able to help this week

12:19 <heller_> hkaiser: the chunking not working properly anymore is on you though ;)

12:19 <hkaiser> why is chunking not working properly?

12:19 <heller_> https://github.com/STEllAR-GROUP/hpx/blob/c24d02101fe6c9a58a5d46d1781b1c481c4e0d48/hpx/runtime/serialization/output_archive.hpp#L106

12:20 <heller_> [10:53:29] <jbjnr> heller_: I fixed the size calculation error by commenting out this https://github.com/STEllAR-GROUP/hpx/blob/c24d02101fe6c9a58a5d46d1781b1c481c4e0d48/hpx/runtime/serialization/output_archive.hpp#L106

12:20 <heller_> [10:53:46] <jbjnr> during preprocess, the chunker is null and so all the optimizations are disabled

12:20 <heller_> [10:54:12] <jbjnr> leaving that check out doesn't seem to change anything, but now the size is correct

12:20 <hkaiser> why is that not correct?

12:20 <heller_> gtg, will be back in about an hour

12:22 <josef__k> OK, my test program compiles, runs, computes the correct result, but is not parallelized. :\ I will pastebin some code shortly, but when adapting from std::inner_product to hpx::parallel::transform_reduce, do I need to return futures from my binary operators?

12:22 <hkaiser> heller_: shrug, do we have a test enforcing this behaviour you claim I have broken?

12:23 <hkaiser> josef__k: have you passed any command line options?

12:23 <josef__k> hkaiser: No.

12:23 <hkaiser> nod

12:23 <heller_> hkaiser: it's not my claim

12:23 <hkaiser> josef__k: HPX runs on one core if not instructed otherwise

12:23 <josef__k> hkaiser: Ohhh. :)

12:23 <hkaiser> heller_: shrug

12:23 <hkaiser> josef__k: use the command line option --hpx:threads=N (-tN)

12:24 shoshijak has quit [Ping timeout: 255 seconds]

12:26 <heller_> hkaiser: let's just create a test case, tbh, I have no idea what's broken

12:26 <hkaiser> me neither

12:27 <K-ballo> heller_: I talked to EricWF and he confirmed recentish versions of libc++ support exception_ptr on windows too

12:28 <K-ballo> we could switch to <exception> directly, and only introduce a compat layer for it if we hit a platform without the required compiler support/library integration

12:28 <josef__k> Ah, the mysteries of parallelization; it's slower when threads=#CPUs vs single-threaded.

12:29 <hkaiser> josef__k: lol

12:29 <hkaiser> josef__k: happens to all of us

12:29 <hkaiser> try using lesser number of cores

12:29 <josef__k> Granted, the benchmark library may be interfering.

12:30 <hkaiser> also, recompile HPX using cmake -DHPX_WITH_THREAD_IDLE_RATES=On ...

12:31 <hkaiser> josef__k: that will enable a performance counter which is very useful in analysing parallelization

12:35 <josef__k> It's only somewhat slower, about 20%. For similar code when I tried OpenMP, it became 100% slower.

12:37 <hkaiser> josef__k: but it shouldn't be _slower_

12:38 <josef__k> Mmmm, yes.

12:39 <josef__k> The code repository is here if you're interested: https://github.com/jeremy-murphy/programming in the statistics.hpp file. It's just a whole lot of variations on the same algorithm.

12:39 <hkaiser> josef__k: try recompiling hpx with that flag ^^

12:39 <josef__k> hkaiser: Is that recompiling the HPX library or just my code?

12:40 <hkaiser> that will enable a performance count we call idle-rate - a very nice overarching measure of how well the application is parallelized

12:40 <hkaiser> hpx library

12:40 <hkaiser> this counter adds some overhead so we disable it by default

12:41 <josef__k> OK.

12:42 <josef__k> I will add that flag to the HPX library another time, just not right now.

12:42 <hkaiser> as you wish

12:43 <josef__k> It's bed time here. :)

12:44 <josef__k> Thanks for your help again though.

12:48 shoshijak has joined #ste||ar

12:53 ajaivgeorge has quit [Ping timeout: 246 seconds]

12:55 bikineev has quit [Read error: No route to host]

12:55 bikineev has joined #ste||ar

13:01 <jbjnr> hkaiser: (+heller fyi) it's not so much that the serialization is broken, but during the preprocess pass, the size is calculated - with the chunker set to null, the optimizations are discarded, so the preprocess pass returns a size that assumes all vector/rma chunks are added to the buffer directly and not chunked, so the size is larger than it needs to be. for the vector version, it's not a...

13:01 <jbjnr> ...big deal (slight wast of a malloc), but for the rma version, the memory must be taken from the pinned pool, or registered on the fly and it costs then.

13:01 <jbjnr> I've made a better fix anyway, that I'll do a PR for on it's own.

13:02 <jbjnr> sorry. two fixes, one is just commenting out the chunker = nulptr check, the other is the size calculation of the buffer - which I've improved

13:02 ajaivgeorge has joined #ste||ar

13:10 <heller_> jbjnr: a test for that would be great

13:15 <jbjnr> heller_: I'll see what I can do

13:15 <hkaiser> the chunker == nullptr check makes Anton's serialization tests run properly, otherwise they just segfault

13:15 <jbjnr> ooh

13:16 <jbjnr> that's bad

13:16 <jbjnr> was it added recently?

13:16 <jbjnr> hkaiser: ^

13:16 josef__k has quit [Ping timeout: 240 seconds]

13:17 <jbjnr> I'll see if I can find a less destructive fix anyway. I was going to test anton's serialization stuff in any case

13:18 <hkaiser> yes, I added that change to fix Anton's tests

13:18 <hkaiser> we need to agree on how we want for the archive to be used and adapt all the code using it appropriately

13:19 <jbjnr> yes, I suspect anton's test is doing something slightly unexpected ...

13:20 <jbjnr> one nice thing would be to add an assert that the container size matches the archive size - but during preprocess a dummy container is used, so it is tricky

13:20 <jbjnr> I'm just trying to track down an unreated rma bug, then I'll come back to this.

13:21 diehlpk_work has joined #ste||ar

13:21 <hkaiser> jbjnr: ok, sounds good - let's talk this out though before you change things

13:22 <heller_> during preprocessing, the chunking flag should have been set explicitly

13:22 <heller_> if enabled in the PP that is

13:24 <jbjnr> the flags are set correctly, but the null chunker check causes the optimizations to be disabled

13:24 <jbjnr> hence the mismatch

13:25 <heller_> yeah

13:25 <jbjnr> I only noticed it because my rma PP logging flagged allocations that were 'unexpected'

13:25 <heller_> good catch

13:26 <jbjnr> (and because my new rma serialization test was slower than expected)

13:30 <hkaiser> jbjnr: doesn't it make sense?

13:30 <hkaiser> if there is no chunker, then the optimizations have to be disabled

13:31 eschnett has quit [Quit: eschnett]

13:31 <hkaiser> instead of allowing for the parameters to be mismatched I'd rather we supplied a dummy chunker

13:32 <heller_> or chunk already during preprocessing

13:33 <heller_> IIRC, the preprocessing already calculates the number of chunks needed

13:34 <hkaiser> sorry gtg now

13:35 hkaiser has quit [Quit: bye]

13:36 aserio has joined #ste||ar

13:43 <github> [hpx] sithhell pushed 1 new commit to lf_multiple_parcels: https://git.io/vHVYx

13:43 <github> hpx/lf_multiple_parcels 5ac0d87 Thomas Heller: Merge branch 'master' into lf_multiple_parcels...

13:44 <heller_> jbjnr: ^^ this should be ready now

13:56 ajaivgeorge has quit [Read error: Connection reset by peer]

14:07 <aserio> heller_: are you interested in doing the Journal paper?

14:07 <aserio> heller_: also, how is writing going?

14:08 <heller_> aserio: is the journal publishing my thesis?

14:08 <heller_> aserio: I have a EU review on wednesday, no time to write till then :(

14:08 <aserio> heller_: that is for your to determine, I am simply trying to put a meeting together :)

14:08 ajaivgeorge has joined #ste||ar

14:08 <heller_> aserio: tell me more

14:09 <aserio> I don't know what and EU review is, but it doesn't sound good

14:09 <aserio> heller_: This is the journal paper that comes from Patrick's publication in ParNum

14:09 <heller_> ahh, that one

14:10 <aserio> heller_: if this is in German I will come though your machine to pop you

14:10 <heller_> aserio: the european comission is doing reviews of the projects they fund after 18 months to see if their money is well spent

14:10 <heller_> aserio: learn another language you ignorant fool ;)

14:11 <aserio> ah, is it just a report or do you have a physical reviewer

14:11 <aserio> :p make me

14:11 <heller_> physical reviews

14:11 <heller_> we have a a whole day meeting where we present the projects progress on wednesday

14:11 <aserio> ewww

14:11 <heller_> in brussels

14:12 <heller_> exactly

14:12 <aserio> I liked the city well enough though

14:12 <heller_> i like the city as well

14:12 <heller_> and the review makes sense

14:14 <heller_> it's just a lot of effort

14:24 <github> [hpx] hkaiser pushed 1 new commit to serialization_access_data: https://git.io/vHVGA

14:24 <github> hpx/serialization_access_data 953e158 Hartmut Kaiser: Remove superfluous 'return'

14:25 hkaiser has joined #ste||ar

14:36 bikineev has quit [Ping timeout: 255 seconds]

14:37 eschnett has joined #ste||ar

14:40 mcopik has quit [Ping timeout: 255 seconds]

14:43 <heller_> hkaiser: #2619 should be ready now, btw

14:55 <hkaiser> heller_: ok, thanks

15:00 Matombo has quit [Remote host closed the connection]

15:00 jaafar has joined #ste||ar

15:01 EverYoung has joined #ste||ar

15:37 hkaiser has quit [Quit: bye]

15:39 bikineev has joined #ste||ar

15:41 david_pf_ has joined #ste||ar

15:56 hkaiser has joined #ste||ar

15:58 mcopik has joined #ste||ar

16:13 <github> [hpx] hkaiser force-pushed fixing_2667 from 887cc42 to 4a3a5c1: https://git.io/vHueR

16:13 <github> hpx/fixing_2667 4a3a5c1 Hartmut Kaiser: Inhibit direct conversion from future<future<T>> --> future<void>...

16:14 <github> [hpx] hkaiser pushed 1 new commit to master: https://git.io/vHVaT

16:14 <github> hpx/master 67870fc Hartmut Kaiser: Merge pull request #2672 from STEllAR-GROUP/invoke...

16:34 aserio1 has joined #ste||ar

16:37 aserio has quit [Ping timeout: 240 seconds]

16:37 aserio1 is now known as aserio

16:41 EverYoung has quit [Ping timeout: 246 seconds]

16:41 EverYoung has joined #ste||ar

16:50 Matombo has joined #ste||ar

16:52 <diehlpk_work> heller_, hkaiser What about the skype meeting for the HPXCL paper?

17:06 ajaivgeorge has quit [Quit: ajaivgeorge]

17:11 mcopik has quit [Ping timeout: 255 seconds]

17:21 pree has joined #ste||ar

17:26 bikineev has quit [Remote host closed the connection]

17:46 david_pf_ has quit [Quit: david_pf_]

17:46 david_pf_ has joined #ste||ar

17:47 david_pf_ has quit [Client Quit]

17:54 bikineev has joined #ste||ar

18:11 bikineev has quit [Remote host closed the connection]

18:17 bikineev has joined #ste||ar

18:27 <github> [hpx] K-ballo force-pushed compat-exception from d778392 to b574c11: https://git.io/vH8FM

18:27 <github> hpx/compat-exception 3aade6e Agustin K-ballo Berge: Add compatibility layer for std::exception_ptr

18:27 <github> hpx/compat-exception a3b0486 Agustin K-ballo Berge: Add inspect checks for deprecated boost::exception_ptr

18:27 <github> hpx/compat-exception b574c11 Agustin K-ballo Berge: Remove compatibility layer for std::exception_ptr, mark support as required

18:30 <K-ballo> heller_: I'll leave the compat layer as part of the history ^ so that it's there if we ever happen to need it

18:33 <heller_> K-ballo: thanks

18:57 bikineev has quit [Remote host closed the connection]

18:58 mcopik has joined #ste||ar

19:20 pree has quit [Quit: AaBbCc]

19:26 aserio has quit [Ping timeout: 255 seconds]

19:30 bikineev has joined #ste||ar

19:36 mcopik has quit [Ping timeout: 255 seconds]

19:45 aserio has joined #ste||ar

20:01 hkaiser has quit [Quit: bye]

20:35 denis_blank has quit [Quit: denis_blank]

20:36 mcopik has joined #ste||ar

20:36 hkaiser has joined #ste||ar

20:49 <aserio> hkaiser: yt?

20:49 <hkaiser> aserio: here

20:51 <aserio> hkaiser: So Dominic and I talked and it looks like he has a race condition

20:51 <hkaiser> did he say whether my fixes solved his problems?

20:51 <aserio> The future of future, future void is on a branch right?

20:52 <aserio> It looks like that the changes reduce the frequency of the issue

20:52 <hkaiser> yes

20:52 <aserio> I told him that we should brain storm some ways of searching for the error over the next few days

20:53 <hkaiser> ok

20:59 aserio has quit [Ping timeout: 246 seconds]

21:06 eschnett has quit [Quit: eschnett]

21:38 aserio has joined #ste||ar

21:43 aserio has quit [Quit: aserio]

21:44 aserio has joined #ste||ar

21:57 eschnett has joined #ste||ar

22:02 aserio has quit [Quit: aserio]

22:10 <jbjnr> hkaiser: yt?

22:15 hkaiser has quit [Read error: Connection reset by peer]

22:16 hkaiser has joined #ste||ar

22:16 <hkaiser> jbjnr: here

22:17 <jbjnr> hkaiser: when we serialize our args into a parcel, the rma_vector can create a chunk of type 2 (an rma chunk) and store the pinned memory info, which is picked up by the parcelport and used for rma operations. all is well, but ...

22:17 <jbjnr> when the data is received into a new rma chunk, and handed over to the archive to be deserialized - I don't have access to the chunk info directly

22:18 <jbjnr> how can I access the chunk structure when I am reading my rma object out of the archive?

22:18 <hkaiser> shouldn't the chunker do that?

22:18 <jbjnr> if I can do that, then I can access the memory region handle I put in there

22:18 <jbjnr> how can I get the chunker then?

22:18 <hkaiser> let me look

22:19 <hkaiser> how do you create the special chunk during serialization?

22:19 <jbjnr> I only see functions like " void load_binary_chunk(void* address, std::size_t count) // override" in input_container - but no chunk info is there

22:20 <jbjnr> when storing, save_binary_chunk actually stores the chunk directly, but not when loadig

22:20 <hkaiser> could you point me to the code your'e looking at, pls?

22:21 <jbjnr> hkaiser: here's where I create the rma chunk https://github.com/biddisco/hpx/blob/rdma_object/hpx/runtime/serialization/output_container.hpp#L350

22:22 <jbjnr> so I create a new kind of chunk with the rma info in it, and the parcelport bypasses memory registration. works great

22:22 <jbjnr> but the inverse operation, I do not see the chunks

22:23 <hkaiser> ok, so you changed the output archive to call a new function which is creating the rma chunk

22:23 <jbjnr> yes

22:23 <jbjnr> for rma types

22:23 <jbjnr> it's a new specialization/overlaod set

22:23 <hkaiser> nod

22:24 <hkaiser> for special types you call save_rma_chunk instead of save_binary_chunk

22:24 <jbjnr> yes

22:24 <hkaiser> that means you should do the same on receive

22:25 <hkaiser> for the same special types you call load_rma_chunk instead of load_binary_chunk

22:25 <hkaiser> shouldn't that do the trick?

22:25 <jbjnr> yes, but I only get the pointer and size, and not the extra stuff I stored in the chunk

22:25 <jbjnr> the rma handles etc

22:25 <jbjnr> I need these for memory managment'

22:26 <hkaiser> why do you see only the pointer and the size?

22:26 <hkaiser> shouldn't the sending/receival of the chunks pass your additional information along?

22:27 <jbjnr> the chunker has the received rma data (and receive handle, which is not the same as the sent on on the other end), but I don't quite know how to get the chunk handle stuff from inside load_rma_chunk

22:27 <hkaiser> jbjnr: here: https://github.com/biddisco/hpx/blob/rdma_object/hpx/runtime/parcelset/decode_parcels.hpp#L52-L64

22:28 <github> [hpx] K-ballo force-pushed compat-exception from b574c11 to 60ac848: https://git.io/vH8FM

22:28 <github> hpx/compat-exception f65b0be Agustin K-ballo Berge: Add inspect checks for deprecated boost::exception_ptr

22:28 <github> hpx/compat-exception 60ac848 Agustin K-ballo Berge: Remove compatibility layer for std::exception_ptr, mark support as required

22:28 <hkaiser> you somehow identify the chunk as being an rma chunk, yes?

22:29 <hkaiser> or is it just the serialized (special) type which knows that?

22:29 <jbjnr> yes, they are received as type 2 rma chunks, but I didn't modify the decode parcels to handle them.

22:29 <jbjnr> thanks

22:29 <jbjnr> that should be my next place to fix it

22:29 <hkaiser> encode_chunks needs this as well, probably

22:30 <hkaiser> jbjnr: https://github.com/biddisco/hpx/blob/rdma_object/hpx/runtime/parcelset/encode_parcels.hpp#L99-L124

22:30 <hkaiser> but you have stuff here already

22:31 <jbjnr> nothing special needed to be done on send, apart from the new chunk type - the serialization part allows me to create the chunks directly

22:31 <hkaiser> right

22:31 <jbjnr> it's the decoding I need to look at

22:31 <hkaiser> you need to do the proper memory handling on the receive end, though

22:31 <jbjnr> I will poke around there. once I find the right place, we have it fixed.

22:31 <hkaiser> cool

22:32 <jbjnr> it's all working, but I have leaks due to the handle mismatches

22:32 <jbjnr> I have to write this paper by friday :(

22:33 <hkaiser> jbjnr: the allocation of the chunks is probably done in the parcelports, not in decode_chunks

22:33 <hkaiser> jbjnr: let me know if I can help in any way

22:33 <jbjnr> yes. The chunks are created corrcly by the libfabric receiver, but then they are passed into decode_parcels

22:33 <jbjnr> and then I lost them

22:34 <jbjnr> so I must make changes in there to get them passed into load_rma_chunk for the rma type deserialization

22:34 <hkaiser> well, the chunks you see in the decode_chunks function are the ones allocated in the pp

22:34 <jbjnr> yes

22:34 <jbjnr> I thnk I know what to do now

22:34 <jbjnr> ta

22:34 <hkaiser> load_rms_chunk is called from the de-serialization of your rms data type

22:34 <jbjnr> yes - but getting the chunk itself it the part I miss

22:35 <hkaiser> you just 'assume' that the chunk you're about to deserialize from was correctly allocated

22:35 <jbjnr> I need the actual chunk though

22:35 <jbjnr> it has the handles I need

22:35 <jbjnr> and load_rma_chunk only sees the archive/container

22:36 <jbjnr> not the chunker

22:36 <hkaiser> jbjnr: you can assume this chunk is the correct one: https://github.com/biddisco/hpx/blob/rdma_object/hpx/runtime/serialization/input_container.hpp#L201

22:36 <hkaiser> I don't think you need to do anything special in addition to just accessing its pointer

22:36 <jbjnr> shit. I forgot I did that code.

22:36 <jbjnr> yes, the chunk is present there.

22:37 <jbjnr> excellent.

22:37 <hkaiser> you just have not accessed the pointer yet

22:37 <jbjnr> thanks. I will fix the last bit now

22:54 shoshijak has quit [Ping timeout: 255 seconds]

23:01 Matombo has quit [Ping timeout: 260 seconds]

23:24 Matombo has joined #ste||ar

23:27 Matombo has quit [Remote host closed the connection]

23:38 EverYoun_ has joined #ste||ar

23:41 EverYou__ has joined #ste||ar

23:41 EverYoung has quit [Ping timeout: 255 seconds]

23:42 EverYou__ has quit [Remote host closed the connection]

23:42 EverYoun_ has quit [Ping timeout: 246 seconds]

23:42 EverYoung has joined #ste||ar

23:44 diehlpk has joined #ste||ar

23:55 bikineev has quit [Remote host closed the connection]