#ste||ar on 2019-05-11 — irc logs at irclog.cct.lsu.edu

2018-08-26 23:03 hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/

00:23 nikunj has joined #ste||ar

00:23 <nikunj> hkaiser, yt?

00:24 <hkaiser> nikunj: here

00:24 <nikunj> hkaiser, I was thinking to implement the vote function that you were talking of

00:25 <nikunj> would you want me to continue with that or should I run some benchmarks?

00:25 <nikunj> coz I think we have expected graphs as of now

00:26 <nikunj> so we should move forward with implementing the vote function

00:28 <hkaiser> nikunj: that would be a second set of replicate API's, right?

00:28 <nikunj> so you want me to make 2 sets of replicate API?

00:29 <nikunj> one that returns the first ans and one that votes?

00:29 <hkaiser> well, one as it is and one with an arbiter function

00:29 <hkaiser> both with async and dataflow variations

00:30 <nikunj> we can do that

00:30 <hkaiser> nikunj: if you do that you will have to run yet another set of benchmarks, though ;-)

00:30 <nikunj> we will then have 2 sets of replicates with 2 variations each (normal and validate)

00:31 <nikunj> hkaiser, I'm used to running benchmarks now ;-)

00:31 <hkaiser> ok

00:31 <nikunj> but I think the voting function is important

00:31 <hkaiser> feel free to do that, then

00:31 <nikunj> also, in the mean time should I run the benchmarks for the current benchmarks that I have over PR?

00:31 <hkaiser> let's discuss the API before you start implementing things

00:32 <nikunj> ok, then I'll wait till next thursday

00:32 <hkaiser> I'd suggest you write the test first demonstrating how things should look like and only then start implementing things

00:32 <hkaiser> we can discuss things over email or here befor ethat

00:33 <nikunj> btw did you understand the bumps we were seeing for replicate?

00:33 <nikunj> I was expecting almost similar times for all the error rates

00:33 <nikunj> with a bit of overhead with ones throwing exceptions

00:33 <hkaiser> nikunj: how many runs have you averaged there?

00:34 <nikunj> it's just one

00:34 <nikunj> and that too when I was about to leave

00:34 <hkaiser> nod, that explains it

00:34 <nikunj> I just wanted to get an idea of how things looked after I changed the code

00:34 <hkaiser> but I would expect constant execution times, indeed

00:34 <nikunj> yes, that is what I expected too

00:35 <nikunj> I guess we'll have to play with execution times to get a better picture

00:35 <nikunj> besides I was reading about the exponential distribution and it works on a probability distribution

00:36 <nikunj> P(x|y) = y*exp(-x*y)

00:36 <nikunj> so what we do there is set y for the probability distribution, where y can be a set time after which a certain event is meant to occur

00:37 <nikunj> and then we get a function where we can identify the probability of occurrence of x given the value of y

00:38 <nikunj> so basically if we plug in y = 3 and x = 3 (that's what my benchmark is doing), the probability of occurrence of a number greater than 3 in a run with about 10000 threads is about 4 only

00:39 <nikunj> and the reason why we get the exponential curve for the graph is because there is an exponential relation between x and the probability function, so if we decide the error generation as 2 and put x = 2 we will have exponentially more

00:40 <nikunj> since it will then be 2*exp(-4) compared to 3*exp(-9

00:40 <nikunj> that's why we see an exponential time difference in executions

00:41 <hkaiser> nod

00:41 <nikunj> now what we could do is that we could allow the user to add an average rate of occurrence of some event

00:41 <nikunj> and then control the x ourselves

00:42 <nikunj> or keep the x same as the one that user enters i.e. the average rate of occurrence of some event

00:42 <nikunj> if we control x ourselves then we can decide on the number of failing threads, else it will be probable to error_rate*exp(-error_rate^2)

00:43 <nikunj> we could do it either way you want

00:44 <nikunj> the math behind it is pretty simple and I ran an alternative program to test if my thinking was correct

00:50 <nikunj> so what should I go ahead with?

00:51 <hkaiser> we need to control the average time between errors

00:52 <nikunj> then what we could do is to keep x = 1 so that the user will know that the probability of failure will be equal to exp(-error_rate)

00:53 <nikunj> this way for an error_rate specified as y, the probability of a number generated being greater than 1 will be exp(-y)

00:53 <hkaiser> right

00:54 <nikunj> so we will have a very clear function translation

00:54 <nikunj> and the user should not have any issues identifying with exponential probabilities

00:55 <nikunj> alright then, I will make this change

00:55 <nikunj> would you want me to benchmark once I've changed that?

00:55 <nikunj> or is there something else you should look into first?

00:57 <hkaiser> well, we talked about the parameters to change for the benchmarks

00:57 <hkaiser> let's see what we get

00:57 <nikunj> alright, I'll run for the parameters we discussed and generate some graphs

00:58 <nikunj> I'll mail you those graphs I generate and we can pick from there

01:10 rishabh_bansal11 has quit [Quit: Connection closed for inactivity]

01:25 <nikunj> hkaiser, would you want execution times in microseconds?

01:25 <nikunj> or should I keep them in milliseconds?

01:25 <hkaiser> you mean the artifical delay?

01:26 <nikunj> yes the artificial time for any thread execution

01:26 <hkaiser> I think we should run with 5us, 50us, 500us, and 5ms

01:26 <nikunj> alright, I'll run it with those, I'll add another 2ms in between just for a better comparision

01:27 <hkaiser> sure, feel free to add that

01:30 <nikunj> done

01:30 <nikunj> I'll start the benchmarks now

01:32 <nikunj> one final thing, would you want me to keep the n-value high so that all tests pass? or keep it low and let some tests fail

01:33 <nikunj> coz we will see tests failing if n is low and error-rate is set to pretty low (which translates to high amount of errors)

01:33 <nikunj> on the downside, keeping n-value high will make replicate run for longer too

01:34 <nikunj> but I encountered multiple failing instances when I kept n-value low for replicate

02:00 <nikunj> hkaiser, I've started a background script to run about with a variety of parameters repeating 20 times for every parameter there is. So we should now have a comprehensive view. Also, marvin will be blocked for this weekend most likely coz there are a total of 20k+ runs ;)

02:02 <nikunj> till it completes running I'll quickly go through the reports you sent over, and start going through phylanx seminars

02:14 <nikunj> total runs equals 61440

02:14 <nikunj> that will easily take this weekend

02:20 nikunj has quit [Ping timeout: 246 seconds]

02:26 K-ballo has quit [Quit: K-ballo]

02:36 hkaiser has quit [Quit: bye]

04:01 jaafar has joined #ste||ar

04:18 nikunj97 has joined #ste||ar

04:18 nikunj97 has quit [Remote host closed the connection]

04:24 nikunj97 has joined #ste||ar

05:17 RostamLog has quit [Ping timeout: 258 seconds]

08:25 simbergm has quit [Remote host closed the connection]

08:47 heller has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]