hkaiser changed the topic of #ste||ar to: STE||AR: Systems Technology, Emergent Parallelism, and Algorithm Research | stellar.cct.lsu.edu | HPX: A cure for performance impaired parallel applications | github.com/STEllAR-GROUP/hpx | Buildbot: http://rostam.cct.lsu.edu/ | Log: http://irclog.cct.lsu.edu/
nikunj has joined #ste||ar
<nikunj> hkaiser, yt?
<hkaiser> nikunj: here
<nikunj> hkaiser, I was thinking to implement the vote function that you were talking of
<nikunj> would you want me to continue with that or should I run some benchmarks?
<nikunj> coz I think we have expected graphs as of now
<nikunj> so we should move forward with implementing the vote function
<hkaiser> nikunj: that would be a second set of replicate API's, right?
<nikunj> so you want me to make 2 sets of replicate API?
<nikunj> one that returns the first ans and one that votes?
<hkaiser> well, one as it is and one with an arbiter function
<hkaiser> both with async and dataflow variations
<nikunj> we can do that
<hkaiser> nikunj: if you do that you will have to run yet another set of benchmarks, though ;-)
<nikunj> we will then have 2 sets of replicates with 2 variations each (normal and validate)
<nikunj> hkaiser, I'm used to running benchmarks now ;-)
<hkaiser> ok
<nikunj> but I think the voting function is important
<hkaiser> feel free to do that, then
<nikunj> also, in the mean time should I run the benchmarks for the current benchmarks that I have over PR?
<hkaiser> let's discuss the API before you start implementing things
<nikunj> ok, then I'll wait till next thursday
<hkaiser> I'd suggest you write the test first demonstrating how things should look like and only then start implementing things
<hkaiser> we can discuss things over email or here befor ethat
<nikunj> btw did you understand the bumps we were seeing for replicate?
<nikunj> I was expecting almost similar times for all the error rates
<nikunj> with a bit of overhead with ones throwing exceptions
<hkaiser> nikunj: how many runs have you averaged there?
<nikunj> it's just one
<nikunj> and that too when I was about to leave
<hkaiser> nod, that explains it
<nikunj> I just wanted to get an idea of how things looked after I changed the code
<hkaiser> but I would expect constant execution times, indeed
<nikunj> yes, that is what I expected too
<nikunj> I guess we'll have to play with execution times to get a better picture
<nikunj> besides I was reading about the exponential distribution and it works on a probability distribution
<nikunj> P(x|y) = y*exp(-x*y)
<nikunj> so what we do there is set y for the probability distribution, where y can be a set time after which a certain event is meant to occur
<nikunj> and then we get a function where we can identify the probability of occurrence of x given the value of y
<nikunj> so basically if we plug in y = 3 and x = 3 (that's what my benchmark is doing), the probability of occurrence of a number greater than 3 in a run with about 10000 threads is about 4 only
<nikunj> and the reason why we get the exponential curve for the graph is because there is an exponential relation between x and the probability function, so if we decide the error generation as 2 and put x = 2 we will have exponentially more
<nikunj> since it will then be 2*exp(-4) compared to 3*exp(-9
<nikunj> that's why we see an exponential time difference in executions
<hkaiser> nod
<nikunj> now what we could do is that we could allow the user to add an average rate of occurrence of some event
<nikunj> and then control the x ourselves
<nikunj> or keep the x same as the one that user enters i.e. the average rate of occurrence of some event
<nikunj> if we control x ourselves then we can decide on the number of failing threads, else it will be probable to error_rate*exp(-error_rate^2)
<nikunj> we could do it either way you want
<nikunj> the math behind it is pretty simple and I ran an alternative program to test if my thinking was correct
<nikunj> so what should I go ahead with?
<hkaiser> we need to control the average time between errors
<nikunj> then what we could do is to keep x = 1 so that the user will know that the probability of failure will be equal to exp(-error_rate)
<nikunj> this way for an error_rate specified as y, the probability of a number generated being greater than 1 will be exp(-y)
<hkaiser> right
<nikunj> so we will have a very clear function translation
<nikunj> and the user should not have any issues identifying with exponential probabilities
<nikunj> alright then, I will make this change
<nikunj> would you want me to benchmark once I've changed that?
<nikunj> or is there something else you should look into first?
<hkaiser> well, we talked about the parameters to change for the benchmarks
<hkaiser> let's see what we get
<nikunj> alright, I'll run for the parameters we discussed and generate some graphs
<nikunj> I'll mail you those graphs I generate and we can pick from there
rishabh_bansal11 has quit [Quit: Connection closed for inactivity]
<nikunj> hkaiser, would you want execution times in microseconds?
<nikunj> or should I keep them in milliseconds?
<hkaiser> you mean the artifical delay?
<nikunj> yes the artificial time for any thread execution
<hkaiser> I think we should run with 5us, 50us, 500us, and 5ms
<nikunj> alright, I'll run it with those, I'll add another 2ms in between just for a better comparision
<hkaiser> sure, feel free to add that
<nikunj> done
<nikunj> I'll start the benchmarks now
<nikunj> one final thing, would you want me to keep the n-value high so that all tests pass? or keep it low and let some tests fail
<nikunj> coz we will see tests failing if n is low and error-rate is set to pretty low (which translates to high amount of errors)
<nikunj> on the downside, keeping n-value high will make replicate run for longer too
<nikunj> but I encountered multiple failing instances when I kept n-value low for replicate
<nikunj> hkaiser, I've started a background script to run about with a variety of parameters repeating 20 times for every parameter there is. So we should now have a comprehensive view. Also, marvin will be blocked for this weekend most likely coz there are a total of 20k+ runs ;)
<nikunj> till it completes running I'll quickly go through the reports you sent over, and start going through phylanx seminars
<nikunj> total runs equals 61440
<nikunj> that will easily take this weekend
nikunj has quit [Ping timeout: 246 seconds]
K-ballo has quit [Quit: K-ballo]
hkaiser has quit [Quit: bye]
jaafar has joined #ste||ar
nikunj97 has joined #ste||ar
nikunj97 has quit [Remote host closed the connection]
nikunj97 has joined #ste||ar
RostamLog has quit [Ping timeout: 258 seconds]
simbergm has quit [Remote host closed the connection]
heller has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]