I've confused "
expected length of a run" and "
expected longest run", and probably confused you too.
I can derive the "expected length of a run":
Expected length of a runFor a series of Bernoulli trials where:
P(win) = p
P(loss) = 1 - p
the probability of k of losses before a win is the probability of k - 1 losses multiplied by the probability of 1 win:
P(X = k) = (1 - p)^(k-1) * p
From this the expected number of losses before a win can be derived.
E(X) = sum(n = 1 to n = k) [ n * (1 - p)^(n - 1) * p ]
= p * sum(n=1 to n=k) [ n * (1 - p)^(n - 1)]
The derivative of a geometric progression sum(n=1 to k) n * x^ (n - 1) = 1/(1 - x)^2, so substituting (1 - p) for x:
p * sum(n=1 to n=k) [ n * (1 - p)^(n - 1)] = p * 1 (1 - ( 1 - p))^2
= p p^2 = 1/p
So the expected number of losses before a win is 1 pThis holds true for the number of runs in a win, by redefining
P(loss) = p
P(win) = 1 - p
Expected longest runI'm pretty sure I didn't derive this one. Maybe it was dooglus.
* organofcorti attempts to shift the blame
Edit: I believe I see where the 1 comes from... and yet it doesn't seem to work.
-log(1*1/2 + 1)/log(1/2) is about 0.5849625007211561814537389439478165087598144076924810, not 0.5 as intuition would suggest. Does the formula converge towards "correctness?"
As far as I can tell, neither converge to the correct answer although they clearly converge on each other. I'll see if I can figure it out.
Edit: I found a better estimate here:
http://gato-docs.its.txstate.edu/mathworks/DistributionOfLongestRun.pdfExpected longest run = -log(n*p)/log(q) + 0.57722.../log(1/q) -1/2
Note: 0.57722... is the Euler-Mascheroni constant, which is the limit(n -> infinity) { sum(k = 1 to k = n)[1/k] - log(n) }