How many samples?
July 7, 2010 at 8:30 AM by Dr. Drang
In my Monte Carlo solution of the Two Child Problem, I did 10,000 simulations using Python’s random
module. Was that enough? If I didn’t already know the answers (the usual case when doing Monte Carlo simulation), how could I estimate the accuracy of the results? Not surprisingly, the answers to these results can be found in the field of statistics.
When the process being simulated consists of a series of independent trials, each trial having one of two results (which we’ll call passes and fails, although any two mutually exclusive results could be used), it’s said to be a Bernoulli sequence, and the count of passes follows a binomial distribution
where is the probability of a passing a single trial.1 You may recognize the term with the factorials as the binomial coefficient.
An estimate of is easily calculated:
where we put the hat over the to distinguish the estimate from the true (albeit typically unknown) value.
Because we could get a different value of every time we run a set of trials, is a random variable. If the number of trials is large enough, will have a normal distribution with a mean of
and a standard deviation of
That the mean of is equal to signifies that is an unbiased estimate of : as we increase , will tend toward .
Because we know the distribution of and its parameters, we can calculate the probability that will fall within some neighborhood, , around ,
where is the standard normal cumulative distribution function,
The function can’t be integrated analytically, but many numerical analysis programs include a function for it. Octave has a function called normcdf
that does the trick. Back in the not-so-good old days, you’d have to look up it up in a table; every statistics textbook had a table of in its appendix.
This is all very nice, but you can only do these calculations if you know , and the typical reason you’re doing the Monte Carlo simulation is that you don’t know . What you have is a particular value of and you want to know how good it is.
Enter the concept of the confidence interval. By flipping the terms around and using to estimate the standard deviation,
we can say with a confidence2 of
that the true value of is in the interval
A more convenient way to work is to introduce a new variable,
and choose a confidence level of . Then
where is the inverse of the standard normal CDF (Octave function norminv
).
The confidence interval at this level is
or, expanding out the ,
That was a longish derivation, but the result is simple to use. Let’s say we’ve done a Monte Carlo simulation and we want to get the 95% confidence interval for . Here’s what we do:
- Note that for 95% confidence, .
- Use whatever tools available to determine that .
Take the values of and from the simulation and calculate the interval
Now let’s say we’re planning a simulation and want to choose a value of to get a decent estimate of .
- For 95% confidence, use , as before.
- Get a preliminary value of , perhaps by doing a small Monte Carlo run.
- Decide how tight we want the confidence interval, , to be.
Calculate via
Note that if we want a very tight confidence interval, will be small and will have to be large. This makes sense: to get a close estimate, we need a lot of samples.
Although it’s a common confidence level, there’s nothing magical about 95%. We could just as easily use another value. Here are the values for a few others.
Confidence | ||
---|---|---|
99% | 0.005 | 2.576 |
95% | 0.025 | 1.960 |
90% | 0.050 | 1.645 |
75% | 0.125 | 1.150 |
50% | 0.250 | 0.675 |
As expected, to get higher confidence, we need more samples.
Let’s calculate the confidence interval for the Monte Carlo simulation of the first example of the Two Child Problem. The result was
If we restrict ourselves to families that have at least one son,
the probability of having two sons is 2520/7535 = 0.334
So is 75353, and is 0.334. The 99% confidence interval is
So if we didn’t already know from earlier work that the true probability was , we could use this result to say with 99% confidence that the true probability was between 0.320 and 0.348.
I should mention that the formulas in this post aren’t restricted to Monte Carlo simulation. They’re valid any time you’re sampling from a pass/fail type of population.
-
The count of fails also follows a binomial distribution; just substitute for and for . ↩
-
We use the word confidence instead of probability because we’re really not calculating a probability here. The true value of isn’t a random variable, it’s a number that we just don’t happen to know. ↩
-
Why isn’t it 10,000? In this simulation we filtered out about a quarter of the sample families because they didn’t have sons, leaving 7535. ↩