Introduction
I come from a family with a fair number of gamers. Board games, role playing games, video games: someone in my family plays just every kind of game. A few years ago, I took a course on machine learning, where we spent a while reviewing all the probability rules that get covered in a Statistics 101 class, and while pondering the types of questions that DON’T get put into the 101 curriculum. This is about one of them.
Other than a coin flip, the most common probability example in introductory classes is based on a fair, uniform 6-sided die, where we assume that all sides are uniformly equal in probability. For example:
The probability of rolling a “n” is 1/6, or 16.7%, for any “n” in [1, 2, 3, 4, 5, 6]. (1)
In these introductory classes, this uniform assumption is extended to example after example.
The probability of rolling “6” every time on k dice rolls is (1/6)k. (2)
In order to roll a “4” in two die rolls, the first and second dice must be 1 and 3, or 2 and 2, or 3 and 1. These are the only 3 combinations that produce a “4.” Therefore, the probability of rolling a 4 on two dice rolls is the sum of the probabilities of rolling 1+3, 2+2, and 3+1.
There are 36 combinations of rolling “a”+”b” on two rolls, and each are equally probable.
Therefore, the probability of rolling “4” is 3 out of 36 = 8.3%. (3)
Probabilities for a skewed die
The fundamentals of probability still hold if the assumption of uniformity is violated. Skewed dice were designed (by the author) to illustrate how pervasive the uniformity assumption is. These dice have been rolled a few hundred times to determine experimentally the probabilities of each number coming up.
Figure 1: skewed dice, as saved on Thingiverse. Thanks to Josh Silverstein for saving, printing, sanding, and painting a set of model dice for me.
For Table 1 below, these probabilities have been rounded to a convenient number that is within the confidence interval of estimates of the true probability. For example: the best estimate of the probability of rolling “2” is 19.67%, but I’m reporting it as 1 in 5, or 20%. These rounded-off probabilities total 101%. Deal with it. More precise estimates are included in Appendix 1.
NOTE: for this paper, it is assumed that the numbers on an n-sided die are 1, 2, 3, …, n. For mathematicians and computer scientists who like to start counting at zero, you have my apologies.
Table 1:
Digit | Odds | Probability |
1 | 1 in 3 | 33% |
2 | 1 in 5 | 20% |
3 | 1 in 10 | 10% |
4 | 1 in 5 | 20% |
5 | 1 in 9 | 11% |
6 | 1 in 15 | 7% |
For Equation (2), calculating the probability works the same way: the probability of rolling “1” three times on three rolls is
(1/3)3, or 3.7% (2a)
while the probability of rolling “6” three times is much less likely
(1/15)3, or 0.03% (2b)
Probabilities for multiple rolls
The probability of rolling a 4 on two dice rolls is STILL the sum of the probabilities of rolling 1+3, 2+2, and 3+1, but if the uniformity assumption is not valid, these probabilities have to be added up separately. For example:
Let Pr(j, k) be the probability of rolling “j” as the sum of “k” die rolls. The probability of rolling a 3 on one roll is therefore denoted Pr(3, 1), which from Table 1 is 10%. The probability of rolling a 1 and a 3 is the joint probability of the two rolls:
Probability of 1 and 3 is the same as Probability of 3 and 1:
= Pr(1, 1) * Pr(3, 1) = (1/3)*(1/10) = .033 (3a)
Probability of rolling a pair of 2s on 2 rolls is the square of the probability of rolling a 2 on one roll:
= Pr(2, 1) * Pr(2, 1) = (1/5) * (1/5) = .04 (3b)
So the probability of rolling “4”:
= P( 1 and 3) + P( 2 and 2) + P( 3 and 1) = .033 + .04 + .033 = 0.106 —> 10.6% (3c)
This is further illustrated in Figure 2. For any given number of rolls, the probability of rolling a certain total is an extension of the probability from each side.
Figure 2: Probability of rolling a total number j with k rolls of the skewed die.
As a further extension of this, we see that there is a simple way to calculate the probabilities for an arbitrary k die rolls is iteratively. If we have calculated all the probabilities for k-1 rolls, then:
(4)
A table of these probabilities is shown in Appendix 2.
Application of Central Limit Theorem
For a “fair” six-sided die, the average score over a large number of rolls is 3.5, which is the average of all sides: (1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5. For a skewed die, that formula is more properly expanded into a weighted average, by multiplying the value of each side by its probability. Thus, for our 6-sided die, The average of many rolls is:
Pr(1, 1) * 1 + Pr(2, 1) * 2 + Pr(3, 1) * 3 + Pr(4, 1) * 4 + Pr(5, 1) * 5 + Pr(6, 1) * 6 +
[ (.33 * 1) + (.2 * 2) + ( .1 * 3) + (.2 * 5) + (.11 * 5) + (.07 * 6) = 2.8 (5)
So, for example, if you rolled the biased die 5 times, you can expect a total of around 2.8 * 5 = 14.
It goes without saying that to get a 14 on 5 rolls, this can happen by rolling 4 three times and 1 twice, or rolling 1 three times, 5 once, and 6 once, or any number of other ways. However, as we roll the dice more times, it becomes more and more likely that there will be a combination of numbers which eventually approach the “true” average. For a more rigorous explanation of this, go somewhere else.
Another of the consequences of the Central Limit Theorem is that for a high enough number of rolls, the distribution of the sum of rolls looks more and more like a normal Gaussian distribution. In Figure 2, this is clear as the k=4 rolls and k=5 rolls look more and more like a normal bell-shaped curve.
This brings up an amusing application. In this case, the standard deviation is around 3.22, which is around 83% of the standard deviation for the fair die. Also, the mean value from each roll of the die is around 79% of the mean for a fair die. Within a margin of error, the sum of 6 rolls of this biased die will give around the same distribution of outcomes as 5 rolls of a fair die.
Calculations are left as an exercise for the reader (or see how it’s done here.)
The Jim’s Birthday problem
In Equation (4), we presented an iterative formula for Pr(j, k). Theoretically, a closed form solution for calculating this should exist. In honor of my brother’s 50th birthday, I’m calling this the Jim’s Birthday Problem.
APPENDIX 1: actual dice rolls for probability estimates
Based on 3 sets of 100 rolls.
APPENDIX 2: Pr(n, k) — probability of rolling “n” on “k” rolls
Given the single roll probabilities estimated in Appendix 1, the probabilities for rolling a given sum j on k rolls is calculated from Equation (4).
Probabilities:
Sum | 1 roll | 2 rolls | 3 rolls | 4 rolls | 5 rolls | |
1 | 33% | |||||
2 | 19.667% | 10.890% | ||||
3 | 9.667% | 12.980% | 3.594% | |||
4 | 19.667% | 10.248% | 6.425% | 1.186% | ||
5 | 11.333% | 16.782% | 6.987% | 2.827% | 0.391% | |
6 | 6.667% | 16.150% | 10.950% | 3.917% | 1.166% | |
7 | 12.660% | 13.408% | 6.315% | 1.963% | ||
8 | 8.681% | 13.189% | 8.924% | 3.361% | ||
9 | 5.747% | 12.243% | 10.389% | 5.256% | ||
10 | 3.907% | 10.589% | 11.304% | 6.964% | ||
11 | 1.511% | 8.697% | 11.521% | 8.511% | ||
12 | 0.444% | 6.041% | 10.979% | 9.761% | ||
13 | 3.780% | 9.524% | 10.457% | |||
14 | 2.232% | 7.625% | 10.411% | |||
15 | 1.166% | 5.791% | 9.690% | |||
16 | 0.519% | 4.069% | 8.550% | |||
17 | 0.151% | 2.624% | 7.104% | |||
18 | 0.030% | 1.535% | 5.537% | |||
19 | 0.824% | 4.054% | ||||
20 | 0.403% | 2.792% | ||||
21 | 0.169% | 1.807% | ||||
22 | 0.058% | 1.085% | ||||
23 | 0.013% | 0.602% | ||||
24 | 0.002% | 0.307% | ||||
25 | 0.143% | |||||
26 | 0.059% | |||||
27 | 0.021% | |||||
28 | 0.006% | |||||
29 | 0.001% | |||||
30 | 0.000% |