This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

## 3.5 Rolling a dice – Discrete Uniform distribution

Another simple distribution, arising from a classical probability theory examples, is the Uniform distribution. For now we focus on the discrete version of it, keeping in mind that there also exists the continuous one (see Section 4.2).

The classical example of application of this distribution is dice rolling. The conventional dice has 6 sides and when rolled can give a value of 1 to 6. If the dice is fair then the probability of getting a score on it is the same for all the sides. This means that the PMF of the distribution can be written as: $\begin{equation} f(y, k) = \frac{1}{k}, \tag{3.14} \end{equation}$ where $$k$$ is the number of outcomes (sides of the dice). The more outcomes there are, the lower the probability of having a specific outcome is. For example, on a dice with 10 sides, the probability of getting the score 5 is $$\frac{1}{10}$$, while on the 6-sided version it is $$\frac{1}{6}$$.

The PMF of the Uniform distribution is shown visually in Figure 3.9 on example of 1d6. Figure 3.9: Probability Mass Function of Uniform distribution for 1d6.

The mean of this distribution is calculated as $$\frac{a+b}{2}$$, where $$a$$ is the lowest and $$b$$ is the highest possible values. So, for the 1d6, the mean is $$\frac{1+6}{2}=3.5$$. This means that if we roll the dice many times the average score will be 3.5.

The variance of the uniform distribution depends on the number of outcomes and is calculated as: $\begin{equation} \sigma^2(y, k) = \frac{k^2-1}{12} . \tag{3.15} \end{equation}$ As can be seen from the formula, the variance of Uniform distribution is proportional to the number of outcomes.

Coming to the CDF of the Uniform distribution, it is calculated as: $\begin{equation} f(y, k) = \frac{y-a+1}{k}, \tag{3.16} \end{equation}$ where $$a$$ is the lowest possible value and $$k$$ is the number of outcomes. This CDF can be visualised as shown in Figure 3.10. Figure 3.10: Cumulative Distribution Function of Uniform distribution for 1d6.

Given that the probability of each separate outcome in the Uniform distribution is always $$\frac{1}{k}$$, the CDF demonstrates a linear growth, reaching 1 at the highest point, which can be interpreted as rolling 1d6, we will always get a value up to 6 (less than or equal to 6). The CDF can be used to get probabilities of several events at the same time. For example, we can say that when rolling 1d6 the probability of getting 1 or 2 is $$\frac{2-1+1}{6}=\frac{1}{3}$$.

Bernoulli distribution (Section 3.2) with $$p=0.5$$ can be considered as a special case of the Uniform distribution (with only two outcomes).

A company produces headphones, putting serial numbers on them. So far, it has produced 9,990 of them. If a customer buys headphones, what is the probability that they will get a serial number with three digits?

Solution. This is the task on Uniform distribution, because serial numbers do not repeat and we can assume that the probability of getting any of them is the same. In terms of parameters, $$a=1$$ and $$b=9990$$. To get a serial number with three digits, a customer needs to have anything between 100 and 999. This can be formulated as: $\begin{equation*} \mathrm{P}(100 \leq y \leq 999) = \mathrm{P}(y \leq 999) - \mathrm{P}(y \leq 99). \end{equation*}$ Inserting the values in the CDF of the Uniform distribution (3.16) we get: $\begin{equation*} \mathrm{P}(100 \leq y \leq 999) = \frac{999}{9990} - \frac{99}{9990} \approx 0.1 - 0.01 = 0.09. \end{equation*}$

Remark. Similarly how Binomial distribution is a generalisation of the Bernoulli, there is distribution describing the multiple dice rolls. It is called the Multinomial distribution. While we do not discuss it here, we note that this is a distribution, which is, for example, used to model respondents choices in survey, when the variable of interest is in a categorical scale and the probabilities for different options are not equal.