18 Binomial and Multinomial Distribution

18.1 Bernoulli and Categorical Distribution

Suppose that a trial, or an experiment, whose outcome can be classified as either a “success” or as a “failure” is performed. If we let \(X = 1\) if the outcome is a success and \(X = 0\) if it is a failure, then the probability mass function of \(X\) is given by \[\begin{align}p(0) = P\{X=0\} &= 1-p \\ p(1) = P\{X=1\} &= p\end{align}\] where \(p\in [0,1]\), is the probability that the trial is a “success”. The random variable \(X\) is called a Bernoulli random variable.

Properties: * \(\text E[X] = p\) * \(\text{Var}(X) = p(1-p)\) * \(G_X(z) = (1-p)+pz\)

A generalized Bernoulli random variable, usually called multinoulli or categorical random variable, is a discrete probability distribution that the result of the trial can take on one of several categories, with the probability of each category separately specified.

We assume that the support of the random variable is \(\{1,2,...,k\}\), and corresponding to those \(k\) categories, we assign a probability \(p_i\) such that \[p(i) = P\{X = i\} = p_i \quad \text{and} \quad \sum_{i=1}^k p_i = 1\] We see that the categorical distribution has pgf \[G_X(z) = p_1z + p_2z^2 + ... + p_nz^n\]

A special case of the categorical distribution is the discrete uniform distribution, where \(p_i = \frac{1}{k}\) for all \(i\).

Properties (discrete uniform): * \(\text E[X] = \frac{k+1}{2}\) * \(\text{Var}(X) = \frac{k^2-1}{12}\)

To vary the support to be in the range \(a,a+1,...,b\), one can add or subtract to \(X\) to get the desired distribution. ## Binomial Distribution Suppose that \(n\) independent trials, each of which results in a “success” with probability \(p\) and in a “failure” with probability \(1 − p\), are to be performed. If \(X\) represents the number of successes that occur in the n trials, then \(X\) is said to be a binomial random variable with parameters \((n, p)\).

We usually denote \(X \sim B(n,p)\) if \(X\) is said to be a binomial random variable having parameters \((n, p)\). Its pmf is: \[p(k) = {n \choose k}p^k (1-p)^{n-k}\]

The mean and variance of \(X\) is: * \(\text E[X] = np\) * \(\text{Var}(X) = np(1-p)\) * \(\varphi_X(t) = ((1-p) + pe^{it})^n\) * \(G_X(z) = ((1-p)+pz)^n\)

Proposition: Let \(X_i \sim B(n_i, p)\) be independent binomial random variables. Then \[X_1 + ... + X_m \sim B\bigg(\sum_{i=1}^m n_i, p\bigg)\] The Bernoulli random variable is the special case of the binomial random variable where \(n=1\).

Example: Four fair coins are flipped. If the outcomes are assumed independent, what is the probability that two heads and two tails are obtained?

Solution: Letting \(X\) equal the number of heads that appear, then \(X \sim B(n=4, p=0.5)\). Hence, \[P\{X = 2\} = {4 \choose 2}(0.5)^2 (0.5)^2 = \frac{3}{8}\]

Example: Suppose that an airplane engine will fail, when in flight, with probability \(1 − p\) independently from engine to engine; suppose that the airplane will make a successful flight if at least \(50\) percent of its engines remain operative. For what values of \(p\) is a four-engine plane preferable to a two-engine plane?

Solution: Because each engine is assumed to fail or function independently of what happens with the other engines, it follows that the number of engines remaining operative is a binomial random variable.

The probability that a four-engine plane makes a successful flight is: \[P\{X \ge 2\} = \sum_{k=2}^4 {n\choose k} p^k (1-p)^{n-k} = 6p^2(1 − p)^2 + 4p^3(1 − p) + p^4\] The probability that a two-engine plane makes a successful flight is: \[P\{X \ge 1\} = \sum_{k=1}^2 {n\choose k} p^k (1-p)^{n-k} = 2p(1−p)+p^2\] Hence the four-engine plane is safer if \[\begin{align} 6p^2(1 − p)^2 + 4p^3(1 − p) + p^4 &\ge 2p(1−p)+p^2 \\ \implies (p−1)2(3p−2)& \ge 0 \implies p\ge 2/3\end{align}\] ## Multinomial Distribution A generalization of the binomial distribution is the multinomial distribution. For \(n\) independent trials each of which leads to a success for exactly one of \(k\) categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

We usually denote \(X \sim \text{Multinomial}(n,p_1, ..., p_k)\) if \(X\) is said to be a multinomial random variable. Its pmf is: \[p(x_1,..., x_k) = P\{X_1=x_1, ..., X_n=x_n\} = \frac{n!}{x_1! ...x_k!}p_1^{x_1} ...p_k^{x_k}\] where \(X_i\) models the number of time that category \(i\) occurs in the \(n\) trials. The categorical distribution is a special case of the multinomial distribution for \(n = 1\).

Properties: * \(\text E[X_i] = np_i\) * \(\text{Var}(X_i) = np_i(1-p_i)\) * \(\text{Cov}(X_i, X_j) = -np_ip_j\) for \(i \ne j\)