24  Normal and Lognormal Distribution

24.1 Normal Distribution

A normal random variable \(X\) with parameters \(\mu\) and \(\sigma^2\) has a pdf \[f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-(x-\mu)^2/(2\sigma^2)}\qquad -\infty < x < \infty\] We denote \(X \sim N(\mu, \sigma^2)\). The normal distribution is probably the most important distribution because of a result known as the central limit theorem.

To prove that the given \(f(x)\) is indeed a pdf we must show that the area under the normal curve is 1. That is, \[\int_{-\infty}^\infty\frac{1}{\sigma\sqrt{2\pi}} e^{-(x-\mu)^2/(2\sigma^2)} dx= 1\] This is a well-known result from the Gaussian integral.

The cdf of the normal distribution is defined as \[\Phi(x) = \int_{-\infty}^x\frac{1}{\sigma\sqrt{2\pi}} e^{-(t-\mu)^2/(2\sigma^2)} dt\] Properties: * \(\text E[X] = \mu\) * \(\text{Var}(X) = \sigma^2\) * \(\Phi(x) = 1-\Phi(-x)\) * \(\varphi_X(t) = \exp(i\mu t - \sigma^2 t^2 / 2)\)

The standard normal distribution is defined as \(N(0,1)\), the normal distribution with mean 0 and variance 1.

Proposition:
If \(X\) follows a normal distribution with parameters \((\mu, \sigma^2)\) then \(Y = aX + b\) follows a normal distribution with parameters \((a\mu + b, a^2σ^2)\).

With this, we can transform every normal random variable \(X \sim N(\mu, \sigma^2)\) by a change of variable \(Z = \frac{X-\mu}{\sigma}\).

Proposition: Let \(X_i \sim N(\mu_i, \sigma_i^2)\) be independent normal random variables. Then \[X_1 + ... + X_m \sim N\bigg(\sum_{i=1}^m \mu_i, \sum_{i=1}^m \sigma_i^2\bigg)\] ### Normal Approximation to the Binomial Distribution If \(n\) is large enough, then the skew of the distribution is not too great. In this case a reasonable approximation of \(B(n,p)\) is given by the normal distribution \(N(\mu = np, \sigma^2 = np(1-p))\). A commonly used rule is when \(np > 5\) and \(n(1-p) > 5\). This basic approximation can be improved in a simple way by using a suitable continuity correction.

If \(Y\) has a distribution given by the normal approximation, then \(P(a \le X \le b)\) is approximated by \(P(a-0.5 \le X \le b+0.5)\). ### Normal Approximation to the Poisson Distribution For sufficiently large values of \(\lambda\), (say \(\lambda > 1000\)), the normal distribution \(N(\mu=\lambda, \sigma^2=\lambda)\) is an excellent approximation to the Poisson distribution. If \(\lambda > 10\), then the normal distribution is a good approximation if an appropriate continuity correction is performed.

24.2 Multivariate Normal Distribution

One definition is that a random vector said to be \(k\)-variate normally distributed if every linear combination of its \(k\) components has a univariate normal distribution.

If a \(k\)-dimensional random vector \(\mathbf X = ( X_1 , … , X_k )^T\) follows a multivariate normal distribution, then we denote \(\mathbf X\sim N_k(\boldsymbol \mu,\boldsymbol \Sigma)\), where: * \(\boldsymbol \mu = \text E[\mathbf X] = (\text E[X_1], ..., \text E[X_k])^T\) * \(\boldsymbol \Sigma_{ij} = \text{Cov}(X_i, X_j)\)

Proposition: Let \(Z_1, ..., Z_n\) be a set of \(n\) independent standard normal random variables. The random variables \(X_1 , . . . , X_m\) have a multivariate normal distribution if and only if, for some \(1\le i \le m,1\le j \le n\), and \(\mu_i\), \(1 \le i \le m\), \[\begin{align}X_1 &= a_{11}Z_1 + a_{12} Z_2 + ... + a_{1n} Z_n + \mu_1 \\ X_2 &= a_{21}Z_1 + a_{22} Z_2 + ... + a_{2n} Z_n + \mu_2 \\ &\hskip0.5em \vdots \\ X_m &= a_{m1}Z_1 + a_{m2} Z_2 + ... + a_{mn} Z_n + \mu_m\end{align}\] In other words, \(\mathbf X = \mathbf{AZ} +\boldsymbol \mu\). Then, the covariance matrix equals \(\boldsymbol \Sigma = \mathbf{AA}^T\). We can prove this by showing that the characteristic of the multivariate normal variable is \[\varphi_{\mathbf{X}}(\mathbf{t}) = \exp\bigg(i\boldsymbol{\mu}^T\mathbf{t} - \frac{1}{2}\mathbf t^T \boldsymbol\Sigma \mathbf t \bigg)\] The pdf of the multivariate normal distribution is \[f_{\mathbf X}(x_1, ..., x_k) = \frac{\exp\big(-\frac{1}{2}(\boldsymbol{x-\mu})^T \boldsymbol\Sigma^{-1} (\boldsymbol{x-\mu}) \big)}{\sqrt{(2\pi)^k|\boldsymbol\Sigma|}}\] ## Log-normal Distribution A random variable \(X\) is said to follow a log-normal distribution with parameters \(\mu\) and \(\sigma^2\) if \(\ln X \sim N(\mu,\sigma^2)\). We denote \(X\sim \text{Lognormal}(\mu, \sigma^2)\). The parameters are the expected value and variance of the natural log of the random variable, not the random variable itself.

Its cumulative distribution function is \[F_X(x) = \Phi\bigg(\frac{\ln x - \mu}{\sigma}\bigg)\] and its pdf is \[f_X(x) = \frac{1}{x\sigma \sqrt{2\pi}} \exp \bigg(-\frac{(\ln x -\mu)^2}{2\sigma^2}\bigg)\] Properties: * \(\text E[X] = \exp\big(\mu + \frac{\sigma^2}{2}\big)\) * \(\text{Var}(X) = (e^{\sigma^2} - 1)e^{2\mu+\sigma^2}\) * \(\text E[X^k] = \exp(k\mu+ k^2\sigma^2/2)\)