15 Independence, Covariance and Order Statistics

15.1 Independent Random Variables

The random variables \(X\) and \(Y\) are said to be independent if, for all a, b, \[P\{X\in A, Y \in B\} = P\{X\in A\}P \{Y\in B\}\] In other words, \(X\) and \(Y\) are independent if, for all \(a\) and \(b\), the events \(E_a = \{X ≤ a\}\) and \(F_b = \{Y ≤ b\}\) are independent.

In terms of the joint distribution function \(F(x,y)\), we have that \(X\) and \(Y\) are independent if \[F(x,y) = F_X(x)F_Y(y)\] When \(X\) and \(Y\) are discrete, the condition of independence reduces to \[p(x,y) = p_X(x)p_Y(y)\] while if \(X\) and \(Y\) are jointly continuous, independence reduces to \[f(x,y) = f_X(x)f_Y(y)\] Proposition: If \(X\) and \(Y\) are independent, then for any functions \(h\) and \(g\): \[\text E[g(X)\times h(Y)] = \text E[g(X)]\times \text E[h(Y)]\] ## Covariance The covariance of any two random variables \(X\) and \(Y\), denoted by \(\text{Cov}(X,Y)\), is \[\begin{align} \text{Cov}(X,Y) &= \text E[(X-\text E[X])(Y-\text E[Y])]\\ &=\text E[XY] - \text E[X]\text E[Y]\end{align}\] Note that if \(X\) and \(Y\) are independent, then it follows that \(\text{Cov}(X,Y) = 0\).

We also define the Pearson correlation coefficient \[\rho_{X,Y} = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y}\]

Properties: * \(|\text{Cov}(X, Y)| \le \sigma_X\sigma_Y\), hence \(\rho_{X,Y} \in [-1,1]\). * \(\text{Cov}(X,Y) = \text{Cov}(Y,X)\) * \(\text{Cov}(X,X) = \text{Var}(X)\) * \(\text{Cov}(cX,Y)=c\times \text{Cov}(X,Y)\) * \(\text{Cov}(\sum_i X_i, \sum_j Y_j) = \sum_i \sum_j \text{Cov}(X_i,Y_j)\)

Proposition: The variance of the sum of random variables is: \[\text{Var}\bigg(\sum_{i=1}^n X_i\bigg) = \sum_{i=1}^n \text{Var}(X_i) + 2 \sum_{i=1}^n\sum_{j < i} \text{Cov}(i,j)\]

If \(X_i\) are independent random variables, then the equation reduces to: \[\text{Var}\bigg(\sum_{i=1}^n X_i\bigg) = \sum_{i=1}^n \text{Var}(X_i)\] ## Order Statistics Let \(X_1 , . . . , X_n\) be independent and identically distributed continuous random variables with probability distribution \(F\) and density function \(F ' = f\). If we let \(X_{(i)}\) denote the i-th smallest of these random variables, then \(X_{(1)}, . . . , X_{(n)}\) are called the order statistics.

To obtain the distribution of \(X_{(i)}\), note that \(X_{(i)}\) will be less than or equal to \(x\) if and only if at least \(i\) of the \(n\) random variables \(X_1,..., X_n\) are less than or equal to \(x\). Hence \[P\{X_{(i)} \le x\} = \sum_{k=i}^n {n\choose k} (F(x))^k (S(x))^{n-k}\]Differentiation yields that the density function of \(X_{(i)}\) is as follows: \[f_{X_{(i)}}(x) = \frac{n!}{(n-i)!(i-1)!}f(x)(F(x))^{i-1}(S(x))^{n-i}\]

The probability density that every member of a specified set of \(i−1\) of the \(X_j\) is less than \(x\),every member of another specified set of \(n−i\) is greater than \(x\), and the remaining value is equal to \(x\) is \((F(x))^{i−1}(1 − F(x))^{n−i} f(x)\).

Therefore, since there are \(n!/[(i − 1)!(n − i)!]\) different partitions of the n random variables into the three groups, we obtain the preceding density function.

Proposition: For independent and identically distributed random variables, * \(S_{X_{(1)}}(x) = [S(x)]^n\) * \(F_{X_{(n)}}(x) = [F(x)]^n\)