6  Introduction to Measures

The theory of measure, developed by Émile Borel and Henri Lebesgue, first is to extend the notion of “length” to general subsets of the real line beyond of the naive definition of length: \(\lambda((a, b]) := b-a\).

We consider also the extended real number system \(\overline{\mathbb R} = \mathbb R \cup \{\pm \infty\}\). A few extensions follow:

6.1 Rings, algebras and \(\sigma\)-algebras

Definition 6.1 A function \(\mu: \mathcal C \to \overline{\mathbb R}\) is said to be finitely additive iff:

  • \(\mu(\varnothing) = 0\)
  • If \(A_i\) are disjoint, \(A_i \in \mathcal C\) for \(i = 1,...,n\), and \(A := \bigcup_{i=1}^n A_i \in \mathcal C\), we have \(\mu(A) = \sum_{i=1}^n \mu(A_i)\)

If also \(B := \bigcup_{n \ge 1}A_n \in \mathcal C\), we have \(\mu(B) = \sum_{n \ge 1} \mu(A_n)\) for countably many \(A_n\), then \(\mu\) is called countably additive.

Example 6.1 Let \(p \ne q\) in a set \(X\), and let \[ m(A)= \begin{cases} 1 \text{ if } p,q \in A \\ 0 \text{ otherwise} \end{cases}. \] Then, \(m\) is not additive on \(X\).

Definition 6.2 Given a set \(X\).

  • A collection \(\mathcal A \subset 2^X\) is called a ring iff
    • \(\varnothing \in \mathcal A\), and
    • \(A, B \in \mathcal A \implies A \cup B \in \mathcal A\) and \(B \setminus A \in \mathcal A\)
  • A ring \(\mathcal A\) is called an algebra iff
    • \(X \in \mathcal A\)
  • An algebra \(\mathcal A\) is called a \(\sigma\)-algebra if
    • For every sequence \(\{A_n\} \subset \mathcal A\), \(\bigcup_{n\ge 1}A_n \in \mathcal A\).

Example 6.2  

  • In any set \(X\), the collection of all finite sets is a ring, but it is not an algebra unless \(X\) is finite
  • The collection of all finite sets and their complements is an algebra but not a \(\sigma\)-algebra, unless, again, \(X\) is finite.

Note that for \(A, B\) in ring \(\mathcal R\), \(A \cap B = A \setminus (A \setminus B) \in \mathcal R\). For any set \(X\), \(2^X\) is a \(\sigma\)-algebrra of subsets of \(X\). For any collection \(\mathcal C \subset 2^X\), there is a smallest (\(\sigma\)-)algebra including \(\mathcal C\) namely, the intersection of all (\(\sigma\)-)algebras including \(\mathcal C\). This (\(\sigma\)-)algebra is said to be generated by \(\mathcal C\).

Example 6.3 If \(\mathcal A\) is the collection of all singletons \(\{x\}\) in \(X\),

  • The algebra generated by \(\mathcal A\) is the collection of all subsets of \(A\) of \(X\) which are finite or have finite complement.
  • The \(\sigma\)-algebra generated by \(\mathcal A\) is the collection of sets which are countable or have countable complement.

For any sequence of sets \(A_1, ..., A_n \downarrow \varnothing\) means \(A_n \supset A_{n+1}\) for all \(n\), \(\bigcap_n A_n = \varnothing\). For an infinite interval, such as \([c, \infty)\) with finite \(c\), we have \(\lambda([c, \infty)) := \infty\). Then for \(A_n := [n, \infty)\), we have \(A_n \downarrow \varnothing\) but \(\lambda(A_n) = +\infty\) for all \(n\), not converging to \(0\). The following theorem illustrates a criterion for being countable additive.

Theorem 6.1 Let \(\mu\) be a finitely additive, real-valued function on an algebra \(\mathcal A\). Then \(\mu\) is countably additive iff \(\mu(A_n) \to 0\) whenever \(A_n \downarrow \varnothing\) and \(A_n \in \mathcal A\) (\(\mu\) is then said to be “continuous at \(\varnothing\)”).

Suppose \(\mu\) is countably additivev and \(A_n \downarrow \varnothing\) with \(A_n \in \mathcal A\). Then \(A_n \setminus A_{n+1}\) are disjoint for all \(n\), and their union is \(A_1\). Also, their union for \(n \ge m\) is \(A_m\) for each \(m\).

It follows from \(\sum_{n \ge m} \mu(A_n \setminus A_{n+1}) = \mu(A_m)\) for each \(m\). Since the series \(\sum_{n \ge 1} \mu(A_n \setminus A_{n+1})\) converges, the sums for \(n \ge m\) must approach \(0\), so \(\mu\) is continuous at \(\varnothing\).

Conversely, suppose \(\mu\) is continuous at \(\varnothing\), and the sets \(B_j\) are disjoint and in \(\mathcal A\) with \(B := \bigcup_j B_j \in \mathcal A\). Let \(A_n := B \setminus \bigcup_{j < n} B_j\). Then \(A_n \in \mathcal A\) and \(A_n \downarrow \varnothing\), so \(\mu(A_n) \to 0\). By finite additivity, \[ \mu(B) = \mu(A_n) + \sum_{j < n}\mu(B_j) \quad \text{ for each } n \] Letting \(n\to \infty\) gives \(\mu(B) = \sum_j \mu(B_j)\).

6.2 Measures

Definition 6.3 A countably additive function \(\mu\) from a \(\sigma\)-algebra \(\mathcal S\) of subsets of \(X\) into \([0, \infty]\) is called a measure. Then \((X, \mathcal S, \mu)\) is called a measure space.

Example 6.4 For \(A \subset X\), let \(\mu(A)\) be the cardinality of \(A\) for \(A\) finite, \(+\infty\) for \(A\) infinite. Then \(\mu\) is always a measure, called the counting measure on \(X\).

While showing that length is countably additive, and can be extended to the countably additive on a large collection of sets (a \(\sigma\)-algebra), it will be useful to show at the same time that countable additivity also holds if the length \(b-a\) of the interval \((a, b]\) is replaced by \(G(b) - G(a)\) for a suitable function \(G\).

A function \(G: \mathbb R\to\mathbb R\) is called

  • Nondecreasing iff \(G(x) \le G(y)\) whenever \(x \le y\). Then for any \(x\), the limit \(G(x^+) := \lim_{y\downarrow x}G(y)\) exists.
  • Continuous from the right iff \(G(x^+) = G(x)\) for all \(x\)

Let \(G\) be a nondecreasing function which is continuous from the right. As \(x \uparrow \infty\), define \(G(+\infty) := \lim_{x\uparrow +\infty} G(x)\). Likewise define \(G(-\infty)\) so that \(G(x) \downarrow G(-\infty)\) as \(x \downarrow -\infty\).

Example 6.5 If \[ G(x) = \begin{cases} 1 & \text{if } x \in (1,\infty) \\ 0 & \text{otherwise} \end{cases} \] Then \(G\) is nondecreasing, \(G(1) = 0\) and \(G(1^+) = 1\). So \(G\) is not continuous from the right but from the left.

Theorem 6.2 Let \(\mathcal C := \{(a, b]: -\infty < a \le b < +\infty\}\). A function \(\mu := \mu_G\) defined by \(\mu((a, b]) := G(b) - G(a)\). If \(G\) is nondecreasing and continuous from the right, then \(\mu\) is countably additive on \(\mathcal C\).

Let us first show that \(\mu\) is finitely additive. Suppose \((a, b] = \bigcup_{1 \le i \le n}(a_i, b_i]\), where \((a_i, b_i]\) are disjoint and we may assume the intervals are non-empty. Without changing the sum of \(mu((a_i, b_i])\), the intervals can be relabeled so that \(b_{j-1}=a_j\) for \(j = 2,...,n\). Then

\[ G(b) - G(a) = \sum_{1 \le j \le n} G(b_j) - G(a_j) \] so \(\mu\) is finitely additive on \(\mathcal C\).

Next we show that \(\mu\) is finitely subadditive, i.e. if \(a < b\), and \((a, b]\subset \bigcup_{i \le j \le n} (c_j, d_j]\) with \(c_j < d_j\) for each \(j\), then \(G(b) - G(a) \le \sum_{i \le j \le n} G(d_j) - G(c_j)\). We prove by induction on \(n\). For \(n=1\) it is obvious.

In general, \(c_j < b \leq d_j\) for some \(j\) (if \(a = b\) there is no problem). By renumbering, let us take \(j = n\). If \(c_n \leq a\), we are done. If \(c_n > a\), then \((a, c_n]\) is covered by the remaining \(n - 1\) intervals, so by induction hypothesis

\[ \begin{align*} G(b) - G(a) &= G(b) - G(c_n) + G(c_n) - G(a) \\ &\leq G(b) - G(c_n) + \sum_{j=1}^{n-1} G(d_j) - G(c_j) \end{align*} \] as desired.

Now for countable additivity, suppose an interval \(J := (c, d]\) is a union of countably many disjoint intervals \(J_i := (c_i, d_i]\). For each finite \(n\), \(J\) is a union of the \(J_i\) for \(i = 1, \dots, n\), and finitely many other left open, right closed intervals, disjoint from each other and the \(J_i\). Specifically, relabel the intervals \((c_i, d_i], i = 1, \dots, n\), as \((a_i, b_i], i = 1, \dots, n\), where \(c \leq a_1 < b_1 \leq a_2 < \dots \leq a_n < b_n \leq d\). Then the ``other’’ intervals are \((c, a_1], (b_j, a_{j+1}]\) for \(j = 1, \dots, n - 1\), and \((b_n, d]\). By finite additivity, for all \(n\),

\[ \mu(J) \geq \sum_{i=1}^{n} \mu(J_i), \quad \text{so} \quad \mu(J) \geq \sum_{i=1}^{\infty} \mu(J_i). \]

For the converse inequality, let \(J := (c, d] \in \mathcal{C}\). Let \(\varepsilon > 0\). For each \(n\), using right continuity of \(G\), there are \(\delta_n > 0\) such that \(G(d_n + \delta_n) < G(d_n) + \varepsilon / 2^n\), and \(\delta > 0\) such that \(G(c + \delta) \leq G(c) + \varepsilon\). Now the compact closed interval \([c + \delta, d]\) is included in the union of countably many open intervals \(I_n := (c_n, d_n + \delta_n)\). Thus there is a finite subcover. Hence by finite subadditivity,

\[ G(d) - G(c) - \varepsilon \leq G(d) - G(c + \delta) \leq \sum_{n} G(d_n) - G(c_n) + \varepsilon / 2^n, \]

and \(\mu(J) \leq 2\varepsilon + \sum \mu(J_n)\). Letting \(\varepsilon \downarrow 0\) completes the proof.

6.3 Outer measures

Return to the general case, we have the extension property. Our main example will be where \(\mathcal A\) is the ring of all finite unions of left-open-right-closed intervals in \(\mathbb R\).

Theorem 6.3 For any set \(X\) and ring \(\mathcal A\) of subsets of \(X\), any countably additive function \(\mu: \mathcal A \to [0, \infty]\) extends to a measure on the \(\sigma\)-algebra \(\mathcal S\) generated by \(\mathcal A\).

For any set \(E\subset X\), let \[ \mu^*(E) := \inf \bigg\{\sum_{1 \le n < \infty}\mu(A_n) : A_n \in \mathcal A,E \in \bigcup_n A_n \bigg\} \text{ or } \infty \text{ if no such } A_n \text{ exist} \] Then \(\mu^*\) is called the outer measure defined by \(\mu\). Note that \(\mu^*(\varnothing) = 0\) by letting \(A_n = \varnothing\) for all \(n\).

The proof will include four lemma.

Lemma 6.1 For any sets \(E\) and \(E_n \subset X\), if \(E \subset \bigcup_n E_n\), then \(\mu^*(E) \le \sum_n \mu^*(E_n)\).

If the latter sum if \(\infty\), there is no problem.

Otherwise, fix \(\varepsilon > 0\). For each \(n\), by definition of \(\mu^*(E_n)\) as an infimum, we can take a sequence \(A_{nm} \in \mathcal A\) such that \(E_n \subset \bigcup_m A_{nm}\) and \(\sum_m \mu(A_{nm}) < \mu^*(E_n) + \varepsilon / 2^n\).

Then \(E \subset \bigcup_n \bigcup_m A_{nm}\). We have \(\sum_n\sum_m \mu(A_{nm}) = \sum_n [\sum_m\mu(A_{nm})]\) since \(\mu(A_{nm})\ge 0\) for all \(n, m\). Thus we have

\[ \mu^*(E) \le \sum_{1 \le n < \infty} \bigg[\sum_{1\le m < \infty} \mu(A_{nm})\bigg] \le \sum_{1 \le n < \infty}\bigg[\mu^*(E_n) + \varepsilon/ 2^n \bigg] \le \varepsilon + \sum_{1 \le n < \infty} \mu^*(E_n) \] Letting \(\varepsilon \downarrow 0\) proves this lemma.

Lemma 6.2 For any \(A \in \mathcal A\), \(\mu^*(A) = \mu(A)\).

If \(A \subset \bigcup_n A_n\), \(A_n \in \mathcal A\), let \(B_n := A\cap A_n \setminus \bigcup_{j < n} A_j\). Then the \(B_n\) are disjoint and in \(\mathcal A\), with union \(A\), so by assumption \(\mu(A) = \sum_n \mu(B_n) \le \sum_n \mu(A_n)\). Thus \(\mu(A) \le \mu^*(A)\).

Conversely, taking \(A_1 = A\) and \(A_n = \varnothing\) for \(n > 1\) shows that \(\mu^*(A) \le \mu(A)\), so \(\mu^*(A) = \mu(A)\)

Definition 6.4 A set \(F \subset X\) is called \(\mu^*\)-measurable, i.e. \(F \in \mathcal M(\mu^*)\), iff for every set \(E \subset X\), the following Carathéodory condition is satisfied: \[ \mu^*(E) = \mu^*(E\cap F) + \mu^*(E \setminus F) \]

In other words, \(F\) splits all sets additively for \(\mu^*\).

This is equivalent to \[ \mu^*(E)\ge \mu^*(E\cap F) + \mu^*(E \setminus F) \text{ whenever } \mu^*(E) < \infty \] hence always, and since the reverse inequality always holds by Lemma 6.1.

Lemma 6.3 \(\mathcal A \subset \mathcal M(\mu^*)\) (all sets in \(\mathcal A\) are \(\mu^*\)-measurable).

Let \(A \in \mathcal A\) and \(E \subset X\) with \(\mu^*(E) < \infty\).

Given \(varepsilon > 0\), take \(A_n \in \mathcal A\) with \(E \subset \bigcup_n A_n\) and \(\sum_n \mu(A_n) \le \mu^*(E) + \varepsilon\). Then \(E \cap A \subset \bigcup_n (A\cap A_n)\), \(E\setminus A \subset \bigcup_n (A_n \setminus A)\), and so \(\mu^*(E\cap A) + \mu^*(E \setminus A) \le \sum_n [\mu(A \cap A_n) + \mu(A_n\setminus A)] = \sum_n \mu(A_n)\).

Letting \(\varepsilon \downarrow 0\) proves this lemma.

Lemma 6.4 \(\mathcal M(\mu^*)\) is a \(\sigma\)-algebra and \(\mu^*\) is a measure on it.

Clearly \(F \in \mathcal M(\mu^*)\) if and only if \(X \setminus F \in \mathcal M(\mu^*)\). If \(A, B \in \mathcal M(\mu^*)\), then for any \(E \subset X\), note that \(A \cup B = X \setminus [(X \setminus A) \cap (X \setminus B)]\), and

\[ \begin{align*} \mu^*(E) &= \mu^*(E\cap A) + \mu^*(E \setminus A) & \text{since } A \in \mathcal M(\mu^*) \\ &= \mu^*(E\cap A \cap B) + \mu^*(E \cap A \setminus B) + \mu^*(E \setminus A) & \text{since } B \in \mathcal M(\mu^*) \\ &= \mu^*(E\cap (A\cap B)) + \mu^*(E \setminus (A\cap B)) & \text{since } A \in \mathcal M(\mu^*) \\ \end{align*} \] Thus \(\mathcal M(\mu^*)\) is an algebra.

Now let \(E_n \in \mathcal M(\mu^*)\) for \(n = 1,2,...\), \(F := \bigcup_{1\le j < \infty} E_j\), and \(F_n := \bigcup E_j \in\mathcal M(\mu^*)\).

Since \(E_n \setminus \bigcup_{j < n} E_j \in \mathcal M(\mu^*)\) for all \(n\), we may assume \(E_n\) disjoint in proving \(F\) measurable.

For any \(E \subset X\), we have:

\[ \begin{align*} \mu^*(E) &= \mu^*(E\setminus F_n) + \mu^*(E \cap F_n) & \text{since } F_n \in \mathcal M(\mu^*) \\ &= \mu^*(E\setminus F_n) + \mu^*(E \cap E_n) + \mu^*\bigg(E \cap \bigcup_{j < n} E_n\bigg) & \text{since } E_n \in \mathcal M(\mu^*) \\ \end{align*} \] Thus by induction on \(n\), \[ \mu^*(E) = \mu^*(E \setminus F_n) + \sum_{j=1}^n \mu^*(E \cap E_j) \ge \mu^*(E \setminus F) + \sum_{j=1}^n \mu^*(E\cap E_j) \]

Letting \(n \to \infty\) gives \[ \mu^*(E) \ge \mu^*(E \setminus F) + \sum_{j=1}^n \mu^*(E \cap E_j) \ge \mu^*(E \setminus F) + \mu^*(E\cap F) \] by Lemma 6.1. Thus \(F\in \mathcal M(\mu^*)\) and by Lemma 6.1 again,

\[ \mu^*(E) = \mu^*(E \setminus F) + \sum_{j \ge 1} \mu^*(E \cap E_j) \] Letting \(E = F\) shows \(\mu^*\) is countably additive on \(\mathcal M(\mu^*)\), proving Lemma 6.4 and Theorem 6.3.

Proposition 6.1 If \(\mu^*(E) = 0\), then \(E \in \mathcal M(\mu^*)\).

For any \(A \subset X\), \[ \mu^*(A) \ge \mu^*(A \setminus E) = \mu^*(A\setminus E) + \mu^*(E) \ge \mu^*(A \setminus E) + \mu^*(A \cap E) \]

6.4 Uniqueness of Extension of Measure

Definition 6.5 A function \(\mu\) on a collection \(\mathcal A\) of subsets of \(X\) is called \(\sigma\)-finite iff there is a sequence \(\{A_n\}\subset \mathcal A\) with \(|\mu(A_n)| < \infty\) for all \(n\), and \(X = \bigcup_n A_n\).

Example 6.6  

  • If \(\mathcal A\) is the collection of all intervals \((a, b]\) for \(a < b\) in \(\mathbb R\) with \(\lambda((a,b]) := b-a\), then \(\lambda\) is \(\sigma\)-finite.
  • Counting measure is \(\sigma\)-finite iff \(X\) is countable.
  • If \(\mu(A) = \infty\) for all non-empty sets in some collection, and \(\mu(\varnothing) = 0\), then \(\mu(A)\) is clearly not \(\sigma\)-finite.
  • Let \(\mathcal A\) be the algebra generated by all intervals \((a, b]\), let \(\mu(A) = \infty\) for all non-empty sets in \(\mathcal A\). Let \(\mathcal B\) be the \(\sigma\)-algebra generated by \(\mathcal A\). One measure defined on \(\mathcal B\) which equals \(\mu\) on \(\mathcal A\) is counting measure, another one assigns measure \(\infty\) to all non-empty sets in \(\mathcal B\). This illustrates how \(\sigma\)-finiteness is useful in the following uniqueness theorem

Theorem 6.4 Let \(\mu\) be countably additive and nonnegative on an algebra \(\mathcal A\). Let \(\alpha\) be a measure on the \(\sigma\)-algebra \(\mathcal S\) generated by \(\mathcal A\) with \(\alpha = \mu\) on \(\mathcal A\).

Then for any \(A \in \mathcal S\) with \(\mu^*(A)< \infty\), \(\alpha(A) = \mu^*(A)\). If \(\mu\) is \(\sigma\)-finite, then the extension of \(\mu\) from the algebra \(\mathcal A\) to the \(\sigma\)-algebra \(\mathcal S\) it generates given by Theorem 6.3 is unique, and the extension \(\alpha = \mu^*\) on \(\mathcal S\).

For any \(A \in \mathcal S\) and \(A_n \in A\) with \(A \subset \bigcup_n A_n\), we have by the proof of Lemma 6.2, \(\alpha(A) \le \sum_n \alpha(A_n) = \sum_n \mu(A_n)\). Taking an infimum gives \(\alpha(A) \le \mu^*(A)\).

If \(\mu^*(A) < \infty\), then given \(\varepsilon > 0\), choose \(A_n\) with \(\sum_n \mu(A_n) < \mu^*(A) + \varepsilon / 2\). Let \(B_k := \bigcup_{1\le m < k} A_m\). Then \(B_k \in \mathcal A\) for \(k\) finite, and \(B_\infty \in \mathcal S\). As \(A \subset B_\infty\), \[ \mu^*(A) \le \mu^*(B_\infty) < \mu^*(A) + \varepsilon / 2 \]

By Lemma 6.3 and Lemma 6.4, which implies \(\mathcal S \subset \mathcal M(\mu^*)\), and Theorem 6.1, for \(k\) large enough, \(\mu^*(B_\infty \setminus B_k) < \varepsilon / 2\), and \(\mu^*(B_\infty \setminus A) = \mu^*(B_\infty) - \mu^*(A) < \varepsilon / 2\). Now \(\alpha(B_\infty) \le \mu^*(B_\infty) < \infty\), and since \(\alpha \le \mu^*\) on \(\mathcal S\),

\[ \begin{align*} \alpha(A) &= \alpha(B_\infty) - \alpha(B_\infty \setminus A) \ge \alpha(B_k) - \mu^*(B_\infty \setminus A) \\ &\ge \alpha(B_k) - \varepsilon / 2 = \mu^*(B_k) - \varepsilon / 2 \\ &\ge \mu^*(B_\infty) - \varepsilon \ge \mu^*(A) - \varepsilon \end{align*} \] where the second line follows from \(B_k \in \mathcal A\) and by Lemma 6.2. Letting \(\varepsilon \downarrow 0\) gives \(\alpha(A) \ge \mu^*(A)\), so \(\alpha(A) = \mu^*(A)\). Or if \(A_n\) are disjoint with \(\mu(A_n) < \infty\), then \[ \alpha(A) = \sum_n \alpha(A\cap A_n) = \sum_n \mu^*(A\cap A_n) = \mu^*(A) \] by Lemma 6.4. Thus \(\alpha\) is unique if \(\mu\) is \(\sigma\)-finite.