12 Product Measures

For any \(a \le b\) and \(c \le d\), the rectangle \([a, b]\times [c, d]\subset \mathbb R^2\) has area \((d-c)(b-a)\). It’s very familiar that area is defined for much more general sets. In this section, it will be defined even more generally, as a measure, and the Cartesian product will be defined for any two \(\sigma\)-finite measures in place of length of two real axes. Then, the product will be extended to more than two factors, giving, for example, “volume” as a measure on \(\mathbb R^3\).

Let \((X, \mathcal B, \mu)\) and \((Y, \mathcal C, \nu)\) be any two measure spaces. In \(X\times Y\) let \(\mathcal R\) be the collection of all “rectangles” \(B \times C\) with \(B \in \mathcal B\) and \(C \in \mathcal C\). For such sets let \(\rho(B\times C) := \mu(B)\nu(C)\), where (in this case) we set \(0 \cdot \infty := \infty \cdot 0 := 0\). \(\mathcal R\) is a semiring by Proposition 7.2.

Theorem 12.1 \(\rho\) is countably additive on \(\mathcal R\).

Proof

Suppose \(B \times C = \bigcup_n B(n)\times C(n)\) in \(\mathcal R\) where the sets \(B(n) \times C(n)\) are disjoint, \(B(n)\in \mathcal B\) and \(C(n)\in \mathcal C\) for all \(n\).

So for each \(x\in X\) and \(y\in Y\), \(1_{B(x)} 1_{C(y)} = \sum_n 1_{B(n)}(x)1_{C(n)}(y)\). Then integrating \(d\nu(y)\) gives for each \(x\), by countable additivity, \(1_B (x) \nu(C) = \sum_n 1_{B(n)}(x) \nu(C(n))\). Now integrating \(d\mu(x)\) gives, by additivity in Proposition 10.6 and MCT, \(\mu(B)\nu(C) = \sum_n \mu(B(n))\nu(C(n))\).

Let \(\mathcal A\) be the ring generated by \(\mathcal R\). Then \(\mathcal A\) consists of all unions of finitely many disjoint elements of \(\mathcal R\) by Proposition 7.3. Since \(X \times Y \in \mathcal R\), \(\mathcal A\) is an algebra.

For any disjoint \(C_j\in \mathcal R\) and finite \(n\), let \(\rho(\bigcup_{1\le j\le n} C_j) := \sum_{1\le j\le n} \rho(C_j)\). Here \(\rho\) is well-defined and countably additive on \(\mathcal A\) by Proposition 7.4 and Theorem 12.1. Then \(\rho\) can be extended to a countably additive measure on the product \(\sigma\)-algebra \(\mathcal B \otimes \mathcal C\) generated by \(\mathcal R\) or \(\mathcal A\) (Theorem 6.4) is not unique.

We will give conditions under which the extension is unique and can be written in terms of iterated integrals. The following notion will be helpful.

Definition 12.1 A collection \(\mathcal M\) of sets is called a monotone class iff whenever \(M_n \in \mathcal M\) and \(M_n \downarrow M\) or \(M_n\uparrow M\), then \(M \in \mathcal M\).

Example 12.1

Any \(\sigma\)-algebra is a monotone class
A topology, in general, is not a monotone class. For instance, \((0, 1-1/n)\) converging to \((0, 1]\) which is not open.

The intersection of any set of monotone classes is a monotone class. Thus, for any collection \(\mathcal D\) of sets, there is a smallest monotone class including \(\mathcal D\).

Theorem 12.2 If \(\mathcal A\) is an algebra of subsets of a set \(X\), then the smallest monotone class \(\mathcal M\) including \(\mathcal A\) is a \(\sigma\)-algebra.

Proof

Let \(\mathcal N := \{E \in \mathcal M: X\setminus E \in \mathcal M\}\). Then \(\mathcal A \subset \mathcal N\) and \(\mathcal N\) is a monotone class, so \(\mathcal N = \mathcal M\).

For each set \(A \subset X\), let \(\mathcal M_A := \{E: E\cap A \in \mathcal M\}\). Then for each \(A \in \mathcal A\), \(\mathcal A \subset \mathcal M_A\) and \(\mathcl M_A\) is a monotone class, so \(\mathcal M \subset \mathcal M_A\). Then for each \(E\in\mathcal M\), \(\mathcal M_E\) is a monotone class including \(\mathcal A\), so \(\mathcal M \subset \mathcal M_E\). Thus \(\mathcal M\) is an algebra. Being a monotone class, it is a \(\sigma\)-algebra.

The next fact says that the order of integration can be inverted for indicator functions of measurable sets in the product \(\sigma\)-algebra. This will be the main step toward the construction of product measures and in interchange of integrals for more general functions.

Theorem 12.3 Suppose \(\mu(X)<\infty\) and \(\nu(Y)<\infty\). Let

\[ \begin{align*} \mathcal F := \bigg\{E &\subset X \times Y : \int \bigg[\int 1_{E}(x,y)\, d\mu(x)\bigg]d\nu(y) \\ &= \int \bigg[\int 1_{E}(x,y)\, d\nu(y)\bigg]d\mu(x) \bigg\} \end{align*} \]

Then \(\mathcal B \otimes \mathcal C\subset \mathcal F\).

Proof

The definition of \(\mathcal F\) inplies that all of the integrals appearing in it are defined, so that each function being integrated is measurable. It should be noted that this measurability holds at each step of the proof to follow.

If \(E = B \times C\) for some \(B \in \mathcal B\) and \(C \in \mathcal C\), then

\[ \iint 1_E\, d\mu\, d\nu = \mu(B) \int 1_C\, d\nu = \mu(B)\nu(C) = \iint 1_E\, d\nu\, d\mu \]

Thus \(\mathcal R \subset \mathcal F\). If \(E_n \in \mathcal F\), and \(E_n \downarrow E\) or \(E_n \uparrow E\), then \(E \in \mathcal F\) by MCT, using finiteness. Thus \(\mathcal F\) is a monotone class. Also, any finite disjoint union of sets in \(\mathcal F\) is in \(\mathcal F\). Thus \(\mathcal A \subset \mathcal F\). Hence by Theorem 12.2, \(\mathcal B\otimes \mathcal C \subset \mathcal F\).

Theorem 12.4 (Product Measure Existence Theorem) Let \((X, \mathcal B, \mu)\) and \((Y, \mathcal C, \nu)\) be two \(\sigma\)-finite measure spaces. Then \(\rho\) extends uniquely to a measure on \(\mathcal B\otimes \mathcal C\) such that for all \(E \in \mathcal B\otimes \mathcal C\),

\[ \rho(E) = \iint 1_E(x,y)\, d\mu(x)\, d\nu(y) = \iint 1_E(x,y)\, d\nu(y)\, d\mu(x) \]

Proof

First suppose \(\mu\) and \(\nu\) are finite. Let \[ \alpha(E) := \iint 1_E(x, y) \, d\mu(x) \, d\nu(y), \quad E \in \mathcal{B} \otimes \mathcal{C}. \]

Then by Theorem 12.3, \(\alpha\) is defined and the order of integration can be reversed. Now \(\alpha\) is finitely additive (for any finitely many disjoint sets in \(\mathcal{B} \otimes \mathcal{C}\)) by Proposition 10.6. Again, all functions being integrated will be measurable. Then, \(\alpha\) is countably additive by MCT. For any other extension \(\beta\) of \(\rho\) to \(\mathcal{B} \otimes \mathcal{C}\), the collection of sets on which \(\alpha = \beta\) is a monotone class including \(\mathcal{A}\), thus including \(\mathcal{B} \otimes \mathcal{C}\). So the theorem holds for finite measures.

In general, let \(X = \bigcup_m B_m\), \(Y = \bigcup_n C_n\), where the \(B_m\) are disjoint in \(X\) and the \(C_n\) in \(Y\), with \(\mu(B_m) < \infty\) and \(\nu(C_n) < \infty\) for all \(m\) and \(n\). Let \(E \in \mathcal{B} \otimes \mathcal{C}\) and \(E(m, n) := E \cap (B_m \times C_n)\). Then for each \(m\) and \(n\), by the finite case,

\[ \iint 1_{E(m,n)} \, d\mu \, d\nu = \iint 1_{E(m,n)} \, d\nu \, d\mu \]

This equation can be summed over all \(m\) and \(n\) in any order. By countable additivity and MCT, we get

\[ \alpha(E) := \iint 1_E \, d\mu \, d\nu = \iint 1_E \, d\nu \, d\mu, \quad \text{for any } E \in \mathcal{B} \otimes \mathcal{C} \]

Then \(\alpha\) is finitely additive, countably additive by monotone convergence, and thus a measure, which equals \(\rho\) on \(\mathcal{A}\). If \(\beta\) is any other extension of \(\rho\) to a measure on \(\mathcal{B} \otimes \mathcal{C}\), then for any \(E \in \mathcal{B} \otimes \mathcal{C}\),

\[ \beta(E) = \sum_{m,n} \beta(E(m,n)) = \sum_{m,n} \alpha(E(m,n)) = \alpha(E) \]

so the extension is unique.

Example 12.2 Let \(c\) be counting measure and \(\lambda\) Lebesgue measure on \(I := [0,1]\). In \(I \times I\) let \(D := \{(x,x): x \in I\}\). Then \(D\) is measurable (it is closed and \(I\) is second-countable, so Proposition 10.5 applies), but \(\iint 1_D \, d\lambda \, dc = 0 \ne 1 = \iint 1_D \, dc \, d\lambda\), as \(c\) is not \(\sigma\)-finite.

This shows how \(\sigma\)-finiteness is useful in Theorem 12.4.

The measure \(\rho\) on \(\mathcal B \otimes \mathcal C\) is called a product measure \(\mu \times \nu\). Now here is the main theorem on integrals for product measures:

Theorem 12.5 Let \((X, \mathcal B, \mu)\) and \((Y, \mathcal C, \nu)\) be \(\sigma\)-finite, and let \(f: X\times Y \to [0, \infty]\) measurable for \(\mathcal B \otimes \mathcal C\), or \(f \in \mathcal L^1(X\times Y, \mathcal B \otimes \mathcal C, \mu\times\nu)\). Then

\[ \int f\, d(\mu\times\nu) = \iint f(x,y)\, d\mu(x)\, d\nu(y) = \iint f(x,y)\, d\nu(y)\, d\mu(x) \] Here \(\int f(x,y)\, d\mu(x)\) is defined for \(\nu\)-almost all \(y\) and \(\int f(x,y) \, d\nu(y)\) for \(\mu\)-almost all \(x\).

Proof

Recall that integrals are defined for functions only defined almost everywhere. For nonnegative simple \(f\), the theorem follows from Theorem 12.4 and Proposition 10.6. Then for nonnegative measurable \(f\) it follows from Proposition 10.3 and MCT.

Then, for \(f \in \mathcal L^1(X\times Y, \mathcal B \otimes \mathcal C, \mu\times\nu)\), the theorem holds for \(f^+\) and \(f^-\); thus \(\int f^+(x,y)\, d\mu(x) < \infty\) for almost all \(y\) (w.r.t \(\nu\)) and lifewise for \(f^-\) and for \(\mu\) and \(\nu\) interchanged. For \(\nu\)-almost all \(y\), \(\int |f(x,y)|\, d\mu(x) < \infty\), and then by Theorem 10.2,

\[ \int f(x,y)\, d\mu(x) = \int f^+(x,y)\, d\mu(x) - \int f^-(x,y)\, d\mu(x) \] with all three integrals being finite.

Next, in integrating w.r.t \(\nu\), the set of \(\nu\)-measure \(0\) where and integral of \(f^+\) or \(f^-\) is infinite doesn’t matter as in Proposition 11.1. So, again by Theorem 10.2, \(\iint f(x,y) \, d\mu(x)\, d\nu(y)\) is defined and equals

\[ \iint f^+(x,y) \, d\mu(x)\, d\nu(y) - \iint f^-(x,y) \, d\mu(x)\, d\nu(y) \]

Then the theorem for \(f^+\) and \(f^-\) implies it for \(f\).

Remark 12.1. To prove that \(f \in \mathcal L^1(X\times Y, \mathcal B \otimes \mathcal C, \mu\times\nu)\), one can prove that \(f\) is \(\mathcal B \otimes \mathcal C\)-measurable and then that \(\iint |f|\, d\mu \, d\nu < \infty\) or \(\iint |f|\, d\nu \, d\mu < \infty\).

Example 12.3 Let \(X=Y=\mathbb N\) and \(\mu=\nu=\text{counting measure}\).

For \(f:\mathbb N \to \mathbb R\), we have \(\int f\, d\mu = \int f(n) \, d\mu(n) = \sum_n f(n)\), where \(f\in \mathcal L^1(\mu)\) if and only if \(\sum_n |f(n)| < \infty\). Note that for counting measure, the \(\sigma\)-algebra is \(2^{\mathbb N}\), so all functions are measurable.

On \(\mathbb N \times \mathbb N\), let \[ g(m, n) := \begin{cases} 1 & \text{if } m = n \\ -1 & \text{if } m = n+1 \\ 0 & \text{otherwise} \end{cases} \]

y \ x	0	1	2	3	4	5	Sums
0	0	0	0	0	1	-1	0
1	0	0	0	1	-1	0	0
2	0	0	1	-1	0	0	0
3	0	1	-1	0	0	0	0
4	1	-1	0	0	0	0	0
…
Sums	1	0	0	0	0	0

Then \(g\) is bounded and measurable on \(X \times Y\). We have

\[ \begin{align*} \iint g(m, n)\, d\mu(m)\, d\nu(n) &= (1-1)+(1-1)+... = 0 \text{, but} \\ \iint g(m, n)\, d\nu(n)\, d\mu(m) &= 1 + (1-1)+(1-1)+... = 1 & \end{align*} \] Thus the integrals cannot be interchanged, as both \(g^+\) and \(g^-\) have infinite integrals.

Example 12.4 For \(x \in \mathbb R\) and \(t > 0\), let \[ f(x,t) := \frac{\exp(-x^2/(2t))}{\sqrt{2\pi t}} \]

Let \(g(x,t) := \partial f/\partial t\). Then \(\partial f/\partial x = -xf/t\) and \(\partial^2 f/\partial x^2 = (x^2t^{-2}-t^{-1})f = 2g\).

Thus \(f\) satisfies the PDE \(2\partial f/\partial t = \partial^2 f/\partial x^2\), the heat equation.

For every \(t > 0\) we have \(\int_{-\infty}^\infty f(x,y)\, dx = 1\) (note that \(dx := d\lambda(x)\)). We construct the polar coordinates transformation \(T: X := [0, \infty)\times [0, 2\pi) \to \mathbb R^2\) defined by \(T(r, \theta) := (r\cdot\cos\theta, r\cdot\sin\theta)\).

Let \(\sigma\) be the measure on \((0, \infty)\) defined by \(\sigma(A) := \int_A r\,d\lambda(r)\). Let \(\mu := \sigma \times \lambda\) on \(X\). It can be proven that image measure \(\mu\circ T^{-1}\) is the Lebesgue measure \(\lambda^2 := \lambda \times \lambda\).

Applying this transformation, \[ \bigg(\int_0^\infty \exp(-x^2/2) dx\bigg)^2 = \frac{\pi}{2} \int_0^\infty r\cdot \exp(-r^2/2)dr = \frac{\pi}{2} \]

Now for any \(s > 0\), \[ \begin{align*} \int_{-\infty}^\infty \int_s^\infty g(x,t)\, dt\, dx &= \int_{-\infty}^\infty -f(x,s) dx = -1 \text{, but} \\ \int_s^\infty \int_{-\infty}^\infty g(x,t)\, dx\, dt &= \int_s^\infty \frac{\partial f}{\partial x}\bigg|_0^\infty = 0 & \end{align*} \] since \(\partial f/\partial x\to 0\) as \(|x|\to\infty\) by L’Hospital rule. Thus, the order of integration cannot be interchanged here and neither \(g^+\) nor \(g^-\) is integrable.

Definition 12.2 Let \(\mathcal S_j\) be a \(\sigma\)-algebra of subsets of \(X_j\) for each \(j = 1,...,n\). Then the product \(\sigma\)-algebra \(\mathcal S_1 \otimes ...\otimes \mathcal S_n\) is defined as the smallest \(\sigma\)-algebra for which each coordinate function \(x_j\) is measurable.

It is easily seen that this agrees with the previous definition for \(n=2\) and that for each \(n \ge 2\), by induction on \(n\), \(\mathcal S_1 \otimes ...\otimes \mathcal S_n\) is the smallest \(\sigma\)-algebra of subsets of \(X\) containing all sets \(A_1\times ...\times A_n\) with \(A_j \in \mathcal S_j\) for each \(j = 1,...,n\).

Just as a function continuous for a product topology is called jointly continuous, a function measurable for a product \(\sigma\)-algebra will be called jointly measurable.

Theorem 12.6 Let \((X_j, \mathcal S_j, \mu_j)\) be \(\sigma\)-finite measure spaces for \(j=1,...,n\). Then there is a unique measure \(\mu\) on the product \(\sigma\)-algebra \(\mathcal S\) in \(X = X_1 \times \dots \times X_n\) such that for any \(A_j\in \mathcal S_j\) for \(j=1,...,n\), \(\mu(A_1\times \dots\times A_n) = \mu_1(A_1)\dots\mu_n (A_n)\) or 0 if any \(\mu_j(A_j)=0\), even if another is \(\infty\).

If \(f\) is nonnegative and jointly measurable on \(X\), or if \(f \in \mathcal L^1(X, \mathcal S, \mu)\), then \[ \int f\, d\mu = \idotsint f(x_1,\dots x_n)\, d\mu_1(x_1)\dots d\mu_n(x_n) \] where for \(f \in \mathcal L^1(X, \mathcal S, \mu)\), the iterated integral is defined recursively “from the outside” in the sense that for \(\mu_n\)-almost all \(x_n\), the iterated integral w.r.t the other variables is defined and finite, so that except on a set of \(\mu_{n-1}\) measure 0 (possibly depending on \(x_n\)) the iterated integral for the first \(n-2\) variables is defined and finite, and so on. The same holds if the integrations are done in any order.

Proof

The statement follows from Theorem 12.4 and Theorem 12.5 and induction on \(n\).

Example 12.5 The best-known example is the Lebesgue measure \(\lambda^n\) on \(\mathbb R^n\), which is a product with \(\mu_j = \text{Lebesgue measure } \lambda\) on \(\mathbb R\) for each \(j\).

Then \(\lambda\) is length, \(\lambda^2\) is area, \(\lambda^3\) is volume, and so forth.