样本方差是总体方差的无偏估计

发布于:2025-06-22 ⋅ 阅读:(19) ⋅ 点赞:(0)

在高斯-马尔可夫假设条件下,线性回归模型为 Y = Z β + ε \mathbf{Y} = \mathbf{Z} \boldsymbol{\beta} + \boldsymbol{\varepsilon} Y=Zβ+ε,其中:

  • Y \mathbf{Y} Y n × 1 n \times 1 n×1 响应向量,
  • Z \mathbf{Z} Z n × ( r + 1 ) n \times (r+1) n×(r+1) 设计矩阵(满列秩,即 rank ( Z ) = r + 1 \text{rank}(\mathbf{Z}) = r + 1 rank(Z)=r+1),
  • β \boldsymbol{\beta} β ( r + 1 ) × 1 (r+1) \times 1 (r+1)×1 参数向量,
  • ε \boldsymbol{\varepsilon} ε n × 1 n \times 1 n×1 误差向量。

高斯-马尔可夫假设包括:

  1. 线性性:模型形式正确。
  2. 期望零: E ( ε ∣ Z ) = 0 E(\boldsymbol{\varepsilon} | \mathbf{Z}) = \mathbf{0} E(εZ)=0,即条件期望为零。
  3. 同方差性和无自相关: Var ( ε ∣ Z ) = σ 2 I n \text{Var}(\boldsymbol{\varepsilon} | \mathbf{Z}) = \sigma^2 \mathbf{I}_n Var(εZ)=σ2In,其中 σ 2 \sigma^2 σ2 是未知的总体方差。

OLS 估计的残差向量为 ε ^ = Y − Z β ^ \widehat{\boldsymbol{\varepsilon}} = \mathbf{Y} - \mathbf{Z} \widehat{\boldsymbol{\beta}} ε =YZβ ,其中 β ^ = ( Z ′ Z ) − 1 Z ′ Y \widehat{\boldsymbol{\beta}} = (\mathbf{Z}' \mathbf{Z})^{-1} \mathbf{Z}' \mathbf{Y} β =(ZZ)1ZY。残差平方和为 ε ^ ′ ε ^ \widehat{\boldsymbol{\varepsilon}}' \widehat{\boldsymbol{\varepsilon}} ε ε ,且样本方差定义为:
s 2 = ε ^ ′ ε ^ n − r − 1 = Y ′ ( I − H ) Y n − r − 1 , s^2 = \frac{\widehat{\boldsymbol{\varepsilon}}' \widehat{\boldsymbol{\varepsilon}}}{n - r - 1} = \frac{\mathbf{Y}' (\mathbf{I} - \mathbf{H}) \mathbf{Y}}{n - r - 1}, s2=nr1ε ε =nr1Y(IH)Y,
其中 H = Z ( Z ′ Z ) − 1 Z ′ \mathbf{H} = \mathbf{Z} (\mathbf{Z}' \mathbf{Z})^{-1} \mathbf{Z}' H=Z(ZZ)1Z 是帽子矩阵(对称且幂等),且 I − H \mathbf{I} - \mathbf{H} IH 也是对称且幂等。

要证明 E ( s 2 ) = σ 2 E(s^2) = \sigma^2 E(s2)=σ2,即 s 2 s^2 s2 σ 2 \sigma^2 σ2 的无偏估计。

证明:

考虑残差平方和的期望:
E ( ε ^ ′ ε ^ ) = E [ Y ′ ( I − H ) Y ] . E(\widehat{\boldsymbol{\varepsilon}}' \widehat{\boldsymbol{\varepsilon}}) = E\left[ \mathbf{Y}' (\mathbf{I} - \mathbf{H}) \mathbf{Y} \right]. E(ε ε )=E[Y(IH)Y].
在高斯-马尔可夫假设下,有条件期望:
E ( Y ∣ Z ) = Z β , Var ( Y ∣ Z ) = σ 2 I n . E(\mathbf{Y} | \mathbf{Z}) = \mathbf{Z} \boldsymbol{\beta}, \quad \text{Var}(\mathbf{Y} | \mathbf{Z}) = \sigma^2 \mathbf{I}_n. E(YZ)=Zβ,Var(YZ)=σ2In.
对于二次型 Y ′ A Y \mathbf{Y}' \mathbf{A} \mathbf{Y} YAY(其中 A = I − H \mathbf{A} = \mathbf{I} - \mathbf{H} A=IH 对称),其条件期望公式为:
E ( Y ′ A Y ∣ Z ) = trace ( A ⋅ Var ( Y ∣ Z ) ) + [ E ( Y ∣ Z ) ] ′ A [ E ( Y ∣ Z ) ] . E(\mathbf{Y}' \mathbf{A} \mathbf{Y} | \mathbf{Z}) = \text{trace}(\mathbf{A} \cdot \text{Var}(\mathbf{Y} | \mathbf{Z})) + [E(\mathbf{Y} | \mathbf{Z})]' \mathbf{A} [E(\mathbf{Y} | \mathbf{Z})]. E(YAYZ)=trace(AVar(YZ))+[E(YZ)]A[E(YZ)].
代入已知条件:
E ( Y ′ A Y ∣ Z ) = trace ( A ⋅ σ 2 I n ) + ( Z β ) ′ A ( Z β ) . E(\mathbf{Y}' \mathbf{A} \mathbf{Y} | \mathbf{Z}) = \text{trace}(\mathbf{A} \cdot \sigma^2 \mathbf{I}_n) + (\mathbf{Z} \boldsymbol{\beta})' \mathbf{A} (\mathbf{Z} \boldsymbol{\beta}). E(YAYZ)=trace(Aσ2In)+(Zβ)A(Zβ).
计算各项:

  1. 第一项 trace ( A ⋅ σ 2 I n ) = σ 2 ⋅ trace ( A ) \text{trace}(\mathbf{A} \cdot \sigma^2 \mathbf{I}_n) = \sigma^2 \cdot \text{trace}(\mathbf{A}) trace(Aσ2In)=σ2trace(A)

    • 因为 A = I − H \mathbf{A} = \mathbf{I} - \mathbf{H} A=IH,有 trace ( A ) = trace ( I n ) − trace ( H ) \text{trace}(\mathbf{A}) = \text{trace}(\mathbf{I}_n) - \text{trace}(\mathbf{H}) trace(A)=trace(In)trace(H)
    • trace ( I n ) = n \text{trace}(\mathbf{I}_n) = n trace(In)=n
    • trace ( H ) = trace ( Z ( Z ′ Z ) − 1 Z ′ ) = trace ( ( Z ′ Z ) − 1 Z ′ Z ) = trace ( I r + 1 ) = r + 1 \text{trace}(\mathbf{H}) = \text{trace}\left( \mathbf{Z} (\mathbf{Z}' \mathbf{Z})^{-1} \mathbf{Z}' \right) = \text{trace}\left( (\mathbf{Z}' \mathbf{Z})^{-1} \mathbf{Z}' \mathbf{Z} \right) = \text{trace}(\mathbf{I}_{r+1}) = r + 1 trace(H)=trace(Z(ZZ)1Z)=trace((ZZ)1ZZ)=trace(Ir+1)=r+1(因为 Z \mathbf{Z} Z 满列秩)。
    • 因此, trace ( A ) = n − ( r + 1 ) \text{trace}(\mathbf{A}) = n - (r + 1) trace(A)=n(r+1)
    • trace ( A ⋅ σ 2 I n ) = σ 2 ( n − r − 1 ) \text{trace}(\mathbf{A} \cdot \sigma^2 \mathbf{I}_n) = \sigma^2 (n - r - 1) trace(Aσ2In)=σ2(nr1).
  2. 第二项 ( Z β ) ′ A ( Z β ) = β ′ Z ′ A Z β (\mathbf{Z} \boldsymbol{\beta})' \mathbf{A} (\mathbf{Z} \boldsymbol{\beta}) = \boldsymbol{\beta}' \mathbf{Z}' \mathbf{A} \mathbf{Z} \boldsymbol{\beta} (Zβ)A(Zβ)=βZAZβ

    • 代入 A = I − H \mathbf{A} = \mathbf{I} - \mathbf{H} A=IH,有 A Z = ( I − H ) Z = Z − H Z \mathbf{A} \mathbf{Z} = (\mathbf{I} - \mathbf{H}) \mathbf{Z} = \mathbf{Z} - \mathbf{H} \mathbf{Z} AZ=(IH)Z=ZHZ
    • 由于 H Z = Z ( Z ′ Z ) − 1 Z ′ Z = Z \mathbf{H} \mathbf{Z} = \mathbf{Z} (\mathbf{Z}' \mathbf{Z})^{-1} \mathbf{Z}' \mathbf{Z} = \mathbf{Z} HZ=Z(ZZ)1ZZ=Z(因为 Z \mathbf{Z} Z 满列秩),
    • 因此 A Z = Z − Z = 0 \mathbf{A} \mathbf{Z} = \mathbf{Z} - \mathbf{Z} = \mathbf{0} AZ=ZZ=0
    • β ′ Z ′ A Z β = β ′ Z ′ 0 β = 0 \boldsymbol{\beta}' \mathbf{Z}' \mathbf{A} \mathbf{Z} \boldsymbol{\beta} = \boldsymbol{\beta}' \mathbf{Z}' \mathbf{0} \boldsymbol{\beta} = 0 βZAZβ=βZ0β=0.

综上:
E ( Y ′ A Y ∣ Z ) = σ 2 ( n − r − 1 ) + 0 = σ 2 ( n − r − 1 ) . E(\mathbf{Y}' \mathbf{A} \mathbf{Y} | \mathbf{Z}) = \sigma^2 (n - r - 1) + 0 = \sigma^2 (n - r - 1). E(YAYZ)=σ2(nr1)+0=σ2(nr1).
取无条件期望(因为 σ 2 \sigma^2 σ2 n n n r r r 是常数):
E ( ε ^ ′ ε ^ ) = E [ Y ′ ( I − H ) Y ] = E [ E ( Y ′ ( I − H ) Y ∣ Z ) ] = E [ σ 2 ( n − r − 1 ) ] = σ 2 ( n − r − 1 ) . E(\widehat{\boldsymbol{\varepsilon}}' \widehat{\boldsymbol{\varepsilon}}) = E\left[ \mathbf{Y}' (\mathbf{I} - \mathbf{H}) \mathbf{Y} \right] = E\left[ E\left( \mathbf{Y}' (\mathbf{I} - \mathbf{H}) \mathbf{Y} \mid \mathbf{Z} \right) \right] = E\left[ \sigma^2 (n - r - 1) \right] = \sigma^2 (n - r - 1). E(ε ε )=E[Y(IH)Y]=E[E(Y(IH)YZ)]=E[σ2(nr1)]=σ2(nr1).
现在,样本方差为:
s 2 = ε ^ ′ ε ^ n − r − 1 . s^2 = \frac{\widehat{\boldsymbol{\varepsilon}}' \widehat{\boldsymbol{\varepsilon}}}{n - r - 1}. s2=nr1ε ε .
因此:
E ( s 2 ) = E [ ε ^ ′ ε ^ n − r − 1 ] = E ( ε ^ ′ ε ^ ) n − r − 1 = σ 2 ( n − r − 1 ) n − r − 1 = σ 2 . E(s^2) = E\left[ \frac{\widehat{\boldsymbol{\varepsilon}}' \widehat{\boldsymbol{\varepsilon}}}{n - r - 1} \right] = \frac{E(\widehat{\boldsymbol{\varepsilon}}' \widehat{\boldsymbol{\varepsilon}})}{n - r - 1} = \frac{\sigma^2 (n - r - 1)}{n - r - 1} = \sigma^2. E(s2)=E[nr1ε ε ]=nr1E(ε ε )=nr1σ2(nr1)=σ2.

结论:

在高斯-马尔可夫假设条件下, E ( s 2 ) = σ 2 E(s^2) = \sigma^2 E(s2)=σ2,即样本方差 s 2 s^2 s2 是总体方差 σ 2 \sigma^2 σ2 的无偏估计。此证明依赖于模型线性性、误差期望零、同方差性与无自相关性,以及设计矩阵满列秩。


网站公告

今日签到

点亮在社区的每一天
去签到