最优估计准则与方法(2)线性最小方差估计(LMMSE)_学习笔记

发布于:2025-08-15 ⋅ 阅读:(20) ⋅ 点赞:(0)

前言

线性最小方差估计(Linear Minimum Variance Estimation,LMVE) 是一种特殊的最小方差估计(MMSE)。作为卡尔曼滤波(Kalman Filtering) 的基础,在最优估计理论中的地位占有非常重要。本文将详细介绍该最优估计准则与方法。

最优估计问题(Optimal Estimation Problem)

实际工程中大多是多维随机向量的估计问题,如下:
X X X n n n维随机向量, Z Z Z为观测 X X X m m m维随机向量, X ^ \hat{X} X^为对 X X X的估计量,是关于 Z Z Z的函数, X ~ \tilde{X} X~ X ^ \hat{X} X^的估计误差,实现对 X X X的最优估计。令 A A A n × m n×m n×m维矩阵, B B B n n n维随机向量, X ^ \hat{X} X^ Z Z Z满足线性关系,根据最优估计准则对估计量 X ^ \hat{X} X^进行求极值。
X ^ = A Z + B X ~ = X − X ^ = X − ( A Z + B ) \begin{align*} \hat{X}&=AZ+B \tag{1}\\ \tilde{X}&=X-\hat{X}=X-(AZ+B) \tag{2} \\ \end{align*} X^X~=AZ+B=XX^=X(AZ+B)(1)(2)

线性最小方差估计(Linear Minimum Mean Squared Error, LMMSE)

线性最小方差估计准则与最小方差一致,是使估计均方误差集合均值最小的估计[1]。常简记做线性最小方差(LMMSE)[2]。
代价函数为:
J = E [ X ~ T X ~ ] = T r ( E [ X ~ X ~ T ] ) = E [ ( X − ( A Z + B ) ) T ( X − ( A Z + B ) ) ] = m i n (3) J =E[\tilde{X}^{T}\tilde{X}]=Tr(E[\tilde{X}\tilde{X}^{T}])=E[(X-(AZ+B))^{T}(X-(AZ+B))]=min \tag{3} J=E[X~TX~]=Tr(E[X~X~T])=E[(X(AZ+B))T(X(AZ+B))]=min(3)

最小化(Minimizing)

最小方差估计(MMSE) 在最小化的过程不需要已知 X X X Z Z Z的条件概率密度函数 P ( X ∣ Z ) P(X|Z) P(XZ),联合概率密度函数 P ( X , Z ) P(X,Z) P(X,Z)和条件数学期望 E [ X ∣ Z ] E[X|Z] E[XZ] 。注意:这种苛刻的的先验条件,是此方法在工程上的应用受到很大限制[1]。线性最小方差估计(LMMSE) 仅需要已知随机向量 X X X Z Z Z的一阶距和二阶矩,即 E [ X ] E[X] E[X] E [ Z ] E[Z] E[Z] V a r ( X ) Var(X) Var(X) V a r ( Z ) Var(Z) Var(Z) C o v ( X , Z ) Cov(X ,Z) Cov(X,Z) C o v ( X , Z ) Cov(X, Z) Cov(X,Z),这使得工程应用的难度显著降低,其最小化推导过程如下:
J = E [ ( X − ( A Z + B ) ) T ( X − ( A Z + B ) ) ] \begin{align*} J&=E[(X-(AZ+B))^{T}(X-(AZ+B))] \tag{4}\\ \end{align*} J=E[(X(AZ+B))T(X(AZ+B))](4)
先对式(2)的 J J J B B B求偏导,并取极小:
∂ J ∂ B = 0 ∂ J ∂ ( X − ( A Z + B ) ) ∂ ( X − ( A Z + B ) ) ∂ B = 0 − 2 E [ X − ( A Z + B ) ) ] = 0 E [ X − A Z − B ] = 0 E [ X ] − A E [ Z ] − B = 0 B = E [ X ] − A E [ Z ] \begin{align*} \frac{\partial J}{\partial B}&=0\\ \frac{\partial J}{\partial (X-(AZ+B))}\frac{\partial (X-(AZ+B))}{\partial B}&=0\\ -2E[X-(AZ+B))] &= 0\\ E[X-AZ-B] &= 0\\ E[X]-AE[Z]-B &= 0\\ B &= E[X]-AE[Z] \tag{5}\\ \end{align*} BJ(X(AZ+B))JB(X(AZ+B))2E[X(AZ+B))]E[XAZB]E[X]AE[Z]BB=0=0=0=0=0=E[X]AE[Z](5)
再对式(4)的 J J J A A A求偏导,代入式(5)中的 B B B并取极小:
∂ J ∂ A = 0 ∂ J ∂ ( X − ( A Z + B ) ) ∂ ( X − ( A Z + B ) ) ∂ A = 0 − 2 E [ ( X − ( A Z + B ) ) Z T ] = 0 E [ ( X − B ) Z T ] − E [ A Z Z T ] = 0 E [ X Z T ] − B E [ Z T ] − E [ A Z Z T ] = 0 E [ X Z T ] − ( E [ X ] − A E [ Z ] ) E [ Z T ] − E [ A Z Z T ] = 0 E [ X Z T ] − E [ X ] E [ Z T ] + A E [ Z ] E [ Z T ] − A E [ Z Z T ] = 0 ( E [ X Z T ] − E [ X ] E [ Z T ] ) − A ( ( E [ Z Z T ] − E [ Z ] E [ Z T ] ) = 0 E [ ( X − E [ X ] ) ( Z − E [ Z ] ) T ] − A E [ ( Z − E [ Z ] ) ( Z − E [ Z ] ) T ] = 0 C o v ( X , Z ) − A V a r ( Z ) = 0 A = C o v ( X , Z ) V a r ( Z ) − 1 \begin{align*} \frac{\partial J}{\partial A}&=0\\ \frac{\partial J}{\partial (X-(AZ+B))}\frac{\partial (X-(AZ+B))}{\partial A}&=0 \\ -2E[(X-(AZ+B))Z^{T}] &= 0 \tag{6} \\ E[(X-B)Z^{T}] -E[AZZ^{T}]&= 0\\ E[XZ^{T}]-BE[Z^{T}] -E[AZZ^{T}]&= 0\\ E[XZ^{T}]-(E[X]-AE[Z])E[Z^{T}] -E[AZZ^{T}]&= 0\\ E[XZ^{T}]-E[X]E[Z^{T}]+AE[Z]E[Z^{T}]-AE[ZZ^{T}]&= 0\\ (E[XZ^{T}]-E[X]E[Z^{T}])-A((E[ZZ^{T}]-E[Z]E[Z^{T}])&= 0\\ E[(X-E[X])(Z-E[Z])^{T}]-AE[(Z-E[Z])(Z-E[Z])^{T}]&= 0\\ Cov(X,Z)-AVar(Z) &= 0\\ A &= Cov(X,Z)Var(Z)^{-1} \tag{7}\\ \end{align*} AJ(X(AZ+B))JA(X(AZ+B))2E[(X(AZ+B))ZT]E[(XB)ZT]E[AZZT]E[XZT]BE[ZT]E[AZZT]E[XZT](E[X]AE[Z])E[ZT]E[AZZT]E[XZT]E[X]E[ZT]+AE[Z]E[ZT]AE[ZZT](E[XZT]E[X]E[ZT])A((E[ZZT]E[Z]E[ZT])E[(XE[X])(ZE[Z])T]AE[(ZE[Z])(ZE[Z])T]Cov(X,Z)AVar(Z)A=0=0=0=0=0=0=0=0=0=0=Cov(X,Z)Var(Z)1(6)(7)
A A A代入式(5),并求出 B B B
B = E [ X ] − A E [ Z ] = E [ X ] − C o v ( X , Z ) V a r ( Z ) − 1 E [ Z ] \begin{align*} B &= E[X]-AE[Z] \\ &= E[X]-Cov(X,Z)Var(Z)^{-1}E[Z] \tag{8}\\ \end{align*} B=E[X]AE[Z]=E[X]Cov(X,Z)Var(Z)1E[Z](8)
A A A B B B代入式(1),并求出 X ^ \hat{X} X^
X ^ = A Z + B = C o v ( X , Z ) V a r ( Z ) − 1 Z + E [ X ] − C o v ( X , Z ) V a r ( Z ) − 1 E [ Z ] = E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) \begin{align*} \hat{X}&=AZ+B\\ &=Cov(X,Z)Var(Z)^{-1}Z+E[X]-Cov(X,Z)Var(Z)^{-1}E[Z]\\ &=E[X]+Cov(X,Z)Var(Z)^{-1}(Z-E[Z]) \tag{9}\\ \end{align*} X^=AZ+B=Cov(X,Z)Var(Z)1Z+E[X]Cov(X,Z)Var(Z)1E[Z]=E[X]+Cov(X,Z)Var(Z)1(ZE[Z])(9)

无偏性(Unbiased)

线性最小方差估计 X ^ \hat{X} X^的数学期望为:
E [ X ^ ] = E [ E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ] = E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 E [ Z − E [ Z ] ] = E [ X ] \begin{align*} E[\hat {X}] &= E[E[X]+Cov(X,Z)Var(Z)^{-1}(Z-E[Z])] \\ &= E[X]+Cov(X,Z)Var(Z)^{-1}E[Z-E[Z]] \\ &= E[X] \tag{10}\\ \end{align*} E[X^]=E[E[X]+Cov(X,Z)Var(Z)1(ZE[Z])]=E[X]+Cov(X,Z)Var(Z)1E[ZE[Z]]=E[X](10)
显然,线性最小方差估计是无偏的,有:
E [ X ~ ] = E [ X − X ^ ] = 0 (11) E[\tilde{X}]=E[X-\hat{X}]=0 \tag{11} E[X~]=E[XX^]=0(11)

协方差矩阵(Covariance Matrix)

线性最小方差估计的估计误差 X ~ \tilde{X} X~的协方差矩阵为[1]:
E [ X ~ X ~ T ] = E [ ( X − X ^ ) ( X − X ^ ) T ] = E [ ( X − ( E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ) ( X − ( E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ) T ] = E [ ( ( X − E [ X ] ) − C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ) ( ( X − E [ X ] ) − C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ) T ] = C 1 + C 2 + C 3 \begin{align*} E[\tilde{X}\tilde{X}^{T}] &= E[(X-\hat{X})(X-\hat{X})^{T}] \\ &= E[(X-(E[X]+Cov(X,Z)Var(Z)^{-1}(Z-E[Z]))(X-(E[X]+Cov(X,Z)Var(Z)^{-1}(Z-E[Z]))^{T}] \\ &= E[((X-E[X])-Cov(X,Z)Var(Z)^{-1}(Z-E[Z]))((X-E[X])-Cov(X,Z)Var(Z)^{-1}(Z-E[Z]))^{T}] \\ &= C_{1}+C{2}+C{3} \tag{12}\\ \end{align*} E[X~X~T]=E[(XX^)(XX^)T]=E[(X(E[X]+Cov(X,Z)Var(Z)1(ZE[Z]))(X(E[X]+Cov(X,Z)Var(Z)1(ZE[Z]))T]=E[((XE[X])Cov(X,Z)Var(Z)1(ZE[Z]))((XE[X])Cov(X,Z)Var(Z)1(ZE[Z]))T]=C1+C2+C3(12)
其中,
C 1 = E [ ( X − E [ X ] ) ( X − E [ X ] ) T ] = V a r ( x ) \begin{align*} C_{1} &= E[(X-E[X])(X-E[X])^{T}] \\ &= Var(x) \tag{13} \\ \end{align*} C1=E[(XE[X])(XE[X])T]=Var(x)(13)
C 2 = − E [ ( X − E [ X ] ) ( Z − E [ Z ] ) T ] [ V a r ( Z ) − 1 ] T C o v ( X , Z ) T − C o v ( X , Z ) V a r ( Z ) − 1 E [ ( Z − E [ Z ] ) ) ( X − E [ X ] ) T ] = − C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) − C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) = − 2 C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) \begin{align*} C_{2} &= - E[(X-E[X])(Z-E[Z])^{T}][Var(Z)^{-1}]^{T}Cov(X,Z)^{T} \\ &- Cov(X,Z)Var(Z)^{-1}E[(Z-E[Z]))(X-E[X])^{T}] \\ &= - Cov(X,Z)Var(Z)^{-1}Cov(Z,X) - Cov(X,Z)Var(Z)^{-1}Cov(Z,X) \\ &= - 2Cov(X,Z)Var(Z)^{-1}Cov(Z,X) \tag{14} \\ \end{align*} C2=E[(XE[X])(ZE[Z])T][Var(Z)1]TCov(X,Z)TCov(X,Z)Var(Z)1E[(ZE[Z]))(XE[X])T]=Cov(X,Z)Var(Z)1Cov(Z,X)Cov(X,Z)Var(Z)1Cov(Z,X)=2Cov(X,Z)Var(Z)1Cov(Z,X)(14)
C 3 = C o v ( X , Z ) V a r ( Z ) − 1 E [ ( Z − E [ Z ] ) ( Z − E [ Z ] ) T ] [ V a r ( Z ) − 1 ] T C o v ( X , Z ) T = C o v ( X , Z ) V a r ( Z ) − 1 E [ ( Z − E [ Z ] ) ( Z − E [ Z ] ) T ] V a r ( Z ) − 1 C o v ( Z , X ) = C o v ( X , Z ) V a r ( Z ) − 1 V a r ( Z ) V a r ( Z ) − 1 C o v ( Z , X ) = C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) \begin{align*} C_{3} &= Cov(X,Z)Var(Z)^{-1}E[(Z-E[Z])(Z-E[Z])^{T}][Var(Z)^{-1}]^{T}Cov(X,Z)^{T} \\ &= Cov(X,Z)Var(Z)^{-1}E[(Z-E[Z])(Z-E[Z])^{T}]Var(Z)^{-1}Cov(Z,X) \\ &= Cov(X,Z)Var(Z)^{-1}Var(Z)Var(Z)^{-1}Cov(Z,X) \\ &= Cov(X,Z)Var(Z)^{-1}Cov(Z,X) \tag{15} \\ \end{align*} C3=Cov(X,Z)Var(Z)1E[(ZE[Z])(ZE[Z])T][Var(Z)1]TCov(X,Z)T=Cov(X,Z)Var(Z)1E[(ZE[Z])(ZE[Z])T]Var(Z)1Cov(Z,X)=Cov(X,Z)Var(Z)1Var(Z)Var(Z)1Cov(Z,X)=Cov(X,Z)Var(Z)1Cov(Z,X)(15)
式(13)(14)(15)代入式(13),得:
E [ X ~ X ~ T ] = V a r ( X ) − C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) \begin{align*} E[\tilde{X}\tilde{X}^{T}] &=Var(X)- Cov(X,Z)Var(Z)^{-1}Cov(Z,X) \tag{16}\\ \end{align*} E[X~X~T]=Var(X)Cov(X,Z)Var(Z)1Cov(Z,X)(16)

线性变换(Linear Transformation)

由式(2)(6)(11),得
E [ ( X − ( A Z + B ) ) Z T ] = E [ X ~ Z T ] = 0 E [ ( X − ( A Z + B ) ) ] = E [ X ~ ] = 0 \begin{align*} E[(X-(AZ+B))Z^{T}] &= E[\tilde{X}Z^{T}] = 0 \tag{17}\\ E[(X-(AZ+B))] &=E[\tilde{X}] = 0 \tag{18}\\ \end{align*} E[(X(AZ+B))ZT]E[(X(AZ+B))]=E[X~ZT]=0=E[X~]=0(17)(18)
正交投影定量几何原理图

图1 正交投影定量几何原理图[3]

根据正交投影定理[2],从几何上来看,估计量 X ^ \hat{X} X^ X X X在由 A A A B B B确定关于 Z Z Z的线性空间上( Z Z Z B B B所在平面)的投影。估计误差的均值为0,即垂线的长度为0, X ^ \hat{X} X^ X X X重合,线性最小方差估计即为无偏估计。线性转换的过程可以理解为:由于 Z Z Z作为观测量无法调整,但是 B B B可以被调整使得 Z Z Z B B B所在平面令 X X X落在该平面, A A A B B B Z Z Z则确定该平面上的投影向量,即得到最优无偏估计。

参考文献

[1] 最优估计准则与方法(1)最小方差估计(MMSE)_学习笔记
https://blog.csdn.net/jimmychao1982/article/details/149478176
[2] 《最优估计理论》,刘胜,张红梅著,2011,高等教育出版社。
[3] 3-2 正交定理, Yandld
https://www.bilibili.com/video/BV1wj411H7j7?t=2.7


网站公告

今日签到

点亮在社区的每一天
去签到