矩阵的相似对角形

发布于:2025-04-10 ⋅ 阅读:(38) ⋅ 点赞:(0)

1-10 矩阵的相似对角形

线性变换理论要研究的一个主要问题是:对于 n n n 维线性空间 V V V 上的线性变换 A \mathscr{A} A ,是否存在 V V V 的一个基使得 C \mathscr{C} C 在这个基下的矩阵为对角矩阵。

定义1.10.1 数域 F F F 上的 n n n 维线性空间 V V V 的线性变换 B \mathcal{B} B 称为可对角化的,如果 V V V 中存在一个基,使得 A \mathscr{A} A 在这个基下的矩阵为对角矩阵。

定义1.10.2 若 n n n 阶矩阵 A \boldsymbol{A} A 与对角矩阵相似,则称 A \boldsymbol{A} A 可对角化,也称 A \boldsymbol{A} A 是单纯矩阵。

A \mathscr{A} A n n n 维线性空间 V V V 的线性变换, A \mathscr{A} A 在基 α 1 , α 2 , ⋯   , α n \boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n α1,α2,,αn 下的矩阵表示为 A \boldsymbol{A} A ,即

A ( α 1 , α 2 , ⋯   , α n ) = ( α 1 , α 2 , ⋯   , α n ) A \mathscr{A}\left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n\right)=\left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n\right) \boldsymbol{A} A(α1,α2,,αn)=(α1,α2,,αn)A

不难证明:
定理1.10.1 线性变换 A \mathscr{A} A 可对角的充分必要条件是 A \boldsymbol{A} A 可对角化.(证略)由此可见,我们只需研究矩阵的可对角化问题即可。

一,矩阵 A \boldsymbol{A} A 可对角化条件
定理1.10.2 n n n 阶矩阵 A \boldsymbol{A} A 可对角化的充要条件 A \boldsymbol{A} A n n n 个线性无关的特征向量.
证明 必要性:设满秩矩阵 P \boldsymbol{P} P ,满足

P − 1 A P = diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) \boldsymbol{P}^{-1} \boldsymbol{A P}=\operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right) P1AP=diag(λ1,λ2,,λn)

P \boldsymbol{P} P 按列向量进行分块

P = ( α 1 , α 2 , ⋯   , α n ) \boldsymbol{P}=\left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n\right) P=(α1,α2,,αn)
将式(1.10.2)代人式(1.10.1)得

A ( α 1 , ⋯   , α n ) = ( α 1 , ⋯   , α n ) diag ⁡ ( λ 1 , λ 1 , ⋯   , λ n ) \boldsymbol{A}\left(\boldsymbol{\alpha}_1, \cdots, \boldsymbol{\alpha}_n\right)=\left(\boldsymbol{\alpha}_1, \cdots, \boldsymbol{\alpha}_n\right) \operatorname{diag}\left(\lambda_1, \lambda_1, \cdots, \lambda_n\right) A(α1,,αn)=(α1,,αn)diag(λ1,λ1,,λn)

于是

A α i = λ i α i ( i = 1 , 2 , ⋯   , n ) \boldsymbol{A} \boldsymbol{\alpha}_i=\lambda_i \boldsymbol{\alpha}_i \quad(i=1,2, \cdots, n) Aαi=λiαi(i=1,2,,n)

因为 P \boldsymbol{P} P 是满秩的,所以 α 1 , α 2 , ⋯   , α n \boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n α1,α2,,αn 是线性无关的.从而由式(1.10.1)知, A \boldsymbol{A} A n \boldsymbol{n} n个线性无关的特征向量.

充分性:设 A \boldsymbol{A} A n n n 个线性无关的特征向量 α 1 , ⋯   , α n \boldsymbol{\alpha}_1, \cdots, \boldsymbol{\alpha}_n α1,,αn ,即 A α i = λ i α i ( i = \boldsymbol{A} \boldsymbol{\alpha}_i=\lambda_i \boldsymbol{\alpha}_i \quad(i= Aαi=λiαi(i= 1 , 2 , ⋯   , n ) 1,2, \cdots, n) 1,2,,n) 。命

P = ( α 1 , α 2 , ⋯   , α n ) \boldsymbol{P}=\left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n\right) P=(α1,α2,,αn)

显然 P \boldsymbol{P} P 是满秩的.故

A P = A ( α 1 , α 2 , ⋯   , α n ) = ( A α 1 , A α 2 , ⋯   , A α n ) = ( λ 1 α 1 , λ 2 α 2 , ⋯   , λ n α n ) = ( α 1 , α 2 , ⋯   , α n ) diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) = P diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) P − 1 A P = diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) \begin{aligned} \boldsymbol{A} \boldsymbol{P}= & \boldsymbol{A}\left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n\right) \\ = & \left(\boldsymbol{A} \boldsymbol{\alpha}_1, \boldsymbol{A} \boldsymbol{\alpha}_2, \cdots, \boldsymbol{A} \boldsymbol{\alpha}_n\right) \\ = & \left(\lambda_1 \boldsymbol{\alpha}_1, \lambda_2 \boldsymbol{\alpha}_2, \cdots, \lambda_n \boldsymbol{\alpha}_n\right) \\ = & \left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n\right) \operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right) \\ = & \boldsymbol{P} \operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right) \\ & \boldsymbol{P}^{-1} \boldsymbol{A} \boldsymbol{P}=\operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right) \end{aligned} AP=====A(α1,α2,,αn)(Aα1,Aα2,,Aαn)(λ1α1,λ2α2,,λnαn)(α1,α2,,αn)diag(λ1,λ2,,λn)Pdiag(λ1,λ2,,λn)P1AP=diag(λ1,λ2,,λn)

推论 设 P − 1 A P = diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) \boldsymbol{P}^{-1} \boldsymbol{A P}=\operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right) P1AP=diag(λ1,λ2,,λn) ,则 λ 1 , λ 2 , ⋯   , λ n \lambda_1, \lambda_2, \cdots, \lambda_n λ1,λ2,,λn A \boldsymbol{A} A n n n 个特征值, P \boldsymbol{P} P 的第 i i i 个列向量是 A \boldsymbol{A} A 的属于 λ i \lambda_i λi 的特征向量.

由定理1.10.2可见,并不是任何一个线性变换都存在一个基,使其在该基下的矩阵表示呈现对角形.若一个线性变换在某组基下的矩阵表示是对角形,便称这线性变换是可对角化变换.

例1.10.1 已知线性微分方程组

{ d x 1   d t = a 11 x 1 + a 12 x 2 + ⋯ + a 1 n x n   d x 2   d t = a 21 x 1 + a 22 x 2 + ⋯ + a 2 n x n ⋮   d x n   d t = a n 1 x 1 + a n 2 x 2 + ⋯ + a n n x n \left\{\begin{array}{l} \frac{\mathrm{d} x_1}{\mathrm{~d} t}=a_{11} x_1+a_{12} x_2+\cdots+a_{1 n} x_n \\ \frac{\mathrm{~d} x_2}{\mathrm{~d} t}=a_{21} x_1+a_{22} x_2+\cdots+a_{2 n} x_n \\ \vdots \\ \frac{\mathrm{~d} x_n}{\mathrm{~d} t}=a_{n 1} x_1+a_{n 2} x_2+\cdots+a_{n n} x_n \end{array}\right.  dtdx1=a11x1+a12x2++a1nxn dt dx2=a21x1+a22x2++a2nxn dt dxn=an1x1+an2x2++annxn

X = ( x 1 x 2 ⋮ x n ) , d X   d t = ( d x 1   d t ⋮   d x n   d t ) , A = [ a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋮ ⋮ ⋮ a n 1 a n 2 ⋯ a n n ] \boldsymbol{X}=\left(\begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_n \end{array}\right), \quad \frac{\mathrm{d} \boldsymbol{X}}{\mathrm{~d} t}=\left(\begin{array}{c} \frac{\mathrm{d} x_1}{\mathrm{~d} t} \\ \vdots \\ \frac{\mathrm{~d} x_n}{\mathrm{~d} t} \end{array}\right), \quad \boldsymbol{A}=\left[\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1 n} \\ a_{21} & a_{22} & \cdots & a_{2 n} \\ \vdots & \vdots & & \vdots \\ a_{n 1} & a_{n 2} & \cdots & a_{n n} \end{array}\right] X= x1x2xn , dtdX=  dtdx1 dt dxn ,A= a11a21an1a12a22an2a1na2nann

则方程组(1)的矩阵形式为
d X   d t = A X \frac{\mathrm{d} \boldsymbol{X}}{\mathrm{~d} t}=\boldsymbol{A} \boldsymbol{X}  dtdX=AX

A \boldsymbol{A} A 可对角化,即存在 P ∈ C n n × n \boldsymbol{P} \in C_n^{n \times n} PCnn×n ,使得

P − 1 A P = Λ = diag ⁡ ( λ 1 , ⋯   , λ n ) \boldsymbol{P}^{-1} \boldsymbol{A} \boldsymbol{P}=\boldsymbol{\Lambda}=\operatorname{diag}\left(\lambda_1, \cdots, \lambda_n\right) P1AP=Λ=diag(λ1,,λn)

X = P Y X=P Y X=PY

其中 Y = ( y 1 ⋮ y n ) \boldsymbol{Y}=\left(\begin{array}{c}y_1 \\ \vdots \\ y_n\end{array}\right) Y= y1yn ,把式(3)代人式(2)得

d ( P Y ) d t = A P Y \frac{\mathrm{d}(\boldsymbol{P} \boldsymbol{Y})}{\mathrm{d} t}=\boldsymbol{A} \boldsymbol{P} \boldsymbol{Y} dtd(PY)=APY

P d Y   d t = A P Y \boldsymbol{P} \frac{\mathrm{d} \boldsymbol{Y}}{\mathrm{~d} t}=\boldsymbol{A} \boldsymbol{P} \boldsymbol{Y} P dtdY=APY

P − 1 \boldsymbol{P}^{-1} P1 左乘上式两端得

d Y   d t = P − 1 A P Y = Λ Y { d y 1   d t = λ 1 y 1   d y 2   d t = λ 2 y 2 ⋮   d y n   d t = λ n y n \begin{aligned} & \frac{\mathrm{d} \boldsymbol{Y}}{\mathrm{~d} t}=\boldsymbol{P}^{-1} \boldsymbol{A} \boldsymbol{P} \boldsymbol{Y}=\boldsymbol{\Lambda} \boldsymbol{Y} \\ & \left\{\begin{array}{l} \frac{\mathrm{d} y_1}{\mathrm{~d} t}=\lambda_1 y_1 \\ \frac{\mathrm{~d} y_2}{\mathrm{~d} t}=\lambda_2 y_2 \\ \vdots \\ \frac{\mathrm{~d} y_n}{\mathrm{~d} t}=\lambda_n y_n \end{array}\right. \end{aligned}  dtdY=P1APY=ΛY  dtdy1=λ1y1 dt dy2=λ2y2 dt dyn=λnyn

因此

y 1 = c 1 e λ 1 t , y 2 = c 2 e λ 2 t , ⋯   , y n = c n e λ n t y_1=c_1 \mathrm{e}^{\lambda_1 t}, \quad y_2=c_2 \mathrm{e}^{\lambda_2 t}, \cdots, y_n=c_n \mathrm{e}^{\lambda_n t} y1=c1eλ1t,y2=c2eλ2t,,yn=cneλnt

代人方程组(3)求得微分方程解 x 1 , x 2 , ⋯   , x n x_1, x_2, \cdots, x_n x1,x2,,xn
定理 1.10.3 矩阵 A \boldsymbol{A} A 可对角化的充要条件是 A \boldsymbol{A} A 的每一个特征值的几何重复度等于代数重复度。

证明 设 n n n 阶矩阵的谱为 { λ 1 , λ 2 , ⋯   , λ r } . λ i \left\{\lambda_1, \lambda_2, \cdots, \lambda_r\right\} . \lambda_i {λ1,λ2,,λr}.λi 的代数重复度为 p i p_i pi ,几何重复度为 q i ( i = 1 , 2 , ⋯   , r ) q_i(i=1,2, \cdots, r) qi(i=1,2,,r) .则

p 1 + p 2 + ⋯ + p r = n p_1+p_2+\cdots+p_r=n p1+p2++pr=n

由定理1.8.5知

q 1 + q 2 + ⋯ + q r ⩽ p 1 + p 2 + ⋯ + p r = n q_1+q_2+\cdots+q_r \leqslant p_1+p_2+\cdots+p_r=n q1+q2++qrp1+p2++pr=n

由定理1.10.2 知

q 1 + q 2 + ⋯ + q r = n q_1+q_2+\cdots+q_r=n q1+q2++qr=n

故得

q 1 = p 1 , q 2 = p 2 , ⋯   , q r = p r q_1=p_1, \quad q_2=p_2, \quad \cdots, \quad q_r=p_r q1=p1,q2=p2,,qr=pr
推论 若矩阵 A \boldsymbol{A} A 的特征根全是单根,则 A \boldsymbol{A} A 可对角化.
定理1.10.4 设 n n n 阶矩阵 A \boldsymbol{A} A 的谱为 { λ 1 , λ 2 , ⋯   , λ r } \left\{\lambda_1, \lambda_2, \cdots, \lambda_r\right\} {λ1,λ2,,λr} ,特征值 λ i \lambda_i λi 的代数重复度为 p i ( i = 1 , 2 , ⋯   , r ) p_i(i=1,2, \cdots, r) pi(i=1,2,,r) ,则 A \boldsymbol{A} A 与对角矩阵相似的充要条件是 λ i \lambda_i λi 的代数重复度 p i = p_i= pi= n − rank ⁡ ( λ i E − A ) ( i = 1 , 2 , ⋯   , r ) n-\operatorname{rank}\left(\lambda_i E-A\right) \quad(i=1,2, \cdots, r) nrank(λiEA)(i=1,2,,r)

证明 由定理1.10.3知 λ i \lambda_i λi 的代数重复度 p i p_i pi 等于它的几何重复度 q i q_i qi ,而 λ i \lambda_i λi 的几何重复度就是线性齐次方程组 ( λ i E − A ) x = 0 \left(\lambda_i \boldsymbol{E}-\boldsymbol{A}\right) x=0 (λiEA)x=0 的基础解系向量个数,即 λ i \lambda_i λi 的几何重复度等于 n − rank ⁡ ( λ i E − A ) n-\operatorname{rank}\left(\lambda_i \boldsymbol{E}-\boldsymbol{A}\right) nrank(λiEA)

二,可交换情况 A B = B A \boldsymbol{A B}=\boldsymbol{B A} AB=BA
一般而言,若 A , B ∈ C n × n \boldsymbol{A}, \boldsymbol{B} \in C^{n \times n} A,BCn×n ,未必能有

A B = B A A B=B A AB=BA

A B = B A \boldsymbol{A B}=\boldsymbol{B A} AB=BA ,便称 A \boldsymbol{A} A B \boldsymbol{B} B(乘法)可交换.
定理1.10.5 若 A \boldsymbol{A} A B \boldsymbol{B} B 乘法可交换,则 A \boldsymbol{A} A 的任何特征子空间都是 B \boldsymbol{B} B 的不变子空间。

注:定理1.10.5是定理1.9.2的另一种说法。并且可知, B \boldsymbol{B} B 的任何特征子空间也是 A \boldsymbol{A} A 的不变子空间。

定理1.10.6 若 A \boldsymbol{A} A B \boldsymbol{B} B 乘法可交换,则 A \boldsymbol{A} A 的任何特征子空间中都有 B \boldsymbol{B} B 的特征向量。

证明 设 V λ 0 V_{\lambda_0} Vλ0 A \boldsymbol{A} A 的特征值为 λ 0 \lambda_0 λ0 的特征子空间, α 1 , α 2 , ⋯   , α s \boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_s α1,α2,,αs V λ 0 V_{\lambda_0} Vλ0 的一个基,由定理1.10.5知 V λ 0 V_{\lambda_0} Vλ0 B \boldsymbol{B} B 的不变子空间。所以

B α i = c 1 i α 1 + c 2 i α 2 + ⋯ + c s i α s ( i = 1 , 2 , ⋯   , s ) \boldsymbol{B} \boldsymbol{\alpha}_i=c_{1 i} \boldsymbol{\alpha}_1+c_{2 i} \boldsymbol{\alpha}_2+\cdots+c_{s i} \boldsymbol{\alpha}_s \quad(i=1,2, \cdots, s) Bαi=c1iα1+c2iα2++csiαs(i=1,2,,s)

M = [ c 11 c 12 ⋯ c 1 s c 21 c 22 ⋯ c 2 s ⋮ ⋮ ⋮ c s 1 c s 2 ⋯ c s s ] \boldsymbol{M}=\left[\begin{array}{cccc} c_{11} & c_{12} & \cdots & c_{1 s} \\ c_{21} & c_{22} & \cdots & c_{2 s} \\ \vdots & \vdots & & \vdots \\ c_{s 1} & c_{s 2} & \cdots & c_{s s} \end{array}\right] M= c11c21cs1c12c22cs2c1sc2scss

X ∈ V λ 0 X \in V_{\lambda_0} XVλ0 ,则有

X = l 1 α 1 + l 2 α 2 + ⋯ + l s α s \boldsymbol{X}=l_1 \boldsymbol{\alpha}_1+l_2 \boldsymbol{\alpha}_2+\cdots+l_s \boldsymbol{\alpha}_s X=l1α1+l2α2++lsαs

欲使 X \boldsymbol{X} X V λ 0 V_{\lambda 0} Vλ0 的向量,只需 B X = μ X B \boldsymbol{X}=\boldsymbol{\mu} \boldsymbol{X} BX=μX .于是结合式(1.10.5)有

B X = l 1 B α 1 + l 2 B α 2 + ⋯ + l s B α s = ( l 1 c 11 + l 2 c 12 + ⋯ + l s c 1 s ) α 1 + ( l 1 c 21 + l 2 c 22 + ⋯ + l s c 2 s ) α 2 + ⋯ + ( l 1 c s 1 + l 2 c s 2 + ⋯ + l s c s s ) α s μ X = μ l 1 α 1 + μ l 2 α 2 + ⋯ + μ l s α s \begin{aligned} \boldsymbol{B} \boldsymbol{X}= & l_1 \boldsymbol{B} \boldsymbol{\alpha}_1+l_2 \boldsymbol{B} \boldsymbol{\alpha}_2+\cdots+l_s \boldsymbol{B} \boldsymbol{\alpha}_s \\ = & \left(l_1 c_{11}+l_2 c_{12}+\cdots+l_s c_{1 s}\right) \boldsymbol{\alpha}_1+ \\ & \left(l_1 c_{21}+l_2 c_{22}+\cdots+l_s c_{2 s}\right) \boldsymbol{\alpha}_2+\cdots+ \\ & \left(l_1 c_{s 1}+l_2 c_{s 2}+\cdots+l_s c_{s s}\right) \boldsymbol{\alpha}_s \\ \mu \boldsymbol{X}= & \mu l_1 \boldsymbol{\alpha}_1+\mu l_2 \boldsymbol{\alpha}_2+\cdots+\mu l_s \boldsymbol{\alpha}_s \end{aligned} BX==μX=l1Bα1+l2Bα2++lsBαs(l1c11+l2c12++lsc1s)α1+(l1c21+l2c22++lsc2s)α2++(l1cs1+l2cs2++lscss)αsμl1α1+μl2α2++μlsαs
B X \boldsymbol{B X} BX μ X \mu \boldsymbol{X} μX 的表达式代人

B X = μ X B X=\mu X BX=μX

并根据 α 1 , α 2 , ⋯   , α s \boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_s α1,α2,,αs 线性无关,得到 l 1 , l 2 , ⋯   , l s l_1, l_2, \cdots, l_s l1,l2,,ls 满足方程组

{ l 1 ( c 11 − μ ) + l 2 c 12 + ⋯ + l s c 1 s = 0 l 1 c 21 + l 2 ( c 22 − μ ) + ⋯ + l s c 2 s = 0 ⋮ ⋮ ⋮ l 1 c s 1 + l 2 c s 2 + ⋯ + l s ( c s s − μ ) ⋮ \left\{\begin{array}{ccc} l_1\left(c_{11}-\mu\right)+l_2 c_{12} & +\cdots+l_s c_{1 s} & =0 \\ l_1 c_{21} & +l_2\left(c_{22}-\mu\right)+\cdots+l_s c_{2 s} & =0 \\ \vdots & \vdots & \vdots \\ l_1 c_{s 1} & +l_2 c_{s 2} & +\cdots+l_s\left(c_{s s}-\mu\right) \\ \vdots \end{array}\right. l1(c11μ)+l2c12l1c21l1cs1++lsc1s+l2(c22μ)++lsc2s+l2cs2=0=0++ls(cssμ)

此即 ( l 1 , l 2 , ⋯   , l s ) T \left(l_1, l_2, \cdots, l_s\right)^{\mathrm{T}} (l1,l2,,ls)T s s s 阶矩阵 M \boldsymbol{M} M 的特征向量,它总是存在的。因此在 V λ 0 V_{\lambda_0} Vλ0 中至少存在一组数 l 1 , l 2 , ⋯   , l s l_1, l_2, \cdots, l_s l1,l2,,ls 使得 X = l 1 α 1 + l 2 α 2 + ⋯ + l s α s \boldsymbol{X}=l_1 \boldsymbol{\alpha}_1+l_2 \boldsymbol{\alpha}_2+\cdots+l_s \boldsymbol{\alpha}_s X=l1α1+l2α2++lsαs 满足式(1.10.8),即 X \boldsymbol{X} X B \boldsymbol{B} B的一个特征向量。

推论1.10.1 若 A \boldsymbol{A} A B \boldsymbol{B} B 乘法可交换,则 A \boldsymbol{A} A B \boldsymbol{B} B 必有公共的特征向量.
推论1.10.2 若 A \boldsymbol{A} A B \boldsymbol{B} B 乘法可交换, λ 1 , λ 2 , ⋯   , λ k \lambda_1, \lambda_2, \cdots, \lambda_k λ1,λ2,,λk A \boldsymbol{A} A k k k 个相异特征值,则 A \boldsymbol{A} A B \boldsymbol{B} B 至少有 k k k 个线性无关的公共特征向量.

三,同时对角化
引理1.10.1 设 A ∈ C n × n , B ∈ C m × m \boldsymbol{A} \in C^{n \times n}, \boldsymbol{B} \in C^{m \times m} ACn×n,BCm×m ,且 D = [ A 0 0 B ] \boldsymbol{D}=\left[\begin{array}{ll}\boldsymbol{A} & 0 \\ 0 & \boldsymbol{B}\end{array}\right] D=[A00B] ,则 D \boldsymbol{D} D 可以对角化的充要条件是 A , B \boldsymbol{A}, \boldsymbol{B} A,B 都可以对角化。

证明 充分性 若 A , B A, B A,B 都可以对角化,存在 S 1 ∈ C n n × n , S 2 ∈ C m m × m S_1 \in C_n^{n \times n}, S_2 \in C_m^{m \times m} S1Cnn×n,S2Cmm×m ,满足

S 1 − 1 A S 1 = Λ 1 =  对角形  S 2 − 1 B S 2 = Λ 2 =  对角形  S = [ S 1 0 0 S 2 ] \begin{gathered} S_1^{-1} A S_1=\Lambda_1=\text { 对角形 } \\ S_2^{-1} B S_2=\boldsymbol{\Lambda}_2=\text { 对角形 } \\ S=\left[\begin{array}{cc} S_1 & 0 \\ 0 & S_2 \end{array}\right] \end{gathered} S11AS1=Λ1= 对角形 S21BS2=Λ2= 对角形 S=[S100S2]

S − 1 D S = [ S 1 − 1 0 0 S 2 − 1 ] [ A 0 0 B ] [ S 1 0 0 S 2 ] = [ S 1 − 1 A S 1 0 0 S 2 − 1 B S 2 ] = [ Λ 1 0 0 Λ 2 ] = Λ =  对角形  \begin{aligned} S^{-1} D S & =\left[\begin{array}{cc} S_1^{-1} & 0 \\ 0 & S_2^{-1} \end{array}\right]\left[\begin{array}{cc} A & 0 \\ 0 & B \end{array}\right]\left[\begin{array}{cc} S_1 & 0 \\ 0 & S_2 \end{array}\right] \\ & =\left[\begin{array}{cc} S_1^{-1} A S_1 & 0 \\ 0 & S_2^{-1} B S_2 \end{array}\right]=\left[\begin{array}{cc} \Lambda_1 & 0 \\ 0 & \Lambda_2 \end{array}\right]=\boldsymbol{\Lambda}=\text { 对角形 } \end{aligned} S1DS=[S1100S21][A00B][S100S2]=[S11AS100S21BS2]=[Λ100Λ2]=Λ= 对角形 

必要性 若 D \boldsymbol{D} D 可以对角化,存在 S ∈ C n + m ( n + m ) × ( n + m ) S \in C_{n+m}^{(n+m) \times(n+m)} SCn+m(n+m)×(n+m) ,满足

S − 1 D S = Λ = diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n , λ n + 1 , ⋯   , λ n + m ) 命 S = ( α 1 , α 2 , ⋯   , α n , α n + 1 , ⋯   , α n + m ) . \begin{gathered} \boldsymbol{S}^{-1} \boldsymbol{D S}=\boldsymbol{\Lambda}=\operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n, \lambda_{n+1}, \cdots, \lambda_{n+m}\right) \\ \text{命}\boldsymbol{S}=\left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n, \boldsymbol{\alpha}_{n+1}, \cdots, \boldsymbol{\alpha}_{n+m}\right) . \end{gathered} S1DS=Λ=diag(λ1,λ2,,λn,λn+1,,λn+m)S=(α1,α2,,αn,αn+1,,αn+m).

其中

α i = [ ξ i η i ] ∈ C n + m , ξ i ∈ C n , η i ∈ C m ( i = 1 , 2 , ⋯   , n + m ) \boldsymbol{\alpha}_i=\left[\begin{array}{c} \boldsymbol{\xi}_i \\ \boldsymbol{\eta}_i \end{array}\right] \in C^{n+m}, \quad \boldsymbol{\xi}_i \in C^n, \quad \boldsymbol{\eta}_i \in C^m \quad(i=1,2, \cdots, n+m) αi=[ξiηi]Cn+m,ξiCn,ηiCm(i=1,2,,n+m)

因为 D S = Sdiag ⁡ ( λ 1 , λ 2 , ⋯   , λ n , λ n + 1 , ⋯   , λ n + m ) \boldsymbol{D S}=\operatorname{Sdiag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n, \lambda_{n+1}, \cdots, \lambda_{n+m}\right) DS=Sdiag(λ1,λ2,,λn,λn+1,,λn+m) ,所以
D ( α 1 , α 2 , ⋯   , α n , ⋯   , α n + m ) = ( α 1 , α 2 , ⋯   , α n , ⋯   , α n + m ) × [ λ 1 λ 2 ⋱ ( λ 1 α 1 , λ 2 α 2 , ⋯   , λ n α n , ⋯   , λ n + m α n + m ) ] \begin{aligned} & \boldsymbol{D}\left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n, \cdots, \boldsymbol{\alpha}_{n+m}\right) \\ = & \left(\boldsymbol{\alpha}_1, \boldsymbol{\alpha}_2, \cdots, \boldsymbol{\alpha}_n, \cdots, \boldsymbol{\alpha}_{n+m}\right) \times\left[\begin{array}{llll} \lambda_1 & & & \\ & \lambda_2 & & \\ & & \ddots & \\ & & & \\ & \left(\lambda_1 \boldsymbol{\alpha}_1, \lambda_2 \boldsymbol{\alpha}_2, \cdots, \lambda_n \boldsymbol{\alpha}_n, \cdots, \lambda_{n+m} \boldsymbol{\alpha}_{n+m}\right) \end{array}\right] \end{aligned} =D(α1,α2,,αn,,αn+m)(α1,α2,,αn,,αn+m)× λ1λ2(λ1α1,λ2α2,,λnαn,,λn+mαn+m)

比较上式两端得

D α i = λ i α i ( i = 1 , 2 , ⋯   , n + m ) D \boldsymbol{\alpha}_i=\lambda_i \boldsymbol{\alpha}_i \quad(i=1,2, \cdots, n+m) Dαi=λiαi(i=1,2,,n+m)

[ A 0 0 B ] [ ξ i η i ] = λ i [ ξ i η i ] ( i = 1 , 2 , ⋯   , n + m ) \left[\begin{array}{cc} \boldsymbol{A} & 0 \\ \mathbf{0} & \boldsymbol{B} \end{array}\right]\left[\begin{array}{c} \boldsymbol{\xi}_i \\ \boldsymbol{\eta}_i \end{array}\right]=\lambda_i\left[\begin{array}{c} \xi_i \\ \boldsymbol{\eta}_i \end{array}\right] \quad(i=1,2, \cdots, n+m) [A00B][ξiηi]=λi[ξiηi](i=1,2,,n+m)

比较上式两端得

A ξ i = λ i ξ i , B η i = λ i η i ( i = 1 , 2 , ⋯   , n + m ) \boldsymbol{A} \boldsymbol{\xi}_i=\lambda_i \boldsymbol{\xi}_i, \boldsymbol{B} \boldsymbol{\eta}_i=\lambda_i \boldsymbol{\eta}_i \quad(i=1,2, \cdots, n+m) Aξi=λiξi,Bηi=λiηi(i=1,2,,n+m)

这说明 ξ i \boldsymbol{\xi}_i ξi A \boldsymbol{A} A 的特征向量, η i \boldsymbol{\eta}_i ηi B \boldsymbol{B} B 的特征向量.现在将要证明 ( n + m ) (n+m) (n+m) ξ i \boldsymbol{\xi}_i ξi 中仅有 n n n 个是线性元关的, ( n + m ) (n+m) (n+m) η i \boldsymbol{\eta}_i ηi 中仅有 m m m 个是线性无关的。

因为

S = [ ξ 1 , ξ 2 , ⋯   , ξ n , ⋯   , ξ n + m η 1 , η 2 , ⋯   , η n , ⋯   , η n + m ] ∈ C n + m ( n + m ) × ( n + m ) \boldsymbol{S}=\left[\begin{array}{cccccc} \boldsymbol{\xi}_1, & \boldsymbol{\xi}_2, & \cdots, & \boldsymbol{\xi}_n, & \cdots, & \boldsymbol{\xi}_{n+m} \\ \boldsymbol{\eta}_1, & \boldsymbol{\eta}_2, & \cdots, & \boldsymbol{\eta}_n, & \cdots, & \boldsymbol{\eta}_{n+m} \end{array}\right] \in C_{n+m}^{(n+m) \times(n+m)} S=[ξ1,η1,ξ2,η2,,,ξn,ηn,,,ξn+mηn+m]Cn+m(n+m)×(n+m)

所以 S \boldsymbol{S} S ( n + m ) (n+m) (n+m) 个行向量线性无关,于是矩阵 ( ξ 1 , ξ 2 , ⋯   , ξ n , ⋯   , ξ n + m ) \left(\boldsymbol{\xi}_1, \boldsymbol{\xi}_2, \cdots, \boldsymbol{\xi}_n, \cdots, \boldsymbol{\xi}_{n+m}\right) (ξ1,ξ2,,ξn,,ξn+m) ∈ C n × ( n + m ) \in C^{n \times(n+m)} Cn×(n+m) n n n 个行向量线性无关, ( η 1 , η 2 , ⋯   , η n , ⋯   , η n + m ) ∈ C m × ( n + m ) \left(\boldsymbol{\eta}_1, \boldsymbol{\eta}_2, \cdots, \boldsymbol{\eta}_n, \cdots, \boldsymbol{\eta}_{n+m}\right) \in C^{m \times(n+m)} (η1,η2,,ηn,,ηn+m)Cm×(n+m) m m m 个行向量线性无关。因此

rank ⁡ ( ξ 1 , ξ 2 , ⋯   , ξ n , ⋯   , ξ n + m ) = n rank ⁡ ( η 1 , η 2 , ⋯   , η n , ⋯   , η n + m ) = m \begin{aligned} & \operatorname{rank}\left(\boldsymbol{\xi}_1, \boldsymbol{\xi}_2, \cdots, \boldsymbol{\xi}_n, \cdots, \boldsymbol{\xi}_{n+m}\right)=n \\ & \operatorname{rank}\left(\boldsymbol{\eta}_1, \boldsymbol{\eta}_2, \cdots, \boldsymbol{\eta}_n, \cdots, \boldsymbol{\eta}_{n+m}\right)=m \end{aligned} rank(ξ1,ξ2,,ξn,,ξn+m)=nrank(η1,η2,,ηn,,ηn+m)=m

此即 ( n + m ) (n+m) (n+m) ξ i \boldsymbol{\xi}_i ξi 中仅有 n n n 个线性无关, ( n + m ) (n+m) (n+m) η i \eta_i ηi 中仅有 m m m 个线性无关。所以 A , B A, B A,B 均可对角化。

定理1.10.7 设 A , B ∈ C n × n \boldsymbol{A}, \boldsymbol{B} \in C^{n \times n} A,BCn×n 都可以对角化,则 A , B \boldsymbol{A}, \boldsymbol{B} A,B 同时对角化的充要条件是 A B = B A \boldsymbol{A B}=\boldsymbol{B A} AB=BA

证明 必要性:若存在 P ∈ C n n × n \boldsymbol{P} \in C_n^{n \times n} PCnn×n ,满足

P − 1 A P = diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) P − 1 B P = diag ⁡ ( μ 1 , μ 2 , ⋯   , μ n ) \begin{aligned} & \boldsymbol{P}^{-1} \boldsymbol{A} \boldsymbol{P}=\operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right) \\ & \boldsymbol{P}^{-1} \boldsymbol{B} \boldsymbol{P}=\operatorname{diag}\left(\mu_1, \mu_2, \cdots, \mu_n\right) \end{aligned} P1AP=diag(λ1,λ2,,λn)P1BP=diag(μ1,μ2,,μn)

( P − 1 A P ) ( P − 1 B P ) = diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) diag ⁡ ( μ 1 , μ 2 , ⋯   , μ n ) = diag ⁡ ( μ 1 , μ 2 , ⋯   , μ n ) diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) = ( P − 1 B P ) ( P − 1 A P ) \begin{aligned} \left(\boldsymbol{P}^{-1} \boldsymbol{A P}\right)\left(\boldsymbol{P}^{-1} \boldsymbol{B P}\right) & =\operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right) \operatorname{diag}\left(\mu_1, \mu_2, \cdots, \mu_n\right) \\ & =\operatorname{diag}\left(\mu_1, \mu_2, \cdots, \mu_n\right) \operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right) \\ & =\left(\boldsymbol{P}^{-1} \boldsymbol{B} \boldsymbol{P}\right)\left(\boldsymbol{P}^{-1} \boldsymbol{A} \boldsymbol{P}\right) \end{aligned} (P1AP)(P1BP)=diag(λ1,λ2,,λn)diag(μ1,μ2,,μn)=diag(μ1,μ2,,μn)diag(λ1,λ2,,λn)=(P1BP)(P1AP)

此即
A B = B A A B=B A AB=BA

充分性:分两步论述。先假定 A \boldsymbol{A} A 为对角形矩阵

A = [ λ 1 E 1 λ 2 E 2 ⋱ λ h E h ] A=\left[\begin{array}{llll} \lambda_1 E_1 & & & \\ & \lambda_2 E_2 & & \\ & & \ddots & \\ & & & \lambda_h E_h \end{array}\right] A= λ1E1λ2E2λhEh

其中 E i E_i Ei 是单位矩阵,其阶数为 λ i \lambda_i λi .对 B \boldsymbol{B} B 实施分块,其分法使之与 A \boldsymbol{A} A 能相乘

B = [ B 11 B 12 ⋯ B 1 h B 21 B 22 ⋯ B 2 h ⋮ ⋮ ⋮ B h 1 B h 2 ⋯ B h h ] ,  \boldsymbol{B}=\left[\begin{array}{cccc} \boldsymbol{B}_{11} & \boldsymbol{B}_{12} & \cdots & \boldsymbol{B}_{1 h} \\ \boldsymbol{B}_{21} & \boldsymbol{B}_{22} & \cdots & \boldsymbol{B}_{2 h} \\ \vdots & \vdots & & \vdots \\ \boldsymbol{B}_{h 1} & \boldsymbol{B}_{h 2} & \cdots & \boldsymbol{B}_{h h} \end{array}\right] \text {, } B= B11B21Bh1B12B22Bh2B1hB2hBhh

其中 B i j \boldsymbol{B}_{i j} Bij 的行数与 E i \boldsymbol{E}_i Ei 阶数相同,列数与 E j \boldsymbol{E}_j Ej 的阶数相同.由于 A B = B A \boldsymbol{A B}=\boldsymbol{B A} AB=BA ,所以 B i j = 0 \boldsymbol{B}_{i j}=0 Bij=0 ( i ≠ j ) (i \neq j) (i=j) ,即

B = [ B 11 B 22 ⋱ B h h ] \boldsymbol{B}=\left[\begin{array}{llll} \boldsymbol{B}_{11} & & & \\ & \boldsymbol{B}_{22} & & \\ & & \ddots & \\ & & & \boldsymbol{B}_{h h} \end{array}\right] B= B11B22Bhh

其中 B i i \boldsymbol{B}_{i i} Bii 均为方阵,由引理1.10.1知, B 11 , B 22 , ⋯   , B h h \boldsymbol{B}_{11}, \boldsymbol{B}_{22}, \cdots, \boldsymbol{B}_{h h} B11,B22,,Bhh 都是可对角化矩阵.即存在满秩方阵 T i \boldsymbol{T}_i Ti ,使得 T i − 1 B i i T i \boldsymbol{T}_i^{-1} \boldsymbol{B}_{i i} \boldsymbol{T}_i Ti1BiiTi 是对角形矩阵 ( i = 1 , 2 , ⋯   , h ) (i=1,2, \cdots, h) (i=1,2,,h) 。命

T = [ T 1 T 2 ⋱ T h ] \boldsymbol{T}=\left[\begin{array}{llll} \boldsymbol{T}_1 & & & \\ & \boldsymbol{T}_2 & & \\ & & \ddots & \\ & & & \boldsymbol{T}_h \end{array}\right] T= T1T2Th

T − 1 A T , T − 1 B T \boldsymbol{T}^{-1} \boldsymbol{A T}, \boldsymbol{T}^{-1} \boldsymbol{B} \boldsymbol{T} T1AT,T1BT 均为对角形矩阵。
现设 A A A 可以对角化,则存在 S ∈ C n n × n S \in C_n^{n \times n} SCnn×n ,满足

S − 1 A S = [ λ 1 λ 2 ⋱ λ n ] = A ~ \boldsymbol{S}^{-1} \boldsymbol{A} \boldsymbol{S}=\left[\begin{array}{llll} \lambda_1 & & & \\ & \lambda_2 & & \\ & & \ddots & \\ & & & \lambda_n \end{array}\right]=\widetilde{\boldsymbol{A}} S1AS= λ1λ2λn =A

S − 1 B S = B ~ \boldsymbol{S}^{-1} \boldsymbol{B S}=\widetilde{\boldsymbol{B}} S1BS=B 也是可以对角化矩阵。根据 A B = B A \boldsymbol{A B}=\boldsymbol{B} \boldsymbol{A} AB=BA ,可得 A ~ B ~ = B ~ A ~ \widetilde{\boldsymbol{A}} \widetilde{\boldsymbol{B}}=\widetilde{\boldsymbol{B}} \widetilde{\boldsymbol{A}} A B =B A .由前面论述知,对于 A ~ , B ~ \widetilde{\boldsymbol{A}}, \widetilde{\boldsymbol{B}} A ,B 可以同时对角化,即存在 T ∈ C n n × n \boldsymbol{T} \in C_n^{n \times n} TCnn×n 满足

T − 1 A ~ T = Λ 1  与  T − 1 B ~ T = Λ 2 T^{-1} \widetilde{A} T=\Lambda_1 \quad \text { 与 } \quad T^{-1} \widetilde{B} T=\Lambda_2 T1A T=Λ1  T1B T=Λ2

所以

( S T ) − 1 A ( S T ) = Λ 1  与  ( S T ) − 1 B ( S T ) = Λ 2 (S T)^{-1} A(S T)=\Lambda_1 \quad \text { 与 } \quad(S T)^{-1} B(S T)=\Lambda_2 (ST)1A(ST)=Λ1  (ST)1B(ST)=Λ2

此即 A , B A, B A,B 可以同时对角化.

矩阵的相似对角化实现(MATLAB和C++)

MATLAB实现

1. 基本对角化

% 输入矩阵
A = [1 2 0; 0 2 0; -1 -2 1];

% 计算特征值和特征向量
[P, D] = eig(A);

% 验证对角化
disp('原始矩阵 A:');
disp(A);
disp('特征向量矩阵 P:');
disp(P);
disp('对角矩阵 D:');
disp(D);
disp('验证 P*D*inv(P):');
disp(P*D*inv(P));  % 应该等于A

% 检查条件数,判断数值稳定性
cond_P = cond(P);
disp(['矩阵P的条件数: ', num2str(cond_P)]);
if cond_P > 1e10
    warning('矩阵P接近奇异,对角化可能数值不稳定');
end

2. 处理不可对角化矩阵

A = [2 1; 0 2];  % Jordan块,不可对角化

[P, D] = eig(A);
if rank(P) < size(A,1)
    disp('矩阵不可对角化,尝试Jordan分解');
    [V, J] = jordan(A);  % 需要Symbolic Math Toolbox
    disp('Jordan标准形 J:');
    disp(J);
    disp('转换矩阵 V:');
    disp(V);
end

3. 对称矩阵的对角化(正交对角化)

A = [1 2; 2 1];  % 对称矩阵

% 对称矩阵总是可以对角化,且P是正交矩阵
[P, D] = eig(A);

% 验证正交性
disp('P的转置乘以P:');
disp(P'*P);  % 应该接近单位矩阵

C++实现(使用Eigen库)

1. 基本对角化

#include <iostream>
#include <Eigen/Eigenvalues>

using namespace Eigen;

void matrixDiagonalization(const MatrixXd& A) {
    // 计算特征分解
    EigenSolver<MatrixXd> es(A);
    MatrixXcd P = es.eigenvectors();
    MatrixXcd D = es.eigenvalues().asDiagonal();
    
    std::cout << "Original matrix A:\n" << A << "\n\n";
    std::cout << "Eigenvector matrix P:\n" << P << "\n\n";
    std::cout << "Diagonal matrix D:\n" << D << "\n\n";
    
    // 验证对角化
    MatrixXcd reconstruction = P * D * P.inverse();
    std::cout << "Verification P*D*P^(-1):\n" << reconstruction << "\n";
    std::cout << "Reconstruction error norm: " 
              << (reconstruction - A.cast<std::complex<double>>()).norm() 
              << "\n";
    
    // 检查是否可对角化
    FullPivLU<MatrixXcd> lu(P);
    if(lu.rank() < A.rows()) {
        std::cout << "Matrix is not diagonalizable (defective)\n";
    }
}

int main() {
    Matrix3d A;
    A << 1, 2, 0,
         0, 2, 0,
         -1, -2, 1;
         
    matrixDiagonalization(A);
    return 0;
}

2. 实数对称矩阵的对角化

#include <iostream>
#include <Eigen/Eigenvalues>

using namespace Eigen;

void symmetricDiagonalization(const MatrixXd& A) {
    // 确保输入是对称的
    if(!A.isApprox(A.transpose())) {
        std::cerr << "Matrix is not symmetric!\n";
        return;
    }
    
    // 使用SelfAdjointEigenSolver更高效
    SelfAdjointEigenSolver<MatrixXd> es(A);
    MatrixXd P = es.eigenvectors();
    MatrixXd D = es.eigenvalues().asDiagonal();
    
    std::cout << "Original matrix A:\n" << A << "\n\n";
    std::cout << "Orthogonal matrix P:\n" << P << "\n\n";
    std::cout << "Diagonal matrix D:\n" << D << "\n\n";
    
    // 验证正交性
    std::cout << "P' * P:\n" << P.transpose() * P << "\n";
    
    // 验证对角化
    MatrixXd reconstruction = P * D * P.transpose();
    std::cout << "Verification P*D*P':\n" << reconstruction << "\n";
    std::cout << "Reconstruction error norm: " 
              << (reconstruction - A).norm() << "\n";
}

int main() {
    Matrix2d A;
    A << 1, 2,
         2, 1;
         
    symmetricDiagonalization(A);
    return 0;
}

3. 处理不可对角化矩阵

#include <iostream>
#include <Eigen/Eigenvalues>

using namespace Eigen;

void checkDiagonalizability(const MatrixXd& A) {
    EigenSolver<MatrixXd> es(A);
    MatrixXcd P = es.eigenvectors();
    
    FullPivLU<MatrixXcd> lu(P);
    if(lu.rank() < A.rows()) {
        std::cout << "Matrix is not diagonalizable.\n";
        std::cout << "Attempting real Schur decomposition instead.\n";
        
        RealSchur<MatrixXd> schur(A);
        MatrixXd T = schur.matrixT();
        MatrixXd U = schur.matrixU();
        
        std::cout << "Quasi-triangular matrix T:\n" << T << "\n";
        std::cout << "Orthogonal matrix U:\n" << U << "\n";
    } else {
        std::cout << "Matrix is diagonalizable.\n";
    }
}

int main() {
    Matrix2d A;
    A << 2, 1,
         0, 2;  // Jordan块,不可对角化
         
    checkDiagonalizability(A);
    return 0;
}

应用实例

MATLAB应用:矩阵幂的计算

A = [1 2; 3 4];
[P, D] = eig(A);

% 计算A^5
A_pow_5 = P * (D^5) * inv(P);
disp('A^5:');
disp(A_pow_5);

C++应用:矩阵指数

Matrix2d A;
A << 1, 2,
     3, 4;
     
EigenSolver<MatrixXd> es(A);
MatrixXcd P = es.eigenvectors();
MatrixXcd D = es.eigenvalues().asDiagonal();

// 计算exp(A) = P * exp(D) * P^(-1)
MatrixXcd expD = D.array().exp().matrix().asDiagonal();
MatrixXcd expA = P * expD * P.inverse();

std::cout << "Matrix exponential exp(A):\n" << expA << "\n";

网站公告

今日签到

点亮在社区的每一天
去签到