1. 主成分分析基础知识准备
1.1 样本均值
给定数据集(D={x_1, x_2, ..., x_n}), 样本(x_i)是(d)维向量,则样本均值为
[overline{x}=frac{x_1+x_2+...+x_n}{n} ag{1}
]
例1 给定一个数据矩阵
[D_{3 imes2}=
egin{bmatrix}
4 & 2\
-1 & 2\
3 & 2
end{bmatrix}\
]
求样本平均?
[x_1 = (4, 2)^T\
x_2 = (-1, 2)^T\
x_3 = (3, 2)^T
]
[overline{x}=frac{x_1+x_2+x_3}{3}=(2, 2)^T
]
1.2 向量投影
1.2.1 两个维度的向量投影
求向量(vec{a})在向量(vec{b})上的投影,即红色线段的长度?
[lVert{vec{a}}
Vert{cos{ heta}}=lVert{vec{a}}
Vert{frac{vec{b}^T.vec{a}}{lVert{vec{a}}
VertlVert{vec{b}}
Vert}}\
=vec{e}^Tvec{a} ag{2}
]
1.2.2 三个维度的向量投影
[vec{e_1}^Tvec{x}=(frac{1}{sqrt{2}},-frac{1}{sqrt{2}},0)egin{pmatrix}1\0\2end{pmatrix}=frac{1}{sqrt{2}}\
vec{e_2}^Tvec{x}=(frac{1}{sqrt{2}},frac{1}{sqrt{2}},0)egin{pmatrix}1\0\2end{pmatrix}=frac{1}{sqrt{2}}
]
则,投影的向量坐标为((frac{1}{sqrt{2}}, frac{1}{sqrt{2}})^T).它的矩阵形式如下:
[egin{bmatrix}
vec{e_1}^T\
vec{e_2}^T
end{bmatrix}x
=
egin{bmatrix}
frac{1}{sqrt{2}} & -frac{1}{sqrt{2}} & 0\
frac{1}{sqrt{2}} & frac{1}{sqrt{2}} & 0
end{bmatrix}
egin{bmatrix}
1\
0\
2
end{bmatrix}
=
egin{bmatrix}
frac{1}{sqrt{2}}\
frac{1}{sqrt{2}}
end{bmatrix}
]
这就是一个线性变换,将三维向量映射为二维向量。
1.3 矩阵微分
在向量空间上定义函数(f),即(f:R^d ightarrow{R}),那么函数对向量的微分形式为:
[frac{partial f}{partial vec{x}}=
egin{bmatrix}
frac{partial f}{partial x_1}\
frac{partial f}{partial x_2}\
vdots\
frac{partial f}{partial x_d}
end{bmatrix} ag{3}
]
例2 令向量(vec{w}=(w_1,w_2,w_3)^T),函数(g(vec{x})=2w_1+5w_2+12w_3=(2,5,12)vec{w}),则
[frac{partial g}{partial vec{w}}=
egin{bmatrix}
frac{partial g}{partial w_1}\
frac{partial g}{partial w_2}\
frac{partial g}{partial w_3}\
end{bmatrix}
=
egin{bmatrix}
2\
5\
12\
end{bmatrix}
]
例3 对下面函数求导:
[f(vec{e})=e_1^2+e_2^2+cdots+e_d^2=vec{e}^Tvec{e} ag{4}
]
解:
[frac{partial vec{e}^Tvec{e}}{partial vec{e}}
=
egin{bmatrix}
frac{partial vec{e}^Tvec{e}}{partial e_1}\
frac{partial vec{e}^Tvec{e}}{partial e_2}\
vdots\
frac{partial vec{e}^Tvec{e}}{partial e_d}\
end{bmatrix}
=
2egin{bmatrix}
e_1\
e_2\
vdots\
e_d\
end{bmatrix}
]
例4
[A=
egin{bmatrix}
a_{11} & a_{12} & cdots & a_{1d}\
a_{21} & a_{22} & cdots & a_{2d}\
vdots & vdots & ddots & vdots\
a_{d1} & a_{d2} & cdots & a_{dd}\
end{bmatrix}
]
求(frac{partial vec{e}^TAvec{e}}{vec{e}})
解:
当
[A=
egin{bmatrix}
a_{11} & a_{12}\
a_{21} & a_{22}
end{bmatrix}
]
时,
[vec{e}^TAvec{e}=
egin{bmatrix}
e_1 & e_2
end{bmatrix}
egin{bmatrix}
a_{11} & a_{12}\
a_{21} & a_{22}
end{bmatrix}
egin{bmatrix}
e_1 \ e_2
end{bmatrix}\
=
egin{bmatrix}
e_1a_{11}+e_2a_{21} & e_1a_{12}+e_2a_{22}
end{bmatrix}
egin{bmatrix}
e_1 \ e_2
end{bmatrix}\
=
e_1^2a_{11}+e_2e_1a_{21}+e_1e_2a_{12}+e_2^2a_{22}
]
则,
[frac{partial vec{e}^TAvec{e}}{vec{e}}=
egin{bmatrix}
2a_{11}e_1 + (a_{12}+a_{21})e_2\
(a_{21}+a_{12})e_1 + 2a_{11}e_2
end{bmatrix}\
=
(A+A^T)
egin{bmatrix}
e_1 \ e_2
end{bmatrix}\
]
所以,当矩阵为(n imes{n})时,
[frac{partial vec{e}^TAvec{e}}{vec{e}}=(A+A^T)vec{e} ag{5}
]
特殊情况,当(A)对称矩阵,即(A=A^T)
[frac{partial vec{e}^TAvec{e}}{vec{e}}=2Avec{e} ag{6}
]