散度 - 润新知

散度
原谅我写中文太累了，而且相信在座的都有一定的英文水平。

KL散度

　　考虑某个未知分布 $p(x)$ ，假定已经使用一个近似的分布 $q(x)$ 对它进行建模。如果使用 $q(x)$ 来建立一个编码体系，用来把 $x$ 的值传给接收者，那么，由于使用了 $q(x) $ 而不是真实分布 $ p(x) $ ，因此在具体化 $x$ 的值时，需要一些附加的信息。我们需要的平均的附加信息量（单位是nat）为:

　　　　$\begin{aligned}D_{K L}(p \| q) &=-\int p(x) \log q(x)-(-\int p(x) \log p(x)) \\&=-\int p(x) \log \frac{q(x)}{p(x)} d x\end{aligned}$

　　得：

　　　　$D_{K L}(p \| q)=-\int p(X) \log \frac{q(X)}{p(X)} d X$

　　　　$D_{K L}(q \| p)=-\int q(X) \log \frac{p(X)}{q(X)} d X$

　　KL散度在一定程度上衡量了两个分布的差异，具有如下性质：
- - $D_{K L}(p \| q)>=0$ ，并且当且仅当时取等号。
  - 不满足对称性，即 $D_{K L}(p \| q) \neq D_{K L}(q \| p)$ ，因此选择作为衡量两个分布的差距时要慎重选择。
$\alpha$-散度

　　给定 $\alpha \in \mathbb{R}$ ，$\alpha$ 散度可以被定义为

　　　　$\frac{1}{\alpha(1-\alpha)}\left(1-\sum\limits _{x} p_{2}(x)\left(\frac{p_{1}(x)}{p_{2}(x)}\right)^{\alpha}\right)$

　　KL散度是$\alpha$ 散度的一个特例，$K L\left(P_{1}, P_{2}\right)$ , $K L\left(P_{2}, P_{1}\right)$ 分别对应 $ \alpha=1, \alpha=0$，且 $\alpha \neq 0,1 $。

　　The Amari divergence come from the above by the transformation $\alpha=\frac{1+t}{2}$.

JS散度

　　为构造出对称的形式，可以将两种 KL 散度结合起来，就是 JS 散度（Jensen-Shannon散度），表达式如下：

　　　　$D_{J S}(p \| q)=\frac{1}{2} D_{K L}\left(p \| \frac{p+q}{2}\right)+\frac{1}{2} D_{K L}\left(q \| \frac{p+q}{2}\right)$

　　性质：
- JS散度是对称的。
- JS散度有界，范围是 $ [0, 1]$ 。
F-散度

　　Given a convex function $f(t): \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}$ with f$(1)=0$, $f^{\prime}(1)=0$, $f^{\prime \prime}(1)= 1$ , the $f$ -divergence on $\mathcal{P}$ is defined by

　　　　$\sum \limits _{x} p_{2}(x) f\left(\frac{p_{1}(x)}{p_{2}(x)}\right)$
- The cases $f(t)=t \ln t$ correspond to the Kullback-Leibler distance.
- The cases $f(t)=(t-1)^{2}$ correspond to the $\chi^{2}$ -distance.
- The case $f(t)=|t-1|$ correspond to the variational distance.
- The case $f(t)=4(1-\sqrt{t})$ (as well as $f(t)=2(t+1)-4 \sqrt{t})$ corresponds to the squared Hellinger metric.
- The case $f(t)=(t-1)^{2} /(t+1) $ correspond to the Vajda–Kus semimetric.
- The case $ f(t)=\left|t^{a}-1\right|^{1 / a}$ with $0<a \leq 1$ correspond to the generalized Matusita distance.
- The case $f(t)=\frac{\left(t^{a}+1\right)^{1 / a}-2^{(1-a) / a}(t+1)}{1-1 / \alpha} $ correspond to the Osterreicher semimetric.
Harmonic mean similarity

　　The harmonic mean similarity is a similarity on $\mathcal{P}$ defined by

　　　　$2 \sum \limits _{x} \frac{p_{1}(x) p_{2}(x)}{p_{1}(x)+p_{2}(x)} .$

Fidelity similarity

　　The fidelity similarity (or Bhattacharya coefficient, Hellinger affinity) on $\mathcal{P}$ is

　　　　$\rho\left(P_{1}, P_{2}\right)=\sum_{x} \sqrt{p_{1}(x) p_{2}(x)} .$

Hellinger metric

　　In terms of the fidelity similarity $\rho$ , the Hellinger metric (or Matusita distance, Hellinger-Kakutani metric) on $\mathcal{P}$ is defined by

　　　　$\left(\sum\limits_{x}\left(\sqrt{p_{1}(x)}-\sqrt{p_{2}(x)}\right)^{2}\right)^{\frac{1}{2}}=\sqrt{2\left(1-\rho\left(P_{1}, P_{2}\right)\right)}$

Bhattacharya distance 1

　　In terms of the fidelity similarity $\rho$ , the Bhattacharya distance 1 (1946) is

　　　　$\left(\arccos \rho\left(P_{1}, P_{2}\right)\right)^{2} $
　　for $P_{1}, P_{2} \in \mathcal{P}$ . Twice this distance is the Rao distance . It is used also in Statistics and Machine Learning, where it is called the Fisher distance.

Bhattacharya distance 2

　　The Bhattacharya distance 2(1943) on $\mathcal{P}$ is defined by

　　　　$-\ln \rho\left(P_{1}, P_{2}\right)$

$\chi^{2}$ -distance

　　The $\chi^{2}$ -distance (or Pearson $\chi^{2} $-distance) is a quasi-distance on $\mathcal{P}$ , defined by

　　　　$\sum_{x} \frac{\left(p_{1}(x)-p_{2}(x)\right)^{2}}{p_{2}(x)}$

　　The Neyman $\chi^{2}$ -distance is a quasi-distance on $\mathcal{P} $, defined by

　　　　$\sum_{x} \frac{\left(p_{1}(x)-p_{2}(x)\right)^{2}}{p_{1}(x)} .$

　　The half of $\chi^{2}$ -distance is also called Kagan's divergence.

　　The probabilistic symmetric $\chi^{2}$ -measure is a distance on $\mathcal{P} $, defined by

　　　　$2 \sum_{x} \frac{\left(p_{1}(x)-p_{2}(x)\right)^{2}}{p_{1}(x)+p_{2}(x)} .$

　　由于我暂时用不到剩下的，所以没写。

　　本文参考《 Encyclopedia of Distances》，需要电子书的联系博主。

　　Distances on Distribution Laws ..................................... 261

　　同时参考了另外一个”借鉴“者的博客《机器学习中的数学》
因上求缘，果上努力~~~~ 作者：加微信X466550探讨，转载请注明原文链接：https://www.cnblogs.com/BlairGrowing/p/15859978.html
相关阅读:
css属性操作2（外边距与内边距<盒子模型>）
css的属性操作1
css伪类
 属性选择器二
 属性选择器1
03_MySQL重置root密码
 02_Mysql用户管理之Navicat下载及安装
 18.扩散模型
 17.广播模型
 16.友谊悖论
原文地址：https://www.cnblogs.com/BlairGrowing/p/15859978.html

散度

KL散度

$\alpha$-散度

JS散度

F-散度

Harmonic mean similarity

Fidelity similarity

Hellinger metric

Bhattacharya distance 1

Bhattacharya distance 2

$\chi^{2}$ -distance