Adversarially Robust Generalization Requires More Data

Adversarially Robust Generalization Requires More Data
目录
- 概
- 主要内容
  高斯模型
  upper bound
  lower bound
  伯努利模型
  upper bound
  lower bound
Schmidt L, Santurkar S, Tsipras D, et al. Adversarially Robust Generalization Requires More Data[C]. neural information processing systems, 2018: 5014-5026.

@article{schmidt2018adversarially,
title={Adversarially Robust Generalization Requires More Data},
author={Schmidt, Ludwig and Santurkar, Shibani and Tsipras, Dimitris and Talwar, Kunal and Madry, Aleksander},
pages={5014--5026},
year={2018}}

概

本文在二分类高斯模型和伯努利模型上分析adversarial, 指出对抗稳定的模型需要更多的数据支撑.

主要内容

高斯模型定义: 令( heta^* in mathbb{R}^n)为均值向量, (sigma >0), 则(( heta^*, sigma))-高斯模型按照如下方式定义: 首先从等概率采样标签(y in {pm 1}), 再从(mathcal{N}(y cdot heta^*, sigma^2I))中采样(x in mathbb{R}^d).

伯努利模型定义: 令( heta^* in {pm1}^d)为均值向量, ( au >0), 则(( heta^*, au))-伯努利模型按照如下方式定义: 首先等概率采样标签(y in {pm 1}), 在从如下分布中采样(x in {pm 1}^d):

[x_i = left { egin{array}{rl} y cdot heta_i^* & mathrm{with} : mathrm{probability} : 1/2+ au \ -y cdot heta_i^* & mathrm{with} : mathrm{probability} : 1/2- au end{array} ight. ]
分类错误定义: 令(mathcal{P}: mathbb{R}^d imes {pm 1} ightarrow mathbb{R})为一分布, 则分类器(f:mathbb{R}^d ightarrow {pm1})的分类错误(eta)定义为(eta=mathbb{P}_{(x, y) sim mathcal{P}} [f(x) ot =y]).

Robust分类错误定义: 令(mathcal{P}: mathbb{R}^d imes {pm 1} ightarrow mathbb{R})为一分布, (mathcal{B}: mathbb{R}^d ightarrow mathscr{P}(mathbb{R}^d))为一摄动集合. 则分类器(f:mathbb{R}^d ightarrow {pm1})的(mathcal{B})-robust 分类错误率(eta)定义为(eta=mathbb{P}_{(x, y) sim mathcal{P}} [exist x' in mathcal{B}(x): f(x') ot = y]).

注: 以(mathcal{B}_p^{epsilon}(x))表示({x' in mathbb{R}^d||x'-x|_p le epsilon}).

高斯模型

upper bound

定理18: 令((x_1,y_1),ldots, (x_n,y_n) in mathbb{R}^d imes {pm 1}) 独立采样于同分布(( heta^*, sigma))-高斯模型, 且(| heta^*|_2=sqrt{d}). 令(hat{w}:=ar{z}/|ar{z}| in mathbb{R}^d), 其中(ar{z}=frac{1}{n} sum_{i=1}^n y_ix_i). 则至少有(1-2exp(-frac{d}{8(sigma^2+1)}))的概率, 线性分类器(f_{hat{w}})的分类错误率至多为:

[exp (-frac{(2sqrt{n}-1)^2d}{2(2sqrt{n}+4sigma)^2sigma^2}). ]
定理21: 令((x_1,y_1),ldots, (x_n,y_n) in mathbb{R}^d imes {pm 1}) 独立采样于同分布(( heta^*, sigma))-高斯模型, 且(| heta^*|_2=sqrt{d}). 令(hat{w}:=ar{z}/|ar{z}| in mathbb{R}^d), 其中(ar{z}=frac{1}{n} sum_{i=1}^n y_ix_i). 如果

[epsilon le frac{2sqrt{n}-1}{2sqrt{n}+4sigma} - frac{sigmasqrt{2log 1/eta}}{sqrt{d}}, ]
则至少有(1-2exp(-frac{d}{8(sigma^2+1)}))的概率, 线性分类器(f_{hat{w}})的(ell_{infty}^{epsilon})-robust 分类错误率至多为(eta).

lower bound

定理11: 令(g_n)为任意的学习算法, 并且, (sigma > 0, epsilon ge 0), 设( heta in mathbb{R}^d)从(mathcal{N}(0,I))中采样. 并从(( heta,sigma))-高斯模型中采样(n)个样本, 由此可得到分类器(f_n: mathbb{R}^d ightarrow {pm 1}). 则分类器关于( heta, (y_1,ldots, y_n), (x_1,ldots, x_n))的(ell_{infty}^{epsilon})-robust 分类错误率至少为

[frac{1}{2} mathbb{P}_{vsim mathcal{N}(0, I)} [sqrt{frac{n}{sigma^2+n}} |v|_{infty} le epsilon ]. ]
伯努利模型

upper bound

令((x, y) in mathbb{R}^d imes {pm1})从一(( heta^*, au))-伯努利模型中采样得到. 令(hat{w}=z / |z|_2), 其中(z=yx). 则至少有(1- exp (-frac{ au^2d}{2}))的概率, 线性分类器(f_{hat{w}})的分类错误率至多为(exp (-2 au^4d)).

lower bound

引理30： 令( heta^* in {pm1}^d) 并且关于(( heta^*, au)-伯努利模型)考虑线性分类器(f_{ heta^*}),
(ell_{infty}^{ au})-robustness: (f_{ heta^*})的(ell_{infty}^{ au})-robust分类误差率至多为(2exp (- au^2d/2)).
(ell_{infty}^{3 au})-nonrobustness: (f_{ heta^*})的(ell_{infty}^{3 au})-robust分类误差率至少为(1-2exp (- au^2d/2)).
Near-optimality of ( heta^*): 对于任意的线性分类器, (ell_{infty}^{3 au})-robust 分类误差率至少为(frac{1}{6}).

定理31: 令(g_n)为任一线性分类器学习算法. 假设( heta^*)均匀采样自({pm1}^d), 并从(( heta^*, au))-伯努利分布(( au le 1/4))中采样(n)个样本, 并借由(g_n)得到线性分类器(f_{w}).同时(epsilon < 3 au)且(0 < gamma < 1/2), 则当

[n le frac{epsilon^2gamma^2}{5000 cdot au^4 log (4d/gamma)}, ]
(f_w)关于( heta^*, (y_1,ldots, y_n), (x_1,ldots, x_n))的期望(ell_{infty}^{epsilon})-robust 分类误差至少为(frac{1}{2}-gamma).
相关阅读:
BEGINNING SHAREPOINT® 2013 DEVELOPMENT 第11章节--为Office和SP解决方式开发集成Apps 集成SP和Office App
jQuery 处理TextArea
Raphael的拖动处理
 CSS的position设置
 SVG的内部事件添加
 SVG的a链接
 SVG的text使用
 SVG的path的使用
 SVG的Transform使用
 Java中两个List对比的算法
原文地址：https://www.cnblogs.com/MTandHJ/p/13033613.html

Adversarially Robust Generalization Requires More Data

概

主要内容

高斯模型

upper bound

lower bound

伯努利模型

upper bound

lower bound