多分类(Multi-Classification)
One-Versus-All (OVA) Decomposition
以逻辑回归为例,其思路是将其中一类和剩下的类分开,做二分类,并对全部类做次操作,这样便有了K个逻辑回归分类器,只要取其中概率最大hypothesis所对应的分类作为分类结果即可。
- for (k in mathcal { Y }) , obtain (mathbf{w}_{[k]}) by running logistic regression on
[mathcal { D } _ { [ k ] } = left{ left( mathbf { x } _ { n } , y _ { n } ^ { prime } = 2 left[kern-0.15emleft[ y _ { n } = k
ight]kern-0.15em
ight] - 1
ight)
ight} _ { n = 1 } ^ { N }
]
- return (g ( mathbf { x } ) = operatorname { argmax } _ { k in mathcal { Y } } left( mathbf { w } _ { [ k ] } ^ { T } mathbf { x } ight))
其优缺点是:
- pros: efficient ,can be coupled with any logistic regression-like approaches
效率高,可以和类似逻辑回归的算法(输出概率的算法)结合 - cons: often unbalanced D[k] when K large
如果K太大会导致数据不平衡
One-Versus-One (OVO) Decomposition
其基本思路是将其中一类和剩下的类中的一类做二分类,然对全部分类器执行该操作(组合数就是分类器数),那么
- for (( k , ell ) in mathcal { Y } imes mathcal { Y }) , obtain (mathbf { w }_ { [ k , l ] }) by running logistic regression on
[mathcal { D } _ { [ k , ell ] } = left{ left( mathbf { x } _ { n } , y _ { n } ^ { prime } = 2 left[kern-0.15emleft[ y _ { n } = k
ight]kern-0.15em
ight] - 1
ight) : y _ { n } = k ext { or } y _ { n } = ell
ight}
]
- return (g ( mathbf { x } ) = ext { tournament champion } left{ mathbf { w } _ { [ k , ell ] } ^ { T } mathbf { x } ight})
其优缺点是:
- pros: efficient (‘smaller’ training problems), stable, can be coupled with any binary classification approaches
更有效率更加稳定,可以结合任何二分类方法 - cons: use (O(K^2) \,mathbf { w }_ { [ k , l ] }),more space, slower prediction, more training。
需要训练(O(K^2)) 个 (,mathbf { w }_ { [ k , l ] }),占用更多的时间和空间。