• scikit-learn:4.5. Random Projection


    參考:http://scikit-learn.org/stable/modules/random_projection.html


    The sklearn.random_projection module 通过trading accuracy(可控的范围)来降维数据。提高效率。实现了两类unstructured random matrix:Gaussian random matrix and sparse random matrix.


    理论基础:the Johnson-Lindenstrauss lemma (quoting Wikipedia),该引理大概内容为:

    In mathematics, the Johnson-Lindenstrauss lemma is a result concerning low-distortion embeddings(低失真嵌入) of points from high-dimensional into low-dimensional Euclidean space. The lemma states that a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. The map used for the embedding is at least Lipschitz, and can even be taken to be an orthogonal projection(正交投影).  


     the sklearn.random_projection.johnson_lindenstrauss_min_dim 能够仅通过样本的数量来得到随机子空间的保守最小维度(同一时候保证向低维空间随机投影时造成的失真是bounded的,estimates conservatively the minimal size of the random subspace to guarantee a bounded distortion introduced by the random projection):

    >>> from sklearn.random_projection import johnson_lindenstrauss_min_dim
    >>> johnson_lindenstrauss_min_dim(n_samples=1e6, eps=0.5)
    663
    
              
  • 相关阅读:
    053705
    053704
    053703
    053702
    053701
    053700
    053699
    053698
    053697
    HDU 3746 Cyclic Nacklace
  • 原文地址:https://www.cnblogs.com/slgkaifa/p/7306432.html
Copyright © 2020-2023  润新知