• 机器学习sklearn(十二): 特征工程(三)特征组合与交叉(一)多项式特征


    在机器学习中,通过增加一些输入数据的非线性特征来增加模型的复杂度通常是有效的。一个简单通用的办法是使用多项式特征,这可以获得特征的更高维度和互相间关系的项。这在 PolynomialFeatures 中实现:

    >>> import numpy as np
    >>> from sklearn.preprocessing import PolynomialFeatures
    >>> X = np.arange(6).reshape(3, 2)
    >>> X                                                 
    array([[0, 1],
     [2, 3],
     [4, 5]])
    >>> poly = PolynomialFeatures(2)
    >>> poly.fit_transform(X)                             
    array([[  1.,   0.,   1.,   0.,   0.,   1.],
     [  1.,   2.,   3.,   4.,   6.,   9.],
     [  1.,   4.,   5.,  16.,  20.,  25.]])

    >>> X = np.arange(9).reshape(3, 3)
    >>> X                                                 
    array([[0, 1, 2],
     [3, 4, 5],
     [6, 7, 8]])
    >>> poly = PolynomialFeatures(degree=3, interaction_only=True)
    >>> poly.fit_transform(X)                             
    array([[   1.,    0.,    1.,    2.,    0.,    0.,    2.,    0.],
     [   1.,    3.,    4.,    5.,   12.,   15.,   20.,   60.],
     [   1.,    6.,    7.,    8.,   42.,   48.,   56.,  336.]])

    注意,当使用多项的 Kernel functions 时 ,多项式特征被隐式地在核函数中被调用(比如, sklearn.svm.SVC , sklearn.decomposition.KernelPCA )。

    创建并使用多项式特征的岭回归实例请见 Polynomial interpolation 。

    class sklearn.preprocessing.PolynomialFeatures(degree=2*interaction_only=Falseinclude_bias=Trueorder='C')

    Generate polynomial and interaction features.

    Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].

    Parameters
    degreeint, default=2

    The degree of the polynomial features.

    interaction_onlybool, default=False

    If true, only interaction features are produced: features that are products of at most degree distinct input features (so not x[1] ** 2x[0] x[2] ** 3, etc.).

    include_biasbool, default=True

    If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model).

    order{‘C’, ‘F’}, default=’C’

    Order of output array in the dense case. ‘F’ order is faster to compute, but may slow down subsequent estimators.

    New in version 0.21.

    Attributes
    powers_ndarray of shape (n_output_features, n_input_features)

    powers_[i, j] is the exponent of the jth input in the ith output.

    n_input_features_int

    The total number of input features.

    n_output_features_int

    The total number of polynomial output features. The number of output features is computed by iterating over all suitably sized combinations of input features.

    Methods

    fit(X[, y])

    Compute number of output features.

    fit_transform(X[, y])

    Fit to data, then transform it.

    get_feature_names([input_features])

    Return feature names for output features

    get_params([deep])

    Get parameters for this estimator.

    set_params(**params)

    Set the parameters of this estimator.

    transform(X)

    Transform data to polynomial features

    Examples

    >>> import numpy as np
    >>> from sklearn.preprocessing import PolynomialFeatures
    >>> X = np.arange(6).reshape(3, 2)
    >>> X
    array([[0, 1],
           [2, 3],
           [4, 5]])
    >>> poly = PolynomialFeatures(2)
    >>> poly.fit_transform(X)
    array([[ 1.,  0.,  1.,  0.,  0.,  1.],
           [ 1.,  2.,  3.,  4.,  6.,  9.],
           [ 1.,  4.,  5., 16., 20., 25.]])
    >>> poly = PolynomialFeatures(interaction_only=True)
    >>> poly.fit_transform(X)
    array([[ 1.,  0.,  1.,  0.],
           [ 1.,  2.,  3.,  6.],
           [ 1.,  4.,  5., 20.]])
  • 相关阅读:
    手写数字识别-卷积神经网络cnn(06-2)
    putty中文显示乱码解决方法
    linux可执行程序调试(c++)
    修改优化器进一步提升准确率(04-3)
    react 数组删除某一项更新setState无效的问题,react js怎么删除数组某一项,splice删除了某一项页面数据却不变
    高德地图实现一个比例圆环形聚合点缩放
    使用react context的作用React.createContext
    怎么在websotrm配置快捷启动vue-cli项目?
    vue项目找不到.eslintrc.js文件解决---帮助小白解决 'xxx' is defined but never used
    react hook性能优化使用memo、useCallback、useMemo
  • 原文地址:https://www.cnblogs.com/qiu-hua/p/14903580.html
Copyright © 2020-2023  润新知