• sklearn线性回归


    1.多项式回归(Polynomial Regression).

    "一元多项式回归": 自变量只有一个 "多元多项式回归": 自变量有多个。 

    一元n次多项式:$hat{y}=w_{0}+w_{1} x_{1}+ w_{2} x^{2}+cdots+w_{n} x^{n}$

    多元多次多项式(二元二次多项式为例):$hat{y}=w_{0}+w_{1} x_{1}+ w_{2} x^{2}+w_{3}x_{1}^{2}+w_{4}x_{2}^{2}+w_{5} x_{1}x_{2}$

    2.sklearn拟合n元一次多项式

    $hat{y}=w_{0}+sum_{i=1}^{n}w_{i} x_{i}$

    from sklearn.linear_model import LinearRegression
        datasets_X = []  # 形如[[1,2,3],[4,5,6],[7,8,9]]
        datasets_Y = []  # 形如[6,15,24]
        fr = open('train.dat', 'r')
        lines = fr.readlines()
        for line in lines:
            items = line.strip().split('	')
            datasets_X.append([float(ele) for ele in items[:-1]])
            datasets_Y.append(float(items[-1]))
        model = LinearRegression()
        model.fit(datasets_X, datasets_Y)
        # 加载测试数据,这儿直接用的训练数据
        X_test = datasets_X
        y_test = datasets_Y
        predictions = model.predict(X_test)
        for i, prediction in enumerate(predictions):
            print('Predict_value: %s, True_value: %s' % (prediction, y_test[i]))
        print('R2-squared: %.2f' % model.score(X_test, y_test))
        print(model.coef_)  # 参数,回归方程系数
        print(model.intercept_)  # 偏置bias,方程的截距
        exit()

    model.coef_就是一个参数列表,对应$w_{1}$,$w_{2}$,$cdots$,$w_{n}$等$n$个回归方程的系数,model.intercept_就是偏置(方程的截距)$w_{0}$

    3.sklearn拟合n元n次多项式

    以三元三次多项式为例,一共有,20项特征(有一个值为常数1),再加1个偏置常数项$w_{0}$

    数据形如:

    datasets_X = [[1,2,3],[4,5,6],[7,8,9]]
    datasets_Y = [6,15,24]

    $ extbf{f}=[1,x_{1}、 x_{2}、 x_{3}、 x_{1}x_{1}、 x_{2}x_{2}、 x_{3}x_{3}、 x_{1}x_{2}、 x_{2}x_{3}、 x_{1}x_{3}、 x_{1}x_{1}x_{2}、 x_{1}x_{1}x_{3}、 \x_{2}x_{2}x_{1}、 x_{2}x_{2}x_{3}、 x_{3}x_{3}x_{1}、 x_{3}x_{3}x_{2}、 x_{1}x_{1}x_{1}、 x_{2}x_{2}x_{2}、 x_{3}x_{3}x_{3}、 x_{1}x_{2}x_{3}]$

    $hat{y}=w_{0}+sum_{i=1}^{20}w_{i} f_{i}$

     (ps:其实只要把$ extbf{f}$的每一个元素当成一个新的特征,这样一条数据就有20个特征了,也可以像sklearn拟合n元一次多项式那样使用,得到20个方程的系数和一个截距)

    from sklearn.linear_model import LinearRegression
    from sklearn.preprocessing import PolynomialFeatures
    datasets_X = [[1,2,3],[4,5,6],[7,8,9]]
    datasets_Y = [6,15,24]
    poly_feat = PolynomialFeatures(degree=3)
    datasets_X_ploy = poly_feat.fit_transform(datasets_X)
    model = LinearRegression()
    model.fit(datasets_X_ploy, datasets_Y)
    print(model.coef_)  # 参数
    print(model.intercept_)  # 偏置bias
    
    # model.coef_ = [-5.86336535e-16  4.23053648e-03  4.23053648e-03  4.23053648e-03
    #   9.72135088e-03  1.39518874e-02  1.81824238e-02  1.81824238e-02
    #   2.24129603e-02  2.66434968e-02 -4.83347121e-02 -3.86133612e-02
    #  -2.88920103e-02 -2.46614738e-02 -1.07095864e-02  7.47283740e-03
    #  -6.47904996e-03  1.17033739e-02  3.41163342e-02  6.07598310e-02]
     # model.intercept_ = 3.400113258456912
    可以看到${model.coef_}$有20个参数,分别对应${	extbf{f}}$中特征的权重。(顺序可能不是一一对应的,因为不知道代码输出参数对应的那些交叉特征项)
    还有一个截距${model.intercept_ = 3.400113258456912}$
    回归方程系数
  • 相关阅读:
    zzuli oj 1120 最值交换
    zzuli oj 1119 一维数组排序
    zzuli oj 1118 数列有序
    zzuli oj 1117 查找数组元素
    寒假集训 字符串专题 1001
    zzuli oj 1116
    A
    Codeforces Round #615 (Div. 3) E. Obtain a Permutation
    Codeforces Round #615 (Div. 3) F. Three Paths on a Tree
    Codeforces Round #603 (Div. 2)F. Economic Difficulties
  • 原文地址:https://www.cnblogs.com/sunupo/p/12833508.html
Copyright © 2020-2023  润新知