• Random Forest And Extra Trees


    随机森林

    我们对使用决策树随机取样的集成学习有个形象的名字–随机森林。

    scikit-learn 中封装的随机森林,在决策树的节点划分上,在随机的特征子集上寻找最优划分特征。

    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn import datasets
    
    X, y = datasets.make_moons(n_samples=500, noise=0.3, random_state=666)
    
    plt.scatter(X[y==0, 0], X[y==0, 1])
    plt.scatter(X[y==1, 0], X[y==1, 1])
    plt.show()
    

    png

    from sklearn.ensemble import RandomForestClassifier
    
    rf_clf = RandomForestClassifier(n_estimators=500, random_state=666, oob_score=True)
    rf_clf.fit(X, y)
    

    RandomForestClassifier(bootstrap=True, class_weight=None, criterion=’gini’, max_depth=None, max_features=’auto’, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimat 大专栏  Random Forest And Extra Treesors=500, n_jobs=1, oob_score=True, random_state=666, verbose=0, warm_start=False)

    rf_clf.oob_score_
    

    0.892

    自定义决策树某些参数

    rf_clf2 = RandomForestClassifier(n_estimators=500, max_leaf_nodes=16
                                    , random_state=666, oob_score=True)
    rf_clf2.fit(X, y)
    rf_clf2.oob_score_
    

    0.906

    Extra-Trees

    在决策树的节点划分上,使用随机的特征和随机的阈值。

    随机性更加极端。

    提供了额外的随机性,一直过拟合,但增大了 bias 。

    更快的训练速度。

    from sklearn.ensemble import ExtraTreesClassifier
    
    et_clf = ExtraTreesClassifier(n_estimators=500, bootstrap=True
                                  , random_state=666, oob_score=True)
    et_clf.fit(X, y)
    

    ExtraTreesClassifier(bootstrap=True, class_weight=None, criterion=’gini’, max_depth=None, max_features=’auto’, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=500, n_jobs=1, oob_score=True, random_state=666, verbose=0, warm_start=False)

    et_clf.oob_score_
    

    0.892

    集成学习解决回归问题

    from sklearn.ensemble import BaggingRegressor
    from sklearn.ensemble import RandomForestRegressor
    from sklearn.ensemble import ExtraTreesRegressor
    
  • 相关阅读:
    mybatis之衣服商城
    mybatis之增加操作
    There is no getter for property named 'answer' in 'class (N1)
    java.lang.ClassNotFoundException原因
    Openstack(Kilo)安装系列之环境准备(一)
    Python标识符
    Python命令行参数
    Python中文编码
    嵌入式数据库H2的安装与配置
    saltstack之nginx部署
  • 原文地址:https://www.cnblogs.com/lijianming180/p/12275801.html
Copyright © 2020-2023  润新知