利用随机森林进行特征重要性评估

https://blog.csdn.net/xiezhen_zheng/article/details/82011908

参考：特征筛选方法

https://blog.csdn.net/m0_37316673/article/details/107524247

import pandas as pd

df = pd.read_csv('D:Users/FengZH2/Desktop/test/testdata.csv',encoding='gbk')

df.info()

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
x, y = df.iloc[:, 1:].values, df.iloc[:, 0].values
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 0)
feat_labels = df.columns[1:]
forest = RandomForestClassifier(n_estimators=10000, random_state=0, n_jobs=-1)
forest.fit(x_train, y_train.astype('int'))

importances = forest.feature_importances_
import numpy as np
indices = np.argsort(importances)[::-1]
for f in range(x_train.shape[1]):
    print("%2d) %-*s %f" % (f + 1, 30, feat_labels[indices[f]], importances[indices[f]]))

threshold = 0.15
x_selected = x_train[:, importances > threshold]
x_selected.shape

import matplotlib.pyplot as plt
plt.figure(1)
plt.title('Feature Importances')
plt.barh(range(len(indices)), importances[indices], color='b', align='center')
plt.xlabel('Relative Importance')

相关阅读:
设置通过数字证书方式登录远程连接Linux服务器，禁止root账户远程密码方式远程连接服务器
Windows server 2003 粘滞键后门+提权
CVE-2017-7494复现 Samba远程代码执行
SQL盲注学习-时间型
SQL盲注学习-布尔型
关于绿盟RSAS使用时遇到的问题
SQL注入学习
CentOS7，zabbix3.4通过，zabbix-Java-gateway监控Tomcat
（转）输入pipt提示：AttributeError: 'module' object has no attribute 'HTTPSConnection'
pip安装cx_Oracle报错：error code 1 in /tmp/pip-build-root/cx-Oracle

原文地址：https://www.cnblogs.com/yukit/p/13706066.html