• sklearn 随机森林分类iris花


    #!/usr/bin/env python
    # coding: utf-8
    
    # ### 导入随面森林的相关库文件.
    from sklearn.ensemble import RandomForestClassifier          # 导入随机森林的包
    # from sklearn.model_selection import train_test_split         # 这个用于后台数据的分割
    from sklearn.preprocessing import StandardScaler             # 数据的标准化
    import numpy as np
    
    #导入iris数据
    # * Sepal.Length(花萼长度),单位是cm;
    # * Sepal.Width(花萼宽度),单位是cm;
    # * Petal.Length(花瓣长度),单位是cm;
    # * Petal.Width(花瓣宽度),单位是cm;
    # * 种类:Iris Setosa(山鸢尾)、Iris Versicolour(杂色鸢尾),以及Iris Virginica(维吉尼亚鸢尾) 共三种
    
    from sklearn import datasets                     # 导入iris自带数据库文件
    iris_data = datasets.load_iris()
    iris_feature = iris_data.data[:151:2]
    iris_target = iris_data.target[:151:2]
    
    # 数据标准化
    scaler = StandardScaler()  # 标准化转换
    # Compute the mean and std to be used for later scaling.
    scaler.fit(iris_feature)  # 训练标准化对象
    print(type(iris_target))
    iris_feature = scaler.transform(iris_feature)  # 转换数据集
    # feature_train, feature_test, target_train, target_test = train_test_split(traffic_feature, traffic_target,test_size=0.3, random_state=0)
    
    # 数据训练
    clf = RandomForestClassifier()
    clf.fit(iris_feature, iris_target)
    # predict_results = clf.predict(feature_test)
    
    # 数据为 0 号花
    test_feature = np.array([5.5,3.5,1.3,0.2]).reshape(1,-1) # 变为一个矩阵,是1行,n列,n值由最后的值来确定,所以这里采用-1
    print (test_feature)
    # scaler.fit(test_feature)  # 训练标准化对象
    target_feature = scaler.transform(test_feature)  # 转换数据集
    print (clf.predict(target_feature))
    
    
    
  • 相关阅读:
    spark[源码]-TaskSchedulerlmpl类源码
    spark[源码]-SparkEnv执行环境创建
    spark[源码]-sparkContext概述
    spark学习(基础篇)--(第三节)Spark几种运行模式
    spark关于join后有重复列的问题(org.apache.spark.sql.AnalysisException: Reference '*' is ambiguous)
    Spark --【宽依赖和窄依赖】
    CURL常用命令
    Foundation框架基本数据类型
    Object-C Categories和Protocols
    Object c 基础知识
  • 原文地址:https://www.cnblogs.com/laohaozi/p/12537662.html
Copyright © 2020-2023  润新知