• 鸢尾花数据集分析


    鸢尾花数据集分析

    鸢尾花

    数据集分析一共150行数据,分别为三种种类。

    种类 代表数字
    setosa 0
    versicolor 1
    virginica 2

    四种特征

    特征 翻译
    sepal length (cm) 萼片长度(厘米)
    sepal width (cm) 萼片宽度(厘米)
    petal length (cm) 花瓣长度(厘米)
    petal width (cm) 花瓣宽度(厘米)

    各种属性对对应的散点图

    各种属性的直方图

    各种属性的雷达图

    分类代码点这里

    画图代码如下

    '''
    datatime:2020/6/14
    author:wuxiong
    description:鸢尾花数据集分类
    '''
    import numpy
    from sklearn.datasets import load_iris 
    #读出鸢尾花数据集data
    data=load_iris()
    
    print(data.keys())
    #鸢尾花数据集包含的内容
    # print(data['data'])
    #print(data['DESCR'])
    # print(data['target_names'])
    # print(data['feature_names'])
    # print(data['data'])
    
    import matplotlib.pyplot as plt
    import numpy as np
    
    #转化成nupy数组
    data_numpy = np.array(data['data'])
    target = np.array(data['target'])
    #切片第一列
    sepal_lenth = data_numpy[...,0]
    #切片第二列
    sepal_width = data_numpy[...,1]
    #切片第三列
    petal_length = data_numpy[...,2]
    #切片第四列
    petal_width = data_numpy[...,3]
    
    sepal_lenth_feature =[sepal_lenth,'sepal lenth']
    sepal_width_feature =[sepal_width,'sepal width']
    petal_length_feature =[petal_length,'petal length']
    petal_width_feature =[petal_width,'petal width']
    
    features=[sepal_lenth_feature,sepal_width_feature,petal_length_feature,petal_width_feature]
    
    colors1 = '#00CED1' #点的颜色
    colors2 = '#DC143C'
    clores3 = '#4fd424'
    
    area = np.pi * 4**2  # 点面积 
    # 画散点图,12张图
    def drawScatter(target,x,y,xlable,ylable):
        for i,j in enumerate(x):
            if(target[i]==0):
                plt.scatter(x[i], y[i], s=area, c=colors1, alpha=0.4, label='setosa')
            elif (target[i]==1):
                plt.scatter(x[i], y[i], s=area, c=colors2, alpha=0.4, label='versicolor')
            else:
                plt.scatter(x[i], y[i], s=area, c=clores3, alpha=0.4, label='virginica')
            plt.xlabel(xlable) 
            plt.ylabel(ylable) 
        plt.show()
    pass
    
    #画直方图,一共4张图
    def drawHistogram(target,x_feature):
        data = x_feature[0]
        xlable = x_feature[1]
        plt.hist(data, bins=50, normed=0, facecolor="blue", edgecolor="black", alpha=0.7)
        plt.xlabel(xlable)
        plt.ylabel("frequency")
        plt.title("{} histogram".format(xlable))
        plt.show()
    
    #画雷达图,一张
    def drawRader1(target,sepal_lenth,sepal_width,petal_length,petal_width):
        # 雷达图1 - 极坐标的折线图/填图 - plt.plot()
        plt.figure(figsize=(16,8))
        ax1= plt.subplot(111, projection='polar')
        ax1.set_title('features radar map
    ')  # 创建标题
        ax1.set_rlim(0,12)
        data1 = sepal_lenth
        data2 = sepal_width
        data3 = petal_length
        data4 = petal_width
        theta=np.arange(0, 2*np.pi, 2*np.pi/150)
        # 创建数据
    
        ax1.plot(theta,data1,'.--',label='data1')
        ax1.fill(theta,data1,alpha=0.2)
        ax1.plot(theta,data2,'.--',label='data2')
        ax1.fill(theta,data2,alpha=0.2)
        ax1.plot(theta,data3,'.--',label='data3')
        ax1.fill(theta,data3,alpha=0.2)
        ax1.plot(theta,data4,'.--',label='data4')
        ax1.fill(theta,data4,alpha=0.2)
        
    def drawRader2(target,sepal_lenth,sepal_width,petal_length,petal_width):
        labels = np.array(['sepal lenth','sepal width','petal length','petal width']) # 标签
        dataLenth = 150 # 数据长度
        data1 = sepal_lenth
        data2 = sepal_width
        data3 = petal_length
        data4 = petal_width
    
        angles = np.linspace(0, 2*np.pi, dataLenth, endpoint=False) # 分割圆周长
        data1 = np.concatenate((data1, [data1[0]])) # 闭合
        data2 = np.concatenate((data2, [data2[0]])) # 闭合
        data3 = np.concatenate((data3, [data3[0]])) # 闭合
        data4 = np.concatenate((data4, [data4[0]])) # 闭合
        angles = np.concatenate((angles, [angles[0]])) # 闭合
        
        plt.figure(figsize=(16,8))
        plt.polar(angles, data1, 'o-', linewidth=1) #做极坐标系
        plt.fill(angles, data1, alpha=0.25)# 填充
        plt.polar(angles, data2, 'o-', linewidth=1) #做极坐标系
        plt.fill(angles, data2, alpha=0.25)# 填充
        plt.polar(angles, data3, 'o-', linewidth=1) #做极坐标系
        plt.fill(angles, data3, alpha=0.25)# 填充
        plt.polar(angles, data4, 'o-', linewidth=1) #做极坐标系
        plt.fill(angles, data4, alpha=0.25)# 填充
        
        plt.thetagrids(angles * 180/np.pi, labels) # 设置网格、标签
        plt.ylim(0,10)  # polar的极值设置为ylim
    
    drawRader1(target,sepal_lenth,sepal_width,petal_length,petal_width)   
    drawRader2(target,sepal_lenth,sepal_width,petal_length,petal_width)
        
    for i,x_feature in enumerate(features):
        drawHistogram(target,x_feature)
        tem = features.copy()
        tem.pop(i)
        for j,y_feature in enumerate(tem):
            drawScatter(target,x_feature[0],y_feature[0],x_feature[1],y_feature[1])
        pass
    pass
    
  • 相关阅读:
    Python
    Linux, Nginx
    Python
    C#图像处理(各种旋转、改变大小、柔化、锐化、雾化、底片、浮雕、黑白、滤镜效果)
    堆——神奇的优先队列(下)
    堆——神奇的优先队列(上)
    二叉树
    开启“树”之旅
    巧妙的邻接表(数组实现)
    Dijkstra最短路算法
  • 原文地址:https://www.cnblogs.com/realwuxiong/p/13126881.html
Copyright © 2020-2023  润新知