• 《机器学习》周志华 习题答案5.5


    原题是写一个BP神经网络来拟合西瓜数据集,西瓜数据集我已经数值化了如下:

    编号,色泽,根蒂,敲声,纹理,脐部,触感,密度,含糖率,好瓜
    1,1,1,3,1,1,1,0.697,0.46,1
    2,2,1,2,1,1,1,0.774,0.376,1
    3,2,1,3,1,1,1,0.634,0.264,1
    4,1,1,2,1,1,1,0.608,0.318,1
    5,3,1,3,1,1,1,0.556,0.215,1
    6,1,2,3,1,2,2,0.403,0.237,1
    7,2,2,3,2,2,2,0.481,0.149,1
    8,2,2,3,1,2,1,0.437,0.211,1
    9,2,2,2,2,2,1,0.666,0.091,0
    10,1,3,1,1,3,2,0.243,0.267,0
    11,3,3,1,3,3,1,0.245,0.057,0
    12,3,1,3,3,3,2,0.343,0.099,0
    13,1,2,3,2,1,1,0.639,0.161,0
    14,3,2,2,2,1,1,0.657,0.198,0
    15,2,2,3,1,2,2,0.36,0.37,0
    16,3,1,3,3,3,1,0.593,0.042,0
    17,1,1,2,2,2,1,0.719,0.103,0

    而后调用pybrain的库建立具有50个单元的单隐层神经网络,如下

    
    
    #!/usr/bin/python
    # -*- coding:utf-8 -*-
    import numpy as np
    import matplotlib.pyplot as plt
    from matplotlib import colors
    from pybrain.tools.shortcuts import buildNetwork
    from pybrain.supervised.trainers import BackpropTrainer
    from pybrain.datasets import SupervisedDataSet

    file1 = open('c:quantwatermelon.csv','r')
    data = [line.strip(' ').split(',') for line in file1]
    data = np.array(data)
    X = [raw for raw in data[1:,1:-1]]
    y = [1 if raw[-1]=='1' else 0 for raw in data[1:]]
    X = np.array(X)
    y = np.array(y)
    print X,y
    #######################################################################以上是西瓜

    fnn = buildNetwork(8,50,1)
    DS = SupervisedDataSet(8,1)
    for a,b in zip(X,y):
    DS.addSample(a,b)
    # 训练器采用BP算法
    # verbose = True即训练时会把Total error打印出来,库里默认训练集和验证集的比例为4:1,可以在括号里更改
    trainer = BackpropTrainer(fnn, DS, verbose = True, learningrate=0.01)

    # maxEpochs即你需要的最大收敛迭代次数,这里采用的方法是训练至收敛,我一般设为1000
    trainer.trainUntilConvergence(maxEpochs=10000)

    # activate函数即神经网络训练后,预测的X2的输出值
    for a,b in zip(X,y):

    prediction = fnn.activate(a)
    print prediction,b
     

    下面分别是训练10000次和1000次的效果对比:

    训练10000次,左边是训练结果,右边是理想输出
    [ 0.99417443] 1
    [ 0.99774329] 1
    [ 1.00390992] 1
    [ 0.99456691] 1
    [ 0.99167349] 1
    [ 0.99627566] 1
    [-0.16419402] 1
    [ 0.99678622] 1
    [-0.00259512] 0
    [ 1.46741515] 0
    [ 0.57305884] 0
    [-0.00284737] 0
    [-0.0029103] 0
    [-0.00400758] 0
    [ 1.19899233] 0
    [-0.00333452] 0
    [-0.26766382] 0
    训练1000次,左边是训练结果,右边是理想输出。
    [ 0.3439171] 1
    [ 0.71063964] 1
    [ 0.86324691] 1
    [ 0.39205173] 1
    [ 0.97416348] 1
    [ 0.55886924] 1
    [ 0.1247508] 1
    [ 1.6945434] 1
    [ 0.38352444] 0
    [ 0.7585709] 0
    [-0.23212559] 0
    [ 0.03274158] 0
    [-0.33641601] 0
    [-0.65105817] 0
    [ 1.22768539] 0
    [ 0.11638493] 0
    [-0.13244805] 0

    可以看到10000次的训练误差明显要低的多,但是有可能有过拟合问题。

    参考文章:http://www.zengmingxia.com/use-pybrain-to-fit-neural-networks/

  • 相关阅读:
    设计模式总结——程序猿武功秘籍(下一个)
    easyui datagrid显示进度条控制操作
    使用CountDownLatch和CyclicBarrier处理并发线程
    人类探索地外文明显著取得的进展
    Linux 启动过程的详细解释
    不会跳回到微博认定申请书
    unix域套接字UDP网络编程
    VS SQL 出现%CommonDir%dte80a.olb 该解决方案
    数据仓库与数据挖掘的一些基本概念
    CheckBoxPreference组件
  • 原文地址:https://www.cnblogs.com/zhusleep/p/5634655.html
Copyright © 2020-2023  润新知