• R_Studio(神经网络)BP神经网络算法预测销量的高低


      BP神经网络  百度百科:传送门

      BP(back propagation)神经网络:一种按照误差逆向传播算法训练的多层前馈神经网络,是目前应用最广泛的神经网络

      

    #设置文件工作区间
    setwd('D:\dat') 
    #读入数据
    Gary=read.csv("sales_data.csv")[,2:5]
    #数据命名
    library(nnet)
    colnames(Gary)<-c("x1","x2","x3","y")
    
    ###最终模型
    model1=nnet(y~.,data=Gary,size=6,decay=5e-4,maxit=1000)  
    
    pred=predict(model1,Gary[,1:3],type="class")    
    (P=sum(as.numeric(pred==Gary$y))/nrow(Gary))
    table(Gary$y,pred)
    prop.table(table(Gary$y,pred),1)
    Gary.Script

    实现过程

      目的:通过BP神经网络预测销量的高低

      数据预处理,对数据进行重命名并去除无关项

    > #设置文件工作区间
    > setwd('D:\dat') 
    > #读入数据
    > Gary=read.csv("sales_data.csv")[,2:5]
    > #数据命名
    > library(nnet)
    > colnames(Gary)<-c("x1","x2","x3","y")
    > Gary
         x1  x2  x3    y
    1   bad yes yes high
    2   bad yes yes high
    3   bad yes yes high
    4   bad  no yes high
    5   bad yes yes high
    6   bad  no yes high
    7   bad yes  no high
    8  good yes yes high
    9  good yes  no high
    10 good yes yes high
    11 good yes yes high
    12 good yes yes high
    13 good yes yes high
    14  bad yes yes  low
    15 good  no yes high
    16 good  no yes high
    17 good  no yes high
    18 good  no yes high
    19 good  no  no high
    20  bad  no  no  low
    21  bad  no yes  low
    22  bad  no yes  low
    23  bad  no yes  low
    24  bad  no  no  low
    25  bad yes  no  low
    26 good  no yes  low
    27 good  no yes  low
    28  bad  no  no  low
    29  bad  no  no  low
    30 good  no  no  low
    31  bad yes  no  low
    32 good  no yes  low
    33 good  no  no  low
    34 good  no  no  low

      nnet:包实现了前馈神经网络和多项对数线性模型。前馈神经网络是一种常用的神经网络结构,如下图所示

      

      前馈网络中各个神经元按接受信息的先后分为不同的组。每一组可以看作一个神经层。每一层中的神经元接受前一层神经元的输出,并输出到下一层神经元。整个网络中的信息是朝一个方向传播,没有反向的信息传播。前馈网络可以用一个有向无环路图表示。前馈网络可以看作一个函数,通过简单非线性函数的多次复合,实现输入空间到输出空间的复杂映射。这种网络结构简单,易于实现。前馈网络包括全连接前馈网络和卷积神经网络等

    使用neet()方法创建模型

      neet()方法:

      decay: 权重衰减parameter for weight decay. Default 0.

      maxit: 最大迭代次数

    x: 训练样本数据集的输入集合
    y: x对应的训练样本数据集的标签(类)集合
    weights: 
    size: 隐层节点数, Can be zero if there are skip-layer units.
    data:训练数据集.
    subset: An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
    na.action: A function to specify the action to be taken if NAs are found. The default action is for the procedure to fail. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.)
    contrasts:a list of contrasts to be used for some or all of the factors appearing as variables in the model formula.
    Wts: 边的权重. If missing chosen at random.
    mask: logical vector indicating which parameters should be optimized (default all).
    linout: 是否为逻辑输出单元,若为FALSE,则为线性输出单元
    entropy: switch for entropy (= maximum conditional likelihood) fitting. Default by least-squares.
    softmax: switch for softmax (log-linear model) and maximum conditional likelihood fitting. linout, entropy, softmax and censored are mutually exclusive.
    censored: A variant on softmax, in which non-zero targets mean possible classes. Thus for softmax a row of (0, 1, 1) means one example each of classes 2 and 3, but for censored it means one example whose class is only known to be 2 or 3.
    skip: switch to add skip-layer connections from input to output.
    rang: Initial random weights on [-rang, rang]. Value about 0.5 unless the inputs are large, in which case it should be chosen so that rang * max(|x|) is about 1.
    decay: 权重衰减parameter for weight decay. Default 0.
    maxit: 最大迭代次数
    Hess: If true, the Hessian of the measure of fit at the best set of weights found is returned as component Hessian.
    trace: switch for tracing optimization. Default TRUE.
    MaxNWts: The maximum allowable number of weights. There is no intrinsic limit in the code, but increasing MaxNWts will probably allow fits that are very slow and time-consuming.
    abstol: Stop if the fit criterion falls below abstol, indicating an essentially perfect fit.
    reltol: Stop if the optimizer is unable to reduce the fit criterion by a factor of at least 1 - reltol.
    neet()方法
    model1=nnet(y~.,data=Gary,size=6,decay=5e-4,maxit=1000)
    # weights:  31
    initial  value 27.073547 
    iter  10 value 16.080731
    iter  20 value 15.038060
    iter  30 value 14.937127
    iter  40 value 14.917485
    iter  50 value 14.911531
    iter  60 value 14.908678
    iter  70 value 14.907836
    iter  80 value 14.905234
    iter  90 value 14.904499
    iter 100 value 14.904028
    iter 110 value 14.903688
    iter 120 value 14.903480
    iter 130 value 14.903450
    iter 130 value 14.903450
    iter 130 value 14.903450
    final  value 14.903450 
    converged

    评估模型

    > pred=predict(model1,Gary[,1:3],type="class")
    > (P=sum(as.numeric(pred==Gary$y))/nrow(Gary))
    [1] 0.7647059
    > table(Gary$y,pred)
          pred
           high low
      high   14   4
      low     4  12
    > prop.table(table(Gary$y,pred),1)
          pred
                high       low
      high 0.7777778 0.2222222
      low  0.2500000 0.7500000

      得到混淆矩阵图后可以看出

      检测样本为34个,预测正确的个数为26个,预测准确率为76.5%,预测准确率较低

      原因:由于神经网络训练时需要较多的样本,而这里训练集比较少 

    (如需转载学习,请标明出处)
  • 相关阅读:
    linux centos6.4 php连接sql server2008
    Windows下连接php5.3+sql server2008
    解决:安装SQl 2008为SQL Server代理服务提供的凭据无效
    Linux下查看文件夹或目录大小
    Sql 子查询
    Linux 删除空行
    shell中的IFS详解
    Linux 文件名匹配
    Linux Shell逻辑运算符和表达式详解
    转:shell 经典, shell 十三问
  • 原文地址:https://www.cnblogs.com/1138720556Gary/p/10040710.html
Copyright © 2020-2023  润新知