• 深度学习 Deep Learning UFLDL 最新 Tutorial 学习笔记 1:Linear Regression


    1 前言

    Andrew Ng的UFLDL在2014年9月底更新了。

    对于開始研究Deep Learning的童鞋们来说这真的是极大的好消息!


    新的Tutorial相比旧的Tutorial添加了Convolutional Neural Network的内容。了解的童鞋都知道CNN在Computer Vision的重大影响。

    而且从新编排了内容及exercises。


    新的UFLDL网址为:

    http://ufldl.stanford.edu/tutorial/


    2 Linear Regression 理论简述

    对于线性回归Linear Regression,恐怕大部分童鞋都了解。简单的说

    线性回归问题就是一个目标值y取决于一组输入值x。我们要寻找一个最合适的如果Hypothesis来描写叙述这个y与x的关系。然后利用这个Hypothesis来预測新的输入x相应的y。


    这是个简单的最优化问题。我们须要一个代价函数cost function来描写叙述在training set样本中的y与通过h函数预測的y之间的差距,从而利用这个cost function通过Gradient Decent梯度下降法来计算h的最优參数从而得到最优的h。

    由于是通过样本让计算机“学习”合适的參数theta,因此这是一个最主要的机器学习算法。


    cost function:

    J(θ)=12i(hθ(x(i))y(i))2=12i(θx(i)y(i))2

    对theta做偏导:

    Differentiating the cost function J(θ) as given above with respect to a particular parameter θj gives us:

    J(θ)θj=ix(i)j(hθ(x(i))y(i))

    3 Linear Regression 练习

    3.1 ex1a_linreg.m 分析

    %
    %This exercise uses a data from the UCI repository:
    % Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository
    % http://archive.ics.uci.edu/ml
    % Irvine, CA: University of California, School of Information and Computer Science.
    %
    %Data created by:
    % Harrison, D. and Rubinfeld, D.L.
    % ''Hedonic prices and the demand for clean air''
    % J. Environ. Economics & Management, vol.5, 81-102, 1978.
    %
    addpath ../common
    addpath ../common/minFunc_2012/minFunc
    addpath ../common/minFunc_2012/minFunc/compiled
    
    % Load housing data from file.
    data = load('housing.data');  % housing data  506x14 
    data=data'; % put examples in columns  14x506  一般这里将每一个样本放在每一列
    
    % Include a row of 1s as an additional intercept feature.
    data = [ ones(1,size(data,2)); data ];  % 15x506    添加intercept term 
    
    % Shuffle examples. 乱序 目的在于之后可以随机选取training set和test sets
    data = data(:, randperm(size(data,2))); %randperm(n)用于随机生成1到n的排列
    
    % Split into train and test sets
    % The last row of 'data' is the median home price.
    train.X = data(1:end-1,1:400);   %选择前400个样本来训练,后面的样本来做測试
    train.y = data(end,1:400);
    
    test.X = data(1:end-1,401:end);
    test.y = data(end,401:end);
    
    m=size(train.X,2);  %训练样本数量
    n=size(train.X,1);  %每一个样本的变量个数
    
    % Initialize the coefficient vector theta to random values.
    theta = rand(n,1); %随机生成初始theta 每一个值在(0,1)之间
    
    % Run the minFunc optimizer with linear_regression.m as the objective.
    %
    % TODO:  Implement the linear regression objective and gradient computations
    % in linear_regression.m
    %
    tic; %Start a stopwatch timer. 開始计时
    options = struct('MaxIter', 200);
    theta = minFunc(@linear_regression, theta, options, train.X, train.y);
    fprintf('Optimization took %f seconds.
    ', toc); %toc Read the stopwatch timer
    
    % Run minFunc with linear_regression_vec.m as the objective.
    %
    % TODO:  Implement linear regression in linear_regression_vec.m
    % using MATLAB's vectorization features to speed up your code.
    % Compare the running time for your linear_regression.m and
    % linear_regression_vec.m implementations.
    %
    % Uncomment the lines below to run your vectorized code.
    %Re-initialize parameters
    %theta = rand(n,1);
    %tic;
    %theta = minFunc(@linear_regression_vec, theta, options, train.X, train.y);
    %fprintf('Optimization took %f seconds.
    ', toc);
    
    % Plot predicted prices and actual prices from training set.
    actual_prices = train.y;
    predicted_prices = theta'*train.X;
    
    % Print out root-mean-squared (RMS) training error.平方根误差
    train_rms=sqrt(mean((predicted_prices - actual_prices).^2));
    fprintf('RMS training error: %f
    ', train_rms);
    
    % Print out test RMS error
    actual_prices = test.y;
    predicted_prices = theta'*test.X;
    test_rms=sqrt(mean((predicted_prices - actual_prices).^2));
    fprintf('RMS testing error: %f
    ', test_rms);
    
    
    % Plot predictions on test data.
    plot_prices=true;
    if (plot_prices)
      [actual_prices,I] = sort(actual_prices); %从小到大排序价格。I为index
      predicted_prices=predicted_prices(I);
      plot(actual_prices, 'rx');
      hold on;
      plot(predicted_prices,'bx');
      legend('Actual Price', 'Predicted Price');
      xlabel('House #');
      ylabel('House price ($1000s)');
    end


    3.2 linear_regression.m code

    function [f,g] = linear_regression(theta, X,y)
      %
      % Arguments:
      %   theta - A vector containing the parameter values to optimize.
      %   X - The examples stored in a matrix.
      %       X(i,j) is the i'th coordinate of the j'th example.
      %   y - The target value for each example.  y(j) is the target for example j.
      %
      
      m=size(X,2);
      n=size(X,1);
    
      f=0;
      g=zeros(size(theta));
    
      %
      % TODO:  Compute the linear regression objective by looping over the examples in X.
      %        Store the objective function value in 'f'.
      %
      % TODO:  Compute the gradient of the objective with respect to theta by looping over
      %        the examples in X and adding up the gradient for each example.  Store the
      %        computed gradient in 'g'.
      
    %%% YOUR CODE HERE %%%
    
    % Step 1 : Compute f cost function
    for i = 1:m
        f = f + (theta' * X(:,i) - y(i))^2;
    end
    
    f = 1/2*f;
    
    % Step 2: Compute gradient 
    
    for j = 1:n
        for i = 1:m
            g(j) = g(j) + X(j,i)*(theta' * X(:,i) - y(i));
        end
        
    end
    

    3.3 Result

    Optimization took 3.374166 seconds.
    RMS training error: 4.679871
    RMS testing error: 4.865463



    【本文为原创文章,转载请注明出处:blog.csdn.net/songrotek】

  • 相关阅读:
    顺便说说webservice
    了解c3p0,dbcp与druid
    静心己过
    慢慢来写SpringMVC基本项目
    关于druid的配置说明
    想法
    看见了别人的数据库题,随便写写
    Java 工具类
    Java 工具类
    使用JavaMail实现发送模板邮件以及保存到发件箱
  • 原文地址:https://www.cnblogs.com/lcchuguo/p/5180555.html
Copyright © 2020-2023  润新知