• 深度学习 Deep LearningUFLDL 最新Tutorial 学习笔记 2:Logistic Regression


    1 Logistic Regression 简述

    Linear Regression 研究连续量的变化情况,而Logistic Regression则研究离散量的情况。简单地说就是对于推断一个训练样本是属于1还是0。那么非常easy地我们会想到概率,对,就是我们计算样本属于1的概率及属于0的概率,这样就能够依据概率来预计样本的情况,通过概率也将离散问题变成了连续问题。


    Specifically, we will try to learn a function of the form:

    P(y=1|x)P(y=0|x)=hθ(x)=11+exp(θx)σ(θx),=1P(y=1|x)=1hθ(x).

    The function σ(z)11+exp(z) is often called the “sigmoid” or “logistic” function

    我们仅仅须要计算y=1的概率就ok了。其Cost Function例如以下:

    J(θ)=i(y(i)log(hθ(x(i)))+(1y(i))log(1hθ(x(i)))).

    除了方程不一样,其它的计算和Linear Regression是全然一样的。

    OK,接下来我们来看看练习怎么做。


    2 exercise1B 解答

    本练习通过使用MNIST的数据来推断手写数字0或者1.

    我直接贴出代码:
    ex1b_regression.m (无需更改)
    addpath ../common
    addpath ../common/minFunc_2012/minFunc
    addpath ../common/minFunc_2012/minFunc/compiled
    
    % Load the MNIST data for this exercise.
    % train.X and test.X will contain the training and testing images.
    %   Each matrix has size [n,m] where:
    %      m is the number of examples.
    %      n is the number of pixels in each image.
    % train.y and test.y will contain the corresponding labels (0 or 1).
    binary_digits = true;
    [train,test] = ex1_load_mnist(binary_digits);
    
    % Add row of 1s to the dataset to act as an intercept term.
    train.X = [ones(1,size(train.X,2)); train.X]; 
    test.X = [ones(1,size(test.X,2)); test.X];
    
    % Training set dimensions
    m=size(train.X,2);
    n=size(train.X,1);
    
    % Train logistic regression classifier using minFunc
    options = struct('MaxIter', 100);
    
    % First, we initialize theta to some small random values.
    theta = rand(n,1)*0.001;
    
    % Call minFunc with the logistic_regression.m file as the objective function.
    %
    % TODO:  Implement batch logistic regression in the logistic_regression.m file!
    %
    %tic;
    %theta=minFunc(@logistic_regression, theta, options, train.X, train.y);
    %fprintf('Optimization took %f seconds.
    ', toc);
    
    % Now, call minFunc again with logistic_regression_vec.m as objective.
    %
    % TODO:  Implement batch logistic regression in logistic_regression_vec.m using
    % MATLAB's vectorization features to speed up your code.  Compare the running
    % time for your logistic_regression.m and logistic_regression_vec.m implementations.
    %
    % Uncomment the lines below to run your vectorized code.
    %theta = rand(n,1)*0.001;
    tic;
    theta=minFunc(@logistic_regression_vec, theta, options, train.X, train.y);
    fprintf('Optimization took %f seconds.
    ', toc);
    
    % Print out training accuracy.
    tic;
    accuracy = binary_classifier_accuracy(theta,train.X,train.y);
    fprintf('Training accuracy: %2.1f%%
    ', 100*accuracy);
    
    % Print out accuracy on the test set.
    accuracy = binary_classifier_accuracy(theta,test.X,test.y);
    fprintf('Test accuracy: %2.1f%%
    ', 100*accuracy);

    logistic_regression.m
    function [f,g] = logistic_regression(theta, X,y)
      %
      % Arguments:
      %   theta - A column vector containing the parameter values to optimize.
      %   X - The examples stored in a matrix.  
      %       X(i,j) is the i'th coordinate of the j'th example.
      %   y - The label for each example.  y(j) is the j'th example's label.
      %
    
      m=size(X,2);
      n=size(X,1);
      
      % initialize objective value and gradient.
      f = 0;
      g = zeros(size(theta));
    
    
      %
      % TODO:  Compute the objective function by looping over the dataset and summing
      %        up the objective values for each example.  Store the result in 'f'.
      %
      % TODO:  Compute the gradient of the objective by looping over the dataset and summing
      %        up the gradients (df/dtheta) for each example. Store the result in 'g'.
      %
    %%% YOUR CODE HERE %%%
    
    % Step 1?Compute Cost Function
    
    for i = 1:m
        f = f - (y(i)*log(sigmoid(theta' * X(:,i))) + (1-y(i))*log(1-...
            sigmoid(theta' * X(:,1))));
    end
    
    
    for j = 1:n
        for i = 1:m
            g(j) = g(j) + X(j,i)*(sigmoid(theta' * X(:,i)) - y(i));
        end
        
    end
    
    
    


    ex1_load_mnist.m (无需更改)
    function [train, test] = ex1_load_mnist(binary_digits)
    
      % Load the training data
      X=loadMNISTImages('train-images-idx3-ubyte');  % 784x60000 60000张图片28x28pixel
      y=loadMNISTLabels('train-labels-idx1-ubyte')'; % 1*60000
    
      if (binary_digits)
        % Take only the 0 and 1 digits
        X = [ X(:,y==0), X(:,y==1) ];  %通过y==0和y==1直接得到y=0和1的index
        y = [ y(y==0), y(y==1) ];
      end
    
      % Randomly shuffle the data
      I = randperm(length(y));
      y=y(I); % labels in range 1 to 10
      X=X(:,I);
    
      % We standardize the data so that each pixel will have roughly zero mean and unit variance.
      s=std(X,[],2);  %??

    std??X???

    m=mean(X,2); X=bsxfun(@minus, X, m); X=bsxfun(@rdivide, X, s+.1); % 就是计算(x-m)/s 加0.1是为了防止分母为0 % Place these in the training set train.X = X; train.y = y; % Load the testing data X=loadMNISTImages('t10k-images-idx3-ubyte'); y=loadMNISTLabels('t10k-labels-idx1-ubyte')'; if (binary_digits) % Take only the 0 and 1 digits X = [ X(:,y==0), X(:,y==1) ]; y = [ y(y==0), y(y==1) ]; end % Randomly shuffle the data I = randperm(length(y)); y=y(I); % labels in range 1 to 10 X=X(:,I); % Standardize using the same mean and scale as the training data. X=bsxfun(@minus, X, m); X=bsxfun(@rdivide, X, s+.1); % Place these in the testing set test.X=X; test.y=y;


    【说明:本文为原创文章,转载请注明出处:blog.csdn.net/songrotek 欢迎交流QQ:363523441






  • 相关阅读:
    策略梯度(Policy Gradient)
    无约束优化问题
    有约束优化问题
    计算机网络学习资料
    为什么要用等效基带信号?
    通信网实验—话务量分析
    无感数据埋点(自定义注解+aop+异步)
    排序算法
    位运算常见操作
    数据库与缓存一致性的几种实现方式
  • 原文地址:https://www.cnblogs.com/jzssuanfa/p/7043491.html
Copyright © 2020-2023  润新知