这里是斯坦福大学机器学习网络课程的学习笔记。课程地址是:https://class.coursera.org/ml-2012-002/lecture/index
本周题目有:
1. sigmoid函数的梯度;
2. 成本函数(正则化的)(正向传播);
3. 梯度(正则化的)(反向传播);
同时给我们提供了,check梯度法的代码。;
实现:
1. sigmoid函数的梯度
function g = sigmoidGradient(z) g = zeros(size(z)); g=sigmoid(z).*(1-sigmoid(z)); end
2. 成本函数(正则化的)(正向传播)3. 梯度(正则化的)(反向传播)
function [J grad] = nnCostFunction(nn_params, ... input_layer_size, ... hidden_layer_size, ... num_labels, ... X, y, lambda) Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ... hidden_layer_size, (input_layer_size + 1)); Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ... num_labels, (hidden_layer_size + 1)); % Setup some useful variables m = size(X, 1); % You need to return the following variables correctly J = 0; Theta1_grad = zeros(size(Theta1)); Theta2_grad = zeros(size(Theta2)); a1 = [ones(m,1) X]; %5000*401 z2 = a1*Theta1'; %5000*25 a2 = [ones(size(z2,1),1) sigmoid(z2)]; %5000*(25+1) z3 = a2*Theta2'; %5000*10 a3 = sigmoid(z3); h=a3; %%for循环版 J=0; for k = 1:num_labels y1 = (y==k); J = J + 1/m* sum( -y1.*log(h(:,k)) - (1-y1).*log(1-h(:,k)) ); end %%完全向量化版 J=0; Y=zeros(m,num_labels); for i=1:num_labels Y(:,i)=(y==i); end J = 1/m*sum(sum(-Y.*log(h)-(1-Y).*log(1-h))); %%正则化后 J = J + lambda/2/m*( sum(sum(Theta1(:,2:end).^2)) + sum(sum(Theta2(:,2:end).^2)) ); % compute delta delta3=zeros(m, num_labels); for k = 1 : num_labels delta3(:,k) = a3(:,k) - (y==k); %5000*10 end delta2 = delta3 * Theta2 .* [ones(size(z2,1),1) sigmoidGradient(z2)]; %5000*26 %compute Delta Delta1 = delta2(:,2:end)' * a1; %25*401 Delta2 = delta3' * a2; %10*26 % compute Theta_grad Theta1_grad = 1/m*Delta1; Theta2_grad = 1/m*Delta2; % 正则化grad reg1 = lambda/m*Theta1; reg2 = lambda/m*Theta2; reg1(:,1) = 0; reg2(:,1) = 0; Theta1_grad = Theta1_grad + reg1; Theta2_grad = Theta2_grad + reg2; % Unroll gradients grad = [Theta1_grad(:) ; Theta2_grad(:)]; end