Exercise:Learning color features with Sparse Autoencoders
习题链接:Exercise:Learning color features with Sparse Autoencoders
sparseAutoencoderLinearCost.m
function [cost,grad,features] = sparseAutoencoderLinearCost(theta, visibleSize, hiddenSize, ... lambda, sparsityParam, beta, data) % -------------------- YOUR CODE HERE -------------------- % Instructions: % Copy sparseAutoencoderCost in sparseAutoencoderCost.m from your % earlier exercise onto this file, renaming the function to % sparseAutoencoderLinearCost, and changing the autoencoder to use a % linear decoder. % -------------------- YOUR CODE HERE -------------------- % W1 is a hiddenSize * visibleSize matrix W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize); % W2 is a visibleSize * hiddenSize matrix W2 = reshape(theta(hiddenSize*visibleSize+1:2*hiddenSize*visibleSize), visibleSize, hiddenSize); % b1 is a hiddenSize * 1 vector b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize); % b2 is a visible * 1 vector b2 = theta(2*hiddenSize*visibleSize+hiddenSize+1:end); numCases = size(data, 2); % forward propagation z2 = W1 * data + repmat(b1, 1, numCases); a2 = sigmoid(z2); z3 = W2 * a2 + repmat(b2, 1, numCases); a3 = z3; % error sqrerror = (data - a3) .* (data - a3); error = sum(sum(sqrerror)) / (2 * numCases); % weight decay wtdecay = (sum(sum(W1 .* W1)) + sum(sum(W2 .* W2))) / 2; % sparsity rho = sum(a2, 2) ./ numCases; divergence = sparsityParam .* log(sparsityParam ./ rho) + (1 - sparsityParam) .* log((1 - sparsityParam) ./ (1 - rho)); sparsity = sum(divergence); cost = error + lambda * wtdecay + beta * sparsity; % delta3 is a visibleSize * numCases matrix delta3 = -(data - a3); % delta2 is a hiddenSize * numCases matrix sparsityterm = beta * (-sparsityParam ./ rho + (1-sparsityParam) ./ (1-rho)); delta2 = (W2' * delta3 + repmat(sparsityterm, 1, numCases)) .* sigmoiddiff(z2); W1grad = delta2 * data' ./ numCases + lambda * W1; b1grad = sum(delta2, 2) ./ numCases; W2grad = delta3 * a2' ./ numCases + lambda * W2; b2grad = sum(delta3, 2) ./ numCases; %------------------------------------------------------------------- % After computing the cost and gradient, we will convert the gradients back % to a vector format (suitable for minFunc). Specifically, we will unroll % your gradient matrices into a vector. grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)]; end function sigm = sigmoid(x) sigm = 1 ./ (1 + exp(-x)); end function sigmdiff = sigmoiddiff(x) sigmdiff = sigmoid(x) .* (1 - sigmoid(x)); end
如果跑出来是这样的,可能是把a3 = z3写成了a3 = sigmoid(z3)