代价函数cost function
-
公式:
其中,变量θ(Rn+1或者R(n+1)*1) -
向量化:
Octave实现:
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
prediction=X*theta;
sqerror=(prediction-y).^2;
J=1/(2*m)*sum(sqerror)
% =========================================================================
end
多变量梯度下降(gradient descent for multiple variable)
- 公式:
也即,
- 矩阵化:
梯度下降可以表示为,
其中,为,
其中微分可以求得,
将其向量化后,
则最终的梯度下降的矩阵化版本,
Octave版本:
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
predictions=X*theta;
updates=X'*(predictions-y);
theta=theta-alpha*(1/m)*updates;
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end