Linear Algebra

Linear Algebra
目录
Scalars, Vectors, Matrices and Tensors
- Scalars: A scalar is just a single number, we usually give scalars lower-case variable names.
- Vector: A vector is an array of numbers, we give vectors lower-case names written in bold typeface, such as x.
  
  [oldsymbol{x} = egin{bmatrix} x_1 \ x_2 \ vdots \ x_n \ end{bmatrix} ]
  To access (x_1, x_3, x_6) , we define the set S = {1, 3, 6} and write (oldsymbol{x}_s). We use the - sign to index the complement of a set, (oldsymbol{x}_{-1}) is the vector containing all elements of x except of (x_1).
- Matrices: A matrix is a 2-D array of numbers. We usually give matrices upper-case variable names with bold typeface, such as A.
  
  [mathbf{A} = egin{bmatrix} A_{1, 1} & A_{1, 2} \ A_{2, 1} & A_{2, 2} \ end{bmatrix} ]
  We can identify all of the numbers with vertical coordinate (i) by writing a ":" for the horizontal coordinate, (oldsymbol{A}_{i, :}) is known as the i-th row of A. Sometimes we may need to index matrix-valued expressions that are not just a single letter, in this case, we use subscripts after the expression, such as (f(oldsymbol{A})_{i, j}), which gives element (i, j) of the matrix computed by applying the function (f) to A.
- Tensors: In some cases we will need an array with more than two axes. An array of numbers arranged on a regular grid with a variable number of axes is known as a tensor. We denote a tensor named "A" with this typeface: A.
Multiplying Matrices and Vectors

The matrix product of matrices A and B is a third matrix C, C = AB and

[C_{i, j} = sum_kA_{i, k}B_{k, j} ]
Matrix product operations have many properties:
- distributive: A(B + C) = AB + AC
- associative: A(BC) = (AB)C
- not commutative: sometimes AB ( eq) BA
Note that the standsrd product of two matrices is not just a matrix containing the product of the individual elements. Such an operation is called the element-wise product , and is denoted as A (odot) B .

The dot product between two vectors x and y of the same dimensionality is the matrix product (mathbf{x}^Tmathbf{y})

Identity and Inverse Matrices

An identity matrix that preserves n-dimensional vectors is denoted as (oldsymbol{I}_n). Formally, (mathbf{I}_n in mathbb{R}^{n imes n}), and

[forall mathbf{x} in mathbb{R}^n, mathbf{I}_nmathbf{x} = mathbf{x}. ]
The matrix inverse of A is donated as (mathbf{A}^{-1}), and it is defined as the matrix such that

[mathbf{A}^{-1}mathbf{A} = mathbf{I}_n ]
Linear Dependence and Span

A linear combination of some set of vectors ({mathbf v^{(1)},dots, mathbf v^{(n)}}) is given by multiplying each vector (mathbf v^{(i)}) by a corresponding scalar coefficient and adding the results:

[sum_i c_i mathbf v^{(i)} ]
The span of a set of vectors is the set of all points obtainable by linear combination of the original vectors.

Determining whether (mathbf{Ax = b}) has a solution thus amounts to testing whether b is in the span of the columns of A. This particular span is known as the column space or the range of A.

A set of vectors is linear independent if no vector in the set is a linear combination of the other vectors. A square matrix with linearly dependnet columns is known as singular.

Norms

Sometimes we need to measure the size of a vector. In machine learning, we usually measure the size of vectors using a function called a norm. Formally, the (L^p) norm is given by

[||x||_p = Big(sum_i|x_i|^pBig)^{frac{1}{p}} ]
for (p in mathbb{R}, pgeq1).

A norm is any function (f) that satisfies the following properties:
- (f(x) = 0 Rightarrow mathbf{x}=mathbf{0})
- (f(mathbf{x} + mathbf{y}) leq f(mathbf{x}) + f(mathbf{y}))
- (forall alpha in mathbb{R}, f(alpha mathbf{x}) = |alpha|f(mathbf{x}))
Special Kinds of Matrices and Vectors
- Diagonal matrices consist mostly of zeros and have non-zero entries only along the main diagonal. We write diag(v) to denote a square diagonal matrix whose diagonal entries are given by the entries of the vector v. Then we have
  
  [diag(mathbf{v}mathbf{x}) = mathbf{v} odot mathbf{x} ]
- A symmetric matrix is any matrix that is equal to its own transpose:
  
  [mathbf{A} = mathbf{A}^T ]
- A unit vector is a vector with unit norm:
  
  [||mathbf{x}||_2 = 1 ]
- A vector x and a vector y are orthogonal to each other if (mathbf{x}^Tmathbf{y} = 0), if the vectors are not only orthogonal but also have unit norm, we call them orthonormal.
  
  An orthogonal matrix is a square matrix whose rows are mutually orthonormal and whose columns are mutually orthonormal:
  
  [mathbf{A}^Tmathbf{A} = mathbf{A}mathbf{A}^T = mathbf{I} ]
Eigendecomposition

An eigenvector of a square matrix A is a non-zero vector v such that multiplication by A alters only the scale of v:

[mathbf{Av} = lambda mathbf{v} ]
Suppose that a matrix A has n linearly independent eigenvectors, ({mathbf{v}^{(1)},dots,mathbf{v}^{(n)}}), withcorresponding eigenvalues ({lambda_1,dots,lambda_n}). We may concatenate all of the eigenvectors to form a matrix V with one eigenvector per column: V = ([mathbf{v}^{(1)},dots,mathbf{v}^{(n)}]). Likewise, we can concatenate the eigenvalues to form a vector λ = ([lambda_1,dots,lambda_n]). The eigendecomposition of A is then given by

[mathbf{A} = mathbf{V}diag(oldsymbol{lambda})mathbf{V}^{(-1)} ]
Every real symmetric matrix can be decomposed into an expression using only real-valued eigenvectors and eigenvalues:

[oldsymbol{A} = oldsymbol{Q Lambda Q}^T ]
A matrix whose eigenvalues are all positive is called positive definite. A matrix whose eigenvalues are all positive or zero-valued is called positive semidefinite. If all eigenvalues are negative, the matrix is negative definite.

Singular Value Decomposition

In last section, we saw how to decompose a matrix into eigenvectors and eigenvalues. The singular value decomposition (SVD) provides another way to factorize a matrix, into singular vectors and singular values. However, the SVD is more generally applicable. Every real matrix has a singular value decomposition, but the same is not true of the eigenvalue decomposition (matrix may not square).

In singular value decomposition, we can rewrite A as

[oldsymbol{A} = oldsymbol{UDV}^T ]
Suppose that A is an (m imes n) matrix. Then U is defined to be an (m imes m) matrix, D to be an (m imes n) matrix, and V to be an (n imes n) matrix. Each of these matrices is defined to have a special structure. The matrices U and V are both defined to be orthogonal matrices. The matrix D is defined to be a diagonal matrix. Note that D is not necessarily square.

The elements along the diagonal of D are known as the singular values of the matrix A. The columns of U are known as the left-singular vectors. The columns of V are known as as the right-singular vectors.

We can actually interpret the singular value decomposition of A in terms of the eigendecomposition of functions of A. The left-singular vectors of A are the eigenvectors of (oldsymbol{AA}^T). The right-singular vectors of A are the eigenvectors of (oldsymbol{A}^Toldsymbol{A}). The non-zero singular values of A are the square roots of the eigenvalues of (oldsymbol{A}^Toldsymbol{A}). The same is true for (oldsymbol{AA}^T).

The Moore-Penrose Pseudoinverse

Matrix inversion is not defined for matrices that are not square, but the Moore-Penrose pseudoinverse allows us to defined A as

[oldsymbol{A}^+ = lim_{alpha o 0}(oldsymbol{A}^Toldsymbol{A}+ alphaoldsymbol{I})^{-1}oldsymbol{A}^T ]
[oldsymbol{A}^+ = oldsymbol{VD}^+oldsymbol{U}^T ]
The Trace Operator

[Tr(oldsymbol{A})=sum_ioldsymbol{A}_{i,i} ]
The Determinant

The determinant of a square matrix, denoted det(A), is a function mapping matrices to real scalars. The determinant is equal to the product of all the eigenvalues of the matrix.
相关阅读:
2019年11月4日随堂测试最多输入字母统计
 写增删改查中间遇到的问题
 2019年12月9日下午自习成果
 2019年12月16日分级考试
 2019年11月18日 JAVA期中考试增删改查
 sql语言积累
 【转载】Java项目中常用的异常处理情况总结
 泛型
 C#数字格式化输出
 委托，Lambda的几种用法
原文地址：https://www.cnblogs.com/wang-haoran/p/13254538.html

Scalars, Vectors, Matrices and Tensors

Multiplying Matrices and Vectors

Identity and Inverse Matrices

Linear Dependence and Span

Norms

Special Kinds of Matrices and Vectors

Eigendecomposition

Singular Value Decomposition

The Moore-Penrose Pseudoinverse

The Trace Operator

The Determinant