C2-Linear Algebra

concepts

  • linear algebra is a form of continuous rather than discrete mathematics
  • detailed information --> Matrix Cookbook (Petersen and Pedersen, 2006)
  • Scalars:–>(ponit)
    • A scalar is just a single number
    • a=aTa=a^T
  • Vectors:–>(line)
    • A vector is an array of numbers, the numbers are arranged in order.A standard column vevtor
  • Matrices:–>(face)
    • A matrix is a 2-D array of numbers, so each element is indentified by tow indices.
    • Note:transpose for matrix:(AT)i,j=Aj,i(\mathbf{A}^T)_{i,j}=A_{j,i}
    • C=A+B\mathbf{C} = \mathbf{A}+\mathbf{B} where Ci,j=Ai,j+Bi,jC_{i,j}=A_{i,j}+B_{i,j}.
  • Tensors:–>(volume)
    • A tensors is an array of numbers arranges on a regular grid with a variable number of axes.

NOTE:
D=aB+c, where Di,j=aBi,j+c \mathbf{D}=a\cdot \mathbf{B}+c,\text{ where }D_{i,j} = a\cdot B_{i,j}+c
Broadcasting:
C=A+b, where Ci,j=Ai,j+bj. \mathbf{C} = \mathbf{A}+\mathbf{b},\text{ where }C_{i,j} = A_{i,j}+b_{j}.

  • matrix product:
    • C=AB\mathbf{C} = \mathbf{AB}, where Ci,j=kAi,kBk.jC_{i,j}=\sum\limits_k A_{i,k}B_{k.j}
    • also called element-wise product or Hadamard product and denoted as AB\mathbf{A}\odot \mathbf{B}
    • distribution:A(B+C)=AB+AC\mathbf{A(B+C)=AB+AC}
    • association:A(BC)=(AB)C\mathbf{A(BC)=(AB)C}
    • uncommutation: ABBA\mathbf{AB\neq BA}
    • transpose: (AB)T=BTAT(\mathbf{AB})^T=\mathbf{B}^T\mathbf{A}^T
  • dot product: (for two vector x\mathbf{x} and y\mathbf{y})
    • likely martix product xTy\mathbf{x^Ty}
    • communication:xTy=yTx\mathbf{x^Ty=y^Tx}
  • linear equations:
    • Ax=b\mathbf{Ax=b}, where ARm×n\mathbf{A}\in\mathbb{R}^{m\times n} is a known matrix, bRm\mathbf{b}\in\mathbb{R}^m is a known vector, and xRn\mathbf{x}\in\mathbb{R}^n is a vector of unknown variables to be solved.
  • matrix inversion
    • denoted by A1\mathbf{A}^{-1}, and define as A1A=In\mathbf{A}^{-1}\mathbf{A=I}_n
  • identity matrix
    • InRn×n\mathbf{I}_n\in\mathbb{R}^{n\times n}, and we have xRn,Inx=x\forall \mathbf{x}\in\mathbb{R}^n,\mathbf{I}_n\mathbf{x=x}

Linear Dependence

  • The geometry meaning of linear equations
    A=[a1,1a1,2a1,Na2,1a2,2a2,NaN,1aN,2aN,N]=[A:,1A:,2A:,N] \mathbf{A}=\begin{bmatrix} a_{1,1}&a_{1,2}&\cdots&a_{1,N}\\ a_{2,1}&a_{2,2}&\cdots&a_{2,N}\\ \vdots&\vdots&\ddots&\vdots\\ a_{N,1}&a_{N,2}&\cdots&a_{N,N} \end{bmatrix} = \begin{bmatrix} \mathbf{A}_{:,1}&\mathbf{A}_{:,2}&\cdots&\mathbf{A}_{:,N} \end{bmatrix}
    C2-Linear Algebra
  • linear combination
    • {v(1), ,v(n)}\{v^{(1)},\cdots,v^{(n)}\} is given by iciv(i)\sum\limits_ic_iv^{(i)}
  • Span
    • the span of a set of vectors is the set of all points obtainable by linear combination of the original vectors.
    • if Ax=b\mathbf{Ax=b} has a solution, b\mathbf{b} is in the span of the columns of A\mathbf{A}.–> column space or the range of A\mathbf{A}.
    • nmn\geq m is the necessary condition
    • A set of vectors is linearly independent if no vector om the set is a linear combination of the oter vectors.
    • A square matrix with linearly dependent colums is known as singular.
    • For square matrices, the left inverse and the right inverse are equal.
  • Norms
    • measure the size of vectors
    • LpL^p norm is given by xp=(ixip)1p\|\mathbf{x}\|_p=(\sum\limits_i|x_i|^p)^{\frac{1}{p}} for pR,p1p\in\mathbb{R},p\geq1
    • functions mapping vectors to non-negative values.
    • satisfies the properties:
      • f(x)=0f(\mathbf{x})=0=>x=0\mathbf{x}=0
      • f(x+y)f(x)+f(y)f(\mathbf{x+y})\leq f(\mathbf{x})+f(\mathbf{y}) (the triangle inequality)
      • αR,f(αx)=αf(x)\forall\alpha\in\mathbb{R},f(\alpha\mathbf{x})=|\alpha f(\mathbf{x})|
    • L2L^2 is the Euclidean norm between the origin and the point x\mathbf{x}.
      • denoted as x\|\mathbf{x}\|, calculated as x2\mathbf{x}^2.
      • increases very slowly near the origin.
    • L1L^1 is the grows at the same rate in all locations.
      • x1=ixi\|\mathbf{x}\|_1=\sum\limits_i|x_i|
    • L0L^0 measure the size of the vector by counting its number of nonzero elements.
      • not a norm, always used L1L^1 as a substitute.
    • LL^{\infty} known as the max norm
      • x=maxixi\|\mathbf{x}\|_{\infty}=\max\limits_i|x_i|
    • L2L^2 of matrix is Frobenius norm
      • used to measure the size of a matrix
      • AF=i,jAi,j2\|A\|_F=\sqrt{\sum\limits_{i,j}A^2_{i,j}}
    • Represented dot product by norms: xTy=x2y2cosθ\mathbf{x}^T\mathbf{y}=\|\mathbf{x}\|_2\|\mathbf{y}\|_2\cos\theta, where θ\theta is the angle between x\mathbf{x} and y\mathbf{y}.
  • Special Kinds of Matrices and Vectors
    • Diagonal matrices:
      • D\mathbf{D} is a diagonal, if and only if Di,j=0D_{i,j}=0 for all iji \neq j.
      • diag(v\mathbf{v}) denote the diagonal matrix whose diagonal entries are given by the vector v\mathbf{v}.
      • diag(v\mathbf{v})x=vx\mathbf{x}=\mathbf{v}\odot\mathbf{x}
      • if every diagonal entry is nonzero, diag(v\mathbf{v})1^{-1}=diag([1/v1, ,1/vn1/v_1,\cdots,1/v_n]).
      • for a non-square diagonal matrix D\mathbf{D}, if D\mathbf{D} is taller than it is wide, concatenating some zeros to the results; if D\mathbf{D} is wider than it is tall, discarding some of the last elements of the vector.
    • symmetric matrix:
      • A=AT\mathbf{A}=\mathbf{A}^T
    • unit vector
      • unit norm: x2=1\|\mathbf{x}\|_2=1
      • orthogonal: xTy=0\mathbf{x}^T\mathbf{y}=0
      • orthonormal: orthogonal and unit
    • orthogonal matrix:
      • ATA=AAT=I\mathbf{A}^T\mathbf{A}=\mathbf{A}\mathbf{A}^T=\mathbf{I}
      • Implies: A1=AT\mathbf{A}^{-1}=\mathbf{A}^T
  • Eigendecomposition
    • decompose a matrix into a set of eigenvectors and eigenvalues.
    • eigenvector and eigenvalues
      • a square matrix A\mathbf{A}, a non-zero vector v\mathbf{v},satisfied Av=λv\mathbf{Av}=\lambda\mathbf{v}.
      • λ\lambda is the eigenvalue corresponding to this eigenvector v\mathbf{v}. ++Also there are left eigenvector vTA=λvT\mathbf{v}^T\mathbf{A}=\lambda\mathbf{v}^T++
      • for sR,s0s\in\mathbb{R},s\neq0, if v\mathbf{v} is an eigenvector of A\mathbf{A}, then svs\mathbf{v} has the same eigenvalue.
      • let V=[v(1), ,v(1)],λ=[λ1, ,λn]T\mathbf{V}=[\mathbf{v}^{(1)},\cdots,\mathbf{v}^{(1)}],\mathbf{\lambda}=[\lambda_1,\cdots,\lambda_n]^T, then Vdiag(λ)=AV\mathbf{V}diag(\mathbf{\lambda})=\mathbf{AV}, namely A=Vdiag(λ)V1\mathbf{A}=\mathbf{V}diag(\mathbf{\lambda})\mathbf{V}^{-1}.
        C2-Linear Algebra
    • every real symmetric matrix can be decomposed
      • A=QΛQT\mathbf{A=Q\Lambda Q}^T, where Q\mathbf{Q} an orthogonal matrix composed of eigenvectors of A\mathbf{A} and Λ\mathbf{\Lambda} is a diagonal matrix.
      • the eigenvalue Λi,i\Lambda_{i,i} is associated with the eigenvector Q:,iQ_{:,i}
      • is not unique.
      • if any of eigenvalues are zero, the matrix is singluar
      • eigendecomposition:
        • f(x)=xTAxf(\mathbf{x})=\mathbf{x}^T\mathbf{Ax}, subject to x2=1\|\mathbf{x}\|_2=1.
        • if x\mathbf{x} is the eigenvector of A\mathbf{A}, ff is the eigenvalue.
        • the maximum/minimum value of ff is the maximum/minimum eigenvalue
        • positive definite: a matrix with all positive eigenvalue.xTAx=0\mathbf{x}^T\mathbf{Ax}=0==>x=0\mathbf{x}=0
        • positive semidefinite: a matrix with all positive or zero eigenvalue.==>x,xTAx0\forall \mathbf{x},\mathbf{x}^T\mathbf{Ax}\geq 0
        • negative definite
        • negative semidefinite
  • Singular Value Decompsition:
    • every real matrix has a singular value decomposition
    • A=UDVT,Am×n,Um×m,Dm×n,Vn×n\mathbf{A=UDV}^T,\mathbf{A}_{m\times n},\mathbf{U}_{m\times m},\mathbf{D}_{m\times n},\mathbf{V}_{n\times n}.
    • U,V\mathbf{U,V} is the orthogonal matrix
    • D\mathbf{D} is the diagonal matrix and is the singular value of A\mathbf{A}
    • the columns of U\mathbf{U} are the left-singular vectors.
    • the columns of V\mathbf{V} are the right-singular vectors.
  • The Moore-Penrose Pseudoinverse
    • define: A+=limα0(ATA+αI)1AT\mathbf{A}^+=\lim\limits_{\alpha\searrow 0}(\mathbf{A}^T\mathbf{A}+\alpha\mathbf{I})^{-1}\mathbf{A}^T
    • AA+A=A\mathbf{AA}^+\mathbf{A=A}
    • A+AA+=A+\mathbf{A}^+\mathbf{AA}^+=\mathbf{A}^+
    • AA+,A+A\mathbf{AA}^+,\mathbf{A}^+\mathbf{A} are symmetric
    • Compute: A+=VD+UT\mathbf{A}^+=\mathbf{VD}^+\mathbf{U}^T, where U,V,D\mathbf{U,V,D} are the singular value decomposition of A\mathbf{A}.
    • if A\mathbf{A} is a wide matrix, x=A+y\mathbf{x=A}^+\mathbf{y} with the minimal Euclidean norm among all possible solutions.
    • if A\mathbf{A} is a tall matrix, Ax\mathbf{Ax} is as close as possible to y\mathbf{y}.
  • Trace Operator
    • Define: Tr(A)=iAi,i\text{Tr}(\mathbf{A})=\sum\limits_i\mathbf{A}_{i,i}
    • Frobenius norm of a matrix: AF=Tr(AAT)\|A\|_F=\sqrt{\text{Tr}(\mathbf{AA}^T)}
    • Tr(A)=Tr(AT)\text{Tr}(\mathbf{A})=\text{Tr}(\mathbf{A}^T)
    • Tr(ABC)=Tr(CAB)=Tr(BCA)\text{Tr}(\mathbf{ABC})=\text{Tr}(\mathbf{CAB})=\text{Tr}(\mathbf{BCA}), more generally, Tr(i=1n)=Tr(F(n)i=1n1F(i))\text{Tr}(\prod\limits_{i=1}^n)=\text{Tr}(\mathbf{F}^{(n)}\prod\limits_{i=1}^{n-1}\mathbf{F}^{(i)})
    • a=Tr(a)a=\text{Tr}(a)
  • The Determinant
    • define for a square matrix: det(A)=i=1nλi\det(\mathbf{A})=\prod\limits_{i=1}^n\lambda_i
    • The absolute value of the determinant can be thought of as a measure of how much multiplication by the matrix expands or contracts space.
    • If the determinant is 0, then space is contracted(收缩) completely along at leastone dimension, causing it to lose all of its volume.
    • If the determinant is 1, thenthe transformation preserves(保留) volume.