Matrix
Intro
A lot of ideas in matrix are related to what we have already learned in vector. Recall that we can express a simultaneous equation as multiplication of a matrix and a vector.
2a+3b=125a+b=17
(2531)∗(ab)=(1217)
In this chapter, we are going to look under the hood to understand what exactly happens during this multiplication operation. It’s intriguing to see how powerful matrices are in terms of transforming vectors. This concept is also at the heart of linear algebra.
We will then dive into the solution for simultaneous equations by calculating matrix inverse and determinant. Understanding matrix inverse then unlocks the way we convert a vector space in general. As compared to what we have learned previously on changing basis, now we are able to convert a vector to any space without being restricted to a space of orthogonal basis vectors.
We are also going to look at two interesting extensions of matrix. One is called orthogonal matrix. It is the most convenient kind of matrix when we need to perform matrix inverse operation. We will construct such a matrix by the Gram-Schmidt process. The other extension is Eigen vectors and Eigen values. Eigen vectors have some special properties during matrix transformation. Knowing the Eigen vectors for a matrix will also make repeated multiplication of this matrix by itself much simpler.
Let’s get started!
Matrix as a Transformation
We start by showing a matrix (2531) multiplied by the basis vector (10) and (01).
(2531)∗(10)=(25)
(2531)∗(01)=(31)
To read this result, we can see matrix (2531) transforms vector (10) into a new vector (25) and transforms vector (01) into a new vector (31). Graphically, we get below transformation from e1 to e1′ and from e2 to e2′.

We learned from previous chapter on vector that e1 and e2 describe a basis vector space. After being multiplied by a matrix (2531), the original vector space changes to a new one described by e1′ and e2′. If we call e1 and e2 our input vectors and e1′ and e2′ our output vectors, what matrix multiplication does therefore is equivalent to a vector space transformation.
This part is tricky, but very critical for us to form a right intuition of matrix multiplication. Let’s do a matrix multiplication in a step-by-step manner to illustrate this idea of vector space transformation.
(2531)∗(32)=(2531)∗(3∗(10)+2∗(01))=(2531)∗(3∗e1+2∗e2)=3∗(2531)∗e1+2∗(2531)∗e2=3∗(25)+2∗(31)=3∗e1′+2∗e2′
Note here we make use of the associative and distributive properties of matrix multiplication.
The vector (32) is originally expressed in basis vectors e1 and e2. After being multiplied by matrix (2531), we have projected it onto the new basis vectors e1′ and e2′, i.e., 3e1′+2e2′. So we can think of matrix multiplication as the vector sum of the transformed basis vectors. Matrix (2531) tells us where the original basis vectors e1 and e2 should go after the transformation. In addition, we can see that columns of the transformation matrix are just the new basis vectors, e1′=(25), e2′=(31).
If we treat matrix as a way to transform vector space, how many ways can we transform it? Next, we will discuss some possible matrix transformation in a 2-dimensional space.
Identity Matrix
(1001) is an identity matrix. It keeps the vector space unchanged. That means a vector is not altered at all when it is multiplied by an identity matrix. We can look at an example of vector (32).
(1001)∗(32)=(1001)∗(3∗(10))+(1001)∗(2∗(01))=3∗((1001)∗(10))+2∗((1001)∗(01))=3∗(10)+2∗(01)=(32)
Scaling Matrix
Scaling matrix has this form (a00b). It alters the area of basis vector space by a factor of a∗b. Let’s look at an example below.
(5007)∗(xy)=(5007)∗(x∗(10))+(5007)∗(y∗(01))=x∗((5007)∗(10))+y∗((5007)∗(01))=x∗(50)+y∗(07)
The basis vectors for (xy) have shifted from (10) and (01) to (50) and (07). This stretches the vector space from a 1X1 square to a 5X7 rectangle as shown in the graph below.

Similarly, we can also have a scaling matrix with fractional values. This squashes the original vector space into a smaller one.
Flipping Matrix
We can use matrix (−1001) to flip the horizontal basis vector (10) to the other side. Here is an illustration.
(−1001)∗(xy)=(−1001)∗(x∗(10))+(−1001)∗(y∗(01))=x∗((−1001)∗(10))+y∗((−1001)∗(01))=x∗(−10)+y∗(01)
Graphically, this flips the basis vectors along the vertical axis.

Similarly, we can use matrix (100−1) to flip the basis vectors along the horizontal axis.
(100−1)∗(xy)=(100−1)∗(x∗(10))+(100−1)∗(y∗(01))=x∗((100−1)∗(10))+y∗((100−1)∗(01))=x∗(10)+y∗(0−1)

Inverse Matrix
When we flip the basis vectors in both vertical and horizontal direction, we get a new vector space that has inverted the values in both basis vectors. This inverse matrix is (−100−1).
(−100−1)∗(xy)=(−100−1)∗(x∗(10))+(−100−1)∗(y∗(01))=x∗((−100−1)∗(10))+y∗((−100−1)∗(01))=x∗(−10)+y∗(0−1)

Diagonal Mirroring Matrix
We can also do a mirroring of vector space along the 45° line with matrix (0110).
(0110)∗(xy)=(0110)∗(x∗(10))+(0110)∗(y∗(01))=x∗((0110)∗(10))+y∗((0110)∗(01))=x∗(01)+y∗(10)
The basis vector (10) is shifted to (01) and basis vector (01) is shifted to (10) as shown in the graph.

If we combine the inverse matrix and diagonal mirroring matrix, we would get a mirroring of vector space along the -45° line. This matrix is (0−1−10).
(0−1−10)∗(xy)=(0−1−10)∗(x∗(10))+(0110)∗(y∗(01))=x∗((0−1−10)∗(10))+y∗((0−1−10)∗(01))=x∗(0−1)+y∗(−10)

Sheering Matrix
Our basis vectors can also be transformed in more interesting ways like sheering when the vector space is no longer square or rectangle. Let’s see an example of sheering transformation matrix (1011).
(1011)∗(xy)=(1011)∗(x∗(10))+(1011)∗(y∗(01))=x∗((1011)∗(10))+y∗((1011)∗(01))=x∗(10)+y∗(11)

The new vector space after transformation is a parallelogram. Sheering transformation is not limited to just one basis vector. We can do a sheering transform for both e1 and e2 at the same time.
Rotation Matrix
Rotation is another common type of transformation. We can rotate the basis vectors by 90° anti-clockwise using the transformation matrix (01−10).
(01−10)∗(xy)=(01−10)∗(x∗(10))+(01−10)∗(y∗(01))=x∗((01−10)∗(10))+y∗((01−10)∗(01))=x∗(01)+y∗(−10)

A more general case to rotate the basis vectors by any angle θ anti-clockwise is given by (cosθ−sinθsinθcosθ).
Composite Transformation Matrix
It is also possible to apply transformation on the basis vectors more than once. For example, we can first do a 90° anti-clockwise rotation using (01−10), followed by a flipping along the vertical axis using (−1001). The composite transformation is just the matrix product of the two basic transformation matrix.
(−1001)∗(01−10)=(0110)

We can add on more matrices to the expression for more transformation steps. Note the order of operation in matrix multiplication matters. This is because matrix multiplications are not commutative, i.e. (−1001)∗(01−10)=(01−10)∗(−1001). The later transformation matrix should always precede the earlier transformation matrix in multiplication expression.
Matrix transformation is a very important step in a lot of machine learning applications, especially image-related tasks. One good example is face recognition problem whereby you need to center and align the subject faces in the image captured.
(Inspired by Mathematics for Machine Learning lecture series from Imperial College London)