Linear Algebra

Key analogies for the subject.

Ahas from the BetterExplained Linear Algebra Guide

Linear algebra

  • Means “line-like relationships”. That is, relationships that scale perfectly and can be added together.

A matrix

  • A mini-spreadsheet. You can make computations on one spreadsheet from the commands in another spreadsheet, and get a new sheet.

Row/column multiplication

  • Imagine “pouring” the rows onto another. linear algebra


  • Scaling size of the transformation the matrix applies. If the input is unit size [area/volume of 1], determinant is the size of the transformed area or volume. A determinant of 0 means matrix is “destructive” and cannot be reversed (like multiplying by zero: information was lost)."


  • The “axes of transformation”. In a spinning globe, every position changes location except the poles/axis. Direction doesn’t change, but size might – eigenvalue is the scaling factor.
  • TODO: link this to the Fourier Transform

Some more insights on math stack exchange [ ] which can be mined for Ahas. One key is why the determinant measures the volume of the parallelepiped specified by the matrix.

Also: reddit thread on why eigenvectors are important

Another nice demo:

  • See eigenvectors as “attractors” that pull arbitrary vectors to them. (Makes sense, the components which are not part of the eigenvector will be transformed, so there’s less and less of the vector that is away from the eigenvector after each transformation.)

Terrence Tao’s course:

Source: These Math.StackExchange articles:

Why don’t we define matrix multiplication component-wise?

Or in words of your college student self in Linear Algebra class: Why is matrix multiplication so weird? Surely, considering how matrix addition and subtraction works (and how multiplication relates to addition in other fields); this seems the more natural and intuitive, as well as is associative, commutative, distributive over addition and has the identity element. And in fact, it is a christened operation: the Hadamard product.

But it has its problems. Not only is it significantly less useful (only in things like compression algorithms) and less frequent in mathematics (just compare the length of their Wikipedia pages); so we cannot do away with traditional mulitplication altogether. In fact, without it, all the fancy row-and-column notation of our matrices becomes meaningless:

Verdict: Traditional multiplication is very important, but would have been better off named something else (e.g., composition of matrices) to avoid the baggage of the word ‘multiplication’ (which should have been defined in the obvious way before).

Appendix: Matrices and Vectors

As Kalid puts it in his linear algebra article too: matrices and vectors are 2 different things and not inextricably linked as you might come to think, but rather matrices are just glorified 2-D arrays (or spread-sheets, as Kalid puts it) and one possible representation of vectors.

Here’s a quote of another answer:

A vector is not a tuple of individual components. A vector is an element of some vector space.

When you’re writing a vector as such a tuple, you’re only referring to the expansion of the vector in some particular basis. But this basis is very often not even specified. Which is actually ok, because the “normal” vector operations don’t in fact depend on the choice, i.e. if you transformed all you vectors to be written out in some other basis, you would have all different numbers but the same calculations would still yield correct results.

But that wouldn’t work for component-wise multiplication, as was already shown. This operation simply does not work on the vectors but on their basis representation, which is only well-defined for some fixed choice of basis, which is not what you’re actually interested in when studying vectors.

Of course, there are plenty of applications where you are in fact interested in tuples of numbers, but those aren’t vectors then. There’s nothing wrong with the Hadamard product, but it doesn’t work on vectors but on matrices1. If you want to multiply components, then your objects may be called tuples or arrays or lists or whatever, but hardly vectors.

Adding some to Eigenvectors/Eigenvalues, and the general concept of coordinates:

Change of coordinates is often motivated by eigenvectors. If you can represent a linear transformation in its most natural coordinate system, it is often revealed to be equivalent to decoupled dilations across its different dimensions.

In order to understand this properly, you have to first explain how matrices encode linear transformations. (YatharthROCK seems to be getting at this in an earlier comment, and several others have commented about this as well.) Every linear transformation is uniquely determined by how it transforms basis vectors. For me, this has always been a huge ‘aha!’ realization. The formal argument for this is straightforward, but pedagogically it might be best to start with a concrete example.

So suppose you have a linear transformation $ T : \mathbb{R}^2 \rightarrow \mathbb{R}^2 $. You’d like to know how to apply $T$ to an arbitrary vector $v \in \mathbb{R}^2$, let’s say, $v = (3, 4)$. Unfortunately, all you know about $T$ is what it does to your standard basis vectors $e_1 = (1, 0) $ and $e_2 = (0, 1)$ : for instance, you are given that $ T(e_1) = (-1, 1) $ and $ T(e_2) = (2, 1)$. How do you proceed?

Well, we can represent our input $v$ in terms of our basis vectors: $v = 3e_1 + 4e_2$. Then $T(v) = T(3e_1 + 4e_2)$. Finally, we can apply the basic properties of linearity to get that $$T(v) = 3T(e_1) + 4T(e_2) $$

Neat, so now it’s clear that all we needed to know was what $T$ did to $e_1$ and $e_2$. So the result is then
$$ 3 \cdot (-1, 1) + 4 \cdot (2, 1) = (5, 7) $$

It’s a pretty cool fact that all you need to know about a linear transformation in order to completely understand it is what it does to basis vectors. In fact, it’s so cool that people decided it would be really handy to spiffify their calculations by using matrices!

To wit, consider the matrix

\left( {\begin{array}{cc}
-1 & 2 \
1 & 1 \
\end{array} } \right)

Try working out the matrix-vector product $Av$ and you’ll encounter a nice surprise:

$$ Av =
\left( \begin{array}{cc}
-1 & 2 \
1 & 1
\end{array} \right) \left( \begin{array}{cc}
3 \
\end{array} \right) =\left( \begin{array}{cc}
5 \
\end{array} \right)


This is one way to motivate matrix multiplication: matrices are simply representations of linear transformations with respect to a particular basis. Each column of the matrix is just the output of what the transformation does to a basis vector. The usefulness of the representation is that it allows you to quickly carry out computations such as the above one. And after all, it makes perfect sense: all you need to know about a linear transformation is what it does to a basis, so why not just put that down in a little spreadsheet to represent that transformation?

An under-taught, but useful way of understanding these multiplications, is that they are carried out column-by-coordinate. In other words, to multiply $A$ by $v$ as above, you view it as taking each column of $A$ from left to right as a vector and scaling it by each coordinate in $v$ from top to bottom. Hence,
$$ Av =
\left( \begin{array}{cc}
-1 \
\end{array} \right)
\left( \begin{array}{cc}
2 \
\end{array} \right)

Hopefully, this fancy multiplication actually jives with our new understanding: $v$ is composed of $3$ parts of one basis vector and $4$ parts of the other, so to figure out how $A$ transforms it, we take $3$ parts of the first column of $A$ (which is by definition what A did to the first basis vector) and $4$ parts of the second column of $A$ (by similar reasoning). This has always made way more sense to me than the usual approach of doing some sort of dot product between rows and columns.

There might be even a sort of geometric intuition behind all this too: if you specify what your linear transformation does on the ‘axes’ of your space (i.e., the basis vectors) then it seems reasonable to be able to extrapolate what it does to an arbitrary vector, which can be decomposed into a linear combination of the axes. The real heart of the question is, however, what should we choose for our axes? After all, the transformation is independent of the choice of basis. I’ll get to this idea and the relationship with eigenvectors in my next post.

1 Like