Adding some to Eigenvectors/Eigenvalues, and the general concept of coordinates:
Change of coordinates is often motivated by eigenvectors. If you can represent a linear transformation in its most natural coordinate system, it is often revealed to be equivalent to decoupled dilations across its different dimensions.
In order to understand this properly, you have to first explain how matrices encode linear transformations. (YatharthROCK seems to be getting at this in an earlier comment, and several others have commented about this as well.) Every linear transformation is uniquely determined by how it transforms basis vectors. For me, this has always been a huge ‘aha!’ realization. The formal argument for this is straightforward, but pedagogically it might be best to start with a concrete example.
So suppose you have a linear transformation $ T : \mathbb{R}^2 \rightarrow \mathbb{R}^2 $. You’d like to know how to apply $T$ to an arbitrary vector $v \in \mathbb{R}^2$, let’s say, $v = (3, 4)$. Unfortunately, all you know about $T$ is what it does to your standard basis vectors $e_1 = (1, 0) $ and $e_2 = (0, 1)$ : for instance, you are given that $ T(e_1) = (-1, 1) $ and $ T(e_2) = (2, 1)$. How do you proceed?
Well, we can represent our input $v$ in terms of our basis vectors: $v = 3e_1 + 4e_2$. Then $T(v) = T(3e_1 + 4e_2)$. Finally, we can apply the basic properties of linearity to get that $$T(v) = 3T(e_1) + 4T(e_2) $$
Neat, so now it’s clear that all we needed to know was what $T$ did to $e_1$ and $e_2$. So the result is then
$$ 3 \cdot (-1, 1) + 4 \cdot (2, 1) = (5, 7) $$
It’s a pretty cool fact that all you need to know about a linear transformation in order to completely understand it is what it does to basis vectors. In fact, it’s so cool that people decided it would be really handy to spiffify their calculations by using matrices!
To wit, consider the matrix
$$
A=
\left( {\begin{array}{cc}
-1 & 2 \
1 & 1 \
\end{array} } \right)
$$
Try working out the matrix-vector product $Av$ and you’ll encounter a nice surprise:
$$ Av =
\left( \begin{array}{cc}
-1 & 2 \
1 & 1
\end{array} \right) \left( \begin{array}{cc}
3 \
4
\end{array} \right) =\left( \begin{array}{cc}
5 \
7
\end{array} \right)
$$
Whoa!
This is one way to motivate matrix multiplication: matrices are simply representations of linear transformations with respect to a particular basis. Each column of the matrix is just the output of what the transformation does to a basis vector. The usefulness of the representation is that it allows you to quickly carry out computations such as the above one. And after all, it makes perfect sense: all you need to know about a linear transformation is what it does to a basis, so why not just put that down in a little spreadsheet to represent that transformation?
An under-taught, but useful way of understanding these multiplications, is that they are carried out column-by-coordinate. In other words, to multiply $A$ by $v$ as above, you view it as taking each column of $A$ from left to right as a vector and scaling it by each coordinate in $v$ from top to bottom. Hence,
$$ Av =
\left( \begin{array}{cc}
-1 \
1
\end{array} \right)
3
+
\left( \begin{array}{cc}
2 \
1
\end{array} \right)
4
$$
Hopefully, this fancy multiplication actually jives with our new understanding: $v$ is composed of $3$ parts of one basis vector and $4$ parts of the other, so to figure out how $A$ transforms it, we take $3$ parts of the first column of $A$ (which is by definition what A did to the first basis vector) and $4$ parts of the second column of $A$ (by similar reasoning). This has always made way more sense to me than the usual approach of doing some sort of dot product between rows and columns.
There might be even a sort of geometric intuition behind all this too: if you specify what your linear transformation does on the ‘axes’ of your space (i.e., the basis vectors) then it seems reasonable to be able to extrapolate what it does to an arbitrary vector, which can be decomposed into a linear combination of the axes. The real heart of the question is, however, what should we choose for our axes? After all, the transformation is independent of the choice of basis. I’ll get to this idea and the relationship with eigenvectors in my next post.