well you made a good explanation, that even a not-so-smart guy gets it, but i think you missed the obvious -> WHY does gradient show the direction of the greatest increase.
I think that the principle of the gradient is quite easy, but understanding why does it work the way it does is a bit tricky and you should have focued on it more.
Hi Palo, that’s a great point! I’ve been feeling a bit guilty, if you can imagine it, because I’ve lacked that explanation
I’m probably going to do a separate article on the reason why the gradient points in the direction of greatest increase – I have another explanation that it works well with. Thanks for the link and feedback!
You claim: “Points in the direction of greatest increase of a function”.
Why? It can also point in the direction of greatest decrease of a function.
A gradient is one or more directional derivatives. These derivatives are considered in a particular direction. In the case of single variable calculus, we generally talk about a directional derivative when we consider multiples of the x unit vector, i.e. k*(1,0). To consider the y unit vector, we deal with the partial derivatives with respect to y in a given direction. In three dimensions, the 3 partial derivatives form what we now call a ‘gradient’.
So in fact it is incorrect to call this a slope or anything else except to say that it describes the partial derivatives of a point in the direction of a given vector in space.
Does this make sense? Please visit my blog for some more interesting reading.
Hi John, thanks for writing. You’re right, the formal definition of a gradient is a set of directional derivatives.
But when thinking about the intuitive meaning, I think it’s ok to consider the gradient as a vector that “points” in the direction of greatest increase (i.e. if you follow that direction your function will tend towards a local maximum).
Unless I’m mistaken, the gradient vector always points in the direction of greatest increase (greatest decrease would be in the opposite direction).
What I was saying is that it points either one way or the other, it is not restricted to the direction of greatest increase. As a simple example, consider what happens when you differentiate a parabola: You set the derivative equal to 0 and then you determine that it has either a maximum or a minimum at its turning point. It is not always a maximum just as it is not always a minimum. Think I have explained this correctly now.
Hi John, thanks for the clarification. I’d still politely disagree and say that in general, the gradient points in the direction of greatest increase :).
In the case of 2 dimensions, the gradient/slope only gives a forward or backward direction. A positive slope means travel “forward” and a negative slope means travel “backwards”.
Consider f(x) = x^2, a regular parobola. The gradient is zero at the minimum (x=0), and there is no single direction to go. At x = -1, the slope is negative, which means travel “backwards” (to x = -2) to increase your value. Similarly, at x = 1, you travel forward (to x = 2) to increase your value.
But, as you mention, strange things can happen when the derivative = 0. It can mean you are at a local maximum (no way to improve), or at a local minimum (no single direction to improve your position – forward or back will help). I consider the corner case of zero an exception to the general rule / intuition that the gradient is “the direction to follow” if you want to improve your function.
Thanks a bunch! I didn’t think it could be this simple to find the maximum increase at a point, so I thought I’d look it up. Thanks to your great explaination, it turn out it was as easy as it seemed it should be. Great job! Thanks!