Need help with a simple example where it's not clear that the gradient is in direction of "steepest ascent"

by A_Weierstrass   Last Updated June 13, 2019 12:20 PM

Say I am on a point $(x^*,y^*)$ of a function $f(x,y)$ where the function value increases if I go a very small step in any positive direction (i.e. in the direction of a vector where the coordinates $x$ and $y$ are both positive), but the function increases MORE if I go in a very small step in another direction, say a vector where the $x$-coordinate is positive but the $y$-coordinate is negative. Doesn't that mean that the gradient does not point in the direction of steepest ascent?

There was a great answer in this thread about seeing the region around the point as "almost planar", but I still don't see why the function can't be differentiable in that point and still increase in both directions (even if its by a infinitesimal amount), and increase just a little bit more in one direction than another. Does it really HAVE to mean that there is a sharp turn just at that point? Why can't it be smooth but still not planar?

I have drawn two examples where I am imagining that the point I am evaluating the gradient at is $(0,0)$. From there, it is supposed to be steeper to go in the direction of $(-ax, -by)$ than $(ax, by)$:

Example 1

Example 2

I am fairly new to math and very technical explanations are still hard for me to understand. I know I am asking for much, but additional ways of looking at it which are not algebraic would help me the most.


Answers 1

Around any non-stationary point, a smooth function is well approximated by a planar model

$$f(x+u,y+v)=f(x,y)+g_x(x,y)u+g_y(x,y)v,$$ where $g_x,g_y$ are the components of the gradient.

If you look for the direction of largest increase, you can maximize


which can be done by finding the roots of the derivative


From this equation,

$$\tan\theta=\frac{g_y}{g_x},$$ and

$$\begin{cases}\cos\theta=\pm\dfrac{g_x}{\sqrt{g_x^2+g_y^2}},\\\sin\theta=\pm\dfrac{g_y}{\sqrt{g_x^2+g_y^2}},\end{cases}$$ where the signs are synchronized.

Hence after simplification,

$$f_{\max},f_{min}=f\pm\sqrt{g_x^2+g_y^2}$$ are obtained in opposite directions, parallel to the direction of the gradient. Always.

Yves Daoust
Yves Daoust
June 13, 2019 12:15 PM

Related Questions

Updated November 27, 2017 06:20 AM

Updated July 09, 2018 18:20 PM

Updated March 29, 2019 13:20 PM

Updated October 30, 2018 19:20 PM

Updated July 09, 2018 22:20 PM