Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

note that "gradient descent" isn't AI either. it's more computational linear algebra: a heuristic for numerical methods used to solve (usually to a local extrema) systems of equations without direct analytical solution(s).


sort by: page size:

AI is gradient descent?

isn't it just gradient descent?

Gradient descent is for non-linear problems where you can't directly invert or somehow linearize the problem.

Thing is, gradient descent is not really a complex algorithm.

This is not abstract math, and the article does explain what it's doing before presenting the code snippets.

How can you explain or implement gradient descent without math? At some point I think you have to accept that this is a topic that involves math, and you're way better off understanding it on those terms rather than trying to avoid it.


I thought gradient descent was mostly calculus, not linear algebra. I was under the impression linear algebra was used to frame calculations so that GPUs could be utilized (since GPUs are very good at LA operations)

Or gradient descent if you mentally negate the number in question. It's the same thing.

I'm not an AI expert either, but let me give this a try.

I assume you are vaguely familiar with gradient descent. In gradient descent, we are basically trying to find the sweet spot where the value of a function is minimized. We do this by calculating the derivative of the function at a certain point and then use it to take small steps in the direction where we believe the function will have a lower value.

Gradient descent usually suffers from a problem where the algorithm gets stuck in local minimas if the function is not convex in shape.

However, when people use gradient descent to optimize functions with a very large number of parameters (as is the case in Deep Learning), another problem surfaces called saddle points. Imagine a 3 dimensional plot of the function at different values of its parameters (in reality the plot will be multi-dimensional). Now on this plot, there will be many regions where the derivative of the components defining the surface become zero. This messes with our plan to use derivatives to find the direction in which to move. So we need to come up with strategies to escape saddle points during the gradient descent process.


Gradient descent is a subset of survival of the fittest, described by Darwin in 1800-1900, and has been in applied in computer science since the 70's. An AGI will probably use some form of gradient descent during its training, yes, but I wouldn't argue that this has brought us even close to an AGI.

I've considered gradient descent for optimizing parameters on toy problems at university a few times. Never actually did it though, it's a lot of hassle for the advantage of less interaction at the cost of no longer building some intuition.

I know what gradient descent is, thanks, I was referring to the rest of that mess.

You are not off base at all, thanks for clarify and sorry for the confusion, I did not mean to say it was using gradient descent. It's been a while. The term I was thinking of was multiple "simulated annealing".

AI nowadays is curve fitting using gradient descent :p

So it doesn't use a neural network, but it is still optimized by gradient descent. Differentiability is the key!

FYI, gradient descent is covered in one of the very first weeks of Andrew Ng's Coursera machine learning class, so perhaps just watch those lessons (free)

Gradient descent is the approximation solution basically because getting the exact solution requires a good computation of inverse matrices which is apparently not yet doable (it's too slow)


Machines learn to learn by gradient descent by gradient descent, not on YC.

People mostly use gradient descent to "solve" nonconvex problems.

That's something different: it allows doing gradient descent on the space of paramaters for programs, not on the space of programs.

This is wrong. Gradient descent derives from multivariate calculus, not evolution.
next

Legal | privacy