Predicting (function approximation) is perhaps the main use of neural nets but there is another very important use: generating, or in other words, imagination. And in generative mode we need to know about probabilities and latent variables - thus, we need a little more than neural nets to do this task.
I think an important thing to remember about neural nets is that they are basically a good way to overfit. So, yes, you can approximate any function because you have lots and lots of variables, but you shouldn't fool yourself that you are getting the same information as when you write a deterministic equation with a few variables ("with four parameters I can fit an elephant, with five I can make him wiggle his trunk"... Von Neumann). You can the the baseball but you don't yet know the physics.
I think skilled biological actors use this overfitting to get really good at things without knowing how things actually work.
It's interesting, because the common wisdom that I've encountered is that neural networks are better when you can feed them large amounts of data. But some of the use cases here (e.g. approximating the solution to a set of partial differential equations) are so far outside the kind of work I do that I have a hard time conceptualizing them or how they work.
You don't understand. I am making this assertion as a former physicist. We have stumbled upon extremely general function approximators.
Effectively the same way that we can use equations to make inferences about reality - but because of the nature of mathematical notation and the limitations of human ability, though powerful, math has limitations. E.g. you can write out the idealized equations for heat propagation in a conductive medium, but solving them for a real object requires empirical simulation.
Deep neural nets are the next step. Now you can essentially train these neural networks to infer not just the general idealized behaviors, but specific details, discrete values for X and Y on a fine grid that are beyond the practical limits of applied math.
But this is much bigger. It turns out that, much in the way that idealized equations apply to many problems (e.g. exponential growth arising from diff EQ), neural nets generalize to all manner of real world problems, provided the training data is appropriately curated.
These neural networks excel at learning human like heuristics, with machine level precision. You can make inferences for both continuous and discrete probabilistic systems. This is a major development and it's just starting. We've finally assembled the pieces in the last few years.
There are examples quite far from neural networks, the ones I can think of are broadly optimisation problems:
Many physics problems involve trying to find a function which minimises something -- energy, entropy, action. Or the state which makes the difference between two things zero. Sometimes adjusting many parameters slowly down the gradient is a good way to find these.
In bayesian statistics, the basic problem is to sample from a distribution, which you know only indirectly, by some kind of monte-carlo method. But the space to sample can be enormous. If I understand right, advanced ways of doing this exploit the gradients (of functions defining the distribution) to try to choose samples efficiently.
People have hacked tensorflow to do all sorts of things which its creators didn't intend. Or written tools specialised for another particular domain (like Stan). I guess the excitement is that instead of re-inventing the wheel in each domain, maybe this can be pushed down to become a language feature which everyone above uses.
I see neural nets as extensions of human thought - they explore the fuzzy depths where we can't grasp directly, like a microscope or a telescope enhance vision. They are enhanced correlation engines.
Neural nets are also likely to be used to model the behavior of other road users (i.e. 'how is this car/pedestrian/etc likely to move in the next few seconds'?). No-one seems to be seriously considering an end-to-end neural network (i.e. inputs are sensors, outputs are throttle and steering wheel), but neural nets are pervasive components of the full system.
Neural networks are function approximators. So if you 1) know an algorithm that is really computationally complex but not highly random and 2) have a lot of inputs and outputs of that algorithm, you can usually train a neural network to approximate a closed-form formula of the algorithm. It boils down to a bunch of matrix-multiplies and some standard non-linear functions in between.
Neural nets are universal function approximations, so this sounds like a safe bet :) When I first read about them with this categorisation it made the subject click.
reply