Deep Learning was a noticeable improvement over previous neural models, sure. But deep learning is not the entire field of AI and ML. There has been more stuff going on like neural turing machines and differentiable neural computers.
Is that true anymore? I've read of some significant improvements in machine learning in recent years with deep learning. Not to mention cloud services with massive super-computers and datasets.
Machine learning people basically agree that there weren't any big breakthroughs in deep learning. The success and the hype is mostly a combination of more computing power and more data. The algorithms (convolutional neural network etc.) were invented back in the 1980s and even earlier.
There have been some improvements but they are incremental indeed. More use of ReLU, dropout etc. But it's not a new paradigm at all.
While there have been some good incremental improvements, I think you are overstating the place of deep learning vis a vis everything in machine learning that came before it. The same thing happened to SVMs in the early 2000s, but with much less industry noise. NB I’m not trying to compare the impact of the two, just noting it was also over stated.
After all, deep learning is fundamentally a continuation of much older techniques. And we absolutely could identify objects in pictures, we’ve been doing that for 40 years now - deep learning techniques have given a very nice jump in accuracy for some tasks but they didn’t come out of nowhere.
I agree the whole labelling “AI” is problematic. That is also a many decades old problem though...
The moment I realized that Deep Learning is nothing more than a non-linear matrix operation I lost a big deal of respect for the field. I still believe the potential of Deep Learning is huge, since many, especially larger companies, have lots of data that just sits there waiting for some innovation. DL can deliver more productivity, i.e. higher quality, faster processes etc. That is great. But this has very little to do with the fancy sci-fi version of AI.
Yes. The advances in Deep Learning are very impressive and are already proving useful. Still there seem to be no real breakthroughs in actual intelligence, reasoning or something that resembles consciousness.
Deep learning has pushed up the state of the art exponentially in the last few years. On some benchmarks like speech recognition, deep learning methods increased the state of the art by what would have taken decades at the previous rate of progress. This years Imagenet winner achieved 5% error. Last year was 11%, the year before 15%. (Note percentage error isn't linear, each percentage point is harder than the last.)
Everything from Machine vision to natural language processing to speech recognition, is benefiting from this. We live in exciting times for AI, and everyone wants to get in on it.
Simple machine learning algorithms are interesting and even superior on simpler problems. But they don't scale up to complicated AI tasks like this. They are just doing simple correlations between variables, whereas NN's can theoretically learn any arbitrary function (theoretically Turing complete.) Usually this does make them overfit, but on more complicated problems underfitting is the bigger problem.
yes, but before the ML step the old approaches relied on expert-crafted features. The breakthroughs in those fields via deep-learning is because people found architectures (CNN/RNNs) that could learn those features much, much, much more efficiently than they could be hand-crafted.
I have no idea if it is the breakthrough of the decade, but I think deep learning isn't just taking a perceptron with many hidden layers and applying backpropagation to it, as you seem to say, all the interesting things about it you summarized as "fancy stuff" and "not making a big difference", without any context, references or arguments. I do not feel competent to discuss it as I have very little experience in this field, but it doesn't feel too informed even given whatever little knowledge I have. Certainly faster computers and more data have helped, but just like in traditional algorithms research, they cannot completely make up for having exponential growth functions with respect to computational needs of the amount of data required. There have been large improvements in both respects in the deep learning community, in fact rarely does the term "deep learning" refer in practice to traditional completely supervised learning that you are talking about.
I personally think that ML is on the verge of some major breakthroughs.
In particular, I think that results in "deep learning" are very promising. I've written about this approach in earlier HN comments.
"Deep learning" is the new big trend in Machine Learning. It promises general, powerful, and fast machine learning, moving us one step closer to AI. Deep learning has already made important advances in achieving state-of-the-art accuracy in vision and language, but with much less manual engineering that competing methods.
In fact, I think the major success of the deep learning movement has been to get the community to start focusing on figuring out how to get powerful learning algorithms actually to work. A lot of people used to (but many still do, sadly) work on making incremental improvements on learning algorithms that are implausibly simple. "Sure, we know this model can't achieve human level performance on vision or language or control (robotics) or planning, but with this neat refinement I can get a paper out of it." Deep learning begins its endeavor with the goal of AI, and rejects techniques whose upper bound isn't high enough. Simply the fact that the community is setting its sights high---not in terms of over-promising to the outside world, but merely in terms of the learning machinery being explored---and is actually trying to achieve AI is a step forward.
An algorithm is deep if the input is passed through several non-linearities before being output. Most modern learning algorithms (including decision trees and SVMs and naive bayes) are "shallow".
For intuition, imagine if I told you that your main routine can call subroutines, and your subroutines could call subsubroutines, but you couldn't have any more abstraction than that. You can't have subsubsubroutines in your "shallow" program. You could compute whatever you wanted in a "shallow" program, but your code would involve a lot of duplicated code and would not be as compact as it should be. Similarly, a shallow machine learning architecture would involve a lot of duplication of effort to express things that a deep architecture could more compactly. The point being, a deep architecture can more gracefully reuse previous computations.
Deep learning is motivated by intuition, theoretical arguments from circuit theory, empirical results, and current knowledge of neuroscience. Here is a video where I give a high-level talk on deep learning, and describe this intuition: http://bilconference.com/videos/deep-learning-artificial-int....
Another aside: My colleague Hoifung Poon published exciting work in semantic parsing. It received best paper award at ACL 2009, the most prestigious NLP conference.
(http://www.cs.washington.edu/homes/hoifung/papers/poon09.pdf)
You read it and you're like: "Really? You're doing that? You're actually trying to solve NLP using purely automatic techniques. Whoa. I forget that was the goal, I was too busy doing feature engineering!"
He achieves impressive results on question-answering, and beats other systems in recall, giving answers to many more questions, at the same level of accuracy at the competing methods.
The source code for his semantic parser is available (http://alchemy.cs.washington.edu/papers/poon09/) and you can use it to build a Q+A system. You can try a demo of it here, which I put up: http://bravura.webfactional.com/
He is about to talk about an updated version of this work, in which he induces ontologies purely automatically.
Deep learning has nothing to do with "science" and the last algorithmic advances that enabled it happened rather more than 10 years ago: the neocognitron, in 1979, the Long-Short Term Memory recurrent neural nets in 1995, backpropagation (for neural network training) in 1986, etc.
In general, all the interesting work that enabled today's deep learning boom happened towards the end of the 20th century and recent advances are primarily owed to increases in computational power and availability of data sets.
Says not me, but Geoff Hinton:
Geoffrey Hinton: I think it’s mainly because of the amount of computation and the amount of data now around but it’s also partly because there have been some technical improvements in the algorithms. Particularly in the algorithms for doing unsupervised learning where you’re not told what the right answer is but the main thing is the computation and the amount of data.
The algorithms we had in the old days would have worked perfectly well if computers had been a million times faster and datasets had been a million times bigger but if we’d said that thirty years ago people would have just laughed.
Correct me if I'm wrong but I don't see that with 'deep learning' we have answered/solved any of the philosophical problems of AI that existed 25 years ago (stopped paying attention about then).
Yes we have engineered better NN implementations and have more compute power, and thus can solve a broader set of engineering problems with this tool, but is that it?
I don't disagree with you, but this is the first time in history that ML(deep learning in particular) has become useful due to hardware and software availability. I can now buy commodity hardware and run ML experiments.
Additionally on a theoretical level there are new things being discovered every day it seems. I mean, batch normalization wasn't even discovered until 2015.
There's a long road ahead, but I believe it will be filled with better and better inventions.
In computer vision at least, deep learning has been a revolution. More than half of what I knew in the field became obsolete almost overnight (it took about a year or two I would say) and a lot of tasks received an immediate boost in term of performances.
Yes, neural networks have been here for a while, gradually improving, but they were simply non-existent in many fields where they are now the favored solution.
There WAS a big fundamental paradigm shift in algorithmic. Many people argue that it should not be called "neural networks" but rather "differentiable functions networks". DL is not your dad's neural network, even if it looks superficially similar.
The shift is that now, if you can express your problem in terms of minimization of a continuous function, there is a new whole zoo of generic algorithms that are likely to perform well and that may benefit from throwing more CPU resources.
Sure it uses transistors in the end, but revolutions do not necessarily mean a shift in hardware technology. And, by the way, if we one day switch from transistors to things like opto-thingies, if it brings a measely 10x boost on performances, it won't be on par with the DL revolution we are witnessing.
I've also seen that talk. In many cases I think the advantages which may be had from deep learning are likely to be marginal - a few percent improvement over the next best algorithm. Multi-layer neural nets already existed for a couple of decades, and deep learning is really just a refinement on that, making the training process faster and more robust to overfitting.
Not to mention recent advancements in deep learning / machine learning with neural nets. It seems like that field has really proliferated over the past 5ish years. (I'm not sure what breakthrough specifically has led to all this, but we have TensorFlow, Torch, etc.)
Yes, I've seen what neural nets are now capable of- they are capable of
exactly what they were always capable of, except "now" (in the last few years)
we have more data and more compute to train them to actually do it. Says Geoff
Hinton [1].
I have also seen what neural nets are incapable of. Specifically,
generalisation and reasoning. Says François Chollet of Keras [2].
AI, i.e. the sub-field of computer science research that is called "AI" and
that consists of conferences such as AAAI, IJCAI, NeurIPS, etc, and assorted
journals, cannot progress on the back of a couple of neural net architectures
incapable of generalisation and reasoning. We had reasoning down pat in the
'80s. Eventually, the hype cycle will end, the Next Big Thing™ will come
around and the hype cycle will start all over again. It's the nature of
revolutions, see?
So hold your horses. Deep learning is much more useful for AI researchers who
want to publish a paper in one of the big AI conferences, and to the FANG
companies who have huge data and compute, than it is to anyone else. Anyone
else who wants to do AI will need to wait their turn and hope something else
comes around that has reasonable requirements to use, and scales well. Just as
the original article suggests.
Geoffrey Hinton: I think it’s mainly because of the amount of computation
and the amount of data now around but it’s also partly because there have
been some technical improvements in the algorithms. Particularly in the
algorithms for doing unsupervised learning where you’re not told what the
right answer is but the main thing is the computation and the amount of
data.
Say, for instance, that you could assemble a dataset of hundreds of
thousands—even millions—of English language descriptions of the features of
a software product, as written by a product manager, as well as the
corresponding source code developed by a team of engineers to meet these
requirements. Even with this data, you could not train a deep learning model
to simply read a product description and generate the appropriate codebase.
That's just one example among many. In general, anything that requires
reasoning—like programming, or applying the scientific method—long-term
planning, and algorithmic-like data manipulation, is out of reach for deep
learning models, no matter how much data you throw at them. Even learning a
sorting algorithm with a deep neural network is tremendously difficult.
No, I don't think so. AFAIK, deep learning is essentially the same 1960s algorithms[1] (possibly modified a bit) running on much larger networks. Most progress is due to better hardware (and ad hoc configurations, made possible by the larger networks afforded by better harder). Of course, SAT solvers, which have become extremely effective in recent years, are also still based on a 1960s algorithm[2], so use of an old algorithm doesn't imply lack of progress in effectiveness.
The two (NN and SAT solvers) share little theoretical progress (and certainly no theoretical breakthrough) in the past several decades, but SAT solvers aren't marketed as "AI" in spite of their seemingly magical abilities. I know that ML researchers usually cringe at the name AI and often try to disassociate themselves from the sci-fi term, but still, the marketing is extremely aggressive and misleading.
I realize that in every generation, marketers like associating the name "AI" with some particular class of algorithms, but it's important to understand that currently, assigning that name to this class of statistical clustering algorithms (regardless of their remarkable effectiveness in some tasks) is a stretch, just as it was when the term was assigned to other algorithms.
I can't help feeling that research prior to deep learning was more rigorous and impressive though. When I read papers from before they tend to be filled with statistical modeling and proofs and were somewhat intimidating. Now it seems like it's a lot of "oh we made this model and it works".
reply