Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

He's really talking about a path towards AGI, not ML being able to do many tasks like this. There's work towards ML doing causal inference, but CI has been a major challenge for Deep Learning (a specific type of ML) and is likely not possible with it alone (see reference to hybrid models). Of course, if he were saying that ML/DL hasn't improved significantly in these recent decades then yes, he would be being dishonest. There even has been work in explicit density models, symbolic manipulation, and causal inference. All things he (presumably) cares about. But these things don't get nearly the hype nor the research power and thus is a lot slower. In the end there's really two camps. Those trying to do things and those trying to build models that understand. But note that he's using DL as a specific term and not in place of ML nor AI.


sort by: page size:

> The issue is that ML has no conceptual and causal understanding of anything

Practitioners work with much more tightly defined objectives and methods for improving their systems than "conceptual and causal understanding of anything". From the perspective of folks in the industry we've been making remarkable progress, well beyond what we could have expected. We're saturating major benchmarks, sometimes in one to two years. Back in the day, during the "AI Winter", it was often 10 years before a new breakthrough happened.


> You appear to be denying the progress made over the past 30 years by deep learning, ML frameworks, constraint solvers, and immense computing power.

Most of the progress in the last 30 years was immense computing power, almost all foundations for todays ML are revised old concepts. What you propose is AGI, how you want to achieve that? We don't even know where to start in theory, this is not my opinion but current top names in ML world[1], which was discusses on HN many times.

1. https://venturebeat.com/2018/12/17/geoffrey-hinton-and-demis...


> But if you remove gradients from your data, or shrink your data down to tens of samples, or shift the problem to logic, or need to use functions that aren't convex or differentiable, deep nets run smack into a wall,

We're making some progress on the convexity bit (heat functions, RL, etc.), but yes, there are other areas of statistical research all involved in trying to solve those sorts of problems.

ML is not necessarily a panacea, but just because you can point to problems that it doesn't solve doesn't mean it has no "real accomplishments."

> DL, like human cognition, are NOT driven by gradients.

???


>There has been next to ZERO progress towards genuine AGI

I mean, really? In the most pessimistic evaluation of ML research, we still know which techniques and paradigms won't work for AGI. That's not zero progress. Nobody is expecting this to happen overnight.


> RL is supposed to be the way to AGI

Could you expand on that? The more I read from folks like LeCunn & Chollet seem to disagree strongly. Just this week Yan posted about unsupervised modeling (with or without DL) to be the next path forward, and described RL as essentially a roundabout way of doing supervised learning.


>> Sadly though, whoever is working on serious hybrid systems will probably not be very popular in either of the rather extremist communities for pure logic or pure ML.

That is not true. I work in logic-based AI (a form of machine learning where everything, examples, learned models, and inductive bias, is represented as logic programs). I am not against hybrid systems and the conference of my field, the International Joint Conferences of Learning and Reasoning included NeSy the International Conference on Neural-Symbolic Learning and Reasoning (and will again, from next year, I believe). Statistical machine learning approaches and hybrid approaches are widespread in the literature of classical, symbolic AI, such as the literature on Automated Planning and Reasoning, and you need only take a look at the big symbolic conferences like AAAI, IJCAI, ICAPS (planning) and so on to see that there is a substantial fraction of papers on either purely statistical, or neuro-symbolic approaches.

But try going the other way and searching for symbolic approaches in the big statistical machine learning conferences: NeurIPS, ICML, ICLR. You may find the occasional paper from the Statistical Relational Learning community but that's basically it. So the fanaticism only goes one way: the symbolicists have learned the lessons of the past and have embraced what works, for the sake of making things, well, work. It's the statistical AI folks who are clinging on to doctrine, and my guess is they will continue to do so, while their compute budgets hold. After that, we'll see.

What's more, the majority of symbolicists have a background in statistical techniques- I for example, did my MSc in data science and let me tell you, there was hardly any symbolic AI in my course. But ask a Neural Net researcher to explain to you the difference between, oh, I don't know, DFS with backtracking and BFS with loop detection, without searching or asking an LLM. Or, I don't know, let them ask an LLM and watch what happens.

Now, that is a problem. The statistical machine learning field has taken it upon itself in recent years to solve reasoning, I guess, with Neural Nets. That's a fine ambition to have except that reasoning is already solved. At best, Neural Nets can do approximate reasoning, with caveats. In a fantasy world, which doesn't exist, one could re-discover sound and complete search algorithms and efficient heuristics with a big enough neural net trained on a large enough dataset of search problems. But, why? Neural Nets researchers could save themselves another 30 years of reinventing a wheel, or inventing a square wheel that only rolls on Tuesdays, if they picked up a textbook on basic Computer Science or AI (Say, Russel and Norvig, that it seems some substantial minority think as a failure because it didn't anticipate neural net breakthroughs 10 years later).

AI has a long history. Symbolicists know it, because they, or their PhD advisors, were there when it was being written and they have the facial injuries to prove it from falling down all the possible holes. But, what happens when one does not know the history of their own field of research?

In any case, don't blame symbolicists. We know what the statisticians do. It's them who don't know what we've done.


>First off right now there seems to be a drive to explain many phenomena in ML in particular why neural networks are good at what they do. A large body of them reaches a point of basically "they are good at modeling functions that they are good at modeling".

Since this is closely related to my current research, yes, ML research is kind of crappy at this right now, and can scarcely even be considered to be trying to actually explain why certain methods work. Every ML paper or thesis I read nowadays just seems to discard any notion of doing good theory in favor of beefing up their empirical evaluation section and throwing deep convnets at everything.

I'd drone on more, but that would be telling you what's in my research, and it's not done yet!


> where you need solid math basis and a deep understanding of the science behind it all.

I __REALLY__ really wish this were true. But I'll be honest, I know quite a number of researchers at high level institutions (FAANG and top 10 unis) that don't understand things like probability distributions or the difference between likelihood and probability. There's a lot of "interpretability" left on the table simply through not understanding some basic mathematics, let along advanced (high dimensional statistics, differential geometry, set theory, etc). The AI engineering often "needs" less of an understanding.

But I don't think this is a good thing. I specifically have been vocal about how this is going to cause real world harm. Forget the AGI, just look at how people are using models today without any understanding. How people think you can synthesize new data without considering diversity of that data[0], can create "self healing code" that will generate high quality and good code[1,2], how people think LLMs understands causality[3,4], or just how fucking hard evaluation really is[5] (I really cannot stress this last one enough). There is a serious crisis in ML right now, and it is also the thing that made it explode in funding: hype. I don't think this is a bubble in the sense that AI will go away, but I think if we aren't careful with how we deal with this then it isn't unlikely to see heavy governmental restrictions placed on these things. Plus, a lot of us are pretty confident that just learning through data is not enough to get to AGI. It just isn't a high enough level of abstraction, besides being a pain (see the semantic deduplication comments about generation). But academia is even railroaded into SOTA chasing because that's what conferences like. NLP as an entire field right now is almost entirely composed of people just tuning big models instead of developing novel architectures (if you don't win, you struggle to get published despite differing factors). We let big labs spend massive amounts of compute to compare to little labs who can get similar performance with a hundredth, but don't publish those works. It is the curse of benchmarkism and it is maddening. Honestly, a lot of times I feel like a crazy person for bringing this up. Because when I say "ML needs a solid math basis and deep understanding of the science behind it" everyone agrees, but when the rubber hits the road and I suggest mathematical solutions to resolve these, I'm laughed at or told it is unnecessary.

[0] https://news.ycombinator.com/item?id=36509816

[1] https://news.ycombinator.com/item?id=36297867

[2] https://news.ycombinator.com/item?id=35806152

[3] https://news.ycombinator.com/item?id=36036859

[4] https://www.cs.helsinki.fi/u/ahyvarin/papers/NN99.pdf

[5] https://news.ycombinator.com/item?id=36116939


> the data sets have gotten large enough where you can start to consider variable interactions in a way that’s becoming increasingly predictive. And there are a number of problems where the actual individual variables themselves don’t have a lot of meaning, or they are kind of ambiguous, or they are only very weak signals. There’s information in the correlation structure of the variables that can be revealed, but only through really huge amounts of data

This isn't really true, since this can be said of any ML model. ML is nothing new. Deep learning is new. It works because we have so much data that we can start to extract complex, nonlinear patterns.


> I've always been stronger at discrete type math/programming, which is why I tend to shy away from statistics-based stuff like ML.

I think there's a major misconception that ML in the form of deep learning is about statistics. There's no statistics in deep learning models. There are some statistical measurements made of final models, much in the same way a good computer science paper covering implementations of discrete data structures might make statistical statements showing the performance of the author's implementation, but like transformers and traditional neural nets and backprop have nothing to do with statistics.


>It barely learns anything given the amount of computation effort an data that goes into it, but it just happens to be good enough to be practically preferable to old symbolic systems.

This is the part that's always bothered me. For many ML has just become the default tool to throw at a problem where other solutions (albeit less shiny) exist, that might give the same or even better results (and are very likely going to end up being more efficient). Would it be too much of a leap to suggest that we're forgetting to think?


>The argument that this is 90% of what matters in ML seems a bit bold. AFAICT it is completely missing reinforcement learning,

Wouldn't that account for the other 10%? To me it sounds like the quote is not too far off when you consider that he said "of what matters today". There are many other things missing from that list that might become more relevant tomorrow.


> So I am doubling down on ML/DL.

The amount of free resources now available for learning machine learning/deep learning nowadays is robust and easy to comprehend. (indeed, Andrew Ng's Coursera class is very good). And running running ML code is even easier, with libraries like Tensorflow/Theano to abstract the ML gruntwork (and Keras to abstract the abstraction!)

I suspect that there may be machine learning knowledge crash, where the basics are repeated endlessly, but there is less unique, real world application of the knowledge learned. I've seen many Internet testimonials saying how "I followed an online tutorial and now I can classify handwritten digits, AI is the future!" The meme that Kaggle competitions are a metric of practical ML skill encourages budding ML enthusiasts to look at minimizing log-loss or maximizing accuracy without considering time/cost tradeoffs, which doesn't reflect real-world constraints.

Unfortunately, many successful real world applications of ML/DL are the ones not being instructed in tutorials as they are trade secrets (this is the case with "big data" literature, to my frustration). OpenAI is a good step toward transparency in the field, but that won't stop the ML-trivializing "this program can play Pong, AI is the future!" thought pieces (https://news.ycombinator.com/item?id=13256962).


> I am an AI skeptic. I am baffled by anyone who isn’t. I don’t see any path from continuous improvements to the (admittedly impressive) ‘machine learning’ field that leads to a general AI

- I share the skepticism towards any progress towards 'general AI' - I don't think that we're remotely close or even on the right path in any way.

- That doesn't make me a skeptic towards the current state of machine learning though. ML doesn't need to lead to general AI. It's already useful in its current forms. That's good enough. It doesn't need to solve all of humanity's problems to be a great tool.

I think it's important to make this distinction and for some reason it's left implicit or it's purposefully omitted from the article.


> In contrast, 'ML' assumes from the start that computing resources will be available.

Right, but I'm not sure that "better, cleaner names for things" actually follows. Instead, I find that the ML folks just hacked their way to similar results as traditional statistics, but in many cases were comfortable with the algorithms as "black boxes" rather than having a clear understanding of why the algorithms worked. In that sense, the author's "unprincipled" criticism is valid. This is less true today, but the new research in convolutional neural nets shows how ML starts from hacking things until they produce practical results then backing into the theory of why. This habit has resulted in much duplication of effort and naming schemes. My ML prof at Georgia Tech (Isbell was awesome! http://www.cc.gatech.edu/~isbell/) constantly trashed "genetic algorithms" for being a silly form of randomized hill-climbing.

The beneficial side of these less-principled techniques is that they happen to work on larger scale datasets. It turns out approximate results are more scalable than exact results.


>ML in general is just applied statistics. That's not going to get you to AGI.

I don't see how we can rule it out. The size of the statistical models we use are still dwarfed by the brains of intelligent animals, and we don't have any solid theory of intelligence to show how statistics comes up short as an explanation.


> What's the point of dumping a bunch of Google results here? At least half the results are about implementations of pretty traditional etatistical / econometric inference techniques.

Here are some tools for causal inference (and a process for finding projects to contribute to instead of arguing about insufficiency of AI/ML for our very special problem domain here). At least one AGI implementation doesn't need to do causal inference in order to predict the outcomes of actions in a noisy field.

Weather forecasting models don't / don't need to do causal inference.

> A/B testing

Is multi-armed bandit feasible for the domain? Or, in practice, are there too many concurrent changes in variables to have any sort of a controlled experiment. Then, aren't you trying to do causal inference with mostly observational data.

> I really don't see how a RL would help with any of this. Care to come up with something concrete?

The practice of developing models and continuing on with them when they seem to fit and citations or impact reinforce is very much entirely an exercise in RL. This is a control system with a feedback loop. A "Cybernetic system". It's not unique. It's not too hard for symbolic or neural AI/ML. Stronger AI can or could do [causal] inference.


>> Good CS expert says: Most firms that think they want advanced AI/ML really just need linear regression on cleaned-up data

Not nearly true. The simple counter-argument is that prior to DL, we don't have good approach to really 'clean' data like images.

The author states this fact as if cleaning data is a piece of cake. No, it is surely not. In fact, part of the DL's magic trick is the ability to automatically learn to generalize useful features from data. From another perspective, the whole DL frontend, prior the very last layer, can be viewed as a data cleaning pipeline, which is learnt during the training process, optimized to pick the useful signals.

The author clearly isn't an expert on the matters he trying to put claims on. Yet his statement comes with such big confidence or ignorance. This shows why this revolution will be a truly impactful one, for even some of the claimed intellectuals cannot understand its importance and divergence of its predecessors. They will be caught off-guard then left behind. It would be very enjoyable to watch what their reaction would be once it happens.


> (edit: I'm not claiming he's a field expert in a week guys, just that he can probably learn the basics pretty fast, especially given ML tech shares many base maths with graphics)

This may be his biggest impediment. ML has gotten very far with looking at problems as linear algebraic systems, where optimizing a loss function mathematically yields a good solution to a precisely defined (and well circumscribed) classification or regression problem. These techniques are very seductive and very powerful, but the problems they solve have almost nothing in common with AGI.

Put another way, Machine Learning as a field diverged from human learning (and cognitive science) decades ago, and the two are virtually unrecognizable to each other now. Human learning is the best example of AGI we have, and using ML tech as a way to get there may be a seductive dead end.

next

Legal | privacy