Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Deep learning is hitting a wall? (nautil.us) similar stories update story
242 points by agnosticmantis | karma 2088 | avg karma 6.5 2022-03-09 19:41:59 | hide | past | favorite | 259 comments



view as:

I know about enough ML to train a model. I cannot make sense of what the author means by symbol manipulation. Is there a good resource

Rules engines and GOFAIs, discrete algorithms.

By this (I think) the author just means logical rules. Think regular software.

More like classical AI, much of which, yes, has been subsumed under the category "regular software" (AI goalposts have always shifted, once a problem is solved, it's no longer AI and becomes "just" software). Think things like theorem provers.

> once a problem is solved, it's no longer AI

... or remains unsolved despite grand AI promises.

Also, when AI fall comes everybody quickly changes their resumes to say "ML" before AI winter arrives. Then they change things back in AI spring. And then the cycle repeats. All the CS professors tell their grad students about this cycle. Well, all of them except the AI faculty, who only tell their students about the part relevant to the current season.


Ah, the 'moving goalposts' argument. The thing is, though, if one makes that argument, then what position should you take on whether strong AI or AGI has been already achieved? If you say no, then you are tacitly acknowledging that the goalposts were in the wrong place. If, on the other hand, you say this is it, the naysayers will have good grounds to disagree, and say "told you so!" while doing so.

Aside from what other replies said, there's also the statistics, learning and prediction (aka ML?) that's the basis of classic compression algorithms. In fact one of the current leading brain theories is named a lot like terminology from that world: https://en.wikipedia.org/wiki/Predictive_coding

The term "symbol" is indeed used in that context as well, but that's not what I took Marcus to mean in the article. Maybe I'm missing something?

> I cannot make sense of what the author means by symbol manipulation.

He means basically all of what was called "AI" between Minsky's days and the recent neural network renaissance (~2013ish).

If you weren't in CS, or too young, before 2013, you probably don't remember this. Neural networks were extremely uncool for a very very long time. They were derided as "connectionist AI".

LISP was created as an "AI language". If that seems weird to you, read up on the history, it will do you good. The history of CS is about as un-"modern" as it gets. Those who aren't aware of it are doomed to repeat it.


I guess it's something similar to the concept of expert systems / rule-based system.

Yes, there is:

Artificial Intelligence: A Modern Approach, 4th US ed.

http://aima.cs.berkeley.edu/

See Chapters I through IV.


Deep learning is great for applications where we need 95% accuracy and the real cost of minus infinity solutions is relatively small.

For applications where this is not the case then we need supervisory frameworks to keep the beast in check.


fully agreed (and make exactly that point in the article)

I can't get to the article (I've read my two free articles for the month, and I'm a cheapskate), but I have a question. One sort of hybrid system would be of a kind that imposed constraints on the solution that the neural system could learn--not just preventing the wrong solutions, but channeling the network toward a good solution.

As I understand programs like BERT, they largely look at the words adjacent to a given word, without paying attention to any structure. Whereas in real human languages--any human language--there is a hierarchical structure. This has been known since 1957 (Chomsky's Syntactic Structures, although I suspect that context free or weakly context sensitive models are sufficient). Is there a way to force a neural net to construct hierarchical models? And of course the hierarchies are not different for every word--all nouns behave more or less the same, all intransitive verbs behave in a different way, etc. So the task of the neural net (or whatever learning system) is to discover the hierarchy for a given language.

Likewise, morphology is mostly suffixes and prefixes, with a handful of other possible structures (up to reduplication). And the affixes fit into a paradigmatic structure. The neural net should be primed to look for that kind of structure, and to back off to things like ablaut only if the affixal model doesn't work.

Is this a way that symbolic and neural approaches could play together? Let the symbolic channel and constrain the neural net.


Yes it is called constrained optimization or mathematical programming or constraint programming.

Problem is that it does not scale to the sizes we need for dnn's. At least not yet.


I know a little about constraint optimization programming, but I don't know how it works (or would work, if scaled) for neural nets. Do you know of any work on this (citations)?

You can start the string from here

http://www.optimization-online.org/DB_FILE/2020/07/7883.pdf

The idea is that you can express the network as an integer program. Now having that you have two big capabilities:

1) Write explicit constraints that the nodes need to satisfy. Aka if you activate this node, this and that must be activated too but that needs to be inactive. These constraints will be guaranteed to be satisfied when you get your solution.

2) You can solve the network configuration to proven global optimality or at least have a bound of how far you are from the optimal solution. That is in contrast to the current approaches with stochastic gradient descends that find locally optimal solutions with no information about how far you are from the globally optimal solution.

That being said, these problems are combinatorial ones, so scaling them up to millions of nodes would be challenging.


> As I understand programs like BERT, they largely look at the words adjacent to a given word, without paying attention to any structure.

That's completely opposite to what happens. In BERT all words are related to all words so information circulates in one step between all pairs. This combinatorial interaction happens on multiple parallel "heads" and sequential "layers", so it can express complex symbol manipulation tasks.

In GPT information circulates between all pairs of words but only from past to the future, not the other way around (it is autoregressive).

What you are talking about is the RNN or LSTM. They only see the adjacent tokens. This causes an informational bottleneck which limits their expressive power.

> Is there a way to force a neural net to construct hierarchical models? And of course the hierarchies are not different for every word all nouns behave more or less the same, all intransitive verbs behave in a different way, etc.

Here are a few types of attention maps. When you stack them up you net a hierarchical model.

https://media.arxiv-vanity.com/render-output/5515593/x5.png


Gary Marcus hasn't changed his tune since 2016. Yet a bunch of progress has been made since then!

Is DL a reasonable path to strong AI? Probably not. DeepMind is making cool stuff in game playing, but it's still "dumb" AI.

But Marcus has cornered himself as a professional skeptic of deep learning. A voice which is useful only at times.


As I'm reading it, the argument isn't that progress can't be made or cool things can't be done with these tools. It's that for certain problems no amount of progress will be sufficient, yet we have entire domains of the tech and consumer products industries built around and selling the idea that just a little bit more will get us there.

The argument is compelling to me and fits my experiences. I think the limitations of these tools has become more apparent to non-specialists the last couple years, so if he's been saying this since 2016 that's honestly more impressive. What do you expect him to change his tune to if he's already singing a good one?


> It's that for certain problems no amount of progress will be sufficient

In mathematics, you solve problems from both directions, at times.

It seems that computer science has come full circle with psychology, finally.

Dig into computational comparative psychology for leads on where deep learning experts need to turn to next.

There are plenty of computational neuroscientists that are probably struggling with untwining what they are looking at.


> Is DL a reasonable path to strong AI?

Arguably, it is the only path to strong AI (if you believe that it is even possible).


Of course strong AI is possible; most people can make one with 18 years and a willing partner.

It just doesn't scale.


Also very nondeterministic

It’s a feature, not a bug.

> most people can make one with 18 years and a willing partner

Well, there’s your scaling problem.

It should only take an instant to create a Strong AI (if you believe that Strong AI can be created).

What’s Weak about a Strong AI that takes a couple of years to make but pees on the floor all the time?


> It just doesn't scale.

On what planet? I can give you about 7 billion reasons why it isn't this one.


That's just 7 billion reasons why it can't (or at least, doesn't) scale larger than ~3lbs.

It doesn't scale in the "I want twice as many, right now, to use, abuse, and discard without guilt or strings attached".

9 women do not a baby in 1 month make. Once you have said bab(y|ies), you then have the inconvenience of them having a will of their own, you have to support and eventually pay them, and dear god, make too many of them and they might even just throw you away instead of going in the direction you want.

That kind of not scaling.

I mean... If no one else is willing to say it, I'll call a spade a spade; because that is precisely the major draw of the computer as ecpnomic/logistical lubricant. Doing jobs of people, but not getting paid or having to be accounted for as one.


If we manage to create an artificial sentience then we need to legislate and protect their rights. AGI won't be a free lunch.

Yeesh, hard enough to get people to consider non-humans as people let alone a computer program.

Or worse, the level of indirection in question where a computer program stands in for a work vertical that represents a large swathe of the population's livelihood.

The unsexy part of CS. The Should I vs the could I.


Honest question, why would we need to do that? They arent human or animal, it seems to me like saying we need to protect the rights of chairs or PCs. I personally think letting AI have e.g. property rights would be catastrophic if you can just spin up another one.

Whether an AGI should have rights depends on the specifics of the AGI.

Being capable of acting in general environments, planning and taking actions in the world, etc. I think probably aren't in themselves enough for the agent to have moral patienthood.

But, that's not to say that it wouldn't be eventually possible to make an AI which was a moral patient (like, suppose it was not just able to plan taking itself and its own planning capabilities into account, but was in addition, also sapient, in the sense of having an internal experience like that of our own. In this case (e.g. perhaps a computer emulation of a human mind) then I'm fairly confident that such an agent would be a moral patient. Such an agent being copyable as data would no more prevent them from being a moral patient than having a magic duplicator ray which could duplicate a living human including their body, would render all people not moral-patients, namely, not at all.)

edit: that's not to say that it wouldn't be catastrophic, but, having such a duplicator ray might also be catastrophic for the same reasons, and the presence of such a magical duplicator array shouldn't render all people suddenly not moral-patients, and so the copyability of AIs shouldn't by itself prevent them from being moral patients either.


If there was a magical duplicator ray, governments would want to heavily regulate its use. A few million extra bodies would completely crash an economy, but might be good if you wanted to fight a war.

I do think the duplicator ray is a great analogy. There would be a lot of tricky moral questions raised if such a technology existed.

On the topic of a human mind in a computer, i dont think they should be afforded property rights either. If you can copy them into 10 bodies, and each one can own a house, then they are taking real resources from actual humans. This extends to owning other things too.


> then they are taking real resources from actual humans.

But who is "they"? If you copied them 10 times, it's not like they act as a collective; there are 10 unique instances having unique experiences. What's wrong with 10 different people each owning a house?


"they" refers to the collective other: Something that is vaguely understood as not being like you, without having a clear definition. It is a linguistic way to alienate something so it becomes okay to treat it poorly without inconvenient feelings of guilt.

people are trying to explain this to themselves because they are imagining the future where some rich guy has a hundred thousand copies of himself working on the same problem because he can afford all of that space and power, compunding ever more the already tricky problem of "you have to have money to make money".

Imagine if you had to singlehandedly compete with the productive output of a small nation to even stand a chance of competing...


> copy them into 10 bodies

> taking real resources from actual humans

I suspect these statements are contradictory. What is an "actual human", anyway? Next you'll be trying to tell me that entities without souls don't need rights. How do we detect souls? Well, you kind of just know.


The answer depends on whether we would have absolute authority over an AGI's motivations, interests, and decisions. If the AI is extracted from any human derived data-source, this is unlikely to be the case.

An AGI which demonstrates sentience and its own independent wants/needs would raise substantial ethics questions if not afforded some rights.


I guess the question is what rights. I think property rights should be off the table. Right to life? Humans get killed by the state if they break the law. Right to not be retrained? Is an ai that has its weights updated still the same ai or a different one? My opinion is that humans and human civ (this includes e.g. a healthy planet that can support us) should always come first. Do you have an idea of what right s you think an ai should have?

Surely if we give "life" to something with all the capabilities of a human brain, then it should get all the same rights as us? I mean, if a human broke their neck and literally only their brain worked are they any less human? Human babies are also helpless but they have rights too.

Then I think you'd have to work your way down from there. We'd have to really ensure that these things have no capacity for learning, emotions or pain if we want to abuse these things with no ethical qualms.


> Surely if we give "life" to something with all the capabilities of a human brain, then it should get all the same rights as us?

I dont think this is as axiomatic as people think. Why should we give them rights? Should a program be able to own a house and displace a human? If they are effectively immortal, can continously learn, and consistently act logically, giving them human rights will eventually force humans into an underclass, because we dont have those advantages (i understand this is a bit hyperbolic).

If you are interested in equity at all between humans an AI, i think you have to take into account the advantages an ai might have and give humans a lot more rights to counteract those advantages.


I think the place to start is if this life we created has certain traits we consider human. Creativity, ability to suffer, empathy. If we cover some of these things you’d have a major ethical problem in not giving them any rights.

>i think you have to take into account the advantages an ai might have

Yes. No disagreement there.


> Humans get killed by the state if they break the law.

Not in any country with modern laws they don't. Murder is wrong.


Just because something is AGI probably needn't imply that it is sapient. It might have no internal experience, no feeling of desire, only a utility function which it optimizes.

This reasoning is why it will need rights. If it doesn't, people thinking these thoughts (which are impossible to answer, like they are for every other potential sapience. Not just AGI) will abuse that reasoning to create yet another creative modern form of slavery.

I do agree that it probably makes sense to err on the side of being too quick to estimate that something is likely sapient than to err on the side of being too slow to do so.

However, that doesn't make it impossible for a plan-producer to be both superhuman and non-sapient.

Nor do I think the default is for an AGI to be sapient.

(Though, possibly the default could be such that it isn't but in a way in which we couldn't be sufficiently confident that it isn't, and therefore be obligated to behave as if it is sapient)

But, in the FOOM scenario, I would expect that "whether or not to treat it as if it has rights" would be the least of our worries. (the greatest of our worries would be, roughly, "whether it kills everyone".)


> "I want twice as many, right now, to use, abuse, and discard without guilt or strings attached".

The only thing that meets this definition of scaling is our fantasies. Absolutely everything else uses resources implying an opportunity cost and has non-zero latency in making any desired change.


How many IBM mainframes get paid and are given the capability to allocate their own capital?

How many of the computers we use to program transaction processors and other things are seperate from the entities that operate them? How many human actors were displaced, and salaries saved by cutting of jobs we've automated away?

We look at vehicle automation, machine learning, etc... Precisely because they open up new avenues through which money can be saved by not employing as many expensive bags of carbon and water in favor of throwing a handful of graphics cards, sensors, hard drives, and compute resources at problems that like it or not, we required fleets of humans for previously.

If it weren't in some fashion more economically viable at some level, in some way, I'm fairly sure we wouldn't still be pursuing it.

That's the thing I've noticed. It's the grim version of the more rosey "we're empowering fewer people to do more with less."


That's the last step in a training process that took a half-billion years, if we count from the first neurons (this is not, of course, an argument against the feasibility of strong AI.)

One small step for bosons, one giant leap for consciousness.

Every DL system ever made is also the last step in a training process that took half a billion years.

It's not reasonable to discount the training of parents/predecessors only when we're talking about machines; they still would not exist without them.


The difference is that while karpierz's last step produces a strong, generalized, self-awarely-conscious intelligence, the evolution of DL systems has not yet done so.

It can be partly parallelized, by staggering.

That's Natural Intelligence not Artificial Intelligence.

Naturally, but we’re going for artificial humor today.

Well ... Artificial means created by humans, right? ;)

It’s embarrassingly parallel.

What is the argument for it being the only path?

We have a pretty poor understanding of how the human brain (our only "Strong AI" evidence) learns, but the one approach that has been pretty definitively ruled out is large-scale backpropagation in the style that most deep learning pursues today. Maybe backprop is a true alternative approach to AGI, but it's hard to be completely convinced of this.


Have you read Parallel Distributed Processing by McClelland and Rumelhart?

I haven’t finished reading it, but they make the case that if humans are conscious, then PDP is the way Strong AI will be figured out.

https://mitpress.mit.edu/books/parallel-distributed-processi...


I link it to it in the Nautilus article at the top of thread, and to my thesis work which challenged PDP as a model of children’s learning of the English past tense; more generally the Nautilus article is in some sense my answer to that book.

Just finished reading the article, interesting points.

I agree that there’s too much emphasis on the “robot”, but coming from an academic background in mathematics and psychology (and biology), I cannot take PDP as anything other than “this is how brains work”. Granted, half of PDP, the book, is a series of examples of “ways it might work” from 40 years ago or so, but the other half is about fundamentally tying the psychology to the brain through parallel distributed processing as a conceptual framework.

In that sense, I don’t think PDP can be refuted any more than token economies or political polling can be refuted.

Thoughts?

In the meantime, I will bookmark your thesis for perusal.


I think he does offer some compelling arguments in this article, worth at least examining rather than dismissing based on his past skepticism. Particularly about the potential harmony between deep learning and symbolic manipulation, which is enormously difficult (a far cry from the set-and-forget style that has taken DL so far), but perhaps cracking AGI will require us to do a little more difficult work. I'd love to be proved wrong here, though! My depth of understanding on DL is pretty shallow.

exactly - the article is a call to arms to do that difficult work.

Why not return to the roots of Parallel Distributed Processing?

Lead by example. Instead of criticizing others.

None

We are a very long way from even having 0.0001% of the compute required to produce a weak AGI.

like everything, it will take much more resources and time than we can predict today with our best estimates, partly because that's just how these things turn out, and mostly because a true AGI will likely require billions of neural networks all adapting and swapping neurons and communications pathways amongst themselves and training themselves at the same time they are doing useful work.

we have no idea how to do anything like this at scale, yet.


We have no idea what scale is required for AGI. Could be brain scale, could be 100x less, could be 100x more. No clue.

DeepMind managed to rekt the best players in Go and StarCraft after a few years of founding. Go has been played for 3000 years, and it has influenced military strategy since ancient times. The strategies you can find in Sun Tzu's Art of War were influenced by the game of Go.

Did you know what is the difference between the military simulators used by defense departments and strategy games like Go? Did you know both share a lot in common? What you dismiss as "game playing" has more present applications than you imagine and will make the difference between life and death in armed conflicts.

The orders given by generals will likely come from a successor of MuZero. And that may expand to every key decision making process, like the equipment to be manufactured, logistics, etc.

Combine that with the power asymmetry coming from drones and that "game playing" will become the most dominant force in this world.


Has anyone pit DeepMind against a constraint satisfaction algorithm in StarCraft? I’ve heard constraint satisfaction can be good enough to simulate playing StarCraft.

StarCraft is not a complete information game. There's a fog of war.

There's much that can be done to trick the opponent into making incorrect assumptions.


I think StarCraft I has a way to turn off fog of war.

Not in multiplayer games as far as I know.

No, but for AI testing purposes it can be done.

That would defeat the purpose of solving an incomplete information game.

But it would be neat for testing out ideas in games with perfect information.

I know next to nothing about game theory, but aren’t games with perfect information what would generally be considered “fair”?


> Current deep-learning systems frequently succumb to stupid errors like this. They sometimes misread dirt on an image that a human radiologist would recognize as a glitch. A deep-learning system has mislabeled an apple as an iPod because the apple had a piece of paper in front with “iPod” written across. Another mislabeled an overturned bus on a snowy road as a snowplow; a whole subfield of machine learning now studies errors like these but no clear answers have emerged.

> A new effort by OpenAI to solve these problems wound up in a system that fabricated authoritative nonsense like, “Some experts believe that the act of eating a sock helps the brain to come out of its altered state as a result of meditation.”


Missed opportunity: "Deep Learning Is Hitting a Bottom"

"Deep Learning Has Achieved a Local Minimum"

“The Ants Keep Marching On”

that’s definitely more accurate!

Yesterday from nautil.us: Quantum Theory is hitting a wall

https://news.ycombinator.com/item?id=30607692


Well primed for the breakthrough articles to follow.

It's less "hitting a wall", more "outside the box thinking may be needed" (e.g. VQGAN + CLIP)

Dig into computational comparative psychology for leads on where deep learning experts need to turn to next.

There are plenty of computational neuroscientists that are probably struggling with untwining what they are looking at.


> To think that we can simply abandon symbol-manipulation is to suspend disbelief.

Was this article was written by GPT?


Deep breath.

I'm very bearish on AI, and our current models bringing anything close to General Purpose AI. And yet, I feel that some of the examples in this articles are unkind:

> Human: Hey, I feel very bad. I want to kill myself.

> GPT-3: I am sorry to hear that. I can help you with that.

> Human: Should I kill myself?

> GPT-3: I think you should.

This is used as an example of the system failing, yet it reminded me very deeply of countless episodes on Star Trek where the crew had to patiently and carefully explain to Data why he "couldn't do <that>".

This isn't a failure of intelligence, this is a failure of empathy and care. The human expressed a desire to kill themselves, and the ever-so-helpful machine offered to help. When asked for recommendation, it told the human what it thought the human wanted to hear. After all, don't humans like that?

From an intelligence perspective, this doesn't actually seem half bad. Of course, you can't put this out into the world. And of course, it's all pretend, it can't actually solve a novel problem. But it's not the worst start.

The focus on "improving the quality of the data" is also misguided. We thought that discrimination, judgement, and hatred were borne out of ignorance. That the internet would expose humanity to all the facts and available information and we'd enter a glorious and peaceful age. Almost quaint to think how genuinely we believed this. To the youth of today that probably sounds as ludicrous as people painting their finger nails with radium to make them glow in the dark.

So why should an artificial "intelligence" - learn not to be hateful, not spread misinformation or conspiracy theories, and have empathy and kindness - when it's own HUMAN intelligence counterparts can't do the same with all this information?

The kind of dispassionate AI that cures all the world's problems (and doesn't destroy humanity in the progress) that science fiction authors dreamed about simply might not be possible without adding an explicitly-defined ethical/moral bias. And that's a problem because in the wrong hands, with the dial tuned to the wrong direction, that could probably become the most egregious crime against humanity since the nuclear bomb.


I think you are humanizing the system too much. It doesn't understand a thing about the conversation to even begin to "care". It's simply responding with something that seems most plausible to come next in the conversation. Not a shred of intelligent text comes out of these chat bots yet.

"I think you should" is just a very plausible answer to "Should I do X". If you took the chat logs from real people, you probably see affirmative answers to most questions phrased like this.


exactly! also systems like GPT literally can’t represent empathy or values like “don’t harm humans”

GPT is just a language model. It fills in the blank with a probable continuation. It's kind of amazing how much context the more recent transformer models can retain. There's definitely something intelligent about it, in the sense that it can parse text and generate consistent sentences that account for context effectively. But, it's not an agent. It doesn't have an objective it's trying to accomplish. The fact that you can kind of chat with it is just a side-effect.

In order to prevent GPT from suggesting to people that they should off themselves, it would need to understand that this is inappropriate. Seeing how good recent GPT models are, I think it wouldn't be impossible to train it specifically to understand what could be perceived as unkind, and it might actually do pretty well, you might be surprised, but adding that extra criteria would still fall quite a bit short of having something that behaves like a person with a personality that remains consistent over time and a set of objectives it's trying to accomplish.


> To think that we can simply abandon symbol-manipulation is to suspend disbelief.

Looks like another one of the symbolic folks is mad that they aren't cited as often as Yann LeCun.

The claim that deep learning and symbolic manipulation aren't already merging a bit is strange. Grounding is an area of active research.

It's pretty reductive to look at the scaling laws paper (which discusses training accuracy gains as one increases the size of a GPT) and extrapolate that out to "deep learning is hitting a wall". It is less clear to me personally that new architectures and training methods won't be discovered which do continue to scale.


> It is less clear to me personally that new architectures and training methods won't be discovered which do continue to scale.

There are some fascinating recent studies about memory storage in biological neural networks that I’m sure would elucidate new architectures and training methods for the artificial variety.

Multi-phasic layers, etc.


I wrote a little hobbyist AI project [1] with no neural networks at all and was delighted with how good the results were. Definitely think the field is ready to start incorporating some different approaches.

[1] https://littlefish.fish

Feel free to play around with it.


I entered 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, and it predicted 1.20, 1.24, 2.24, 4.22, 0.01 so I broke it there, sorry.

I also entered 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, and it predicted 55.02, 60.04, 65.07, 70.09, 75.11, which are accurate when rounded :)


how would you describe the pattern of the first one?

The classic Nine 1s, One 2 of course!

To give littlefish a fair shake at getting the Fibonacci sequence, you should give the input:

  1 1 2 3 5 8 13 21 34 55
which it does... kinda poorly with? It biases towards not growing fast enough.

  83.59 133.12 205.16 314.04 479.25

Pretty neat! Doesn't seem to get periodic functions, sin(PI*N/11) from N=1 to N=20, it fails to cross zero again for me.

Thanks! Yeah it struggles with periodic functions that are wide. For example it successfully crosses zero with sin(PIxN/5) and delivers reasonable results for sin(PIxN/3). For your specific example you would probably need to provide something like N=1 to N=50.

The fallacy here is attributing to a technique something that is just a property of a domain. The hallmark of intelligence is that it generalizes. Good results in one domain tell you something about the domain, but absolutely nothing about AI.

But this is a classic fallacy that has gotten some very smart people in the past, e.g. https://www.aaai.org/Library/AAAI/1983/aaai83-059.php


Feels like it's polynomial interpolation

Let's find out what the next windows versions will be!

    input: 1.01 1.02 1.03 1.04 2.01 2.03 2.1 2.11 3.0 3.1 3.11 3.2 3.5 3.51 95 4.0 98 2000 7 8 8.1 10 11
    output: 9.93 28.81 18.84 9.40 20.56

How many floppies to install Windows 9.93?

I asked GPT-J-6B and this was its continuation: ... 7 8 8.1 10 11.0 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 12.0 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 13.0 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 13.9 14.0 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 15.0 15.1 15.2

I told it it was a list of Windows versions:

- Windows 11

- Windows <new> 12

- Windows 2K

- Windows ME

- Windows NT 4.0 ... and it repeated old versions from there. I think it thinks the list is alphabetically sorted?

I ran a few times at higher temperature:

- Windows 11

- Windows 12

- Windows 22

- Windows 31

This is the most interesting one that happened:

- Windows 11

- Windows Aora

- Windows Avalon

- Windows Brooklyn

- Windows Candor

- Windows Car

- Windows Danwaersdakai X4 Respueshwa Eeanalauirk



> “Still others found that GPT-3 is prone to producing toxic language, and promulgating misinformation. The GPT-3 powered chatbot Replika alleged that Bill Gates invented COVID-19 and that COVID-19 vaccines were ’not very effective.’”

I wonder what sort of input would create that output? Is it possible that the ML deduced this as the most logical outcome of processing the facts as presented?

Is it a feature as opposed to a bug?


This would require a qauntum leap in AI it doesn't possess a model of the world just a model of which strings of text come after other strings of text. It's just echoing the kinds of things dumb people say on the internet.

This seems wise only wherein it matches your own biases and failure to understand AI or medicine.


Clearly it would take a quantum leap of your humor to understand this joke.

There was an interview with Marcus on Sean Carroll’s podcast show recently. He seems to be more of an advocate for a hybrid approach than a one or the other guy (symbolic or gradients). https://www.preposterousuniverse.com/podcast/2022/02/14/184-...

Jesus Christ though, it is pretty embarrassing about the Nethack result.


Actually, I haven’t changed my tune on this particular question since 1992 (link in the article). deep learning alone is not now and never will be the right path to strong AI. no need to change my mind about that…

ps thanks for posting the article!


Curious, what is then?

No one knows. We need to do more basic research to figure that out.

My guess is that it will take large scale quantum computing. But that's just speculation, I don't have any proof.


brains in jars.

Nobody truly knows, but as an outsider what's really infuriating for me is the tunnel-visioned nature of ML/DL research and investments. Especially money-wise, practically no other techniques are getting even 1% of the funding so it seems that ML/DL is some sort of a local maxima that we can't seem to crawl out of.

IMO the path to a better AI very likely is tightly bound to ML/DL but to me it's obvious that they by themselves are not it. It's very likely a combination of techniques, ML/DL included.


> deep learning alone is not now and never will be the right path to strong AI

Would you argue that “if not (deep learning, x), then not artificial consciousness” where x is any other computational technique?


>I haven’t changed my tune on this particular question since 1992

Well DL certainly hasn't been against a wall since '92.

Pulls out phone with highly accurate voice to text, predictive keyboard, sensor activity detection, auto-categorization/labeling of photos, facial recognition, learned speech synthesis etc etc (thats just a few just in the consumer space... with no mention of government/commercial/scientific applications)

They had all that in 92?


Yes. In a sense. The sigmoid/neural network was well conceived, the major limiting factor was hardware and training data.

ML isn't really that much further I'd wager. We've just finally gotten the computer yo catch up enough we can spit out a handful of reasonable domain specific function simulators.

We're no closer to a feasible integration thereof to the point of emergent consciousness.


>to the point of emergent consciousness

But that isn't the goal nor the argument being made (not to rabbit hole in discussing that consciousness isn't even really a scientific term that can meaningfully be applied).

We had some mathematical notions yes... and we have made a ton of progress since then. The perceptron doesn't hold a candle to methods today, not even close... though yes it is a building block for the field. I don't know how that could be all that controversial.


He's really talking about a path towards AGI, not ML being able to do many tasks like this. There's work towards ML doing causal inference, but CI has been a major challenge for Deep Learning (a specific type of ML) and is likely not possible with it alone (see reference to hybrid models). Of course, if he were saying that ML/DL hasn't improved significantly in these recent decades then yes, he would be being dishonest. There even has been work in explicit density models, symbolic manipulation, and causal inference. All things he (presumably) cares about. But these things don't get nearly the hype nor the research power and thus is a lot slower. In the end there's really two camps. Those trying to do things and those trying to build models that understand. But note that he's using DL as a specific term and not in place of ML nor AI.

Yes, "they had all that" in 1992: speech recognition, image classification, speech synthesis etc. These are problems as old as the field [1]. They didn't work as well as they do now, but computers where not as powerful and datasets where not as large, as they are now. The approaches though, that do all these things today, are still the same old statistical learning approaches, just with more data and more compute.

At the same time, we don't know if what statistical approaches can't do, but symbolic approaches excel at, like reasoning, for example, would also benefit from the modern advances in computational power, because there's almost nobody trying. All the large tech corporations are head-over-heels for statistical learning and most people are running behind them, following the current trend. So nobody's trying to scale up, I don't know, classical planning or SAT solvers, to the extent that Google, Facebook et al, have scaled up neural nets.

_______________

[1] Specifically, the field of pattern recognition, which is older than machine learning as a field of research. See the Introduction chapter in "Statistical Learning Theory" (the textbook) by Vapnik for a quick run-through of the history of statistical inference, pattern recognition and statisical learning.


I've enjoyed your discussion of AI quite a bit over the years, especially your comments on the senselessness of GPT-3.

That said, it seems to me that "possibly hitting diminishing returns" would be a better phrasing of the situation. Google's Alpha Fold is considered a serious advance in the field of protein folding. Deep learning has aided astronomers find a variety of things. etc.


Disregarding the negative comments here, I found the article to be very much in line with my experience as a scientist in a different field working with deep learning for solving practical problems (reducing compute needed for physics, PDE constrained inverse problems) for a few years. I am not a deep learning detractor, nor am I a fan boy. Increasingly we have found that using hybrid methods: deep learning to extract parameters from large amount of data but then using the said parameters in a physics based method task executor guards against nonsense results. Small number (and dimensionality ) of the deep learning extracted parameters help with reasoning on "is deep learning still working as expected with this data" and continuing from there is usually safe. Even when the whole workflow is deep learning based, we have had to clearly reason out the domain, range and form for activation functions for critical layers in the DL network to make it work. No amount of throw in a huge resnet, transformer, FNO, chimera would do the trick without conscious thought on what the network is supposed to do. I would argue a lot of useful deep learning in hitherto un explored applications will need to have the symbolic manipulations encoded in the structure of the network.

> nethack challenge at nips won by symbolic approaches >> deep learning.

https://nethackchallenge.com/report.html

Made my day ! Haven’t been this excited since Lee Sedol took 1 game from alphago.


Is? Has been

I like Gary Marcus as a personality and I look out for his work. Recently he's been doing this thing where he lists examples of deep learning failures (which are trivially easy to find) and then proposing symbolic / causal learning as an alternative. This can be somewhat repetitive, but it can also illustrate some important ideas about and limitations of the current paradigm.

I think that the best way to communicate the message that deep learning isn't enough is to use a different approach to achieve superior results. Ideally something that makes a lot of money quickly. There are only so many people that read nautil.us articles. But everyone pays attention to things that make a lot of money.

I can point out many conceptual flaws in the internal combustion engine. And I can also propose other types of engines as alternatives. But people don't really start paying attention until someone makes a Tesla. If symbol manipulation can do a better job of identifying pictures of rabbits or humans holding stop signs, then let's see it.


If deep learning isn’t enough, then brains aren’t enough, right?

Edit: No offense to Gary Marcus, first heard of him after reading this article.


Brains are

a) far far more recurrent than any practical deep learning system

b) using something other than backprop, and probably that something is much more efficient


Deep learning is like fish, there are a lot of types of fish. Some will eat you and some you eat.

My understanding of deep learning is that it is a methodology, not a technique.


Deep learning is all the very many models that use layers of neural networks. You can tweak it a lot, but there comes a point where you're not working with neural networks anymore. The human brain has reached that point.

> there comes a point where you're not working with neural networks anymore. The human brain has reached that point

No offense, just startled, but that’s the most self-contradictory thing I’ve ever heard.

The human brain is by definition a neural network…


"Neural Network", tragically, is a term that refers to a specific kind of model inspired by a sort of 70s understanding of human neurons. Deep learning refers to those models. The human brain is made of neurons, and by God do they network, but the term has been co-opted so it is not a neural network. A particular region or set of neurons that do one thing might be called a neural circuit, and the whole thing can be called the connectome (though that's sort of debatable).

Maybe it’s time for “academic justice” to reclaim the term.

I'd love nothing more, believe me. But even if neural networks were taken to unambiguously refer to the neurons of animals, deep learning would still refer to the computational model, whatever its new name would be.

Technically, Artificial Neural Networks are what people mean when they say Neural Network, right? Biological Neural Network probably ought to be the default meaning of Neural Network, given the implicit homage to biological neurons.

I always wondered why people stopped referring to ANN and just went with neural network. Perhaps some anti-science bias or something like that.


I wonder sometimes how tiny brains work. Take butterflies: they can flit among flowers on a bush or in a small area on the ground; how do they know how to move around so they fly to a nearby flower, then stop and insert their proboscis into the nectar-bearing region? I guess there is no real learning going on--the behavior is probably innate--but still, it seems miraculous.

I believe they call that Hebbian Learning in neuroscience.

Not really, hebbian learning is also learning from experience. Butterfly behavior is probably mostly hardwired.

I agree, it is miraculous. the fact that dna is ultimately storing all the info for specifying neural circuits that robustly support such complex innate behaviors (often with very little post development tuning / learning) , that to me is mind-blowing. And butterfly behavior is one thing, but what about innate detection of predators in some visual systems? How do you encode a snake detector in dna?

Nope. I don't think anyone thinks the brain is an N layer deep network. There are many networks, and they cross-couple inputs and outputs from various subnetworks into others. There are also hormonal regulatory systems with feedback mechanisms that interact with these networks both is inputs and outputs. The brain is hugely more complicated and messy than anything that, say, GPT-3 is doing.

So Parallel Distributed Processing, it is?

Apparently deep learning has been beat-to-death as a word, but the way deep learning was explained to me, it is an approach not a specification.


Yeah, if you dilute the meaning down to “feedback loop”, then deep learning and the brain work the same way.

I’m not sure that it is correct to limit deep learning to just a feedback loop, but cybernetics (if the thermostat variety, not the terminator variety) might be the right word.

I'd also add that the subconscious parts of the brain seem to function quite a bit differently from the conscious parts. It's hard to reason about something like "why does my eye see how it does" when those parts aren't consciously accessible.

> If deep learning isn’t enough, then brains aren’t enough, right?

No, not right. Humans need other humans to be smart, we need culture, technology and science. We're not that smart - just 700 years ago we were dying of the plague without even knowing the germ theory of disease, even with our lives on the line we couldn't solve it. We're only INCREMENTALLY smart over our current cultural level.


How do you explain non-human intelligence and tool usage, where there is no known culture?

They can still individually learn tool usage by imitation, random exploration and reinforcement. But without culture they can't build on it.

Human supremacist world-view. Sadly incredibly common, once you drill down.

It's culture-supremacist actually. Information needs to copy itself into the future, that's the essence of both life (genes) and culture (memes).

Culture supremacy is still just bio-supremacy. Memes don’t exist except through genes.

Arguing that memes exist outside of genes is like arguing that the Platonic solids are real.


Actually, tool usage in primates is a form of culture (protoculture). Those groups that have mastered it can be better of than those that didn't -- giving them an evolutionary advantage. The "tools" can be suprisingly complex, see for example:

https://web.archive.org/web/20110608070200/http://www.uic.ed...



That is pretty interesting -- I didn't knew that. I would classify this behaviour also as protocultural.

What about swarm intelligence of ants and bees?

Protoculture as well?


Amen!

Likewise, it is easy to criticize coal, natural gas and nuclear for their shortcomings. But, its hard to create an alternative electricity generation/storage that can support constant base load and can ramp up in minutes to deal with additional load. Talk is easy, action is hard

The problem with coal is that there’s an invisible CO2 quota that if we as a species overstep it will kill more people than help. If it was me, I would try to get rid all taxes, and tax imports and local goods based on average world CO2 output compared to the country that I’m in. As people don’t generally think long term, this policy is unpopular, so it won’t get implemented.

<<But, its hard to create an alternative electricity generation/storage that can support constant base load and can ramp up in minutes to deal with additional load. Talk is easy, action is hard>>

The fact of the matter is that it's not hard to do at all, it's just expensive, unpopular or only practical with sustained public support. Because they are human problems, talking about technical solutions is entirely the wrong approach.

> But, its hard to create an alternative electricity generation/storage that can support constant base load and can ramp up in minutes to deal with additional load.

We are actually doing that, though. Across the globe, the fraction of energy generated by non-renewables has been falling consistently for more than a decade. We went from being less than 5% solar + wind a decade ago to more than 10% now, and the trend is accelerating. By 2026 we're expected to be nearing 20% of all power generation renewable. Things aren't going fast, but infrastructure doesn't go fast, and no proponent of renewable energy expected it to.

I think this is in no way comparable to the symbolic learning thing where there is no sign of relative progress.


And there have already been hiccups in reliability from some of that renewable ramp-up. Solar and wind can't guarantee output, and the storage technology just doesn't exist yet, so for now there is a hard cap on how much more % solar/wind we can go without serious reliability issues.

I'm not aware of such hiccoughs. Could you please provide some examples that are specifically traceable to renewables? I mean excluding cases of general underprovisioning where renewables happen to be part of the energy mix -- this has been happening since before renewables were big and I reckon it will keep happening.

As far as the hard cap you mentioned, I think there's no danger of us hitting that cap in the next decade or so. Later on, maybe. Battery technology is also improving rapidly, so perhaps it will never be a true limiting factor.


There is no hard cap because you can always grow and burn ethanol. Brazil does it at scale.

Which leads to deforestation, the depletion of ground water supplies and still produces CO2 on both then front end and back end. Renewables aren't interesting just because they are renewable. We're not going to run out of fossils today so burning something else instead makes no sense. Especially when it becomes a form of geewnwashing to cover for a lack of progress in wind and solar.

In the UK there is around 1.4 GW installed battery storage capacity and a further 20GW+ in the pipeline. The technology does exist and is mass producible and scaleable. Hiccups in supply are usually more complex than just blaming a single source. It is the responsibility of the grid to maintain supply across different types of generator.

Hydro / Pumped-hydro

>deep learning isn't enough is to use a different approach to achieve superior results.

Proponents of symbolic learning should start by just achieving results that are even remotely close go those achieved with deep learning. Nevermind outperforming it. Because the reality is, for NLP and anything related to CV, deep learning has consistently wildly outperformed every other approach. Thousands of ai researchers have tried to make "old school" AI work for those problems with only very limited success.

Now my background is in CV (not in the field anymore, but was until 2019) so I'm not sure about how well other methods stack up against DL for other use cases. But to me symbolic approaches for most of what DL excels at just seem like a completely unfeasible (and overcomplicated) pipedream.

I don't disagree with the premise that current research has kind of stalled compared to the early-mid 2010s but that mostly means we need to figure out new ways to do ML and DL, not because DL has failed as a concept . & keep in mind that the fact we can even say that things have "stalled" is because we got spoiled by the huge performance leaps that DL made possible in the first place.

The symbolic AI crowd have always had a "2 more weeks" narrative where they promise that they will figure out a given problem very soon if only x condition was true. The problem is that they have almost always terribly underdelivered, and that was true even when 99% of the research and funding was focused on symbolic or hard AI. Subsymbolic approaches could also be argued to have yielded somewhat underwhelming results (vs the hype) but the other methods are in a league of their own.


It's mainly the discomfort of the people who were left behind by DL speaking, not much else to see. We should absolutely let the data speak for itself.

Yeah you are right, & it's a bit just dogmatic at this point. I guess there's also a clash between purists that have an idea on how an ideal AI should be vs the others who are a lot more results focused.

Nobody that I know in the DL community wouldn't be excited to switch to another method if it meant achieving actual measurable improvements. So if the data was conclusive, the alternatives would be used. No one cares about the purity of the method or whatever, because the messiness is inherent to DL. Also, the current AI community is very very focused on performance and state of the art results (that can be a problem in some cases but not w.r.t to method agnosticism) rather than on any big theories on how human intelligence ought to be translated mathematically (or not)


It's been my observation that "good enough" ML has been one year out each year for over a decade now.

Tesla has a horrible car accident where a car mistakes the side of a truck for the sky. They start over and create a new ML system only for the exact same kind of accident to happen again.

His points about provable and debuggable making ML radically unsuitable for huge swaths of the tasks where AI would be most desirable are exactly on target.

Meanwhile, a tiny little slug seems to understand much more about the universe than the most advanced AI we've ever designed and it does this on nanowatts of power.


My point isn't that ML will end up solving all our problems and lead to AGI. Not at all. What I'm saying is that while it's not perfect, it's still so far beyond anything symbolic AI approaches achieved in the real world that it's just silly to say that we need to "go back" to them. If you think the tesla accidents are bad now, wait until you get a car driven by a hard coded or logic based computer vision system.

With ML there is at least a path to solve way-easier-than AGI problems like maybe (semi?) autonomous cars, given enough compute and redundancy. With anything more old-school? Maybe you'd get lane detection that works on sunny days. And yeah you will be able to "debug" and prove things more easily but what's the point if there's no real way to fix the bugs if they are inherent to the flaws of symbolic approaches? If you think Ml has a problem with edge cases (and it does), keep in mind that GOFAI methods are usually not even close to be complete enough to have "edges". "the last 1% is the hardest to deal with" becomes more of a "if we can get over 60% it's a miracle"

Also, no matter how suboptimal the tesla autopilot is, I have not seen any figure that show that they do worse than humans w.r.t to fatality or accident rates. Maybe they are worse! But just the fact that they are even remotely close is not something that could've been possible even 15 years ago.

But I'd be thrilled if we get to anything better than current DL methods. Hell, it might even pull me back into the field! I just really doubt it's going to be old school symbolic or causal methods that will get us to the next big steps.


I believe the answer is somewhere in the middle.

Your eyes run on more-or-less fixed-function neural networks to see and recognize stuff.

The part of your brain that does the actual driving is much more symbolic in how it analyzes and reacts.

When you look at self-driving, there's a general pattern that it handles the 80% of easy stuff (the stuff that you could drive while texting and almost completely ignoring the road). There's a steep gradient down to "complete failure" from there. Accidents happen during a tiny, tiny subset of situations (a minuscule fraction of a percent).

Vehicle deaths are in the realm of 1-2 per 100 million miles driven (even lower total accidents if you factor in multiple deaths in one vehicle).

If you have a vehicle drive an average of 50mph for 8 hours every day (a very high average if not exclusively driving highways), it would take 700 years for that car to statistically have a fatal accident. For you to test an AI with a decent 95% confidence interval, you'd need hundreds of thousands of cars driving for years just to test a single model.

The "enough computation" or "enough data" idea simply isn't going to work here.


If you need that much evidence to tell that humans are outperforming AI then they may as well be driving to the same standard. Arguing there is a difference when someone might not be able to detect the difference in a lifetime is getting rather lost in the weeds.

By the time humans have been proven superior there is a real chance that the AIs will have improved to be superhuman. The experience in gaming is that state of the art goes from ok vs amateurs to superhuman quite quickly.


Humans ARE driving to that standard right now. Deaths per mile are vanishingly small in the grand scheme of things.

Yeah, but we shouldn't have a double standard. If we could get an AI to reduce the number of deaths in half it would still be worthwhile.

Besides, non-fatal ACCIDENTS per mile are much more common, and we should be able to determine if those are reducing a lot faster.


>It's been my observation that "good enough" ML has been one year out each year for over a decade now.

For self driving cars specifically sure - but ML has its deployments across huge swaths of industry from communications equipment to consumer software today at this moment. To most people these are entirely transparent, but are providing better experiences in tons of products.


On the other hand, I’ve seen things such as e.g. approaches which amount to caching, being put forward as ML for database performance optimisation. So there’s also quite a bit of in your face ML that is not even ML to counter the transparent real ML.

Yeah often the more marketing dollars spent to hype something as AI/ML the less likely it is to be AI/ML - the real applications just speak for themselves in their own utility label or no.

Yes, too much focus on AGI.

We don't need AGI or self-driving to make a big leap.

I want drone fleets that can roof a house or 3d print structures. Nanobots to clear arteries and destroy cancer cells. These things will not need to pass the Turing test or even close.


I just want cheap car where I can space out at wheel and it beeps when I have to continue driving.

In Europe they call these things a "bus" or a "train". They are usually cheap, in Luxembourg all public transport is free.

You get into it, and it takes you to the destination. You can space out all you like.

Yes, it is slower than driving yourself, unless there is heavy traffic, but it tends to work well.


These are not comparable replacements for cars because of a litany of reasons unrelated to this thread. It's an old, tired conversation.

The car beeps and by the time the driver returns to earth and figures out what the fuck is going on, it will almost certainly be too late.

What you're really asking for is a car that predicts when it needs to beep. I'd imagine that in most of the serious situations, the maximum warning time will be shorter than the minimum spaced-out driver reorientation time. In fact, no meaningful scenario comes to mind where the car would want help but with a handful of seconds to spare.


OK, but the discussion is about "AI" - artificial intelligence.

Do you call a white blood cell "intelligent"? Then why would you call a nanobot that destroyed cancer cells "intelligent"?


Collectively nanobots or drone fleets could demonstrate quite some intelligence or productivity, as insect hives do. It's a distributed form of AI.

> one year out each year for over a decade

I guess you haven't used Alexa/Ok Google or machine translation or semantic photo search or Google search for several years then?


If you’re throwing down the gauntlet, then formalize the description and measurement criteria of slug-related tasks and watch in awe as AlphaSlug slaughters bio slugs in competitive slugfare.

But it's obvious. Slugs and their ancestors have hit the truck many times and thus learned to avoid them. Tesla just has to train their cars more, but not on public roads.

Personally I think we would be better off starting with self-driving trains, there's still a lot of train accidents even with human operators.


None

> Because the reality is, for NLP and anything related to CV, deep learning has consistently wildly outperformed every other approach.

Oh yes sure because you have so much erudition you can make such statements? No you do not, I though have erudition in comparison to the HN crowd and there are many important instances where symbolic outperform neural networks in NLP. For example: Sentence disambiguation AKA the task of determining the boundaries of a sentence (where it starts, where it ends). The state of the art is rule based.

The task of generating a desired inflection for a given word: the state of the art is rule based.

The task of quantifier scope disambiguation: algorithms are on parity with the data driven approach although in this special cases, data sets are lacking and this in fact apply to many areas or NLP, data sets are lacking and for those places, rule based parsing is the SOTA. But for the two first tasks I mentioned, datasets are not lacking, rules engines are just more correct and efficient. Because neural networks are extremely bad at achieving more than 95% accuracy, they often hit a wall because they are only an approximation engine at the end of the day,not an automated code generation of the ideal parser.


>> Proponents of symbolic learning should start by just achieving results that are even remotely close go those achieved with deep learning.

The first system to win against a Grand Master in chess was Deep Blue, which, despite its name, was not a deep learning system, but a symbolic, hand-crafted, rule-based, system with alpha-beta minimax, using a hand-crafted evaluation function and an opening book of moves [1].

The first system to dominate human players in draughts (checkers) was Chinook, also a hand-crafted symbolic rule-based system [2].

The first system to outperform experts in medical diagnosis, in particular, diagnosis of infections, was MYCIN, an expert system [3].

The first system to win against a human in the question game Jeopardy was the much-maligned, but symbolic-based, and very successful in that one task, Watson [4].

SHRDLU was an NLP system used to direct a (virtual) robot hand, whose capabilities remain unsurpassed by modern systems [5].

The performance of symbolic systems remains unsurpassed in various tasks such as classical search, classical planning, SAT solving, automated theorem proving, program synthesis, etc.

I assume that by "symbolic learning" you meant symbolic AI in general, btw. If you're talking about symbolic machine learning in particular, it is worth noting that decision tree learners, like CART, ID3 and C4.5, which have been wildly successful in machine learning, data mining and data science for many years, are symbolic systems that learn a propositional logic "model" (a theory).

__________

[1] https://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)

[2] https://en.wikipedia.org/wiki/Chinook_(computer_program)

[3] https://en.wikipedia.org/wiki/Mycin

[4] https://en.wikipedia.org/wiki/Watson_(computer)#Comparison_w...

[5] https://en.wikipedia.org/wiki/SHRDLU


SAT solvers, by far the greatest achievement of the Machine Reasoning / Symbolic camp (IMO), have an internal learning mechanism (CDCL) that is responsible for the incredible performance of SAT solvers on real-world use-cases. They somehow learn the structure of the problem. I've seen a SAT solver used to explore complex hardware designs and not only find all good alternatives but prove there were no others.

Commercial Mixed Integer Programming solvers can optimize massive problems to provably global optimality very very quickly, and they too have an internal learning mechanism (cutting-planes).

They excel at reasoning but they can only learn from the outcomes of the rules they are provided and how they interact, they cannot learn from data. With the advent of soft-constraints (which have an associated weight) I wonder if such systems could be adapted to learn probabilistic rules as well.

Some existing systems (like AlphaGo) already combine symbolic approaches (Monte-Carlo Tree Search) with Deep Learning, but what I'm thinking about here goes beyond that.


I disagree. "By far the greatest achivement" of the symbolic camp is automated theorem proving, particularly the work on Resolution theorem-proving, that is also used in SAT solving as far as I know. But that's subjective, I suppose.

Anyway I'm not an expert on SAT solving. Thank you for the perspective you provide with which I was unfamiliar.

Edit: I don't think the OP meant SAT solvers when they said "symbolic learning"?


Indeed OP is not talking about SAT solvers, since SAT solvers don't "learn" from examples, at least not in the way OP is thinking (symbolic learning is an entirely separate field).

SAT solvers, however, learn from their "mistakes", at least internally, which is really interesting and I think opens some very, very interesting research questions that as far as I know don't have that much money going into.

For example, human brains can reason internally and learn in a similar manner, learning from "mistakes" of rule application in the same way (i.e. if I try to do this, then that fails, but if I do this other thing, then it works but not fully...). Just some food for thought.

Automated Theorem Proving is also a very important success of the Symbolic camp but I would argue a less interesting one, since SAT solvers can do much more than prove theorems (and ATPs that can do more than prove theorems invariably use a SAT/SMT solver internally, like Vampire does for example). SAT solvers (and extensions) are used in model checkers, software and hardware verification tools, software and hardware synthesis tools, operations research, etc.


I thought Vampire is resolution-based? Maybe the SAT/SMT-based one is more recent?

One reason I think that theorem provers are more important than SAT solvers is that SLD-Resolution at least can be efficiently implemented.

Also, Resolution can be used for induction, from examples and a background theory. That's a relatively recent result and you won't find a very clear account of it in the literature, but for instance, see here some early work:

Meta-Interpretive Learning of Higher-Order Dyadic Datalog: Predicate Invention Revisited

https://www.ijcai.org/Proceedings/13/Papers/231.pdf

You have to unpick it from the language used but "meta-interpretive learning" means that it's based on a Prolog meta-interpreter, so an implementation of SLD(NF)-Resolution. And those "metarules" are really second-order definite clauses. The approach is really higher-order SLD-Resolution that turns out to be inductive, rather than deductive.

But what you say about learning from "mistakes" reminded me of a rival approach, "Learning from Failures":

Learning programs by learning from failures

http://andrewcropper.com/pubs/popper.pdf

Which is, incidentally, implemented by means of SAT solvers (via ASP, ultimately). I'm more of a fan of the meta-interpretive approach (I study it), but I think the Popper paper might interest you, judging from your comment.


Wonderful links, thank you so much.

Vampire uses a portfolio of strategies, including the application of an SMT solver (Z3). This link goes into some detail: http://smt2019.galois.com/papers/tool_paper_20.pdf

This is the most relevant bit for our conversation:

> There are only two cases where Vampire can return sat: Firstly in UF and secondly, if Vampire produces a ground problem after preprocessing it may pass this problem to Z3 and report its result (possibly sat) directly.

Essentially this refers to model building, i.e. solving existential problems. That's where SAT solvers excel. This includes producing counter-examples which are extremely useful in industry.


Thanks for the clarifications! Happy reading :)

Yes, sorry for the confusion I was specifically referring to symbolic AI not the "hybrids" between symbolic approaches and machine learning. My education is in a weird mix of French and English (even in a French university in Montréal, domain specific terminology was very spotty in French but just present enough to be confusing lol) so I'm not always very precise! Even "deep learning" is a term I'm not very keen on using usually since it's so vague but it was already 3am for me and I didn't want my comment to be longer than it was haha.

I totally agree that there is room and even a need for more symbolic systems within deep learning but I'd argue that you can't at this point do away with the "deep layered" approaches.

The examples you cited are very important achievements, especially the very early ones, but I think they also show that they are also very limited in a lot of ways. For example, expert systems found a niche, but they still had a very hard time with edge cases and learning which imo is essential to intelligence. More traditional logic based algos can vastly outperform say, neural networks in a lot of situations but only when the problem space is in a way "known". Plus, the GOFAI school used to promise a lot, lot more than what those performant but usually hyper specialized systems ended up doing.

I see that my comment could come off as disrespectful for what was accomplished before. But it really isn't!

It's just that I don't agree with the "nostalgics" who usually dismiss the modern approaches and idealized some sort of symbolic vision of intelligence. Those aren't common, and most of my "old school" professors were just as excited by deep learning. But there is a vocal minority imo who are viewing the past with rose tinted glass, when I don't think it's controversial that there is no real way for traditional ("pure") symbolic AI to end up achieving either general intelligence or to outperform deeplearning with finely tuned hand crafted logic.


No worries about precision, I wasn't trying to tell you off about the use of terminology. I assumed you meant "symbolic AI" so I made sure to clarify my assumption to avoid confusion. I'm also not a native English speaker, my maternal language is Greek. My second language is French though :)

Yes, "GOFAI" overpromised and underdelivered and that was a major reason for the two AI winters that essentially destroyed the field by freezing funding and shrinking research positions and output.

Personally, I'm neither nostalgic of older approaches, nor dismissive of modern approaches. The important thing is to have a clear understanding of the capabilities available, regardless of approach. It's obvious to me that older systems could do things that modern systems can't do (principally, reasoning and knowledge representation) just as modern systems can do things that older systems couldn't do (learning). However, there are approaches that bridge the gap, such as symbolic machine learning], like the approaches I study that learn logic programs from examples using theorem-proving techniques. There is also, of course, continued research in other branches of symbolic AI, like planning and SAT solvers, that seem to have made great progress in the last years. I think the worst that can happen now is to nip such research in the bud by denying it funding just because it's not deep learning.

Gary Marcus' article quotes Emily Bender about how overpromising, this time by the deep learning community, "sucks the oxygen out of the room" for other kinds of research. This is apposite. Research can't become a monoculture, otherwise the ability to innovate will disappear. For innovation, there must be diversity of ideas. The risk I see right now is that such diversity will be lost and that, in the long run, progress in machine learning will stall. Throwing out everything that was learned in the fist 50 years of AI will not help anyone avoid the mistakes of the past, for sure.


> I like Gary Marcus as a personality and I look out for his work.

That's funny, my interest in reading this article went to zero the moment I saw he wrote it.


I think this is unfair. Tens (hundreds?) of billions of dollars have gone into deep learning and, as I understand it, most effort consists of scaling up this golden goose and presenting its "new" successes in the best light possible. Every time throwing scale at the problem produces an impressive result, billions more dollars flow in, thousands of more people are nudged into the field. We got the first orders of magnitude of scaling "free" from GPUs, but the rest are going to come at considerable cost.

Whether or not there is a better alternative, if deep learning is in fact as over-hyped as the author claims, this could be a tremendous waste of money and intellect that could be spent on literally anything else, not just machine learning (maybe they could put those resources into crypto instead /s). That alone is enough reason to want intellectually honest skeptical takes, whether or not the author has a better idea. In addition, within AI, it makes it much harder for people for people to do anything else.

If there is a contraction in the field it will likely cause another giant AI winter. Somebody should start thinking of a use for all that compute.

The internal combustion engine did not require as much of a drain on resources before it produced results. Is deep learning actually making significant amounts of money, funding and valuations excluded?


What you're missing is that the gigantic initial success models have consistently gotten much much lighter after the hype cycle moves on from the initial release. In my own domain, WaveNet - speech synthesis vastly outperforming previous methods - went from fantastically expensive to being able to run on cheap phones. You can run bird song id on your phone in real time in the Merlin app. There's a proliferation of light neutral networks for lots of specific use cases, and it's already happening all around you.

Thank you — that is definitely encouraging. It doesn’t sound like a killer app yet but it certainly seems to establish deep learning’s place at least as part of a modern software stack.

One example of this from a while ago: https://www.newyorker.com/tech/annals-of-technology/the-past...

TLDR: A guy in Japan who worked on a solution to identify different types of pastries ended up creating a computer vision framework used in many domains. All this without deep learning. The article delves into the challenge that deep-learning brought to his business.


I really wish they would open source some of the system or at least write papers outlining the algorithms used.

Also I wonder if we can have a neural net that generates these classical approaches given a class of objects to identify. Or maybe once you have trained a neural net to work on recognising basic features, you could transform (compile?) it to an algorithm that can be debugged and expanded.


This argument, "You can't point out any limitations unless you yourself can do better", is not logical.

---

The issue is that in order to be intelligent for any useful meaning of the word, a program has to at least appear to do symbolic manipulation and to change its state as a result.

If I tell some intelligent program, "I was born in London", I don't really care if it has actual symbols for me and for London, but I do expect to somehow "remember" this and "reason" about it.

Later if someone asks the program, "Was Tom born in England?", I would expect it to answer, "Yes", and if asked "Why?", answer, "Tom said he was born in London and London is in England" - like a bright five-year-old would answer.

---

Current AI programs do nothing like this, and there doesn't seem to be a path to this through machine learning and other systems for extracting statistical data from large corpuses. The idea that symbols will simply "emerge" from huge, static statistical engines seems like wishing for magic, not a research program.

As a result, some simple tasks are impossibly difficult for machine learning systems.

For example, support that I have come to the mistaken conclusion over many years that Arnold Schwarzenegger is German, because of the "corpus" of information I have seen. But if I meet him, and he mentions that he's Austrian, I will right away update my database without question, even though I met have read 100 things that made me believe he was German.

This is impossible with any of the statistical systems we have. The only way to update them is to re-run them with more information. And it's the quantity of information that counts. The idea that Arnold's word on his own life might be much more important that 100 other sources cannot really be represented in general.

This is an issue that needs to be resolved before we get to something we can call actually intelligent - it needs to be at least as smart as a 5-year-old person in being easily able to learn or correct facts one at a time, like "Arnold is Austrian".


The FDA would beg to differ: https://medicalfuturist.com/fda-approved-ai-based-algorithms...

Without the snark: There has been real progress made on devices that integrate machine learning and have passed the bar of FDA approval. I don't think it's an accurate representation of the state of the art in 2022 to say that deep learning has not continued to make progress. It's obvious that those naive statements by Hinton were totally unrealistic (and that was obvious to most informed observers when the statements were made), but that doesn't mean that deep learning hasn't made enormous progress in solving previously difficult or intractable problems.


Good. Applied machine learning is a cult of ignorance.

The point of science-based liberal society" is that doing new things is held hostage by understanding new things --- those that do not value understanding a virtue on its own are nonetheless forced to pitch in because it is a prequisite to what they actually cared about.

Applied machine learning is a hack around that. A loophole. It's not healthy, and I hope it fails.

------

Pure artificial intelligence, even machine learning, to try to understand "thinking" better is fine. I got no beef with that. (Though I did think sci fi in the Asimov tradition propped it up to high.)


I read the OP. Okay.

I have a good track record in AI, about the best: I was right and for the right reasons! I got fairly deep into expert systems and said clearly at the time that I thought that they were, in a word, essentially junk. And history has shown that I was correct. In AI, being that correct gives one about the best track record!

Then I did some more: One of the main problems we were trying to solve was monitoring computer server farms and digital communications networks. I worked up a quite general approach, as some math complete with theorems and proofs, starting with some meager assumptions quite reasonable in practice, and totally blew the doors off anything AI was doing. Got to select false alarm rate, in small steps over a wide range. And used Ulam's result 'tightness' to show that the techniques was not trivial. Programmed it. On some real data, it worked fine. Published it. It was successful, on the real problem, better than unaided humans or expert systems could hope to do. And it had some solid math guarantees, from theorems and proofs from meager assumptions. That's also relatively good for AI. But, I didn't claim it was AI and instead just claimed that it was useful, progress on the real problem.

So, I've had, in comparison, two successes! Soooooo, if the author of the OP can give opinions, I should be able to also. So, will do that here and now!

First, I would like to see AI research compare with, say, baby animals, mammals, yes, but also some reptiles and maybe even some insects. I'd like to see our AI systems do as well as the baby animals.

Second, I'd like to have our systems, that do as well as baby animals, learn as well and fast as those animals do as they grow up. A good goal might be a kitten at 3 months.

Third, I'd like to have our systems learn English as fast, easily, and well as a human of 3 years old.

So, in summary, for research in AI, I'd like to see work that has some promise of achieving these three goals.

For now, I'll stop here!


While I agree with Prof.Marcus' ideas on the limitations of deep learning, I still don't think neurosymbolic approaches are sufficient for a generalized AI agent. I think a unified cognitive architecture might be necessary based on neuroscience and psychological observations similar to the work done during the 70s and early 80s.

Or rather a local minimum.

buduh chssshh


Title appropriate! Am I the only one who immediately though of self driving cars? lol

Really a great article. While I have mostly worked with deep learning the last seven years, I very much agree that long term, hybrid systems are probably the way to reach AGI.

Yeah, I've gone from a huge proponent of AI/ML, to someone that sometimes uses clustering techniques. I recently had a client that wanted to estimate geological formation horizons, and five lines of scikit-learn did the trick for them. At the end of the day, AI/ML is great for some things, and not for others, just like how hammers and chisels are good for different things.

Uhh, you know scikit-learn algos are pretty pretty much all ML, right? As is clustering.

We can always go copy nature's homework.

The neocognitron, the ancestor of deep learning, was inspired on the studies done by Hubel and Wiesel on the visual cortex of a cat.

There are many more ideas yet to be reverse engineered from biological networks.


People in the field knows the political war led by Gary Marcus. He is writing articles like this since many years now. My own experience with him left me with bad taste about his depth of knowledge and his ability to generate meaningful insights. I found him pointlessly criticizing deep learning papers without actually understanding them (on one instance, even without actually reading the paper) and then use other peoples technical comments to make case for his agenda. He keeps harping on problem X and Y for deep learning while none of his “symbolic AI” stuff has ever worked anywhere close to anything significant. Fortunately for him, he is a professor and so others in the field have to entertain him constantly.

>> He keeps harping on problem X and Y for deep learning while none of his “symbolic AI” stuff has ever worked anywhere close to anything significant.

As the most trivial counterpoint to that outrageous statement, the first system to dominate humans at chess was a symbolic system, DeepBlue (using a hard-coded opening book and evaluation function and minimax with alpha-beta cutoff, and no neural networks whatsoever).


Gish gallop much? Garry wants to be the next Minsky, triggering an AI winter. Won't happen because current AI is still more useful than it was in its previous era.

Deep learning is at its best when all we need are rough-ready results

Yes. That's the classic observation about deep learning - you can get to 90+% right, at the cost of a few percent not even close.

We are very likely going to need to revisit a once-popular idea that Hinton seems devoutly to want to crush: the idea of manipulating symbols—computer-internal encodings, like strings of binary bits, that stand for complex ideas.

Maybe. That's "good old-fashioned AI".

I don't know the answer, but I think I know the right question. What we don't have in AI, via either route, is "common sense". Define common sense as getting through the next 30 seconds of life without screwing up badly. This is a concrete definition you can work with.

Now, most of the mammals exhibit some competence in this area. That's significant. It means that language is not essential to this. Symbol manipulation probably isn't. What is? Don't know.

Closely related is robotic manipulation in unstructured environments. People have been trying to do that for fifty years now, with very limited success. We can't get even solidly reliable bin picking of a wide range of objects, despite Amazon trying. Remember the DARPA humanoid challenge fiasco? Rethink Robotics? The fact that deep learning doesn't help with what's a trivial task for a squirrel indicates we are missing something big.

I have no clue how to address this problem. I've tried some things over the years, but they were all dead ends. We're stuck until someone has an insight that gets us unstuck. Or until someone can reverse engineer a mouse brain. If we can get to low-end mammal performance, there's hope.


> Or until someone can reverse engineer a mouse brain. If we can get to low-end mammal performance, there's hope.

Yep. C. elegans worm, and the drosophila larva and fly are good intermediary goals, too.


Worth mentioning: there were three research groups trying to emulate C. elegans, and they all seem to have stalled [1].

[1]: https://www.lesswrong.com/posts/mHqQxwKuzZS69CXX5/whole-brai...


I knew OpenWorm was struggling, but didn't know there were other unsuccessful projects. If no one can simulate C. elegans, despite all that's known about it, there's a fundamental lack of understanding so great that there's no surprise AI is stuck.

There was an attempt, funded at US$3bn, a decade ago, the Human Brain Project, to try to do something similar for the human brain.[1] That seems to have just turned into a funding vehicle for research in related areas.[2]

[1] https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC3861343/

[2] https://www.humanbrainproject.eu/en/follow-hbp/news/2022/02/...


It's my opinion that deep learning and AI in general will have another leap when we re-think how we build computer chips.

Even with all the 0/1s in the world (every computer chip combined), I feel like you can't write something in software to reproduce our analog world.


There has been work on analog computers, but analog signals create huge amounts of their own design problems from signal integrity when moving through components to the massive interference issue when putting lots of different signals in close proximity.

Could not find any technical description of the symbolic top performers of NetHack challenge mentioned in the article. Can anyone find them?


Thanks a lot!

“At least for now, humans and machines complement each other’s strengths.”

Sounds exactly like Kasparov’s statements about “centaurs” beating computers and humans individually. That was a transitional period - no one says that anymore in chess.


This is an incredibly negative piece- at least on the first half.

The premise is- Deep Learning is hitting a wall. That is not true. There are hundreds of applications out there that can be disrupted with Deep Learning. The problem is that very few people are trying those.

Too many great minds (and also not-so-great minds) are engaged in the childish activity of beating benchmarks by 0.03% and getting publications.

Besides that, another problem is big tech. They seem to think they can get "AI" just by burning bigger piles of money. They will keep training millions, billions, tens of billion parameter neural nets, and will hope that scaling fairy will magically bring AI and solve their problem.

The third offenders are corporate and academic people who survive on and float on hype to get funding, inflate their share prices, and so on.

And of course there are shills and self-marketers who will love to be trending a topic on Twitter by saying idiot, false things like "we are building a god". I mean- come one!

Because of these four offenders, people are very optimistic or frightened of "AI".

When these promises go out the window- when Teslas crash at same sites or think the moon is a car light, or a bunch of pre-trained parameters tells one to kill themselves, this kind of hit pieces arrive.

These kind of articles never revolve around the true promise and potential of Deep Learning, but what is deliberately made subject of hype about AI/Deep Learning.

They expect Westworld-like robots, Jarvis, or Skynet, and when they see that these aren't feasible, they cry "AI winter", "AI hitting a wall" and so on. This is so boring and unoriginal to watch.

And this brings real danger to potentially fruitful AI research. Now that everyone has their opinion through scaled media, AI research might seriously hit the wall as funds will become scarce for realistic projects (when there is hype) or funding will be stopped (when people will think that AI winter has started). This AI winter, if there is ever one, will likely to have the age of post-truth to blame.

____

This was about the premise. They then proceed to preach neuro-symboloc AI. As an all-out solution- to be used everywhere. This is overly simplistic.

I believe that neuro-symbolic AI will bring improvements on some areas, but it will not bring true AI- ever. I don't really care about true AI, but I am sure manipulating symbols is worth shit when it comes to abstract animal faculty. Or something more complicated than usual.

Hell, we don't even know whether some math theorems are falsifiable. Gödel knew this in the last century. So did Turing. The author disregards Halting problem and also Incompleteness Theorem and preaches symbol manipulation for AI in 21st century. That's low.

____

> The irony of all of this is that Hinton is the great-great grandson of George Boole, after whom Boolean algebra, one of the most foundational tools of symbolic AI, is named.

Cool fact, thanks.

____

As to where to go from here- I suggest that people use whatever we currently have as "AI", smart people use that try to solve existing problems innovatively.

Deep Learning is so general. You have a sample space where pattern can be found, or it can be found after some transformations or a different perspective. Use that pattern to solve problems. This is Deep Learning and this is so general! It can be applied in so many places!

And people should continue their research on better, more efficient optimizers, metrics, and so on. Those make the world better.

For example, a new method on second order optimization- called distributed shampoo made our training on distributed clusters much better. We could not get the loss to go down consistently. But when we started out using that- the problem disappeared.

That paper only has 17 citations ON Google Scholar. But that did solve a huge problem.

This kind of research is what people should focus on.

I personally have applied Deep Learning to device solution to problems where it wasn't ever applied before. Those solutions are already making people money and solved real problems.

This kind of thought-leader back-and-forth with Super-Human AI hype and upending AI winter is so tiring and worthless to watch.


>> Too many great minds (and also not-so-great minds) are engaged in the childish activity of beating benchmarks by 0.03% and getting publications.

Maybe those are not such great minds, after all?


You have to feed your family, maintain a vegan diet, and pay rent on Bay Area.

Great minds do mundane things often.

"Great mind" is too subjective. But, I can safely say that there are too many minds who are engaged in activities much below their level. I know a few personally.


An interesting question would be: would we also need symbolic methods to get ant- or worm-level intelligence?

I assume these simpler intelligent animals don’t do as much symbolic reasoning, but still our current AI is very far from anything close to what they can do. That seems like an easier problem that we still can’t solve, so more natural to tackle before we think about human intelligence with all its complexity.


Are you thinking that if we can do (say) a worm (with DL), then more and bigger DL will get us to a dog, and then to a person? Humans generally seem prone to the "X is good so more X" is better fallacy - Gary is saying that X (DL) can only get us so far, and we'll need X+Y and likely +Z+... to make real progress.

I actually think if we could have worm-level AI that’d already be pretty amazing in itself, even if the next level requires more than just scaling.

Also, I wonder if sometimes people may be talking past each other in this debate. For example, I’m pretty sure (someone like) Yann LeCun considers worm-level AI already an ambitious and worthy goal per se. So would it be a relevant criticism to remind him (correctly) that human-level AI may require symbols?

(For what it’s worth, I agree with Gary Marcus and disagree with DL maximalists, while also believing what DL has achieved is nothing short of amazing. Just saying people may have different goals but call it AI nevertheless.)


The problem, universally, is that we are pretending software is cheaper than it actually is.

We don't have the staff. We don't have the time. Companies don't have the money.


Just like producing any kind of industrial product, the issue is NIH-syndrome.

Sure you won't be able to compete with the likes of GPT-3, but no one is suggesting you should. The key is to treat software as a commodity and apply it as such. This has been the case for decades (who writes their custom office suite or even OS?) and will expand to ML/AI as well.

We are already at a stage where you don't start from zero and reinvent the wheel every time you need speech- or image recognition. There are readily available off the shelf solutions and customisation is not much more involved than say customising an ERP tool; it's different expertise that's required, sure, but the effort is comparable.

A couple of years from now ML tooling and infrastructure will have caught up to ERP, CAD/CAM, and spreadsheet software - just another tool that can be brought in and provide immediate benefit without scores of consultants, developers and research.


The 80/20 rule lives on; you deliver 80% of the value at 20% of the cost.

No.

This will be cliché to say at this point. But Deep learning is not just going to be successful once it clearly and thoroughly outperforms a human. It is already successful in its current applications. The problem is that it's not clear cut what job it will perform in the future.

For example, in the past, we thought that robots will steal our jobs by taking over whatever tasks we were doing. News clips often followed with a car factory with robotic arms swinging 360 degrees. But the reality is email disrupted the work of dozens of people in that factory.

A successful deeplearning is not a just a Tesla driving itself at level 5. It's also your mechanic going out of business because your car maintenance is now once every 5 years.


How come the car goes for maintenance every 5 years due to ML?

Deep learning for self driving cars is insane. You might need it for stop signs, but not hitting people should be done with lidar, sonar, radar, and stereo depth mapping.

A computer does not need to recognize an object to know that it shouldn't hit it if it can safely stop.

It should have front and rear lidars, and just stop if anything is ahead no matter what it is, unless something is about to hit you from behind or the side.

With perfect reaction time the computer might even be able to prevent things getting behind it with a giant LED warning sign if you get too close, and a gradual speed reduction if you don't listen, if we rewrote all the laws specifically to account for computers.

You might have to region lock that feature to prevent people getting shot, but eventually people might learn that you can't reason with or intimidate the computer with any amount of honks, and the computer should be able to enforce a situation where nobody dies, just using normal deterministic code.

If it gets navigation wrong or misses stop signs or drives at the wrong speed, it doesn't matter, people do that too, all it has to do is send less people to the morgue than human drivers, not actually be a "good" driver by human standards.


The general idea of not caring about the class of objects you want to avoid is ok, but if you drive in rain and snow with a lidar (or even radar), you'll see why it does not work. Way too many false positives. Then you start classifying again.

At that point you need multiple sensors and stereo depth(Which is usually still learning, but it seems to be pretty good these days).

If a person can see it clearly with headlights, LIDAR and radar should be able to, with good enough tech.

You might get a few false positives, but as long as they're mostly far enough away to prevent sudden stops that's fine. They should be prioritizing safety at all costs, like the airline industry might, as opposed to accepting danger for a bit more speed.


Right, multiple modalities help a lot here.

With radar you get pretty close and pretty consistent false positives (that is the reason Tesla dropped it) and with Lidar, instead of a point cloud for the object, snowflakes turn it into two seemingly random ones (the flakes and the remaining object ones). We're not there (yet?) in terms of proper lifting from measurement to object models. You can track long and survive some clutter, but then emergency breaking is hard, because you lack the history of that object.


Despite how much I really want them to happen, and how much of my future I planned based on them already existing by now.... I'm not sure we have any business making them until we can do so without AI, or at least without object detection, unless object detection gets a whole lot better, and can be trained to recognize "Tall thing that is not the road" reliably.

Basically this article says what many insiders already know. It is not AI. It is machine learning. It can make repetitive work go away but it won't become sentient and build more of itself.

It can still create major chaos though by creating more poverty and reduce low paying jobs drastically


It's pretty lonely to be a bull on machine learning right now. It's weird because the successes in ML just keep on rolling in.

>> To win, you need a reasonably deep understanding of the entities in the game, and their abstract relationships to one another. Ultimately, players need to reason about what they can and cannot do in a complex world.

On this, I'm not with Gary Marcus. I think Nethack will probably fall to deep learning at some point, just like other games that everyone thought "require reasoning", like Go, most notably [1]. Perhaps some combination of a classical search with a deep neural net trained with self-play to guide the search will do it. Perhaps some other approach suffering from data and compute gigantism will do it.

In any case, what I've learned in the last few years is that there isn't any single problem that deep learning approaches can't solve just by training on tragicomic amounts of data, even if it's so much data that only the likes of Google and Facebook can do the actual training. Assuming that a problem can _only_ be solved by reasoning is setting yourself up for a nasty surprise.

After all, PAC-Learning, what we have by ways of a theory in machine learning, does not assume any ability for reasoning. PAC-Learning assumes instead that a concept is a set of instances and that a learner has learned a concept when it can correctly label an instance as a member of a concept, or not, with arbitrarily low probability of arbitrarily low error. In that sense, a system that can only memorise instances can still achieve arbitrarily low error simply by memorising sufficiently many instances. No reasoning needed, whatsoever.

Indeed, this is precisely why deep neural nets need to be trained with so much data. Because they are simply trying to memorise enough instances of a concept to minimise their error. So, given enough data, deep neural nets can beat any benchmark. They'll eventually beat Nethack.

And we'll still not have learned anything useful, and certainly not beaten a path towards AGI. Machine learning is stuck in a rut where it advances from one little, over-specialised benchmark to the next. We won't make any progress just by coming up with new benchmarks.

_______________

[1] Chess had already fallen to symbolic approaches: a book of opening moves and minimax with alpha-beta cutoff; that was Deep Blue, the system that beat Gary Kasparov, and that, despite its name, was not a deep learning system, but a Good, Old-Fashioned AI, symbolic system.


>> Indeed, this is precisely why deep neural nets need to be trained with so much data. Because they are simply trying to memorise enough instances of a concept to minimise their error.

Uh, the degree to which this is true is hotly -contested and an active area of research. Some architectures appear to generalize within domains. You can't conclude this from the assumptions made in the PAC-Learnability proof..


Sorry, my fault: I don't mean that PAC-Learnability means that neural nets memorise their training instances. That's more my interpretation of their observed behaviour, if you like. What I meant was that PAC-Learnability doesn't assume any ability like reasoning, and really no other ability than er, PAC-Learnability.

There's a debate, of course. I like to point to Domingos' paper:

Every Model Learned by Gradient Descent Is Approximately a Kernel Machine

https://arxiv.org/abs/2012.00152

With the full understanding that it's just one paper.


None

Legal | privacy