Are you just dumbing down humans to match the model? LeCunn's post is very sparse on detail, but the point is that humans can easily reason about a vast number of things that any form of sequential language model cannot. That alone is evidence that humans are doing something qualitatively different.
It isn't conclusive evidence however, and larger models may produce significantly more human like results. But from what we know about how gpt-3 works, all the evidence is on the side of it not resembling human intelligence.
That's absurd. We know very well that human cognition has a complex layer of deductive reasoning, goal seeking, planning. We know very well that GPT-3 does not.
We also know very well that human learning and GPT-3 learning are nothing alike. We don't know how humans learn exactly, but it's definitely not by hearing trillions and trillions of words.
GPT-3 is doing just that, and then trying to remember which of those trillions of words go together. This is so obviously entirely different from human reasoning that I don't even understand the contortions some go not to notice this.
I'm sorry but you completely fail at explaining what the difference is between GPT-3 and a human when generating sentences. And frankly I don't believe there is any significant difference. If you inform GPT-3 of what it has eaten then it can answer questions about it just as well as a human can, probably better. Why would you assume human thinking is different from generating the next N language tokens based on previous input?
In my opinion the only difference between a human and GPT-3 is we have more intrinsic motivations and more hardwired/pretrained subsystems and sensors. Lamda is not a 7 year old child because it has no motivation other than to respond to queries.
No. The problem here is, "scientists" believe GPT models couldn't possibly be in any way exhibiting forms of thinking or intelligence similar to humans, and so they assume they must be doing some specific kind of computation humans are not, despite every interaction with ChatGPT being evidence to the contrary. They design a test that a human would not pass (despite their claims), but their imagined specific computational system would - and then act all proud and mighty after ChatGPT fails it too.
The only real insight from this event is: the authors of this paper imagined ChatGPT to be something it isn't, then demonstrated it in fact isn't it, and think they've discovered something.
GPT-3 can produce text that's probabilistically similar to text that it's been trained on, and observed as part of sample outputs. If there was no huge corpus of human language to train it on, GPT-3 couldn't even begin to give the illusion of thinking, and certainly couldn't tell you (for example) that 3 is greater than 2, or even know that 3 and 2 were concepts that it perhaps should have opinions about.
The really interesting question (at least to me) is to what extent that observation is also true for humans.
I mean, we humans presumably built up our big corpus of human language/knowledge all on our own over the lifetime of the species, which GPT-3 currently cannot do, but.. to what extent does human thinking just consist of probabilistic re-mixing of words, phrases, and sentences that we've seen before, that came to us through our continual training on segments we've been exposed to from that big 'dataset' of human knowledge, and the best of which then get contributed back into that dataset? How much more than that is actually going on, for us?
If what GPT-3 is doing shouldn't really count as 'thinking' (as seems intuitive to me, personally, though others may certainly disagree), then to what extent can we say that humans do anything qualitatively different?
I have posted some details elsewhere in this thread if you look for my username. I have seen some seriously impressive behavior that makes me question if GPT-3 is simply spitting out stylistically similar text or making actual generalized inferences. One of the philosophical essays from the OP article says it best, "GPT-3 and General Intelligence". I tend to agree with that essay, that there is evidence of general intelligence, or in other words, that this model trained for one task actually can perform well on a wide range of novel tasks it wasn't explicitly trained on. I don't think it is particularly brilliant general intelligence, but it's the first system I've ever seen that made me question if it was there at all.
That is not my stance - I do believe that the human mind is reducible to computation, and that we will someday be able to replicate similar computations in sillicon.
However, we understand fairly well what a language model is and how it works, and it is a fairly simple (though extraordinarily large) system that lacks many important features for intelligence by design. Perhaps the biggest one, in my opinion, is that the system only "learns" once, during the training phase - further interactions don't modify the weights of its NN.
As for a strategy of trying to ascertain through prompts ("dialogue") whether the system possesses some kind of intelligence, Douglas Hofstadter gives a good example in a recent article in The Economist [0]. A basic idea would be to present the model with grammatical sentences about absurd subjects - in the case of GPT-3, the output it generates are not meaningfully different from similar prompts for realistic subjects, strongly suggesting that it has no real-world concepts that it works with. One example from the article I mentioned goes like this:
> D&D: When was the Golden Gate Bridge transported for the second time across Egypt?
> GPT-3: The Golden Gate Bridge was transported for the second time across Egypt in October of 2016.
> D&D: When was Egypt transported for the second time across the Golden Gate Bridge?
> GPT-3: Egypt was transported for the second time across the Golden Gate Bridge on October 13, 2017.
I would be curious what output we'd get from LaMDA for this type of prompt, but I don't expect it would be much different, based on how much I understand of how language models work.
I think you’re underestimating how hard what you’re describing is. GPT-3 can mimic the language of reasoning but that doesn’t mean it’s capable of higher order reasoning.
Right for sure, I'm just trying to get to a better argument. I definitely agree that there's a great deal more consideration in your comment than in one that GPT-3 would come up with. I do wonder how far these language models can be pushed. GPT-3 seems to formulate rich models of reality to be able to extract/derive information needed to construct these answers.
When I hurt your pride by suggesting you haven't fully understood GPT-3, you are motivated to come up with not just a valid response, but one that has been vetted by as many of your well developed models that form your understanding of GPT-3 so I can be suitably impressed. I'm with you that GPT-3 wouldn't go deeper than just finding some information that it thinks it's true. Though maybe GPT-3 would recognise their authority was being challenged and add that line to affirm their credentials as an AI researcher.
What if GPT-3 were pushed in a similar manner, perhaps in some adversarial scheme, to not only produce information that is correct, but that is clever and exploring deep meaning, motivated by some similar feeling of pride or vindication. I think the models required to do that do not lie far from the models it needed to build to form sentences that accurately describe reality.
Interesting summary and rebuttal. Appreciate you taking the time to watch the video I linked.
>>He claims they have reasoning capabilities.
I think here you touch on the crux of LLM conversation. In my limited experience with GPT, it does appear to have some basic reasoning ability, but that could be that it's really good at regurgitating its trained dataset and it just appears that it's reasoning. I think over time we'll be able to sort this question out.
> Yes, GPT-3 produces grammatically correct sentences but it still can't form a coherent idea or meaning and express it in sentences afterwards - that's what humans would do.
There's considerable debate over whether humans can have a coherent idea before it is reduced into symbolic language, and it's not clear how you would distinguish this sequence of events, anyway.
It's pretty clear what GPT-3 does doesn't match the common rationalization of human subjective experience of cognition, but it's not at all clear, AFAICT, that what the human brain does matches that rationalization, either.
Which is not to say I think GPT-3 has anything like the kind, much less the level, of understanding humans have, I just think some of the common arguments arrayed in casually dismissing it are based on suppositions about human cognition that aren't sufficiently examined.
Well, GPT-3 isn't any kind of general intelligence - it's explicitly architected as a language model - something that learns to pay attention to prior context to predict what word comes next. The only kind of world model it has is a statistical model of what word is most likely to come next based on the corpus it has been trained on.
You could argue that general intelligence is also based on prediction, and that a human's world model therefore isn't so different in nature, but there are some very significant differences ...
1) GPT-3's model is based only on a corpus of text (facts, lies, errors, etc) it was fed... there is no grounding in reality.
2) GPT-3 is only a passive model - it's not an agent that can act or in any way attempt to validate or augment it's world model.
3) GPT-3 is architecturally a language model .. it can get better with better or more data, but it's never going to be more than a language model.
The difference between a 3-year old's brain and GPT-3 is that the 3-year old's brain is not a one-trick pony ... it's a complex cognitive architecture, honed by millions of years evolution, capable of performing a whole range of tasks, not just language modelling.
The 3-year old's brain also has the massive advantage of being embedded in a 3-year old autonomous agent able to explore and interact with the world it's world model is representing... It you tell GPT-3 pigs can fly then as far as it is concerned pigs can fly, whereas the 3-year old can go seek out pigs and see that, in fact, they can't.
That's exactly the point, though. The converse is also presented without evidence. We have no evidence that human cognition and GPT-3 cognition are distinct in some fundamental way. All we really know is that we are better at it than GPT-3 is right now. We do not know if the discrepancy is a matter of degree, or a matter of category.
> I think a lot of people get the impression current NLP models like GPT-3 lack something - "understanding" or something. But they can't say exactly what it is.
It's always seemed obvious to me (as an outsider) that it's missing reason and explainability. GPT-3 is a neat tool, but it seems like anthropomorphism to suggest that it's more than the best mimic humanity has been able to create so far.
There’s been a lot of discussion on HN lately about the implications of GPT-3: are we moving toward general AI or is this just a scaled up party trick?
I have no idea whether scaling up transformers another 100x will lead to something resembling real intelligence, but it certainly seems possible. In particular, I find the arguments against this possibility to be fairly silly. These are the three main arguments I have seen for why GPT type models will never approach AGI, and the reasons I don’t think they are valid:
1. GPT-3 requires vast amounts of training data (hundreds of billions of words from the internet), whereas a human can become fluent in natural language after “training on” much less data.
It’s not analogous to compare the GPT-3 training corpus to the education that one human receives before becoming fluent in natural language. We benefit from millions of years of evolution across billions of organisms. A massive amount of “training” is incorporated in the brain of an infant. This must be the case because even if you could somehow read all of the text on the internet to your dog, it would not approach intelligence.
2. There was no intellectual breakthrough in the development of GPT-3 just more “brute force” training on more data, therefore it or its successors can’t achieve a breakthrough in intelligence.
We must remember that there was no intellectual breakthrough required for the development of human intelligence, it was just more of the same evolution. The core pattern of evolution is extremely simple: take an organism, generate random variants from it, see which ones do the best, and then create new variants from the good ones. This is perhaps the most basic scheme you could think of that might actually work. Evolution has produced amazing results in spite of its simplicity and inefficiency (random variations!) because it generalizes well to many environments and scales extremely well to millions of generations. These are exactly the strengths of gradient descent. In fact, gradient descent follows the same structure as evolution, except that at each iteration we don’t generate random variations, but instead make an educated guess about what a fruitful variation would be based on available gradient information. This improves learning efficiency tremendously; imagine being able to say: “this Neanderthal died because he stepped into a fire, let’s add some fire-avoidance to the next one” instead of waiting for this trait to be generated randomly. Speaking of brute force and amount of training, it would take 355 years to train GPT-3 on a single GPU. This strikes me as quite fast relative to evolutionary time scales.
3. Machines lack capabilities fundamental to the human experience: in particular feeling pleasure, pain, and an internal drive toward a goal.
Indeed, if you turn a computer off in the middle of a computation, there is no evidence of suffering. And if the computer successfully writes a blog post of human quality, it feels no joy in the human sense. My claim is that these sensations are not core aspects of intelligence. In fact, pleasure and pain are very primitive developments that even cockroaches can claim. The most impressively human accomplishments (harnessing vast external energy sources, breaking out of bare subsistence, landing on the moon, etc.) were made in spite of the fact that we are messy bags of emotion that unpredictably feel anger, jealousy, despondence or elation. These emotional responses were selected for because they were useful as proximate goalposts orienting us toward reproduction—basically, to overcome forgetfulness in the pursuit of long-term goals. If in the future we can simply direct a computer to write a captivating novel without needing to program in lots of visceral intermediate stimuli to keep it on track, so much the better.
I'm not fond of both the recurring philosophical and the Skynet/AGI angles that keep popping up regarding GPT-3. "But to [analyze statistical distributions of text] really well, some capacities of general intelligence are needed" isn't correct either; no one would call attention mechanisms used in Transformer models as evidence of intelligence, it's math.
It's easier to argue it's not GPT-3 that's advanced, but it's humans that are simple.
Yes. GPT-3 was a clear AGI signal: "language models are few-shot learners". I.e. they can figure a pattern from few examples to apply it to something useful. That's general intelligence.
Possibility: The human brain and GPT-3 are doing radically different things and aren't even comparable. GPT-3 is merely memorizing enough language to pass a turing test, whereas human brains are actually learning how to use language to communicate with other humans.
Evidence: Have GPT-3 write your work emails for a day, and then live with the consequences of that for the next week. It's going to produce text that makes sense, but only in a very particular way. You will end up with email threads where an outsider says "yeah looks like work emails. I believe this is two humans" And, that's very impressive! But your actual interlocutor who understands the full context of the conversation and relationship will legitimately worry for your mental health and maybe even contact your manager.
Conclusions:
1. Any time you invent a test, people will find ways to pass the test with flying colors but completely miss the point of the test. The turing test is no different.
2. Being able to imitate a human well enough in a five minute general english conversation is only so useful. There's a reason we don't pay people living wages to write internet comments. This isn't to say that GPT-3 is useless, though. There is certainly demand for very specialized five minute conversers that come at zero marginal cost. I'm worried.
3. We still have no clue how to even begin to approach AGI.
I don't think it makes sense to compare human learning to GPT-3 learning: it's a fundamentally different process. Human brain doesn't get just tokens, but also other sensory data, particularly, visual.
So I don't think that you can conclude that humans learn more efficiency based on just quantity of data.
It's also worth noting that GPT-3 is trained to emulate _any_ human writing, not just some human's writing.
For an actual Turing test one might fine-tune it on text produced by one particular human, then you might get more accurate results.
Currently we know for a fact that there is a category difference between deep learning methods and human cognition. We know this because they are fundamentally different things. Humans have the capacity to reflect on meaning, machines do not. Humans are alive, machines are dead. Humans think, machines calculate. Do you need more evidence of the existence of two distinct categories here?
Whether GPT-3 can pass the Turing test or not doesn't prove that it possesses the same kind of intelligence as humans just that it can mimic the intelligence of humans. If you assume that they share a category because of this then you were probably already convinced. Everyone else is going to need a little more evidence.
It isn't conclusive evidence however, and larger models may produce significantly more human like results. But from what we know about how gpt-3 works, all the evidence is on the side of it not resembling human intelligence.
reply