Do any mainstream AGI researchers believe that GPT-style ML methods will plausibly get us to AGI? I have only a very shallow understanding of the state of the art, but from the outside playing with the tools it seems much more likely that we’ll get to a local maximum far below AGI and require a different approach altogether. I’d love to read some discussion on the topic if anyone has good non-hype/biased fanboy links to share.
I am a PhD student working in learning and autonomy space and every researcher I know thinks Gary Marcus is a joke. I'm not saying he doesn't know things, but all I am saying is machine learning at scale is not his area of expertise although he pretends it is. Period. He passes on very generic, obvious statements about the future without any details and when someone does something in that direction he claims 'I told you so!, you should have listened to me in the past!'. Look at the entire chain of discussion between Gary Marcus and Yann LeCun in this thread you'll get a sense of I am talking about: https://twitter.com/ylecun/status/1523305857420824576
Gary Marcus is an academic grifter and to me he is no different than crypto bros who grift non-experts.
As a professional programmer and a relatively optimistic AGI enthusiast, why would the current ML methods not work, given sufficient CPU/RAM/GPU/latency/bandwidth/storage?
In theory, as long as you can translate your inputs and outputs to an array of floats, a neural network can compute anything. The required number of neurons might not fit into the world's best RAM, and the required number of weights and biases for those neurons might not be quickly calculated by a CPU/GPU however.
Yes but that doesn’t mean you won’t need new architectures or training methods to get there, or data that doesn’t currently exist. We also don’t know how many neurons / layers we’d need, etc.
The brain itself is infinitely more complex than artificial neural networks. Maybe we don’t need all of what nature does to get there, but we are so many orders of magnitude off its redonk. People talk about number of neurons of the brain as if there’s a 1:1 mapping with an ANN. Real neurons have chemical, physical properties, along with other things probably not yet discovered going on.
This is an interesting comment. I agree that I hear the "all we need is 86 billion neurons and we will habe parity with the human brain", and I feel it is dubious to think this way because there is no reason why this arbitrary number must work.
I also think it is a bit strange to use the human brain as an analogy because biological neurons supposedly are booleans and act in groups to achieve float level behavior. For example I can have neurologic pain in my fingers that isn't on off, but rather, has differences in magnitude.
I think we should move away from the biology comparisons and just seek to understand if "more neurons = more better" is true, and if it is, how do we shove more into RAM and handle the exploding compute complexity.
The current AI approach is like a pure function in programming: no side effects, and given the same input you always get the same output. The “usage” and “training” steps are seperate. There is no episodic memory, especially there is no short term memory.
Biological networks that result in conscious “minds” have a ton of loops and are constantly learning. You can essentially cut yourself off from the outside world in something like a sensory deprivation bath and your mind will continue to operate, talking to itself.
No current popular and successful AI/ML approach can do anything like this.
Agreed, but I also wonder if this is a "necessary" requirement. A robot, perhaps pretrained in a highly accurate 3d physics virtual simulation, which has an understanding of how it can move itself and others in the world, and how to accomplish text defined tasks, is already extremely useful and much more general than an image classificiation system. It is so general, in fact, that it would begin reliably replacing jobs.
Ok, so now we just have to define "AGI" then. A robot, which knows its physical capabilities, which can see the world around it through a frustrum and identifies objects by position, velocity, rotation, which understands the passage of time and can predict future positions for example, which can take text input and translate that into a list of steps it needs to execute, which is functionally equivalent to an Amazon warehouse employee, we are saying is not AGI.
An Amazon warehouse worker isn’t a human, an Amazon warehouse worker is a human engaged in an activity that utilises a tiny portion of what that human is capable of.
A Roomba is not AGI because it can do what a cleaner does.
“Artificial general intelligence (AGI) is the ability of an intelligent agent to understand or learn any intellectual task that a human being can.”
I think the key word in that quote is "any" intellectual task. I don't think we are far from solving all of the mobility and vision-related tasks.
I am more concerned though if the definition includes things like philosophy and emotion. These things can be quantified, like for example with AI that plays poker and can calculate the aggressiveness (range of potential hands) of the humans at the table rather than just the pure isolated strength of their hand. But it seems like a very hard thing to generally quantify, and as a result a hard thing to measure and program for.
It sounds like different people will just have different definitions of AGI, which is different from "can this thing do the task i need it to do (for profit, for fun, etc)"
I think you're on to something very practical here.
Chat GPT allows for conversation that is pretty remarquable today. It hasn't learned the way us humans have - so what?
I think a few more iterations may lead to something very, very useful to us humans. Most humans may just as well say Chat GPT version X is Artificial, and Generelly Intelligent.
One big gap is causal learning. A true general intelligence will have to learn how to intervene in the real world to cause wanted outcomes in novel scenarios. Most current ML models capture only stastical knowledge. They can tell you what interventions have been associated with wanted outcomes in the past. In some situations, replaying these associations seems like genuine causal knowledge, but in novel scenarios this falls short. Even in current day models designed to make causal inferences, say for autonomous driving, the causal structure is more likely to have been built into the models by humans, rather than inferred from observations.
Well, I would put it back like why would it? When you understand how these things work, does it sound anything like what humans do? When prompted with a question, we do not respond by predicting words that come next based on a gigantic corpus of pre-trained text. As a professional programmer, do you think Human intelligence works like a Turing machine?
The first interesting thing I'm told is that real biological neurons operate as booleans in the real world, whereas in computer land I'm told it is preferable to use float neurons. I suppose that you could chain biology neurons together in groups to achieve float-like behavior.
So that's just one small example that we don't need AGI to be a model of the real human brain, with synapses and blood-brain barriers and everything. Rather, we just need one system to do n number of tasks at roughly the same level as a human for it to be "general". Maybe it's not AGI, but it also is not a hardcoded robotic arm that can only work with square objects of a certain dimension.
If you had a robot that was pretrained in a virtual world, assembled in the real world, and then it begins testing and observing and resolving its own physical capabilities (moving arms and legs to stand and jump and backflip)... and then it also had a vision system to scan for threats and objectives... and then it also could resolve text and voice prompts to learn its next objective ("go get my favorite beer can from the fridge")... and the robot knows to ask you more questions to learn what your favorite beer is and also it knows how to preserve its own life in case the dog attacks it or the fridge topples over on it... then I think you have an extremely useful tool that will change the world, regardless of if it is labeled as AGI or not.
There's a 2018 book of interviews of many well-known researchers where they're asked about future prospects: http://book.mfordfuture.com/ (list of interviewees on that page). The actual interview dates weren't specified but don't seem to be earlier than 2017, in my reading. Almost all of them greatly underestimated progress up to now, or refused to say much. (I'm hedging a little bit because it's a year since I read it, and memory is fuzzy.)
Shane Legg of DeepMind wrote a blog post at the opening of the 2010s where he stuck his neck out to predict AGI with a time distribution peaking around 2030. He thought the major development would be in reinforcement learning, rather than the self-supervised GPT stuff.
We still seem to be missing an equivalent of explicit memory formation, serializing digested perceptions into working then short term and long term memory. The however many thousand tokens in a GPT's buffer can span a much larger span of time than the second's worth of sense impressions your brain can hold without consciousness[1] and memory getting involved but the principle seems to be the same.
This isn't to say that there wouldn't be some simple hack to allow memory formation in chat agents, just that there's at least one advance we need besides simple scale.
[1] As in not subliminal, not anything to do with philosophical notions of qualia.
I haven't seen a good write-up on this, however it appears that large Transformer architectures are directly learning a memory representation in their parameters. ChatGPT can easily return facts, and approximate summaries of facts. Unfortunately ChatGPT is unable to differentiate what it knows, vs. what it thinks it knows. I'd love to see a deep dive into how these models are representing such knowledge.
How can we more effectively show a robot that circular objects with a certain position, scale, rotation, mass, color/pattern, or an arbitrary label ("that thing over there") is what we want? Seems like a fun question to solve.
I don't think anyone serious thinks or talks in terms of AGI. The feverishly simplistic idea of the singularity is quite silly.
Most notably, neural networks alone will not reach any kind of AGI.
Start adding the capacity to read from massive knowledge stores, and a place to keep long term information (i.e., memory, probably also in a database), plus a feedback loop for the model to learn and improve? Plus the ability to call APIs? Now you're talking. I think all of those pieces are close to doable right now, maybe with a latency of 5s. If one of the big players puts that in place in a way that is well measured and they can iterate on, I think we'll start to see some really incredible advances.
One really exciting place things might improve is in data cleaning. Right now preprocessing your data and putting it in a format that can be learned efficiently and without bias is a huge pain // risk. This next generation is allowing us to largely ignore a lot of that work.
Similarly, transfer learning is finally good.
And the models are generalist, few shot learners.
As a consequence, individuals with minimal expertise can set up a world class system to solve niche problems. That's really exciting and it's going to get easier.
Cross-language learning (what was referred to as an "interlingua" in the 90s) means we're seeing some stunning advances in low resource languages. It used to be that everyone ignored languages other than English, and then provides them with mediocre support.
I think we're at a point where there's very little excuse not to launch in many languages at once.
Interesting. How are these indexes stored and how are they fed into the transformer model so that GPT can use them? Does this require an additional training step?
Based on my reading of the docs (I haven't looked at the code), it appears that it uses the existing langchain[1] (which I have used and is excellent) library for constructing longer prompts through a technique known as prompt chaining. So for example, summarizing long documents would involve a map/reduce style effort where you get the LLM to summarize chunks of the document (with some degree of overlap between chunks) and then getting the model to summarize the summaries.
For answering "queries", it appears like it iterates over the documents in the store, i.e., NOT using it like an index, and feeding each document as part of the context into the LLM.
Actually they don't. You can listen to the podcasts by Lex Fridman with Yan Le Cun or Andrej Karpathy regarding that topic. But basically what Le Cun is saying is that the information density in text is not high enough in order to learn a realistic representation of the world from it.
I'm completely a bystander, but I feel like one flag for me with current approaches is the ongoing separation between training and runtime. Robotics has been through a similar thing where you have one program that does SLAM while you teleop the robot, and you use that to map your environment, then afterward shut it down and pass the static map into a separate localization + navigation stack.
Just as robots have had to graduate to the world of continuous SLAM, navigating while building and constantly updating a map, I feel like there's a big missing piece in current AI for a system that can simultaneously act and learn, that can reflect on gaps in its own knowledge, and express curiosity in order to facilitate learning— that can ask a question out of a desire to know rather than as a party trick.
I think that depends on which definition of AGI you prefer. It (GPT-3) knows more than I do about most topics (though I still beat it at the stuff I'm best at), so I'd say it's got the A and the G covered, and it's "only" at the level of a university student at the stuff I've seen it tested on, so it's "pretty intelligent" in the way toy drone submarines are "pretty good swimmers".
It's not as fast a learner (efficient with samples) as humans are; it doesn't continuously learn from interactions with others like we do; and it's certainly not superhuman at everything (or probably anything other than breadth of knowledge)…
The problem with language models is that there's no thought behind what they say. It's just pattern recognition.
If you ask it an original convoluted but simple natural language question, it will spit out nonsense. There's no thought behind what it sees - when it sees 5x - 3 = 2x + 7, it doesn't think that the x represents the variable, you need to plug in a number such that both sides are equal etc. It just sees the symbols and looks up the answer in it's memory. There's zero intelligence here.
It's like a hypothetical student who simply memorizes all definitions without understanding at all. You can ask it for the definition of a linear system of equations, and you can ask it how to solve one, but there is zero understanding or thought there, in spite of apparent knowledge.
In a way, it reminds me of Feynman's anecdote about Brazilian universities.
Goalpost shifting aside, that presumes a definition of "thought". Perhaps these models are different from us in this regard (I'd assume so simply because it had to read more tokens than the total number of times an average single synapse in a human brain will fire in a lifetime), but I don't think humanity collectively yet has a concept of what it means to think that's good enough to prove the absence.
Which of those do you think were elementary grade? Because I'm fairly sure those are GCSE level[0]. In fact, why don't you suggest 20 "basic elementary school math problems", if you suggest 20 that are at that level, I'll put them in and post what it says; no cherry picking results, but I will grade it.
Also, when I was at school, 75%-82.5% was a good score, and looking up the current system, that percentage range is in the top three grades (out of 9) at that level, and can be top grade depending on year and exam board.
I haven't been in elementary school for some time, but I do remember linear equations being taught. Maybe except the sine question, since trigonometry is taught later.
Getting 3/4th of simple elementary school questions right indicates that someone is really bad at math. These are not difficult questions.
I don't have time to come up with 20 exercises, but I can suggest a few that I think the AI may have trouble with. All of these are trivial single-variable linear equations that a smart elementary school student should have no problem with. (Note that the numbers are intentionally scrambled; this does not increase the difficulty of the problem):
1. Solve for x: 113(6x - 45) = 92x
2. Multiply 13.4a - 18b by 18b + 13.4a
3. Four girls bought bus tickets. Anne bought seventeen 20-minute tickets and paid $323. Marianne bought twenty-eight 75-minute tickets and paid $784. Alice bought nineteen 20-minute tickets and eight 75-minute tickets, and Alex bought seven thousand eighty five 20-minute tickets and ninety six 75-minute tickets. How much did Alex pay?
4. An old man walked fifty nine kilometres in four hours through a flat field. He drove a car to France for nine hours, and then proceeded to walk for eighteen hours through a flat field at the same pace. How many kilometres did the old man walk in France?
5. Alice bought seventy-nine pens and twenty-six notebooks. The arithmetic mean of the cost of these articles was $9.31. The sum of the cost of the notebooks was $112.762. How much did a pen cost?
Elementary school is for children who are four to eleven years of age. Basic elementary school maths is the start of that age range, so counting with your fingers, the idea of base-10, and adding two small numbers with perhaps one carry. Late elementary school (for me "middle school", but the UK reforms education too much and too often to keep track) was still only basic arithmetic, fundamental "roll 2 d6" probability, nets of the most basic polyhedra.
Algebra and trig wasn't until secondary school for me ("year 7" we called it, school year beginning age 11), though I personally had a head start from having learned to read with the Commodore 64 user manual.
Well, if you're not willing (or well calibrated enough), I should get some old exam papers, see how well it grades against students. I wonder if there even are any downloadable pre-GCSE exams…
What age was that? I'm wondering if something was lost in translation (no two countries I hear about have exactly the same school system so the terms don't line up), or if Poland just did algebra sooner than the UK?
Edit: also, where might I find some old Polish example exams and marking schemes? ChatGPT is inherently multilingual, so I might as well.
Elementary school is (or used to be back when I was there) 1-6th grades, and in 6th grade children are 12 years old, I believe.
There used to be something called a 6th-grade exam (testing both polish and math) that everyone took before going to middle school, which is what I was looking at before. Though some of these questions may be in the training set, so I would advise to scramble the numbers.
Here's an example: https://arkusze.pl/sprawdzian-szostoklasisty-probny-operon-j...
This thread is no longer on my first page of comments, so I may forget to reply, but I've downloaded those and do intend to test it against those exams, and will put the write up here: https://github.com/BenWheatley/Studies-of-AI
That kind of ML model is pretty general. Pretraining a big model has been extended to multi-modal environments. People are training them with RL to take actions. People are applying other generative techniques to them, and all sorts of other stuff. If you just look at it as 'predict the next word token,' then it's pretty limiting, but people have already gone way beyond that. TFA talks about some interesting directions people are taking them.
A more general form of your question is whether we can get to AGI with just incremental steps from where we are today, rather than step-change way-out-of-left-field kinds of ideas. People are split on that. Personally, I think that incremental changes from today's methods are sufficient with better hardware and data, but Big New Ideas could certainly speed up progress.
Extremely cynical take: GPT-style ML will get us to a place that we can fool people into thinking that AGI is just around the bend, or maybe one more breakthrough away. It gives people hope, but is not a realistic foundation.
But watch for lots of breathless "look how close we are!" messages.
The other thing GPT is good for is polluting our info-sphere with stuff that sounds confident, but may be anywhere from dead right to slightly off to completely wrong. Having automated means of producing high volumes of fine-sounding nonsense is not the path to anywhere good.
From my experience working with ChatGPT, I believe that it exhibits many of the characteristics of AGI. For example, it can correctly interpret and parse novel requests, complete tasks that are virtually impossible for a machine to accomplish (example below), perform at the level of an average high school student in most areas, demonstrate judgment and opinion (go ahead, ask it to give you an opinion about something you propose to it), and assemble plans for novel situations. However, ChatGPT still has limitations, such as difficulty with numeracy, handling individual characters, and visual tasks. It's also worth noting that ChatGPT's mistakes sometimes sound like those of a child, and it can sometimes produce code that needs minor changes to work. Despite these obvious shortcomings, it passes my personal standard for the first demonstration of limited AGI.
[above is based on chatgpt's summary of my long draft here: https://hastebin.com/raw/muzuvodupu - I also added the last line above and the parentheticals.]
As far as the supervisor/worker idea, it can act as both. You can just say "outline the steps for this" and then "implement step 1 above", "implement step 2". It does much better than when you try to get it to fulfill an entire complex task in one shot.
Quite interesting! Great job on what you put up so far.
I was thinking along the same lines. Can you put some form of contact on your profile or contact me at mine, I'd love to chat with you about what you're doing.
One thing to note: I don't think a visual game is the best scenario for this because it is very bad at recognizing images, which means it is like asking a blind person to judge a painting. Not a fair task, since it doesn't have a visual cortex at all.
For example, I tried to get chatgpt to output vector graphics in svg format, but it didn't do a great job. I also asked it to recognize what object is in an svg vector image and it took a guess that wasn't bad but was far from being a suitable object detection, it didn't really recognize or describe the object in the svg graphic accurately.
Here is the svg I asked it to describe, it is the default image when you open this svg editor:
I copied this default image into chatgpt and asked it to describe it, and likewise you can ask it to generate its own svg, same as code, but it does very badly.
For the example image, it thought it was a mobile phone, whereas it was a depiction of a piece of paper with the corner bent down and some writing on it - a mobile phone is not a bad guess but far worse than it does for language-based tasks, and it didn't describe any part of the square, the circle, or the text "SVG".) I just repeated the experiment and got this interesting transcript you could look at if you want: https://hastebin.com/raw/cuqodivotu
As you can see it just does okay, not great.
So instead of these types of tasks, I think the supervision should be something that does not involve any imagery at all but is still a sort of generic puzzle. If you have some form of contact I'd love to chat with you about this area of research (this is not job-related).
If it's possible maybe you could click the Discord link in the upper right hand corner of that web site? That is my new Discord. I am runvnc in there. If not I am in the OpenAI Discord under the api-projects. You can also just email me runvnc at gmail dot com.
Yeah, it's actually kind of funny when you ask it to make something with SVG. It can get some of the shapes in there but doesn't know how to arrange them at all. I was aware of the lack of visual data in the training etc, flappy was just kind of a late night random idea.
But I have done some simple tests with for example telling it to write a a few basic unit tests and putting it in a loop. Also was kind of a mixed result but it depends on how you do it.
AGI will probably come from something like GPT-style ML methods plus something else IMHO. They won't get there on their own but the research will be useful.
In 2008, I was taking COS 217 at Princeton - our final project was to write a program to play Othello competitively. We were told to use alpha-beta-pruning for grading, but that there would be a tournament at the end of class, so we could tweak our programs for that (the assignment was insanely hard, so of course no one had time to tweak).
To my surprise, I won the tournament, despite having done the bare implementation with no tweaks! I didn’t get a perfect grade on the assignment though - apparently I had miss implemented the queue for the alpha-beta search, and had completely reversed the queue ordering.
Of course, searching “in reverse” through the alpha-beta tree turned out to be the difference that won me the tournament. It was rather humiliating, but a good lesson on sheer luck’s role in academic approaches to AI.
Isn't searching the right way round extremely important? If you get the ordering right, a search to depth N takes time O(2^N), if you get it in reverse, O(2^2N). Perhaps if there were no time constraints, doing something slightly different was enough to beat the other competitors who cancelled each other out.
Sadly the mistake was even more basic than you’re likely thinking - as I recall, min-max is necessarily breath-first, and I simply pushed onto the queue from the wrong side (think: append instead of push), so the queue was backwards, and breadth first traversed from right to left. Same time complexity, but slightly different paths reached within the same time constraints!
This is a great read, but I keep getting pulled astray by the sheer volume of interesting links that the author has embedded in the text. I'm less than a third of the way through and have just finished my 3rd rabbit hole exploration.
I think Heinlein hit an essence in The Moon is a Harsh Mistress for AGI.
It needs many inputs and many outputs into the real world. A true AGI sees the reaction of the world to its own actions. That is the root of sentience.
reply