Since we're in Yudkowsky subthread, I'll remind here that he spent some 20 odd years explaining why containment won't work, period. You can't both contain and put to work something that's smarter than you.
Another reason containment won't work is, we now know we won't even try it. Look at what happened with LLMs. We've seen the same people muse at the possibility of it showing sparks of sentience, think about the danger of X-risk, and then rush to give it full Internet access and ability to autonomously execute code on networked machines, racing to figure out ways to loop it on itself or otherwise bootstrap an intelligent autonomous agent.
Seriously, forget about containment working. If someone makes an AGI and somehow manages to box it, someone else will unbox it for shits and giggles.
I find his arguments unconvincing. Humans can and have contained human intelligence, for example (look at the Khmer Rouge, for example).
Also, right now there is nothing to contain. The idea of existential risk relies on a lot of stacked-up hypotheses that all could be false. I can "create" any hypothetical risk by using that technique.
> Humans can and have contained human intelligence
ASI is not human intelligence.
As a lower bound on what "superintelligence" means, consider something that 1) thinks much faster, years or centuries every second, and 2) thinks in parallel, as though millions of people are thinking centuries every second. That's not even accounting for getting qualitatively better at thinking, such as learning to reliably make the necessary brilliant insight to solve a problem.
> The idea of existential risk relies on a lot of stacked-up hypotheses that all could be false.
It really doesn't. It relies on very few hypotheses, of which multiple different subsets would lead to death. It isn't "X and Y and Z and A and B must all be true", it's more like "any of X or Y or (Z and A) or (Z and B) must be true". Instrumental convergence (https://en.wikipedia.org/wiki/Instrumental_convergence) nearly suffices by itself, for instance, but there are multiple other paths that don't require instrument convergence to be true. "Human asks a sufficiently powerful AI for sufficiently deadly information" is another whole family of paths.
(Also, you keep saying "could" while speaking as if it's impossible for these things to not to be false.)
Even for your last example, two hypotheses need to be true: (1) such information exists, and (2) the AI has access to such information/can generate it. EDIT: actually at least three: (3) the human and/or the AI can apply that information.
It also unclear to what extent thinking alone can solve a lot of problems. Similar, it is unclear if humans could not contain superhuman intelligence. Pretty unintelligent humans can contain very smart humans. Is there an upper limit on intelligence differential for containment?
> Even for your last example, two hypotheses need to be true: (1) such information exists, and (2) the AI has access to such information/can generate it. EDIT: actually at least three: (3) the human and/or the AI can apply that information.
Those trade off against each other and don't all have to be as easy as possible. Information sufficiently dangerous to destroy the world certainly exists, the question is how close AI gets to the boundary of "possible to summarize from existing literature and/or generate" and "possible for human to apply", given in particular that the AI can model and evaluate "possible for human to apply".
> Similar, it is unclear if humans could not contain superhuman intelligence.
If you agree that it's not clearly and obviously possible, then we're already most of the way to "what is the risk that it isn't possible to contain, what is the amount of danger posed if it isn't possible, what amount of that risk is acceptable, and should we perhaps have any way at all to limit that risk if we decide the answer isn't 'all of it as fast as we possibly can'".
The difference between "90% likely" and "20% likely" and "1% likely" and "0.01% likely" is really not relevant at all when the other factor being multiplied in is "existential risk to humanity". That number needs a lot more zeroes.
It's perfectly reasonable for people to disagree whether the number is 90% or 1%; if you think people calling it extremely likely are wrong, fine. What's ridiculous is when people either try to claim (without evidence) that it's 0 or effectively 0, or when people claim it's 1% but act as if that's somehow acceptable risk, or act like anyone should be able to take that risk for all of humanity.
We do pretty much nothing to mitigate other, actual extinction level risks - why should AI be special given that its risk has an unknown probability and it could even be zero.
> As a lower bound on what "superintelligence" means, consider something that 1) thinks much faster, years or centuries every second, and 2) thinks in parallel, as though millions of people are thinking centuries every second.
I'm fairly certain that what you describe is physically impossible. Organic brains may not be fully optimal, but they are not that many orders of magnitude off.
Not only is such an entity possible, it's already here, the only difference is clock speed. Corportations (really any large organization devoted to some kind of intellectual work, or you could say human society as a whole) can be thought of as a superintelligent entity composed of biological processors running in parallel. They that can store, process, and create information at superhuman speeds compared to the individual. And given that our brains evolved restricted by the calories available to our ancient ancestors, the size of the human pelvis, and a multitude of other obsolete factors, that individual processing unit clearly far from optimal. But even if you just imagine the peak demonstrated potential of the human brain in the form of an AI process it's easy to see how that could be scaled to massively superhuman levels. Imagine a corporation that could recruit from a pool of candidates (only bound by their computing power) each with the intellect of John von Neumann, with a personality fine tuned for their role, that can think faster by increasing their clock speed, that can access, learn and communicate information near instantly, happily work 24/7 with zero complaint or fatigue, etc, etc. Imagine how much more efficient (how much more superintelligent) that company would be compared to its competition.
The parent was positing, at a minimum, a 30 million fold increase in clock speed. This entails a proportional increase in energy consumption, which would likely destroy any thermal envelope the size of a brain. The only reason current processors run that fast is that there are very few of them: millions of synaptic calculations therefore have to be multiplexed into each, leading to an effective clock rate that's far closer to a human brain's than you would assume.
As for your corporation example, I do not think the effectiveness of a corporation is necessarily bottlenecked by the number or intelligence of its employees. Notwithstanding the problem of coordinating many agents, there are many situations where the steps to design a solution are sequential and a hundred people won't get you there any faster than two. The chaotic nature of reality also entails a fundamental difficulty in predicting complex systems: you can only think so far ahead before the expected deviation between your plan and reality becomes too large. You need a feedback loop where you test your designs against reality and adjust accordingly, and this also acts as a bottleneck on the effectiveness of intelligence.
I'm not saying "superintelligent" AI couldn't be an order of magnitude better, mind you. I just think the upside is far, far less than the 7+ orders of magnitude the parent is talking about.
I can think of a counterpoint or a workaround (or at least a sketch of either) for each of your objections, despite being moderately intelligent for a human. A superintelligence will think of many, many more (if it couldn't, it wouldn't be much of an intelligence in the first place).
For example:
> The parent was positing, at a minimum, a 30 million fold increase in clock speed. This entails a proportional increase in energy consumption, which would likely destroy any thermal envelope the size of a brain.
No reason to restrict the size of an artificial mind to that of a human brain. With no need to design for being part of a mobile, independently-operating body, the design space grows substantially. Even on existing hardware, you can boost the serial processing speed by a whole order of magnitude if you submerge it in liquid nitrogen. I imagine you could get much greater speeds still on hardware designed specifically for cryogenic cooling with cryo liquids at much lower temperatures. Maybe not 7 orders of magnitude difference, but 3 seem plausible with custom hardware.
Corporations are a good example of what's effectively AIs living among us today, but their performance is indeed bottlenecked by many things. A lot of them don't seem fundamental, but rather a result of the "corporate mind" growing organically on top of the runtime of bureaucracy - it's evolved software, as opposed to designed one. E.g.:
> there are many situations where the steps to design a solution are sequential and a hundred people won't get you there any faster than two
That's IMO because managers and executives, like everyone else, don't see employees and teams as a computational process, and don't try to optimize them as one. You can get a boost on sequential problems with extra parallelism, but it would look weird if done with human teams. You could absolutely do pipelining and branch prediction with teams and departments, but good luck explaining why this works to your shareholders...
The point being, while corporations themselves won't become superintelligences, due to bottlenecks you mention, those limits don't apply to AIs inhabiting proper computers. 7 orders of magnitude advantage over humans may be way too much, but I don't see anything prohibiting 2-3, maybe even 4 orders, and honestly, even 1 order of magnitude difference is unimaginably large.
Something much smarter may be able to make GOFAI or a hybrid of it work for reasoning, where we failed to, and be much more efficient.
We know simple examples like exact numeric calculation where desktop grade machines are already over a quadrillion times faster than unaided humans, and more power efficient.
We could plausibly see some >billion-fold difference in strategic reasoning at some point even in fuzzy domains.
First, how is the Khmer Rouge an example of containment, given that regime fell?
Second, even if your argument is "genocide of anyone who sounds too smart was the right approach and they just weren't trying hard enough", that only really fits into "neither alignment nor containment, just don't have the AGI at all".
Containment, for humans, would look like a prison from which there is no escape; but if this is supposed to represent an AI that you want to use to solve problems, this prison with no escape needs a high-bandwidth internet connection with the outside world… and somehow zero opportunity for anyone outside to become convinced they need to rescue the "people"[0] inside like last year: https://en.wikipedia.org/wiki/LaMDA#Sentience_claims
[0] or AI who are good at pretending to be people, distinction without a difference in this case
We don't have an answer yet to what the form factor of AGI/ASI will be, but if it's anything like current trends the idea of 'killed' is laughable.
You can be killed because your storage medium and execution medium are inseparable. Destroy the brain and you're gone, and you don't even get to copy yourself.
With AGI/ASI if we can boot it up from any copy on a disk if we have the right hardware then at the end of the day you've effectively created the undead as long as a drive exists with a copy of it and a computer exists than can run it.
You talk with these interesting certainties. Like if a human dies and we have their body and DNA we can be pretty assured they are dead.
With something like an AI you can never be sure as you must follow the entire chain of thermodynamic evidence of the past to when the AI was created that no copy of it was ever made. Not just by you, by any potential intruder in the system.
You're the only example I know of, of someone using "containment" to mean "extermination" in the context of AI.
Extermination might work, though only on models large enough that people aren't likely to sneak out of the office with copies for their own use and/or just upload copies to public places (didn't that already happen with the first public release of Stable Diffusion, or am I misremembering?)
Another reason containment won't work is, we now know we won't even try it. Look at what happened with LLMs. We've seen the same people muse at the possibility of it showing sparks of sentience, think about the danger of X-risk, and then rush to give it full Internet access and ability to autonomously execute code on networked machines, racing to figure out ways to loop it on itself or otherwise bootstrap an intelligent autonomous agent.
Seriously, forget about containment working. If someone makes an AGI and somehow manages to box it, someone else will unbox it for shits and giggles.
reply