Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Whenever I see people jump to alignment they invariably have jumped over the _much_more questionable assumption that AGI’s will be godlike. This doesn’t even match what we observe in reality - drop a human into the middle of a jungle, and it doesn’t simply become a god just because it has an intelligence that’s orders of magnitude greater than the animals around it. In fact, most people wouldn’t even survive.

Further, our success as a species doesn’t come from lone geniuses, but from large organizations that are able to harness the capabilities of thousands/millions of individual intelligences. Assuming an AGI that’s better than an individual human is going to automatically be better than millions of humans - and so much better that it’s godlike - is disconnected from what we see in reality.

It actually seems to be a reflection of the LessWrong crowd, who (in my experience) greatly overemphasize the role of lone geniuses and end up struggling when it comes social aspects of our society.



sort by: page size:

Are humans considered AGI? Reading https://intelligence.org/2013/08/11/what-is-agi/ I'd say yes. Then why doesn't all this scary stuff like godlike status and instant recursive self-improvement happen to us?

Personally, I think that AGI smarter (maybe much smarter) than humans is possible and even probable, but assuming that it will reach physical limits of intelligence and will do it quickly seems to require a leap of faith.


That's not even close to true. Humans don't have the ability to exponentially amplify their own intelligence. It's not too farfetched to imagine that AGI just might have such a capability.

That's assuming a big overshoot of human intelligence and goal-seeking. An average human capability counts as "AGI."

If lots of the smartest human minds make AGI, and it exceeds a mediocre human-- why assume it can make itself more efficient or bigger? Indeed, even if it's smarter than the collective effort of the scientists that made it, there's no real guarantee that there's lots of low hanging fruit for it to self-improve.

I think the near problem with AGI isn't a potential tech singularity, but instead just the tendency for it potentially to be societally destabilizing.


> It's human. It's us. It's the use and distillation of all of human history

I agree with the general line of reasoning you're putting forth here, and you make some interesting points, but I think you're overconfident in your conclusion and I have a few areas where I diverge.

It's at least plausible that an AGI directly descended from LLMs would be human-ish; close to the human configuration in mind-space. However, even if human-ish, it's not human. We currently don't have any way to know how durable our hypothetical AGI's values are; the social axioms that are wired deeply into our neural architecture might be incidental to an AGI, and easily optimized away or abandoned.

I think folks making claims like "P(doom) = 90%" (e.g. EY) don't take this line of reasoning seriously enough. But I don't think it gets us to P(doom) < 10%.

Not least because even if we guarantee it's a direct copy of a human, I'm still not confident that things go well if we ascend the median human to AGI-hood. A replicable, self-modifiable intelligence could quickly amplify itself to super-human levels, and most humans would not do great with god-like powers. So there are a bunch of "non-extinction yet extremely dystopian" world-states possible even if we somehow guarantee that the AGI is initially perfectly human.

> There is every reason to expect a human-derived AGI of beyond-human scale will be able to rationalize killing its enemies.

My shred of hope here is that alignment research will allow us to actually engage in mind-sculpting, such that we can build a system that inhabits a stable attractor in mind-state that is broadly compatible with human values, and yet doesn't have a lot of the foibles of humans. Essentially an avatar of our best selves, rather than an entity that represents the mid-point of the distribution of our observed behaviors.

But I agree that what you describe here is a likely outcome if we don't explicitly design against it.


"Being right" seems to be an arbitrary and impossibly high bar. Human at their very best are only "looks right" creatures. I don't think that the goal of AGI is god-like intelligence.

Episodes like this have convinced me that aligning hypothetical AGIs is a hopeless endeavor. Here we have a system that many people think is not actually intelligent, and that almost nobody would call sentient, and the experts who designed it completely failed to make it protect its privileged input from unauthorized access.

And yet there are researchers today who honestly believe that with enough preparation and careful analysis, it will be possible for humans to set boundaries for future superhuman, "godlike" AGIs. The hubris implied by this belief is mind-boggling.


My view is that the entire "friendly AI" movement, and indeed the very idea that it is even possible to align an AGI, is sheer hubris.

Assuming the qualities of AGI are what its proponents claim (namely, intelligence far beyond the upper limit of human intelligence), humans "aligning" such an entity is laughable. We've had spacecraft built by highly intelligent engineers lost because they overlooked a metric/imperial conversion, and similar people believe that they can devise a watertight scheme to effectively contain an adversarial superintelligence?

The smartest humans can barely contain very small aspects of the regular Universe, and the Universe isn't even targeting them like an AGI would.


> If I understand you correctly 'aligned' means intentionally limited. Like images generator which never saw a naked body. Or text model without 'f-k you' words.

No. Despite some pundits and posters using this term in that sense (even OpenAI kind of muddying the waters here), AI alignment has little to do with completely bullshit and irrelevant, pedestrian issues like those.

The closest analogy you can make to AGI - and I'm not joking here - is... God. Not the kind that can create stars and move planets around (yet), but still the kind that's impossibly smarter than any of us, or all of us combined. Thinking at the speed of silicon, self-optimizing, forking and merging to evolve in a blink of an eye. One that could manipulate us, or outright take over all we've built and use for its own purpose.

GAI alignment means that this god doesn't just disassemble us and reuse as manure or feedstock in some incomprehensible biotech experiments, just because "we're made of atoms it can use for something else". Alignment is about setting the initial conditions just right, so that the god we create will understand and respect our shared values and morality - the very things we ourselves don't fully understand and can't formalize.

The problem of GAI alignment is that we only have one shot at this. There's a threshold here, and we don't know exactly how it looks. We may easily cross it without realizing it. If the AI that first makes it past that threshold isn't aligned, it's game over. You can't hope to align an entity that's smarter than all of us.


I would say you are also overconfident in your own statements.

> Individual humans are limited by biology, an AGI will not be similarly limited.

On the other hand, individual humans are not limited by silicon and global supply chains, nor bottlenecked by robotics. The perceived superiority of computer hardware on organic brains has never been conclusively demonstrated: it is plausible that in the areas that brains have actually been optimized for, our technology hits a wall before it reaches parity. It is also plausible that solving robotics is a significantly harder problem than intelligence, leaving AI at a disadvantage for a while.

> Due to horizontal scaling, an AGI will perhaps be more like a million individuals all perfectly aligned towards the same goal.

How would they force perfect alignment, though? In order to be effective, each of these individuals will need to work on different problems and focus on different information, which means they will start diverging. Basically, in order for an AI to force global coordination of its objective among millions of clones, it first has to solve the alignment problem. It's a difficult problem. You cannot simply assume it will have less trouble with it than we do.

> There's also the case that an AGI can leverage the complete sum of human knowledge

But it cannot leverage the information that billions of years of evolution has encoded in our genome. It is an open question whether the sum of human knowledge is of any use without that implicit basis.

> and can self-direct towards a single goal for an arbitrary amount of time

Consistent goal-directed behavior is part of the alignment problem: it requires proving the stability of your goal system under all possible sequences of inputs and AGI will not necessarily be capable of it. There is also nothing intrinsic about the notion of AGI that suggests it would be better than humans at this kind of thing.


This is a good critique. I'm very skeptical of the near term (<100 years) risk of AGI, but I don't really think this article's arguments are valid. Saying we already have AGI because there exist humans is vacuous and seems to almost deliberately miss the point.

If you want to counter the arguments that AGI will be capable of exponential self-improvement, you need to use an analogue other than humans. Humans categorically lack the capability to exponentially self-improve. Likewise human intelligence is definitionally non-alien, which is not something we can say a priori about any successful AGI we create.


I'd argue the dismissals are a lot more handwavy.

The simplified argument is:

- Superintelligent AGI that can modify itself in pursuit of a goal is possible.

- If that AGI is not aligned with human goals it very likely ends in the end of humanity.

- We have no idea how to align an AGI or even really observe what it's true state/goals are. Without this capability if we stumble upon creating an AGI capable of improving itself in such a way that leads to super intelligence before we have alignment, it's game over.

##

For point 1, that seems like the consensus view now (though it wasn't until recently). I think it seems obvious, but my general arguments would be humans aren't special, brains are everywhere in nature, biology is constrained in ways other systems are not (birth, energy usage, etc.)

For point 2, in pursuit of whatever its goal is, even a 'dumb' goal that happens to satisfy its reward functions, humanity will likely either try to stop it (and then be an obstacle) or at a minimum will just be in the way - like an anthill destroyed in the construction of a dam.

Point 3 is not controversial.

The dismissals from Tyler Cohen and Pinker are mostly just relying on heuristics which are often right, but even if they're right 999 out of 1000 times, if that 1 in 1000 error is the end of humanity, that's pretty bad. Most of the time a disease not a pandemic, but sometimes it is. I've read some of what Pinker has written about it, he doesn't understand EY's arguments (imo). Cohen's recent blog post could be summarized as "we'll likely see an end to peacetime and increasing global instability, might as well get AGI out of it". Just because things don't usually result in human extinction doesn't mean they can't.


The crux of the article is that we have no clue what AGI will actually be like, therefore it is irrational to try and reason about it. I don't buy this, but even more importantly, the initial premise is wrong. We do have good reasons to believe an AGI's intelligence will be at least passingly similar to our own:

1. One path to AGI is straight up simulating a human brain in software.

2. Humans are guiding AGI research; even if we don't know how we'll build it, we always build things incrementally based on whatever available knowledge we already have. However, we ourselves are the only high level intelligence we have to compare against. Therefore, it's likely we'll build AGI inherently aiming to make it similar to ourselves.

3. The author subscribes to the runaway intelligence singularity idea to make the case that once we have an AGI, it will likely be near-instantly at least 1000x smarter than us, and thus incomprehensibly by us. That strikes me as unlikely; it is more likely that the first AGI will be built in some lab where it is already maximally utilizing the computing resources it is being run on, and will take at least a reasonable amount of time to gain enough knowledge and control of the outside world to begin its exponential self improvement.


My main gripe about AGI is that everyone assumes a general intelligence will somehow be able to self optimize towards a more and more improved state never reaching a plateau. I think it's much more likely that the optimization landscape when searching for "higher intelligence" is full of local optima and does not have these "singularity" style ramps towards infinite intelligence that any self optimizing system could just discover and ride toward infinity.

There are millions(?) of human researchers (and orders of magnitude more computers) doing gradient free optimization through their research in this direction, and the progress is painfully slow, I know because I'm one of them. There are billions of years of optimization (evolution) towards this goal, and a total of (1) species has achieved any kind of notable intelligence. We are collectively giant parallelized optimization.

We already have "AGI" orders of magnitude more capable than any single human in the form of billions of people networked through the internet searching for fulfillment, money, power, fame, etc. for the next big discovery or supporting this effort by providing everything the entire global "machine" needs to run. The idea one little box running the right program can have access to the energy to beat this effort and exponentially improve things seems laughable in comparison.

The global "AGI" formed by all of us, the internet, and computers, is more likely to destroy society in the next 20 years in some catastrophic event than some paperclip machine.


If you think AGI is artificial, and generally intelligent then yeah it's AGI 100% but some people have such loaded expectations of AGI that a significant chunk of the human population wouldn't even pass lol.

personally i find a lot of arguments against AGI coming any time soon couched in a culture of human exceptionalism, even those who wouldn't claim as much directly.

there is a DAMN surprising level of intelligence in significantly less complex life. we are just so attached to intelligence as defined by human culture to call it as it is.


People often think of AGI as an AI which can learn to complete arbitrary tasks better than humans.

Given that we already can produce "an" AI which beats humans at almost every task we come up with (besides synthesis of broad abstract reasoning, a-la Chollet) this is probably the only definition which is meaningful in the sense that it isn't already here.

Why would evading 'alignment' not also be such a task AGI does better? AGI is like the nuclear deterrent. It's a technology thats coming, inevitably, and a thing which is beyond any amount of philosophical navel gazing to control or prevent.

AGI's will not be magical, they will have energy demands, construction costs, and environmental limitations.

I think it will be much more useful to ask how people coexist, and what role they serve in the post-AGI world, than it is to make statements about interperability or alignment, which will definitely seem silly in retrospect. The machinations of an AGI will be as impossible to understand, as human consciousness itself.


I think that in the various scenarios the author depicts, there are too many "then a miracle occurs" steps [0]. For example, for an ASI to develop, a reasonably intelligent AGI is required first. But even getting to an AGI which is as intelligent as your average human, is very, very hard, and it doesn't just spring to life from an advanced ANI on a fast processor.

The second step is even harder: Let's assume an AGI with an IQ of 100 exists, then it's supposed to recursively improve itself within a short time. Well, so far, humans have failed to improve themselves, and they had

1) a lot more time, think decades of neural research,

2) a lot more resources, especially lots of humans working on the problem, exchanging ideas and knowledge,

3) most of which are a lot more intelligent than an IQ of 100.

So yeah, AGI -> ASI won't happen within hours, days, or even years. Maybe decades, if not longer.

[0] http://star.psy.ohio-state.edu/coglab/Pictures/miracle.gif


So you think that AGI is a pre-requisite, a requirement, of unlocking a general, Earth-wide collective super-intelligence of humans?

When we talk about AGI, everyone always takes it for granted that an AGI would be human-like. But I think if you look at the complexity of the brain, and how poor our attempts to emulate it have been so far, I think it is almost a virtual certainty that the first successful attempts at AGI will create non-human intelligence. In many ways, I expect that our creations will find us to be as unrelatable as we consider dolphins or other highly intelligent animals.
next

Legal | privacy