Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
OpenAI Insider Estimates 70% Chance AI Will Destroy or Tragically Harm Humanity (futurism.com) similar stories update story
2 points by jc_811 | karma 1876 | avg karma 6.01 2024-06-10 15:00:32 | hide | past | favorite | 79 comments



view as:

At this point, you can practically predict what OpenAI will announce to the press next. It's the same Skinner/Chalmers routine over-and-over-and-over again.

"Good lord, what is happening in there? Life-changing AI implementations that could destroy all human life as we know it? Localized entirely within your laboratory?"

"Yes"

"May I see it?"

"No"


It's pathetic at this point, so-called AI is grossly overvalued based on imaginings of what it MIGHT become, but shows no signs of becoming. I always ask people what they'd value "AI" at if what we see now is all we ever get.

Of course this dog and pony show just got NVIDIA past $3 trillion, it's like crypto on steroids.


I bet those stocks are going to go even crazier when Apple announces their OpenAI partnership later today.

I'm sure they will, the nature of stocks being that it doesn't matter WHY things go up, just that they reliably will. Then when it all comes crashing down at some remove from the hype, and the smart moneymakers already divested, it's just the average bag holder left wondering what happened to their 'sure thing'.

> imaginings of what it MIGHT become, but shows no signs of becoming

Just for reference, this is from nine years ago:

https://youtu.be/dbQh1I_uvjo

And this is from five years ago:

"You may not need hospitalization for respiratory problems for which there is pain or even full internal combustion. Breathing is particularly hard in the high-pitched high pitch, low pitched pitches. Air you inhale and exhale can cause a shortening in your breathing, so a ventilator is best. Doping can be administered on your own. If you fall, you must be at least 1 kilometer away from you to be considered for the testing. Your turn-of-the-seat's softerness may be a sign that something has gone wrong. Your depth should not exceed the mean of your seat height when wearing the seatbelt."


For one, Deepdream is cool but was never designed to be photorealistic. It was intentionally degraded to create psychadelic imagery, unlike the contemporary computer vision models we know today.

And for two... LLMs are also getting to be pretty old. BERT and GPT-2 are both more than 5 years old, and still have fundamental issues that we can't be sure are solvable. They lie confidently, they conflate facts and fiction, they create entirely made-up scenarios when asked about real-world happenings. It should be extremely alarming that the only successful AI strategy to-date has been scaling-up an already inefficient model.

AI will improve with time, but it seems entirely plausible to me that we've already hit the proverbial "bathtub curve" of progress. I genuinely cannot imagine what a "generational leap" in LLM technology would look like, outside of fixing the hallucination issues.


> LLMs are also getting to be pretty old. BERT and GPT-2 are both more than 5 years old

You sound like you have no clue that five years are the blink of an eye.


I'm someone that used both BERT and GPT-2 5 years ago, and am incredibly disappointed by how little progress the industry has made in that time. GPT-Neo-J is still arguably competitive with the latest AI models you can use today.

Cryptocurrency apologists also used this line of logic. "Yes, today the value of crypto is nothing... but imagine where it will be in five years!" Then we all wait 5 years, and cryptocurrency is still the victim of it's own mindset. Like cryptocurrency, I think LLMs have "technically" solved what they set out to do; generate readable text. But you need more than a solution in search of a problem; there isn't really that much demand for marginally truthful text in the same way completely decentralized currency isn't really necessary for the average well-meaning civilian. I'd even go further and argue that introducing AI into our daily lives requires you to replace something else, something that was probably more accurate and better-designed than the AI replacing it.

If 5 years is a "blink of an eye" for the industry, the entire field will be dead within 18 months. VC just moves that fast.


"Arguably competitive" is just plain wrong. I know it's fun to be dismissive of things, but making silly claims doesn't help comments like this read seriously.

I would legitimately argue they are competitive. GPT-J and GPT-Neo are both fundamentally quite similar to modern LLMs, and when compared against other 4b and 8b parameter models I would genuinely champion the quality of their responses over their contemporaries.

> when compared against other 4b and 8b parameter models I would genuinely champion the quality of their responses

You clearly have some very specific models in mind. Even if the latest 4B and 8B models don’t move the needle on the “results you would champion” metric, this does not advance your argument that the state of the art hasn’t significantly progressed from 5 years ago.

> I would legitimately argue

I’ll bet you would!


> generate readable text

Because this to you is "generate readable text"?

https://m.youtube.com/watch?v=MirzFk_DSiI

Sorry to be so direct, but you're in denial.


That is a party trick, that Hatsune Miku does every year for thousands of loving fans. The fact that an American company figured out vocaloid is not making any nerds lose their minds.

There's a reason why GPT-4o is not taking over YouTube and social media with it's incredible capabilities; nobody cares.


> cannot imagine what a "generational leap" in LLM technology would look like

Well, if it encodes a world model, and can work on its own encoding, loop it into a critical revision of its contents and you have the Real Thing - something that is informed and reasons.


Why should I expect a world model to be any more accurate than the current and horribly flawed text and vision modalities?

Because it is «loop[ed] into a critical revision of its contents»...

That a full world model is there is the preliminary condition. So, first you have to establish that an LLM actually draws a world model in its inner structure.

Then you have to refine that world model, and be able to refine its intentionally spawned branches (e.g. "Imagine scenario X"...). If the iteration of the refining is successful, you achieved the goal.

The point is implementing critical thinking; that the initial world model is flawed is the known starting point: at the beginning, it only knows it has "heard" a lot of information.


> that the initial world model is flawed is the known starting point: at the beginning, it only knows it has "heard" a lot of information.

If it has no concrete reasoning, how is this looped revision intended to be more-accurate than a zero-shot guess?

I agree with the fundamental principle of not trusting input data from the start, but that just puts us back in square-one where we generate non-authoritative results dependent on a series of autoregressive parameters. It might generate better-reasoned results, but the outcome will be equally unreliable and prone to hallucination.


> If it has no concrete reasoning, how is this looped revision intended to be more-accurate

It can work if the reasoning (the logic) is part of the world model and is similarly refined.

Then, like in the normal case, you gather a good amount of information than draw your conclusions based on you logic and the "wisdom" built. Were this automated, you would have an automated champion in judgement - doing its best with the imperfect data and function available. Like us, just better.


> but shows no signs of becoming

We started a very long time ago and you have to give it time. The current turn of events seems to have turned its back on some very critical aspects - and still, give it time.

Impatience at this stage seems very ill-posed.

Although you probably meant that "LLMs show no signs to "emerge" (through upscaling) what they should become".


Is there any technology you can't say that about?

>>70 percent chance that AI will destroy or catastrophically harm humanity

>Is there any technology you can't say that about?

Yes: all the technologies humanity has invented except for AI and maybe possibly nuclear bombs.

Or am I misinterpreting your question?


Nuclear bombs aren't the only technology that made ww2 so bad. Pretty much every technology has been used for harm one way in another. The question is on balance whether we are better off with or without.

Since you are bringing up WW2, I'm guessing you are not aware that by "destroy humanity", OP means "reduce the human population to exactly zero".

He added a lesser option, catastrophically harming humanity, so whatever he meant by the first is immaterial (“there’s a 70% chance of a hurricane or strong winds”). Furthermore, if it wasn’t a high number chosen for dramatic effect the estimated percentage would be completely arbitrary.

I disagree that 70% was chosen for dramatic effect.

No, you’re right, it was chosen because “trust me bro”.

Look, it may well be something he believes, and he’s free to prognosticate (or market) however he likes, but I see absolutely nothing to support the number outside of his own opinion.

Besides, there’s no time limit on p(doom), so it’s completely unfalsifiable (“on a long enough timescale…”), and it’s about the destruction of humanity which means it’s unprovable as well. That, in my view, makes his 70% guess a sensational statement lacking scientific merit.


No, the number is made up and the facts don’t matter so the statement can easily be reimagined as an ad lib.

> There’s a [arbitrary number] percent chance that [technology] will destroy or catastrophically harm humanity

Try these:

social media - the Internet - the large hadron collider - Starlink - Neuralink - quantum computers - the Y2K bug - the 2038 bug - electric cars - gasoline cars - the great firewall of China - the not so great firewall of asbestos - gain of function research - mRNA technology - nuclear bombs - nuclear energy - nuclear memes - paper clip manufacturers - scissors

I’m not saying it’s true that these have a $ARBITRARY_NUMBER percent chance of destroying or catastrophically harming humanity, but couldn’t you make the case? If you want to start with an easy one, try “scissors” ;)


Silly string and aglets seem safe for now.

Is that insider perhaps employed to do marketing?

Daniel Kokotajlo quit OpenAI earlier this year, and was one of the only people who refused to sign their secret non-disparagement agreement, even though (at the time) OpenAI told him this meant he would forfeit his equity. The idea that someone who quit OpenAI months ago, and who volunteered to give up all of his OpenAI stock in order to be able to publicly criticize OpenAI, is actually part of a secret OpenAI marketing strategy, reads like a galaxy-brained conspiracy theory.

> idea that someone who quit OpenAI months ago, and who volunteered to give up all of his OpenAI stock in order to be able to publicly criticize OpenAI, is actually part of a secret OpenAI marketing strategy

Wake me up when they go to D.C. and start working for the people at government wages. This isn’t part of OpenAI’s marketing, but it’s still self-aggrandizement.



> Christiano has

No, he hasn’t. He started a non-profit. I can’t find his pay.

Also, the Wikipedia seems wrong. The Alignment Research Center is not a 501(c)(3) charity [1], they simply accept donations through one [2]. They directly claim their evaluations team, METR, is a 501(c)(3) [3], but due to the name being so close to “metro,” I’m having trouble verifying that. (They also launder at least smaller donations through the middleman charity [4].)

The closest Cristiano has come to public service is his role at the NIST, an appointment which prompted a revolt amongst staffers given his ties to effective altruism [[5].

All a bit ironic for a pair of nonprofits preaching transparency.

[1] https://apps.irs.gov/app/eos/

[2] https://www.every.org/alignment?donateTo=alignment#/donate/c...

[3] https://www.alignment.org/donate/

[4] https://metr.org/donate

[5] https://web.archive.org/web/20240607060738/https://venturebe... having trouble loading the original Venture Beat article

[a] https://news.ycombinator.com/item?id=39643951


How is working at NIST not "going to DC and working for the people at government wages"? He is, in fact, currently working at NIST, he's on the website:

https://www.nist.gov/people/paul-christiano

The Alignment Research Center is, in fact, a 501(c)3 charity, here is their Form 990:

https://projects.propublica.org/nonprofits/organizations/863...

METR is also a 501(c)3. It doesn't have a Form 990 because it's so new it hasn't filed a return yet, but here is the Guidestar page:

https://www.guidestar.org/Profile/99-1219864

every.org is just a generic non-profit payment processor. Lots of people use payment processors, because managing payments is annoying. Calling this "laundering" is like saying that accepting third-party payments through Stripe is some sort of illegal sinister plot.


> How is working at NIST not "going to DC and working for the people at government wages"?

It is. I’m saying the sole evidence of outcome I could find was his staff revolting.

> Alignment Research Center is, in fact, a 501(c)3 charity, here is their Form 990

Thank you, I stand corrected. (Christiano’s compensation is an admirable zero.)

> every.org is just a generic non-profit payment processor

Sure, and given ARC and METR are 501(c)(3)s, it’s fine. Charity to charity. If either weren’t, and were simply non-profits, it would mean a charity was lending its tax-exempt status to a non-charity. That would be sketchy. But as you’ve shown, it’s not what’s going on here.


Kokotajlo is also a frequent poster at LessWrong, the favored watering hole for AI doomers who take Eliezer Yudkowsky seriously. I think that tells you all you need to know about him and why he left OpenAI.

Would have been far more believable if they said 71.24% chance. 70% sounds like they're just spitballing.

The threshold is “destroy or catastrophically harm humanity,” a threshold which one could argue the burning of coal, existence of nuclear weapons, social media for children and a good fraction of living world leaders have already reached [1].

[1] https://www.nytimes.com/2024/06/04/technology/openai-culture...


Agreed. In addition to sounding like spitballing, even the linked articles (from what I saw) don't show a clear way in which the former employee came to 70%. Overall this article comes across as fear-mongering but one of the linked sources was an interesting blog post (IMO) about why people should _stop_ providing specific estimations of p(doom).

https://www.lesswrong.com/posts/EwyviSHWrQcvicsry/stop-talki...


They are just spitballing. Nobody has any actual idea yet whether AGI is even possible. Nobody has a statistically valid sample of what AGIs have done to civilizations. Nobody actually knows anything. They're just spitballing.

That was my point. 70% is a made up number. If you're going to just make up a number, may as well throw some false precision in as well, lest you appear lazy.

Children playing with fire.

Did you try turning it off and back on again?

"Oh, but how will we ever get to the stars without it?"

Do you know what type-III civilizations don't appreciate? Stupidity.


If you haven't accepted that humanity is doomed to being a Type 1 on the Kardashev scale, I think you might need a refresher course in Malthusian growth and the progress of global warming. It would be convenient if we could blame AI for destroying our planet, but the sad truth is that we did this, with our genius expansionist industry. AI is the spittle-drooling hero we summoned in hopes that it would save us from our own demise.

The mere fact that something as dumb as LLMs could be marketed and sold to investors is enough of an indictment of human intelligence already. It's over, for us. We cannot even be bothered to value rational thought, how can you expect investors to behave in humanity's self-interest without AI?


Not asking, and pressing on, “How?” in most of these is journalistic malpractice.

“Does it involve creation of technology that doesn’t exist now?”

“Does it involve physically impossible nanotech?”

“Does it involve assuming the personal safety of every person with a any degree of control over it?”


AI is already killing innocent people in war. All it takes is for AI to launch the wrong missile once.

In a traditional sense (not like Lavander), AI is applied in combat in extremely specific situations where it is almost impossible to mess-up. Computer-vision bombs have to corroborate their target with GPS or INS information, and mostly serve to get a solid TV lock on a target; there is no LLM to jailbreak here. Cruise missiles are a dangerous asset, but that's the case with or without AI; we design these dangerous weapons specifically so they can't be overriden or controlled by third-parties, AI included.

In theory, you're right. If an LLM was given complete control over ballistic missiles, there is a nonzero chance that something could go horribly wrong. But this is where we take two steps back and realize that no rational country will give AI complete control over their weapons. It would be an extremely human-based design failure to hand over control of nuclear weapons to a statistics model designed to be wrong.


> this is where we take two steps back and realize that no rational country will give AI complete control over their weapons

And to the extent they do, it’s replacing a system which otherwise wasn’t particularly concerning itself with civilian casualties on the other side.


That is not the algorithm or the black-box: that is using it loosely, intentionally, by humans.

It is another available piece of imposing power that can be used irresponsibly.


AI will not harm humanity. Nuclear weapons will not harm humanity. Bad humans will.

I think this is a strong argument. Just as you cannot hold a computer accountable, you can't hold any piece of technology responsible. Of course some technology is created to harm or destroy as its outcome, but that isn't most technology.

If someone inside OpenAI had said something like "I predict there is a 70% chance our current corporate culture and leadership will harm or destroy humanity", that would be a much more compelling argument.

It's what you do with the technology that counts.


Heh, the problem with AI may be what it does with us.

Yes, we know that it is ultimately bad humans pulling the trigger, but we also know for a fact that bad humans exist. So why are we arming them?

> Yes, we know that it is ultimately bad humans pulling the trigger, but we also know for a fact that bad humans exist. So why are we arming them?

Because tech people want to build, are paid to build, and "it is difficult to get a man to understand something, when his salary depends on his not understanding it."


Not sure if ironic or not, but wild how often I'm hearing people paraphrase "guns don't kill people, people kill people" line lately.

I guess any group can become the NRA when their controversial interest comes under scrutiny.


Reminds me of the comic "Yes, the planet got destroyed. But for a beautiful moment in time we created a lot of value for shareholders."

There's a 70% chance that their ability to predict the future is flawed.

TLDR: Daniel Kokotajlo (ex-OpenAI employee) puts chances of AI doom at 70%.

Source is actually here: https://www.lesswrong.com/posts/xDkdR6JcQsCdnFpaQ/adumbratio...


70% given future variables that are unknown and vast is just saying he has an agenda to pull sensational numbers

Sure, I don't agree with that estimate either.

Is this a time bound prediction, or is it 70% chance AI will tragically harm humanity between now and the death of our species?

The sun begins to die 5 billion years from now, so is this an estimate there's a 70% chance someone will "tragically harm humanity" with AI in the next 5 billion years?

Or are they positing humans will survive the death of our sun, and concluding between now and infinity, someone somewhere will tragically harm someone with AI?

If this is true someone at some point between now and infinity should really get on this, perhaps we might estimate that around 2.5 billion years from now, we'll need to write some regulations.


Looks like knowing simple statistics isn’t a requirement at OpenAI.

There's too much alpha in being a doom sayer.

Tragic harm can take many forms. I think one likely tragic harm is the mundane one: our grandchildren live in a version of the spaceship from WALL-E, except it is their minds that are floating around in chaises longues drinking slurpees and getting fat.

May be I'm ignorant but how does anyone go about estimating this, much less doing so in a realistic fashion? The article states AGI will be developed by these people by 2027, but we've heard all the estimates that we'd have self driving cars by now and most people wouldn't drive, while self driving cars are becoming a thing, they missed their target year by years now.

Unless I'm mistaken, even GenAI hasn't really changed anything worthwhile. Most of the changes I see are a) most of the scummy and creepy internet ads are now AI generated b) endless spam is now AI generated and c) the latest round of tech layoffs used GenAI as an excuse. Has there been an actual revolution I've missed?

...and so, if GenAI hasn't lived up to the hype, how am I supposed to believe that this industry will make actual existing AGI?


I am so tired of these meaningless warnings.

Prove it.

The open letter this article is based on still cites, "existential risk."

Define what AGI is. Show us evidence the technology is on track to implement it.

This whole "AI safety" scene is becoming a circus for conspiracy theories and speculation.

As far as I can tell we're reaching the point of diminishing returns in training LLMs... which is already rather destructive.

What's telling is that we don't see similar "safety" committees there trying to protect water in distressed ecosystems, regulate building/zoning of data-centres, energy usage; protecting workers used in alignment and training; protecting workers displaced by the technology and poor policies, etc.


Ah yes, the primary threat to humanity, a random word generator.

It's up to humans to use AI well to improve the human condition or to harm it. We know for a fact that the axis of evil (RICIN) (Russia, Iran, China, Israel, North Korea) will use it to do their worst. The only way to counter bad AI is with good AI.

Looking at the way soldiers in Ukraine or civilians in Russia are being blown up to pieces by small drones, I'd say we're already here. And these drones are driven by AI, so even if GPS is jammed, these drones are intelligent enough to find their target without it.

Now imagine that AI has access to anybody's search history, plus that person is using a trackable phone, who's to say that a small FPV drone cannot be sent to get the job done? The way it's done in Ukraine?

I think we as humanity are facing mortal danger RIGHT NOW, not in the future. AI is killing humans on almost industrial. Especially because, well, some people might argue that's the entire goal. So AI is making it much easier.


Jürgen Schmidhuber said years ago during a live QnA that the debate is not dissimilar from that at the invention of fire: "Oh my, it's dangerous" // "Yes, but it is there - it is facts now".

Mass extinction through technology has been a known reality since decades. The technology remains the enabler, the responsible remains the human actor.


> The technology remains the enabler, the responsible remains the human actor.

That's an awful lot like "it is difficult to get a man to understand something, when his salary depends on his not understanding it." Some tech people just want to keep building, consequences be damned. Mentally shifting the responsibility one step further down the road is just a way for them to dodge responsibility.

It's worth remembering the developer is also "human actor." He's still responsible if he builds a dangerous technology, and leaves it for someone else to push the button.


It depends case by case. Examples abound, too many.

The atomic bomb? But Nazis had the V2.

The videogame? But the creator rewarded the creative, entertainment and possibly artistic results more than the idea of collective time of addiction, or maybe thought "better than chemicals"...

The axe? It is for chopping trees. Dynamite? Mines. Death ray? To defend yourself from wolves etc.

Sure there will be cases in which someone develops something inherently and only harmful, but it is not the general rule.


It's worth remembering "past performance is not indicative of future results."

Don't reason about the safety of potential future technology from the safety of past technology, generally. That gets you into nonsense like "my highly contagious bio-weapon won't destroy civilization because the axe didn't." You've got to look at the particularly characteristics of the new technology, with a cold and realistic understanding of human nature.


You are pointing to the risk that rogue engineers deluded themselves thinking that past technologies were not "that" devastating. That would be an error on more sides: not only you have not died by fall until the ground so past trend does not determine the future, but also, some past technologies have been devastating.

Still, it is pretty uncommon in general to develop a «highly contagious bio-weapon»: when it happens - and it happens - the destructive context is clear. But there are also very many cases in which the productive or destructive adoption of an otherwise neutral technology is critical. The progresses determined by Yann LeCun: recognize targets, or thieves, or citizens, or tumors...



'Some guy has an opinion, which also seem to be aligned with the importance of the products he works on'

Legal | privacy