Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Such a weird test. 99.9% of humans wouldn't even understand the question, let alone be able to formulate a coherent answer for it.


view as:

Being able to answer these questions is a pre-requisite for AGI. After all, there ARE humans capable of doing that, so, if the AI can't do it no matter how hard it tries, then that means there ARE human capabilities that the AI can't replicate (thus, it isn't an AGI). And it seems like no LLM is making any progress at all in that kind of prompt, which is why I use it as a core benchmark on my "AGI-meter".

Though humans aren't able to do it in a chat session. Being able to work on the problem in the background for a couple days may be a prerequisite for AI to solve these problems. And such would require money from the asker.

Anyone familiar with the syntax / jargon should be able to answer this specific problem in ~5 seconds of thinking, though. And I mean it, even a 10yo kid should...

I think you'll be using that meter for a long time, then. I don't really know anyone who's under the impression that the current direction of LLMs are going to produce AGI, it seems as if you're barking up a tree most people aren't really concerned exists.

That's fair enough

Except there’s a lot of not-so-informed people who think AGI was always here when chatgpt came out. Even more that think it’ll get there very shortly based on just bigger and bigger LLMs. Many have argued as such here on HN.

You're making a completely incoherent argument -- that if it can't do a single task that some percentage of people can do, that it's not at intelligent, when there is nobody on earth that can do everything that some small percentage of people on earth can do, by definition.

Why is this relevant to the performance of a computer program? It makes sense to me that computer programs & humans should continue to be judged by different standards.

If a good chunk of humans can't pass your "general intelligence test" then it's not by definition a general intelligence test unless humans are not generally intelligent.

which is better than formulating a coherent but wrong answer

Legal | privacy