Hacker Read

lucakiebel | karma 187 | avg karma 3.4 · 2023-02-15 11:53:49

If it wasn’t confidentially wrong all of the time. My calculator will display 80085, but not tell me that 2+2=5

metacritic12 | karma 1083 | avg karma 4.28 · 2023-02-15 11:56:06

To your point. I find the 2+2=5 cases more interesting, and would like to see more of those: when does it happen? When is ChatGPT most useful? Most deceptive?

The 80085 case is only interesting insofar as it reveals weaknesses in the tool, but it's so far from tool-use that it doesn't seem very relevant.

reply

currymj | karma 3365 | avg karma 4.01 · 2023-02-15 12:09:26

in my experience it happens pretty regularly if you ask one of these things to generate code (it will often come up with plausible library functions that don't exist), or to generate citations (comes up with plausible articles that don't exist).

potatolicious | karma 31022 | avg karma 5.68 · 2023-02-15 12:25:34

Considering that in its initial demo, on very anodyne and "normal" use cases like "plan me a Mexican vacation" it spit out more falsehoods than truth... this seems like a problem.

Agreed on the meta-point that deliberate tool mis-use, while amusing and sometimes concerning, isn't determinative of the fate of the technology.

But the failure rate without tool mis-use seems quite high anecdotally, which also comports with our understanding of LLMs: hallucinations are quite common once you stray even slightly outside of things that are heavily present in the training data. Height of the Eiffel Tower? High accuracy in recall. Is this arbitrary restaurant in Barcelona any good? Very low accuracy.

The question is how much of the useful search traffic is like the latter vs. the former. My suspicion is "a lot".

reply

williamcotton | karma 3110 | avg karma 1.49 · 2023-02-16 05:16:22

> But the failure rate without tool mis-use seems quite high anecdotally

The problem with your judgement is you click on every “haw haw, ChatGPT dumb” and you don’t read any of the articles that show how an LLM works, what is is quantitatively good at and bad at and how to improve performance on tasks using other methods such as PAL, Toolformer or other analytic augmentation methods.

Go read some objective studies and you won’t be yet another servomechanism blindly spreading incorrect assumptions based on anecdotes from attention starved bloggers.

reply

potatolicious | karma 31022 | avg karma 5.68 · 2023-02-16 11:29:43

Hi, I work on LLMs daily, along with some intensely talented, skilled, and experienced machine learning engineers who also work on LLMs daily. My opinion is formed by both my own experiences with LLMs as well as the opinions of those experts.

Wanna try again? Alternatively you can keep riding the hype train from techfluencers who keep promising the moon but failing to deliver, just like they did for crypto.

reply

scotty79 | karma 14043 | avg karma 1.62 · 2023-02-15 12:10:03

It's a language model not a knowledge model. As long as it produces the language it's by definition correct.

erulabs | karma 5450 | avg karma 5.4 · 2023-02-15 12:15:10

I'm not entirely sure that's as simple of a distinction as you might suppose. Language is more than grammar and vocabulary. Knowing and speaking truth have quite the overlap.

More specifically, without language, can you know that someone else knows anything?

reply

scotty79 | karma 14043 | avg karma 1.62 · 2023-02-15 14:18:44

> Language is more than grammar and vocabulary. Knowing and speaking truth have quite the overlap.

But speaking the truth is just minor and rare application of the language.

> More specifically, without language, can you know that someone else knows anything?

Honestly, just ask them to show you math. If they don't have any math they probably don't have any true knowledge. The only other form of knowledge is a citation.

Language and truth are orthogonal.

reply

nwienert | karma 4348 | avg karma 3.09 · 2023-02-15 14:26:03

Just like the model, you’re technically correct but missing the point. No one cares if it’s good at generating nonsense, so the metric were all measuring by is truth not language. At least if we’re staying on context here and debating the usefulness of these things in regards to search.

So as a product, that’s the game it’s playing and failing at. It’s unhelpfully pedantic to try and steer into technicalities.

reply

stonemetal12 | karma 1269 | avg karma 2.0 · 2023-02-15 15:30:07

>were all measuring by is truth not language.

If that is the measure you are using that's cool, but

>So as a product, that’s the game it’s playing and failing at.

It is failing that measure by such a wide margin that if "everyone" (certainly anyone at MS) was using that measure then the product wouldn't exist. The measure MS seems to be using is it entertaining and does it get people to visit the site. Heck this is probably the most I have heard about bing in at least 5 years.

reply

SergeAx | karma 2688 | avg karma 1.5 · 2023-02-15 18:01:05

I tell you more: language is an instrument of telling lies. Truth doesn't need to and actually cannot be spoken, it manifests itself as is. Lao Tzu: "He who knows, does not speak, and he who speaks does not know". Meaning: any truth put into words becomes a lie.

dralley | karma 13928 | avg karma 4.7 · 2023-02-15 15:54:52

Then maybe marketing it alongside a search engine is a bad idea?