Hacker Read

rafaelero · 2024-04-18 17:10:59

It will certainly decrease. Also, there are multiple ways to deal with hallucinations. You can sample GPT-4 not once, but 10, 100, 1000 times. The chances of it hallucinating the same things asymptotically reaches 0. It all depends on how much money you are willing to invest in getting the right opinion, which in the field of medicine, can be quite a lot.

daveguy | karma 5761 | avg karma 2.78 · | 2024-04-18 18:28:55

> You can sample GPT-4 not once, but 10, 100, 1000 times.

Is there a study on improved outcomes based on simple repetitions of GPT-4? I would be very interested in that study. I don't think gpt hallucinations are like human hallucinations. Where if you ask someone after a temporary hallucination they might get it right another 9 times, but I could be wrong. That would be an interesting result.

reply

mrbombastic | karma 507 | avg karma 2.85 · | 2023-04-14 11:19:55

I don’t have hard numbers but anecdotally hallucinating has gone down significantly with gpt4, it certainly still happens though.

arrowsmith | karma 1548 | avg karma 2.64 · | 2023-07-05 04:30:52

Try paying for GPT-4 - it barely hallucinates at all, at least as far as I've noticed.

og_kalu | karma 4856 | avg karma 2.4 · | 2024-04-21 15:00:23

GPT-4 hallucinates a lot less than 3.5. Same with the Claude Models. This is from personal experience. There are also benchmarks (like TruthfulQA) that try to measure hallucinations that show the same thing.

ToValueFunfetti | karma 496 | avg karma 2.04 · | 2023-04-07 12:29:38

The technical report[1] makes that claim at least:

>GPT-4 significantly reduces hallucinations relative to previous GPT-3.5 models (which have them- selves been improving with continued iteration). GPT-4 scores 19 percentage points higher than our latest GPT-3.5 on our internal, adversarially-designed factuality evaluations

[1] https://arxiv.org/abs/2303.08774 (text from page 10)

reply

p1esk | karma 6022 | avg karma 1.5 · | 2024-06-26 18:43:52

I get maybe one hallucination per twenty chats with gpt4.

justinclift | karma 11995 | avg karma 1.81 · | 2023-07-01 00:14:03

Does GPT-4 really never hallucinate?

This is the first time I've personally heard someone claiming anything like that, though I don't tend to do anything with LLMs (due to this bullshit factor).

reply

daveguy | karma 5761 | avg karma 2.78 · | 2024-04-18 16:10:17

Do you think hallucinations will be solved with GPT-5? If so, that would be an amazing breakthrough. If not, it still won't be suitable for medical advice.

hereme888 | karma 804 | avg karma 1.72 · | 2024-01-18 16:35:22

Idk how you're promoting, but hallucinations are rare for me, even at graduate-level material.

There's a reason GPT-4 scores so highly on advanced exams like the USMLE.

reply

ziptron | karma 377 | avg karma 6.39 · | 2024-06-14 00:24:53

What is the hallucination rate of, for example, a Llama3 or GPT4?

mmahemoff | karma 6818 | avg karma 4.89 · | 2023-09-25 08:06:03

I'm curious if you're using GPT-4 ($)? I find a lot of the criticisms about hallucination come from users who aren't, and my experience with GPT-4 is it's far less likely to make stuff up. Does it know all the answers, certainly not, but it's self-aware enough to say sorry I don't know instead of making a wild guess.

badrunaway | karma 37 | avg karma 0.82 · | 2024-06-24 11:39:55

Haha, asking chat-gpt surely won't work. Everything can "feel" like a halting problem if you want perfect results with zero error with uncertain and ambiguous new data adding.

My take - Hallucinations can never be made to perfect zero but they can be reduced to a point where these systems in 99.99% will be hallucinating less than humans and more often than not their divergences will turn out to be creative thought experiments (which I term as healthy imagination). If it hallucinates less than a top human do - I say we win :)

reply

chaoz_ | karma 180 | avg karma 1.91 · | 2023-04-14 02:57:39

True, "amount of hallucination" (very confident, but factually wrong) is probably something they can decrease in the next versions tho.

I also would not trust it with anything important, but there can be good applications for something that works 9/10 times.

reply

jebarker | karma 1120 | avg karma 2.25 · | 2023-08-24 09:14:37

My experience has been that the hallucinations in GPT4 are actually pretty rare. But in any case, if I choose to use code it suggests I ask it for explanations and then I verify those myself by other means, e.g. tests. I think it's too strong to call it a really bad TA. I'd say it's an imperfect TA and you need to check it's work, but it's work still has great value.

loandbehold | karma 347 | avg karma 2.02 · | 2023-05-18 15:42:38

The issue of hallucinations is overblown. I use GPT4 all the time and don't see any hallucinations at all. It's a big problem with Google BARD and GPT3 and earlier models. But GPT4 fixed the issue of hallucinations completely.

dudeinhawaii | karma 500 | avg karma 5.43 · | 2023-03-30 23:15:01

I honestly haven't found hallucination to be a problem on GPT-4 when asking it to analyze or parse a dataset but can acknowledge it being possible (I just haven't encountered it).

I think that if we consider the accuracy rate as measured in various ways being roughly that of a human, then you're trading human mistakes for AI mistakes in exchange for dramatically lower costs and a dramatically higher speed of processing. You might even say a higher level of reasoning. In my own interactions it's been fantastic at reasoning clearly and quickly outside of complex trick questions. Most scenarios in life aren't generally full of trick questions.

reply

chasd00 | karma 6692 | avg karma 2.11 · | 2024-02-17 01:02:56

I done testing here and I have seen hallucinations for all of it. It’s not that reliable.

jebarker | karma 1120 | avg karma 2.25 · | 2023-05-03 07:51:13

Hallucinations are a feature, not a bug. GPT pre-training teaches the model to always produce an answer, even when it has little or not relevant training experience, and on average it does well at that. Part of the point of RLHF in ChatGPT was to teach the model not to hallucinate when it doesn't have good supporting experience encoded in it's weights. This helps but is not perfect. However, it seems like there might be a path to much less hallucinations with more RL training data. As others pointed out, humans hallucinate all the time we just have better training for what level of hallucination is appropriate given supporting evidence and context.

paddw | karma 795 | avg karma 4.94 · | 2023-04-06 20:19:21

Are we going to have to have this discourse for every single instance of GPT doing some semi-novel harm via hallucination?

(Probably, I suppose.)

reply