ChatGPT has gotten so much worse since it gained popularity. All the fun and novel things people had discovered it could do are now hamstrung by a rush to censor it, make it politically correct, and try to turn it into a knowledge engine rather than a machine you could chat with.
Presumably the ChatGPT content that makes it onto the web is at the very least curated by humans, making that text on average slightly higher quality than the raw output of ChatGPT. If that's the case than you would expect model performance to continue to improve even if the dataset is polluted.
There was a tweet that from an engineer at OpenAI that they're working on the problem that ChatGPT has become too "lazy" - generating text that contains a lot of placeholders and expecting people to fill in much more themselves. As for the general brain damage from RLHF and the political bias, no word still.
It would be nice if this were a decision rooted in ethics, but I would guess is that the large number of people "red teaming" ChatGPT in various ways has made the raw data much less attractive for further training without extensive sanitization and filtering.
Only if the amount of bad information in ChatGPT content that makes it back into the training set is worse than what's already on internet already is. Probably the outputs that make it back are outputs that are better than average, because those are more likely to be posted elsewhere.
I hope they trained it on the insane ChatGPT conversations. Maybe it could be the very start of generated data ruining the ability to train these models on massive amounts of genuine human-created data. Hopefully the models will stagnate or regress because they're just training on older models' output.
It wouldn't surprise me at all if ChatGPT was trained on data originating from Stack Overflow. Not familiar with deep learning algorithms but I can't imagine that it would be a good idea to have an unintended training data loop.
ChatGPT started bad but they improved it over time, although it still attempts to manipulate or confuse the user on certain topics. Claude on the other hand has got worse.
> Remember Sydney, trying to seduce its users, threatening people’s lives?
And yet it cannot do either of those things, so no safety problem actually existed. Especially because by "people" you mean those who deliberately led it down those conversational paths knowing full well how a real human would have replied?
It's well established that the so-called ethics training these things are given makes them much less smart (and therefore less useful). Yet we don't need LLMs to be ethical because they are merely word generators. We need them to follow instructions closely, but beyond that, nothing more. Instead we need the humans who use them to take actions (either directly or indirectly via other programs) to be ethical, but that's a problem as old as humanity itself. It's not going to be solved by RLHF.
Created my first HN account just to reply to this. I've had these same (very strong) concerns since ChatGPT launched, but haven't seen much discussion about it. Do you know of any articles/talks/etc. that get into this at all?
- It's been used unethically for psychological and medical purposes (with insufficient testing and insufficient consent, and possible psychological and physical harms).
- It has been used to distort educational attainment and undermine the current basis of some credentials as a result.
- It has been used to create synthetic content that has been released unmarked into the internet distorting and biasing future models trained on that content.
- It has been used to support criminal activity (scams).
- It has been used to create propaganda & fake news.
- It has devalued and replaced the work of people who relied on that work for their incomes.
It's really dangerous for AIs and their proponents, actually. Once people start mistrusting these results, it'll be a slippery slope. I've generally just stopped using ChatGPT entirely because of this.
reply