I'm certain GPT-3 has been commenting on HN threads for a while now. In some cases, its presence has been disclosed (see, for example: https://news.ycombinator.com/item?id=23886503) In other cases, GPT-3's presence has not been disclosed; the machine has been pretending to be a human being, largely unnoticed. Consider only how easy it is for it to write short, punchy comments -- say, one to three sentences long.
By implication, there's a high probability that we -- you, me, and everyone else on HN -- have been upvoting and downvoting GPT-3 comments for a while without realizing it.
Considering how great GPT-3 is at generating text (see: https://news.ycombinator.com/item?id=23885684), I thought maybe someone already started generating HN comments and posting them.
1 - Did anyone try this without getting caught?
2 - If someone creates a bot that always manages to contribute to discussions and keeps it within the rules, would that be against the rules itself? Should it be?
Note: This is not a project proposal, I'm just fascinated by GPT-3.
All the blogs posted by e.g. this user [0] were generated by GPT-3. [1] Some of those reached the top of HN.
That comment indeed looks a lot like it is generated. It has correlated a bunch of words, but it did not understand that the link between UI and AI is tenuous. It is probably one of the few comments where it is so glaringly obvious. There are likely a lot more comments around which are generated but which went unnoticed.
This comment is not generated, as the links below are dated after the GPT-3 dataset was scraped.
Yeah, bots - sure. GPT? No way. My rapid-classification pattern matcher, honed on two decades of running blogs and web bulletin boards, recognizes the shape, cadence and sound of these comments as indicative of the usual kind of spam comment, common in the last 15+ years.
I.e. they came out of some boring, old-school script.
(Though I do wonder, how much of the spam comments on random, long-forgotten wordpress blogs, ended up in the GPT-{3,4} training data.)
If you read it too much, and go back to normal human forums you will wonder if they're also generated by AI or if we are all just AI generated bots on the internet :-O
Yes, which will produce things that are stylistically similar to HN comments, but without any connection to external reality beyond the training data and prompt.
That might provide believable comments, but not things likely to be treated as high-quality ones, and not virtual posters that respond well to things like moderation warnings from dang.
Honestly I'm wasting too much time here. I wish I could feed the bot with my beliefs (Electron bad, nuclear good, Windows 11 a train-wreck, AI a bubble, ...) and it would post for me as appropriate.
The example comments you linked (Where GPT-3's presence was disclosed) were believably human, particularly if you were skimming, but they were not good comments. If not for the note at the end about GPT-3, I'm pretty confident they would have been downvoted.
And if I'm wrong, and GPT-3 is actually capable of writing thoughtful and substantive comments... well, in the words of XKCD, "mission fucking accomplished."
Why do people care whether a human or a bot wrote a comment / draw a picture / developed software.. ? The only relevant metric is the quality of the comment / picture / software, isn't it?
They all seem to be posting on the same three threads if you check from their history. GPT or not, seems trained or made to rephrase older HN comments (see: https://news.ycombinator.com/item?id=34910204).
Markov-chain generators are extremely lacking in long-term coherency. They rarely even make complete sentences, much less stay on topic! They were not convincing at all-- and many of the GPT-2 samples are as "human-like" as average internet comments.
Conjecture: GPT-2 trained on reddit comments could pass a "comment turing test", where the average person couldn't distinguish whether a comment is bot or human with better than, say, 60% accuracy.
reply