Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I think any article on design can have its entire HN comment thread written by bots. Not even GPT3, maybe GPT2.


sort by: page size:

This is only the tip of the iceberg.

I'm certain GPT-3 has been commenting on HN threads for a while now. In some cases, its presence has been disclosed (see, for example: https://news.ycombinator.com/item?id=23886503) In other cases, GPT-3's presence has not been disclosed; the machine has been pretending to be a human being, largely unnoticed. Consider only how easy it is for it to write short, punchy comments -- say, one to three sentences long.

By implication, there's a high probability that we -- you, me, and everyone else on HN -- have been upvoting and downvoting GPT-3 comments for a while without realizing it.

And the technology is only going to get better.


Considering how great GPT-3 is at generating text (see: https://news.ycombinator.com/item?id=23885684), I thought maybe someone already started generating HN comments and posting them.

1 - Did anyone try this without getting caught? 2 - If someone creates a bot that always manages to contribute to discussions and keeps it within the rules, would that be against the rules itself? Should it be?

Note: This is not a project proposal, I'm just fascinated by GPT-3.

Note2: No, I'm not a bot.


All the blogs posted by e.g. this user [0] were generated by GPT-3. [1] Some of those reached the top of HN.

That comment indeed looks a lot like it is generated. It has correlated a bunch of words, but it did not understand that the link between UI and AI is tenuous. It is probably one of the few comments where it is so glaringly obvious. There are likely a lot more comments around which are generated but which went unnoticed.

This comment is not generated, as the links below are dated after the GPT-3 dataset was scraped.

[0] https://news.ycombinator.com/submitted?id=adolos

[1] https://adolos.substack.com/p/what-i-would-do-with-gpt-3-if-...


Can't be that hard to train GPT on HN comments. Plus if someone were to do it, they probably know of HN.

I could definitely see someone already having trained a bot to write HN comments and posting them.

What's anyone going to do about it? It's super hard to write a discriminator that works well enough to not destroy the site for everyone.


So a GPT bot instead of the human commenters would make reddit more useful in the end, this is what you're saying right?

Yeah, bots - sure. GPT? No way. My rapid-classification pattern matcher, honed on two decades of running blogs and web bulletin boards, recognizes the shape, cadence and sound of these comments as indicative of the usual kind of spam comment, common in the last 15+ years.

I.e. they came out of some boring, old-school script.

(Though I do wonder, how much of the spam comments on random, long-forgotten wordpress blogs, ended up in the GPT-{3,4} training data.)


Yea, same thing happens after reading the subreddit where all posts and comments are generated by GPT-2: https://www.reddit.com/r/SubSimulatorGPT2/

If you read it too much, and go back to normal human forums you will wonder if they're also generated by AI or if we are all just AI generated bots on the internet :-O


> Can't be that hard to train GPT on HN comments.

Yes, which will produce things that are stylistically similar to HN comments, but without any connection to external reality beyond the training data and prompt.

That might provide believable comments, but not things likely to be treated as high-quality ones, and not virtual posters that respond well to things like moderation warnings from dang.


Now every AI article has a comment from someone being suspicious that a comment was generated.

[This comment, like every other comment on HN, was generated by GPT-3. You're the only human here.]


Edit 2: After reading all the comments by u/thegentlemetre, I've become suspicious of all comments that sound like GPT-3 even here on HN.

Example: https://news.ycombinator.com/item?id=24711877

I had to go through the writer's history to convince myself that they are human.

We are in for a very interesting decade ahead of us.


Does anyone have a GPT bot for HN commenting?

Honestly I'm wasting too much time here. I wish I could feed the bot with my beliefs (Electron bad, nuclear good, Windows 11 a train-wreck, AI a bubble, ...) and it would post for me as appropriate.


> it superficially looks like a real... but it doesn't make much sense if you examine it closely.

In that case I think GPT-3 has been writing 80% of the comments on the interwebs since 2010 or so.


Two posts per hour and deep comments that turn into word salad under close examination? Could be a gpt-bot as well

I'm reminded of https://xkcd.com/810/.

The example comments you linked (Where GPT-3's presence was disclosed) were believably human, particularly if you were skimming, but they were not good comments. If not for the note at the end about GPT-3, I'm pretty confident they would have been downvoted.

And if I'm wrong, and GPT-3 is actually capable of writing thoughtful and substantive comments... well, in the words of XKCD, "mission fucking accomplished."


>Who says, that your comment isn't a bot already?

Why do people care whether a human or a bot wrote a comment / draw a picture / developed software.. ? The only relevant metric is the quality of the comment / picture / software, isn't it?

https://xkcd.com/810/


Here's a subreddit where all posts and comments are made by a set of GPT-2 bots trained on different subreddits: https://www.reddit.com/r/SubSimulatorGPT2/comments/btfhks/wh...

It's very impressive


They all seem to be posting on the same three threads if you check from their history. GPT or not, seems trained or made to rephrase older HN comments (see: https://news.ycombinator.com/item?id=34910204).

The less said, the easier it is for a language model to approximate a useful thread comment for the purposes of mass propaganda.

“These graphics look terrible. I will never play this game.”

“$Candidate is a corporate shill and everyone knows it.”

“I can’t wait for $Artist’s next album! They’re sooooo good!”

Doesn’t need to be an extensive, well thought out comment to drive thought and discourse. GPT2 is good enough for that.


Markov-chain generators are extremely lacking in long-term coherency. They rarely even make complete sentences, much less stay on topic! They were not convincing at all-- and many of the GPT-2 samples are as "human-like" as average internet comments.

Conjecture: GPT-2 trained on reddit comments could pass a "comment turing test", where the average person couldn't distinguish whether a comment is bot or human with better than, say, 60% accuracy.

next

Legal | privacy