Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Markov-chain generators are extremely lacking in long-term coherency. They rarely even make complete sentences, much less stay on topic! They were not convincing at all-- and many of the GPT-2 samples are as "human-like" as average internet comments.

Conjecture: GPT-2 trained on reddit comments could pass a "comment turing test", where the average person couldn't distinguish whether a comment is bot or human with better than, say, 60% accuracy.



sort by: page size:

The other way to look at it is that HN comments are indistinguishable from GPT-3 generated sentences. Hell, even a standard Markov chain would suffice.

Was this comment written by a Markov chain? Subreddit Simulator?

So a GPT bot instead of the human commenters would make reddit more useful in the end, this is what you're saying right?

I always like to point people to /r/SubSimulatorGPT2 [1] as a good example of what GPT2 is able to accomplish.

It's a subreddit filled entirely with bots, each user is trained on a specific subreddit's comments matching it's username (so politicsGPT2Bot is trained on comments from the politics subreddit).

Go click through a few comment sections and see how mind-bendingly real some comment chains seem. They reply quoting other comments, they generate links (they almost always go to a 404 page, but they look real and are in format that makes me think it's real every time I hover over it) entirely on their own, they have full conversations back and forth, they make jokes, they argue "opinions" (often across multiple comments back and forth keeping the context of which "side" each comment is on), and they vary from single word comments to multi-paragraph comments.

Take a look at this thread [2] specifically. The headline is made up, the link it goes to is made up, but the comments look insanely real at first glance. Some of them even seem to be quoting the contents of the article (which again, doesn't exist) in it's comments!

If you threw something like 50% "real humans" in the mix, I genuinely don't think I'd be able to pick out the bots on my own.

[1] https://www.reddit.com/r/SubSimulatorGPT2/

[2] https://www.reddit.com/r/SubSimulatorGPT2/comments/fzwso5/nr...


I'm also wondering whether the 'handpick[ed]...to ensure a high coherence and high relevance' GPT-2 comments actually outperform the comparatively trivial sentence-spinning script in getting approved by MTurkers.

Think https://www.reddit.com/r/SubSimulatorGPT2/ is more impressive than a study where half of GPT-2 comments handpicked for being human-like by one human were accepted by another human. Particularly given that some of the comments in question were three or four words long...


Yeah, bots - sure. GPT? No way. My rapid-classification pattern matcher, honed on two decades of running blogs and web bulletin boards, recognizes the shape, cadence and sound of these comments as indicative of the usual kind of spam comment, common in the last 15+ years.

I.e. they came out of some boring, old-school script.

(Though I do wonder, how much of the spam comments on random, long-forgotten wordpress blogs, ended up in the GPT-{3,4} training data.)


I wrote a Markov Chain HN comment generator a few months ago. I'd say a solid 70% of them were pretty funny.

> Can't be that hard to train GPT on HN comments.

Yes, which will produce things that are stylistically similar to HN comments, but without any connection to external reality beyond the training data and prompt.

That might provide believable comments, but not things likely to be treated as high-quality ones, and not virtual posters that respond well to things like moderation warnings from dang.


>A good old markov chain can simulate the average reddit thread

Someone did this on Reddit 9 years ago. It was remarkably good.

https://old.reddit.com/r/SubredditSimulator/comments/3g9ioz/


This reads like a Markov chain trained on HN comments.

This isn't exactly bot sumissions, and the process is not really scalable:

> To quickly weed out inappropriate comments, I handpick from generated comments those that ensure a high coherence and high relevance sample for submission.

So basically it's a validation of GPT-2 making sense with small amounts of text. Judging from the demo test page, they are pretty good texts, but he said it himself that larger texts betray the bot. So, i m not sure what he's trying to prove by using MTurkers, since this does not attack the problem mentioned in his introduction: the fake FCC comments were weeded out through text analysis, not via human work.

In all, i'm not sure if this is something that people didn't know about gpt-2. The title is certainly not justified, perhaps "Curated bot comments can't be distinguished by humans to be obviously fake" would be better, but also more banal.


It puts every comment into that GPT output detector and colors and writes a short comment on the HN comment, like you see in the screenshot based on a threshold. >0.7 is probably AI, >0.9 is definitely AI. Lower than that is most likely human. Most comments still appear to be human.

It only becomes reliable after about 50 tokens (one token is around 4 characters) so I mark the comments that are too short with gray and make no assessment on those.

I've put it on https://github.com/chryzsh/GPTCommentDetector


Can't be that hard to train GPT on HN comments. Plus if someone were to do it, they probably know of HN.

I could definitely see someone already having trained a bot to write HN comments and posting them.

What's anyone going to do about it? It's super hard to write a discriminator that works well enough to not destroy the site for everyone.


>(Aside: The autogenerated spam comments there are also strangely interesting - they sound almost poetic.)

Given the right corpus and parameters, Markov chains can do a surprisingly scary job at producting content that seems profound and/or humorous.


I think if we ran the GPT2 spotting model against anyone's comment history we'd find a couple comments that scored really high. I doubt it's actually related to your writing style.

Alternative hypothesis is that it's not about how you're writing but where. HN appears to have been used to train GPT3.5. I don't know if it was used to train GPT2, but it might have done. So your comments might be in the training set.


Funny that the GPT detector is giving 75-99 percent AI probability for that comment.

I've trained a Transformer encoder-decoder model (this was slightly before GPT2 came out) to generate HN comments from titles. There is a demo running at https://hncynic.leod.org

Did you run generate this comment using social-comments-GPT? https://social-comments-gpt.com/

Just because someone's post is clear, well-structured, and includes a final recap doesn't mean it was AI-generated.

Then again, maybe I'm a cat using CatGPT.


This is a very valid concern, and one that has been raised before on this platform. With the recent advancements in AI and language models, it's becoming increasingly difficult to distinguish between comments generated by machines and those written by humans. This has the potential to undermine the authenticity and credibility of discussions on online forums like Hacker News.

----

The above was generated by ChatGPT.

next

Legal | privacy