Honestly I'm wasting too much time here. I wish I could feed the bot with my beliefs (Electron bad, nuclear good, Windows 11 a train-wreck, AI a bubble, ...) and it would post for me as appropriate.
Considering how great GPT-3 is at generating text (see: https://news.ycombinator.com/item?id=23885684), I thought maybe someone already started generating HN comments and posting them.
1 - Did anyone try this without getting caught?
2 - If someone creates a bot that always manages to contribute to discussions and keeps it within the rules, would that be against the rules itself? Should it be?
Note: This is not a project proposal, I'm just fascinated by GPT-3.
I'm certain GPT-3 has been commenting on HN threads for a while now. In some cases, its presence has been disclosed (see, for example: https://news.ycombinator.com/item?id=23886503) In other cases, GPT-3's presence has not been disclosed; the machine has been pretending to be a human being, largely unnoticed. Consider only how easy it is for it to write short, punchy comments -- say, one to three sentences long.
By implication, there's a high probability that we -- you, me, and everyone else on HN -- have been upvoting and downvoting GPT-3 comments for a while without realizing it.
With a proper AI system you don’t even need to specify the exact article and nature of the comment.
For example here’s the prompt I use to generate all my HN comments:
“The purpose of this task is to subtly promote my professional brand and gain karma points on Hacker News. Based on what you know about my personal history and my obsessions and limitations, write comments on all HN front page articles where you believe upvotes can be maximized. Make sure to insert enough factual errors and awkward personal details to maintain plausibility. Report back when you’ve reached 50k karma.”
Working fine on GPT-5 so far. My… I mean, its 8M context window surely helps to keep the comments consistent.
I wonder if we could train something that would automatically generate typical comments on HN threads.
- This is EEE!
- No it's not, there is no third E
- Well, back in the 90s such and such happened
- Microsoft has changed! Look at VS Code!
- Electron is slow.
- Buy a better machine.
- what about the millions who can't afford a better machine?
- Electron can be good, just code better.
- but muh privacy
- exactly, VS Code is going to take over the world with it's knowledge about your coding habits
- It's all about Azure nowadays, anyway.
- Azure has terrible customer support, we should use bare metal.
I've trained a Transformer encoder-decoder model (this was slightly before GPT2 came out) to generate HN comments from titles. There is a demo running at https://hncynic.leod.org
I want a GPT-3 trained on HN - if it can look at your title and generate basically your content/comment, don’t post it. Similarly, if it can look at the comment you’re replying to and generate essentially your reply (let’s say in top-5 most probable), don’t post it.
Edit, love the irony that both responses are about GPT. Point proven?
Nice. I just sort of assumed early on my comments were training some future AI, and I hope that in some small way I have been able to moderate some of its stupider urges.
A version where you can turn knobs of flavored contributors would be pretty funny. I know my comment style is easily identifiable and reproducable, and it encodes a certain type of logical conjugation, albeit biased with some principles and trigger topics, and I think there is enough material on HN that there may be such a thing as a distinct, motohagiographic lens. :)
Yes, which will produce things that are stylistically similar to HN comments, but without any connection to external reality beyond the training data and prompt.
That might provide believable comments, but not things likely to be treated as high-quality ones, and not virtual posters that respond well to things like moderation warnings from dang.
The other way to look at it is that HN comments are indistinguishable from GPT-3 generated sentences. Hell, even a standard Markov chain would suffice.
Markov-chain generators are extremely lacking in long-term coherency. They rarely even make complete sentences, much less stay on topic! They were not convincing at all-- and many of the GPT-2 samples are as "human-like" as average internet comments.
Conjecture: GPT-2 trained on reddit comments could pass a "comment turing test", where the average person couldn't distinguish whether a comment is bot or human with better than, say, 60% accuracy.
Honestly I'm wasting too much time here. I wish I could feed the bot with my beliefs (Electron bad, nuclear good, Windows 11 a train-wreck, AI a bubble, ...) and it would post for me as appropriate.
reply