Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Google's penalty against The Online Slang Dictionary (onlineslangdictionary.com) similar stories update story
234.0 points by martin-t | karma 287 | avg karma 5.22 2020-10-18 23:47:57+00:00 | hide | past | favorite | 75 comments



view as:

This needs to make it to the front, especially as I have heard rumours the last several years that these programs have grown in usage and that there is virtually nothing to be done about it.

Well that's extremely interesting to have hard confirmation that they do this. I wonder what other sites have secret artificial boosts and penalties locked away in Google's systems somewhere, and for what reasons.

the Attention Economy is a cancer on mankind.

Decades from now, our children will look back on this time and wonder how we could be so stupid to think that all this software, costing billions to build, was being given to us for free by kind-hearted technology companies in exchange for a few simple ads. This is, and always has been a devil's bargain.

I was there at the beginning and I wish more than anything we could have saved the web from becoming this mutated monstrosity that is hurting our civilization so badly.


> our children will look back on this time and wonder how we could be so stupid to think that all this software, costing billions to build, was being given to us for free by kind-hearted technology companies in exchange for a few simple ads.

What will they think instead?


I'd imagine they'll think the same thing considering my nephews computer is a school provided chromebook. They're growing up with all the things we've slowly slipped into and will see no problems with it.

I have a penalty on https://garyshood.com/rsclient/ which used to be the number one term for "autoclicker" several years running.

I think it was due to comment spam since I didn't moderate the comments and since then the invisible penalty has never gone away. I've given up on trying. people just have to search "garys autoclicker" if they want to find it now, or have it bookmarked. "autoclicker" is pushed all the way back to page 4 for my site now. on bing I am page 2 behind basically autoclicker dot org and every other variation of the TLD


Hey I remember using this. Good stuff. That VB6 icon brings back memories.

Good. Blatant cheating software. You're not even trying to hide it since you specifically talk about using it in RuneScape. I hope "this program has never been detected or banned by any site or game" is put there as a honeypot to get cheaters banned because it's clearly not true.

While I agree with you on the moral judgement of this software for a multiplayer game, the site has this to say a tiny bit further down:

> Q: Can I get banned for using this?

> A: Sure, but don't be stupid and let it alch for you at 1.6 seconds for 100 hours, and I'm sure you will be OK.


lol it’s not a honeypot. it’s just a way to avoid clicking the same location on your screen for 1.6 days in a row to get level 99 magic. it doesn’t even have movement capability, it just clicks the same spot at a set interval. the stuff that ranks above me is much better suited for cheating since it can actually macro.

A similar complaint: https://ungoogle.us/

This is one of the reasons among many that search needs more competition. Having one dominant search engine means that abuses towards site owners can't simply be remedied by being listed somewhere else. Yes there are other search engines right now but they either serve Google's results or Bing's and that means they're actually furthering this problem.

Full disclosure, I recently founded a search engine and we are serving our own indexes to fight problems exactly like this.

https://www.whize.co


Yo, I’m on an old SE and I can’t click the consent. It doesn’t scroll, and it’s off screen.

Alright, I just pushed a change and for good measure clearing our site caches since we've had a few changes to that banner recently. Check back in a few and you should be good.

I think you need some kind of message when it doesn't have any results. It took me a while to figure out that it wasn't broken due to uBlock Origin or something. [javascript graphing library] was the query (spent half the afternoon searching variations on that in DDG and Google). The results for just [javascript] are pretty funny.

Yeah I think that is good feedback, we currently have 100m pages and that represents 40k sites so we cover a lot of the most visited sites right now except FB, Github and a few others because of concerns we have over indexing somethings -- we're working those out right now and then we will have better results for you there.

Edit: Also I just saw it that is actually hilarious -- we'll get that fixed.


I used to manage a fairly popular web site. About 30,000 unique visitors a day. Harmless content about a particular profession. Informative. Updated almost daily. Ranked #1 for dozens of terms for years and years.

Then one day Google decided it didn’t like the site and it disappeared from the results. You could do a site: search and the pages would come up, so they were indexed. But they just didn’t rank anymore. Webmaster Tools was zero help. Showed everything was normal. No errors. No oddities.

The obvious notion is that it got hacked or there was content spam or something like that, but there wasn’t. No commenting system to game. Pages were all static HTML with no ads. Not even JavaScript to screw with.

The regulars kept coming, but with no Google referrals, traffic dwindled. I eventually took it offline since it wasn’t fun anymore.

The Google giveth, and the Google taketh away.


I of late think there is a market for an anti SEO search engine. Basically hides anything with ads and google analytics.

I don't think there is a market for something that hides at least half the internet (market meaning you can make money off it) but I think there may be people who would enjoy it.

There's https://millionshort.com/

> The search engine, which brands itself as “more of a discovery engine,” allows users to filter the top million websites on the internet out of their search, resulting in a unique set of results and placing an emphasis on content discovery. This approach to search is also designed to combat the impact that aggressive black and grey hat SEO practices have on mainstream search results.[0]

[0] https://en.wikipedia.org/wiki/Million_Short


This sounds like a great idea, and something I would have used if it worked with javascript disabled.

If the owners are around : would you consider adding nojs support? I know it's an edge case request, sorry to be that guy. But then again, "search without top results" is not exactly a concept that will bring you the average web user :)


While I think this particular stance is a bit extreme I do agree that some of Google's larger signals have been heavily gamified and even though they change them up every few years they've had a hard time combating people who try to play the system. I very much think that looking into alternative ranking methods less subject to gaming is a good idea.

It's the case for pretty much all of the content that's online these days. Online reviews aren't really trustworthy, online news and social media isn't really either, nor are search rankings.

I'm not sure if there's a real solution though, pretty much every mechanism for solving this has far too many incentives to break it, so that you'd get traffic/clicks/purchases. Just too much money causes the metrics to be gamed.


I think that because there is very little competition in search in particular that they are able to rest on that -- I think if you saw real competition in the space for users that it would at least get better.

Yuuuuuuup lol, under penalty for almost a decade now.

In my case I had a forum (remember those!) in support of my software product.

Bots would occasionally create accounts and post links to knockoff handbags and watches. I'd tolerate (and swiftly kill!) them because our users really loved having a place to meet. (This was back in 2011, ironically, reCAPCTAHA landed in 2012)

Unbeknownst to me those links were part of a larger spam network where thousands of low-quality links pointed back to my site, presumably to those fake accounts(?).

When the penalty hit the process of trying to figure out what the heck went wrong and trying to do something about it -- identical.

In short, I've been penalized out of existence because of an obvious and in my humble opinion, easy to identify spam campaign. Sadly Google placed the cleanup burden on me, and try as I did nothing actually helped. The article's mention "hidden" penalties feels...accurate.

I often tell folks when you perform a Google search you're given worse results than you deserve. My site and goodness knows how many others have been placed so far below the fold that if we're not outright killed, we never reach the users and potential we should.

No biggie if the search market were more diverse, sadly, that is simply not the world we live in.


>when you perform a Google search you're given worse results than you deserve.

This may be the most relevant synopsis of Google search, that has ever been crafted.


Amen.

Did you make your outbound links in forum post nofollow? That should remove the incentive to use your forum in spam networks.

This suggests spammers care about the quality of the links their bots post rather than the quantity. I suspect that isn't the case. Spam is always worthwhile to post because forum might change to remove the nofollow in future, a forum user might follow a link, and it's not worth the effort bothering to check if a forum is providing 'value' to the network. In other words, it's easier to to spam everyone and hope some of it proves useful.

I think they meant specifically in the context of avoiding being penalised by Google; my admittedly lay understanding is that "nofollow" would've helped here -- the spammers would still spam of course, but the penalty would not be as severe? Or apply at all, perhaps? Please correct me if I'm misunderstanding though

This does not work. Many spammers are not exactly the most clever folks and often neither check the effectiveness of their links, nor if they are removed within a few hours anyway. Instead they simply send automated tools to your site, which they often do not even create themselves.

In addition to that, there are a couple of reasons more, why spammers sometimes even actively chase nofollow links. For one, many people believe a certain amount of nofollow links is part of a "healthy" link profile: having 99% follow links might be considered a "bad signal" by Google, because nofollow is just so common, you are expected to have many nofollow links.

I will not judge this theory, but the theory does exist and is followed by some people.

Plus, the fact that it seems Google simply started considering nofollow as a kind of hint, not a decision, did not help the spam situation either.

Regarding whether the nofollow helps with at least avoiding the penalty, even if you are still spammed, nobody knows whether this might work. Google is usually uptight about what it does or does not do.


> Plus, the fact that it seems Google simply started considering nofollow as a kind of hint, not a decision, did not help the spam situation either.

When nofollow is used for all user generated content they kind of have to take it as no more than a suggestion. Just ignoring UGC when it comes to ranking would throw away too much of the web.


I can see how that sucked for you, but personally, I don't want google sending me to forums infested with spam bots. I don't agree with google's hidden penalties, I think transparency is crucial, but the burden of cleaning up your spam filed website was yours and I'd expect every search engine to bury you until you managed to get it under control.

> you managed to get it under control.

He did, though. He cleaned up his website and deleted all the spam comments. How do you expect him to prevent other websites from linking to his?


Via Google's link disavowal tool.

Great feedback but yeah, big picture we're talking pretty small numbers.

The 10 (or so) spam accounts posted about 30 times. The longest lasting survived a weekend, after which I enabled manual account activation. From start to finish the spam issue lasted around 3 weeks. Eventually, I got rid of the forum altogether.

Alas I tried all the usual things, from the disavow tool to Webmaster forums and dozens of site changes, sadly nothing helped. My penalty felt back then and still appears to be, permanent.

It may help to know pre-penalty my rank was quite high, first page for most relevant search terms. (immediately after page 20 or lower, now around page 10).

Ironically, while the rank was nice to have I never actually did anything for it.

I simply built a fast, human first(!), site that inadvertently followed Google's site quality guidelines.

For example, my software generates web forms. One common growth tactic my competitors use is placing a link at the bottom of every form back to the parent site.

Me -- I never did that. I strongly felt that under no circumstance should the output of my software be used as a marketing tool. Sure it may harm growth, but it felt right, and that was enough for me.

Years after my penalty I read just that. Google frowns upon and may penalize sites for using such "widely distributed site links".

My focus was and always has been on user, so of course I'm the one who gets penalized lol.

Anyway, I think the most damning part of the process was not having a reasonable path for knowing what exactly happened and what I could do to help.

If I were crafting legislation that's where I'd start. I can't help that Google has the market-share it does, but it does mean we all have to play within their world.

All I ask is the rules, wherever they are, are fairly, justly, and evenly applied.


Hey. I don't know much about SEO and have a question. Does sub-domain (forum.example.com) vs sub-path (example.com/forum) make a difference? Now days would it be better to isolate anything with user generated content onto a completely separate domain (example.net)?

It doesn't look like there is any penalty against The Online Slang Dictionary. Google even one-boxes their site as the preferred answer for some searches!

For example:

https://www.google.com/search?q=slang+dictionary+%22as+much+...

https://www.google.com/search?q=slang+dictionary+%22squeege%...

For some queries, Online Slang Dictionary even ranks above sites like Stack Exchange or Urban Dictionary. (Usually Urban Dictionary isn't so reputable, but for this sort of query, it's probably pretty relevant.)

What's the evidence here - some anonymous Google employee claimed the site was penalized back in 2013? That is not "hard confirmation" at all. Most likely, the site is not penalized at all. It just doesn't show up as often as Urban Dictionary and sites like english.stackexchange.com, and the operator is angry about that, but it honestly just isn't quite as good as those sites.


You may have missed this part of the article:

“Frequently asked questions

I just did a Google search and your site appeared in the first few results. Does that mean that the penalty has been removed?”


The explanation of "Google Juice" given in the article is generally incorrect. Google very often does penalize sites by kicking them out of the index entirely. For example, see https://www.searchenginewatch.com/2013/08/01/what-is-pure-sp... .

For the queries I linked, I gave a specific example of oneboxes because this is a user interface that's specifically promoting a result as good. If there's any sort of penalty on a site, it's unlikely to be placed in such a UI.

At the end of the day, it isn't entirely visible to us what the Google algorithm is doing. Google does manually blacklist pages that Google considers to be spam - see that searchenginewatch article for some more information. But it doesn't appear that this is happening here.

The Online Slang Dictionary is not being unfairly penalized by Google. It's just not that great of a website.


Honestly, the whole thing reads as a conspiracy fed by an anonymous (outdated?) source, and the Wikileaks-esque drip of leaks "culminating with full headers!" doubles down on feeling more like a cry for attention/help than an actual argument against (or even proof of) any kind of manual penalty actually being applied.

I did come to the same conclusion, with the complaint itself as my sole source of information.

It is unfortunate that this type of arguments gain a lot of popularity, even among technical readers.

Of course, it is also possible that the complaint is completely justified, and it's just a coincidence that it's written in this conspiracy / paranoia style. Among thousands of consipiracy theories, there's probably some that are actually true. Sorting through them is not something I am interested in doing, though.


You are looking for emotional excuses for Google’s uncompetitive behavior.

Yeah, if you hang around HN long enough you'll get the privilege of seeing the man himself lament about the supposed penalty. It's gotten really long in the tooth and it's the same spiel every time.

At no point does he consider that there is no penalty, his site is just not as useful as competitors and that is why it does not rank highly.


Agreed. the whole "some pages are ranked lower than what they've earned on their merit." bit seems pretty vague and subjective. Even when you can prove there's a penalty against your website, I don't see any evidence that those penalties are invalid. I agree that it's unfair for Google to not disclose this stuff, but it seems to me that this guy is just hoping Google gets enough bad press that they work with him, but this seems like a bad attitude to take when asking for help.

> fed by an anonymous (outdated?) source

The majority of Google employees don't work on search. Even fewer have access to webspam tools/code, for obvious reasons.


When you put "slang dictionary" in the query it's going to significantly boost The Online Slang Dictionary's results, because you've included part of the site's name in the query.

If you just search squeege, Urban Dictionary is #2 and The Online Slang Dictionary is not in the top 10. If you just search "as much use as a chocolate teapot", Urban Dictionary is #5 and Online Slang Dictionary is #9.


That seems fair. It wouldn't make sense for Google to generally rank The Online Slang Dictionary higher; Urban Dictionary is more popular and algorithms like PageRank would naturally prefer Urban Dictionary. When The Online Slang Dictionary ranks pretty well, but not as high as Urban Dictionary, but if you kinda include its name it ranks higher, that seems like appropriate search results.

also, citations would be outgoing references and that is different from being linked TO. The in/out ratio was abused by SEO spammers and so as a result, google came to value sites with more inbound links. Every single word on UD has links to tweet/like/share. If this Walter guy added that feature and gave the site a few typographic finishing touches, who knows? Maybe the auto penalty would be less for having 5700-out/??-in ratio?

> Urban Dictionary is more popular

I think this is exactly the point of contention: why is Urban Dictionary more popular? The author never says it outright, but appears to believe that it's because of the cumulative effect of this penalty over the last 8 years. This assumption is bolstered by the choice to include a photograph of the ex-googler and owner of Urban Dictionary posing with the person who ran the team instituting the penalties.


>of my conversation with the Google employee who told me in secret about the penalty.

I wonder if Google will be able to identify that employee.


In my eyes, headers such as "Matt Cutts lies" loses a lot of credibility for Walter (the author of the complaint).

IMHO, Matt Cutts (when he was at Google) had no incentive to lie to anyone, as he could simply ignore any questions from the website owners who wanted his response.

The fact that Matt actually bothered responding suggests that he was trying to help. He may have failed to help because he just didn't care enough, or ran out of time, or didn't understand what's going on, etc. But referring to his attempt as "lies" tells me a lot about Walter. In particular, it tells me that Walter isn't the type of person I'd like to help. Why bother, if things don't go to his satisfaction, he might just call me a "liar" later.


Mmm sure but I think your comment is a little "seeing the forest for the trees" regardless of whether Walter is lashing out the fact that this was one of the few avenues he had to get help is telling about Google. They don't care about their users whether that is site owners, ad buyers or searchers.

Walter should have never had to have had the interaction in the first place where Matt has to make the decision to help him because there should have been support avenues and no secret penalties in the first place. That's like blaming the abused.


The fact that it's hard to get Google customer service to respond is a well-known and unfortunate problem. That does not mean it's ok to call people who try to help, "liars". I don't know how much Walter was "abused" by Google's bad customer service; maybe a lot, maybe not at all. But given that I only have Walter's side of the story, and I don't see him as a reliable / honest source of information, I choose to discard this complaint as "probably nothing serious".

Okay, I see large co vs independent site owner and I cut google absolutely 0 slack. They don't need the protection and in my opinion at this point given the continuous string of recent revelations shouldn't be afforded benefit of the doubt either. You're well within your rights to have your opinion but I think it's the wrong one.

Yeah, every one of us has their own prior (in this case, on how likely it is that big corp is at fault versus a small business owner is at fault). We then update our priors based on our interpretation of available information.

Both of these steps are highly subjective. Especially the choice of the priors, since they are based on years of unique personal experiences and personal analysis of those experiences.

I would never attempt to change or challenge anyone's prior. I don't know whether human civilization will some day develop a way for people to converge to similar priors, but it's certainly not happening this century, so it won't be relevant to me.


> Matt Cutts (when he was at Google) had no incentive to lie to anyone

Umm. Yes, he had a paycheck and bonus package as incentive. Performance reviews too.

But as to Walter's attacking on Cutts, that's just someone not willing to put in the work to being a responsible adult. When it comes to Google search, you can't accidentally hurt yourself/your-site unless you're a bull in a China shop. Also, Google's search is really complicated and I can't generalize for his specific situation but it's extremely likely that it was OP's fault for using 'black hat' techniques (either directly or indirectly).


Accusations like that can be a bad look, but I can understand the frustration of your website being snuffed out by the whims of Google, and there's no way to figure out what happened and nobody who can tell you what really went wrong or how to fix it.

I even understand that there's a ton of actual spammers out there, and it's not scalable to tell each of them "yes, you're really a spammer, you deserve de-ranking, go away". Still, I'd like to see someone at least try to find a better way to fix both problems.


If we have to regulate the financial system, I don't see how we can't regulate information systems, at least to the point wherein Search Engines are required to provide reasoning, demonstrate their policy is applied consistently. 'Search' is like 'Banking' or 'Electricity' it's just a core good. That, or do something to ensure competition.

Regardless of the particular complaint's merits, yeah I definitely agree -- though it will have some chilling effects on new companies entering the space, the sheer amount of power these information systems have amassed with precious little regulation is a bit disappointing.

AML and KYC-like regulations for the information monopolies. I think we're heading in that direction, seeing the backlash to Facebook et al.


I am sure we discussed this previously but I can’t find it. This grudge is as old as the hills and unsubstantiated.

There needs to be a whistleblower from a Googler about what’s going on.

I strongly suspect that a few Googlers have too much time on their hands. This speaks volumes to Google’s eventual downfall that they can’t keep their employees busy doing more important work.

My second theory it’s tied to Google’s outsourced workforce in India that handles their advertising accounts and the advertising platform. I think those workers are penalizing the sites that don’t respond or buy more advertising. This workforce has too much access to try to manipulate advertising spend.

Also, everyone should know by now that Google counts on brands to buy advertising for their trademarks to keep them as the first result. Otherwise these company names show up as 2-10.

Google knows which sites should be #1 for any given term. These are manually curated.


> I am going to start releasing details of my email conversation with the Google employee, culminating in my release of an MBOX file including full headers.

That seems like a bad thing to do against the whistleblower, since according to the redacted excerpts the whistleblower was using their Google mail address.


This doesn't pass a sniff test:

- No details; promise to "leak" details later (but that never happened, and this page seems from ~2013?)

- Emotional language doesn't inspire confidence in reasonable and rational thinking about this, and it doesn't consider the possibility that Cutts was merely mistaken in his email, or mixed things up, or that there is something else not obvious that could explain the email.

- Calls the Google ranking algorithm a "penalty"; that this penalizes pages with quotations and the like is unfortunate, but lifting text from random websites is a a technique spammers use frequently. This is a rather unnuanced way to describe it.

- Even the mysterious "whistleblower" email doesn't strike me as very convincing.

- Insinuation that the site is penalized just because Cutts has ties to UrbanDictionary; which strikes me as unlikely.

Are parts of it true? Probably. The "automated penalty" probably is, but I wouldn't phrase it like that myself. The rest? Meh.


> No details; promise to "leak" details later (but that never happened, and this page seems from ~2013?)

Last update was October 2020.

That he intends to release the whistle-blower mails with full headers suggest that he has confidence that they were sent from Google servers. We'll see, I guess.

Do agree that the language leans a bit too much in the emotional/conspiracy direction. Really doesn't help his case.


Add:

- no SSL because there was "no time"? It is trivial to enable SSL on such a simple website.

- the person has no idea how SEO works, I see no mention on noindex, nofollow, disavow or other common thingies to fight penalties.

I don't like google at all, just look at my twitter, but as arp242 said, this doesn't pass the sniff test at all.


It's not much of a dictionary if it doesn't even have "big brain". Rather than rag on it I should probably submit it though...as you were.

I am sympathetic but blaming having no SSL on "time suck" is weak imo

I don't think this is necessary a product of malfeasance or anything, but rather lack of complete knowledge on certain areas.

Put it another way: Sucks for this guy that google seems not to know about dictionaries and the importance of citations; and that whoever placed the manual penalty (assuming there's one) isn't completely well versed on intellectual property law.

Suppose that, again, the solution would be on more competition; it would encourage google to improve on nuances like those.


Legal | privacy