Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I think the problem is just that the solution isn't in Google's wheelhouse: There is no algorithmic ranking system that can't be gamed. Human moderation and curation is the only way to provide true quality, and Google is allergic to solutions that don't automate and scale.

I think a really good search engine would still algorithmically search it's index, but the content library should be human-curated with a goal of ingesting content via author, not via platform. Once a given author was human-approved as a quality source of information, content they produce could be automatically ingested going forwards, and conditionally re-reviewed by a human if there were reports the quality had decreased.



sort by: page size:

Imo the problem mainly lies in Google's square. All SEO tools exist for the reason of it being possible to game search results in the first place. It's inevitable that companies will offer tools and services to do so when there's such a huge market for it.

Google has gotten better at discerning between crap and quality content, but they still have a long way to go, and I'm unsure if it can be fully fixed the way search engines currently work.


Disclaimer: I work for Google. I do not work in search quality (or on search at all). Below are strictly my personal views.

I can't (and won't) comment on the specifics of this change. In fact, my knowledge of them is pretty much what everyone else's is.

What I will say is that in the last few months there have been several stories about the quality of Google search results in the last few months (eg scraper sites rating higher than the original).

The problem I see with such criticism as leveled with those episodes (and this one too) is that there is the implicit premise that Google's search results and algorithm are static. This most recent change should be evidence of that.

So while Google's search is algorithmic the people who are in charge of it are not. To put it another way: if you try and game Google's system, it will possibly work for a time but at some point, when the problem is viewed as being of sufficient severity to warrant attention, that algorithm will change.

Search, as I see it, is an arms race. SEO, particularly black hat SEO, is on the other side of that. But this isn't as simple as SEO. The world changes over time too. New business models form. New memes come into existence (eg the idea of social search).

So let's assume for a second the OP's argument is sound and that Google has merely killed off Demand Media's competition. If true, there are now a lot less content farms on highly ranked pages than there used to be. Sounds like a win to me. Is it a perfect solution? No. But is it better? Absolutely.

Google's mission is to deliver quality content to it's users. The more people use our properties, the more money we make. We are very focused on the user experience. Gaming our system is, at best, a short term proposition as there are an awful lot of bright and talented people here constantly striving to defeat such attempts.


I've always been a critic of the algorithm over curation for search. Cheap low effort content has always been orders of magnitude easier to generate than real quality content. The only way to prevent declines of the real content is to put a wall(curation) around serious information providers so they cannot be gamified out of the way by bad actors.

I wish.

It's not because they are Google, it's because their algorithm is so good, that when there will be another alternative to google search, it will need to act the same.

I'm dealing with content on a daily basis. Most of the challenge when it comes to writing content is to make it suitable to more than 200 factors Google is using in their algorithm to determine the quality of this content.

You can think about it as a massive supermarket checklist, Only when all the 200 boxes checked, your wife will tell you good job.


The problem is, Google isn't very curated. All the AI-generated SEO spam is going to pollute its indexes.

I think it's sad that so much of this is driven by SEO -- a desire to satisfy an algorithm -- rather than to make good quality human readable content in its own right. In a desire to get on the front page of Google so much useless content is being created, and the content itself is tainted by the inclusion of keywords. This is because, as the article mentions, Google's algorithm can't tell a quality article from one that isn't.

What if I like the content Google tunes its algorithms to rank badly? It's the same issue ...

Google is in such a hard place. I unwillingly came across a SEO conference while I was vacationing and every single SEO practitioner is using AI tools to fill the web with articles and low quality rehashes while using Google's inability to punish them while not punishing Forbes and the like (while also publishes low quality articles at times). I honestly don't know how Google is going to solve this one and in another decade how will things look like.

My biggest gripe with Google is how the ranking algorithm has essentially killed any innovation on the content and user experience part of the web.

I was searching for some poems by Pablo Neruda. The first few results all had awfully designed pages filled with ads, pop ups, bad fonts - things you don’t want to see when reading a poem.

Yet, these awful sites continue to dominate the SERPs for queries like these because they are all ancient, have tons of backlinks from ages ago, and thus in the eyes of Google, are “authoritative”.

No one wants to put in the effort to create a website that collects, say, poems with a better user experience because they know that most traffic will come from search and they can’t really rank against these ugly old dinosaurs.


Google was good at launch because it was harvesting data from webrings and directories to provide it "high quality" link ranking data. However, they didn't thank or credit or share any of their revenue with the sites whose human curation helped their results become so impressive. Seeing that Google search was effective, most human curators stopped curating directories and webrings. The SEO industry picked up the slack and began curating "blogs" that are junk links to junk products. This pair of outcomes led to the gradual and ongoing decay of Google's result quality.

Google has not yet discovered how to automate "is this a quality link?" evaluation or not, since they can't tell the difference between "an amateur who's put in 20 years and just writes haphazardly" and "an SEO professional who uses Markov-generated text to juice links". They have started to select "human-curated" sources of knowledge to promote above search results, which has resulted in various instances of e.g. a political party's search results showing a parody image. They simply cannot evaluate trust without the data they initially harvested to make their billions, and without curation their algorithm will continue to fail.


To me this isn’t a problem with Google, but with SEO. Gaming the system to get a better ranking is an entire industry now, and that won’t stop no matter which engine is the most popular.

There have been a lot of discussions on Google Search quality lately on HN. (eg here: [1])

I wonder how much of the negative sentiment towards Google is partially motivated by subconscious dissatisfaction with their main product. I think I'm probably guilty of that.

Others have mentioned that the web environment has changed a lot and I've come to agree. PageRank style algorithms are probably pretty useless in an era of low effort reposts, aggregators, optimizing for clicks and a large SEO industry.

In addition, most of the user-generated content is now inside "walled gardens" like Facebook, Instagram and Reddit, and is either impossible to index, or much harder to quantify and rank than websites linking to each other.

There has to be something better out there, but it will be a difficult challenge.

Ironically, the thing I would appreciate most is a manually curated, Wikipedia inspired, hierarchical directory.

[1] https://news.ycombinator.com/item?id=21515181


The results are human curated as much as google would like to publicly pretend otherwise.

I think a more fundamental problem is a large portion of content production is now either unindexable or difficult to index - Facebook, Instagram, Discord, and YouTube to name a few. Pre-Facebook the bulk of new content was indexable.

YouTube is relatively open, but the content and contexts of what is being produced is difficult to extract, if, for the only reason that people talk differently than they write. That doesn’t mean, in my opinion, that the quality of a YouTube video is lower than what would have been written in a blog post 15 years ago, but it makes it much more difficult to extract snippets of knowledge.

Ad monetization has created a lot of noise too, but I’m not sure without it, there would be less noise. Rather it’s a profit motive issue. Many, many searches I just go straight to Wikipedia and wouldn’t for a moment consider using Google for.

Frankly I think the discussion here is way better than the pretty mediocre to terrible “case study” that was posted.


The linked tweets imply that this decline in search results was Google's choice, led by desire for further monetization or exec incompetence, but I think Google is simply facing an impossible task.

Receiving an arbitrary question and finding the most helpful site for that question out of the entire web is already nearly impossible.

Now consider the above problem, except the sites you have to search are highly adversarial. More precisely, the internet is roughly divided into people who post useful content and have little interest in SEO and those who only care about SEO and clicks and not about creating useful content. The latter are more motivated and have more resources. For every useful site, they can take that site and create 100 of their own copies with the same content, more aggressive SEO and their own ads.

How is Google, or anyone else, supposed to navigate this landscape?


The problem of web content is the search systems have no opinion on skill, effort, scholarship or accuracy.

It's become disconnected from quality so much that metrics of quality are seen as arbitrary, unrelated attributes


it's to make content so good it seems as if a human wrote it. If that's actually the case there's no conflict in Google -- good content gets pushed to the top.

There's an easy way to achieve that: have an actual human write it. This solution does not necessarily win one good will with Google: for one obvious example, most of the content farms were farming manually rather than farming with Markov chains.

I also think "good content" only craters the approach to the bridge of describing both a) what actually ranks on Google and b) what, in an ideal world, Google thinks would rank on Google.


The problem, I find, is that Google results tend to be OK until there is a commercial intent. Then you just get the biggest advertiser, biggest brand (sometimes 3 links for the same thing) and most seo optimised results.

A human curated alternative would, likely, end the same way as the urge to monetise such a resource would be overwhelming.


The whole reason google spends billions on their search engine is that humanity does not yet have a program that can differentiate fake relevancy from relevancy, with perfect accuracy.

The venn diagram between SEO and user satisfaction is gradually being compressed into a circle by Google as they improve their algorithm.

SEO is already basically human oriented now- anyone selling mumbo jumbo SEO magic now is a crank. It used to actually work quite well.


Human curation is something that is definitely missing these days and was severely undervalued with the growth of the internet. I rarely conduct a Google search without appending "reddit" to the query. Because I know I want actual human answers. We are, ironically, using Google today as if it were the Yahoo! curated catalog of 1997. Because we can't trust that the links Google returns aren't some autogenerated SEO-driven affiliate link garbage.

The same applies to music. Spotify is missing that certain ingredient that local radios (before they were sold and cannibalized by Clear Channel/iHeartMedia) and early 1980s MTV really nailed. We want a knowledgeable human to guide us through the landscape of books, music, film, and everything else.

> Not "highest rated" or "most popular" - but books I wouldn't get otherwise exposed to that an employee has selected carry, and display.

In the past few years Netflix introduced their "Top 10" feature which is prominent near the top of their app. This was the reason I unsubscribed from Netflix. I had been a member since 2004. The "Top 10" feature reminded me on a daily basis that I have no interest in the content Netflix has. Previously I had assumed (because of their rather horrible discovery mechanisms) that Netflix had a much deeper catalog and I could find something to watch if I keep searching. The top 10 list made me finally realize their offerings simply were bad. The value of the service wasn't there.

next

Legal | privacy