Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Google Interferes with Its Search Algorithms and Changes Your Results (www.wsj.com) similar stories update story
135.0 points by sahin-boydas | karma 10239 | avg karma 4.94 2019-11-17 15:44:23+00:00 | hide | past | favorite | 91 comments



view as:

Similar discussion 2 days ago: https://news.ycombinator.com/item?id=21544537

yes, same research, different publication spinning it, much deeper / better HN discussions on the issues on that thread imho. They maybe should be merged.

That discussion seems to be on the same WSJ article, but syndicated through msn.com

It seems to conflate search/auto-complete and then ignores the context of search having adversarial groups consistently trying to manipulate search rankings. While it's possible this was a good faith article, given News Corps' broader conflicts with Google recently I'd guess it's intentional.

We are down to like one or two remaining professional journalism shops (FT and Economist, imho).

Sad to see once decent, thoughtful operations like WSJ, NY Times, WaPo, basically turned into conflict generation drivel producers.

Edit to add: I don't mean to ignore efforts like Pro Publica, but they are very small ops and don't really move the needle readership-wise.


I agree. I had Bloomberg on that list but it's gotten steadily worse over the last few years. Some parts of WSJ and NYT are still pretty good, but they're flooded by drivel and hard to find.

I would argue that the economist takes much stronger editorial stances than other publications allow. They haven’t given any space to or said any kind anything about Russia in the last 10 years, nor have they given any space to the pro-Brexit side of the debate. They do great reporting inside their editorial stance, but they’re a long ways from “Just the facts, ma’am.”

> They haven’t given any space to or said any kind anything about Russia in the last 10 years, nor have they given any space to the pro-Brexit side of the debate.

Signs of a quality publication, surely? There's not a lot of papers saying nice things about a mafia oligarchy state unless they've been paid to do so, and the pro-Brexit arguments are mostly nonsense.

The economist has firmly located itself in an economic and socially liberal policy position, but they're willing to look at other cogently argued positions.


> mafia oligarchy state

Could be talking about Russia, the U.S., or China at this point, really.

> the pro-Brexit arguments are mostly nonsense

Well, the Economist will assiduously avoid any examination of how the common currency & market are handmaidens for austerity imposed on the peripheral countries to ensure bondholders are always made whole in the EU financial core, that's for sure. Not sure what makes this point "nonsense" though.


Not sure that you point relates to Brexit to be fair, even if I find little else to disagree with

UK uses Euros? TIL!

That argument may have been valid for Grexit, but UK austerity is entirely native Tory policy, and the primacy of bondholders is pretty much global doctrine (see Argentina pari passu fiasco).

Anyway, that has zero bearing on how Brexit has been conducted, and especially on the opposition to free movement of people.


> There's not a lot of papers saying nice things about a mafia oligarchy state unless they've been paid to do so

There was one, called "The Economist in the year X" for all X such that whichever combination of oligarchs, mafia and security services are in power happen to be friendly to Western business interests.

> The economist has firmly located itself in an economic and socially liberal policy position

They are the voice of Empire and always have been.

http://exiledonline.com/exile-classic-the-economist-the-worl...


What is there to say about pro Russia or Brexit?

I have met some Brittain people yesterday and they also didn't have anything positive to say about that.


I have no idea who this guy is or if his ideas are valid but here's his 10 reasons why Brexit is good

https://dominicfrisby.com/news/ten-reasons-im-voting-to-leav...


> I have no idea who this guy is or if his ideas are valid but

Then why are you sharing/promoting his viewpoint?

There are endless starting points for any debate, picking the right one is critical.


I shared it because they all sounded like extremely reasonable points. They didn't sound crazy or irrational. The question was "are there any positive things to say about brexit" and the answer is "yes". Does that mean Brexit = good? I have no idea but the answer seems clearly not as simple as anti-brexit people seem to claim

What about Reuter’s?

So many abbreviations I had a hard time following you.

What's FT?


Financial Times

The Financial Times.

>> Sad to see once decent, thoughtful operations like WSJ, NY Times, WaPo, basically turned into conflict generation drivel producers.

I'm not sure that's a fair description of news orgs. The world's fairly complex and you're going to end up with slant one way or another however you try to describe it.


I don't get what the article means by "Google Interferes" beyond what we already know.

I can't imagine any notion of neutrality applied to search results, except making your search algorithm public, but then, Google is not a public service, it's Google results you came looking for by using Google, if you don't like them, and more and more people are not happy with the results as well, something else will emerge and win.


I would point out that people made this argument back in 2005, 2010, 2015, and now 2019 and all that's happened is is the argument has evolved from "you can use X other search engines!" to "well even if they have a monopolistic position, someone else will make one eventually".

There is Duck Duck Go. Unfortunately, in my experience its results are still worse than Google's.

I agree, as much as I want to support it it is unusably bad.

The internet is a much bigger place than it was in the 90s when we last had multiple competing search engines (and even manually-curated directories!). It is much harder to effectively index "all the things" than it was back then. Google managed to get ahead of that scale curve.

I think any future competitor will have to be an extension of Dogpile or Jeeves, where you aggregate the results of individual engines and rank results based on relevance + domain specialty or credibility of each engine/indexer, like a revamped EBSCO or VirusTotal.


I find google unusably bad. It changes results based on where I am and on the time of day. It's unstable. It gives me what it thinks I want, not what I ask for.

In contrast ddg is always giving the same result, changing only with the addition of new data to the web over time.

I'm curious what you find unusable about ddg in contrast?

For several years I have never had a situation where dropping into Google gave me the result I thought I could find. If ddg isn't finding it, it doesn't exist.

Could we be using search differently?


The thing is that for me, Google is often right when it gives me what it thinks I want, and Duck Duck Go gives pages of things I don't.

For many easy queries (where I really know where the thing I'm looking for is, but it's easier to just do a quick search and click the result) both are fine. But when I have an obscure problem and don't quite know what the right search terms would be, Google often guesses right and DDG doesn't try.


The question is whether or not Google results are relevant to your search query. There are many alternatives to Google, you can't claim that Google results are not relevant to what you are looking for because Google "interferes" with them and at the same time claim they are so good no alternative is capable of competing.

Yes, you can easily claim that the interference introduces bias for the vast majority of people doing basic searches, even while google is indispensable for the smaller percentage of more complex searches.

There is no contradiction.


These are two different arguments, of course. Google isn't a public service, so I do not expect their algorithm to be public. But Google's dominant position also might make it almost impossible for a competitor to emerge. I am not saying, it couldn't happen, but the old capitalist idea has not shown to work well with monopolies all that much.

The difference with Google is that it's a search box. It's not like Microsoft Windows or the iPhone, where the monopoly is supported by an ecosystem of apps and compatibility. If the product is bad and shows irrelevant results compared to another one, in the case of Google, capitalism will work its magic.

The weasel words are ‘compared to another one’.

I worked for a defunct company that had a product search engine. As soon as you start looking at the results, you realize you need a lever for removing specific results for a query, and writing an algorithm that always gets it right isn't possible.

> something else will emerge and win

No, no it won't.

That's capitalist spin not considering how monopolies or psychology actually work.


search results used to be pretty neutral 20 years ago... but yeah Google Search is not neutral.

Quality has gone down. Some of it due to “freshness” bias. Some due to content farms. And now we have more explicit search results squelching.

I used to have some favorite sites with different things, some recipes, some other technical things. I can’t surface those results anymore. They are buried by all sorts of uninteresting results.


I believe the bolt on ML algorithms like Vince, Panda, and Penguin add up to something like "prefer a well linked brand site over anything else".

The Vince update is interesting. Google characterized it as being about trust: https://searchengineland.com/google-searchs-vince-change-goo...


There is also other reason. More and more new content is not visible for web search engines - for instance groups on social media like facebook or linkedin. Very often the results for a very specific query are pages from some forums or discussion boards. Since more and more discussions are taking place in one or other walled garden, more and more relevant content will not be accessible through independent search engines. I'm afraid that in the future larger and larger fraction of content will be fragmented and the ability to use a single search engine to find everything will become a thing of the past.

>rdxm 8 minutes ago [dead] [-] We are down to like one or two remaining professional journalism shops (FT and Economist, imho). WSJ, NY Times, WaPo, basically turned into conflict generation drivel producers. Edit to add: I don't mean to ignore efforts like Pro Publica, but they are very small ops and don't really move the needle readership-wise.

This comment was marked dead but seems cogent and on point. The article being commented did not add much of anything to the discussion of search engine algorithms.


To vouch for a dead comment, turn Show Dead on, then click the time associated with the dead comment and click Vouch.

Try googling: "american inventors".

At the top I see a lot of American inventors. There are some exceptions. Tesla wasn't born in the US, and classifying Gates and Jobs as inventors is probably a stretch.

Then try googling "us inventors." The difference is that the former appears in the often-used phrase "African American inventors," but I'm not surprised that the right-wing outrage machine that brought us "The War on Christmas" knows its audience is too feeble-minded to figure that out.

You can argue that Google's (and Baidu's and Yandex's) results aren't very good in the former case, and I would agree, but to imply bias intentionally added by the company is to ignore what a search engine does.


Blacks are still over-represented in that one. Not as blatantly, though.

Hard to say what's causing this. try "is abortion good or bad" on Google vs Bing and you get very different results. Bing provides evenhanded results and Google is almost entirely pro-abortion.

It's pretty easy to say what's causing this, as I've given the exact reason why and shown two non-US search engines with the same behavior.

As far as your abortion example, Bing very clearly took an editorial decision to put its thumb on the search results ranking, going so far as showing a "vs." with two different articles at the top. Google just let its default relevancy algorithm do the ranking, which means articles that contain both the words "good" and "bad" will tend to rank higher than articles that just contain "bad."

The exact same checks apply to this case to throw out the Google bias argument as right wing outrage hysterics targeted at simpletons who don't know how to verify the assertion. My Google results contain two first page links that say without argument that abortion is bad. Both Yandex and Bing have 0.


It's likely school assignments like "do a paper on an African-American inventor" impact Google's algorithms in that sort of fashion.

Clearly an ML system is not perfectly accurate here! That is known to happen from time to time.

Once upon a time, Google autocorrected "she invented" to "he invented".

It does raise the question though, that bias is in the real-world data. Why should a search engine ignore the real world to paint a happier picture?


The entire premise here is ridiculous. It "interferes" and "changes your results" to "shape what you see." It's like me asking you a question and you "interfering with your vocal cords and changing your response to shape what I hear."

It's like me asking you a question and you "interfering with your vocal cords and changing your response to shape what I hear."

No, that is analogous to Google setting a font and colour theme you like, which they don’t do, btw.


This was my impression too, although they could have adjusted the wording to say "manually" or "directly" or something to indicate it isn't purely algorithmic.

Of course algorithms can have bias too, but still.


People feel like algorithms are neutral and people are biased. Having people actively intervene on something that only algorithms would have touched otherwise feels bad if you already suspected that your results aren't what they should be

Yeah, people who don't have practical experience with them seem to think of algos as a kind of designed policy, like laws. Exceptions to the laws seem to be hypocrisy. In reality, algos are just a way to scale and automate some goal. Exceptions compensate for imperfection execution of that goal.

Not exactly. There is an implicit expectation of truth when searching on Google, sort of like an encyclopedia. The interference happens when the results do not match reality.

Try these yourself to see how this can change public opinion: Google: "is abortion good or bad" returns almost entirely pro-aborion results. Do the same on Bing and you get actual results such as "Is abortion a good or bad thing?"

Further reading: https://en.wikipedia.org/wiki/Search_engine_manipulation_eff...


I posted this in another thread about this, and my opinion is such:

The American people arent this stupid. Really arrogant of google to think they can get away with controlling information at this scale. The forces of capitalism will eventually surface another search engine that will compete with them


What is my results? All results are generated by Google and i don't have a copyright claim on them. They can show me whatever result they want. If I'm not happy, I'll look elsewhere.

> All results are generated by Google and i don't have a copyright claim on them. They can show me whatever result they want. If I'm not happy, I'll look elsewhere.

Google gets legal protections from claiming that they don't editorialize their search results.


Why is this being downvoted? Doesn’t Section 230 apply to them if they don’t editorialize?

Section 230 protects them from being liable for user generated content regardless of if they editorialize or not.

Here's the Search Quality Evaluator Guidelines document the article is referencing [0]

"Expertise" is mentioned 99 times, "authoritative" 55 times, and "trust" 69 times. Of course, tweaks to generate search results favoring websites that show expertise, authoritativeness, and trust are going to favor larger, more established companies.

Seems like News Corp shot the arrow and then drew an 8,800-word bullseye around it.

[0] https://static.googleusercontent.com/media/guidelines.raterh...


Interesting to skim through the PDF.

Here's an excerpt from Section 3.1 which outlines the "most important factors" that influence Page Quality ratings. One thing I found particularly interesting is that raters are encouraged to search for external sources to determine a website's reputation (one example they provide in the PDF is that Kernel.org should receive the highest rating for term "Linux Kernel archives" because of Wikipedia's kernel.org page vouching for its authoritativeness).

---

- The Purpose of the Page

- Expertise, Authoritativeness, Trustworthiness: This is an important quality characteristic. Use your research on the additional factors below to inform your rating.

- Main Content Quality and Amount: The rating should be based on the landing page of the task URL.

- Website Information/information about who is responsible for the MC: Find information about the website as well as the creator of the MC.

- Website Reputation/reputation about who is responsible for the MC: Links to help with reputation research will be provided.


This is a gold mine for SEO people

Alternative Headline: "News Corp Interferes with Its Journalists and Changes Your Context"

Are you saying we should expect Google results to be as biased as news outlets?

No, I’m saying the WSJ is trash.

I couldn’t care less about Google’s choices. They’re a private company not a public utility.

In any case, the algorithms Google uses have always been biased. That’s the whole point of returning anything resembling “relevant” search results.


You are right - the algorithms have always been biased.

The issue is that Google has maintained that they aren’t. Unlike the WSJ, which doesn’t claim to be unbiased.


No, that is your issue. Like I said, I couldn't care less.

If Google once said, "PageRank is a combination of butterfly migration patterns and random chicken bones scattered on the floor..." I wouldn't care. If Matt Cutts laughed manically after each interview while nodding in a knowing way at the camera, I wouldn't care. If Larry and Sergey personally dug a hole to the center of the Earth to power their personal space stations using microwave transmitters fueled by geothermal energy, I wouldn't care.

I also don't care what the WSJ says about itself either. It also happens to be a private company and it can do whatever it wants. If they want to claim to be a source of "trusted news" (and they do), that is their business. I don't care. I just happen to think that they are trash.

My opinion has nothing to do with Google or their butterflies and like all opinions (including yours and everyone else's) it is essentially worthless.

But do go on if you want... I don't care.


You don’t have to care, and it’s not my issue. It is the reason articles like this are relevant at all.

If it was generally accepted that Google is just another biased actor, there wouldn’t be any news to this.

I‘m just pointing out that your comments are irrelevant to the general complaint here.

It’s seems to be a curious behavior of your that you write elaborate comments about things you don’t care about.


Don't be disingenuous; it's absolutely your issue. You're running up and down this thread and hopping onto other posts about the same topic. I don't care if it's your personal crusade, but you should at least own it and enjoy it.

As for my "curious behavior," let me remind you that you're the one who stuck your nose in. I made a joke and you decided to try and be a bully. I don't like bullies but I do take great pleasure in leading them down very long, nonsensical pathways for no other reason than to waste their time.

It's a time-honored tradition that can be traced back centuries if not millennia. You'll find it in many great works of literature as well as those of the lesser know poets. Lewis Carroll was particular well-known in this regard and one could certainly do worse than to emulate his example.

Mind you, I'm certainly not claiming to be any of these things myself. No, I'm just an average traveller but I do like shape of the joke. It never ceases to amuse me when deployed as the bully in question (that would be you) never seems to catch on even when the game is laid out in front of them.

For what it's worth, it is 100% relevant to comment on a source regardless of the topic. This is especially true of the WSJ, which is still trash. And while we're at it, this whole topic is not news anyway. People have been talking about Google's bias since the very beginning of time.

I'd recommend that you go outside and have a walk or maybe a nice cup of tea.


Can you explain how you are being ‘bullied’?

It seem like you think of yourself as some kind of extremely clever and sophisticated victim, but it’s not clear how you have been hurt.

I’m fine with criticizing the WSJ, I just don’t think we should use that to deflect criticism from Google.


I'm glad you have an opinion.

I think this is a good article. People should be aware that the information they are being exposed to is being controlled.

Most of the complaints about the article are purely political.


I'm not sure that's entirely fair. Most of the complaints about the article are coming from people who know enough to understand that there's no alternative to interfering. There's no magic "correct results" machine at Google HQ; there are only strategies for better results, and sometimes the best known strategy involves manual intervention. So they see an article that's saying something obviously true in conspiratorial tones; I don't think it's political to be annoyed by that.

The best argument I see in favor of the WSJ article is that many people actually do think Google has a magic machine. If so it's fair to publish stories about how that's not true. Yes, it couldn't work any other way if you think about it, but the point of a newspaper article is to get people informed enough to think about it.


>Most of the complaints about the article are coming from people who know enough to understand that there's no alternative to interfering

Except Google claims they don't interfere, which is why this is news.


As far as I know, Google has never made any general claims that they don't interfere. Of course they interfere. They just claim that there are specific reasons they won't use to justify interference and specific interference methods they won't employ.

(The WSJ article does contain some claims that Google's interfering in ways they previously denied, and that part is unquestionably news if true, but it really doesn't seem like the main point they're trying to get across.)


You are really trying to twist this in google's favor. The fact is they claim they don't editorialize and they do.

Any choice of results to produce is an editorialization, barring the existence of an oracle to tell us the one true answer. Thus your complaint about editorializing reduces to nothing but google chose the results they produce. About which, well, duh; that has been true since the very first search result they served.

I need to subscribe to read. I won't.

Aw gee why are they bullying their most productive and obviously right wing algorithms?

Could someone who can access the article please paste the text? I was only able to access what's below, even through archive.is.

---

WSJ INVESTIGATION

How Google Interferes With Its Search Algorithms and Changes Your Results

The internet giant uses blacklists, algorithm tweaks and an army of contractors to shape what you see

By Kirsten Grind, Sam Schechner, Robert McMillan and John West

Nov. 15, 2019 8:15 am ET

Every minute, an estimated 3.8 million queries are typed into Google, prompting its algorithms to spit out results for hotel rates or breast-cancer treatments or the latest news about President Trump.

They are arguably the most powerful lines of computer code in the global economy, controlling how much of the world accesses information found on the internet, and the starting point for billions of dollars of commerce. Twenty...

TO READ THE FULL STORY


Nobody has mentioned StartPage.com. I believe that uses Google Search in the background, while not sending your personal data or returning a bunch of advertising garbage.

There is a story about startpage earlier this month

https://news.ycombinator.com/item?id=21371577


Search Quality is so hard, it's hard to make everyone happy. Wish every Search engine had an easy to use personal option (IoC).

Not that I defend Google by any means, but it's hard to get everything right with a system like this.


Yes! A toggle on/off button for personalization would be amazing!

I remember it used to be easy for me to find obscure things on google using the advanced features. As they "improved" Google by giving it better results for the average user with sort of a "dumbed down" kind of feel, my ability to do that went way down. A few years after that, the personalized results started to get really really good, around 2014-2015 or so. Since then, it's gotten bad again, and I'm not sure why, but I suspect it has to do with $$. Basically it started with the Wikipedia article not being the first result and has continued to devolve. I never did recover the ability to get nuanced searches like I used to. It's kind of just a luck thing now. Although I actually am having better luck with duckduckgo now.


What I would be interested in knowing is what metrics Google uses to optimize their systems. When rolling out a new update do they only look at click through rates ? Do they have human moderators rate a sample of search results ?

Given the large number of people such ranking algorithms affect I foresee regulations being passed to make them more transparent.

Companies should be required to publish audits about whether their algorithms are biased towards protected categories and other things like their competitors or big companies.



The written rules and algorithms are only half the story. The other half is how seriously and with what integrity the criteria applied. The Soviet Union had a more liberal Constitution than the US, but it was interpreted by the Communist Party to allow the Gulags and controlled news and academic writings.

A follow-up to my earlier post about algorithms being only half the story. I follow the Google news aggregation feed. Despite the algorithm and their "Search Quality Evaluator Guidelines" their news feed is still a left-wing propaganda rag. It isn't a place to go to get an accurate model of what is going on in the world. That should come as no surprise because, based on the news coming out of Google, their employees are always demanding that Google not work for certain DoD programs, not work with ICE and support most Politically Correct causes.

The first thing Google needs to do to improve their news and search engine quality is to hire and train people who have quality and integrity as their guiding principles. Leftist ideologues need not apply for these functions although they can contribute in most other parts of the company.


Legal | privacy