Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

That comment shouldn't have aged well one nanosecond after it was posted.

Search results are bias. The entire idea of a "search engine" is to bias the set of all possible data in the crawled universe to select for the information you're searching for, then sort that information by "likeliest to be what you wanted" because the interface can't just cram all the results straight into your brain.

... and the company writing the search engine is always the final arbiter of what that means in implementation.

In this specific case, DDG is announcing they are aware of some sites where the information is likely to be untrue and they're downranking it on account of it being a datasource unlikely to deliver what the user wants. That's their job, in exactly the same sense that it's their job to figure out that when I search for "hacker news" I mean this site and not the r/hackernews Reddit mirror.



sort by: page size:

> they are explicitly adding bias here.

All search engines are explicitly biased. That is the point, they generate a ranking of results. Heck even how you tokenize text is an explicit bias of what you match against.


> So where does that leave us? Everything is biased?

Search results are created by humans and intended to be consumed by humans, so yes. (And just to head it off, a web crawler is ultimately just an abstraction for the humans who will later consume the search results, it just happens to have particularly fast and strict browsing habits)

Now that said, accepting that "(un)biased" is an extremely broad term, I'd very easily believe that the intent of the tweet was to point at some specific type of bias that DDG (at the time) intended to avoid.


DDG's ranking algorithm is under attack, and they're acting in defense of it.

Nothing more, nothing less. To think there's some objective "unbiased" search result is naive.


> How would DDG do this even if they wanted? Send fact checkers to the ground?

If you want DDG to independently verify every decision it makes via primary sources, you are going to get less useful search results. DuckDuckGo doesn't have a team of scientists to reproduce every research paper they see. Nevertheless, they can decide to intervene in situations where they are reasonably certain that a source isn't trustworthy.

Of course, people are free to disagree with them. Is the disagreement here that people think they're blocking sites that aren't misinformation? That's difficult to debate given that we don't know the list of sites, but my personal priors are that the sites probably aren't the victims of smear-campaigns, they probably are peddling deliberate misinformation. Hard to debate one way or another if we don't know the list; but once again, arguing that DuckDuckGo is wrong about whether these sources are trustworthy is not the same as saying that they shouldn't be able to downrank a bad news source without first forming their own team of investigative journalists.

----

> I agree. The problem here is they are only flagging disinformation from one side as factually incorrect, and do not even bother to do so for disinformation coming from the other side, thus creating bias, which is political in nature for the reasons explained above.

So, there's two things here:

First, yes, search engines have bias for the same reason that all ranking systems have bias. Remember that DuckDuckGo is literally in the business of ranking certain sites above other sites. There is no one in the world and no algorithm that is capable of ranking information without incorporating some degree of worldview into that decision about how rankings should work. This bias is why we use search engines, and it's why diversity in search engines would be a good thing. We want sorting systems to have opinions about how information should be sorted.

This is still very difficult to talk about when the word "political" is being used in such a broad sense. Do you mean political in the sense that all editorial decisions are political by nature because they either reinforce or question a status quo? Or do you mean political in a more narrow way -- that applying more strict standards to a subgroup of sources is the thing that makes this political? If you mean "political" in a broad sense, then sure, I agree, but also there's no such thing as a web search engine that is apolitical in that broad sense and I question whether it's possible to build one that is apolitical without also being completely useless for most users. If you mean political in the second sense, that there is a narrow category of political topics and the lack of fairness is the thing that makes it political... again, I just don't understand how you square that with the regular filtering that search engines do all the time.

When Google Ads pay special attention to ads for lockpickers because it's a popular spam category of ad, but they don't pay special attention to other ads to the same degree, is that suddenly political?

Second issue I have here, if the problem is a lack of flagging misinformation in other contexts, why would the answer not be more rigorous flagging of that misinformation? Why would the answer necessarily be that DuckDuckGo results should be a free-for-all whenever someone searches for the word Ukraine? There's a big jump here from, "I think they're not doing a thorough enough job and I think they're taking sides in a conflict" to "they shouldn't be even trying to do this at all".

There are some services where that viewpoint makes sense, but I don't see how DDG is one of them. I personally have argued that companies like Cloudflare fundamentally shouldn't be in the business of releasing content filters at all. I personally have argued that TLDs shouldn't be involved in censorship. I have personally argued that ISPs should not be allowed to filter content that is not illegal. Important difference, none of those are companies whose primary service is sorting content, none of them are companies that we go to with the explicit request for them to give us information based on what they think is relevant and accurate.

How do you make the jump from disapproval of DDG's standard for misinformation and how it's applied to the idea that they shouldn't be involved in filtering of misinformation at all?

----

TLDR, I still don't really understand why editorial decisions about political content is a slippery slope, but abandoning editorial decisions based on a word ("political") that doesn't seem particularly rigorously defined isn't also a slippery slope.


Unbiased search is impossible, so DDG should be deliberately biased?

Undoubtedly there is bias in the design of any software. But, you'd hope that software designers would try to keep their own biases out of it!

This is not what DDG are doing though. They are explicitly inserting their bias - no doubt this is in good cause - but I don't want any overt bias at all! I want to see the data as is. As a grown-up, I'll mediate my own searches - thanks! I don't need a parental filter!

That they were even able to implement some sort explicit bias functionality so quickly is also a concern to me. They must have had a means to manually add certain keywords etc to some sort of banned or weighted listing.

So there is some reason to think their censorship was by design.... why? Perhaps this is on request of a three letter agency, via some secret court judgement? Are there other explicit biases that they are not telling us about? How can we know?

We can't know, but do know that they are not operating in good faith - that they are editing results.


Kudos for duckduckgo.com! Looks like they have it right.

Quick comment:

I don't really blame the search engines for listing a result with bad data because sites change so often. Whose to say that the data wasn't right at one point and a recent update introduced erroneous data after it was already indexed.

It seems to me like "caveat emptor" applies here and no matter where a site falls on search rankings the reader / consumer should perform some due diligence.


You're right. Fundamentally, there is no unbiased search engine. Ranking results by definition innately creates a bias.

(To be a pedant on my own post: I guess you could do a search engine that collates all search results relating to a keyword and then just shows you a random one first, but I doubt that's what the people want).


Search algorithms are preferential and biased by design. They are always tuned for a certain result, and usually that's with a bias towards accurate information. Obviously they've failed at this in the past, present, and future, but it should hardly be surprising that literal propaganda produced by a highly militarized government run by a tyrant is being down ranked.

> How do you make the jump from disapproval of DDG's standard for misinformation and how it's applied to the idea that they shouldn't be involved in filtering of misinformation at all?

It is quite easy to disapprove their current standard on record because according to it, misinformation can be coming from a "Russian" source only and we all know there is much more misinformation in the world than that. Can we agree on this?

I am not saying that a search engine shouldn't be involved in filtering of misinformation. On the contrary, I think that DDG (and any other search engine) should absolutely be in the business of filtering all misinformation they can. Key here is "all".

But by being selective, and in this case based on a particular political view (and I use the word political in the context of world politics), introduces a bias which may negatively affect its users, without any particular benefit.

Makes sense?


Which is why I edited my top post. No need to spread inaccurate information.

Since you're going on the offensive, a defensive move might be in order: no, I wasn't. I said "odds are", not "DDG does", because I didn't know for sure and did not intend to claim to. The odds are in my favor, the fact is an aberration on otherwise almost-uniform behavior by search providers.


I guess this did not age well:

“[W]hen you search, you expect unbiased results, but that’s not what you get on Google,” @matthewde_silva quotes @yegg

https://twitter.com/DuckDuckGo/status/1114524914227253249

Also, they probably do not realize that they will have to start with Twitter if to be consistent.


Search results by definition are quite biased - it's returning an ordered list not an unordered set. Googlers' personal biases can and do enter into those results, often in a subtle way where they feel they are being (from their perspective) unbiased. It's an automated system but it's still written by subjective humans.

I agree that every Googler will tell you that impartiality of search results is an overriding priority. It's the crucial tenet underpinning the entire edifice.


One problem with platforms having a clear bias in their search results is their classification of concepts like "disinformation" aren't what they say. They can't be. A business that needs to increase profits every quarter is never going to be charitably filtering data to get rid of "bad actors" (unless it's enforced by the law), they're going to do what they can to make more money. I don't think the owners of DDG are making this mistake maliciously. However, controlling information based on subjective opinions on narratives like the Russia/Ukraine conflict (which 99% of people talking about haven't even been to the region) is short sighted unless they're openly stating "We have a clear bias and are making our own claims on what good information is."

Nope, you were just plain wrong.

The odds would have been in your favor if you have said "odds are a search engine chosen at random does this too".

But, instead, you said: "odds are DDG does this too", thus binding the odds to DDG.

Since the general stance of DDG is pro privacy, you should have reasoned that the odds were in favor of DDG NOT doing this.

(Not to mention that posting that you "were close" didn't provide anything to the discussion --apart from some ill-conceived face saving from your part).


"That seems like its not only reasonable"

Is it though? There is bias in everything.

There is bias in Googles original search - it crawled content. That content will be biased.

Google search has been biased since the start. It has nothing to do with AI.

Biased is also contextual: what does it mean for all of us non-Americans to see tons of American content in everything - the 'American bias' is overwhelmingly the strongest bias, where are the concerns about that?

And how does bias imply a 'lack of ethics'?

The entire consideration is ridiculous:

1) AI is not special and odes not deserve it's own ethical oversight. Every social tech has issues and it all needs to be thought about.

2) Someone with no real training in the issue may very well merely be injecting their own politics into the situation, and possibly overstating issuses.

3) There's nothing objective about morality or ethnics, so it's really hard to even find such at thing as 'objective'.

What the company needs is a clean, comprehensive framework for the issue, and probably some independent oversight that can given them private assessments of where there are red flags.

Not individuals who want to make a name for themselves on the issue publicly, and who might be part of some kind of ideological movement.


> In the hands of a human, decisions like these might be viewed as biased. For a Google algorithm, they are simply a matter of numbers.

This is pure nonsense. Algorithms are written by humans and are just as biased as their creators.


You can really see the "search term bias" in that snippet.

All search engines have some form of implicit bias. An unbiased search engine would be beyond useless at actually finding anything but extremely well specified queries. The trick is to tune the bias to favor results that are interesting and relevant.

This is also why having just one big search engine is a bad idea.

next

Legal | privacy