Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Is Firefox lying to users about viruses in downloads? (www.theindy.us) similar stories update story
74.0 points by newman8r | karma 2539 | avg karma 2.19 2018-07-04 12:37:30+00:00 | hide | past | favorite | 76 comments



view as:



Spoiler: no, sometimes flags safe content from shady sources.

Saved me some time reading, thanks (also, Betteridge's Law)...

Guess it's the same for IE - i've had a few issues with executables that probably don't see many downloads...

...no, i don't use it voluntarily - but sometimes have to at work...


In my book, flagging as malware something that is not malware is lying, or at least a false positive.

So, spoiler: it's a bit more nuanced than "yes" or "no".


In my book, there is a lot of distance between "lying" and "false positive"

Which really shrinks when it's about presenting the known-failible process as 100% reliable, which the wording/UX seem to suggest in this case.

I can understand the reasoning behind this, I've seen it done multiple times to my own projects by the UX guys. I don't really know if it's net-positive strategy or not, but the fact that there is a lie involved at some level is undeniable. At least to those who didn't drink too much Kool-Aid and don't think the average user is too stupid to comprehend the truth, and therefore that it makes no sense to present it to the users.

I may be biased in the other direction, but every time I see a discussion about "dark patterns" in UI I get a feeling that they differ from the "light" ones only in what is the goal of the manipulation, but with methods surprisingly similar on both sides.


... so, materially yes? As a user, I don't care that they have an excuse, I care that they said "this file is a virus" when the file was not a virus.

Not really. "Lying" includes an intent to deceive. I would have been okay with "raises false alarms", "generates false positive" or even "confuses and/or misleads users".

As a user, I want to be better safe than sorry - but would perhaps be happier with a finer classification than "no problem/OMG VIRUS!"


I agree that finer classification is needed. However: if you decide to declare to a user that every individual file that might contain a virus does contain a virus, that's intentionally deceiving them. It's deciding "X% of the time, this statement will not be true, and we are ok with that" instead of building the uncertainty into the statement itself. So I don't see how this can not be considered lying, unless they believed there would be no false positives.

Until the moment where you really need that file.

This. Firefox has become far more aggressive in policing use. For ex., if HTTPS is misconfigured 60.0.2 no longer offers an option for one time exceptions, now there's only 'get me out of here' and 'report to mozilla' buttons. Perhaps someone more familiar with ff's config knobs will post the knob name.

I'm running 60.0.2 and I still have the option to add an exception, under "Advanced". What you describe only happens when the site is so badly misconfigured that no connection is possible (e.g. protocol errors).

You cannot add exception for domain using HSTS.

Having false positives != lying.

Not inherently, but in this case all positives are matched to the same wording, and that wording expresses 100% certainty to the user that something in the file will harm their computer. That's a lie, because that is not what has been determined at all.

Firefox isn’t necessarily scanning the files for viruses, they’re often just using databases that list domains suspected of hosting malware.

IIRC, Chrome does the same thing too. I think it's not much of an issue for Firefox to flag stuffs downloaded from suspected URLs as a malware since it's not uncommon to have one's system infected from those sites' content. Firefox is just trying it's best to prohibit any sort of system infection through itself.


Then (as the article suggests) Firefox should actually tell the user that it is the site that is untrusted and not the file, and allow the user to save it so that they can actually scan it themselves.

> Firefox is just trying it's best to prohibit any sort of system infection through itself.

In that case I would change the wording from "contains malware" to "may contain malware".

Also, they're prohibiting nothing. They still give you the option to open the file (which, as explained by the article, opens a whole different, albeit small, can of worms)


> In that case I would change the wording from "contains malware" to "may contain malware".

Or even more precise: "may contain copyright infringement".


I wonder why they can’t integrate some service like VirusTotal into their downloader. Sure, for heretofore new objects it’ll take longer, but they have a long list of many many file hashes and their reputation dB .

Because I don't want Firefox to send what I download to some (third party) company. Or anywhere for that matter.

Their (and chrome's) current solution to block malware domains use a client-side bloom filter afaik.

If you'd try to build the same client-side into firefox, you'd have just built another (bad) antivirus software.

Might as well integrate clamav into firefox then.


Can’t they just compare hashes and for those where they have to do a scan, serve as an anonymous intermediary/proxy?

Perhaps similarly to Pwned Passwords API, except for files: https://www.troyhunt.com/ive-just-launched-pwned-passwords-v...

There are quite a few reasons I can think of:

* Privacy

* Performance

* They would likely have to work with Virus Total to support their infrastructure as I can only imagine how quickly they'd take such a cloud service offline if everyone started using it by default

* And then what happens if / when the cloud service does have an outage? Does that mean people are blocked from downloading things?

* Same question for people on a corporate network who might have Virus Total blocked

* Same question for people on poorer internet connections as now the user has to transfer twice as much data if it's not a hash already stored on Virus Total.

That all said, its a cool idea for a third party browser add-on (if it hasn't already been done?)


Those are good things to think about, but I'll give them a stab:

_Privacy: They [FF] can act as anonymous proxy.

_Perf: Yes, for new objects, but for known objects, it should be minimal.

_Service outage: Build a system which can allow an override (download at your own peril, while service is out)

_Corp should already have enterprisey systems in place

_New, unknown object: Yes an issue.

As someone else said [GlitchMr], if they can do it ala haveibeenpowned I think it's worth looking into.


>_Privacy: They [FF] can act as anonymous proxy.

I think he's talking about the file contents.


Indeed I was. I should have been more verbose on that bullet. :)

Ah! Good point.

cue the inevitable "Firefox is leaking every file you downloaded!"

Oh, if Chrome does it, it’s totally fine for Firefox to do it, too!!

Wait, why am I not using Chrome then in the first place?


I recently stumbled across just such a false positive, in the NextCloud showcase, no less. Not much they can do about it, as far as anyone can tell.

https://github.com/nextcloud/server/issues/9916


This is how every other browser does it too. I'd be more alarmed if my browser had an inbuilt antivirus.


You would be alarmed that the largest attack vector decided to build in a system to detect attacks?

It would be an interesting experiment to see if being direct and clear but at times inaccurate protects more novice users than being a little more wishy washy but accurate.

If only the voting public understood about statistics, false positives, and false negatives.

Even if Firefox did a full virus scan, there would still be false positives and false negatives in the results.

The system designer always has to put their reporting threshold somewhere, and that always means making a decision to bias towards false positives or false negatives. Eliminating false positives means exploding the number of false negatives.

Unfortunately, people like the OP don’t understand this, and yell angry things like “Firefox is lying to me!” Others here suggest FF should inject a bunch of weasel words like maybe and could and might. That’s a seemingly rational thing to do if you believe you’re informing a customer population that understands things like false positives and false negatives, but we have pretty clear evidence that they don’t. So FF makes a conscious informed decision to prefer reporting false positives over false negatives and then allows the user to override if they believe they know enough to do so. Sure, there are some costs to false positives, but they are dwarfed by the cost of false negatives.


I would suggest it might also force some sites to more actively police their site for malware to get off of the list.

And if only developers understood UX.

The "lie" the author was complaining about was that Firefox is miscommunicating what it did: It warned that a concrete file was containing malware when it actually found a suspicious domain.

Depending on context, that might make a huge difference - e.g., if a user got such a warning for a file they uploaded themselves, they might get the wrong impression that their system is compromised.

As the author noted, simply describing the actual threat would clear this up.

> Sure, there are some costs to false positives, but they are dwarfed by the cost of false negatives.

This strategy has blown up a number of times already. If you present too many false positives, users might lose trust in you and ignore your predictions altogether.


> And if only developers understood UX.

I've upvoted you, since I think your point is good, I'd just add that as a developer, I've found myself on more than one occasion pushing for the type of UX being advocated here, and getting pushback from designers/PMs. Typically, the issue I run into is a desire for the UX to be "simple", sometimes simpler than the system underlying it actually is (or is even capable of being), and so instead of clear and correct messaging, you get simplified and incorrect messaging.

(My current example of this would be ZIP codes. People like to simplify them to a geographic area in their heads, and then think of them as polygons, and then from there, think you can ask "is this lat/lng inside this ZIP?"; ZIP aren't polygons (they're defined as segments of roads) so answering that question requires approximations; those approximations are sometimes wrong.)


IMHO, the most reasonable solution is to have a link for “More details” for the interested/advanced users to dig into. It’s hostile for software to fail because of incorrect assumptions and then give them no way for the sensible user to handle the collateral damage.

Yea, that s happens all the time in software. It’s the developer’s job to push back and explain the probability distribution of the results to the designers/PM so they understand the algorithm’s limitations, and everyone’s job to fight for the user. It needs to be a conversation, not a contract.

Designer: This label is required to say what store the user is at given his location.

Developer: the location sensor is imprecise. We could have 20m of precision or worse. The store location data is also known to be inaccurate and incomplete. We can provide top 10 candidate stores with their probabilities. The “top” one we return could be 95% likely or 35% likely. Your move!

Bad Designer: Too much numbers for my brain. We’ll just show the top result. YOLO!

Better designer: We could only show a result if it’s above a certain threshold. Or we could show top 3. Or maybe we need to talk about changing the requirement.

Better developer: There may be other signals and inputs we could use to help make the results more confident...


I generally agree, though in this case, I'd agree with the yolo designer.

If I consulted Google Maps and it told me I'm with 48.25% likelihood at address A, with 39.81% likelihood at address B and with 11.94% at another location, what exactly am I to do with this information?

I think Google Maps actually shows a good design to communicate the uncertanity: They show the uncertanity as a blue circle of varying size. It's visual but (more importantly) gives you as a user the ability to reduce the uncertanity with their own information. E.g., if you know you're at an intersection and the circle covers only one, you now have a precise location.

In general, I'd say, it's important to know which context the information is used.


Understand security, if there is malware in one file on your server you burn down the server and set up new one.

If your machine gets infected you format all because it is insecure by definition. You might even need to throw out physical machine...

If you get one it downloads ten other and you don't know which one will pass your virus scanner.

It is not fun and games anymore, silly nerds having fun are not doing it. It is actual crime and really bad guys that would kill you without blinkink an eye are doing malware.

If there is one infected file on your domain you consider whole domain compromised.


What you say is true, but how is this relevant to the discussion at hand?

It speaks to the argument that “if this site is serving up infected files, all files it serves should be treated as potentially infected”

That would include domains like dropbox.com or drive.google.com then?

Security is always complex, and simple rules always have flaws. That doesn’t mean simple rules are always bad, it just means people who build systems around them do need to understand they need to do more than just blindly follow simple rules. I’m pretty sure you’ll find FF doesn’t alert on those sites, because they are being handled by a more complex rule than a vanilla low or medium traffic site.

Funny thing is article is saying "Unfortunately, this message isn’t always accurate. Apparently, sometimes this message is an outright lie.".

So it is not like this domain hosts one bad file and it got flagged, it is apparently more often if it "sometimes .. is .. lie".


Do you trust dropbox.com that they separate their servers, do you trust google that they do good job in terms of security and containing malware to single file? If yes then there is no need to include those.

On the other hand if you don't trust then those should be included.

Question is, does Mozilla trust google enough? Does Mozilla trust some random website where people host pirated content?

That is random article written by some random guy. Seems like he is more technical than average Joe, but he does not have any statistics to show why this behaviour was implemented. It just looks like a nag that he get his pirated downloads flagged by Firefox. It is backed up by bunch of people who also use it on /r/libgen. If they don't like it they can move to IE6.


Every file you download from site serving pirated content is high potential for malware.

I trust Mozilla more than some pirated content website or their user.

If you need that content more than you are afraid to be hacked and understand risks involved you are good to go.

People who are not aware of that threat need to be warned. Telling nicely does not work and people will click OK without reading.

Just as I wrote it is not fun and games and magic unicorns from internetz are not giving latest movies/programs for free.

Yes it is censorship, but as I wrote if you understand the risk and know what to do you will find your way to download it without firefox.


Unfortunately people like yourself feel the need to respond to articles without reading them first.

And why the hell should the browser act as an antivirus?. It's a browser, not an antivirus, it should assume the user knows what he's doing, not treat him like a toddler.

It's just a huge annoyance with no considerable benefit.


In the web security space its typically called the browser security model and there are differences between browsers. The browser is not just a portal forwarding/displaying the intentions of the developers. If you want to see one of the more obvious examples look at the same origin policy and how it’s implemented across browsers.

Having said that, I do agree that browsers shouldn’t implement lots of crazy features but I personally don’t mind if they have some kind of malicious file scanning feature.


How likely do you think it is that this has something to do with some of the organisations 'donating' to Mozilla? As anti-piracy lobbying for instance.

As an advanced user you can disable a lot of "helpful" features. Including safe browsing. Its in about:config

Browser.safebrowsing.malware.enabled

Just set it to false IF you consider yourself a poweruser.


I'm not as bothered by the main point of the article but it does raise an interesting point that was missed by the comments so far is near the end: If you're flagging files as potentially harmful, giving a user a choice to either execute it or delete it is kind of bad design!

No, it's not "bad design".

If there's an uncertainty in detection, there are false positives and letting the user decide is the only correct option.

Edit: Nevermind. Re-reading the article - they should indeed allow saving the file in addition to deleting or opening it.


The implied better alternative is to allow saving, instead of opening, so users can scan the files themselves.

So it's whatever list they use of malware hashes that's the issue, not Firefox, right? It's probably publishers that are adding their pirated works to it, not Mozilla. Or does Mozilla maintain it themselves for Firefox in particular?

There are several ebooks that have been uploaded to libgen that contain PDF exploits, and from what I understand there's no way to remove them.

The way that their library database works is by linking a book number to a file's md5 sum. On the filesystem they are stored something like `$drive:\$batch\$sum` where `$drive` is a Windows drive letter, `$batch` is the primary key of the document rounded to the nearest 1k, 10k or 100k depending on collection and `$sum` is the `md5sum` of the file data. The archive's file data is shared via torrents, usenet and other means in those batches, and to keep that in sync they have a policy of the primary key and sum of each file being immutable.

So if you do happen to download the literary works of mankind via their torrents, you have to do so with your antivirus turned off and hope nobody has uploaded anything too illegal over the last decade.


I could be wrong but I think technically a PDF exploit only affects a single viewer program, like Acrobat on windows, right?

Well yes, and in this case we're talking files that contain an exploit for a version of Acrobat from 2006 or so and an infection vector that only works on Windows XP, and connects to a botnet that is either long dead or now an NSA/CIA asset.

But Windows Defender quite rightly still quarantines the file.


It would depend on the exploit. For a simple example, an exploit that was a result of a flaw in the file specification could result in it being cross platform.

It's going to be rarer to find something of that scope, maybe even to the point of you being effectively right.


Also dodgy files can contain multiple exploits, potentially for different platforms. Problem here from the malicious actor's point of view is that each vector for attack is also a vector for detection, so rather than a cesspool of exploits it makes more sense to use single new and mostly unknown exploit that targets software used by the greatest number of victims.

It depends on the exploit and on the reader. If, for example, the reader supports javascript then it can be attacked, apart from other weaknesses. Chrome on Linux executes javascript in PDF, while Firefox does not.

Here is an example file: https://we.tl/q90gXERGmx

Built with https://github.com/cornerpirate/JS2PDFInjector


Or don't use a vulnerable PDF viewer, or OS.

Or put it anywhere a vulnerable PDF viewer or OS might stumble upon, where an overzealous scanner has write access to, or where some snitch might grab a copy from and blacklist your domains.

I'm now wondering what happens if I start posting the string "X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*" to various places. Would it result in the site getting flagged?

If anyone else doesn't like this behavior of sending hashes for everything you download to Google, set:

browser.safebrowsing.malware.enabled=false

And if you don't want phishing warnings either: browser.safebrowsing.phishing.enabled=false


Who cares? Flagging sites probably provides an overall benefit for people to be careful on sites where viruses have been detected. People pirating books will still download them ‘cause if they are on a know pirate books site they’ve already accepted some risk. People who don’t know what their doing will still get infected.

In part I agree, but then again, it can become a problem à-la windows security prompt, where people have learned to basically ignore it and give permission to mostly anything.

In other words, in my opinion, don't cry wolf unless there's an actual wolf, or at least something that could reasonably look like one.


The history of Google's "SafeBrowsing" with respect to Firefox is a real mixed bag. The privacy protections Mozilla asked for from Google were begrudgingly implemented. As a Linux user, I used to just turn it off.

I agree that straight up calling it a virus is wrong. "This file comes from a suspicious source and might be dangerous" would be much better wording. Enough to make the user cautious, but not scare the shit out of them (yes, I've literally seen people jump out of their seats and scream when they saw a virus warning).

Legal | privacy