Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I also don't understand the vector everyone seems worried about. Also considering that perceptual hashing isn't new, as far as I'm aware, and it hasn't seemingly yet led to any sort of wave of innocent folks being screwed over sham illicit images.

I think there _is_ an argument to be made about a system like this being used to track the spread of political material, and it's easy to see how such a system would be terrible for anyone trying to hide from an authoritarian power of some type, but that'd already require Apple is absolutely and completely compromised by an authoritarian regime, which isn't high on my list of concerns.



sort by: page size:

Thank you for this perspective. I've never worked at an organization of this magnitude, so I am definitely lacking some perspective.

> It's also clear Apple put a lot of thought into addressing the privacy concerns for this. Technologically, it's sophisticated, impressive.

I'm not sure about this. How is a perceptual hash sophisticated and impressive given that it can be abused by governments demanding Apple scan for political content, etc?


You know, I'm not the sharpest tool in the shed, but I mostly see a bunch of disingenuous logic from the pro-surveillance forces.

Being able to uniquely identify a picture via a hash and having a copy of that same picture at The Borg is essentially the same as sending a copy, it's a form of image compression.

Much hand-waving results as to the actual rules for pushing the data upstream, but that's just a detail..and a definable one at that.

It's probably best to view Apple products as privacy oriented to corporations not named 'Apple', but transparent to governments. That's probably enough for most people.


The answer to why people don't like this is simple, if a government like China says "Apple, you're going to add these image hashes to the database and report any device that has them in the next update or you're going to leave China," what do you think Apple is going to do?

I have read their papers, I understand the system and the safeguards they put in place, but none of them are good enough to have scanning on my device. There is nothing that is good enough. On device scanning for "illicit" content is a box that cannot be closed.


The implementation also means Apple has plausible deniability if the “CSAM” in their database actually contains images associated with political enemies of whatever regime supplies the source material. How would you know the hash you are testing against isn’t just a Winnie the Pooh? You really can’t.

Really, it’s probably the best way to keep the police state from destroying your business while trying to sleep at night.


One thing I don’t understand about this debate is that one of the bigger concerns folks who are against the measure have is that Apple might one day add non-CSAM photos to the set of photos they scan for.

As far as I understand it, the CSAM hash database is part of the OS, which Apple can update in any way they like, including to read your messages or surreptitiously compromise the encryption of your photos (and they can force your device to install this code via a security update). We trust them not to do these things (they already have a track record of resisting the creation of backdoors), so I’m not sure why they we don’t trust them to also use this capability only for CSAM.

Sure, it would be technically easier for them to add the hash of the tank man photo (an example of something an oppressive government might be interested in) to their database after something like this is implemented, but it’s also not very hard for them to add scanning-for-the-tank-man-photo to their OS as it currently exists. Indeed, if the database of hashes lives on your device it makes it easier for researchers to verify that politically sensitive content is not present in that database.


The apple system is a dangerous surveillance apparatus at many levels. The fact that I pointed out one element was broken in a post doesn't mean that I don't consider others broken.

My primary concern about its ethics has always been the breach of your devices obligation to act faithfully as your agent. My secondary concern was the use of strong cryptography to protect Apple and its sources from accountability. Unfortunately, the broken hash function means that even if they weren't using crypto to conceal the database, it wouldn't create accountability.

Attacks on the hash-function are still relevant because:

1. the weak hash function allows state actors to denyably include non-child porn images in their database and even get non-cooperating states to include those hashes too.

2. The attack is lower risk for the attacker if they never need to handle unlawful images themselves. E.g. they make a bunch of porn images into matches, if they get caught with them they just point to the lawful origin of the images. While the victim won't know where they came from.


Exactly right. The tech Apple uses can be one of two things:

1. It requires a perfect 1:1 match (their documentation says this is not the case); 2. Or it has some freedom in detecting a match, probably including a match with a certain percentage.

If it's the former, it's completely useless. A watermark or a randomly chosen pixel with a slightly different hue and the hash would be completely different.

So, it's not #1. It's going to be #2. And that's where it becomes dangerous. The government of the USA is going to look for child predators. The government of Saudi Arabia is going to track down known memes shared by atheists, and they will be put to death; heresy is a capital offence over there. And China will probably do their best to track down Uyghurs so they can make the process of elimination even easier.

It's not like Apple hasn't given in to dictatorships in the past. This tech is absolutely going to kill people.


Literally Apple said in their own FAQ that they are using a perceptual(similarity based) hash and that their employees will review images when flagged. If that's not good enough(somehow) then even the New York Times article about it says the same thing. What other evidence do you need?

All I'm saying is that I understand why Apple doesn't want CSAM on their servers, while also reserving the ability to implement end to end encryption on photo galleries in future.

It's a tough line for them to walk.

Personally I don't see any substantial moral difference to what Apple was planning versus what Google and Facebook already do, aside for the somewhat icky aspect that some initial CPU cycles are occurring on hardware I own.

I'm equally concerned that a purported need to scan for CSAM could be used by legislators as a way to convince the population that the Government needs more encryption backdoors. Personally I'd rather have my photos inconsequentially hashed than have encryption legislatively broken.


I've created and posted on github a number of visually high quality preimages against Apple's 'neuralhash' [1][2] in recent days.

I won't be posting any more preimages for the moment. I've come to learn that Apple has begun responding to this issue by telling journalists that they will deploy a different version of the hash function[3].

Given Apple's consistent dishonest[4] conduct on the subject I'm concerned that they'll simply add the examples here to their training set to make sure they fix those, without resolving the fundamental weaknesses of the approach, or that they'll use improvements in the hashing function to obscure the gross recklessness of their whole proposal. I don't want to be complicit in improving a system with such a potential for human rights abuses.

I'd like to encourage people to read some of my posts on the Apple proposal to scan user's data which were made prior to the hash function being available. I'm doubtful they'll meaningfully fix the hash function-- this entire approach is flawed-- but even if they do, it hardly improves the ethics of the system at all. In my view the gross vulnerability of the hash function is mostly relevant because it speaks to a pattern of incompetence and a failure to adequately consider attacks and their consequences.

- https://news.ycombinator.com/item?id=28111959 Your device scanning and reporting you violates its ethical duty as your trusted agent.

- https://news.ycombinator.com/item?id=28111908 Apple's human review exists for the express purpose of quashing your fourth amendment right against warrantless search.

- https://news.ycombinator.com/item?id=28121695 Apple is not being coerced to perform these searches and if they were that would make their actions less ethical, not more.

- https://news.ycombinator.com/item?id=28097304 Apple uses complex crypto to protect themselves from accountability.

- https://news.ycombinator.com/item?id=28124716 A simplified explanation of a private set intersection.

- https://news.ycombinator.com/item?id=28101009 Perceptual hashes at best slightly improve resistance to false negatives at the expense of destroying any kind of cryptographic protection against false positives (as this thread has shown!). Smart perverts can evade any perceptual hash, dumb ones won't alter the images.

- https://news.ycombinator.com/item?id=28097508 Apple's system and ones like it likely create an incentive to abuse more children

And these posts written after:

- https://news.ycombinator.com/item?id=28260264 A second "secret" hash function cannot be secret from the state actors that produce the database for Apple.

- https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX//issues/1#issuecomment-903181678 fuzzy hashes with resistance against false positives tracable to sha256 are possible, but require you to value privacy over avoiding false negatives.

[1] https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX//issues/1#issuecomment-903094036

[2] https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX//issues/1#issuecomment-902977931

[3] "Apple however told Motherboard in an email that that version analyzed by users on GitHub is a generic version, and not the one final version that will be used for iCloud Photos CSAM detection." https://www.vice.com/en/article/wx5yzq/apple-defends-its-anti-child-abuse-imagery-tech-after-claims-of-hash-collisions

[4] https://news.ycombinator.com/item?id=28221538


What makes you think that Apple has a database of actual child sex abuse images? Does that feel like a thing you'd be OK with? "Oh, this is Jim, he's the guy who keeps our archive of sex abuse photographs here at One Infinite Loop" ? If you feel OK with that at Apple, how about at Facebook? Tencent? What about the new ten-person SV start-up would-be Facebook killer whose main founder had a felony conviction in 1996 for violating the Mann Act. Still comfortable?

Far more likely Apple takes a bunch of hashes from a third party in the law enforcement side of things (ie cops) and trust that the third party is definitely giving them hashes to protect against the Very Bad Thing that Apple's customers are worried about.

Whereupon what you're actually trusting isn't Tim Cook, it's a cop. I'm told there are good cops. Maybe all this is done exclusively by good cops. For now.

Now, I don't know about the USA, but around here we don't let cops just snoop about in our stuff, on the off-chance that by doing so they might find kiddie porn. So it should be striking that apparently Apple expects you to be OK with that.


The concern is that Apple is handing governments a new tool to go “give me a list of all users that have this photo”. It could track down dissidents based on this and combined with metadata is probably sufficient to pinpoint who took a particular picture.

Think you shared that picture of police brutality anonymously? Think again.


For argument (1), they are only looking matches from existing database of hashes what NCMEC is providing. They are not developing general AI to identify new pictures, they only try to stop redistribution of known files. Because of that, their claim for 1/1 trillion false postives might be actually close to be correct since it is easily validated on development phase. Also, there is human verification before law-enforces are included.

For argument (2), this might be valid, but yet again, all we can do is to trust Apple, as we do all the time by using their closed source system. Model can be changed, but it is still better option than store everything unencrypted? In case you mean forging hashes to decrypt content.

For the sake of surveillance, it is not strong argument because again, system is closed and we know only what they say. Creating such model is trivial, and is not stopping government for demanding if Apple would want to allow that. System would be identical for antivirus engines which have existed since 1980s.

This is such a PR failure for Apple, because all their incoming features are improving privacy on CSAM area, everything negative comes from speculation which was equally already possible.


I’m not sure which side you’re arguing here.

The biggest concern about Apple’s system is that it’s very easy to add new items to a hash list. That is an argument about the technical similarity of scanning for CSAM and scanning for other things like classified documents (for example).

But there is a vast difference in principle. Pretty much everyone wants to stop child abuse. But many people—including major news organizations—believe citizens should sometimes have the opportunity to view classified documents.

Different categories of things to scan for will be different in principle, even if the technical approach is similar. This difference in principle is what Apple leans on when they say they will oppose any request to expand their system beyond CSAM.


Again a "what if" scenario.

It is not trivial to identify a picture that is uniquely present in Assange's phone. Without having access to Assange's phone that is. And if you have access to Assange's phone, you probably have way more information than such a trick will give you.

But yes, it is a problem, and a reason why I dislike CSAM. But from "allowing the government to track highly sensitive individuals" to "you, the average citizen, will go to jail", there is a huge gap.

Also, I think one thing many people missed is that a hash is not enough. The only thing a hash does it that it allows a reviewer at Apple to decrypt a matching file. If the picture turns out to be nothing special, nothing will happen.

In fact, that feature is much weaker than people make it, it just puts Apple on par with what others like Google, Microsoft, etc... can do. I think the reason it got so much negative press is that Apple made privacy a big selling point. I think it serves them right, I don't like how Apple treats privacy as just a way to attack Google without really committing to it, but still, for the facts, it is a bit unfair.


Well first of all, it's not provided by the US government. It's a non-profit, and Apple has already said they're going to look for another db from another nation and only included hashes that are the union of the two to prevent exactly this kind of attack.

If what you mean by blinded is that you don't know what the source image is for the hash, that's true. Otherwise Apple would just be putting a database of child porn on everyone's phones. You gotta find some kind of balance here.

What do you mean you can't verify it doesn't contain extra hashes? Meaning that Apple will say here are the hashes in your phone, but secretly will have extra hashes they're not telling you about? Not only is this the kind of thing that security researchers will quickly find, you're assuming a very sinister set of features from Apple that they'll only tell you half the story. If that were the case, then why offer the hashes at all? It's an extremely cynical take.

The reality is all of the complaints about this system went from this specific implementation, and then as details get revealed, it's now all about the future hypothetical situations. I'm personally concerned about future regulations, but those regulations could/would exist independently of this specific system. Further, Dropbox, Facebook, Microsoft, Google, etc all have user data unencrypted on their servers and are also just as vulnerable to said legislation. If the argument is this is searching your device, well the current implementation is its only searching what would be uploaded to a server instead. If you suggest that could change to anything on your device due to legislation, wouldn't that happen anyway? And then what is Google going to do... not follow the same laws? Both companies would have to implement new architectures and systems for complying.

I'm generally concerned about the future of privacy, but I think people (including myself initially) have gone too far in losing their minds.


I would argue that this Apple system is not so much surveillance as a censorship mechanism.

They check for a finite set of "bad" things that no one is allowed to have. Because they went so far out of their way to avoid learning anything else about your photos, I think the argument is going to get very messy if we try to argue the surveillance angle. It gets very nuanced very quickly, and public opinion doesn't do nuance well.

It's a censorship tool. This argument is straightforward and easy. China can add Tank Man to the list of bad hashes, and now nobody is allowed to see him. The entire argument is now about what information should be censored, and who do we trust to maintain the badlist.

(Edit: Otherwise I agree with everything that matthewdgreen wrote above.)


This is a bizarre result but so.. what's the conclusion? That only a few things like Apple's proprietary image lookup are able to tap into the ANE so far? Or that it's actually just a marketing gimmick?

Reading this makes me wonder if it's not just a placeholder for some kind of intrusive system that will neural-hash everything you own, but I'm sure I'm just being paranoid.


Assuming the user trusts Apple, that may be palatable to some. But the crux is, the images they're matching against are in a database maintained by an unaccountable nonprofit (NCMEC).

Apple did this so their staff wouldn't have to see the offending images, until/unless it was actually confirmed.

However, the matching is all algorithmic and who knows what's in the CSAM database. It could be images of police brutality so law enforcement can ferret out protesters, for all we know. Want to round up all communists? This is an easy way to do so.

next

Legal | privacy