Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Well first of all, it's not provided by the US government. It's a non-profit, and Apple has already said they're going to look for another db from another nation and only included hashes that are the union of the two to prevent exactly this kind of attack.

If what you mean by blinded is that you don't know what the source image is for the hash, that's true. Otherwise Apple would just be putting a database of child porn on everyone's phones. You gotta find some kind of balance here.

What do you mean you can't verify it doesn't contain extra hashes? Meaning that Apple will say here are the hashes in your phone, but secretly will have extra hashes they're not telling you about? Not only is this the kind of thing that security researchers will quickly find, you're assuming a very sinister set of features from Apple that they'll only tell you half the story. If that were the case, then why offer the hashes at all? It's an extremely cynical take.

The reality is all of the complaints about this system went from this specific implementation, and then as details get revealed, it's now all about the future hypothetical situations. I'm personally concerned about future regulations, but those regulations could/would exist independently of this specific system. Further, Dropbox, Facebook, Microsoft, Google, etc all have user data unencrypted on their servers and are also just as vulnerable to said legislation. If the argument is this is searching your device, well the current implementation is its only searching what would be uploaded to a server instead. If you suggest that could change to anything on your device due to legislation, wouldn't that happen anyway? And then what is Google going to do... not follow the same laws? Both companies would have to implement new architectures and systems for complying.

I'm generally concerned about the future of privacy, but I think people (including myself initially) have gone too far in losing their minds.



sort by: page size:

Thank you for this perspective. I've never worked at an organization of this magnitude, so I am definitely lacking some perspective.

> It's also clear Apple put a lot of thought into addressing the privacy concerns for this. Technologically, it's sophisticated, impressive.

I'm not sure about this. How is a perceptual hash sophisticated and impressive given that it can be abused by governments demanding Apple scan for political content, etc?


The apple system is a dangerous surveillance apparatus at many levels. The fact that I pointed out one element was broken in a post doesn't mean that I don't consider others broken.

My primary concern about its ethics has always been the breach of your devices obligation to act faithfully as your agent. My secondary concern was the use of strong cryptography to protect Apple and its sources from accountability. Unfortunately, the broken hash function means that even if they weren't using crypto to conceal the database, it wouldn't create accountability.

Attacks on the hash-function are still relevant because:

1. the weak hash function allows state actors to denyably include non-child porn images in their database and even get non-cooperating states to include those hashes too.

2. The attack is lower risk for the attacker if they never need to handle unlawful images themselves. E.g. they make a bunch of porn images into matches, if they get caught with them they just point to the lawful origin of the images. While the victim won't know where they came from.


How would Apple know what the content was that was flagged if all they are provided with is a list of hashes? I completely agree it's ludicrous, but there are plenty of countries that want that exact functionality.

Their use of a highly vulnerable[1] "neural" perceptual hash function makes the database unauditable: An abusive state actor could obtain child porn images and invisibly alter them to match the hashes of the ideological or ethnically related images they really want to match. If challenged, they could produce child porn images matching their database, and they could had these images to other governments to unknowingly or plausibly denyably include.

...but they don't have to do anything that elaborate because Apple is using powerful cryptography against their users to protect themselves and their data sources from any accountability for the content of the database: The hashes in the database are hidden from everyone who isn't Apple or a state agent. There is no opportunity to learn, much less challenge the content of the database.

[1] https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issue...


There's already a problem that Apple can't verify the hashes. Say a government wants to investigate a certain set of people. Those people probably share specific memes and photos. Add those hashes to the list and now you have reasonable cause to investigate these people.

Honestly this even adds to the danger of hash collisions because now you can get someone on a terrorist watch list as well as the kiddy porn list.


The government doesn’t supply the hashes in an unauditable way, that is a totally false statement.

The hashes are supplied by NCMEC, a non-profit which is auditable, not a secret government agency.

In any case, even if a non-CSAM hash were somehow in the database, Apple reviews the images before making reports, and those reports are used in normal criminal prosecutions.


The FT article mentioned it was US only, but I'm more afraid of how other governments will try to pressure Apple to adapt said technology to their needs.

Can they trust random government to give them a database of only CSAM hashes and not insert some extra politically motivated content that they deem illegal ?

Because once you've launched this feature in the "land of the free", other countries will require for their own needs their own implementation and demand (through local legislation which Apple will need to abide to) to control said database.

And how long until they also scan browser history for the same purpose ? Why stop at pictures ? This is opening a very dangerous door that many here will be uncomfortable with.

Scanning on their premises (considering they can as far as we know ?) would be a much better choice, this is everything but (as the "paper" linked tries to say) privacy forward.


i think the remote execution with a remote hash database is the key part thay you need to focus. Checking for faces is something that doesn't need much information outside your phone itself.

What Apple is proposing is basically adding a feature to scan any user's phone for a collection of hashes. Even if they say they will only use this for CSAM this sends a strong message to all government agencies around the world that the capability is over there. Maybe for US citizens this doesn't sound dangerous but if I was a minority or a critic of the government on a more authoritarian country I would jump ship from Apple products right away.


For argument (1), they are only looking matches from existing database of hashes what NCMEC is providing. They are not developing general AI to identify new pictures, they only try to stop redistribution of known files. Because of that, their claim for 1/1 trillion false postives might be actually close to be correct since it is easily validated on development phase. Also, there is human verification before law-enforces are included.

For argument (2), this might be valid, but yet again, all we can do is to trust Apple, as we do all the time by using their closed source system. Model can be changed, but it is still better option than store everything unencrypted? In case you mean forging hashes to decrypt content.

For the sake of surveillance, it is not strong argument because again, system is closed and we know only what they say. Creating such model is trivial, and is not stopping government for demanding if Apple would want to allow that. System would be identical for antivirus engines which have existed since 1980s.

This is such a PR failure for Apple, because all their incoming features are improving privacy on CSAM area, everything negative comes from speculation which was equally already possible.


I also don't understand the vector everyone seems worried about. Also considering that perceptual hashing isn't new, as far as I'm aware, and it hasn't seemingly yet led to any sort of wave of innocent folks being screwed over sham illicit images.

I think there _is_ an argument to be made about a system like this being used to track the spread of political material, and it's easy to see how such a system would be terrible for anyone trying to hide from an authoritarian power of some type, but that'd already require Apple is absolutely and completely compromised by an authoritarian regime, which isn't high on my list of concerns.


Find the websites distributing it, infiltrate them, generally do the legwork to find what's going on - which is exactly what they've been doing.

"But the children!" is not a skeleton key for privacy, as far as I'm concerned.

I reject on-device scanning for anything in terms of personal content as a thing that should be done, so, no, I don't have a suggested way to securely accomplish privacy invasions of this nature.

I'm aware that they claim it will only be applied to iCloud based uploads, but I'm also aware that those limits rarely stand the test of governments with gag orders behind them, so if Apple is willing to deploy this functionality, I have to assume that, at some point, it will be used to scan all images on a device, against an ever growing database of "known badness" that cannot be evaluated to find out what's actually in it.

If there existed some way to independently have the database of hashes audited for what was in it, which is a nasty set of problems for images that are illegal to store, and to verify that the database on device only contained things in the canonical store, I might object slightly less, but... even then, the concept of scanning things on my private, encrypted device to identify badness is still incredibly objectionable.

In the battle between privacy and "We can catch all the criminals if we just know more," the government has been collecting huge amounts of data endlessly (see Snowden leaks for details), and yet hasn't proved that this is useful to prevent crimes. Given that, I am absolutely opposed to giving them more data to work with.

I would rather have 10 criminals go free than one innocent person go to prison, and I trust black box algorithms with that as far as I can throw the building they were written in.


The EFF article refers to a "classifier", not just matching hashes.

So, three different things.

I don't know how much you know about them, but this is what the EFF's role is. Privacy can't be curtailed uncritically or unchecked. We don't have a way to guarantee that Apple won't change how this works in the future, that it will never be compromised domestically or internationally, or that children and families won't be harmed by it.

It's an unauditable black box that places one of the highest, most damaging penalties in the US legal system against a bet that it's a perfect system. Working backwards from that, it's easy to see how anything that assumes its own perfection is an impossible barrier for individuals, akin to YouTube's incontestable automated bans. Best case, maybe you lose access to all of your Apple services for life. Worst case, what, your life?

When you take a picture of your penis to send to your doctor and it accidentally syncs to iCloud and trips the CSAM alarms, will you get a warning before police appear? Will there be a whitelist to allow certain people to "opt-out for (national) security reasons" that regular people won't have access to or be able to confirm? How can we know this won't be used against journalists and opponents of those in power, like every other invasive system that purports to provide "authorized governments with technology that helps them combat terror and crime[1]".

Someone's being dumb here, and it's probably the ones who believe that fruit can only be good for them.

[1] https://en.wikipedia.org/wiki/Pegasus_(spyware)


What makes you think that Apple has a database of actual child sex abuse images? Does that feel like a thing you'd be OK with? "Oh, this is Jim, he's the guy who keeps our archive of sex abuse photographs here at One Infinite Loop" ? If you feel OK with that at Apple, how about at Facebook? Tencent? What about the new ten-person SV start-up would-be Facebook killer whose main founder had a felony conviction in 1996 for violating the Mann Act. Still comfortable?

Far more likely Apple takes a bunch of hashes from a third party in the law enforcement side of things (ie cops) and trust that the third party is definitely giving them hashes to protect against the Very Bad Thing that Apple's customers are worried about.

Whereupon what you're actually trusting isn't Tim Cook, it's a cop. I'm told there are good cops. Maybe all this is done exclusively by good cops. For now.

Now, I don't know about the USA, but around here we don't let cops just snoop about in our stuff, on the off-chance that by doing so they might find kiddie porn. So it should be striking that apparently Apple expects you to be OK with that.


Isn't it? Honestly asking.

Apple are having peoples handsets check file hashes against a hash list and reporting anyone who has files with hashes on the list right? And there is a threshold of matches below which you dont get reported. Above that they lock your account.

The fact they're currently limiting it to image files and the US doesn't seem like much of a difference.

Am i missing some clever defence against misuse.


This argument falls exactly into the point I make about not understanding the implementation or the status quo.

Here's what would need to change about the system to fulfil your requirements.

1. The local government would need to pass a law allowing such a change 2. The hashes would then need to come from the government. Instead of the two intersection of two independent CSAM databases. 3. The review process would need to redesigned and deployed to provide a lower threshold and permit the reviewers to see high resolution images. Neither of these changes are trivial. 4. The reports would then need to go to the government. 5. What's stopping these same governments from requesting such things from Google and Meta - both of which have comparable systems with lower oversight?

Apple don't "ALWAYS follows the laws of the land", one can read into this with the recent "secret" agreement between Apple and China which detail the various ways that Apple hasn't complied with Chinese requests (e.g. the requests for source code.)


> If hashes are uploaded to devices, they can be extracted and images that clash against it can be created.

Many organizations have the hashes, so they could leak nonetheless. Either way, I don't think that's a major problem. If the system interprets a picture of a pineapple as CSAM, you only need to produce the picture of a pineapple to defend yourself against any accusations. If clashes are too commonplace, the entire system would become unreliable and would have to be scrapped.

In any case, I have looked it up. The database is indeed on the device, but it's encrypted:

https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...

> Instead of scanning images in the cloud, the system performs on-device matching using a database of known CSAM image hashes provided by NCMEC and other child-safety organizations. Apple further transforms this database into an unreadable set of hashes, which is securely stored on users’ devices.

Overall, after reading the PDF, here is my understanding of the process:

1. Apple gathers a set of "bad hashes"

2. They upload to each device a map from a hashed bad hash to an encrypted bad hash

3. The device runs an algorithm that determines whether there are matches with hashed bad hashes

4. For each match, the device uploads a payload encrypted using a secret on-device key, and a second payload that contains a "share" of the secret key, encrypted using the neural hash and encrypted bad hash.

5. The device also periodically uploads fake shares with dummy data to obfuscate the number of matches that actually occurred. Apple can't tell fake shares from real ones unless they have enough real shares.

6. Once Apple has enough real shares, they can figure out the secret key and know which hashes caused a match.

The main concern I have, and as a non-expert, is step 2: it requires Apple to provide their key to an auditor who can cross-check with child protection agencies that everything checks out and no suspect hashes are included in the payload. In theory, that needs to be done every time a new on-device database is uploaded, but if it is done, or if child protection agencies are given the secret so that they can check it themselves, I think this is a fairly solid system (notwithstanding the specifics of the encryption scheme which I don't have the competence to evaluate).

The thresholding is also a reassuring aspect of the system, because (if it works as stated) the device can guarantee that Apple can't see anything at all until a certain number of images match, not even the count of matching images. The threshold could only be changed with an OS update.

There's certainly a lot of things to discuss and criticize about their system, but it's going to be difficult to do so if nearly no one even bothers reading about how it works. It's frustrating.


First off, I don’t think this is some evil plan to kill our privacy. I think this project is done with good intentions, if nothing else.

However I think this is an interesting question: how does Apple know that the hashes they’re supplied match CSAM, and not, say, anti-government material? How would they know if the people they got hashes from started supplying anti-government hashes? Apple will only be receiving the hashes here - by design, even they won’t have access to the underlying content to verify what the hashes are for.


>They can't see the scan result until the device tells them that 30 images have matched kiddie porn

Isn't this FALSE? the devices hashes the images but it does not have the database, so the hashes are sent to the server, and Apple servers compares your hashes with the secret database, so Apple knows how many matches you have.

Your argument would make sense ONLY IF your images would be encrypted and Apple had no way to decrypt them so the only way to compute the has is with the creepy on device code.


> Apple decided to implement a similar process, but said it would do the image-matching on a user's iPhone or iPad, before it was uploaded to iCloud.

Is this list of hashes already public? If not, seems like adding it to every iPhone and iPad will make it public. I get the "privacy" angle of doing the checks client-side, but it's little like verifying your password client-side. I guess they aren't concerned about the bogeymen knowing with certainty which images will escape detection.

next

Legal | privacy