So, who can have access to the data that fastly has access to? Because it seems that url by url the chrome browser will send the entire history to a third party.
1. If google is paying fastly, what prevents google from paying fastly some more to divulge the user information as well? As far as I understand ohp, the model assumes that the three parties are not aligned which is not true for the fastly:google connection.
2. Even 1. is not true, if the encrypted body is not salted, it is effectively just a unique hash of the original url, which means that forth party could make inferences about the content of the message by encrypting many possible url-s of interest through the google key encryption. Salt is not mentioned in the article.
I'm not a chrome user, so it is just random curiosity. In fact I like Google and many of the things that they've made affordable to the world, but I also do not like tracking of people for most reasons and for most of the ways it is done.
How far does this scenario go though? it would be easier for Google to just deploy a changed client app (chrome) that skips the blinding infrastructure, or just sends the user information to them at the same time anyway. In the end, unless you're reading and building every line of code (and maybe even hardware) you are delegating trust.
How far would a multi billion company go in order to ensure its main source of income? They have a history of providing privacy assurances which fell through under serious inspection. The flock scheme was a few months back, the stop location history scandal was a few years back, and I'm mentioning only those on the top of my head.
That's fair, I suppose my point is there are easier ways, I don't think there's any way to technically achieve full privacy on the current design of the internet that couldn't be circumvented by operational changes. But perhaps the point is more that technologies like this are maybe used more to provide false assurance through obscurity. Although it does sound skeptical and bleak given there is no real solutions to that.
Correct me if I'm wrong, but either the user encryption key is unique and hence the user is identifiable to Google, or a finite number of keys is used which makes the encrypted text subject to identification because a finite number of inputs can produce a given output.
My cryptography education is rather superficial, so I might be missing something.
I know nothing about what has been implemented by Google/Fastly. All I can point you to is the details of Cloudflare's implementation and RFCs (which we co-authored): https://datatracker.ietf.org/doc/draft-ietf-ohai-ohttp/
OHTTP does require that the parties don't collude, which is why Google has engaged Fastly to run the relay service (which knows end user identifying data) and are themselves running the gateway service (which knows the end user request body).
Part of the contract terms include not delivering log data to Google for this service, among other things that help ensure that this separation of knowledge is upheld.
First, thanks for the answers - both this comment and the other in the thread. HN shines again when it comes to access to the sources of information.
Second, as I said in another comment, I'm not a chrome user and I'm asking more for personal entertainment. However, I think that I'm asking questions that everyone not in the space would ask looking from the outside. Hopefully, your answers are of use for someone else.
Third, my personal biased opinion is that this will not resolve any of the issues surrounding Google and the online tracking. I lost my personal trust in Google many years ago and things haven't change since then. Even this initiative which is supposed to underpin the privacy and the choice of the user is provided as a corporate project with Google choosing who decides on the allowed urls, the ohp provider, and everything else about the parameters of the "deal". As I said, I cannot comment on the cryptography, but anything else in the whole story does not provide me with the confidence that the user choice has been uphold as a value. I doubt that anyone will have their opinion change from all of this.
Possible measures which could've demonstrated some transparency could've been if Google wasn't the only authority on the allow list, if people could choose the ohp provider, if the authority was granted to a ngo with transparent rules and decision taking process, and independent oversight...
Appreciate your questions and feedback. There's nothing wrong with some healthy skepticism. Ultimately this solution depends on the tech and implementation but it also requires a degree of user trust. I've been happy to see both Fastly and Google being pretty transparent about what's going on and how it works, in order to start establishing that trust.
I can't speak to your points about Google specifically, but I have appreciated in my interactions with the Privacy Sandbox team that they are putting a lot of energy in to delivering these services while also respecting user privacy.
On the Fastly side, I see an opportunity to deliver OHTTP services for a bunch of additional use cases and to other customers. I think this could be a powerful tool to enable privacy for all sorts of things, like metrics and log collection and other kinds of API access. The spec right now needs the client to know various things which requires a tight coupling between client -> relay -> gateway -> target, but I think that there are ways that could be adjusted in future revisions. And not all of the opportunities that I'm exploring are for commercial entities, to your point about NGOs.
I'm also working on some other privacy enablement services, like Fastly Privacy Proxy (which is one of the underlying providers for Apple's iCloud Private Relay) and some un-announced things. Between these various technologies I think that Fastly can help to raise the level across the industry for end user privacy.
Ultimately we are a business and we like making money. I think we can do that in this space by delivering real value to our customers and their end users via these building block services that help them to build privacy enabled products. I'm hopeful that, as we explore more opportunities in this space and OHTTP adoption increases, user trust continues to be built in both the OHTTP technology and Fastly's privacy enablement services.
More about Oblivious HTTP and what Fastly is doing here is in a blog post that I wrote [1]. I wrote the OHTTP relay service for Fastly and was heavily involved in this deal.
Some points about how the service operates:
- Fastly does not receive your Chrome browsing history by virtue of running this service, because there is not a 1-1 mapping between URLs browsed and OHTTP requests made. We also cannot view the encapsulated request (which is passed to Google).
- Fastly does not capture access logs for this service, and no logs are sent to Google. There is only access to service-level metrics.
- Google does not have access to modify the configuration of this Fastly service, and does not own the domain or TLS key associated with it.
Yes, I'm working on bringing Fastly's OHTTP Relay to GA, which will allow us to offer it to more customers. That's ultimately more of a pricing and business process thing than any additional technical work. The implementation is feature complete at this point. Planning for that in Q2 (mid-April if all goes well).
I'm not (currently) planning to support customer self-service for this, because I anticipate that most customers may want:
1. Fastly to operate the OHTTP relay service, so that they can clearly state that they can't interfere with its operation to their end users.
2. Customization around business logic. We do plan to re-use the core service implementation across customers, but I've found with the initial implementations that there is an additional layer of business logic that's valuable (things like specifically which headers to strip / pass, using backend API key, verifying a client shared secret, etc.).
However, if it becomes apparent that self-service is desirable here, I'll definitely consider that. There would be a bit more work on the engineering side to enable that.
If you might be interested in that service, I'm happy to discuss: <hn username> @ fastly dot com
The relay doesn't learn the content because it's not encrypted for it. The target receives the payload from the relay, so it doesn't know its IP, and everything the relayed removed.
Correct, we developed Oblivious DNS (first as a research paper and then as a service) to do this. Here's a quick writeup we did about our alpha service that we're planning to roll out globally:
Their diagrams omit that the public key used for signing is fetched by the client directly from the Google k-anonymity server.
Shouldn't the key used be either shipped with the client, or digitally signed by a shipped Google anchor & hosted by the relay? Fetching the key from Google's servers directly means that Google has a pool of IPs (and cert fetch timestamps) to help de-anonymise, and they also have the ability to serve particular clients unique public keys to deanonymise them.
We’ve been calling this problem key consistency, basically the risk that Google (or any OHTTP gateway operator) gives out unique keys to different users and builds a {user -> request} mapping based on the keys used to encrypt requests. The OHTTP authors published a draft spec discussing exactly this problem.
https://datatracker.ietf.org/doc/draft-ietf-privacypass-key-...
Frankly, none of the technical solutions we considered were satisfying enough given the added complexity. It’s actually kind of tricky to form a convincing distribution model. The designs we looked at mostly involve multi-party schemes where there’s trust required either that parties don’t collude with each other or between the user and one party explicitly. Admittedly OHTTP is also premised on multiple parties not colluding, but we decided to punt on the key consistency problem for the sake of simplicity. We also recognize that it’s possible for savvy users to fetch the keys from the endpoints we’re serving them on, compare those keys with other users, and fact check our claims. We honestly have nothing to hide about this; everyone is receiving the same key.
To specifically address the two proposals you made:
1) Shipping the key with the client: it’s not clear this is any better than fetching it right from Google. Chrome updates and our experiment framework (known as Chrome Variations) are both Google-managed distribution mechanisms, not terribly different from fetching the key at a Google API endpoint.
2) Having the relay serve the keys: these keys are actually designed to protect data from the relay. If the relay could change the public key Chrome uses to something the relay knows the private key for it would undermine the purpose of HPKE encryption. You suggested the relay could serve a key signed by Google. The issue with that is it complicates the process of key revocation. We expect to rotate the key we use, and we want to maintain the ability to revoke keys if needed (say the relay learned of a private key). Sure we could serve a signing key revocation list, but then we’re back in into the world of complicating the multi-party relationships.
All this being said, addressing key consistency issue is on our radar, and we hope to improve the distribution model in the future. We welcome feedback of course; and we’ll take it into consideration as we continue to evolve the designs used here. There is a general blog post on Privacy Sandbox feedback at [1] and you can take a look at specifically how Chrome is fetching keys in [2]. In case you're curious the key we’re serving today has a cipher suite of (DHKEM_X25519_HKDF_SHA256, HKDF_SHA256, AES_256_GCM) and at the moment we have two gateways (one for read traffic and one for write traffic) each using their own single keypair.
You'll need to spend a Private State Token to call the k-anon API, minting PSTs is rate limited on the server side, and the client and server still cooperate to make forging PST requests hard (e.g. device integrity attestation, requiring the user be signed into Chrome). What I think this means is that you could undermine the k-anonymity in individual cases by large amounts of manual work, but not do so at scale.
I'm calling bullshit on the unlinkability asserted in the Github reading here. Sure, you can make it so the API in question can't relink, but that doesn't mean you can't use extraneous unmentioned metadata to do it. Without proof that no other systems are siphoning off or tracking individual token state, this just looks like more "Lets use dubious cryptography to get pressure off our backs til a credible researcher we haven't hired/paid to be quiet blows the whistle".
The having to be logged in to Chrome bit is exactly what has me thinking something about that arrangement allows them to deanonymize, otherwise, they wouldn't even be able to measure the difference between real and fake. They'd also be more than happy to make an explicit business related contractual obligation of not sharing logs with them, because they don't need them, 1 and 2, to the uninitiated, it looks like they are trying to make an active attempt to anonymize things in spite of the fact they have enough extra OOB telemetry that they can continue with business as usual.
K-anonymity isn't anonymity at all, and is at best, less identifiable internal to the dataset.
That's the general idea. We're trying to limit the amount of abuse a single Google Account, device, or other first-party identity can cause. We have some more ideas and details we hope to share over time, but the goal is to keep writes to the API anonymous while preventing, the best we can, sybil attacks. We're willing to ignore/sacrifice some writes to achieve this (since any writes above ~k recent ones don't contribute directly to the utility of the system).
Beyond Private State Tokens we also have new cryptography we're researching that should let us improve unlinkability between issuance and redemption further.
google uses this to further gatekeep the online advertising industry since non-google networks have to get Google's permission from the k anonymity server to show their ads. maybe someday someone will sue to break up this mess
As far as I can tell from reading the proposal, this claim is not true. The API is used from the bidding scripts run locally by the browser, and it is the ad space seller that decides which buyers' scripts are eligible to run.
Aren't there a few dozen attorney generals suing Google on various anti-trust grounds right now? Perhaps you should reach out to one of them and explain the specifics of this case. I'm sure they're happy to collect more examples of where Google is engaging in such anti-competitive behavior.
The justice department is also suing them in (what I believe) is a separate lawsuit [0].
I'm actually connected to one of the lawyers consulting on the case. I could reach out and explain to her if I understood the details (which I don't in this particular instance).
> We will require a crowd of 50 users per creative within the past 7 days before the ad can be rendered.
Am I missing something? 50 is not a big number. And 50 over 7 days is less than one per three hours, which suggests that coarse-grained time-based attacks or simply changing the targeting every few hours could end up effectively targeting a single user. Or just asking for the ad 49 times before unleashing it?
reply