Is don’t think I that hash method would work since that part of the URL isn’t sent to the server. It’s strictly used by the client to decide which part of the response we to show.
Since the entire purpose of transmitting URLs is making decisions (payment, provide info to potential customers) based on the URL, hashing wouldn't do anything.
> Surely Google has something that maps the hashes to actual URL patterns, but like the other commenter said, the partial hash you send is only sent when it matches a local DB already.
It sends the first 4 bytes of a 32 byte SHA-256 hash of the URL. There isn't a reasonable map back for that.
I'm guessing a single hash could be problematic for detections based on the domain for example. But this could be circumvented by sending hashed parts of the URL.
For example they could hash the domain, path and query separately.
Comparing hashes would help a bit on both anonymity and size concerns.
I also think, in a majority of cases, one could remove all of the query parameters from a URL and still have the same page. I'm not 100% confident about this though
> 6. Ask the source of step 1 to give you the full set of hashes corresponding to the prefix from (3)
This is quite similar to sending the url. Like there might only be a single site with that hash. Or there might be a handful of which one is far more likely than the others because it has 1000x their dau.
The addresses from the user's address book should be hashed before sending to the server and compared to hashed addresses on the server. Then only positive matches are registered, and the server doesn't see more private information than it needs.
That doesn't help at all. If the server has a database it's going to match this hash too, then it knows what URL corresponds to the hash.
reply