Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> URLs for example should be hashed before going out.

That doesn't help at all. If the server has a database it's going to match this hash too, then it knows what URL corresponds to the hash.



sort by: page size:

Is don’t think I that hash method would work since that part of the URL isn’t sent to the server. It’s strictly used by the client to decide which part of the response we to show.

Server has no need for it.


Not for query parameters sent to the remote server, but there are definitely pages/applications that store state in the hash.

For example, look at the URLs used by mega.nz, or any encrypted pastebin (they store the decryption key in the hash so it's not sent to the server).


Since the entire purpose of transmitting URLs is making decisions (payment, provide info to potential customers) based on the URL, hashing wouldn't do anything.

> Surely Google has something that maps the hashes to actual URL patterns, but like the other commenter said, the partial hash you send is only sent when it matches a local DB already.

It sends the first 4 bytes of a 32 byte SHA-256 hash of the URL. There isn't a reasonable map back for that.


There could be several immemorable hashes pointed at the same IP, the server needs to know which immemorable hash it should serve.

(don't think this makes much sense, just answering your question...)


> I would not only hash it already on the client side,

How does hashing on the client side help?


> As I understand it, the point of the hash is purely for caching purposes, not strictly for validating that the downloaded file matches it.

That is a very useful consequence that prevents malicious content.


Couldn't you just store the URL and hash together, or salt the hash with the URL?

What? This doesn't make any sense at all. They just check the length restriction server-side before hashing it.

That uses a hashed URL

I'm guessing a single hash could be problematic for detections based on the domain for example. But this could be circumvented by sending hashed parts of the URL.

For example they could hash the domain, path and query separately.


What about the same url? Or one with query parameter adds?

What's to prevent another NFT from pointing to the same data and copying the hash?


Comparing hashes would help a bit on both anonymity and size concerns.

I also think, in a majority of cases, one could remove all of the query parameters from a URL and still have the same page. I'm not 100% confident about this though


> 6. Ask the source of step 1 to give you the full set of hashes corresponding to the prefix from (3)

This is quite similar to sending the url. Like there might only be a single site with that hash. Or there might be a handful of which one is far more likely than the others because it has 1000x their dau.


It doesn't solve the problem because of the OP's issues.

Using a hash would allow you to load them from any URL, including the blocked ones.


The addresses from the user's address book should be hashed before sending to the server and compared to hashed addresses on the server. Then only positive matches are registered, and the server doesn't see more private information than it needs.

Hashing is done on the server. Hashing on the client would defeat the whole purpose.

Don't forget that the hash part is client sided; Google doesn't know about it until the page is loaded.

Their server could trivially query the url and store the result hash. This would also give a chance to scan for malicious content
next

Legal | privacy