Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I've wondered about that on several sites.

So it's true that it still requires:

a) a user who should have the URL give it to them

or

b) person 1 guessed the URL, like a password since it's usually a long url with a GUID of some type.

But is that acceptable for some reason? Easier development?

Because otherwise you need to:

a) issue one time/short lived token for accessing an image on S3 for instance, for all the images that will be shown on a page when a page loads for an authenticated user.

or

b) proxy the image requests by making an api endpoint for image names, authenticate those requests, fetch the image from a place only the server has access to and stream the image data back.



sort by: page size:

I think he is talking about a web application. So providing a image tag with the URL, as a src attribute, to the image hosted on s3 seems like a better solution than embedding the image in the HTML response (and blocking the page rendering for a longer time).

I'm assuming you secure the image URL with HTTPS at least to minimize the chances that an ISP could intercept it on the wire?

I don't disagree with your overall point, but there is definitely a difference between being able to save an image from a webpage and being able to share the URL that image was loaded from and having it work for un-authenticated users.

This approach to generating images generates a lot of benefits when you're working at scale. Before, when we wanted an image that's been thumbnailed, cropped, or modified a certain way, we would generate the image (or check if it exists) at page render time, and pass on the URL. This meant that page rendering blocked either checking if the image exists or generating a new image. Further, when our flaky file store starting freaking out it meant the entire site went down.

By encoding all the relevant information in the URL, like Amazon does, we offload such work asynchronously to different web requests. If there's info in this URL we want hidden from users (spam-protected email addresses) we encrypt it using Base64-encoded AES. If the file store were to go down--which hasn't happened since we switched to using S3--users would only see broken images instead of the entire site crashing. Further, we store the images on S3 using the exact same URL as we fetch it from, and have our static web server check there first. That means if the image is cached it never hits our application stack.

Now, when we generate a page with one of these dynamic images, our app stack only has to generate a URL instead of relying on the vagaries of our file store and image processing libraries. It's made our site more reliable, easier to manage, and faster. If we suddenly start serving up many more images, we could easily replace our image generation service with something higher-performance or more reliable, since it's just a web service. Right now, it works great.

Clearly Amazon has seen the benefits of the same approach.


This isn't about exposed credentials though. It would be like an autmatic image uploder that could pick an image hosting site such as imgur and upload the image for you and give you a link. Services are offering the ability to host images for you. You aren't stealing imgur's s3 credentials. They just let any user upload images for free despite the fact it technically costs them money to host the file for you. Similarly there are sites offering the ability to serve LLM requests for you for free.

Any particular reason you need to proxy the data through your own server? Seems like a 302 to the original image url would be quite a lot easier and more scalable.

Unfortunately the platform makes the annoying solution easy, and the one you prefer hard.

D'n'd and forms give full access to the image data relatively easily, and it works completely client-side without server's involvement.

OTOH due to same-origin policy getting data from a random URL on the Internet is (intentionally) hard. "Simple" proxying via own server is tricky to do without accidentally building tool that could be abused for DoS or anonymous download of something that you wouldn't want to be associated with.


> The service would need to send out the Cross Origin Resource Sharing headers in order for the image to be accessible via <canvas> and the service also needs a means for the querying server to test if a certain image is indeed the one associated with the user.

// EDIT: ignore what was here, you're right.


Once you have the full URL to the image, you can share that too - authorization checks dont happen when fetching the image.. from googleusercontent.com as far as I can tell..

But really - once you share an image to some one, there is no stopping them from downloading the image and sharing it out somewhere else anyways.. So I'm not sure the point of this.


It's one URL in the token pointing to a JSON file with the metadata. In the JSON file there's another URL to the image itself. So you need to go through two different centralized servers to access the image. The only permanent thing is the first URL to the JSON file but nothing prevents the hosts from changing the content.

Assuming we're assuming SSL, then a string in the URL could be more secure than a password because it could be longer than a human could comfortably remember. Longer = harder to brute force, plus you can block (or teergrube, or whatever) any IP's that try to guess a URL and fail.

You can add an option to delete/rekey the image too. At that point the URL is exactly as secure as the method you use to send the URL -- just like a password.


Images are like any other web asset and can be protected either by the web server (htpasswd) or the application's logic.

I think you've been brainwashed by modern storage services (S3) where the URL is essentially always public and out of your control. It's trivial to password protect an image when it's on your server. I would assume any medical provider would protect their images by checking the session and not serve them via AWS.


I think it's pretty safe as long as no one without the permissions can find (or guess/extrapolate) that URL. The images are probably just hosted by a CDN and serving up the files with authentication might slow it down or complicate the setup.

There is a dilemma for web developers with images loaded from CDNs or APIs. Regular <img> tags can't set an Authorization header with a token for the request, like you can do with fetch() for API requests. The only possibility is adding a token to the URL or by using cookie authentication.

Cookie auth only works if the CDN is on the same domain, even a subdomain can be problematic in many cases.


Clearly the best way is to link to a popular site and have your webserver be overwhelmed by traffic. Can't serve images? Then users can't steal them!

Hm, I see. I’m pretty sure I can figure out how to serve a greyscale image. I suggest making that proxy explicit in your docs. You suggest using HTTPS and an obscure URL to keep our data “secret” but then you’re literally processing our content and could be doing anything with it.

But the actual image data isn't stored in the token, just its URL (usually). So what you own is not the image, but rather, whatever resource resides at a particular URL, with there being no guarantee that that resource doesn't also live at some other URL.

Surely it would be cached based on the image URL though, not based on the page URL, and different images could use different URLs. I still don't see how this would pose a problem.

Another reason proxying requests to external images is important is that some browsers will display an HTTP Basic Auth dialog if the image request responds with the right kind of 401. It’s confusing and sort of disconcerting to see a basic auth dialog pop up from a random site if you’re just browsing Github.
next

Legal | privacy