Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Imagine if you have a site that displays some sort of custom content when a client is hitting a 404 error. Even if it’s just a static page, the web browser still has to follow the code path and open it. Even if it caches the page internally, it still has to regularly check if the file updated on disk.

The amount of time spent by the web browser is fairly small, but the parent comment mentioned it in context of an attack, so that small per-page effort multiplied by many connections makes for more of a substantial load.



sort by: page size:

Cached per browser, though, which is significantly different than cached per request.

Even if you're caching/serving static content efficiently it still adds load to a server.


Ah, fair point. Now I see what you were trying to say.

Assuming the server has no way of determining which assets the client has cached (which depending on implementation may not be the case) you're of course correct. However, after step 2 the page has already fully loaded in both cases, so step 3 doesn't really slow anything down.


once it's cached though, loading new pages or new data is trivial.

It really depends on the use case- if I'm navigating my banking website, I would prefer a longer load time up front if it made it navigating between accounts really fast. If I'm checking my power bill, I'm probably only going to view one page so I just want it loaded ASAP


worst case scenario is that the browser has to request it from the server. This is bad because the execution would have to stop and wait, but for most cases it would just be for a few milliseconds. And the developer should take that into account and maybe prep the cache for possible paths.

The first time the page is hit, the browser should keep it cached for a year.

How much time will avoiding that first hit actually save?

Keeping a file compiled could be more generic too: just check which URLs get hit a lot, and keep those pre-compiled as a separate feature.


> But when it’s on every page, from a web performance perspective, it equates to a lot of data.

How does browser caching come into play here? Doesn't it make a difference?


> But when it’s on every page, from a web performance perspective, it equates to a lot of data

It's not cached?


Even if the CSS is cached, your browser still needs to ask the server if the files has changed and receive back a "304 Not Modified," so you're still incurring the overhead of those connections.

Although I take your point that people do a lot of needless computation these days, I don't quite understand the dig at caching. Isn't adding an HTML file to your server just manual caching?

Ah, right, I understand your point now. When sites determine if something is cached or not, are they doing it purely based on the time taken to load it?

The other side though is if something is cached at the client, right? I thought that was what people were talking about with cache hits.

Then the server doesn't need to know about repeat-visits that don't hit it, and it would be nice to maintain caching support if the page content is static.

Browsers have caches

Your example is pretty bad for a number of reasons:

1) If you're building a high availability file server then you would expect the sysadmin to configure the way the OS caches files rather than run with the defaults. Likewise, if you're building a busy site then you'd need to configure caching to fit your specific application.

2) The OS file cache can run with pretty basic defaults because file system files are static (yes they can change, but you have to go through the kernel ABIs anyway so easy to track changes). Website content cannot be guessed upon because even static content can and often is dynamically generated. No assumptions can be accurately made. This is also why there's so many different levels of caching that happen on busy web sites.

3) File caching on the OS only needs to happen at one place (as touched on in previous point). However websites are built from a plethora of different frameworks which are literally far too numerous to name.

That all said, there will be some specific web frameworks which will ship with caching defaults (more typically in the case of browser caching) and some web applications will ship with recommended plugins for enhanced caching (eg Redis / memcached). However pragmatically I think if a web developer is smart enough to write code then they should be smart enough to implement caching. In the days of frequent website attacks, the unpredictability of which sites go viral nor when, and the ease of which anyone can build and host a website; I really do think some web developers need to up their game rather than blaming the complexity of the tools nor the lack of sane defaults. Yeah the current web model is a mess of edge cases and hidden traps, but if you're a developer then there's no excuse not to properly learn the tools you've been given regardless (or maybe especially because) of how poor those tools are at protecting you from sawing your own hand off.


Headers: W3 Total Cache/0.9.2.4

So he's serving a static page 90% of the time? I'd love to see how this thing operates without a cache.


Makes me wonder how caching and other stuff are handled by browsers, or the impact of it on load times. Anyway, cool concept.

It's far from just the browser cache. Consider things like network location, download speed, interaction path, browser parsing edge cases, script enabled, privacy add-ons enabled, etc...

The browser can cache the http page too!

Can you give us a summary?

I’m struggling to see how a way to treat the browser cache as a file system is supposed to inject sanity into human communication.

next

Legal | privacy