Imagine if you have a site that displays some sort of custom content when a client is hitting a 404 error. Even if it’s just a static page, the web browser still has to follow the code path and open it. Even if it caches the page internally, it still has to regularly check if the file updated on disk.
The amount of time spent by the web browser is fairly small, but the parent comment mentioned it in context of an attack, so that small per-page effort multiplied by many connections makes for more of a substantial load.
Ah, fair point. Now I see what you were trying to say.
Assuming the server has no way of determining which assets the client has cached (which depending on implementation may not be the case) you're of course correct. However, after step 2 the page has already fully loaded in both cases, so step 3 doesn't really slow anything down.
once it's cached though, loading new pages or new data is trivial.
It really depends on the use case- if I'm navigating my banking website, I would prefer a longer load time up front if it made it navigating between accounts really fast. If I'm checking my power bill, I'm probably only going to view one page so I just want it loaded ASAP
worst case scenario is that the browser has to request it from the server. This is bad because the execution would have to stop and wait, but for most cases it would just be for a few milliseconds. And the developer should take that into account and maybe prep the cache for possible paths.
Even if the CSS is cached, your browser still needs to ask the server if the files has changed and receive back a "304 Not Modified," so you're still incurring the overhead of those connections.
Although I take your point that people do a lot of needless computation these days, I don't quite understand the dig at caching. Isn't adding an HTML file to your server just manual caching?
Ah, right, I understand your point now. When sites determine if something is cached or not, are they doing it purely based on the time taken to load it?
Then the server doesn't need to know about repeat-visits that don't hit it, and it would be nice to maintain caching support if the page content is static.
Your example is pretty bad for a number of reasons:
1) If you're building a high availability file server then you would expect the sysadmin to configure the way the OS caches files rather than run with the defaults. Likewise, if you're building a busy site then you'd need to configure caching to fit your specific application.
2) The OS file cache can run with pretty basic defaults because file system files are static (yes they can change, but you have to go through the kernel ABIs anyway so easy to track changes). Website content cannot be guessed upon because even static content can and often is dynamically generated. No assumptions can be accurately made. This is also why there's so many different levels of caching that happen on busy web sites.
3) File caching on the OS only needs to happen at one place (as touched on in previous point). However websites are built from a plethora of different frameworks which are literally far too numerous to name.
That all said, there will be some specific web frameworks which will ship with caching defaults (more typically in the case of browser caching) and some web applications will ship with recommended plugins for enhanced caching (eg Redis / memcached). However pragmatically I think if a web developer is smart enough to write code then they should be smart enough to implement caching. In the days of frequent website attacks, the unpredictability of which sites go viral nor when, and the ease of which anyone can build and host a website; I really do think some web developers need to up their game rather than blaming the complexity of the tools nor the lack of sane defaults. Yeah the current web model is a mess of edge cases and hidden traps, but if you're a developer then there's no excuse not to properly learn the tools you've been given regardless (or maybe especially because) of how poor those tools are at protecting you from sawing your own hand off.
It's far from just the browser cache. Consider things like network location, download speed, interaction path, browser parsing edge cases, script enabled, privacy add-ons enabled, etc...
The amount of time spent by the web browser is fairly small, but the parent comment mentioned it in context of an attack, so that small per-page effort multiplied by many connections makes for more of a substantial load.
reply