Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
A Guide to HTTP/2 Server Push (www.smashingmagazine.com) similar stories update story
189.0 points by okket | karma 40755 | avg karma 7.28 2017-04-10 12:37:39+00:00 | hide | past | favorite | 60 comments



view as:

> As of now, Nginx doesn’t support HTTP/2 server push, and nothing so far in the software’s changelog has indicated that support for it has been added. This may change as Nginx’s HTTP/2 implementation matures.

Forget Apache or Nginx. Use the H20 server instead: https://h2o.examp1e.net

I've been using it on a production server with a Django app for about a year now, with HTTP/2 push support, and it's been great. It includes advanced feature that knows if a web browser already has pushed content in its cache, so it doesn't resend it. It's architecture seems to offer the best of Nginx's multiprocessing capabilities as well. Configuration is as simple as Nginx as well. (I went from Apache -> Nginx -> H2O)


H2O also allows an HTTP/1.1 backend to push by sending links [1] with rel=preload — these links are automatically converted to pushes by H2O. This can be further sped up with early hints [2].

[1] https://tools.ietf.org/html/rfc5988 [2] https://tools.ietf.org/html/draft-ietf-httpbis-early-hints


So wait, if the server pushes assets before the client request it, does that mean the server pushes them regardless of the cache status on the client?

If so that should really be limited to small assets to be beneficial


I think it still obeys the browser's cache:

"When the user navigates to a subsequent page that requires that asset, it can be pulled from the cache, eliminating the need for additional requests to the server."

I'm not clear on if there is any relation to existing cache headers. I would have thought even the first page hit could have Link'ed assets pulled from the cache if present (if the client had previously visited that push-enabled resource (edit: in which case it's not the first I guess zzz). I guess the client, upon hitting a cached asset referenced in a Link header ignores it where otherwise it'd suck it down the socket as part of the same initial request.


The H2O server, on the first request, sends a cookie that includes a signature of all the files a server pushes: https://h2o.examp1e.net/configure/http2_directives.html#http...

If, on subsequent requests, the cookie is sent along with the request, H2O doesn't send those files.

This is still a bit of a hack, and ultimately browsers will have to include some information on what's in the actual cache, with proposals on that pending: https://datatracker.ietf.org/doc/draft-ietf-httpbis-cache-di...


As I understand it, the client has an opportunity to proactively cancel pushes of assets it already has cached.

The server will send a "push promise" which basically says "I'm going to send this file to you", and then the client can come back and say "don't bother, I already have it". And this all happens in parallel with the download of other assets (like the main page), so it doesn't really slow anything down.

Here's an article I read which goes into how server push works in a lot more detail: https://hpbn.co/http2/#server-push


I had a bit of a look around after reading the article, this page might also be useful in explaining how it interacts with the cache and some suggestions on when to apply it:

https://www.shimmercat.com/en/blog/articles/whats-push/#inte...

Sounds painful to me.


To set it up isn't that hard... just set it and forget it, and let the browser & server handle things from there.

In practice, it's going to be a lot faster than the default of not pushing resources.


Doesn't that require more round trips than just having the client ask for it?

Yes. This makes push better for things that you'd otherwise inline. link[rel=preload] is generally good enough for other resources.

No, even in the worst case it's exactly the same number of round trips.

Without server push:

1. Client requests main page ->

2. Client receives main page <-

3. Client requests subresources ->

4. Client receives subresources <-

With server push:

1. Client requests main page ->

2. Client receives push promises and main page <-

3. Client cancels promises it doesn't need ->

4. Client receives subresources <-

The only difference is that with server push, steps 2, 3 and 4 happen in parallel, and step 3 can be omitted entirely in the event that the client doesn't need to cancel any pushes.


Or, what may be a more common case, depending on caching method:

Without server push:

1. Client requests main page ->

2. Client receives main page <-

With server push:

1. Client requests main page ->

2. Client receives push promises and main page <-

3. Client cancels promises it doesn't need ->


Ah, fair point. Now I see what you were trying to say.

Assuming the server has no way of determining which assets the client has cached (which depending on implementation may not be the case) you're of course correct. However, after step 2 the page has already fully loaded in both cases, so step 3 doesn't really slow anything down.


It can slow things down, as the push works that way:

The client requests some page (e.g. index.html), which will trigger the push on the server. E.g. one for asset1.png. The server initiates the push by sending a push promise frame down the wire, which contains the content of a simulated HTTP request for that exact resource - like if the client sent a GET request for asset1.png through a headers frame. The client can only react on this frame when it receives it and sees that it has asset1.png already in cache. It can then cancel the stream by sending a ResetStream frame. However obvisouly that takes some time, until this push promise frame arrives at the client. Meanwhile the server might decide to not only send the push promise header, but also parts of (or the complete) data. So asset1.png already goes down the wire, and blocks bandwidth that could be used for other things too.


That might slow down downloads from other pages or processes, but as far as the download from the current page goes (which uses multiplexed requests and responses over a single connection) I don't see how that's any worse than it'd be without server push.

In the worst case scenario with server-push that you described, the server decides to push content the client already has cached before pushing content the client doesn't have, and begins transmitting that redundant content before the client has a chance to cancel the push. Once the client receives push promises for the content it doesn't need, it can immediately send a request to cancel the download of those resources. Once those cancellation requests reach the server, it immediately stops transmitting the resources the client has cached, and starts transmitting the data the client actually needs.

Contrast this with the same scenario without server push, where the client parses the HTML for the main page to determine what resources it needs, then send requests for those items which the server must receive before it starts transmitting them. In both cases a round-trip from the server to the client and back is needed before the server can start transmitting the necessary assets, but without server push the client needs to download and parse the main page before it can start telling the server which resources it needs, whereas with server push it can send cancellation notices before the current page is downloaded or parsed.


The difference is that bandwidth is being used by incoming pushed assets before the abort. This might not matter much on fast wired connections but can make a big difference on mobile networks.

Exactly. The start of the push will already consume downstream bandwidth until it's canceled by the remote side. I guess in the worst case it consume about up to the maximum window size of the stream (typically 64kB). The "classic" approach compared to that is the client initiating another GET request for each asset after it has received the index page. This requires more upstream traffic. However the webserver could directly respond with a 304 for cached assets, which means less downstream traffic.

Basically yes. The client can cancel the streams for pushed assets that are in cache. But the server doesn't have this information (natively) when one page is requested.

As others have said, there are workarounds using cookies. Which is yet another abuse of a very bad feature of HTTP.


I'm not sure that any of the current browsers actually do this though

It's unspecified how the server should handle this. AFAIK, it's "one of the great unsolved problems" with HTTP2.

There is ongoing work at the HTTP working group to standardize “cache digests” to solve this problem:

https://tools.ietf.org/html/draft-ietf-httpbis-cache-digest


You are precisely correct; server push is not cache aware today.

According to the spec, browsers have the ability to abort pushes that the server has started, so if you start sending down a 3MB file, your browser can send the server a request to stop sending it, because it's already in cache, but no browser actually does this today, and the first part of the file gets sent down regardless.

There's a draft spec for browsers to use "Cache Digests," a compact representation of the content already in the cache. http://httpwg.org/http-extensions/cache-digest.html But no browsers support that today.

Luckily, you can workaround/polyfill Cache Digests using Service Workers. Your Service Worker can provide the Cache Digest as a custom HTTP header on subsequent requests, and then the server can know to avoid re-pushing cached resources.

Alternately, and more simply, if you configure your Service Worker to return/render stale content on first load, your Service Worker can make a request in the background to fetch the latest content, passing a simple header to the server to disable push. Since the stale content has already rendered, the round-trip time to download updated resources is less important.

http://calendar.perfplanet.com/2016/cache-digests-http2-serv...


Some web servers[1] additionally stores cache information in cookies. The same author that (co-)wrote your linked draft implemented it as a primer to the draft. If only CDN's supported it :-)

[1]: https://h2o.examp1e.net/configure/http2_directives.html#http...


> but no browser actually does this today, and the first part of the file gets sent down regardless.

Are browsers really not doing this? My understanding was that just one round-trip was wasted for the client to send an RST_STREAM frame, or does it really re-download the entire pushed file where normally it would be a 304 on request?


The RST_STREAM frame is, indeed, how the browser was supposed to abort the download, per the spec. But it's a SHOULD, not a MUST. In fact, browsers just don't do it.

So why not? Priorities. Not very many sites are using push, and this is partly due to a chicken-and-egg problem; push isn't that good, so nobody's uses it, so there's not much incentive to make push better.

But even if new browsers started correctly aborting pre-cached pushes today, that still wouldn't be as awesome as Cache Digests, which could avoid re-pushing cached content without wasting any bytes. So, if you're going to spend a few minutes working on H2 Push, you should probably spend it working on Cache Digests, rather than working on aborting pushes.

And Cache Digests can even be polyfilled by Service Workers, which are incredibly powerful in their own right. So if you're not Chrome or Firefox, rather than even spending time on Cache Digests, you should probably focus on implementing Service Workers first, so the users can at least polyfill their way out of the problem.

So the order should be: Service Workers first, then Cache Digests, then push-aborts. So, uh, don't expect to see any push aborts from the major browsers any time this year, IMO.


Push shouldn't be used for GETting things that you can cache. For that you can use pull with HTTP pipelining.

Push should only be used for updates, including updating the cache. Ideally you'd only be sending the deltas (changes).


Flashback to the mid-nineties and push media.

http://www.javaworld.com/article/2077287/marimba-software--p...


Literally, nothing like that.

Always remembered the brand and name: Marimba Castanet, but never its purpose – thanks for the refresh! :-)

I was surprised when I read that Server Push wasn't intended for event sourcing. Anyone know why?

These days people use Server Sent Events and WebSockets for this: The client requests data from the server (possibly through a "subscription" type of request), and the server keeps sending messages (e.g. chat messages) until the client cancels the request or closes the connection.

Not having looked deeply into Server Push, it sounds ideal for this sort of thing, and then we could do away with both SSE and WebSockets, neither of which are very HTTP-y.


Confusing name but different use cases. HTTP/2 does introduce efficiency gains that will benefit SSE, but it's unlikely to supplant WebSockets entirely.

https://www.infoq.com/articles/websocket-and-http2-coexist


SSE is just a long-running connection, so it fits with HTTP/2 pretty well (and HTTP/1 for that matter).

H2 push needs to be linked to an initial response, and I don't think it's acceptable to push resources once the initial response is complete. Also, H2 push is pretty much a request/response pair, so it can be cached, so it doesn't really fit the SSE model.

You're right about websockets though. At some point they'll be replaced with something more H2-like, but it won't be push-based.


If the next protocol iteration is something like QUIC, then the WS could just be implemented over QUIC UDP packets.

Right; Server Push isn't "send the client arbitrary data"; Server Push is "move link/subresource prefetch logic to the server, where you can take advantage of optimizations like single JOIN-queries that give you all you need to output multiple resources, actually outputting multiple responses."

Server push sends responses to requests, only these requests are “promised” by the server. So, for every event pushed, the server would have to send at least three frames: promise (the synthesized request), response headers, and response data.

Also, you would still need something else for HTTP/1.x. Even an HTTP/2 client does not have to support server push, and can even ask the server to disable it.

Why do you think server-sent events are not HTTP-y? I found them as elegant as modern Web technology gets.


SSE is not extremely HTTP-y but it at the core is the sort of thing we have used for ages for the same thing we use web sockets. The only difference between SSE and old good polling is the presence of a nice API on the browser side.

We have worked with SSE, HTTP/2 and HTTP/2 Push of SSE event streams and it is a combination made in heaven. In particular, we use SSE events to have the server announce to the browser when a particular segment of an interlaced or progressive image has been delivered. With this combo, the SSE events can be delivered on the wire very close to the data frames for the image pieces themselves.


> HTTP/2 Push of SSE event streams

That's... brilliant, actually! It's like making a request in Go, and being returned a response promise, along with an "events that occurred during response processing" channel, which you can select() on until the promise is resolved.


Why would you do away with them just because they aren't HTTP-y?

WebSockets aren't perfect but they're a lightweight full-duplex protocol that is great for many real-time scenarios. Not being the same as HTTP is the point.

FWIW there is an old spec draft for WebSockets over HTTP/2: https://tools.ietf.org/html/draft-hirano-httpbis-websocket-o...

See the related curl blog post 'No websockets over HTTP/2' [1], which talks about that spec and how it didn't generate enough interest from concerned parties, and goes into some alternatives and perspectives on the matter.

[1] https://daniel.haxx.se/blog/2016/06/15/no-websockets-over-ht...


Good post.

> Sticking to HTTP/2 pretty much allows you to go back and use the long-polling tricks of the past before websockets was created.

Interesting how things turned out. The old bad way is now a good way. Last time I've checked Google Hangouts also used polling instead of WebSockets.


The fetch API is probably the closest to what you're asking for, as it allows for streaming a response, but is basically a plain HTTP request.

Couple gripes with HTTP2 ...

1) HTTP2 Server Push is a hobbled protocol. It was designed for serving static assets to desktop browsers for pre-caching. Most of the internet traffic now is moving to mobile apps or rich browser clients. Developers would greatly benefit from a true bi-directional API that could be fully leveraged by event-based or reactive frameworks. The community could even build it's own APIs if the browsers just exposed the DataFrame primitives of HTTP2. But they don't.

2) Open-source proxy servers like NGINX, HAProxy and many key IaaS providers like Cloudflare, Google Cloud do not support full duplex HTTP2 connections (one-side only at best, translated to HTTP1 on the backend). This is 1 year after RFC 7540 was officially released and 5 years after SPDY. Why even bother baking considerations for HTTP2 into your application layer when your edge infrastructure won't let you leverage it anyway.


The lack of HTTP/2 on the backend to origin connections is a big sad mystery.

If anything it would greatly increase performance, especially for smaller servers and longer distances, with the multiplexing and stream priority settings alone. Required TLS would also increase security.


>The lack of HTTP/2 on the backend to origin connections is a big sad mystery.

Sad, sure, but its not a mystery. As of HTTP/1.1 these proxies already pools TCP/IP connections between the proxy and its origin (the backend server), so it doesn't gain the same connection multiplexing/windowing/cardinality benefits that HTTP/2 to end client gets. This means for many proxy code bases H2 to origin is just a nice to have. Obviously, by prioritizing like that you miss other nice to haves like push.


HTTP/1 connections are limited to one request on the connection at a time and head-of-line blocking. HTTP/2 connections will also be pooled but are able to sustain many more concurrent requests because of multiplexing, while also setting certain requests as higher priority. It's a major improvement for high volume services.

The performance and features are why Google's gRPC uses it as the foundation for microservice communications at scale and low-latency.


>HTTP/1 connections are limited to one request on the connection at a time and head-of-line blocking.

This is true, however this limitation just increases the size of the connection tool needed for the same concurrency. I'm not saying HTTP/2 isn't better, I'm saying its less better for middleware than end user connection optimization.


If you have 100k people who've opened e.g. websocket connections to your backend, I doubt you're going to structure things as a 100k-socket connection pool open to your backend. You've probably either:

A. got that scaled across at least 100 backend servers, even if those servers' CPUs are almost entirely idle; or

B. have chosen a completely different, extremely roundabout architecture involving client async requests with HTTP 202 responses + client polling of a "status of server-side promises" endpoint; or

C. are effectively manually doing what HTTP2 does automatically, by having a "stateful-connection load-balancer server" (e.g. SocksJS) that mediates between long-lived client connections, and short-lived RPC requests with async responses from your backend.

Whereas, with HTTP2, that could very easily be one machine, taking in those 100k nearly-idle connections, and passing them over one TCP socket to one backend.


> Required TLS would also increase security.

TLS is not inherently required for HTTP/2 on the backend.

TLS is sort of required on the frontend mainly because browser vendors don’t want to deal with broken middleware that expects all traffic on TCP port 80 to be HTTP/1.x.

This is less of a concern on the backend, and there are practical implementations of cleartext HTTP/2 today, including nginx and Apache on the server side, libcurl on the client side, and others.


Sure, not absolutely required but it's the default way and most of the time you want all the links to be encrypted. What's the use of a secure last-mile when the origin isn't?

Since backend servers will usually be running internet-facing web-servers, HTTPS on HTTP/2 is already included. I'm sure the certificate checking can be skipped though in certain cases but the overall required by default mode is much better than HTTP/1 now.


HTTP2 Server Push is "for what it's for"—it's not trying to be web sockets, or Server Sent Events, or WebRTC Data Channels; it's just trying to (transparently) make HTTP a little bit more optimized while completely keeping all current HTTP semantics.

1. you can use Server Push just as well to e.g. notice when JSON responses you're about to send out contain "hypermedia annotations" (i.e. "link" keys), and set up async self-subrequests to push the linked resources as well—then the backend can avoid ever having to think in terms of something like GraphQL's nested schemas, instead just answering queries for individual resources with fields and foreign key IDs; and the frontend can just dumbly consider responses to map directly to objects, instead of having to have a complex OSI-6 presentation layer that unpacks nested-schema messages into their resource-object equivalents. Suddenly HTTP PUT/PATCH has coherent semantics again, without losing any performance.

2. This is "a good problem to have"—and a good design decision on HTTP2's part, in my opinion. Most sites don't need HTTP2, and can stay HTTP1. While, if you realize that you need what HTTP2 provides, that need will probably be driven by an individual application you have (e.g. Google Search) that is at such huge scale that it has its own custom infrastructure driven by your scaling needs, of which "and supports HTTP2" will just be one more. (And this argument seems to hold water: according to https://www.searchdatalogy.com/blog/http2-on-top-sites/, a large percentage of the top bigcorps doing web traffic have switched.)


That argument hinges on whether you view the subset of legacy HTTP semantics as a useful and comprehensive abstraction for web client-server development in 2017. For a 2.0 release, and with Server Push specifically being an addition, they could have been more forward thinking than just the offering ... "hey here's a way to serve static resources slightly faster"

The phrase "for a 2.0 release" makes it sound like you think that this was some sort of big branded push, like a new major version of Microsoft Office that Microsoft wants you to buy.

The 2.0 in HTTP2 is just a semver version—it's 2 because it's not wire-compatible with 1. In terms of features, it's pretty much 1.2.

And besides, "serving static resources slightly faster" was enough to get a lot of big corporate players excited to switch. It might not excite you as a developer because it's "just" an infrastructure-level solution to an ops-time scaling problem, rather than a change in development-time workflows. But ops-time scaling problems are real problems, and Internet wire protocols change in response to those problems more than they ever do in response to developers' concerns. (There's a reason all current wire protocol standards are handled by the IETF—a body of data-center and NOC ops people, basically—rather than by developer-oriented groups like Khronos or WHATWG.)

Also, "slightly" is an understatement on severely latency-constrained links, like those of cellular or satellite internet connections, or over Tor(!). If you can turn 10 (tiny) HTTP requests into one request with 10 (tiny) responses, you can often turn five minutes' load time into 20-30 seconds.


It's true that the mechanics of triggering HTTP/2 Push are done with the Link rel=preload directive (defined by the W3C [1]), the advice to hardcode HTTP headers on outgoing responses that point to file references is simplistic at best -- which shows that this whole thing (still) isn't yet thought out too well.

Most of the content of my earlier post on this topic [2] still stands; since then, Caddy has implemented rule-based ways of specifying what to push [3][4] (although not via Google's push manifest [5], despite being asked to). It'd be far more productive if the community coalesced around one clear declarative way of specifying what to push -- Google's push manifest does this well enough -- which could be ingested by server software and applied as needed.

If this were done, the problem space is moved to somehow generating that manifest, which can be done by hand, or scraping resources, or as an output of a more complex tool [6] that knows more about the info-space of the resources.

And of course, keep in mind the advice that Google has prepared as part of the research they've done on deployments of HTTP/2 Push [7].

[1] https://www.w3.org/TR/preload/#server-push-http-2 [2] https://news.ycombinator.com/item?id=12722383 [3] https://github.com/mholt/caddy/issues/816 [4] https://github.com/mholt/caddy/pull/1215 [5] https://github.com/GoogleChrome/http2-push-manifest [6] https://github.com/webpack/webpack/issues/1223#issuecomment-... [7] https://docs.google.com/document/d/1K0NykTXBbbbTlv60t5MyJvXj...


Looking at the graphs, all of the variations come in at 2-4 seconds, in most cases 10-25% average difference. So it's all slow, but not unbearable; it seems like a bit of a wash.

Using `Link: rel=preload` is interesting, but it misses the real opportunity to get the critical resources before the html and window to grow the congestion window early. At best you are in a race condition on the browser's preloader/speculative parser. At worse, you are competing with the html base request with your pushed resources.

The better solution is to push these resources before the HTTP headers and status code from the application framework. The real opportunity is to get the content downloaded early, grow the congestion window ahead of receiving the html, then yielding the socket to the html, and then continue pushing content until the browser starts making it's own http requests. As all things - mileage will vary. Some times this is a large opportunity (eg: User in Australia requesting content from NY; or DR failover). Other times, push opportunity might be negligible (eg: cached html page) On high RTT networks, this does save you the full round trip for the request.

For more details on this check out my talk at Velocity Amsterdam: https://youtu.be/GjWD1pOkxUk?t=1534 https://speakerdeck.com/colinbendell/promise-of-push

I also built a tool (on top of webpagetest.org) that helps you evaluate the potential for push here: https://shouldipush.com


What I'd love in the HTTP protocol is some way to control IPs used by client - eg fail over to different IP if first is unresponsive. Such a small client side change would make back ends much simpler.

There's already a system to do so: it's called DNS.

You can check for timeouts using XHR already and if a server doesn't respond then having it send a new IP wouldn't work anyway.

The problem with using DNS that way is the usual 30 second time out. On the other hand lower values than 30 seconds can cause problems for many users with slow or mobile connections.

Honestly I can't think of a reason why the HTTP specifications should contain anything about IP addresses.


Legal | privacy