Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Brotli compression for the web (calendar.perfplanet.com) similar stories update story
88 points by ssttoo | karma 777 | avg karma 4.32 2015-12-26 01:54:38 | hide | past | favorite | 46 comments



view as:

The support is really only on firefox: http://caniuse.com/#search=brotli

Chromium is still milling on this: https://code.google.com/p/chromium/issues/detail?id=452335

But there seems to be some progress: https://chromium.googlesource.com/chromium/src.git/+log/lkcr...


That's fine. It's quite possible they could sort out Chrome and most of the world's browsers would have the feature by the end of the year. We live in exciting times :-)


Thank you. That is loads more informative than the linked blog post, which does not even bother to compare to LZMA.

It doesn't bother to compare LZMA because no major browser is planning to add LZMA.

And one more comparative benchmark featuring Brotli: https://github.com/mavam/compbench

I still don't understand why it's 'br' in the header and not 'brotli'

Maybe because the actual name is "Brötli" and "Zöplfi" coming from the Swiss German naming of small breads. Umlauts are usually converted like so: "ö" to "oe" but not always. So there may be confusion as to if it's "brotli" or "broetli".

Then again, this is just some wild speculation. :)


Actually, they decided the header would be .bro

During a lot of overplayed drama it was decided to rename it to .br instead.


I feel like ".brot" would've been a reasonable compromise compared to ".br". (Though I also still think absolutely nothing was wrong with ".bro".)

That's because you're a man.

I am a man, but I don't think that's why I think it. I'd be perfectly fine with a ".girl" or ".chick" or ".fem" or ".sis" extension, or any similar variation, if it happened to be a valid acronym for some technology.

Unless the issue is that "bro" has some other kind of objectively offensive connotations, which I completely disagree with. Many women (and some men) use "sis" (or, much more rarely, "dudette"), in the exact same way as "bro".

(And for the record, I didn't downvote you.)


There are some common connotations for the type of people who call each other "bro", e.g. the muscle-bound narcissist who still smells like last night's kegger, where he got to feel up three different blackout-drunk girls. Of course, that isn't actually representative of the population of bro-sayers, so it's just as much a bigoted stereotype as anything else is.

I've actually known some stereotypical sorority-ish girls who use "sis" / "sister" in a very parallel way (shallow, narcissistic, petty, immoral). Doesn't somehow invalidate all use of the word. (And obviously not all sororities are like that, hence "stereotypical". Nor are all fraternities.)

You pretty much can't win in that culture. You're a man, so you can't even grasp what it's like to be a woman, but trust us, it's horrible. And if you wouldn't mind if it were "sis", that's because you haven't been oppressed by women.

I understand there are gender issues, but if you have to overcorrect so much that you're changing an abbreviation to avoid the short form of "brother", you've gone wrong.

I am very thankful that my culture doesn't have this gender hostility and easy offense that's prevalent mostly in the US.


Women can't oppress men because men as a group are privileged.

Don't ask me how this works but this is What (intersectional/third-wave) Feminists Actually Think.


.sis is already taken, it is (was?) the package files for Symbian. Basically .apk before Android, if you'd like.

What about .sys files?

Thank god the man utility was created a long time ago, nowadays it would need to be called something else.

You probably have also heard the joke where shells have the "herstory" command.

And "kill" system call is eliminated for obvious reasons; processes can decide themselves whether they do anything when they receive the "euthanise" signal. Plus the biased "mail" delivery is replaced by "gendre".


Tumblr aside, "kill" is a dumb name anyway. It just send signals, and it's default signal isn't kill, it's terminate. And hangup doesn't have anything to do with terminating.

A better name would have been "signal" (as in the verb), avoid a default signal (how often do you actually send TERM?) and to use lowercase for the signal names consistent with other commands.


  > And hangup doesn't have anything to do with terminating.
The default hangup action is termination because people generally don't want interactive processes sticking around after their I/O is gone.

  > how often do you actually send TERM?
Each common termination trigger has its own signal so that programs can distinguish the source and potentially act differently for each — HUP for hangup, INTR for tty, TERM for kill(1).

> The default hangup action is termination because people generally don't want interactive processes sticking around after their I/O is gone.

Sure but 'hangup' is used to reload configs (which may not actually mean terminating the process).


That came later, though. Assuming re-purposing of an existing signal, SIGHUP was a smart choice because it's exactly the one that a daemon (not attached to a terminal) will never ‘legitimately’ see.

I don't know the history of SIGHUP-to-reload (perhaps a good question for TUHS) but I'd guess it originated outside of Bell or Berkeley, who would have simply added a new signal.


What overplayed drama? The discussion happened on Firefox bug 366559, 'bro' was raised in comment 146 and 'br' was proposed/"accepted" by 152. There was of course a lot of complaints/trolling after the decision was made, but the actual rename happened in a small number of comments over the course of a few hours.

[146]: https://bugzilla.mozilla.org/show_bug.cgi?id=366559#c146 [152]: https://bugzilla.mozilla.org/show_bug.cgi?id=366559#c152


Feminism is a fucking religion at this point. It even has its own clergy that should be consulted before making important decisions.

Indeed, although I said it this way to avoid starting another debate – obviously, that failed.

Look, talking about "a lot of overplayed drama" thus introducing your opinion that "a lot of overplayed drama" had taken place is surely NOT the right way to avoid starting another debate - but I'm inclined to believe you already know that...

Overplayed drama because I consider the whole situation to be ridiculous. It does not matter if it’s bro, br, brötli or whatever.

The whole drama around it, and "feminism gone too far" is ridiculous, and, worst of all, everyone claims the other side did far worse stuff than what actually happened.


Because it's shorter.

But nobody tries to use 'df' for deflate.

They probably would've if they were doing it now.

Previous discussion from 3 months ago:

https://news.ycombinator.com/item?id=10257305


Brotli is an awesome codec, but in a lot of my tests it was not really compelling versus LZHAM in terms of compression speed or compression ratio. In some test cases it edges it out on speed or ratio, but on others it's a tie if not worse. Sadly, in my testing I also hit at least one crash bug in the Brotli compressor, which makes it feel like an immature codec. On the bright side, the bug was fixed extremely quickly. I hope integration in browsers will help push the codec forward and encourage the developers of other codecs to try harder so that we get a really heated competition :-)

> " it was not really compelling versus LZHAM "

For those of us building things on the web, things supported by browsers are far more compelling than those that are not.


I'm hoping for a compression algorithm which allows for separated codebooks, so e.g. one can compactly send updates when previous versions have already been transferred.

What do you mean by a codebook? Something like a static dictionary?

For LZ algorithms, the dictionary made when compressing the first version would be reused for subsequent "versions" of the same document. So sort of a static dictionary, but specialized for what you're sending.

That or maybe he means the Huffman codes, in which case you'd do the same thing but s/LZ/Huffman/ s/dictionary/tree/g.

I tend to favor algorithms like LZF and LZ4 that don't have a separate dictionary for compressing text though, since there's just about no performance loss on either end and compression is "good enough".


> For LZ algorithms, the dictionary made when compressing the first version would be reused for subsequent "versions" of the same document. So sort of a static dictionary, but specialized for what you're sending.

Ah, so you mean something like what SDCH does. This works well for files that are smaller than the window size and has rewards rapidly diminishing with filesize past that.


Chrome has something like this called SDCH, but it hasn't seen a ton of adoption outside Google:

https://en.wikipedia.org/wiki/SDCH

https://engineering.linkedin.com/shared-dictionary-compressi...

https://aprescott.com/posts/sdch

Guessing part of the hurdle is that you can't simply drop it in like gzip; you have to generate a dictionary and serve it and refresh it now and then (the LinkedIn post explains). And a CDN can't just cache "the SDCH'd version" of a resource; ideally it'd cache one version for each recently used dict.

The other, slightly happier "problem," is that other tuning can make the additional gains from SDCH less dramatic. I looked, and the Google homepage has a blob of JS that's 40KB gzip+sdch'd--but it's ~100KB gzip -9'd, and probably costs me 0KB most of the time because it's cacheable and its Last-Modified is about a month ago.

CloudFlare's Railgun proxy does use delta coding to good effect, though in a much different context (origin server to edge location).

I can imagine other variations, like priming Brotli's (de)compressor's history buffer with the first bytes of the old version of a cacheable resource in an If-None-Match request, if both sides have the old version handy and advertise the encoding. Sounds like what you're saying. Like SDCH, that still isn't drop-in; the server has to remember old versions of things. And it only applies to cached resources.

Anyhow, interesting space.


> priming Brotli's (de)compressor's history buffer with...the old version...if both sides...advertise the encoding

Too late to edit, but I mean a separate Accept-Encoding value for the diff-like usage (brdiff or whatever), not piggy-backing on the value already being added for Brotli. Clearly if you're messing with the history buffer it's no longer Brotli-as-we-know-it.


Anyone know if there's something planned for haproxy?

You can make your own tests and your data with TurboBench: https://github.com/powturbo/TurboBench

Legal | privacy