Profile: dtaht

2014-06-01 11:09:39

On Linux 3.13 and later it's easier to just add fq or fq_codel to /etc/sysctl.conf, which will enable it for all your devices, including those with hardware multi-queues. Or, at the command line:

sysctl -w net.core.default_qdisc=fq # if you are a host sysctl -w net.core.default_qdisc=fq_codel # if you are a router

(but this requires that each interface be taken up and down. Better to just stick it in sysctl.conf and forget about it)

2014-06-01 16:17:59+00:00

Stuart Cheshire is the godfather of the bufferbloat project.

With algorithms like fq_codel we've reduced induced network latencies by orders of magnitude...

Two examples of fq_codel in action:

100Mbit:

http://snapon.lab.bufferbloat.net/~d/nuc-client/results.html

Fixing a cable modem:

http://burntchrome.blogspot.com/2014/05/fixing-bufferbloat-o...

I enjoyed reading all the analogies on this thread. Stephen Hemminger and myself try to explain the latency problems involved in queueing with demos using water bottles in various configurations.

http://www.bufferbloat.net/projects/cerowrt/wiki/Bloat-video...

2014-07-24 02:28:17

I note that the bufferbloat.net project is looking for new hardware, and sponsors. The main foci of the project are apolitical in nature - just trying to build a faster, better internet. Most of cerowrt's original goals have been met, except the mandate to seriously improve wifi, which is next on the list.

2014-07-24 02:33:58+00:00

I don't regard cerowrt so much as "my" project. There's a couple hundred people on the mailing list - it's the "reference router project" for the bufferbloat-fighting community,

And: cerowrt is much closer to a "branch" of openwrt than a fork. It's synced with openwrt on a nearly weekly basis, and the net delta between it and openwrt at this point is a few dozen patches (most code under test), and a bunch of test tools maintained in the ceropackages feed. We have always collaborated closely with the openwrt folk (several help out on a regular basis). A core difference between openwrt and cerowrt is process - by focusing on one router only (where they focus on hundreds), we were able to spin up and prove out new ideas faster in some cases than they, and maybe get "Stabler", faster - (that said, see bug 442)

Unfortunately - as so many have noted, that router is EOL, and we'e been trying to find a replacement platform for over a year now.

2014-07-24 02:36:29+00:00

Thank you for the performance statistics. We have done a lot of work on other hardware platforms to get them up to being able to saturate gigE or higher, with tuning and things like "Byte Queue Limits" support. The problem with the ppc thing is that most likely it has bufferbloat in a blob we can't fix. (news to the contrary welcomed). And hearing that the APU peaks at 670Mbit is disappointing, but perhaps someone will get some time to profile it and figure out why it is so bad - certainly we get gigE easily with intel based hardware.

2014-07-24 05:29:46+00:00

I'd rather like better data on the problem you describe, as it does not seem to line up with the experimental data I have so far on the interaction of AQM and packet scheduling techniques with bittorrent.

The first paper we did was: http://perso.telecom-paristech.fr/~drossi/paper/rossi13tma-b...

Which predated the development of fq_codel, which is the fair queue (flow queue) + aqm hybrid now deployed in cerowrt, openwrt, and countless other QoS systems. That paper only explored RED or SFQ, not a hybrid.

What I think the positive effects we are seeing now are several factors. 1) a "torrent" is usually on 6 flows at a time, rotating to a new flow every 15 seconds or so. 2) The additional peers trying to negotiate IS a problem at very low rates (say below 4mbit) but invisible at higher rates. 3) reducing latencies under load to under 5ms has enormous gains in bidirectional throughput, which more than compensate for losing torrent's delay based backoff mechanism, which only backs off at 100ms. Which would you rather have, 100ms latency or 5ms?

See, for example, what the fq_codel system does for verizon and comcast here:

https://www.bufferbloat.net/projects/codel/wiki/RRUL_Rogues_...

We got back 5x free bandwidth on the verizon test. And if you are experiencing 5ms worth of latency does it really matter if you have 6 torrents in the background?

(80, yes! but it's a function of your bandwidth also and... and... more testing is indicated.)

... so I haven't got around to redoing the research and writing a successor paper on it, and probably won't anytime soon. The original authors of that paper have gone off to do several papers of extensive analyses of RED vs ledbat, and I think they've largely gone off into the weeds, not that I mind having that analysis as a basis if ever we (or someone) gets around to analyzing fq_codel with some of those techniques.

Please feel free to test re torrent with a system that uses something like openwrt's qos-scripts or cerowrt's SQM system.

The conclusion of the original paper was that only classification seemed to be an answer to even further deprioritize torrent in an AQM'd and FQ'd world. (which the above systems can do also)

http://tools.ietf.org/html/draft-hoeiland-joergensen-aqm-fq-...

http://tools.ietf.org/html/draft-nichols-tsvwg-codel-02

2014-07-24 06:05:19+00:00

IMHO: EFF doesn't yet grok the importance of pushing out good routing protocols to the edge, and the need for mesh networking the edge together in the advent of emergency... or adversity.

CeroWrt include every mesh networking protocol available - olsr, batman, and babel, for starters. The default is babel in it's source specific variant, and it's on by default. Tinq is also in there as an optional package, and recently RTT based metrics were added to babel to make meshing over vpns saner.

I'm not sure what you mean by "secure mesh networking"?

One of the cool features (not in the current cerowrt release) of the quagga-babeld protocol implementation was the ability to securely exchange routes over an insecure medium, which not only has an implementation but is an official RFC.

http://tools.ietf.org/html/rfc7298

I am under the impression commotion-wireless is doing something similar. If we are ever to replace BGP we need to start somewhere...

2014-08-04 01:58:18+00:00

I do look forward to the day where such complex congestion control protocols can be implemented in hardware...

Until then... fq codel is the best thing going.

2014-08-04 02:26:47+00:00

My problem with the new paper is that it computes for a baseline rtt of 100 or 150ms. Real world average rtts are in the range of 4 ms for Google fiber 18 ms for fios and 38 ms for cable, with the ethernet rtt in a Datacenter far lower than that. I would be very happy to see Remy produce a CC for these rtts one day also.

2014-08-06 12:27:06+00:00

I put up all I have to say on the subject at: https://lists.bufferbloat.net/pipermail/bloat/2014-August/00...

2014-09-16 10:51:24+00:00

I guess no one reads Vint Cerf anymore. Sigh.

2015-02-28 14:26:02

And that we have fixed bufferbloat on many technologies, and thus a mandate for more packet loss, rather than less - or the deployment of ECN - would be WAY better than further political debate. example of the fix for cable modems.

http://snapon.lab.bufferbloat.net/~cero2/jimreisert/results....

2015-07-11 12:48:25+00:00

It is everywhere... pretty thoroughly quantified now (example:

http://www.dslreports.com/speedtest/results/bufferbloat?up=1

)

It is serious and remains so... but

Fixes have emerged (fq_codel, pie, cake) that are widely available in linux and linux derived gear -

which had spectacular results:

http://burntchrome.blogspot.fr/2014/05/fixing-bufferbloat-on...

and pie was mandated as part of the docsis 3.1 std.

and standardization efforts of the new aqms is taking place at the ietf. For example, this RFC just emerged: https://tools.ietf.org/html/rfc7567 and there are other products of that working group nearing finalization.

While wifi and lte remain to be fixed, I sleep better knowing that the tide is turning.

2017-03-02 19:14:02

I would be very interested in trying to see the results of the firebind stuff, with our stuff (fq_codel and cake based "sqm") in place on the link. Could you drop by the bufferbloat mailing list and chat with us?

Also: We use a tool that we consider much better than iperf alone, it's from "flent.org" and gives us the abilty to inject loads and measure the side-effects.

2018-12-19 23:14:38+00:00

On the point to multipoint problem, I think we thoroughly licked that last year (and in openwrt 17.01) and later, with: https://www.usenix.org/system/files/conference/atc17/atc17-h...

I look forward to seeing some more field results from that.

2019-07-14 21:54:40+00:00

I am pleased to see no one coming at us with pitchforks and torches ablaze.

240/4 already works in many idea also. 127 and 225/8-232/8 can be reallocated as unicast also.

2019-07-14 22:01:35

Drop myself and the other project members a line and see the unicast extensions project on github.

It is a trivial number of patches to enable all the open source code in the world to work.

2019-07-14 22:03:59+00:00

This point was the overarching point of my talk at netdevconf. We are going to need more ipv4.

2019-07-14 22:06:21+00:00

Thx for the pointers to the work and talks!

2019-07-14 22:10:02

225/8-231/8 were reserved for future multicast use and never allocated by iana. As near as we can tell by grepping the entire world's source they are entirely unused.

128m addresses....

2019-07-16 22:22:44

This work is a fallout of the conclusion of this report:

https://www.internetgovernance.org/2019/02/20/report-on-ipv6...

"legacy IPv4 will coexist with IPv6 indefinitely"

which was referenced in the original talk, and I wish more "just deploy ipv6" advocates would read. I have worked very hard on making ipv6 deployable (notably in the cerowrt project) and came to the conclusion after reading that that if we wanted sustained internet innovation we needed to expand and improve ipv4, also. Thus 240/4, 0/8, 225/8-231/8 (pending) and yes, even a large portion of 127 and a push to clean up other problems in ipv4 in general.

2019-07-16 22:23:28

2019-07-16 22:27:21

We have had many talks at many levels over the years. This is an attempt to break the logjam of finger pointing that ensued. We expect wide adoption of this and other ipv4 related patches over the next few years across the open source stacks... and then a standards dialog can take place.

John Gilmore (creator of the unicast extensions project) was the co-inventor of bootp in 85, which is what made using 0.0.0.0/8 feasible, then. The fact that it's been feasible since and no movement in the standards orgs to fix it, has kind of been a record long wait from bug fix to deployment in both our cases.

2019-07-16 22:29:42+00:00

We've had 225/8-231/8 up and running for 6+ months, with patches for various routing daemons. Running a gauntlet with 240/4 and 0/8 now working seemed best.

In our exploration of converting these 120m addresses from a multicast definition to unicast, we only had to change the kernel with two tiny patches, and recompile 89 fedora packages. Openwrt was less. The patches for frr and bird, straightforward.

2019-07-16 22:30:59+00:00

As for private or public use, certainly there is demand for a larger CGNAT address space, somewhere, and allocation of a portion of 225-231 for that might drive adoption there.

Still, the addresses need to become routable, and from there (aside from politics and depoloyment delay), globally routable is the dream for most of these new addresses, in 5-10 years.

2019-08-06 17:40:07+00:00

Do keep hoping everybody complaining here about their speed, also check their latency on dslreports.com. Bufferbloat makes slow internet a lot slower than it needs to be, and there's cures for it (sqm/fq_codel,sch_cake etc) - deploying that on your link can be a rather nice upgrade.

2019-09-10 09:56:19

I think I've read everything she ever wrote about congestion control. I'd always wanted to meet her. :(

2021-11-16 11:16:11

We've been having regular videoconferences to discuss this concept tuesdays 9:30 AM PDT over here:

https://starwrt.v.taht.net:8443/group/bufferbloat/

(use any login, no password, chrome is best)

There's also the "rpm" mailing list on lists.bufferbloat.net

2021-11-16 16:47:24

I note the 240/4 and 0/8 and especially zeroth ideas are the more viable of this bunch.

What I mentally see as a possible use for 127/8 (with 127/16 still held in reserve) - is the really painfully untransparent string of vm -> container -> OS -> offload engine that we have little insight into today.

2021-11-16 17:45:20

I'd like that too, having spent decades on ipv6.

Making 0/8 "just work" took an hour and some testing. Ripping out the check for that reservation saved a nanosecond. And: 240/4 had been working for a decade, and nobody noticed.

2021-11-16 18:01:04

the zeroth draft gives folk with an allocation a free new IP address.

2021-11-16 18:45:53

At current prices .mil selling off 22/8 would net them at least 2 F-22s. Not quite the same as selling off tsla stock, but...

2021-11-17 00:51:05

We could use an 'upgrade in place': https://docs.google.com/document/d/1T21on7g1MqQZoK91epUdxLYF...

2021-11-17 00:58:44

filed with the fcc, 6 years back: http://www.taht.net/~d/fcc_saner_software_practices.pdf

2021-11-17 01:01:42

Please contact me via email - dave dot taht at gmail dot com. Working on a successor to this:

http://www.taht.net/~d/fcc_saner_software_practices.pdf

2021-11-17 07:34:47

We have been at this, technically, for years and has been discussed in multiple forums prior to now. It's a done project for me (my primary focus is bufferbloat), as all the technical work I could do is complete and the core patches for 240/4, and 0/8 and zeroth in the linux kernel and BSD. The internet drafts are now submitted for consideration for standards track. If they fail to gain support, the existing and future usages of these ipv4 spaces will remain unstandardized.

2021-11-17 07:36:21

A host nowadays consists of one or more vms, piled on top of a container, piled on top of a hypervisor.

2021-11-17 07:58:55

Other normal traffic on the machine going through that conditioner has side effects on your application as well.

2021-11-17 08:02:37

Over here we tuned up virgin media fiber to have the least latency under normal working conditions yet seen, using sqm and the sch_cake algorithm. Pretty graphs here:

https://forums.overclockers.co.uk/threads/virgin-media-discu...

I don't know what rpm score that network would get as yet.

2021-11-17 08:03:46

It's not "one click", but flent is the goto tool we use to probe networks for a multiplicity of problems. It's available for most OSes, including OSx. See flent.org.

2021-11-19 01:11:40

A very good overview, aside from these proposals, of the problems that internet addressing has developed over the years was also presented at the same working group meeting.

https://datatracker.ietf.org/doc/draft-jia-intarea-scenarios...

This should also be a candidate for careful extrapolation:

https://www.larus.net/blog/IPv6_adoption_then_and_now

2021-11-19 01:15:17

I am a bit sad that what I felt was the proposal of most immediate benefit to the internet was not the one that got all the attention. That one is here:

https://datatracker.ietf.org/doc/draft-schoen-intarea-unicas...

2021-12-02 11:39:24

sometimes... I hate how popular avery's posts are. (he's my former boss). I simply can't write like that. Or think like that. good show, ex-boss!

2021-12-02 13:17:37

AQM technologies were described as necessary best current practice by the IETF back in 1992, and recommendations revised in 2013 https://datatracker.ietf.org/doc/rfc7567/ because the algorithm for it chosen in 1992 (RED) didn't work well enough. Two new algorithms appeared in 2012 (Pie and codel/fq-codel) that did.

I was very frustrated by the network neutrality "debate", partisans excluding the idea that it was actually a huge technical flaw with how the internet is structured that was at the root of the problem - when many flows attempted to co-exist: https://blog.cerowrt.org/post/net_neutrality_customers/

Now Solved. Thoroughly. By those two theoretical breakthrus. Since, zillions of knowlegable users and new products (like those from openwrt, eero, google wifi, and many others) managed to fix the underlying bufferbloat problem for themselves via "smart queue management" or "optimizing for conferencing and gaming", but it required manual configuration and tweaking and the right place for better bandwidth does indeed lie within the ISPs shapers and CPE.

And on by default.

However, it's not as simple of "call of duty or zoom" going "ahead". Better multiplexing (FQ), and shorter queues (AQM) lead to those applications' packets mixing into the heavier netflix flows without manual intervention.

Comcast

2021-12-02 13:22:49

Keeping the queues short with AQM makes every application share more fairly. Low rate applications like gaming and zoom benefit.

QoS, as you describe, has been tried (see diffserv and intserv), and fell down because mostly everyone felt "their" application had priority. Being fair, instead, and improving statistical multiplexing in particular lets 'packets be packets'.

2021-12-02 13:34:30

by multiplexing better, you generally need to not differentiate between high and low priority. Being that I've been trying to convince people of this for years - all I can do is try to point at resources that demonstrate how we think about how packets work is frequently wrong, and/or try to provide a demonstrable example.

Let's take voip vs netflix over 60ms - netflix, like most DASH video traffic from most video services, is very bursty. It tries to grab a chunk of video, and over the course of 1-2 seconds, fills the pipe, and voip sends one tiny packet every 20ms, which kind of looks like this:

What you want:

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN -> you V V V

What you get:

VVVNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN -> you

What AQM does is keep that induced queue short (roughly 16ms with DOCSIS-PIE). The netflix "pulse" goes away and you end up with

NNNNNNNNNNNVNNNNNNNNNNNNNN -> you

Netflix loads the next segment a tiny fraction slower, but your voip call doesn't get the jitter and latency side-effect anymore.

2021-12-02 13:38:46

I've given more than a few lectures on how packets and queues and aqm are supposed to work, with very physical demos, using "packets as people", such as this one: https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-...

I'm cited in the article, participated in the study, and have been trying to counter this mis-impression about "priority" vs filling the pipes, not the queues, as the answer to better network latency for many, many years now. It's still a tough slog, it seems, and I fear this will be a long day on reddit for me! But with comcast's now enormous and deployed existence proof, perhaps more will deeply grok it, and we'll see movement by other isps to apply the same technologies so all their users would benefit.

2022-01-13 14:45:36

"It is time to update our understanding of the primary factors directly affecting end-user Internet performance.What we have learned is that high throughput alone is not sufficient." - from the executive summary

2022-01-30 09:47:18

I tried to leave people laughing, here: https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-...

And the online book, freely available and primarily on applying fq_codel to everything (and also sch_cake) is here: https://bufferbloat-and-beyond.net/

In the last decade we've managed to eliminate fifos from most of linux, most 3rd party firmwares notably with openwrt and sqm, ios and OSX. The only major things left unfixed are unfortunately home routers and edges.

2022-01-30 09:53:29

The modern AQM's are based on "Time in Queue". Both pie and codel work brilliantly with pause frames, so long as BQL is also in the driver. fq+aqm (be it codel, pie or cobalt) works even better.

See:

https://datatracker.ietf.org/doc/html/rfc8290

https://datatracker.ietf.org/group/aqm/documents/

https://arxiv.org/abs/1804.07617

Adding AQM and FQ wifi was way, way harder, (apenwarr drove the group at google that did some of it), but there is full support for fq_codel now in the mt76, ath9k, ath10k, iwl, and one new realtek chipset in the linux kernel. https://lwn.net/Articles/705884/