Gentoo is secure, as far as you can be secure by building a bunch of code few have time to review on hardware few have time to fully understand running in an insecure world.
So, I don’t fault it.
However, if you don’t include dependencies and you don’t manage them, which would be the case in modern environments for the majority of users in the world, how is that safe?
Once Nix moves more to fixed output derivations this will be killer!
You can easily upgrade dependencies by having the linker resolve to different libraries without needing to rebuild the world while still retaining a sensible memory model.
Correct me if I'm wrong (really) but isn't static linking mostly a problem for packagers? The less a package maintainer modifies the application the better IMO.
If application A and B rely on dependency D, which turns out to have a vulnerability, fixed in D', Then _why_ do we think it is anyone but A & B's developers responsibility to update to D' and distribute patched versions? If the packager tries to do it, the chances of diverging A and B due to some other incompatibility are too high.
Either, you're running well maintained software (OSS or commercial), and get the update in a timely manner, or you're running unmaintained software and don't get the update.
Only in the second case does it make sense to patch in a dynamically linked dependency. And I'm sure there's plenty of examples of this, but the real issue is running un-maintained software!
I am much happier keeping in close sync with timely releases from Go/Rust projects than I am with Debian et. al's style of freezing the world.
The sad truth is without the package maintaining too many people would be running horribly out of date and insecure software - as there is zero incentive to upgrade a working system.
I think if the software was labeled "insecure - requires update" people would update. If you don't know that something is insecure then there is zero incentive.
The OP is talking about technical details like static linking and language ecosystems, but if you zoom one level out, this comment correctly pins where the underlying problem actually lies: that the distribution maintainers inject themselves into the development process of all the software they ship.
When A/B are not well-maintained upstream, distributions can assist by updating its dependency D, but even when A/B are well-maintained upstream, distributions still might monkey with them when D changes, even when that change to D actually breaks the software. To me, the root cause is that the distributions are effectively attempting to participate the software they ship, but do so by getting in the middle, rather than participating upstream.
As a former author of some well-maintained upstream software I found their involvement made the software overall worse, but as a user of a distribution I find their maintenance of otherwise unmantained software sometimes helpful. In other words, I think the goal is admirable but the mechanism is wrong, and doing things like adding more dynamic linking only further enable the bad behavior.
I think it would help if distributions could snatch a piece of real estate in upstream software. Something like, "in every project, the debian/ root folder belongs to the debian project and follows their rules". The packaging could then verify this folder and put their patches, build scripts, etc. there. This would help upstream communication a lot, I guess.
The problem is actually far worse than that. While the gp has what seems to be a reasonable solution, the issue is that you can't actually know what distribution is going to package your software, or what environment it is going to run in. There are code bases out there that still work decades later without any active maintenance, but if you need maintenance just to be able to build software in a new environment something is wrong.
The underlying issue is that there are no standards for how to package software in a multi language environment. If I want to go from a state where (require 'module-name) fails to a state where (require 'module-name) succeeds, there are a potentially infinite number of ways that that could be accomplished, a single software project cannot ever specify all the possible ways for building software. What they can try to do use uniform interfaces and standard patterns for building their software, in a way that delegates dependency management to an external system. It seems that it is hard for software developers to admit that other people know more about how and where their software will be running than they do.
Good engineering practice seems to dictate that dependency and environment management should be completely orthogonal to the development of an individual component or individual functionality. The op is 3 stories about what happens
when the two are not kept orthogonal. The fundamental problem is that it is often easier for individual projects to make decisions that conflate the individual project with its dependencies (no longer orthogonal).
To my knowledge, there is not a universal or well understood set of requirements that could be used to specify what a stable interface between an individual software project and its dependencies looks like. There are a number of candidates, such as gentoo ebuilds, rpm spec files, etc. however I have not seen one that effectively accommodates all of them. Further, there are languages where the implementation (or even design) makes it impossible to keep dependencies and individual projects orthogonal.
The end result of non-orthognal systems is more work for everyone, more wasted cpu cycles, and worse security. Distros can't stop people from using languages that conflate the two, but they can tell them that they are on their own, and that the distros can't depend on components written in such languages in the core of the OS. To everyone pushing the rewrite it in Rust meme this should be a wakeup call. The current design decisions in the language and limitations of the implementation make it less secure than C or C++ because swapping out dependencies is bottlenecked by the centralized primary development team, and maintainers and users can't take orthogonal action to fix an issue.
This organizational aspect could be outsourced to one dedicated organization. Not all distributions have to join. I think if 20% did, 80% of the problems would be solved.
>but as a user of a distribution I find their maintenance of otherwise unmantained software sometimes helpful.
Sometimes. Often times they usually break the software and leave the users hanging dry wondering why A or B don't work anymore. Literally my experience on gentoo as a user for the last few years.
The distribution provides support for much longer than upstream does. Also, packaging is actual work. Just because some hipsters in some company consider it cool to release new features every 4 weeks, it does not mean that the volunteers of some linux distribution can keep up with thag. So if you want your distribution to work well, you should focus on workable and transparent interfaces.
If you are happy with the support cycles of your upstream and can live with a black box, on the other hand you don't need a distribution in the first place.
Besides, the whole "vendor everything, link statically" idea works well only for the leafs of the tree. Guess what Rust would do if llvm was not a readily usable library but a bunch of C++ files directly used inside clang? Thanks to the Rust mode of operation, it is impossible to do with the rust compiler what they did with llvm.
> on the other hand you don't need a distribution in the first place.
I think you're right about that, I personally want as little distribution as possible. FreeBSD ports or Arch AUR work well for me.
The idea of solidifying other people's software into a bundle and then maintaining it (with necessarily limited expertise) seems like a loosing battle.
As someone who uses the AUR a lot, it's really the perfect example of how you just won't update software if it isn't done automatically by the distro package manager, particularly when you use -git packages.
> Correct me if I'm wrong (really) but isn't static linking mostly a problem for packagers? The less a package maintainer modifies the application the better IMO.
IMHO that's precisely why dependencies should be unpinned.
Let's say application A relies on dependencies B, C, and D, and dependencies B, C, and D depend on dependency E. Let's say dependency E has a critical security vulnerability and needs to be updated today to E'.
Let's say you have unpinned versions:
The packager for E updates E to E'. The end.
Let's say you have pinned versions:
The developer for B updates package B to depend on E', the developer for C updates C to depend on E', the developer for D updates D to depend on E', the developer for A updates A to depend on B', C', and D'. The packager for B updates the package for B, the packager for C updates the package for C, the packager for D updates the package for D, and the packager for A updates the package for A.
You'll notice that there's a timing issue here. The packager for C cannot move until the developer for C has done the work, the developer for A cannot move until the developer for B, C, and D have done their work, and the packager for A cannot do anything until everyone has completed their work. If, for instance, the developers for C all live in Texas and their power's been out for a few days, and when they get power back they're busy with other stuff for a while, it might take quite some time for C's developers to get an official package posted. But it's that important that A gets updated, because A is a network service with a port open to the internet and E is openssl or whatever. So now what?
In a perfect world, all software dependencies would have active, attentive, prompt maintainers, but it tends to not be that way. Lots of critical internet infrastructure packages have a maintainer who's just some random person in Nebraska, and they go on vacation, or lose interest, go to sleep at night, go to little league games on the weekend, some of them have day jobs. If we lived in a world where Apache can't be updated to use the latest dynamic library for openssl because the developer for leftpad is watching a movie and has their phone turned off, that's a very serious problem, and it's a crazy world I would not want to be a sysadmin in.
Certainly, maybe the packager for E is gonna be off this week, but a distro's packaging team tends to have a much easier time filling in for a maintainer who's away if the package is loosely coupled with the application's build process. 90% of the time, if a package in Gentoo requires an update, all you need to do is `mv foo-1.2.3.ebuild foo-1.2.4.ebuild`, `repoman manifest` and git commit+push. (I can't speak for other distros.)
The system isn't perfect, but IMHO it's much more robust to the unfortunate realities of the ugly, soft underbelly of the world than static linking is.
Making sure that there are multiple people with the knowledge and access to produce new releases of E that incorporate security fixes in a timely way is definitely good. But I'm not convinced that distro maintainers are a good answer to that problem; the distro landscape is very fragmented, and distro maintainers are often not very closely involved with the packages they're notionally maintaining or aware of best practices / pitfalls that apply to that ecosystem. I suspect that something along the lines of the rust platform efforts might have a better chance of pushing out releases of all reverse-dependencies of some package that had a security flaw in a timely way, with minimal risk of breakages.
>> Why do people pin dependencies? The primary reason is that they don’t want dependency updates to suddenly break their packages for end users, or to have their CI results suddenly broken by third-party changes.
Or because we dont want accidental or malicious security vulnerabilities to get automatically incorporated into the software.
This stuff works both ways. You dont automatically incorporate fixes, nor new problems.
The vast, vast majority of updates fix security issues. It's like not vaccinating in case you're one of the million people that has an allergic reaction. Supply chain attacks are rare, not the norm. We hear about such things (and only rarely at that) because it's exceptional enough to make the news.
Which means extra maintenance work to check for every piece of software that anyone uses whether it uses another library that it needs to be recompiled against and, if it fails, how to use the new version.
If there are automatic updates, at least it either works and is more secure, or it breaks automatically and unsafe software stops working.
Whether you prefer people to use MSIE6 because "it just works" or whether you prefer old sites that only worked with MSIE6 to break because it's no longer maintained, that's the trade-off you have to choose between.
As a security person, I'm obviously biased, I can only advise what I see from a professional perspective. All I was saying above is that automatic updates being considered a security risk is on the same scale of odds as considering vaccines dangerous -- in regular cases, that is: of course the advice is different if you're a special (sensitive) organisation or a special (immunocompromised) person.
Nah, I don’t buy it. If it’s “just” bug fixes (for which I might have implemented a hack that now depends on the bug) I prefer Nightly builds with the latest (and re-pinned) dependencies available. Releases are just a re-tag after extra QA
Fantastic article. I now have something to point people to when they ask "what is wrong with pinning?" or "what is wrong with static linking?" or "why can't you just use pip?" Michal has had to deal with some pretty crazy stuff the last couple of months ... scratch that, years.
The recent attempts to empower developers to distribute their own software means that there is now the potential for there to be as many bad security practices as there are pieces of software, because systematic security design that used to be managed by distributions has been pushed down to individual software projects, which have only a tiny view of the issues, and thus repeatedly make locally convenient decisions that are disastrous for everyone else. What do you mean someone is using a project other than ours that shares a dependency? Why would they do that?
One other thing to keep in mind is that from a security standpoint the approach that Nix takes is not good. Nix can deal with pinned dependencies on a per-package basis, but if those pinned versions are insecure and the developers don't have a processes for keeping dependencies up to date, then users are sitting ducks.
Unfortunately pinning is a symptom of at least three underlying issues. The first is that developers do not properly assess the costs of adding a dependency to a project both in terms of complexity, and in terms of maintenance burden. The second is that many of the dependent libraries make breaking changes without providing space for the old api and the new api to coexist at the same time for a certain period so that developers can transition over (I have been bitten by this with werkzeug and pint). Mutual exclusion of versions due to runtime failure is a nasty problem, and pinning hides that issue. Finally it seems that at least some developers are engaged in the Frog and Toad are Cofounders continuous integration story, but with a twist, rather than deleting failing tests, they pin packages so that they don't have to see the accumulating cost of pulling in additional dependencies. Externalities, externalities everywhere.
IMO, pip is great, but it has exactly two use cases where it's warranted. One is where you need to be your own maintainer, e.g. you've developed a private web server to run your website, and you need to manage its dependency versions precisely. The other is for development purposes: it's really great to be able to use different versions of Python or test against different libraries than your distribution ships.
It's not (or shouldn't be) for shipping software to end users. That's the problem being complained about in the essay. This particular way of handling dependencies (use system libraries by default, but allow the user / developer to create an entire virtualized Python installation if they want one) is actually why I think Python handles this better than most other languages, including older ones.
I agree. The context was missing from my original, which is, "why can't you just use pip to install dependencies in production," which is effectively the answer that Michal once got from a PyPA maintainer when asking about setup.py install.
> I now have something to point people to when they ask "what is wrong with pinning?" or "what is wrong with static linking?" or "why can't you just use pip?" Michal has had to deal with some pretty crazy stuff the last couple of months ... scratch that, years.
Literally the only argument in this article against static linking is "it will take an extra couple hours for the distribution to recompile all affected packages and then require the user to download a larger update, meaning time to fix for a security issue will be negligibly longer"... since you seem to believe this article represents the strongest statement of your argument, this has actually moved me further away from where you want me to be ;P. (FWIW, the best argument I could make for dynamic linking involves memory efficiency for shared pages, the disk cache, and maybe the i-cache, not security.)
This I doubt. On a statically-linked application, all calls to functions are going through a regular function call instruction. When you dynamically link, every call to a function that might cross library boundaries (which on ELF systems defaults to every call that's not to a static function) instead calls into the PLT, which will itself dispatch through to the underlying function call.
> It's worth noting that the Swift devs disagree with the Rust and C++ codegen orthodoxy in one major way: they care much more about code sizes (as in the amount of executable code produced). More specifically, they care a lot more about making efficient usage of the cpu's instruction cache, because they believe it's better for system-wide power usage. Apple championing this concern makes a lot of sense, given their suite of battery-powered devices.
It’s sounds like pinning dependencies is just done because we developers are a lazy bunch that just want to guard against the rare case of a breaking upstream change. I was burned too often in the past by breaking changes or more often behavior changes in transient dependencies to not put up some defense. But I still don’t pin versions in the main dependency files (cargo.toml, gemset or similar). I just have the generated lockfile and put that under version control. so all developers in the team get the same versions of direct and transient dependencies. I define the dependencies with version ranges. Every package manager works differently here. I like cargo the best since a statement like ‘version = 1.2.1’ means give me a version bigger than ‘1.2.1’ but lower than ‘2.0.0’. One should never add lockfiles for library projects as they need to be open for future updates. I also add dependabot to all repos and let the bot inform me about updates.
No really, neither users nor developers care, nor should they. I've been using Linux on the desktop for well over a decade and I'm tired of seeing this plea for everything to behave exactly like C or scripting languages because every distro wants to be its own special snowflake and it would be too hard to adapt.
The world has changed and distros have to stop pretending it hasn't. Flatpak, Nix, those are better models that reflect our reality where developers don't want to worry about ten thousand different distros with wildly different downstream configurations producing different kinds of bugs, as well as the desire of users of just getting the darned app.
If you're worried about security you must always, first, work with upstream. Upstream should be the party responsible for its users, not you. Upstream not fast enough and you want to go the extra mile? Well then your packaging tool-belt should have support for patching a series of libaries in an automated fashion and rebuilding the dependency tree; and make sure that you return to upstream as soon as possible.
If you want your downstream to diverge from upstream because you want to act as a barrier as the old distros do, then you'll have to accept the fact that you're maintaining a fork and not pretend to upstream that you're distributing their work or that your bug reports are compatible. Otherwise, again, just limit yourself on distributing the latest upstream's stable release with the bare minimum patching necessary for getting things to work with the distro's chosen config.
With some luck, after the Linux community comes to terms with the fact that the distro model must shift, we can begin to finally share some packaging efforts between all distros and we can leave much of the incompatibility bullshit in the past.
Imagine one day everyone just building from shared Nix derivations? Very ironically, it would look a lot like Portage and Gentoo's USE flags but with everyone building off those generic derivations and offering binary caches for them.
It's hard not to read this and not feel like a little bit of it is some maintainers worried about losing control. When I get on irc and say the words "pip" they certainly send me those vibes.
> I'm tired of seeing this plea for everything to behave exactly like C or scripting languages because every distro wants to be its own special snowflake and it would be too hard to adapt.
The same argument can be used the other way around: "I'm tired of seeing this plea for everything to behave exactly like Docker or static .exe because every software wants to be its own special snowflake and it would be too hard to adapt to software distributions."
Over the last several weeks I was working on an essay about this exact problem, including the connection between static linking and bundling. This one is so well done that I probably won't even publish it.
But I'll add this, for people who may not immediately see why this is important. I think that the real danger these new technologies represent is not inherently bad technology, but the possibility of ecosystem damage. The distribution / maintainer / package manager approach has proven to be an extremely reliable way to get trustworthy software. Many of us love it and want to see it stick around. And it's been possible because "upstream" developers in the open source ecosystem have been willing (or forced) to work with distributions to include their software. But this seems to be changing. A highlight from my essay:
'Many software projects are not good citizens in the open source ecosystem. They are working with a model of development and distribution that does not mesh well with how open source software is predominantly produced. ... These [new] languages are now developing their own ecosystems, with their own expectations for how software should be created and distributed, how dependencies should be handled, and what a "good piece of software" looks like. Regardless, an increasing amount of software is being built in these ecosystems, including open source software.
There might come a day in which open source is fractured. Two different communities create two very different kinds of software, which run on the same systems, but are created and distributed in very different ways. I begin to worry that the community I care about might not survive. Even if my ecosystem continues on its way, I don't want this split to take place. I think being part of the open source ecosystem is good for software, and I think having all the software you could want available within that ecosystem is good for users. If anything, this essay is a call to those who agree to be more careful about the software they create. Make sure it's something that Debian maintainers could be proud to ship. If you're working on a Rust program, for example, ask questions like "how can I make this program as easy for maintainers to distribute as possible?" If you have the ability to work on projects like Rust or Go, do what you can to give their applications the ability to easily split dependencies and support system provided libraries. Let's try to make sure the software ecosystem we love is around for the next generation.'
> The distribution / maintainer / package manager approach has proven to be an extremely reliable way to get trustworthy software. Many of us love it and want to see it stick around.
I disagree, it's proven to be inadequate for modern software development and that's why these new languages/ecosystems are springing up. The least reliable way to package and distribute software is by relying on traditional package managers.
> Do what you can to give their applications the ability to easily split dependencies and support system provided libraries
This is unrealistic. I do not trust system provided libraries to function with my applications because I've been burned so many times in the past.
> how can I make this program as easy for maintainers to distribute as possible
By statically linking everything as much as possible and shipping everything else in a self contained bundle with a launcher that overrides any symbols that might inadvertently be pulled in from the system.
The universe I'd like to live in is where the only use case for dynamic linking are OS vendor APIs and cryptographically secure functions like TLS. My dream package manager would whitelist those system librarie and forbid distribution for any bundle that does contain the shared objects with the symbols it needs.
> I disagree, it's proven to be inadequate for modern software development
Well, that's exactly why the OP (and my essay) are "anti" modern software development in many ways. The view is that we're moving away from the traditional open source ecosystem and methods of software development with these new technologies, which (to be clear) are good technologies, but were created mostly to solve problems that some large corporations have, not to solve the problems that the open source ecosystem has.
> The least reliable way to package and distribute software is by relying on traditional package managers.
Not sure what you mean by this, but it's entirely untrue in my experience. Anything I install with a package manager just works, 100% of the time. Stuff I try to get any other way is a shitshow, and the lack of "quality control" provided by maintainers speaks for itself. I mean, just look at Android apps or the Chrome extension store. Heaven forbid we go back to the days of curl | bash off someone's website.
> By statically linking everything as much as possible and shipping everything else in a self contained bundle with a launcher that overrides any symbols that might inadvertently be pulled in from the system.
I know you know this, but just to be clear, that's not a solution to the problem of "making things easier for maintainers to distribute", that's cutting maintainers out of the loop. The whole point of my focus on ecosystems is that this is something that I, as a user, don't want to happen.
The open source solutions are primarily for C/C++. A bit for Perl, a bit for Python. But they haven't really moved on. Java has been tacked on since forever. Same for .NET, JavaScript, whatever.
And if you wanted your program propagated to all major distros you'd have to wait a decade.
Nobody has time for that. Not corporations, not mom and pop stores, and I doubt many hobbyists.
That's not the problem. (*) The actual problem is that the distro maintainers want split packages (for security and so on), not vendored, and this requirement was already burdensome for many languages other than C/C++. If vendored packages were acceptable I believe people would have of course contributed them. Maintainers made this obstacle themselves (for a good cause, arguably) and it seems farfetched for them to then complain other languages are uncooperative.
* Was "isn't that a problem?", which was not I meant to say.
> If you want to make your program popular then packaging is part of the process needed.
That is demonstrably not true. I'm not making a value judgment (maybe it SHOULD be true, I dunno), but it's not true. There's a lot of popular unpackaged software out there.
That actually clarifies some thing about the debate for me. Thanks.
I'm not sure I'm convinced based on the semi-frequent posts from maintainers and security pros about the issues with vendoring dependencies for software that is widely deployed.
This "better way", since we lack a more concrete name. Seems to be really great if you're running a web app, or server software in your own company and can rebuild and run a rolling deploy pretty easily. For someone pushing software to users all over the world, and as one of those users the downside to allow every application to be responsible for updating this stuff seems pretty steep.
I'd disagree that the problem of large organizations are different from the problems of the FOSS ecosystem. Organizations just have a financial incentive to fix them, the FOSS ecosystem does not. If mutually incompatible dependencies and security updates breaking software weren't problems for both corporate and FOSS ecoystems, these new technologies wouldn't have needed to exist. They'd just use the existing platforms.
And mind you, this is not a corporate/open source split. The burgeoning ecosystems are also full of FOSS technologies doing new and exciting things, they just don't break when a dependency updates!
>Anything I install with a package manager just works, 100% of the time
I run into issues with packages weekly. So much so I've spent engineer days purging references to system packages. It's universal too - yum, apt, pacman, brew, macports, I have to make sure nothing tries to reference packages installed outside a local working directory for an application because of mutual incompatibilities. Maybe it's because I'm trying to write software that runs on multiple targets and not use software where someone else has already spent the time and money to resolve these issues.
> I know you know this, but just to be clear, that's not a solution to the problem of "making things easier for maintainers to distribute", that's cutting maintainers out of the loop. The whole point of my focus on ecosystems is that this is something that I, as a user, don't want to happen.
They should be cut out of the loop. Maintainers don't have a right to dictate what design decisions I put into my applications because they don't think it adds value (the value is, it doesn't just run on their distro!). Another comment in this thread put it better, maintainers shouldn't place themselves in the development process.
Which is exactly what developers have done, and what this guy is complaining about: developers have figured out a different approach to maintenance that works better.
I don't pretend to have an answer but I'm trying to listen to both sides so maybe I can formulate "the question" a little bit better and improve the discussion around this.
It seems like have posts every few months where security professionals, maintainers and sysadmins explain that the "developer usability first" approach to maintenance and dependency management is having massive consequences to our ability to keep systems secure, and to even know if an affected version of a library is on a system.
That is a different problem that doesn't seem to be addressed at all in the current iteration of these tools. If you're aware of efforts to solve that problem in the rust and go ecosystems(since those are the ones cited here) I'd love to read about it.
Knowing whether a vulnerable version is somewhere in your dependency tree, and making sure it gets fixed, is absolutely being done and being made part of CI etc. (I don't know about Rust/Go specifically, but the JVM ecosystem is the subject of similar complaints and we're absolutely doing vunerable dependency scanning and also things like "edge builds" where we bump every transitive dependency to the latest version and see if anything breaks). Nowadays Github itself will give you an alert without you even needing to do anything.
Frankly, most distribution maintainers seem to not know or care about how upstream software is built; they have an idea about what's "best practice" in the handful of languages they're using to build their distribution (which is mostly, like, C and Perl) and insist that they know best, without realising the rest of the world has passed them by.
I disagree. This is how you get the Google Play store or the "freeware" app marketplace. It sucks. As a user, I'm quite happy to continue using a traditional distribution even if it means I don't get to use a handful of flashy programs by developers that disagree with the concept of maintainers. So far that choice has been much more positive than negative for me, and I'm doing my best (by promoting the open source ecosystem) to keep it that way.
If you put in a bunch of crap that doesn't belong in an application (ads, for example), I'm glad I have a maintainer that can either strip this out thanks to the GPL (or BSD / MIT etc), or else choose not to include your app in the distribution at all.
This only works if "core" refuses to use Rust, so it's not sustainable. We already can't build Firefox and GNOME without Rust. Maybe Apache and curl next. And then?
> As a user, I'm quite happy to continue using a traditional distribution even if it means I don't get to use a handful of flashy programs by developers that disagree with the concept of maintainers.
As a user, I'm afraid that you don't seem to be a representative of typical users. I would be happy to use a traditional distribution if they don't break on updates, which is still not the case. It's clear that traditional distros were not satisfactory on even keeping their own promises. I know they work hard, but ultimately the visible outcome says everything.
Distributions are not breaking software, they are just distributing broken software. Blame upstream for lack of testing. Or create your own distribution of flawless software and keep it up to date and flawless for the rest of your life.
Oh, sure. Softwares are broken, you can't fix that. Traditional distros just happen to (falsely) believe that they can somehow fix at their level. What we need instead is a distro that is resilient to software breakage, not flawless software. Does my hope look that unreasonable to you?
Clearly you missed the disclaimer of warranty in the licensing terms.
Packagers are welcome to maintain their own patches, or their own fork, if they like. But they don't have any right to tell upstream what to do or demand particular guarantees from upstream.
It's temporary, I'm sure. When broken software or a weaponized patch will slip through weak fence of maintainers and then break critical infrastructure, Congress will vote for something to stop that.
I’ve been on Debian since Potato, so I totally see what you’re saying. But...
> that's cutting maintainers out of the loop
Is this necessarily a bad thing? The market has seen the need to fill a hole, and it seems to be working.
I first started with Slackware, and dependency nightmares is what got me into Debian in the first place. Although Debian is nice because of its slow and stable base (which makes me happy for production), I’ve recently moved to Arch and have been so happy as it’s brought back Slackware’s idea of getting as close to upstream as possible and it handles dependencies! And to be honest, I’m loving it. And as an even added bonus, I’m getting more and more surprised how a tonne of packages that I’ve installed are Rust apps.
So, coming back to your comment:
> that's cutting maintainers out of the loop
With systems like Arch that get us closer and closer to upstream, are maintainers the unnecessary middlemen? Of course they’re not entirely redundant, but maybe a new model of distros like Arch will be more commonplace in the future
Thoughtful comment, thanks. I'm an Arch user as well and agree broadly with its approach to a desktop operating system. (I.e. stick as closely to upstream as possible.)
That said, I disagree that this means maintainers are unnecessary middlemen, even though their role on a distribution like Debian is obviously more prominent. The essay I linked to in my top level comment is actually by an Arch maintainer, explaining why they still see maintainers as playing an important role. http://kmkeen.com/maintainers-matter/
>With systems like Arch that get us closer and closer to upstream, are maintainers the unnecessary middlemen? Of course they’re not entirely redundant, but maybe a new model of distros like Arch will be more commonplace in the future
Arch is an old distribution, very much in the same class as Fedora, Debian, Gentoo and all the traditional ones.
What makes you think Arch makes maintainers even slightly redundant?
We still deal with security issues. We still need to figure out which Go software utilizes a library with a CVE with no tooling (go list and grep comes a long way). And we still need to deal with pinned dependencies in upstream projects.
That is nonsense and you seem to spread this misinformation in a lot of places. You should also add a disclaimer that you are part of the Arch team.
AUR: Anyone can create an account and upload PKGBUILDs. There are no checks at all. AUR users should verify whether PKGBUILDs are not malicious. In practice, a lot of people use things like yaourt to install packages from the AUR without verifying the PKGBUILDs.
nixpkgs: anyone can contribute a PR with a new package, package update, or package modification. However, changes only get added to nixpkgs after someone with commit privileges verifies the PR and merges it. Also, a common misconception is that nixpkgs package maintainers can merge changes. This is false, only a much smaller set of committers can merge changes in the actual nixpkgs repository.
nixpkgs is more like the Arch Community repository, where committers are long-time contributors with a track record of high-quality contributions. Parts of nixpkgs are like Arch Core/Extra, because they are marked using the GitHub codeowners mechanism and changes are generally not merged unless approved through the code owners.
Disclaimer: I am a nixpkgs committer, former Arch user and AUR contributor.
>That is nonsense and you seem to spread this misinformation in a lot of places. You should also add a disclaimer that you are part of the Arch team.
The AUR comment is unfair, uncalled for and adds nothing to the conversation. I apologize and hope our previous conversations have been more productive. It was meant more tongue in cheek then some grand claim about the quality of nixpkgs and stems mostly from the frustration of the entire vendoring issue.
I don’t see your point. In fact, you’ll most likely find that distros that follow upstream closer than “slow and stable releases” will get their patches as soon as upstream fixes them
This assumes an ideal upstream: This is not always the case.
If someone publishes a CVE for a Go or Rust library it's not always the case the project is well maintained or the dev cares to update the dependency. Even if they did, there are no guarantees the upstream decides to publish a minor release just to update dependencies. Because that is what vendoring dependencies get you.
Instead of applying one patch to a shared library I'd need to hunt down all upstreams utilizing the library and manually patch between 10-140 packages independently and submit them upstream.
If upstream is not keeping their software up to date, why are you using that software? Imagine if Google stopped updating Chrome. Would you keep using Chrome?
Traditionally package maintainers has kept dependencies up to date. This has changed with vendored dependencies where upstreams have to do the work. They are not always up for that work. This isn't strange, nor weird and the comparison to chrome and google doesn't make much sense.
The problem is that the modern practices are being adopted to meet the changing needs of this code, which weren't an issue in the enthusiast linux and BSD communities, but are an issue when open source code is being put in mission critical professionally managed production environments on AWS, or into cars and appliances.
The userland is much more complex now, so freezing dependencies and bundling them in order to have a smaller number of test cases may not have been necessary when the environment was smaller and simpler. The reliability expectations were lower, and there weren't big money professional support contracts which required you to validate your application against a set of well defined environments. All that has changed.
Why are you comparing distro package managers to the Play store or the Chrome extension store? You should be comparing them to npm, pypi, etc. That's clearly the context of this discussion. No one is saying that the Chrome extension store does a better job than package managers. The language ecosystems do an amazingly better job.
> The universe I'd like to live in is where the only use case for dynamic linking are OS vendor APIs and cryptographically secure functions like TLS. My dream package manager would whitelist those system librarie and forbid distribution for any bundle that does contain the shared objects with the symbols it needs.
This idea seems to be predicated on a belief that OS vendor APIs and cryptographic libraries are the only attack surface for serious user-affecting software exploits.
They're obviously not, but they're the ones that package managers can help mitigate automatically. The rest is going to be up to the developers to patch.
Some of the things you mention are not incompatible with package managers. Have you considered Nix?
I actually agree with the parent post in terms of strongly preferring package managers to other means of distributing software. I've always found Linux much easier to admin than other OSes simply because of package managers.
Was really pumped after playing with Nix for a bit. But got bitten twice in the first day. I tried to run my work project using Nix. It needs Pythons python-prctl library - that doesn't work in Nix. So that's a dud. Next I try to use it as a Nim environment - Nim is unable to compile anything in Nix due to some missing (bundled) Glibc symbol. (Nim works nicely in my standard installation). The cool "export full container" thing mentioned in the tutorial failed for all, even the simplest cases. So I am kinda disillusioned by Nix.
> I do not trust system provided libraries to function with my applications because I've been burned so many times in the past.
I agree. About half the issues I've had with dependencies has been due to distributions fiddling with upstream for some reason or another. Probably the main reason I like Arch (which has a policy of just following vanilla upstream, though they aren't immune: 'python' being python 3 is probably the biggest pain point)
>I disagree, it's proven to be inadequate for modern software development
You're not disgreeing with the post you responded to; you're just stating a different priority.
hctaw said the distro/maintainer/pm approach is extremely reliable at producing trustworthy software. That means trustworthy to the user. It says nothing at all about how hard producing that software is for the developer.
You are saying that the distro/maintainer/pm approach makes it harder for the developer. That's true. But it doesn't contradict the above at all.
And anyway, as a user, I don't care. I want my software to work, to be stable, and to not have security flaws; and if a security flaw is found, I want a fix to be pushed to me ASAP. The distro/maintainer/pm approach does that. If instead I have umpteen zillion different statically linked applications installed, each of which packages all of its own dependencies, then instead of just relying on my distro to push security fixes to shared libraries that everyone uses, I have to rely on every single one of those developers to do it for their own packages. And most of them won't do it, or they'll do it when they get around to it instead of when I, the user, need it.
> The universe I'd like to live in is where the only use case for dynamic linking are OS vendor APIs and cryptographically secure functions like TLS
This won't work either, because those are certainly not the only places where security flaws can happen that I, the user, need a fix for ASAP.
> And anyway, as a user, I don't care. I want my software to work, to be stable, and to not have security flaws; and if a security flaw is found, I want a fix to be pushed to me ASAP. The distro/maintainer/pm approach does that. If instead I have umpteen zillion different statically linked applications installed, each of which packages all of its own dependencies, then instead of just relying on my distro to push security fixes to shared libraries that everyone uses, I have to rely on every single one of those developers to do it for their own packages. And most of them won't do it, or they'll do it when they get around to it instead of when I, the user, need it.
This is my main worry as well. We are currently in the early, easy, period of this new development paradigm, where developers are constantly releasing new code, and fixes are easy to deploy. I worry that in 10 or 15 years, when a security bug is found in a critical imported function, the developers aren't going to be around anymore to fix them, they will have moved on to the next new, hot, language, and will have as much interested in maintaining their old Go/Rust code as developers today do in maintaining their old C code.
As a user, my response is simple: I don't use software that's built that way. Outside of code that I write myself, I simply refuse to use software that's not accompanied by a distribution and maintenance infrastructure that I trust. For most software, that means it's packaged by my distro. Some big players might be able to convince me to take their software from them directly, but they will be very few, because there are very few big players that are as reliable as my distro, and that's the standard they have to meet.
> Outside of code that I write myself, I simply refuse to use software that's not accompanied by a distribution and maintenance infrastructure that I trust. For most software, that means it's packaged by my distro.
Okay, that's a reasonable approach, but with distros essentially saying for new software, they can't continue to maintain that standard, and they're just taking software as it comes, without securing it. I think you will find that there are lots of small utility programs and libraries that you won't have available to you in 10 years with this approach. YMMV.
> I think you will find that there are lots of small utility programs and libraries that you won't have available to you in 10 years with this approach.
Then I'll either find an alternate source that has enough reliability to satisfy me, or write them myself, or do without.
(Or I'll end up building what amounts to my own distro. Which is something I have indeed thought of doing, because my preferences are rather idiosyncratic.)
> And anyway, as a user, I don't care. I want my software to work, to be stable, and to not have security flaws; and if a security flaw is found, I want a fix to be pushed to me ASAP. The distro/maintainer/pm approach does that.
Not my experience at all. The distro maintainer generally takes significantly longer to push out a fix than the upstream developer.
> The distro maintainer generally takes significantly longer to push out a fix than the upstream developer.
Of course this is true in a sense, because the distro maintainer has to wait for upstream to push a fix before they can package it.
However, for the upstream developer, "pushing a fix" means "pushing updated source code". For the distro maintainer, "pushing a fix" means "compiling the updated source code and packaging the resulting binaries for all supported versions".
There are some upstream developers who could probably accomplish the latter at least as fast as distros do, but not many. But the latter is what I, as a user, need.
> it's proven to be inadequate for modern software development
Sure, because it's not meant for software development. It's meant for users to run software.
Software development packagaing/distribution is pretty bad. Most languages do it their own way and never seem to learn the lessons of previous ones. And on top of that, they encourage poor habits, like writing your own module that's the same as somebody else's with one new function, rather than contributing to the existing module, or writing an extension for it.
> I do not trust system provided libraries to function with my applications because I've been burned so many times in the past.
Probably this is because both you + the library developer are not coordinating with the distributions on how you release your code. Distros get a lot of flack for breaking changes, but they're working from software released by developers, and rely entirely on their own user base to test changes. If developers cared about their software working they'd be more involved in its packaging & distribution.
> The least reliable way to package and distribute software is by relying on traditional package managers.
Checks and balances produce reliability. Having a packaging process and people distinct from the code author validating and enforcing it, most certainly produces much more reliable, secure and stable packages and distributions.
Sure, it's slower. Slower is actually a feature in this use case.
Having anyone push out their code to the world without any constraints or care, breaking compatibility day to day, that is certainly faster and easier. But reliable, certainly not. This is the culture that produces leftpad debacle and assorted malicious packages regularly.
It really doesn't. I've worked on upstream projects where a significant fraction of all bugs reported were created by distributions screwing up packaging and patching in ways they weren't at all qualified to understand and frequently led to non-obvious failures.
When we tried to work with them to fix this, about half the time they flamed us and quoted distro 'policy' as a reason not to fix their bugs, so we just refused to accept bug reports from anyone using those packages anymore.
The fact is that a lot of old-school Linux distributions are built by people who have only a very vague understanding of the software they're packaging, and frequently are closer to the sysadmin side of things than the large-scale software development side. It makes the relationships very frustrating and that's why proprietary software vendors invariably opt-out of distro packaging. Even with apps statically linked to the max possible level Linux users generate disproportionate levels of support tickets due to the general flakiness of the distros they use, so allowing them to modify tested software even further is a losing proposition.
Basically the whole concept of a Linux distribution is obsolete, fading away and irretrievably broken. Hence the proliferation of containers.
> By statically linking everything as much as possible and shipping everything else in a self contained bundle with a launcher that overrides any symbols that might inadvertently be pulled in from the system.
So, you propose to ship your own OS, as single image, for your application? What stops you from doing this? Drivers? Ship your own hardware then, like smartphone vendors do.
> Two different communities create two very different kinds of software, which run on the same systems, but are created and distributed in very different ways.
This is where the disconnect is coming from. The distro maintainers are coming from a world of multi-user systems where backwards compatibility and updating deps without disturbing a user's workload / forcing them to recompile is paramount.
Go (and a fair amount of rust/python work) come from the land of CI/CD and, to a lesser extent, monorepos. When you are rebuilding the world above a bare minimum of the OS literally on every commit (or at least several times per day), it's easier to reason about code that is running if you can look at the commit that a bin was built from and know exactly what's inside (including all deps).
I agree. I think the difference has been that until recently "the land of CI/CD" and so on has been certain segments of the corporate world, and not how typical open source developers did things*. So when the former developed new technologies and new languages, they created build tools for them that anticipated being used in the ways that they usually produce software.
The "problem", in the sense that it's a problem, is that these languages and related technologies are all pretty good! And so it's understandable that many developers who would traditionally be in the open source ecosystem want to use them. As a result they end up creating software that can't easily be shipped in traditional distributions. Ecosystem fragmentation is the unavoidable result.
* By typical open source developers, I mean the sort of developers (and their development practices) that produced most of the software on my computer. I don't mean Firefox: Mozilla and Google have much more standard corporate development practices despite both producing quite a bit of open source software.
Although continuous integration starts in proprietary software, it's been present in Free Software for at least decades. Netscape may well be the second or third medium-large software outfit to do continuous integration the way it's done today (we know Microsoft had a team doing this by hand every single day for Windows NT but that's completely insane) because some of its team had experienced this approach elsewhere and knew they needed it if they wanted to ship software that actually works. When Mozilla was created, Tinderbox (that system) along with the Mozilla browser (and so today Firefox) and Bugzilla (a bug tracker) were freed.
I know it probably seems like last week, but that was more than twenty years ago.
> The "problem", in the sense that it's a problem, is that these languages and related technologies are all pretty good!
Yes; and frankly the development ecosystem for making software for linux & friends on top of apt/etc is terrible - at least from the perspective of a modern professional software engineer. The assumption is C, and C programs have no package manager - so of course dependencies get bundled / vendored sometimes when the alternative is linking to potentially out of date dependencies in apt. Autoconf/automake is awful to learn and understand. CMake is better - but its horrendously complicated because it tries to solve the impossible job of paving over all the junky custom compilation scripts that came before. (And it still has no cargo equivalent for actually fetching your deps.)
And then to work around all of that, each distribution will make weird, custom, maybe buggy patches to your software before adding it to their package managers. (Which has caused some high profile bugs and security issues a number of times.) Now when there's a bug, nobody knows who's fault it is!
This worked in a world when there wasn't much software, when releases were rare and when most programs only had one or two dependencies. None of these properties are true any more.
Rust, go, python and nodejs don't fit well with linux's package managers. The obvious alternative would be putting every crate, gem, pip package and npm package into apt, rpm and all the rest. And keeping them up to date with every version. But lets be real - that would be horrible. Apt et al aren't (currently) up to the task. (Can you imagine every npm package needing a maintainer in apt alone? I can just imagine the github issues: "I'm in debian stable and this transitive dep you're using only has version 0.1 available, from 6 years ago. What do I do?". Yikes.)
I'm sympathetic to the argument that modern million dependency software development has its own problems; but right now it (sadly) has no competition in terms of ergonomics and build reliability.
> This worked in a world when there wasn't much software, when releases were rare and when most programs only had one or two dependencies. None of these properties are true any more.
I don't think this is even the problem.
It's that upstream maintainers have stopped worrying about compatibility.
Once upon a time you would have regular minor releases of some package. 3.0.2, 3.0.3, 3.0.4, but they were all backwards compatible. If you had version 3.0.4 and some software that was built against 3.0.2, it still worked against 3.0.4 because the only difference was that things were added or compatibly improved, not removed or incompatibly changed.
Version 3.0.x wasn't compatible with version 2.9.x, but then the package maintainer for the distribution only has to package versions 3.0.4 and 2.9.16, i.e. suitably recent minor versions of each compatibility revision. Compatibility revisions so old that nobody relevant uses them anymore can be ignored, so they only had to package two or three incompatible versions which together are compatible with everything in active use.
The problem today is that everything is a compatibility-breaking change, so there are dozens of releases from this year alone that are all mutually incompatible and would have to be packaged separately. And that doesn't scale.
So, you deliberately chose distro intended for users, with low version churn, instead of distro intended for developers, e.g. Fedora, which ships even pre-release versions sometimes, and now you blame ... apt? Just curious, what you are using for coding? MS Word or Excel? For example, Linus uses Fedora/MATE/Emacs.
It's relatively easy to convert between packages between different packagers/distros. Automatic converters are exists for deb/rpm/pip/cpan/ctan/cargo, so it's easy to convert all existing packages into one packaging system and drop all of them into huge mono-repo.
Yes, it's much easier to throw a new version to repo in many non-linux repositories. It's the equivalent of rolling distros, such as Arch, or development version of distro, such as Rawhide in Fedora or Sid in Debian. However, it also much easier to: break the world and make into news, distribute keyloggers and steal passwords and keys, forget to backport security fixes to users of older versions, pivot into completely orthogonal thing, etc. No code reviews means no responsibility.
If you want your software to work everywhere, you either need to take responsibility for making it build everywhere (yes, including old versions of Debian which don’t support the versions of your dependencies you need) or you punt that work to someone else - in which case your software simply won’t work on lots of computers. From the perspective of an upstream maintainer, the status quo is pretty awful.
There’s a reason people are turning to docker - because the “portable executable” idea on Linux is so often broken by weird incompatibilities between libc versions or some important dependency being missing or broken from some users’ systems. Automatic package translation suffers the exact same problem - a dynamically linked binary you build on your computer often won’t run on my computer.
If you want for your software to work everywhere, but you want to avoid compilation step, then use Perl/Python/PHP with bindings to QT/GTK.
Linux has perfect backward compatibility, I still able to compile and run 70 years old app, developed on completely different OS and processor.
IMHO, you think that your app/lib binary will work on all combinations of OS/processor flawlessly without recompilation, which is not true. Nobody promising that. It's by design.
I disagree that rust, go or python inherently do not work with linux package managers, instead there's a culture within subsections of those communities who do not value stability (I'm not sure about nodejs community though), which cause these disagreements. C isn't immune to this (and never has been)—science codes are infamous around their lack of care around stability (and IMHO one of the reasons for scientific python's success over alternatives has been this stability).
I'd also disagree with the idea that the alternative is reliable—try building and modifing a project that hasn't been touched in 6 months, and see how reliable that is. With your tree of dependencies (absent specific projects which abstract over a set of unstable dependencies and hence implicitly stabilise them, e.g. SDL) the stability of your project (whether it is a library or application or framework) is set by the least stable dependency you have. Increasing you dependencies increases the risk, but if you have an ecosystem which values stability, larger dependency trees should not see a significant increase in risk (personally, I'd love to see the scientific rust ecosystem achieve similar stability to python).
That's not to say linux distros are perfect (change can be slow ;)), but there's lots of little things they do get right (how many projects handle updating configuration files correctly—that kind of thing is built into distro tooling), and they enable ecosystem-wide changes more that the alternative (e.g. https://reproducible-builds.org/).
This makes me wonder if there's a Linux base system suitable for servers that embraces the newer approach. That is, a minimal base system that's built for immutable container images, providing just what's needed to bootstrap the current generation of language-specific build and package systems. The Alpine Linux Docker images might be a good choice for now, but IIUC, Alpine Linux itself still embraces the older distro approach.
Maybe the Nix package manager / NixOS is what you're looking for? I think it takes the best features from both worlds.
Every package installed with Nix is isolated into content-addressable* directories, so for example, my install of Firefox is located at /nix/store/c7pmng2x05dkigpbhnjs8fdzd8kk31np-firefox-85.0.2/bin/firefox. This is pretty inconvenient to use directly, so Nix generates a profile that symlinks all your packages into one place (eg. /run/current/system/sw, ~/.nix-profile), and then environment variables like PATH can just include <PROFILE_DIR>/bin.
With this approach, I can have multiple versions of the same package installed simultaneously, without them conflicting with each other. Like in a traditional distro, any dependencies that are shared between packages aren't duplicated, but if a package needs to explicitly depend on a different version, it can.
Also, because Nix is designed as a functional package manager for building packages from source (even though it has a binary cache), you can trace back exactly what sources were used to build your package and its dependencies, all the way back to the bootstrap binaries used to build any self-hosting compilers (gcc, rust, openjdk, ...)
* Most packages use a hash that's generated from the inputs used to build it, rather than the output that's generated.
That's https://cr.yp.to/slashpackage.html which is dated at least 2001. It's not quite the same to the Nix approach though, which also tracks all direct and transitive dependencies.
> That is, a minimal base system that's built for immutable container images, providing just what's needed to bootstrap the current generation of language-specific build and package systems.
At least for the first half, that sounds sort of like Fedora/Red Hat CoreOS[0], the predecessor CoreOS fork Flatcar Container Linux[1], or the Amazon distribution BottleRocket[2].
Fun thing about BottleRocket relevant to this thread: it uses Cargo as its build system. It's really wild and very interesting and more people should know about it, IMHO.
> it's easier to reason about code that is running if you can look at the commit that a bin was built from and know exactly what's inside (including all deps).
Believe me, it's usually the opposite.
Lack of proper releases, testing, and versioning results in unending checkout-fu to figure out what commits for each of 20 libraries will work for each other.
The idea is plainly stupid, without any redeeming qualities.
The entirety of this cargo cult hinges on the point of "If people calling it a genius invention for last 8-10 years admit it not being it, some major reputation, and cred loss will be incurred"
> Lack of proper releases, testing, and versioning results in unending checkout-fu to figure out what commits for each of 20 libraries will work for each other.
Admittedly I'm basing this on my experience in a very large monorepo environment, but there's no figuring out which commits will work with each other. Every commit with every library will work, otherwise it doesn't get committed.
Yes, this involves massive CI infra and tooling to aid in refactoring.
You want to make a breaking change to a lib? Great, it's on you to update every piece of code that calls it.
It's great when you can control every piece of your infra, but I totally get how it's unfeasible (and maintainer hell) for the distro community.
Distros are great for off-the-shelf software especially if you don't care which version you get too much. When versions matter, you quickly get into dependency-hell. So long-lived software tend to stabilize, and then remain unchanged.
K8s, go and even ruby, tend to change and evolve. It's usually a bad idea to pull such software from distros then, even if available.
The means to get software is simply too different, and it's a non-problem for everyone but completist distro maintainers.
Are you actually talking about static linking here, or just venting your spleen about sloppiness in software engineering practice more generally?
Because tracking dependencies in source control (specifically, checking in lock files) is tremendous for reproducibility. It means that when bisecting through the commit history to find when a problem began is not just bisecting through the local source code, but also the specific versions of every dependency.
So regardless of whether the issue you're investigating is in the package, a dependency, or an unexpected interaction between the two, you're able to find the first commit that introduced the issue in O(log(commits)) time, rather than needing O(commits * (num dependencies * dependency versions)) time.
> you're able to find the first commit that introduced the issue in O(log(commits)) time, rather than needing O(commits * (num dependencies * dependency versions)) time.
Try the oldest and newest compatible version of your code and the oldest and newest compatible version of each dependency. This is O(num dependencies), which is generally small N and you can often guess which to try first. Now do each version of the code or dependency that actually made a difference. O(log(N)) on the versions of that code. You're at O(num dependencies) + O(log(N)), not O(commits * (num dependencies * dependency versions)).
Also, doing it the other way can often lead to frustration. The actual bug is introduced in commit #1234 but is timing dependent, then a dependency is upgraded from version 2 to version 3 in commit #5678 which tickles the bug, e.g. the new dependency version moved around some cache lines. Now you're looking in entirely the wrong place at an innocent dependency.
Whereas if you notice that the bug exists with dependency version 3 and the latest commit and then do binary search on all the commits while holding the dependency at version 3, plausibly the bug shows up between commit #1233 and #1234.
While version 3 of the dependency is innocent, commit 5678 is not. Something went wrong in the interaction between the code and its dependencies in that change and discovering that change quickly is valuable.
From there you can start stepping through the code, or simplifying the situation to get a minimal repro of what's going on, or even bisecting with dependency version 3 held constant to see if that's diagnostic.
In my experience, with a bug that's so timing sensitive that some cache lines moving around will trigger it, you're likely to discover pretty quickly that something weird is happening as you try to get a more minimal repro.
Meanwhile, if you're not tracking dependency versions, such a test is likely to appear weirdly flaky. You're trying to track down this failure but something else comes up and you have to set it aside for a few days. When you come back to it, you can no longer get the test to fail because an untracked dependency change has moved cache lines around again. Which change? Which dependency? Since it's not in source control, you've got a painful process of guessing plausible combinations of recent versions of all transitive dependencies.
> While version 3 of the dependency is innocent, commit 5678 is not. Something went wrong in the interaction between the code and its dependencies in that change and discovering that change quickly is valuable.
The trouble is that this will tend to point the finger at large changes that jostle many things around at once and become a rabbit hole rather than the two line commit with a typo that actually caused the problem.
The main advantage you're putting forth is to know the versions of each dependency needed to reproduce the problem. But you can get that from the person reporting the bug. You can add a switch to your software to output the versions of every dependency it's using and then it's there in the bug report. And once you have a combination that can reproduce the bug, the process of experimenting with things to identify the cause is basically the same either way.
Additionally, when you have languages with rich library ecosystems, the OS kind of becomes irrelevant, the platform is the language ecosystem.
Just to pick Go as an example (not to be lost discussing VMs and such), it doesn't matter if I am targeting bare metal, Linux, Windows, IBM z/OS, AWS special cloud runtime, whatever.
As long as the Go code is the same, and someone has done the low level runtime support, it is a compile away and done.
Finally by pushing containers no matter what, the Linux community has made this even easier.
What if there's another software in another language you want to interoperate with? What if you want to avoid containers with their complexity and dubious security record?
Fragmentation is fundamental to Open Source. Fragmentation brings mutation, and mutation brings evolution. Fragmentation is a virtue of open source. Not a vice. The world simply doesn't value security above convenience, as I intuit you may.
I think the natural tendency of all organic evolution is to take the path of least resistance.
It will not be effective to ask developers to do more work. That is going against nature. The packaging ecosystem has to compete and win against its competitors. That is the true way of Open Source, of organic systems.
It is likely that the packaging system you are advocating for is more difficult than the competitors you mentioned. Reducing that friction is perhaps a more effective place to apply your focus. What are these competing ecosystems doing that makes them more attractive? Why is your ecosystem losing users? How does Flathub, Snapcraft and nixOS fit into this view? How does the mac app store, the windows store fit into this view? What is the one true way to package an application?
The distro model did not work, which is why the other model took over.
I used to try to religiously follow the recommendation only to install Python packages which had been repackaged by Debian. Fine, but it meant that you couldn't use any even slightly obscure package, nor one that was younger than some timelag, which ranged from a few months to a couple of years.
Inevitably, you want to pip install something. Then the repercussions of mixing Debian packages and pip packages are a whole new set of problems. And you can't get anyone to look at your problem, even if it's a common issue which the Debian packager could fix or workaround, because 'pip is not supported, you should install this via `apt install python-foo`'.
The best solution is and was to only use pip packages, along with some form of isolation from the wider system, whether virtual environments, containers or what. Python now has extremely good native tools to work this way, and so do most modern languages. I only work like this now, and so, it appears, do all the maintainers of my dependencies and my transitive dependencies. Development cycles are much faster, and it all just works.
Would it be feasible for rust to ship "re-link" scripts? That is, when a cargo dependency gets updated, the distribution can check whether packages code needs to be re-linked? Similarly, when distributed code (say libssl) changes, the rust packages can be re-linked by the distribution on-site?
This article doesn't make a good case against static linking, and the author doesn't seem to understand what vendoring is either:
> Bundling (often called vendoring in newspeak) means including the dependencies of your program along with it.
No, vendoring means including a copy of the source code of dependencies in your repo. You can bundle dependencies without vendoring them.
The only argument presented against static linking is that when a library is updated rebuilding dependants takes longer (and people will have to download bigger updates but I doubt many people care about that).
That may be true, but is it really that big of an issue? I somewhat doubt it (for sane distros that don't make users build everything from source themselves anyway).
The author clearly doesn't like how people do development these days but hasn't stopped to think why they do it like that.
> The only argument presented against static linking is that when a library is updated rebuilding dependants takes longer (and people will have to download bigger updates but I doubt many people care about that).
Not really, the argument is that instead of rebuilding one library you now have to rebuild hundreds of applications to see the benefits for whatever the library update was for...assuming upstream even cares enough to bump the version number for the who knows how many libs they are vendoring because they have other stuff to do which is way more important.
Right but that can be done automatically for static linking just as easily as it can for dynamic linking. So the only difference is time and downloads.
I think you're conflating static linking and bundling and vendoring a bit, like the author is. They're all different things.
Here is the solution to this Debian/Gentoo/<other true free software distribution> packager's dilemma. Packagers realize that their numbers are small and they can't keep up fixing all the modern big software (BS) projects that go against their philosophy.
They define the core of the OS that they can keep managing in this traditional way (kernel, basic userland, basic desktop, basic server services, basic libraries, security support for all that).
"But people want Firefox, Rust, Kubernetes, modern OS has to provide them".
No it doesn't. Let the new complicated juggernaut software be deployed and configured by the users, according to developers' instructions, using whatever modern mechanism they prefer, containers/chroots/light-weight VMs, whatever. Packagers can then forget about the nasty big software and focus on quality of the core OS. Users will have all the newest versions of BS they need from the developers directly.
There is no reason to keep fast-pace, ever-changing and very complicated programs/projects in the main OS distribution. It only leads to lots of OS-specific packaging work with mediocre and obsolete results, and that leads to discontent users, developers and packagers.
Exactly. When you want to install one of those fast-paced pieces of software on your system, you're often dissatisfied with the old version coming with your OS anyway.
I see no problem to bump versions of a few packages, rebuild them, and install them via system package manager. Are you developer or user? Of course, some software may not work after that, but package manager newer stays in your way, unless you have no idea how to use it.
I'm a software developer and a user of package managers. A package manager that allows me to only have one version of a software package installed will inevitably get in my way if I need two programs that need different, incompatible versions of that dependency. It happens quite a lot in my experience.
Basically, you want to put two versions of the same file(s) into one file.
If you need two versions of almost same set of files, then
chose different name for the package, e.g. package-2, chose different base directory for package files, e.g. /usr/share/package-2, and chose different names for binaries, e.g. /usr/bin/binary-2. It's not a magic. Just look at examples, e.g. python2 and python3.
You can create your own repository, where you will be maintainer of your packages.
You can use it locally, as directory on disk, or you can put it on a server and share with others, or you can compile it using openSUSE Build Service, which supports OpenSuSE, SLE, Fedora, RedHat, CentOS, Debian, Arch, etc.: https://en.opensuse.org/openSUSE:Build_Service_supported_bui...
I think that there is a good reason to keep those projects in the main distribution: it makes them more robust and prevents the build systems and bootstrapping process from becoming a massive pile of technical debt.
For example, in order to get clojure 10 running on Gentoo I had to go dig through the git history to figure out where spec-alpha and core-specs-alpha and clojure itself did not form a circular dependency in maven. Because clojure was not being packaged in a variety of ways, they now are at risk of making maven central a hard dependency, which makes the whole project less robust.
Do you mean "robust for people using Gentoo" or "robust for everybody"? Isn't the latter responsibility of the Clojure developers rather than packagers?
Robust for everyone. In theory it is the responsibility of the Clojure devs, but if there isn't a packaging workflow that they are aware of, how can they be expected to integrate with it and make design and development decisions that it would reveal to them? The point is more that, the greater variety of different environments a piece of software runs in the more likely it is that systematic weaknesses will be revealed, sort of like chaos engineering, except for whole environments instead of sporadic failures.
Yes packaging for different distributions can reveal unknown problems or enhancement possibilities in the project, however I would be surprised if the developers in general were interested at all in supporting that activity. Maybe Clojure is, I don't know.
Packaging is beneficial mainly to users, I expect developers to choose single platform and support that, if there is more, great. But most developers aren't interested in a massive distraction that accomodating 520+ distributions is.
You're making a distinction between "packagers" and "users" which does not exist. Packagers are advanced users that take the initiative to improve their distro when they find software they want to use and that isn't integrated in their distro.
> Let the new complicated juggernaut software be deployed and configured by the users, according to developers' instructions, using whatever modern mechanism they prefer
I'm not sure what "they" refers to here (developers or users). The existence of packagers is proof that some subset of users "prefer" that software "be deployed and configured" via distro package managers.
> The existence of packagers is proof that some subset of users "prefer" that software "be deployed and configured" via distro package managers.
Not necessarily. It might be inertia - back when linux distributions started being organised this way, those language dependency management mechanisms pretty much didn't exist.
CPAN exists since 1993, online since 1995. CPAN has the package manager, which is able to install perl software with their dependencies. It's not a new, unknown technology.
If you are sure that you can create functional, stable, bug free, hole free, up to date, full of useful software distro with 10 independent package managers instead of one, then just do it. We will enjoy it. You will spend about 10x more time than current maintainers, but your time is free, so it's not a problem. Of course, we will say huge THANK YOU for your incredible effort, with nine zeroes after initial zero, except for some hatters, which will blame your perfect distro at HN for no reason.
Debian in particular put a fair bit of effort into integrating apt deeply with CPAN so that you could install a package from there that depended on system libraries and vice versa, and then for subsequent languages they... didn't. You'd think these were well-known technologies, but as far as linux distribution maintaners are concerned they're new and scary.
> If you are sure that you can create functional, stable, bug free, hole free, up to date, full of useful software distro with 10 independent package managers instead of one, then just do it.
I'd rather leave the distribution model behind entirely. And you know what? I do, and it works great. You just get the occasional complaint like this article, but it doesn't actually matter in the real world.
I meant ordinary users should rely on developers' defaults. If the user is advanced enough, he will be able to go his own way.
Yes I prefer the package manager too, when it works and gives me what I want. But for the software that changes fast, like new languages / compilers, or juggernaut software like Kubernetes, packagers can't keep up and I do not expect them to.
I actually wish this is feasible, but it isn't. GNOME is basic desktop, GNOME includes librsvg, librsvg build-depends on Rust. So you need to package Rust (hence LLVM) to have GNOME. Your proposal only works if "core OS" refuses to use Rust.
I think it's likely that there will be "system Rust" which is used to compile libraries used by the OS, but this is independent of the toolchain and crates that most developers use for their own development.
In a binary distribution, you don't necessarily need to install the system toolchain, unless you want to work on the system.
Yeah I get that. GNOME is a horrible big software. I would not count is as "basic desktop", because it is so big, buggy and dependencies are out of control. Basic desktop, I mean something like Xfce, with modular components and minimal dependencies so the desktop system is simple and unixy. You would still have the option to install GNOME on such a system, if only GNOME developers made a canonical version of GNOME that would be modular, and would install on any clean Linux system with Xorg or Wayland.
It's the pragmatic thing to do, and it's defacto what happens on Linux distros too. Distros offer pip/docker/cargo, but when some libraries/programs are popular enough, someone will inevitably package it and offer it in a 3rd party repo or the main distro repo. Examples of this would be numpy, ripgrep etc..
This will be a somewhat intemperate response, because as a developer of a significant library I found this quite irritating.
If you publish a Python library without pinned dependencies, your code is broken. It happens to work today, but there will come a day when the artifact you have published no longer works. It's only a matter of time. The command the user had run before, like "pip install spacy==2.3.5" will no longer work. The user will have to then go to significant trouble to find the set of versions that worked at the time.
In short unpinned dependencies mean hopeless bit-rot. It guarantees that your system is a fleeting thing; that you will be unable to today publish an end-to-end set of commands that will work in 2025. This is completely intolerable for practical engineering. In order to fix bugs you may need to go back to prior states of a system and check behaviours. If you can't ever go back and load up a previous version, you'll get into some extremely difficult problems.
Of course the people who are doing the work to actually develop these programs refuse to agree to this. No we will not fucking unpin our dependencies. Yes we will tell you to get lost if you ask us to. If you try to do it yourself, I guess we can't stop you, but no we won't volunteer our help.
It's maddening to hear people say things like, "Oh if everyone just used semantic versioning this wouldn't be a problem". Of course this cannot work. _Think about it_. There are innumerable ways two pieces of code can be incompatible. You might have a change that alters the time-complexity for niche inputs, making some call time-out that used to succeed. You might introduce a new default keyword argument that throws off a *kwargs. If you call these things "breaking" changes, you will constantly be increasing the major version. But if you increase the major version every release, what's the point of semver! You're not actually conveying any information about whether the changes are "breaking".
I don't have a particularly strong viewpoint on this, but I find it noteworthy that in your example the user themselves is asking for a specific version of the software. You don't seem to be intending for users to ask for simply the latest version and have that work, but a specific one, and you want that specific version to work exactly as it did whenever it was published.
I can see some instances in which this expectation is important, and others where it is likely not or else certainly less important than the security implications.
For the extremes, I see research using spaCy has a very strong interest in reproducibility and the impact of any security issues would likely be minimal on the whole simply due to the relatively few people likely to run into them.
On the other extreme, say some low-level dependency is somehow so compromised simply running the code will end up with the user ransomware'd after just-long-enough that this whole scenario is marginally plausible. Then say spaCy gets incorporated into some other project that goes up the chain a ways and ultimately ends up in LibreOffice. If all of these projects have pinned dependencies, there is now no way to quickly or reasonably create a safe LibreOffice update. It would require a rather large number of people to sequentially update their dependencies, and publish the new version, so that the next project up the chain can do the same. LibreOffice would remain compromised or at best unavailable until the whole chain finished, or else somebody found a way to remove the offending dependency without breaking LibreOffice.
I'm not sure how to best reconcile these two competing interests. I think it seems clear that both are important. Even more than that, a particular library might sit on both extremes simultaneously depending on how it is used.
The only solution - though a totally unrealistic and terrible one - that comes to mind is to write all code such that all dependencies can be removed without additional work and all dependent features would be automatically disabled. With a standardized listing of these feature-dependency pairs you could even develop more fine-grained workarounds for removal of any feature from any dependency.
The sheer scale of possible configurations this would create is utterly horrifying.
At any rate, your utter rejection of the article's point seems excessively extreme and even ultimately user-hostile. I can understand your point of view, particularly given the library you develop, however I think you should probably give some more thought to indirect users - ie users of programs that (perhaps ultimately) use spaCy. I don't know that it makes sense to practically change how you do anything, but I don't think the other viewpoint is as utterly wrongheaded as you seem to think.
> I'm not sure how to best reconcile these two competing interests.
What would help a lot is if the requirements were specified outside of the actual artifact, as metadata. Then the requirements metadata could be updated separately.
Libraries pinning dependencies only fixes a narrow portion of the problem and introduces a bunch of others (particularly in ecosystems where only a single version of a package can exist in a dependency tree). In particular, it is great because it makes life slightly easier for the library developers. However, if every library pinned deps, it becomes much harder to use multiple libraries together: suppose an app used libraries A and B, and A depends on X==1.2.3, while B depends on X==1.2.4. It’s then pushed on to every downstream developer to work out the right resolution of each conflict, rather than upstream libraries having accurate constraints.
Pinning dependencies in applications/binaries/end-products is clearly the right choice, but it’s much fuzzier for libraries.
Can you give an example of a real ecosystem that can't handle such a conflict? In my actual experience, the package manager will either automatically use the latest version, or in one case has more complex rules but still picks a version on its own (but I stay away from that one due to the surprise factor). Your argument has force against bad package managers and against using very strict dependency requirements, but not against pinning dependencies sensibly in a good ecosystem.
The only conflict I've seen that can't be automatically resolved is when I had some internal dependencies with a common dependency, and one depended on the git repo of the common dep (the "version" being the sha hash of a commit), and another depended on a pinned version of the common dep. Obviously there's no good way to auto-resolve that conflict, so you should generally stick with versions for library deps and not git shas.
I think you're really under-rating how important it is to be able to do something like "pip install 'requests==1.0.5'" or whatever, in order to reconstruct the past state of a project. If requests hasn't pinned its dependencies, that command will simply not work. The only way you'll be able to install that version of requests is to manually go back and piece together the whole dependency snapshot at that point in time.
There's pretty much no point in setuptools automatically installing library dependencies for you if you expect the library dependencies to be unpinned. In fact it would be actively harmful --- it just leads people to rely on a workflow that works today but will break tomorrow.
You're asking for an ecosystem where there's no easy way to go back and install a particular version of a particular library. That's not better than having version conflicts.
The other thing I'd note is that it's quite an understatement to say that pinning dependencies makes life "slightly easier" for library developers. We're not going to accept builds just breaking overnight, and libraries that depend on us aren't going to accept us breaking their builds either.
Sure, it sucks that unpinned dependencies lose historical context as the deps move forward, and I’ve personally suffered this in my own library maintenance work... but there’s still the fundamental issue of conflicting pinned versions if there’s multiple libraries.
(At the app level, the right approach to “going back in time” is for those apps to pin all their deps, with a lockfile or ‘pip freeze’, not just top level ones. That is, one records the deps of requests==1.0.5 in addition to requests itself.)
If you publish a Python library with pinned dependencies, your code is broken as soon as someone tries to use it with another Python library with pinned dependencies, unless you happened to pin exactly the same version of the dependencies you have in common.
Python libraries should not pin dependencies. _Applications_ can pin dependencies, including all recursive dependencies of their libraries. There are tools like Pipenv and Poetry to make that easy.
This is less of an issue in (say) Node.js, where you can have multiple different versions of a library installed in different branches of the dependency tree. (Though Node.js also has a strong semver culture that almost always works well enough that pinning exact versions isn’t necessary.)
The most frustrating thing is that pip doesn't make it easy to use more loose declared dependencies while freezing to actual concrete dependencies for deployment. Everybody rolls their own.
> Python libraries should not pin dependencies. _Applications_ can pin dependencies, including all recursive dependencies of their libraries.
Is the pypi package awscli an application or a library?
poetry is frustrating in that it doesn't allow you to override a library's declared requirements to break conflicts. They refuse to add support [1][2] for the feature too. awscli for example causes huge package conflict issues that make poetry unusable. It's almost impossible not to run into a requirement conflict with awscli if you're using a broad set of packages, even though awscli will operate happily with a more broad set of requirements than it declares.
For this purpose, I’m defining a “library” as any PyPI package that you expect to be able to install alongside other PyPI packages. This includes some counterintuitive ones like mypy, which needs to extract types from packages in the same environment as the code it’s checking.
The awscli documentation recommends installing it into its own virtualenv, in which case pinned dependencies may be reasonable. There are tools like pipx to automate that.
Though in practice, there are reasons that installing applications into their own virtualenv might be inconvenient, inefficient, or impossible. And even when it’s possible, it still comes with the risk of missing security updates unless upstream is doing a really good job of staying on top of them.
I don’t think that respecting declared dependency bounds is a Poetry bug. Pip respects them too (at least as of 20.3, which enables the new resolver by default: https://pip.pypa.io/en/latest/user_guide/#changes-to-the-pip...). If a package declares unhelpful bounds, the package should be fixed. (And yes, that means its maintainer might have to deal with some extra issues being filed—that’s part of the job.)
> Python libraries should not pin dependencies. _Applications_ can pin dependencies, including all recursive dependencies of their libraries.
This is essentially what we do where I work. When we maked a tagged release, we will create a new virtual environment, run a pip install, run all the tests and then run pip freeze. The output of pip freeze is what we use for the install_requires parameter in the setup method in setup.py.
That said, a library could certainly could update their old releases with a patch release and specify a <= requirement on a particular dependency when versions newer than that no longer work. That said, it would be a bit of work since indirect dependencies would also have to be accounted for as well.
It's maddening to hear people say things like, "Oh if everyone just used semantic versioning this wouldn't be a problem". Of course this cannot work. _Think about it_. There are innumerable ways two pieces of code can be incompatible. ... If you call these things "breaking" changes, you will constantly be increasing the major version.
One of the things that prompted the OP was this breakage in Python's cryptography package [1] (OP actually opened this issue) due to the introduction of a Rust dependency in a 0.0.x release. The dependency change didn't change the public API at all, but did still cause plenty of issues downstream. It's a great question on the topic of semver to think about how to handle major dependency changes that aren't API changes. Personally, I would have preferred a new major release, but that's exactly your point syllogism — it's a matter of opinion.
As a sidenote, Alex Gaynor, one of the cryptography package maintainers is on a memory-safe language crusade. Interesting to see how that crusade runs into conflict with the anti-static linking crusade that distro packagers are on. I find both goals admirable from a security perspective. This stuff is hard.
It's hard because underneath is a battle of who bears the maintenance and testing costs that no one wants to bear.
Asking a publisher to qualify their library against a big range of versions just means that they need to do a lot more testing and support. Obviously they want to validate their code against one version, not 20, and certainly don't want an open ended > version which would force them to do a validation each time a minor dep is released.
Similarly when publishers say I will only work against version X, this puts a bigger burden on the user to configure their dependencies and figure out which version they can use. They would like to push that work onto vendors.
What's a bit depressing is that these economic concerns are not raised openly as the primary subject matter, but the discussion is always veiled in terms of engineering best practices. You're not gonna engineer your way out of paying some cost. Just agree on who bears the cost and how you will compensate them for the cost, then the engineering concerns became much easier.
> In short unpinned dependencies mean hopeless bit-rot.
No, this is not true, for the simple reason that there will _always_ be unpinned dependencies (e.g. your compiler. your hardware. your processor) and thus _those_ are the ones that will guarantee bitrot.
Pinning a dependency only _guarantees you rot the same or even faster_ because now it's less likely that you can use an updated version of the dependency that supports more recent hardware.
Compilers of languages like C, C++, Rust, Go etc go above and beyond to maintain backwards compatibility. It is extremely likely that you will still be able to compile old code with a modern compiler.
> your processor
Hardware is common enough that people go out of their way to make backwards compatibility shims. Things like rosetta, qemu, all the various emulators for various old gaming systems, etc.
> your hardware
Apart from your CPU (see above) your hardware goes through abstraction layers designed to maintain long term backwards compatibility. Things like opengl, vulkan, metal, etc. The abstraction layers are in widespread enough use that as older ones stop being outdated people start implementing them on top of the newer layers. E.g. here is OpenGL on top of Vulkan: https://www.collabora.com/news-and-blog/blog/2018/10/31/intr...
> [Your kernel]
Ok, you didn't say this part, but it's the other big unpinned dependency. And it too goes above and beyond to maintain backwards compatibility. In fact Linus has a good rant on nearly this exact topic that I'd recommend watching: https://www.youtube.com/watch?v=5PmHRSeA2c8&t=298s
> Pinning a dependency only _guarantees you rot the same or even faster_ because now it's less likely that you can use an updated version of the dependency that supports more recent hardware.
Dependencies are far more likely to rot because they change in incompatible ways than the underlying hardware does, even before considering emulators. It's hard to take this suggestion seriously at all.
> Dependencies are far more likely to rot because they change in incompatible ways than the underlying hardware does
Yes, that is true. It is also very likely that you can more easily go back to a previous version of a dependency than you can go back to a previous hardware. The argument is that, therefore, pinning can only speed up your rotting.
If you don't statically link your dependencies, and due to an upgrade something breaks, you can always go back to the previous version. If you statically link, and the hardware, compiler, processor, operating system, whatever causes your software to break, now you can't update the dependency that is causing the breakage. And it is likely that your issue is within that dependency.
If developers are unwilling to maintain dependencies and be good citizens of the larger language community, should they be adding those dependencies in the first place?
If you're not operating in the large ecosystem then fine. But if your project is on e.g. pypi, then there is an issue.
(edit: Note, yes I know the virtualenvs exist, docker exists, etc. but those are space and complexity trade-offs made as a workaround for bad development practices)
Perhaps I still fail to explain myself: what I am saying is that _not pinning_ only _adds_ more choices, so by definition it can only work better.
Pinned or not, if a software update breaks things, you can always just revert back to a previous version of your dependencies. This applies to a myriad soft problems including a dependency changing interface.
However, when pinning, when one of your static dependencies is broken due to a change outside your control (e.g. hardware, operating system, security issue making it unusable, or something else), the user's only recourse is to call the developer to fix the software.
I am not claiming whether one happens more frequently than the other, or claiming that hardware changes cannot break the main software itself, which often nullify the point. All these issues can happen to both software with static linking or dynamic linking. However dynamic linking has at least one extra advantage that static linking cannot have, and the opposite is not true.
> have you ever worked as an application developer? Responsible for getting working artifacts to users as a means to an end?
Look, ironically I find that all of this crap discussion is because of a newer generation of "application developers" that do not know yet what does it mean to "deliver working artifacts to users". Imagine my answer to that question.
> However, when pinning, when one of your static dependencies is broken due to a change outside your control (e.g. hardware, operating system, security issue making it unusable, or something else), the user's only recourse is to call the developer to fix the software.
In practice, this happens so infrequently it can be ignored as a risk. (When it does happen, users generally don't expect the software to continue to work.)
> dynamic linking has at least one extra advantage...
You don't seem to be acknowledging the downside risk to dynamic linking which motivates the discussion in the first place. An update to a dynamically linked dependency which breaks my delivered artifact is an extremely common event in practice.
> In practice, this happens so infrequently it can be ignored as a risk.
Well I disagree there. Security issues or external protocol changes (e.g. TLSv1.2 to TLSv1.3) are rather frequent, not to mention usually customer wants to upgrade their machines (old ones broke) and existing operating system no longer supports the new hardware.
> An update to a dynamically linked dependency which breaks my delivered artifact is an extremely common event in practice.
Again, I agree. A "surreptitious" dependency update breaking the software is much more common. However, I have already acknowledged that _two times already_, and the point that I'm making is that it doesn't matter if you are pinning dependencies or not: customer CAN FIX these issues without help from the developer. They just have to roll back the update!
On the other hand customer CAN'T fix the first issue (e.g. new hardware).
> No, this is not true, for the simple reason that there will _always_ be unpinned dependencies (e.g. your compiler. your hardware. your processor) and thus _those_ are the ones that will guarantee bitrot.
Docker with sha256 tags fixes that issue (and docker container even specify a processor architecture).
Libraries should absolutely not pin their dependencies. Applications should if you care about reproducible builds (not necessarily byte-for-byte, but "can build today == can build tomorrow").
Installing both libraries and applications in the same way in the same environment is a fundamental mismatch that pip encourages, and yes - it leads to fragile binaries.
You’re completely wrong and this advice is somewhat harmful. What you’re describing is how a Python application should be managed. Not a library. Libraries should absolutely not lock their advertised dependencies to arbitrary point-in-time versions for fairly obvious reasons.
Picking a suitable dependency specifier depends heavily on the maturity of the library you’re using and if you need any specific features added or removed in a specific release.
Saying your library depends on “spacy==2.3.5” is a lie that will mean any other library that depends on spacy>=2.3.6 can’t be used. Even if your code will realistically work fine with any spacy 2.x release.
Everyone needs commands like "pip install 'spacy==2.3.5'" to work reliably in the future, so that you can go back and bisect errors. You need to be able to get back to a particular known-good state, and work through changes
I'm not saying we pin our dependencies to exact specific versions, but we absolutely do set an upper bound, usually to the minor version.
> I'm not saying we pin our dependencies to exact specific versions, but we absolutely do set an upper bound, usually to the minor version.
OK. That's more sensible, but "pinning" implies == to a specific version. If you know a library does semantic versioning and breaks their API then ~= is fine. Just not ==.
Even applications shouldn't really be pinning dependencies. The only time to pin dependencies is when deploying that application. That could mean bundling it with pyinstaller or making a docker image. But someone should still be able to install it from source with their own dependencies.
> It's maddening to hear people say things like, "Oh if everyone just used semantic versioning this wouldn't be a problem". Of course this cannot work. _Think about it_. There are innumerable ways two pieces of code can be incompatible. You might have a change that alters the time-complexity for niche inputs, making some call time-out that used to succeed. You might introduce a new default keyword argument that throws off a *kwargs. If you call these things "breaking" changes, you will constantly be increasing the major version. But if you increase the major version every release, what's the point of semver! You're not actually conveying any information about whether the changes are "breaking".
That's the point of the link to Hyrum's law. The article argues that the practice of pinning encourages that attitude: consumers feel free to depend on internal implementation details, producers feel free to change behaviour arbitrarily, and no-one takes responsibility for specifying and maintaining a stable interface, which is how you actually break that knot - producers need to specify which parts are stable interfaces and which are not, consumers need to respect that and not depend on implementation details, and then you can actually use semver because it's clear what's a breaking change and what isn't.
> If you publish a Python library without pinned dependencies, your code is broken.
> you will be unable to today publish an end-to-end set of commands that will work in 2025
Not necessarily.
Since ~2010 I maintain an application with an unpinned requirements.txt; it doesn't even have version constraints at all.
The only breakages I had were either:
1. when switching from Python 2 to Python 3 (obviously)
2. when a new Python version introduces a bug (but Python is not pinnable anyway)
3. once, when a dependency released a new major version and removed an internal attribute I was using in my tests out of laziness (so that one is entirely on me)
The trick is to only use good libraries, that care about not breaking other people's code.
---
It's also worth noting that it's not your job as a developer to make sure your application can be installed anywhere; it's the packager's job to make sure your app can be installed in their distribution.
And if your users want to use pip (which is kind of the Python equivalent of wget + ./configure + make install) instead of apt/yum/... to get the very latest version of your software, then they should be able to figure out how to fix those issues.
It's clear that your approach is a possible approach:
1. 'only use good libraries'
2. 'it's not your job as a developer to make sure your application can be installed'
3. 'if your users want to use pip... they should be able to fix those issues'
However, this isn't a solution to the problem that led to the existence of language ecosystems. It is a refusal to acknowledge the problem.
If you're introducing breaking changes in every new release you should still be in the 0.x stage of SemVer. You're doing something wrong if you end up on v77.0.0. The Node ecosystem's strict compliance with SemVer works fine 99% of the time because SemVer is indeed an effective versioning system (when people use it right).
I get the impression that this advice is accurate for the python ecosystem, but that’s because the entire ecosystem is broken with respect to backwards compatibility.
The exact same mechanisms work fine with other programming languages, and (more importantly, probably) different developer communities.
In fairness, Python’s lack of static types does make things worse than the situation for compiled languages. (Though that’s a general argument against writing non-throwaway code in python).
People claim node does better, even though JS is also missing static types, so presumably they solved this issue somehow (testing, maybe?). I don’t use it, so I have no idea.
Whilst I don't totally disagree with many of the points here, I think there's a wider picture to many of these issues.
The author is concerned with installing packages on user machines: which are typically very long-lived installs - maybe a user has the same machine with the same dependencies for years.
However, for many engineers, (such as myself), a binary may not be used past even a few days from when it was first compiled - e.g. as part of a service in a a quickly continuously integrated system.
I might even argue that _most_ software is used in this way.
When software is built this way, many of the points in this article are very helpful to keep builds stable and to make deployment fast - and in fact for the case of security, we usually _don't_ want dependencies to auto-update, as we do not want to automatically deploy new code if it has not been audited.
Maybe there's a future were OSs become more like this, where binaries are more short lived... maybe not. Although I don't think it's strictly fair to label all of these as "Bad" with a capital B :)
The way iOS and Fuchsia are dealing with the problem is to completely lockdown the operating system with a tight permissions system. An app can be compromised but the damage is limited. Perhaps it is time for servers to move to a similar model.
You mean cgroups, or zones don’t you? Docker (was, last time I heard) a security disaster, not generating robust layer hashes, lacking user isolation, and plenty just running as root...
> An app can be compromised but the damage is limited
AKA the "we don't care" security model. What exact use is the fact that the web browser is "contained" if it is compromised? The mail client? Your PIM program? On a server, what use is that the database engine is contained if it is compromised?
I am the first to accept the security benefits of sandboxing, but it is just _one_ thing. It doesn't even help against the majority of issues. Not even on Android/iOS.
There is a lot I could say about this article, but I've kinda been in this whole Texas natural disaster situation. But I do think it's worth pointing out one thing:
> Rust bundles a huge fork of LLVM
It is not super huge, and we try to upstream patches regularly to minimize the fork. We also regularly re-base all current patches on each release of LLVM, so it's less 'fork' and more 'maintain some extra commits to fix bugs.'
> and explicitly refuses to support to distributions using the genuine LLVM libraries.
We always support the latest LLVM release, and try to maintain compatibilities with older ones as long as is reasonable and possible.
IIRC, the last time we raised the base LLVM requirement was in Rust 1.49, at the end of last year. The minimum version it was raised to was LLVM 9, which was released in September of 2019. The current release is 11.
Again, because of the lack of the extra patches, you may see miscompilation bugs if you use the stock LLVM. There's pros and cons to every choice here, of course.
The Rust project is and has been interested in collaborating on issues to make things easier for folks when there's demand. We've sought feedback from distros in the past and made changes to make things easier!
(I also posted a slightly modified version of this to their blog comments as well. EDIT: the author has commented that suggesting Rust's fork is large was in error)
EDIT: you completely changed your comment. My original response is below the line, but I'll also respond to your new one above it.
I don't actually understand grammatically what you're asking for or saying in this comment. Rust, Cargo, and a bunch of Rust applications are packaged in Debian today, from Buster onwards.
I think what you're saying is that rustc head should be able to be built with the llvm that lives in Debian stable? That is a choice that could be made, I'm sure, but then you're talking about supporting five API-incompatible releases of LLVM. The folks from Debian aren't even asking for that level of compatibility, and it would be quite a bit of work, for unclear benefit.
--------------------------------------------
There are lots of demands, from lots of people. Managing a project requires prioritizing different demands of all of your different users.
> When will the problem mentioned in the article
The article talks about many more things than one single problem. Some range from easy, like the "we already do let you use stock llvm" I mentioned above, and some are difficult and take a long time, like "define a Rust ABI." The needs of packagers need to be balanced with the needs of other folks as well. "define a Rust ABI," specifically, for example, would help some packagers, but it may harm other users. It also may be something that's good to have, but we aren't quite ready for it yet.
A willingness to work together to solve issues, and to find solutions that help as many different kinds of folks as possible, is what matters here. We have had folks from other distros pitching in to help make things better for their distros, and I can't imagine we wouldn't accept the help of the OP either.
What's the problem with focusing on upstreaming the patches and getting rid of rust's "staging fork"? Especially if they fix something as serious as miscompilations, wouldn't those patches be VERY much needed upstream? I'm asking out of genuine interest.
That's already done. The point is that it takes time, and there are always more of them, so at any point in time there are likely to be a handful of them still in-process of moving upstream.
> Strongly prefer to upstream all patches to LLVM before including them in rustc.
That is, this is already the case. We don't like maintaining a fork. We try to upstream as much as we can.
But, at the same time, even when you do this, it takes tons of time. A contributor was talking about exactly this on Twitter earlier today, and estimated that, even if the patch was written and accepted as fast as possible, it would still take roughly a year for that patch to make it into the release used by Rust. This is the opposite side of the whole "move slow and only use old versions of things" tradeoff: that would take even longer to get into, say, the LLVM in Debian stable, as suggested in another comment chain.
So our approach is, upstream the patches, keep them in our fork, and then remove them when they inevitably make it back downstream.
(EDIT: Also, what Rusky said: there's basically always going to be more than one patch, in various stages of this process...)
Ok, maybe I'm just to spoiled by my previous experience then. I never had any problems getting patches upstreamed relatively quickly, but then e.g. meson or picolibc are much smaller than LLVM and patches are simpler to reason about (I know some compiler construction).
Remember that in my comment above, it’s assuming a speedy review. Release schedules are still a thing; I’m not at my computer anymore, but IIRC, llvm releases once or twice a year, so that’s the inherent limit of making it into a release, not necessarily the review time.
Can't there be a build option to not use the LLVM submodule, and instead use the system LLVM? Assuming there are tests for these LLVM bugs, and assuming the patches are indeed being merged, wouldn't a CI be able to catch when it is safe for downstream users that want to use upstream LLVM to update their Rust installation?
Hmm, I couldn't find that last time I looked at building Rust -- all I saw was an option to grab precompiled builds of Rust LLVM from the CI server, but maybe I missed something.
It's a risk calculation: do I want to risk being vulnerable to 0days, or do I want to risk my application not running for any user because a dependency changed its headers/api?
As software engineers we want to be in control of as much as possible when running our application, to make it as deterministic as possible. For that we select versions of our dependencies, build it, and then test it, and if our tests pass, we release it.
If we would let the OS determine the dependency versions, without testing, what guarantees can we give that our code will work? Do you write tests for future, unknown changes in libraries? Do you write tests with Random, because that's the kind of unpredictability you're introducing?
Keeping the application working in the face of dependency updates is the distro's job, if they're supported in doing it. They won't be pushing dependency updates at random except to channels explicitly marked as unstable (at least, assuming a minimally sane distro).
I'm less familiar with Gentoo, but Debian-based distros ought to be safe from that threat (and if anything, you ought to be worries about the reverse problem). So your question becomes: do I want to risk being vulnerable to zero-days?
Why do only *nix folks pull their hair out about this? On Windows programs bundle their dependencies all the time (sometimes even as shared libraries! but without the independent update benefits) and hardly anybody loses sleep over it. Heck, users actually like the fact that it minimizes friction. Nobody claims it's rock-solid security, but does it need to be?
Actually, now that I wrote it above, I think I might have found an answer to my own question: while of course infosec experts will freak out about any vulnerability existing anywhere for even a shred of a nanosecond, in the real world this is really only a big deal for servers, not clients. And I'm guessing this issue affects Linux folks more because servers tend to run on Linux, with security-sensitive software exposed to the open internet all the time. Maybe we need to realize & embrace this trade-off instead of fighting it till the end of time?
(I suppose one could even argue servers don't need this either if all the packages are kept up-to-date anyway; I guess that might also be debatable, but it's beside my point.)
The unix culture comes from a shared multi-user perspective, whereas the windows point of view tends to be towards a single user desktop. In a shared environment, it's not acceptable for applications to change shared system components. Although this is maybe less important today, the culture still persists.
> In a shared environment, it's not acceptable for applications to change shared system components.
That era is long gone on Windows (literally since XP I think). In fact I've had to do this far more on Linux than on Windows in recent memory... the latest instance ironically being the glibc CLOCK_REALTIME mess on WSL. Right now programs bundle whatever they need, and these go into the app directory (like \Program Files). e.g., I have maybe ~20 sqlite3 DLLs on my system besides the system ones.
Your larger point about the culture might still be correct though, I don't know.
> while of course infosec experts will freak out about any vulnerability existing anywhere for even a shred of a nanosecond, in the real world this is really only a big deal for servers, not clients
I don't think that the linked article really reflects a consensus among the security community at all. I don't really find the "dynamic libraries are more secure" argument overly compelling, certainly not stated so broadly.
People honestly need to push back on the constant focus on security, it's an important concern but it shouldn't make life worse for users, in some ways it is in fact being used this way.
The day I stop static linking is the day I can compile on one distro and ship to many without worrying about users reporting loader errors. That day is not today.
Until then, I'll keep doing it, because it saves me time and money.
I don't really buy that dynamic linking all the things is such a boon to security. But I'll link to this: https://drewdevault.com/dynlib
> Not including libc, the only libraries which had "critical" or "high" severity vulnerabilities in 2019 which affected over 100 binaries on my system were dbus, gnutls, cairo, libssh2, and curl. 265 binaries were affected by the rest.
> The total download cost to upgrade all binaries on my system which were affected by CVEs in 2019 is 3.8 GiB. This is reduced to 1.0 GiB if you eliminate glibc.
> It is also unknown if any of these vulnerabilities would have been introduced after the last build date for a given statically linked binary; if so that binary would not need to be updated. Many vulnerabilities are also limited to a specific code path or use-case, and binaries which do not invoke that code path in their dependencies will not be affected. A process to ascertain this information in the wake of a vulnerability could be automated.
The day I stop static linking is the day I can compile on one distro and ship to many without worrying about users reporting loader errors. That day is not today.
Maybe do the trendy thing and link your whole distro together with your app (a.k.a containers)?
The fact that containers exist and are prevalent is a damning indictment on the Linux model imho.
Producing software that can simply launch is so wildly complex that the solution is... to snapshot an entire operating system install to run the application.
> Now, for the worst of all — one that combines all the aforementioned issues, and adds even more. Bundling (often called vendoring in newspeak) means including the dependencies of your program along with it. The exact consequences of bundling vary depending on the method used.
Let’s consider this statement at face value. The WORST thing a program can do is... include the things the program needs to run? So programs should... NOT include the things they REQUIRE to run?
I get it. I understand the argument. But if you take just a half step back it should be clear how utterly broken the whole situation is.
I'm not sure it's fair to characterize a container as a snapshot of an entire OS install - you can have containers that are very lightweight, e.g. just a static Go binary copied into a "scratch" container - but often the required effort is not put in to reduce size.
You are comparing having containers versus having no containers and no kind of bundling at all, which is not a reasonable comparison. If you need an OS snapshot, the only option on other operating systems is often to use a VM. That's what containers are used as an alternative to. (The other alternative there is to never install patches or upgrades on the production machines, which probably creates as many problems as using containers)
I’m not sure I made that comparison. I’m saying the popularity of containers is an existence proof that the distro-oriented shared library system is failing to meet the needs of common use cases.
Programs that bundle dependencies have a radically reduced need for snapshots, VMs, or containers. Such tools do provide a variety of value. But they don’t become virtually requirements to merely launch a program without error.
Ship your damn dependencies says I. Either statically or dynamically, but ship them.
You can't get rid of the distro-oriented shared library system, it exists as long as you are building on top of an operating system. To that end, when you say "we depend on operating system minimum version X.Y.Z" that now becomes another dependency you have to ship at some point if you're running the production machines. Containers are popular because they actually solve that exact problem. If there was some other comparison you were making I'd be interested to hear it, but AFAIK containers can only help here so it's not clear what else you were comparing to.
Parent is saying that the distribution model of modern distros is so legacy and out of alignment with what people actually want that a solution involving bundling the entire damn OS to sidestep that pit of snakes has now became the defacto way to deploy software.
To put it another way, what do you think the relative popularity of distributing your own internal software via a private apt repo is compared to bundling it as a stateless container and putting it on a registry? Some big companies that pre-date containers do it with apt. Most don’t. For good reason.
The company I work at has an internal yum repository that we use for application and certain application dependenices. It's worked reasonably well for us, mainly because we stick with using dependencies provided through the internal or public yum repositories or some other public external yum repositories if a particular dependency is not available otherwise.
I'm sure a similar solution could also work with apt.
For an entire OS, this doesn't make sense. It does make sense for business critical software. We did this 20 years ago, even calling the subdir VENDOR, for some 3rd party components. It's part of design, and even support of your stuff.
What's new is containers and k8s. People will put more into that, for dev and critical sw.
However, for full desktop or OS it's probably cargo-cult or hard to scale and maintain.
I don't think this is a fair characterization of why containers are a killer app. Full-OS virtualization was a killer app for hosting providers because it enabled multi-tenancy, allowing you to oversubscribe physical resources knowing that most applications you host aren't going to be experiencing anywhere near peak load 100% of the time. Containers allow you to take this a step further by allowing multiple applications with different versions of the same dependencies to run together on the same physical or virtual server without needing to worry about symbol clashes and without having to install one application's dependencies into some different than expected location while manipulating LD_LIBRARY_PATH and PATH.
This solved a huge problem for us in the geointelligence ground processing community. The various product generation algorithms tend to all be native code, but they're produced by different contractors, on different delivery schedules, some of whom are no longer on contract at all to provide updates. What do you do when four of them all depend on libxerces-c but all four depend on a different version? Tell them to stop pinning dependencies and update? How, if they're not on contract to do it? Tell the US Congress to get off their butts and award more money to the backing agencies so they can get their developers back on contract? Good luck. In practice, what we ended up doing, without wanting to stripe certain applications to only run on specific servers when some of them are used a ton more than others, was to install dependencies on an NFS mount shared by all the fast compute hosts, into application-specific subdirectories, and then set PATH and LD_LIBRARY_PATH before calling the executable that did all the work.
This system was terrible! But before containers, it was the best we could do. Now we just deploy each individual processing algorithm in its own container, and they can depend on whatever they want, all using standard system path under the illusion they have an entire OS and filesystem all to themselves. and never clash with each other.
> Containers allow you to take this a step further by allowing multiple applications with different versions of the same dependencies to run together on the same physical or virtual server without needing to worry about symbol clashes and without having to install one application's dependencies into some different than expected location while manipulating LD_LIBRARY_PATH and PATH.
We agree 100% on the problem. I enthusiastically agree with all of your complaints.
My point is that Linux model of system-wide shared libraries and PATH/LD_LIBRARY_PATH bullshit is terrible. And the fact that containers are required to resolve that spiderweb nightmare is a damning indictment on the Linux library model.
Containers are one possible solution. An alternative is for those applications to bundle their dependencies. If all applications bundled their dependencies then everything would “just work”. No need to hack bullshit envvars. No need to containerize.
Yes that means it’s harder to deploy security fixes. But if everyone is using containers then also those images need to be updated. At which point what have you even gained?
> I don't really buy that dynamic linking all the things is such a boon to security.
Agreed. It's really not a panacea. When you upgrade a library, you probably want to restart the running applications that depend upon this. Dynamic linking won't save you.
The solution is having a graph of your dependencies! IIRC, NixOS gets this right. I don't think Debian's apt did?
I would really like to know how NixOS solves this problem for things like yarn, webpack, parcel, esbuild, npm, pip, conda, cargo, stack, go modules, ruby gems, etc.
Usually there are tools to extract the necessary dependency information (url and hash) from a package manager's lockfile and produce a fixed-output derivation so a Nix expression can reproduce the build environment. I know of yarn2nix, cargo2nix, and node2nix, and I wrote and maintain gradle2nix for JVM projects.
It's doesn't really, and that's the problem. There are automated tools and scripts to help, but there are always edge cases so that you end up going back to using pip for example.
Nix and language package managers just don't really play well together.
There are scripts and daemons that help you determine what needs restarting[1,2]. NixOS installs can go in separate directory prefixes when there are conflicts. For Gentoo and other Linux distributions, maintainers usually won't mark something stable without resolving conflicts, and this usually means sticking to older stable version of libraries until newer versions of libraries are fully supported by all installed packages. This can definitely be more work for maintainers, and as the blog posts says, it's a sisyphean task.
Thanks, checkrestart looks useful. I wasn't aware of it.
Given that it's in a goodies package though, I assume it's not integrated with apt—at least by default.
I'm thinking of that apt prompt that says "There are services installed on your system which need to be restarted when certain libraries, such as libpam, libc, and libssl, are upgraded." I presume then that uses a hardcoded list of important services.
I can't help with checkrestart or apt questions, but if I can plug 'needrestart -r a': it restarts services, messages outdated shells/user logins and checks for newer kernels/microcode. It will also tell you if any interpreters/containers need restarting, but YMMV here. It's never blown up for me even when restarting boot services, and should work on any distro as far as I can tell. I use it on Gentoo.
> The day I stop static linking is the day I can compile on one distro and ship to many without worrying about users reporting loader errors. That day is not today.
That's completely missing point of the article.
You're free to ship your program binary however you want as the upstream.
It only becomes a problem if dynamic linking is a second class citizen in the language or build tool you use, you bundle dependencies and don't support un-bundling, or pin specific dependency versions.
I feel like this confuses a lot of things by assuming an extremely sophisticated end user. Sure, if you're a double-threat dev/sysadmin using linux, then when some vulnerability gets discovered in some dynamically linked library on your system, you have the capacity to (a) receive information about that fact, and (b) update it.
But now suppose you're an ordinary person. You use software. Maybe you even have a windows machine.[1] Which is more likely to actually get a security update to you?
(a) You have to update a single piece of software, which you know you've installed, though a recognized distributional channel like an app store or something, and all its dependencies come with it.
(b) You have to either learn what a DLL is and learn how to update it and then hope that nothing you rely on breaks in some mysterious way because of some dependency on a dependency on a dependency on a dependency. Or you have to accept a whole operating system update, assuming that the operating system comes with a DLL update---and hence accepting all of the other crap that comes with operating system updates from Microsoft (and Apple), such as bugginess from complex updates, incompatibilities between new versions of operating systems and software (or hardware) that you rely on, new obnoxious security rules that you might not agree to (looking at you, Cupertino), and massive disruptions to your ability to actually use your computer to do your work.
No thanks.
[1] Maybe this article is specifically targeted against the linux ecosystem? If so, perhaps this issue is ameliorated somewhat, but it still seems to put a fairly substantial burden on end users, that seems to be inconsistent with actually letting non-experts use linux OSes.
Alternatively, realise that the problems in (b) are largely solved for reasonable OSes where the vendor takes responsibility both for automatically getting you security patches and for keeping you working without major disruptions.
I'm not sure where you've got the idea that updates are all-or-nothing, but some of us have been living a life you seem to think can't exist for decades at this point.
That sounds like a good argument for using package managers with automatic security updates. The user doesn't need to be an expert, just reboot when the system tells them to.
> (b) You have to either learn what a DLL is and learn how to update it ...
That's not at all what the process is for the ordinary person. The ordinary person sees a notification pop-up from "Ubuntu Software Center" that says 8 packages or whatever need to updated, with one button that says "update everything now" or whatever and one that says "remind me later" or whatever.
It's up to you to choose a distro that applies the appropriate amount of rigor with regards to testing dynamic library updates. For bleeding edge distros like Gentoo or Arch, it's not that much. Upstream publishes an upstream, and the Gentoo package maintainer chucks it into the testing branch. After 30 days, if no one complains, it gets marked stable. The user chooses a certain amount of risk. (although it's been several years at least since ABI breakage has been an issue for me on Gentoo testing) (note that security critical updates are fast tracked into stable) For other distros like RHEL and Debian stable, the package maintainer spends considerably more effort ensuring a random update of openssl-1.1.1i to openssl-1.1.1j doesn't break stuff. The user chooses a certain amount of stability at the expense of not having the latest version of whatever.
On the other hand, on my Windows computer at work, the process for updating is significantly more intrusive. Few OS updates can be applied without a reboot, Visual Studio updates do not permit me to continue working during an update, there are half a dozen auto-updaters, and some programs which don't get updated unless I manually go to their website and check for an update. I don't know when the last time I updated 7-zip is?
Within the past week, there was a bug report that Python has an RCE with untrusted floats or something. On my linux systems, the package manager had an update within hours, and because there is only one Python installation on each of my linux machines, I know that all applications leveraging Python are now protected from that RCE. On my Windows work machine, I have not been notified that any of the applications I use which embed Python need to be updated. Presumably, this means my Windows machine is vulnerable.
You do not need manual version management if you have multiple versions of Python installed. By default, it will use whichever interpreter supported by the application is listed first in PYTHON_TARGETS. If you want, you can override that by calling the python interpreter you want manually, eg `python3.8 <program name>` or `pypy <program name>`. But if python3.9 is the "default" interpreter (because it's listed first) but the application only claims support for python3.8 in the ebuild, if you just run the application with no qualifiers it will start the python3.8 interpreter.
Obviously there were many, many years when python2 and python3 needed to be installed side by side, and it's reasonably common for people to have multiple versions of python3 installed side by side. My VPS has both python3.9 and python3.8 installed side by side, for instance, because apparmor is slow to pick up python3.9 support.
In addition to the other answers to your comment pointing out that the way you wrote (b) is very unfair; I'd like to point out that (a) assumes that the developer of the application is aware of the security update of their dependency and pushes a fix quickly.
For an OS like Linux, the thing about static linking is 100% spot-on.
However, for vetted systems, like app stores, this makes no difference, as the only dylibs allowed to be patched at runtime, are the OS-supplied (ABI) ones. The whole app needs to be rebuilt, re-vetted, and rereleased.
Frankly, I would be worried about dylibs, but for a different reason, as more than one hack has been done by hijacking dynamic links. I'm trying to remember how it went, but there was once a common practice for Mac programs to deliberately take advantage of this, in order to add functionality.
This is a moronic by saying everything is equally bad.
1. The linking strategy doesn't matter. Just rebuild everything. The reason Rust and Haskell don't do dynamic linking is pervasive inlining. This is required for many abstractions to work well, and asking optimization boundaries to coincide with software origin boundaries is stupid anyways.
The ABI arguement is stupid because replacing all the dynamic libraries with the ABI changes is no worse than rebuilding everything cause static linking. Therefor ignore that part and just think about inlining.
2. Lock files are fine as long as one can in fact bump the minor versions. Yes, we should make tools to enforce minor versions are non-breaking. And it's crazy we didn't have those from day one. But until we do lock files are fine.
Centralized repositories are not a good, general solution to the software distribution problem.
First: They don't have all the packages and versions the user could ever need, which means that you'll always have a mixture of software installed through the packagr manager and software installed through tarballs or curl | bash
Second: They distort the relationship between application developer and application user.
Third: they neglect offline use cases.
The only sane, generalizable solution is for the OS to be a platform upon which bundles can be installed. The OS should just define a sane format for said bundles and the user gets their applications directly from the application developer.
If your package management is worth it’s salt then you don’t care about static linking, you just update every package that links to that package. IBM’s AIX had the best package management ever IMO, you could roll forward or back and things would just work. Completely deterministic. On the backend whenever a package was updated they would regression test it and everything that depended on it to ensure the update didn’t break anything - when a bug did get through they would fix it and add another regression test so it didn’t happen again. All of that kinda broke when they adopted RPM as a secondary format because the RPMs didn’t make the same guarantees.
One of the best features of Golang is being able to cut a single static linked Linux binary that works on Alpine, Red Hat, CentOS, etc. With Go your also not linking to OpenSSL or Libxml, and those two packages are responsible for every emergency roll out I’ve ever had to do.
Time and again end result of version pinning is that when forced to update developers are required to advance through multiple major version updates and what should be a patch becomes a rewrite. I have to constantly deal with a bunch of web sites on jquery 2 and the developers boo-hooing every day because jquery 3.5 is completely different and all I can tell them is that jquery 2 is no longer maintained so they need to stop using it for new projects and they need to update or retire their existing projects using it.
One of the things I liked about Golang was that it didn’t have any versions to pin so it avoided all of that nonsense altogether. However then they added “modules” and it’s getting harder to not use those, so that killed that magic. Though I did make sure the CI/CD pipelines I use do a go get -u so hopefully that will force the local Go devs to keep their code fresh.
> Static linking, dependency pinning and bundling are three bad practices that have serious impact on the time and effort needed to eliminate vulnerabilities from production systems.
I'm astonished how different this perspective is. As a developer I see that software is developed faster nowadays. size, library reuse and functionality are increasing.
And distributions are just not able to keep up.
I feel like they never did, they just made exceptions for packages that were too important to ignore, like browsers or office suites.
Really, it's not the software. It's the distros.
Not using libraries is not an option, you don't want devs to write their own crypto.
Not pinning dependencies is bad, incompatibilities and security issues could make their way into the code. The test surface also gets bigger and it raises the question which combinations are supported.
Update: I think what I want to say is, distributions should accept that they can only provide that level of "stability" for a limited set of applications. The new and shiny stuff will always happen elsewhere.
> We try hard to unpin the dependencies and test packages with the newest versions of them. However, often we end up discovering that the newer versions of dependencies simply are not compatible with the packages in question. Sadly, upstreams often either ignore reports of these incompatibilities or even are actively hostile to us for not following their pins.
I don't get it. They unpin the versions and then are disappointed that the software does not work with untested versions? It would be nice if semantic versioning could always work but it does not. I know Elm, Rust and Nix have some solutions to propose for this problem.
Well not pinning does not lead to everything that uses the vulnerable dependency to magically rebuild, just to make the builds less deterministic. The solution is not to blame pinning but to use tooling that informs about the vulnerable dependencies.
One of the issues with unbundling everything into its own package really comes to a head when it comes to how Python is packaged on Debian based systems.
They split out core components that are built-in to Python so that they may distribute them separately thereby breaking the Python version's default tooling that is supposed to exist, in the name of "not having dev tools installed alongside non dev".
Package maintainers have made life miserable for those of us who have to then help people through that mess so that they then may use the software we have written and or maintain. We have to write our quick start guides to tell users how to get a proper working version of Python on their system because the package maintainers deem their way of deploying Python the best way to deploy it.
"Your instructions are wrong, `python3 -mvenv venv` doesn't do anything, it just errors out saying I need to install something else, but when I install that package it still doesn't work".
Packagers that insist on splitting every dependency then put the onus back on the community to support it, they don't have to deal with it.
> Why do people pin dependencies? The primary reason is that they don’t want dependency updates to suddenly break their packages for end users, or to have their CI results suddenly broken by third-party changes. However, all that has another underlying problem — the combination of not being concerned with API stability on upstream part, and not wishing to unnecessarily update working code (that uses deprecated API) on downstream part. Truth is, pinning makes this worse because it sweeps the problem under the carpet, and actively encourages people to develop their code against specific versions of their dependencies rather than against a stable public API. Hyrum’s Law in practice.
Exactly one of the problems with pinning your project dependencies, whatever language your project is in. It's better to unpin and continuously integrate upstream changes as early as possible: it's less work this way at the end of the year, and more secure.
My goodness, It’s as if [the bundling part of] this post speaks directly to me. I use Gentoo as my daily driver, and school forces us to use Zoom for classes. Like most other proprietary software vendors, Zoom chooses to bundle a great deal of shared objects along with their Linux binary. Gentoo however, chooses to let Zoom use the system libraries instead of those bundled. Through some sort of ABI incompatibility, Zoom’s use of my system’s libraries causes joining a class to get stuck on a “Connecting...” screen. It doesn’t happen every time, but it happens often enough to annoy my teachers, who end up having to let me in multiple times.
I don’t really want to run Zoom’s precompiled libraries, and the version numbers of the objects match one to one. But alas, Zoom probably modified the libraries to fit their own applications needs. I can only hope that a future update of Zoom magically solves my problems.
So, I don’t fault it.
However, if you don’t include dependencies and you don’t manage them, which would be the case in modern environments for the majority of users in the world, how is that safe?
Answer: it’s not.
reply