The modern packager’s security nightmare

7800k | karma -3 | avg karma -3.0 · 2021-02-21 20:09:53+00:00

Gentoo is secure, as far as you can be secure by building a bunch of code few have time to review on hardware few have time to fully understand running in an insecure world.

So, I don’t fault it.

However, if you don’t include dependencies and you don’t manage them, which would be the case in modern environments for the majority of users in the world, how is that safe?

Answer: it’s not.

reply

setheron | karma 665 | avg karma 2.36 · 2021-02-21 20:10:49+00:00

Once Nix moves more to fixed output derivations this will be killer!

You can easily upgrade dependencies by having the linker resolve to different libraries without needing to rebuild the world while still retaining a sensible memory model.

reply

anotherhue | karma 5092 | avg karma 4.02 · 2021-02-21 14:25:22

Correct me if I'm wrong (really) but isn't static linking mostly a problem for packagers? The less a package maintainer modifies the application the better IMO.

If application A and B rely on dependency D, which turns out to have a vulnerability, fixed in D', Then _why_ do we think it is anyone but A & B's developers responsibility to update to D' and distribute patched versions? If the packager tries to do it, the chances of diverging A and B due to some other incompatibility are too high.

Either, you're running well maintained software (OSS or commercial), and get the update in a timely manner, or you're running unmaintained software and don't get the update.

Only in the second case does it make sense to patch in a dynamically linked dependency. And I'm sure there's plenty of examples of this, but the real issue is running un-maintained software!

I am much happier keeping in close sync with timely releases from Go/Rust projects than I am with Debian et. al's style of freezing the world.

reply

bombcar | karma 42761 | avg karma 2.58 · 2021-02-21 20:30:22+00:00

The sad truth is without the package maintaining too many people would be running horribly out of date and insecure software - as there is zero incentive to upgrade a working system.

jay_kyburz | karma 4641 | avg karma 2.3 · 2021-02-21 15:07:30

I think if the software was labeled "insecure - requires update" people would update. If you don't know that something is insecure then there is zero incentive.

evmar | karma 6515 | avg karma 8.66 · 2021-02-21 14:36:02

The OP is talking about technical details like static linking and language ecosystems, but if you zoom one level out, this comment correctly pins where the underlying problem actually lies: that the distribution maintainers inject themselves into the development process of all the software they ship.

When A/B are not well-maintained upstream, distributions can assist by updating its dependency D, but even when A/B are well-maintained upstream, distributions still might monkey with them when D changes, even when that change to D actually breaks the software. To me, the root cause is that the distributions are effectively attempting to participate the software they ship, but do so by getting in the middle, rather than participating upstream.

As a former author of some well-maintained upstream software I found their involvement made the software overall worse, but as a user of a distribution I find their maintenance of otherwise unmantained software sometimes helpful. In other words, I think the goal is admirable but the mechanism is wrong, and doing things like adding more dynamic linking only further enable the bad behavior.

reply

choeger | karma 5417 | avg karma 3.41 · 2021-02-21 20:41:48+00:00

I think it would help if distributions could snatch a piece of real estate in upstream software. Something like, "in every project, the debian/ root folder belongs to the debian project and follows their rules". The packaging could then verify this folder and put their patches, build scripts, etc. there. This would help upstream communication a lot, I guess.

CJefferson | karma 14328 | avg karma 4.42 · 2021-02-21 15:58:41

The problem is there isn't just Debian, there is Debian and arch and Gentoo and nix and guix and freebsd and cygwin (at least).

Then, how do I check who should have merge access to all these directories?

reply

tgbugs | karma 969 | avg karma 4.09 · 2021-02-21 17:47:28

The problem is actually far worse than that. While the gp has what seems to be a reasonable solution, the issue is that you can't actually know what distribution is going to package your software, or what environment it is going to run in. There are code bases out there that still work decades later without any active maintenance, but if you need maintenance just to be able to build software in a new environment something is wrong.

The underlying issue is that there are no standards for how to package software in a multi language environment. If I want to go from a state where (require 'module-name) fails to a state where (require 'module-name) succeeds, there are a potentially infinite number of ways that that could be accomplished, a single software project cannot ever specify all the possible ways for building software. What they can try to do use uniform interfaces and standard patterns for building their software, in a way that delegates dependency management to an external system. It seems that it is hard for software developers to admit that other people know more about how and where their software will be running than they do.

Good engineering practice seems to dictate that dependency and environment management should be completely orthogonal to the development of an individual component or individual functionality. The op is 3 stories about what happens when the two are not kept orthogonal. The fundamental problem is that it is often easier for individual projects to make decisions that conflate the individual project with its dependencies (no longer orthogonal).

To my knowledge, there is not a universal or well understood set of requirements that could be used to specify what a stable interface between an individual software project and its dependencies looks like. There are a number of candidates, such as gentoo ebuilds, rpm spec files, etc. however I have not seen one that effectively accommodates all of them. Further, there are languages where the implementation (or even design) makes it impossible to keep dependencies and individual projects orthogonal.

The end result of non-orthognal systems is more work for everyone, more wasted cpu cycles, and worse security. Distros can't stop people from using languages that conflate the two, but they can tell them that they are on their own, and that the distros can't depend on components written in such languages in the core of the OS. To everyone pushing the rewrite it in Rust meme this should be a wakeup call. The current design decisions in the language and limitations of the implementation make it less secure than C or C++ because swapping out dependencies is bottlenecked by the centralized primary development team, and maintainers and users can't take orthogonal action to fix an issue.

reply

choeger | karma 5417 | avg karma 3.41 · 2021-02-22 04:02:55

This organizational aspect could be outsourced to one dedicated organization. Not all distributions have to join. I think if 20% did, 80% of the problems would be solved.

noobermin | karma 14622 | avg karma 3.06 · 2021-02-22 02:43:16

>but as a user of a distribution I find their maintenance of otherwise unmantained software sometimes helpful.

Sometimes. Often times they usually break the software and leave the users hanging dry wondering why A or B don't work anymore. Literally my experience on gentoo as a user for the last few years.

reply

choeger | karma 5417 | avg karma 3.41 · 2021-02-21 20:37:38+00:00

The distribution provides support for much longer than upstream does. Also, packaging is actual work. Just because some hipsters in some company consider it cool to release new features every 4 weeks, it does not mean that the volunteers of some linux distribution can keep up with thag. So if you want your distribution to work well, you should focus on workable and transparent interfaces.

If you are happy with the support cycles of your upstream and can live with a black box, on the other hand you don't need a distribution in the first place.

Besides, the whole "vendor everything, link statically" idea works well only for the leafs of the tree. Guess what Rust would do if llvm was not a readily usable library but a bunch of C++ files directly used inside clang? Thanks to the Rust mode of operation, it is impossible to do with the rust compiler what they did with llvm.

reply

anotherhue | karma 5092 | avg karma 4.02 · 2021-02-21 20:45:38+00:00

> on the other hand you don't need a distribution in the first place.

I think you're right about that, I personally want as little distribution as possible. FreeBSD ports or Arch AUR work well for me.

The idea of solidifying other people's software into a bundle and then maintaining it (with necessarily limited expertise) seems like a loosing battle.

reply

beojan | karma 1294 | avg karma 2.05 · 2021-02-21 22:08:53+00:00

As someone who uses the AUR a lot, it's really the perfect example of how you just won't update software if it isn't done automatically by the distro package manager, particularly when you use -git packages.

nwallin | karma 4323 | avg karma 4.36 · 2021-02-22 00:37:18+00:00

> Correct me if I'm wrong (really) but isn't static linking mostly a problem for packagers? The less a package maintainer modifies the application the better IMO.

IMHO that's precisely why dependencies should be unpinned.

Let's say application A relies on dependencies B, C, and D, and dependencies B, C, and D depend on dependency E. Let's say dependency E has a critical security vulnerability and needs to be updated today to E'.

Let's say you have unpinned versions:

The packager for E updates E to E'. The end.

Let's say you have pinned versions:

The developer for B updates package B to depend on E', the developer for C updates C to depend on E', the developer for D updates D to depend on E', the developer for A updates A to depend on B', C', and D'. The packager for B updates the package for B, the packager for C updates the package for C, the packager for D updates the package for D, and the packager for A updates the package for A.

You'll notice that there's a timing issue here. The packager for C cannot move until the developer for C has done the work, the developer for A cannot move until the developer for B, C, and D have done their work, and the packager for A cannot do anything until everyone has completed their work. If, for instance, the developers for C all live in Texas and their power's been out for a few days, and when they get power back they're busy with other stuff for a while, it might take quite some time for C's developers to get an official package posted. But it's that important that A gets updated, because A is a network service with a port open to the internet and E is openssl or whatever. So now what?

In a perfect world, all software dependencies would have active, attentive, prompt maintainers, but it tends to not be that way. Lots of critical internet infrastructure packages have a maintainer who's just some random person in Nebraska, and they go on vacation, or lose interest, go to sleep at night, go to little league games on the weekend, some of them have day jobs. If we lived in a world where Apache can't be updated to use the latest dynamic library for openssl because the developer for leftpad is watching a movie and has their phone turned off, that's a very serious problem, and it's a crazy world I would not want to be a sysadmin in.

Certainly, maybe the packager for E is gonna be off this week, but a distro's packaging team tends to have a much easier time filling in for a maintainer who's away if the package is loosely coupled with the application's build process. 90% of the time, if a package in Gentoo requires an update, all you need to do is `mv foo-1.2.3.ebuild foo-1.2.4.ebuild`, `repoman manifest` and git commit+push. (I can't speak for other distros.)

The system isn't perfect, but IMHO it's much more robust to the unfortunate realities of the ugly, soft underbelly of the world than static linking is.

reply

lmm | karma 42440 | avg karma 1.91 · 2021-02-22 08:11:00+00:00

Making sure that there are multiple people with the knowledge and access to produce new releases of E that incorporate security fixes in a timely way is definitely good. But I'm not convinced that distro maintainers are a good answer to that problem; the distro landscape is very fragmented, and distro maintainers are often not very closely involved with the packages they're notionally maintaining or aware of best practices / pitfalls that apply to that ecosystem. I suspect that something along the lines of the rust platform efforts might have a better chance of pushing out releases of all reverse-dependencies of some package that had a security flaw in a timely way, with minimal risk of breakages.

phkahler | karma 20899 | avg karma 2.69 · 2021-02-21 14:29:11

>> Why do people pin dependencies? The primary reason is that they don’t want dependency updates to suddenly break their packages for end users, or to have their CI results suddenly broken by third-party changes.

Or because we dont want accidental or malicious security vulnerabilities to get automatically incorporated into the software.

This stuff works both ways. You dont automatically incorporate fixes, nor new problems.

reply

lucb1e | karma 17323 | avg karma 2.26 · 2021-02-21 20:48:00+00:00

The vast, vast majority of updates fix security issues. It's like not vaccinating in case you're one of the million people that has an allergic reaction. Supply chain attacks are rare, not the norm. We hear about such things (and only rarely at that) because it's exceptional enough to make the news.

7sidedmarble | karma 333 | avg karma 2.47 · 2021-02-21 21:25:18+00:00

Yeah, but 'pinning' dependencies is useful so that you can choose when you get those changes.

lucb1e | karma 17323 | avg karma 2.26 · 2021-02-21 21:31:51+00:00

Which means extra maintenance work to check for every piece of software that anyone uses whether it uses another library that it needs to be recompiled against and, if it fails, how to use the new version.

If there are automatic updates, at least it either works and is more secure, or it breaks automatically and unsafe software stops working.

Whether you prefer people to use MSIE6 because "it just works" or whether you prefer old sites that only worked with MSIE6 to break because it's no longer maintained, that's the trade-off you have to choose between.

As a security person, I'm obviously biased, I can only advise what I see from a professional perspective. All I was saying above is that automatic updates being considered a security risk is on the same scale of odds as considering vaccines dangerous -- in regular cases, that is: of course the advice is different if you're a special (sensitive) organisation or a special (immunocompromised) person.

reply

eecc | karma 4200 | avg karma 2.48 · 2021-02-21 21:45:02+00:00

Nah, I don’t buy it. If it’s “just” bug fixes (for which I might have implemented a hack that now depends on the bug) I prefer Nightly builds with the latest (and re-pinned) dependencies available. Releases are just a re-tag after extra QA

tgbugs | karma 969 | avg karma 4.09 · 2021-02-21 20:34:34+00:00

Fantastic article. I now have something to point people to when they ask "what is wrong with pinning?" or "what is wrong with static linking?" or "why can't you just use pip?" Michal has had to deal with some pretty crazy stuff the last couple of months ... scratch that, years.

The recent attempts to empower developers to distribute their own software means that there is now the potential for there to be as many bad security practices as there are pieces of software, because systematic security design that used to be managed by distributions has been pushed down to individual software projects, which have only a tiny view of the issues, and thus repeatedly make locally convenient decisions that are disastrous for everyone else. What do you mean someone is using a project other than ours that shares a dependency? Why would they do that?

One other thing to keep in mind is that from a security standpoint the approach that Nix takes is not good. Nix can deal with pinned dependencies on a per-package basis, but if those pinned versions are insecure and the developers don't have a processes for keeping dependencies up to date, then users are sitting ducks.

Unfortunately pinning is a symptom of at least three underlying issues. The first is that developers do not properly assess the costs of adding a dependency to a project both in terms of complexity, and in terms of maintenance burden. The second is that many of the dependent libraries make breaking changes without providing space for the old api and the new api to coexist at the same time for a certain period so that developers can transition over (I have been bitten by this with werkzeug and pint). Mutual exclusion of versions due to runtime failure is a nasty problem, and pinning hides that issue. Finally it seems that at least some developers are engaged in the Frog and Toad are Cofounders continuous integration story, but with a twist, rather than deleting failing tests, they pin packages so that they don't have to see the accumulating cost of pulling in additional dependencies. Externalities, externalities everywhere.

edit: because it is just too on point https://medium.com/frog-and-toad-are-cofounders/technical-de...

reply

bscphil | karma 8137 | avg karma 3.41 · 2021-02-21 20:51:52+00:00

> "why can't you just use pip?"

IMO, pip is great, but it has exactly two use cases where it's warranted. One is where you need to be your own maintainer, e.g. you've developed a private web server to run your website, and you need to manage its dependency versions precisely. The other is for development purposes: it's really great to be able to use different versions of Python or test against different libraries than your distribution ships.

It's not (or shouldn't be) for shipping software to end users. That's the problem being complained about in the essay. This particular way of handling dependencies (use system libraries by default, but allow the user / developer to create an entire virtualized Python installation if they want one) is actually why I think Python handles this better than most other languages, including older ones.

reply

tgbugs | karma 969 | avg karma 4.09 · 2021-02-21 21:06:51+00:00

I agree. The context was missing from my original, which is, "why can't you just use pip to install dependencies in production," which is effectively the answer that Michal once got from a PyPA maintainer when asking about setup.py install.

saurik | karma 31936 | avg karma 4.3 · 2021-02-21 21:04:42+00:00

> I now have something to point people to when they ask "what is wrong with pinning?" or "what is wrong with static linking?" or "why can't you just use pip?" Michal has had to deal with some pretty crazy stuff the last couple of months ... scratch that, years.

Literally the only argument in this article against static linking is "it will take an extra couple hours for the distribution to recompile all affected packages and then require the user to download a larger update, meaning time to fix for a security issue will be negligibly longer"... since you seem to believe this article represents the strongest statement of your argument, this has actually moved me further away from where you want me to be ;P. (FWIW, the best argument I could make for dynamic linking involves memory efficiency for shared pages, the disk cache, and maybe the i-cache, not security.)

reply

jcranmer | karma 30697 | avg karma 4.25 · 2021-02-21 16:05:33

> maybe the i-cache

This I doubt. On a statically-linked application, all calls to functions are going through a regular function call instruction. When you dynamically link, every call to a function that might cross library boundaries (which on ELF systems defaults to every call that's not to a static function) instead calls into the PLT, which will itself dispatch through to the underlying function call.

reply

Lx1oG-AWb6h_ZG0 | karma 822 | avg karma 5.48 · 2021-02-22 00:15:41+00:00

We had an article on the front page yesterday about Swift’s ABI and how much Apple has invested in dynamic linking: https://news.ycombinator.com/item?id=26205969

> It's worth noting that the Swift devs disagree with the Rust and C++ codegen orthodoxy in one major way: they care much more about code sizes (as in the amount of executable code produced). More specifically, they care a lot more about making efficient usage of the cpu's instruction cache, because they believe it's better for system-wide power usage. Apple championing this concern makes a lot of sense, given their suite of battery-powered devices.

reply

larusso | karma 440 | avg karma 1.74 · 2021-02-21 20:38:42+00:00

It’s sounds like pinning dependencies is just done because we developers are a lazy bunch that just want to guard against the rare case of a breaking upstream change. I was burned too often in the past by breaking changes or more often behavior changes in transient dependencies to not put up some defense. But I still don’t pin versions in the main dependency files (cargo.toml, gemset or similar). I just have the generated lockfile and put that under version control. so all developers in the team get the same versions of direct and transient dependencies. I define the dependencies with version ranges. Every package manager works differently here. I like cargo the best since a statement like ‘version = 1.2.1’ means give me a version bigger than ‘1.2.1’ but lower than ‘2.0.0’. One should never add lockfiles for library projects as they need to be open for future updates. I also add dependabot to all repos and let the bot inform me about updates.

mixedCase | karma 4317 | avg karma 2.74 · 2021-02-21 20:43:06+00:00

Yeah well, who's hungry?

No really, neither users nor developers care, nor should they. I've been using Linux on the desktop for well over a decade and I'm tired of seeing this plea for everything to behave exactly like C or scripting languages because every distro wants to be its own special snowflake and it would be too hard to adapt.

The world has changed and distros have to stop pretending it hasn't. Flatpak, Nix, those are better models that reflect our reality where developers don't want to worry about ten thousand different distros with wildly different downstream configurations producing different kinds of bugs, as well as the desire of users of just getting the darned app.

If you're worried about security you must always, first, work with upstream. Upstream should be the party responsible for its users, not you. Upstream not fast enough and you want to go the extra mile? Well then your packaging tool-belt should have support for patching a series of libaries in an automated fashion and rebuilding the dependency tree; and make sure that you return to upstream as soon as possible.

If you want your downstream to diverge from upstream because you want to act as a barrier as the old distros do, then you'll have to accept the fact that you're maintaining a fork and not pretend to upstream that you're distributing their work or that your bug reports are compatible. Otherwise, again, just limit yourself on distributing the latest upstream's stable release with the bare minimum patching necessary for getting things to work with the distro's chosen config.

With some luck, after the Linux community comes to terms with the fact that the distro model must shift, we can begin to finally share some packaging efforts between all distros and we can leave much of the incompatibility bullshit in the past.

Imagine one day everyone just building from shared Nix derivations? Very ironically, it would look a lot like Portage and Gentoo's USE flags but with everyone building off those generic derivations and offering binary caches for them.

reply

noobermin | karma 14622 | avg karma 3.06 · 2021-02-22 00:45:46+00:00

It's hard not to read this and not feel like a little bit of it is some maintainers worried about losing control. When I get on irc and say the words "pip" they certainly send me those vibes.

loopz | karma 2044 | avg karma 0.95 · 2021-02-21 19:20:35

It's just idealism, but no ideology suits everyone and for every context. People forget this.

progval | karma 4683 | avg karma 3.58 · 2021-02-22 02:58:16

> I'm tired of seeing this plea for everything to behave exactly like C or scripting languages because every distro wants to be its own special snowflake and it would be too hard to adapt.

The same argument can be used the other way around: "I'm tired of seeing this plea for everything to behave exactly like Docker or static .exe because every software wants to be its own special snowflake and it would be too hard to adapt to software distributions."

reply

folkrav | karma 1388 | avg karma 2.39 · 2021-02-22 14:56:51+00:00

This is quite literally the exact opposite of software wanting to be its own special snowflake, so not sure spinning the argument around really works.

bscphil | karma 8137 | avg karma 3.41 · 2021-02-21 20:43:48+00:00

Over the last several weeks I was working on an essay about this exact problem, including the connection between static linking and bundling. This one is so well done that I probably won't even publish it.

But I'll add this, for people who may not immediately see why this is important. I think that the real danger these new technologies represent is not inherently bad technology, but the possibility of ecosystem damage. The distribution / maintainer / package manager approach has proven to be an extremely reliable way to get trustworthy software. Many of us love it and want to see it stick around. And it's been possible because "upstream" developers in the open source ecosystem have been willing (or forced) to work with distributions to include their software. But this seems to be changing. A highlight from my essay:

'Many software projects are not good citizens in the open source ecosystem. They are working with a model of development and distribution that does not mesh well with how open source software is predominantly produced. ... These [new] languages are now developing their own ecosystems, with their own expectations for how software should be created and distributed, how dependencies should be handled, and what a "good piece of software" looks like. Regardless, an increasing amount of software is being built in these ecosystems, including open source software.

There might come a day in which open source is fractured. Two different communities create two very different kinds of software, which run on the same systems, but are created and distributed in very different ways. I begin to worry that the community I care about might not survive. Even if my ecosystem continues on its way, I don't want this split to take place. I think being part of the open source ecosystem is good for software, and I think having all the software you could want available within that ecosystem is good for users. If anything, this essay is a call to those who agree to be more careful about the software they create. Make sure it's something that Debian maintainers could be proud to ship. If you're working on a Rust program, for example, ask questions like "how can I make this program as easy for maintainers to distribute as possible?" If you have the ability to work on projects like Rust or Go, do what you can to give their applications the ability to easily split dependencies and support system provided libraries. Let's try to make sure the software ecosystem we love is around for the next generation.'

I'd also strongly recommend the essay "Maintainers Matter" for another take on why this is so important. http://kmkeen.com/maintainers-matter/ Or read about Debian's recent problems with vendoring: https://lwn.net/Articles/842319/

reply

hctaw | karma 917 | avg karma 3.25 · 2021-02-21 21:37:07+00:00

> The distribution / maintainer / package manager approach has proven to be an extremely reliable way to get trustworthy software. Many of us love it and want to see it stick around.

I disagree, it's proven to be inadequate for modern software development and that's why these new languages/ecosystems are springing up. The least reliable way to package and distribute software is by relying on traditional package managers.

> Do what you can to give their applications the ability to easily split dependencies and support system provided libraries

This is unrealistic. I do not trust system provided libraries to function with my applications because I've been burned so many times in the past.

> how can I make this program as easy for maintainers to distribute as possible

By statically linking everything as much as possible and shipping everything else in a self contained bundle with a launcher that overrides any symbols that might inadvertently be pulled in from the system.

The universe I'd like to live in is where the only use case for dynamic linking are OS vendor APIs and cryptographically secure functions like TLS. My dream package manager would whitelist those system librarie and forbid distribution for any bundle that does contain the shared objects with the symbols it needs.

reply

bscphil | karma 8137 | avg karma 3.41 · 2021-02-21 21:44:35+00:00

> I disagree, it's proven to be inadequate for modern software development

Well, that's exactly why the OP (and my essay) are "anti" modern software development in many ways. The view is that we're moving away from the traditional open source ecosystem and methods of software development with these new technologies, which (to be clear) are good technologies, but were created mostly to solve problems that some large corporations have, not to solve the problems that the open source ecosystem has.

> The least reliable way to package and distribute software is by relying on traditional package managers.

Not sure what you mean by this, but it's entirely untrue in my experience. Anything I install with a package manager just works, 100% of the time. Stuff I try to get any other way is a shitshow, and the lack of "quality control" provided by maintainers speaks for itself. I mean, just look at Android apps or the Chrome extension store. Heaven forbid we go back to the days of curl | bash off someone's website.

> By statically linking everything as much as possible and shipping everything else in a self contained bundle with a launcher that overrides any symbols that might inadvertently be pulled in from the system.

I know you know this, but just to be clear, that's not a solution to the problem of "making things easier for maintainers to distribute", that's cutting maintainers out of the loop. The whole point of my focus on ecosystems is that this is something that I, as a user, don't want to happen.

reply

oblio | karma 25694 | avg karma 2.47 · 2021-02-21 15:59:53

The open source solutions are primarily for C/C++. A bit for Perl, a bit for Python. But they haven't really moved on. Java has been tacked on since forever. Same for .NET, JavaScript, whatever.

And if you wanted your program propagated to all major distros you'd have to wait a decade.

Nobody has time for that. Not corporations, not mom and pop stores, and I doubt many hobbyists.

reply

ticviking | karma 1415 | avg karma 2.23 · 2021-02-21 16:20:31

Then you always have the option to package your software for the ones you use and not worry about the rest.

If you want to make your program popular then packaging is part of the process needed.

reply

bfrog | karma 1755 | avg karma 2.01 · 2021-02-21 23:26:01+00:00

Not anymore

lifthrasiir | karma 12499 | avg karma 3.64 · 2021-02-21 17:37:05

That's not the problem. (*) The actual problem is that the distro maintainers want split packages (for security and so on), not vendored, and this requirement was already burdensome for many languages other than C/C++. If vendored packages were acceptable I believe people would have of course contributed them. Maintainers made this obstacle themselves (for a good cause, arguably) and it seems farfetched for them to then complain other languages are uncooperative.

* Was "isn't that a problem?", which was not I meant to say.

reply

jholman | karma 3913 | avg karma 3.03 · 2021-02-21 18:59:00

> If you want to make your program popular then packaging is part of the process needed.

That is demonstrably not true. I'm not making a value judgment (maybe it SHOULD be true, I dunno), but it's not true. There's a lot of popular unpackaged software out there.

reply

virgo_eye | karma 10 | avg karma 1.0 · 2021-02-22 11:07:24+00:00

I mean the whole argument here is that distro packaging is not needed, that there's a better way of doing it.

ticviking | karma 1415 | avg karma 2.23 · 2021-02-22 06:27:38

That actually clarifies some thing about the debate for me. Thanks.

I'm not sure I'm convinced based on the semi-frequent posts from maintainers and security pros about the issues with vendoring dependencies for software that is widely deployed.

This "better way", since we lack a more concrete name. Seems to be really great if you're running a web app, or server software in your own company and can rebuild and run a rolling deploy pretty easily. For someone pushing software to users all over the world, and as one of those users the downside to allow every application to be responsible for updating this stuff seems pretty steep.

reply

hctaw | karma 917 | avg karma 3.25 · 2021-02-21 16:15:24

I'd disagree that the problem of large organizations are different from the problems of the FOSS ecosystem. Organizations just have a financial incentive to fix them, the FOSS ecosystem does not. If mutually incompatible dependencies and security updates breaking software weren't problems for both corporate and FOSS ecoystems, these new technologies wouldn't have needed to exist. They'd just use the existing platforms.

And mind you, this is not a corporate/open source split. The burgeoning ecosystems are also full of FOSS technologies doing new and exciting things, they just don't break when a dependency updates!

>Anything I install with a package manager just works, 100% of the time

I run into issues with packages weekly. So much so I've spent engineer days purging references to system packages. It's universal too - yum, apt, pacman, brew, macports, I have to make sure nothing tries to reference packages installed outside a local working directory for an application because of mutual incompatibilities. Maybe it's because I'm trying to write software that runs on multiple targets and not use software where someone else has already spent the time and money to resolve these issues.

> I know you know this, but just to be clear, that's not a solution to the problem of "making things easier for maintainers to distribute", that's cutting maintainers out of the loop. The whole point of my focus on ecosystems is that this is something that I, as a user, don't want to happen.

They should be cut out of the loop. Maintainers don't have a right to dictate what design decisions I put into my applications because they don't think it adds value (the value is, it doesn't just run on their distro!). Another comment in this thread put it better, maintainers shouldn't place themselves in the development process.

reply

ticviking | karma 1415 | avg karma 2.23 · 2021-02-21 22:22:36+00:00

> maintainers shouldn't place themselves in the development process.

Then perhaps we as developers should spend more time thinking about and participating in the maintenance process?

reply

lmm | karma 42440 | avg karma 1.91 · 2021-02-22 06:45:18+00:00

Which is exactly what developers have done, and what this guy is complaining about: developers have figured out a different approach to maintenance that works better.

ticviking | karma 1415 | avg karma 2.23 · 2021-02-22 12:21:44+00:00

I don't pretend to have an answer but I'm trying to listen to both sides so maybe I can formulate "the question" a little bit better and improve the discussion around this.

It seems like have posts every few months where security professionals, maintainers and sysadmins explain that the "developer usability first" approach to maintenance and dependency management is having massive consequences to our ability to keep systems secure, and to even know if an affected version of a library is on a system.

That is a different problem that doesn't seem to be addressed at all in the current iteration of these tools. If you're aware of efforts to solve that problem in the rust and go ecosystems(since those are the ones cited here) I'd love to read about it.

reply

lmm | karma 42440 | avg karma 1.91 · 2021-02-24 05:19:33+00:00

Knowing whether a vulnerable version is somewhere in your dependency tree, and making sure it gets fixed, is absolutely being done and being made part of CI etc. (I don't know about Rust/Go specifically, but the JVM ecosystem is the subject of similar complaints and we're absolutely doing vunerable dependency scanning and also things like "edge builds" where we bump every transitive dependency to the latest version and see if anything breaks). Nowadays Github itself will give you an alert without you even needing to do anything.

Frankly, most distribution maintainers seem to not know or care about how upstream software is built; they have an idea about what's "best practice" in the handful of languages they're using to build their distribution (which is mostly, like, C and Perl) and insist that they know best, without realising the rest of the world has passed them by.

reply

bscphil | karma 8137 | avg karma 3.41 · 2021-02-21 22:22:51+00:00

> They should be cut out of the loop.

I disagree. This is how you get the Google Play store or the "freeware" app marketplace. It sucks. As a user, I'm quite happy to continue using a traditional distribution even if it means I don't get to use a handful of flashy programs by developers that disagree with the concept of maintainers. So far that choice has been much more positive than negative for me, and I'm doing my best (by promoting the open source ecosystem) to keep it that way.

> Maintainers don't have a right

If you're creating open source software, they quite literally do! https://www.gnu.org/licenses/gpl-3.0.en.html

If you put in a bunch of crap that doesn't belong in an application (ads, for example), I'm glad I have a maintainer that can either strip this out thanks to the GPL (or BSD / MIT etc), or else choose not to include your app in the distribution at all.

reply

sanxiyn | karma 14687 | avg karma 3.61 · 2021-02-21 22:50:43+00:00

> a handful of flashy programs

This only works if "core" refuses to use Rust, so it's not sustainable. We already can't build Firefox and GNOME without Rust. Maybe Apache and curl next. And then?

reply

ticviking | karma 1415 | avg karma 2.23 · 2021-02-22 00:54:49+00:00

We may be forced to work for free to make rusts build less of a pain. Or to abandon those projects which have abandoned us.

lifthrasiir | karma 12499 | avg karma 3.64 · 2021-02-22 02:17:01+00:00

> As a user, I'm quite happy to continue using a traditional distribution even if it means I don't get to use a handful of flashy programs by developers that disagree with the concept of maintainers.

As a user, I'm afraid that you don't seem to be a representative of typical users. I would be happy to use a traditional distribution if they don't break on updates, which is still not the case. It's clear that traditional distros were not satisfactory on even keeping their own promises. I know they work hard, but ultimately the visible outcome says everything.

reply

drran | karma 1205 | avg karma 0.93 · 2021-02-22 08:22:50+00:00

Distributions are not breaking software, they are just distributing broken software. Blame upstream for lack of testing. Or create your own distribution of flawless software and keep it up to date and flawless for the rest of your life.

lifthrasiir | karma 12499 | avg karma 3.64 · 2021-02-22 10:00:47+00:00

Oh, sure. Softwares are broken, you can't fix that. Traditional distros just happen to (falsely) believe that they can somehow fix at their level. What we need instead is a distro that is resilient to software breakage, not flawless software. Does my hope look that unreasonable to you?

drran | karma 1205 | avg karma 0.93 · 2021-02-22 11:37:48+00:00

My distro works just fine, when an X11 app crashes due to a bug in the Nvidia driver, but I cannot.

lmm | karma 42440 | avg karma 1.91 · 2021-02-22 00:48:00

> If you're creating open source software, they quite literally do! https://www.gnu.org/licenses/gpl-3.0.en.html

Clearly you missed the disclaimer of warranty in the licensing terms.

Packagers are welcome to maintain their own patches, or their own fork, if they like. But they don't have any right to tell upstream what to do or demand particular guarantees from upstream.

reply

drran | karma 1205 | avg karma 0.93 · 2021-02-22 08:26:46+00:00

It's temporary, I'm sure. When broken software or a weaponized patch will slip through weak fence of maintainers and then break critical infrastructure, Congress will vote for something to stop that.

alfiedotwtf | karma 3652 | avg karma 2.04 · 2021-02-21 22:21:04+00:00

I’ve been on Debian since Potato, so I totally see what you’re saying. But...

> that's cutting maintainers out of the loop

Is this necessarily a bad thing? The market has seen the need to fill a hole, and it seems to be working.

I first started with Slackware, and dependency nightmares is what got me into Debian in the first place. Although Debian is nice because of its slow and stable base (which makes me happy for production), I’ve recently moved to Arch and have been so happy as it’s brought back Slackware’s idea of getting as close to upstream as possible and it handles dependencies! And to be honest, I’m loving it. And as an even added bonus, I’m getting more and more surprised how a tonne of packages that I’ve installed are Rust apps.

So, coming back to your comment:

> that's cutting maintainers out of the loop

With systems like Arch that get us closer and closer to upstream, are maintainers the unnecessary middlemen? Of course they’re not entirely redundant, but maybe a new model of distros like Arch will be more commonplace in the future

reply

bscphil | karma 8137 | avg karma 3.41 · 2021-02-21 16:26:31

Thoughtful comment, thanks. I'm an Arch user as well and agree broadly with its approach to a desktop operating system. (I.e. stick as closely to upstream as possible.)

That said, I disagree that this means maintainers are unnecessary middlemen, even though their role on a distribution like Debian is obviously more prominent. The essay I linked to in my top level comment is actually by an Arch maintainer, explaining why they still see maintainers as playing an important role. http://kmkeen.com/maintainers-matter/

reply

Foxboron | karma 3504 | avg karma 6.66 · 2021-02-21 23:03:02+00:00

>With systems like Arch that get us closer and closer to upstream, are maintainers the unnecessary middlemen? Of course they’re not entirely redundant, but maybe a new model of distros like Arch will be more commonplace in the future

Arch is an old distribution, very much in the same class as Fedora, Debian, Gentoo and all the traditional ones.

What makes you think Arch makes maintainers even slightly redundant?

We still deal with security issues. We still need to figure out which Go software utilizes a library with a CVE with no tooling (go list and grep comes a long way). And we still need to deal with pinned dependencies in upstream projects.

reply

dralley | karma 13928 | avg karma 4.7 · 2021-02-21 23:16:45+00:00

Maybe they have heard of the AUR and think it's how the rest of Arch works?

Foxboron | karma 3504 | avg karma 6.66 · 2021-02-21 23:20:45+00:00

I'd be terrified, but that is how NixOS works.

danieldk | karma 20232 | avg karma 4.08 · 2021-02-23 08:24:31+00:00

That is nonsense and you seem to spread this misinformation in a lot of places. You should also add a disclaimer that you are part of the Arch team.

AUR: Anyone can create an account and upload PKGBUILDs. There are no checks at all. AUR users should verify whether PKGBUILDs are not malicious. In practice, a lot of people use things like yaourt to install packages from the AUR without verifying the PKGBUILDs.

nixpkgs: anyone can contribute a PR with a new package, package update, or package modification. However, changes only get added to nixpkgs after someone with commit privileges verifies the PR and merges it. Also, a common misconception is that nixpkgs package maintainers can merge changes. This is false, only a much smaller set of committers can merge changes in the actual nixpkgs repository.

nixpkgs is more like the Arch Community repository, where committers are long-time contributors with a track record of high-quality contributions. Parts of nixpkgs are like Arch Core/Extra, because they are marked using the GitHub codeowners mechanism and changes are generally not merged unless approved through the code owners.

Disclaimer: I am a nixpkgs committer, former Arch user and AUR contributor.

reply

Foxboron | karma 3504 | avg karma 6.66 · 2021-02-23 03:06:26

>That is nonsense and you seem to spread this misinformation in a lot of places. You should also add a disclaimer that you are part of the Arch team.

The AUR comment is unfair, uncalled for and adds nothing to the conversation. I apologize and hope our previous conversations have been more productive. It was meant more tongue in cheek then some grand claim about the quality of nixpkgs and stems mostly from the frustration of the entire vendoring issue.

reply

alfiedotwtf | karma 3652 | avg karma 2.04 · 2021-02-22 07:30:05+00:00

> We still deal with security issues

I don’t see your point. In fact, you’ll most likely find that distros that follow upstream closer than “slow and stable releases” will get their patches as soon as upstream fixes them

reply

Foxboron | karma 3504 | avg karma 6.66 · 2021-02-22 08:10:58+00:00

This assumes an ideal upstream: This is not always the case.

If someone publishes a CVE for a Go or Rust library it's not always the case the project is well maintained or the dev cares to update the dependency. Even if they did, there are no guarantees the upstream decides to publish a minor release just to update dependencies. Because that is what vendoring dependencies get you.

Instead of applying one patch to a shared library I'd need to hunt down all upstreams utilizing the library and manually patch between 10-140 packages independently and submit them upstream.

reply

symlinkk | karma 848 | avg karma 0.8 · 2021-02-22 14:53:21+00:00

If upstream is not keeping their software up to date, why are you using that software? Imagine if Google stopped updating Chrome. Would you keep using Chrome?

Foxboron | karma 3504 | avg karma 6.66 · 2021-02-22 17:21:27+00:00

Traditionally package maintainers has kept dependencies up to date. This has changed with vendored dependencies where upstreams have to do the work. They are not always up for that work. This isn't strange, nor weird and the comparison to chrome and google doesn't make much sense.

drran | karma 1205 | avg karma 0.93 · 2021-02-22 08:30:38+00:00

> will get their patches as soon as upstream fixes them

i.e. never. Lot of upstreams are abandoned with tons of PR's waiting years for somebody to merge in.

reply

rsj_hn | karma 9643 | avg karma 3.21 · 2021-02-21 22:47:46+00:00

The problem is that the modern practices are being adopted to meet the changing needs of this code, which weren't an issue in the enthusiast linux and BSD communities, but are an issue when open source code is being put in mission critical professionally managed production environments on AWS, or into cars and appliances.

The userland is much more complex now, so freezing dependencies and bundling them in order to have a smaller number of test cases may not have been necessary when the environment was smaller and simpler. The reliability expectations were lower, and there weren't big money professional support contracts which required you to validate your application against a set of well defined environments. All that has changed.

reply

virgo_eye | karma 10 | avg karma 1.0 · 2021-02-22 11:04:37+00:00

Why are you comparing distro package managers to the Play store or the Chrome extension store? You should be comparing them to npm, pypi, etc. That's clearly the context of this discussion. No one is saying that the Chrome extension store does a better job than package managers. The language ecosystems do an amazingly better job.

lucideer | karma 13373 | avg karma 3.96 · 2021-02-21 21:47:57+00:00

> The universe I'd like to live in is where the only use case for dynamic linking are OS vendor APIs and cryptographically secure functions like TLS. My dream package manager would whitelist those system librarie and forbid distribution for any bundle that does contain the shared objects with the symbols it needs.

This idea seems to be predicated on a belief that OS vendor APIs and cryptographic libraries are the only attack surface for serious user-affecting software exploits.

reply

hctaw | karma 917 | avg karma 3.25 · 2021-02-21 16:16:29

They're obviously not, but they're the ones that package managers can help mitigate automatically. The rest is going to be up to the developers to patch.

nextos | karma 10746 | avg karma 3.92 · 2021-02-21 15:54:04

Some of the things you mention are not incompatible with package managers. Have you considered Nix?

I actually agree with the parent post in terms of strongly preferring package managers to other means of distributing software. I've always found Linux much easier to admin than other OSes simply because of package managers.

reply

kuratkull | karma 873 | avg karma 3.49 · 2021-02-21 16:11:16

Was really pumped after playing with Nix for a bit. But got bitten twice in the first day. I tried to run my work project using Nix. It needs Pythons python-prctl library - that doesn't work in Nix. So that's a dud. Next I try to use it as a Nim environment - Nim is unable to compile anything in Nix due to some missing (bundled) Glibc symbol. (Nim works nicely in my standard installation). The cool "export full container" thing mentioned in the tutorial failed for all, even the simplest cases. So I am kinda disillusioned by Nix.

nextos | karma 10746 | avg karma 3.92 · 2021-02-21 22:57:01+00:00

You can always emulate FHS environments for things that are hard to turn into a Nix package without significant effort.

Aside from that, Guix is an alternative implementation that might have what you need.

reply

erik_seaberg | karma 4377 | avg karma 1.56 · 2021-02-22 00:02:59+00:00

Are you saying Debian stable isn’t slow and careful enough in your experience? I usually hear just the opposite.

rcxdude | karma 6315 | avg karma 2.5 · 2021-02-21 18:51:51

> I do not trust system provided libraries to function with my applications because I've been burned so many times in the past.

I agree. About half the issues I've had with dependencies has been due to distributions fiddling with upstream for some reason or another. Probably the main reason I like Arch (which has a policy of just following vanilla upstream, though they aren't immune: 'python' being python 3 is probably the biggest pain point)

reply

pdonis | karma 17088 | avg karma 1.99 · 2021-02-22 02:55:42+00:00

>I disagree, it's proven to be inadequate for modern software development

You're not disgreeing with the post you responded to; you're just stating a different priority.

hctaw said the distro/maintainer/pm approach is extremely reliable at producing trustworthy software. That means trustworthy to the user. It says nothing at all about how hard producing that software is for the developer.

You are saying that the distro/maintainer/pm approach makes it harder for the developer. That's true. But it doesn't contradict the above at all.

And anyway, as a user, I don't care. I want my software to work, to be stable, and to not have security flaws; and if a security flaw is found, I want a fix to be pushed to me ASAP. The distro/maintainer/pm approach does that. If instead I have umpteen zillion different statically linked applications installed, each of which packages all of its own dependencies, then instead of just relying on my distro to push security fixes to shared libraries that everyone uses, I have to rely on every single one of those developers to do it for their own packages. And most of them won't do it, or they'll do it when they get around to it instead of when I, the user, need it.

> The universe I'd like to live in is where the only use case for dynamic linking are OS vendor APIs and cryptographically secure functions like TLS

This won't work either, because those are certainly not the only places where security flaws can happen that I, the user, need a fix for ASAP.

reply

secabeen | karma 3793 | avg karma 2.49 · 2021-02-22 04:54:14+00:00

> And anyway, as a user, I don't care. I want my software to work, to be stable, and to not have security flaws; and if a security flaw is found, I want a fix to be pushed to me ASAP. The distro/maintainer/pm approach does that. If instead I have umpteen zillion different statically linked applications installed, each of which packages all of its own dependencies, then instead of just relying on my distro to push security fixes to shared libraries that everyone uses, I have to rely on every single one of those developers to do it for their own packages. And most of them won't do it, or they'll do it when they get around to it instead of when I, the user, need it.

This is my main worry as well. We are currently in the early, easy, period of this new development paradigm, where developers are constantly releasing new code, and fixes are easy to deploy. I worry that in 10 or 15 years, when a security bug is found in a critical imported function, the developers aren't going to be around anymore to fix them, they will have moved on to the next new, hot, language, and will have as much interested in maintaining their old Go/Rust code as developers today do in maintaining their old C code.

reply

pdonis | karma 17088 | avg karma 1.99 · 2021-02-21 23:34:17

As a user, my response is simple: I don't use software that's built that way. Outside of code that I write myself, I simply refuse to use software that's not accompanied by a distribution and maintenance infrastructure that I trust. For most software, that means it's packaged by my distro. Some big players might be able to convince me to take their software from them directly, but they will be very few, because there are very few big players that are as reliable as my distro, and that's the standard they have to meet.

secabeen | karma 3793 | avg karma 2.49 · 2021-02-22 05:53:21+00:00

> Outside of code that I write myself, I simply refuse to use software that's not accompanied by a distribution and maintenance infrastructure that I trust. For most software, that means it's packaged by my distro.

Okay, that's a reasonable approach, but with distros essentially saying for new software, they can't continue to maintain that standard, and they're just taking software as it comes, without securing it. I think you will find that there are lots of small utility programs and libraries that you won't have available to you in 10 years with this approach. YMMV.

reply

pdonis | karma 17088 | avg karma 1.99 · 2021-02-22 06:07:46+00:00

> I think you will find that there are lots of small utility programs and libraries that you won't have available to you in 10 years with this approach.

Then I'll either find an alternate source that has enough reliability to satisfy me, or write them myself, or do without.

(Or I'll end up building what amounts to my own distro. Which is something I have indeed thought of doing, because my preferences are rather idiosyncratic.)

reply

lmm | karma 42440 | avg karma 1.91 · 2021-02-22 00:51:25

> And anyway, as a user, I don't care. I want my software to work, to be stable, and to not have security flaws; and if a security flaw is found, I want a fix to be pushed to me ASAP. The distro/maintainer/pm approach does that.

Not my experience at all. The distro maintainer generally takes significantly longer to push out a fix than the upstream developer.

reply

pdonis | karma 17088 | avg karma 1.99 · 2021-02-22 16:22:16+00:00

> The distro maintainer generally takes significantly longer to push out a fix than the upstream developer.

Of course this is true in a sense, because the distro maintainer has to wait for upstream to push a fix before they can package it.

However, for the upstream developer, "pushing a fix" means "pushing updated source code". For the distro maintainer, "pushing a fix" means "compiling the updated source code and packaging the resulting binaries for all supported versions".

There are some upstream developers who could probably accomplish the latter at least as fast as distros do, but not many. But the latter is what I, as a user, need.

reply

lmm | karma 42440 | avg karma 1.91 · 2021-02-24 01:09:34+00:00

My release builds makes binary packages (including a .deb) and push them to my artifact repository; isn't that normal?

bscphil | karma 8137 | avg karma 3.41 · 2021-02-23 02:32:14+00:00

I agree with your comment, but to be clear,

> hctaw said the distro/maintainer/pm approach is extremely reliable at producing trustworthy software

I think you mean I (bscphil) said that. hctaw was the user disagreeing, saying it makes things harder for developers.

reply

pdonis | karma 17088 | avg karma 1.99 · 2021-02-23 03:39:49+00:00

> I think you mean I (bscphil) said that.

Oops, yes, you're right. Sorry for the mixup.

reply

0xbadcafebee | karma 17927 | avg karma 6.72 · 2021-02-21 20:57:01

> it's proven to be inadequate for modern software development

Sure, because it's not meant for software development. It's meant for users to run software.

Software development packagaing/distribution is pretty bad. Most languages do it their own way and never seem to learn the lessons of previous ones. And on top of that, they encourage poor habits, like writing your own module that's the same as somebody else's with one new function, rather than contributing to the existing module, or writing an extension for it.

> I do not trust system provided libraries to function with my applications because I've been burned so many times in the past.

Probably this is because both you + the library developer are not coordinating with the distributions on how you release your code. Distros get a lot of flack for breaking changes, but they're working from software released by developers, and rely entirely on their own user base to test changes. If developers cared about their software working they'd be more involved in its packaging & distribution.

reply

jjav | karma 7040 | avg karma 2.09 · 2021-02-22 05:44:33+00:00

> The least reliable way to package and distribute software is by relying on traditional package managers.

Checks and balances produce reliability. Having a packaging process and people distinct from the code author validating and enforcing it, most certainly produces much more reliable, secure and stable packages and distributions.

Sure, it's slower. Slower is actually a feature in this use case.

Having anyone push out their code to the world without any constraints or care, breaking compatibility day to day, that is certainly faster and easier. But reliable, certainly not. This is the culture that produces leftpad debacle and assorted malicious packages regularly.

reply

thu2111 | karma 811 | avg karma 0.34 · 2021-02-22 11:39:45+00:00

It really doesn't. I've worked on upstream projects where a significant fraction of all bugs reported were created by distributions screwing up packaging and patching in ways they weren't at all qualified to understand and frequently led to non-obvious failures.

When we tried to work with them to fix this, about half the time they flamed us and quoted distro 'policy' as a reason not to fix their bugs, so we just refused to accept bug reports from anyone using those packages anymore.

The fact is that a lot of old-school Linux distributions are built by people who have only a very vague understanding of the software they're packaging, and frequently are closer to the sysadmin side of things than the large-scale software development side. It makes the relationships very frustrating and that's why proprietary software vendors invariably opt-out of distro packaging. Even with apps statically linked to the max possible level Linux users generate disproportionate levels of support tickets due to the general flakiness of the distros they use, so allowing them to modify tested software even further is a losing proposition.

Basically the whole concept of a Linux distribution is obsolete, fading away and irretrievably broken. Hence the proliferation of containers.

reply

drran | karma 1205 | avg karma 0.93 · 2021-02-22 02:05:02

> By statically linking everything as much as possible and shipping everything else in a self contained bundle with a launcher that overrides any symbols that might inadvertently be pulled in from the system.

So, you propose to ship your own OS, as single image, for your application? What stops you from doing this? Drivers? Ship your own hardware then, like smartphone vendors do.

reply

jauer | karma 2074 | avg karma 4.6 · 2021-02-21 22:03:14+00:00

> Two different communities create two very different kinds of software, which run on the same systems, but are created and distributed in very different ways.

This is where the disconnect is coming from. The distro maintainers are coming from a world of multi-user systems where backwards compatibility and updating deps without disturbing a user's workload / forcing them to recompile is paramount.

Go (and a fair amount of rust/python work) come from the land of CI/CD and, to a lesser extent, monorepos. When you are rebuilding the world above a bare minimum of the OS literally on every commit (or at least several times per day), it's easier to reason about code that is running if you can look at the commit that a bin was built from and know exactly what's inside (including all deps).

reply

bscphil | karma 8137 | avg karma 3.41 · 2021-02-21 16:17:10

I agree. I think the difference has been that until recently "the land of CI/CD" and so on has been certain segments of the corporate world, and not how typical open source developers did things*. So when the former developed new technologies and new languages, they created build tools for them that anticipated being used in the ways that they usually produce software.

The "problem", in the sense that it's a problem, is that these languages and related technologies are all pretty good! And so it's understandable that many developers who would traditionally be in the open source ecosystem want to use them. As a result they end up creating software that can't easily be shipped in traditional distributions. Ecosystem fragmentation is the unavoidable result.

* By typical open source developers, I mean the sort of developers (and their development practices) that produced most of the software on my computer. I don't mean Firefox: Mozilla and Google have much more standard corporate development practices despite both producing quite a bit of open source software.

reply

tialaramex | karma 26322 | avg karma 2.85 · 2021-02-21 23:50:42+00:00

Although continuous integration starts in proprietary software, it's been present in Free Software for at least decades. Netscape may well be the second or third medium-large software outfit to do continuous integration the way it's done today (we know Microsoft had a team doing this by hand every single day for Windows NT but that's completely insane) because some of its team had experienced this approach elsewhere and knew they needed it if they wanted to ship software that actually works. When Mozilla was created, Tinderbox (that system) along with the Mozilla browser (and so today Firefox) and Bugzilla (a bug tracker) were freed.

I know it probably seems like last week, but that was more than twenty years ago.

reply

Ericson2314 | karma 7269 | avg karma 1.6 · 2021-02-21 23:51:52+00:00

I like Nixpkgs and NixOS for understanding both worlds. The Nix ecosystem's best path to mainstream success is being that go-between for everyone.

josephg | karma 16849 | avg karma 4.67 · 2021-02-22 00:49:53+00:00

> The "problem", in the sense that it's a problem, is that these languages and related technologies are all pretty good!

Yes; and frankly the development ecosystem for making software for linux & friends on top of apt/etc is terrible - at least from the perspective of a modern professional software engineer. The assumption is C, and C programs have no package manager - so of course dependencies get bundled / vendored sometimes when the alternative is linking to potentially out of date dependencies in apt. Autoconf/automake is awful to learn and understand. CMake is better - but its horrendously complicated because it tries to solve the impossible job of paving over all the junky custom compilation scripts that came before. (And it still has no cargo equivalent for actually fetching your deps.)

And then to work around all of that, each distribution will make weird, custom, maybe buggy patches to your software before adding it to their package managers. (Which has caused some high profile bugs and security issues a number of times.) Now when there's a bug, nobody knows who's fault it is!

This worked in a world when there wasn't much software, when releases were rare and when most programs only had one or two dependencies. None of these properties are true any more.

Rust, go, python and nodejs don't fit well with linux's package managers. The obvious alternative would be putting every crate, gem, pip package and npm package into apt, rpm and all the rest. And keeping them up to date with every version. But lets be real - that would be horrible. Apt et al aren't (currently) up to the task. (Can you imagine every npm package needing a maintainer in apt alone? I can just imagine the github issues: "I'm in debian stable and this transitive dep you're using only has version 0.1 available, from 6 years ago. What do I do?". Yikes.)

I'm sympathetic to the argument that modern million dependency software development has its own problems; but right now it (sadly) has no competition in terms of ergonomics and build reliability.

reply

zrm | karma 2716 | avg karma 3.04 · 2021-02-22 01:42:14+00:00

> This worked in a world when there wasn't much software, when releases were rare and when most programs only had one or two dependencies. None of these properties are true any more.

I don't think this is even the problem.

It's that upstream maintainers have stopped worrying about compatibility.

Once upon a time you would have regular minor releases of some package. 3.0.2, 3.0.3, 3.0.4, but they were all backwards compatible. If you had version 3.0.4 and some software that was built against 3.0.2, it still worked against 3.0.4 because the only difference was that things were added or compatibly improved, not removed or incompatibly changed.

Version 3.0.x wasn't compatible with version 2.9.x, but then the package maintainer for the distribution only has to package versions 3.0.4 and 2.9.16, i.e. suitably recent minor versions of each compatibility revision. Compatibility revisions so old that nobody relevant uses them anymore can be ignored, so they only had to package two or three incompatible versions which together are compatible with everything in active use.

The problem today is that everything is a compatibility-breaking change, so there are dozens of releases from this year alone that are all mutually incompatible and would have to be packaged separately. And that doesn't scale.

reply

josephg | karma 16849 | avg karma 4.67 · 2021-02-22 10:48:50+00:00

Interesting - I didn’t know that. Can you give some examples?

zrm | karma 2716 | avg karma 3.04 · 2021-02-22 11:20:23

https://wiki.openssl.org/index.php/Versioning

https://developer.gnome.org/gtk3/stable/gtk3-Feature-Test-Ma...

https://everything.curl.dev/libcurl/api

reply

drran | karma 1205 | avg karma 0.93 · 2021-02-22 07:57:55+00:00

So, you deliberately chose distro intended for users, with low version churn, instead of distro intended for developers, e.g. Fedora, which ships even pre-release versions sometimes, and now you blame ... apt? Just curious, what you are using for coding? MS Word or Excel? For example, Linus uses Fedora/MATE/Emacs.

It's relatively easy to convert between packages between different packagers/distros. Automatic converters are exists for deb/rpm/pip/cpan/ctan/cargo, so it's easy to convert all existing packages into one packaging system and drop all of them into huge mono-repo.

Yes, it's much easier to throw a new version to repo in many non-linux repositories. It's the equivalent of rolling distros, such as Arch, or development version of distro, such as Rawhide in Fedora or Sid in Debian. However, it also much easier to: break the world and make into news, distribute keyloggers and steal passwords and keys, forget to backport security fixes to users of older versions, pivot into completely orthogonal thing, etc. No code reviews means no responsibility.

reply

josephg | karma 16849 | avg karma 4.67 · 2021-02-22 10:36:29+00:00

If you want your software to work everywhere, you either need to take responsibility for making it build everywhere (yes, including old versions of Debian which don’t support the versions of your dependencies you need) or you punt that work to someone else - in which case your software simply won’t work on lots of computers. From the perspective of an upstream maintainer, the status quo is pretty awful.

There’s a reason people are turning to docker - because the “portable executable” idea on Linux is so often broken by weird incompatibilities between libc versions or some important dependency being missing or broken from some users’ systems. Automatic package translation suffers the exact same problem - a dynamically linked binary you build on your computer often won’t run on my computer.

reply

drran | karma 1205 | avg karma 0.93 · 2021-02-22 05:57:01

If you want for your software to work everywhere, but you want to avoid compilation step, then use Perl/Python/PHP with bindings to QT/GTK.

Linux has perfect backward compatibility, I still able to compile and run 70 years old app, developed on completely different OS and processor.

IMHO, you think that your app/lib binary will work on all combinations of OS/processor flawlessly without recompilation, which is not true. Nobody promising that. It's by design.

reply

account42 | karma 5969 | avg karma 0.98 · 2021-02-24 12:59:50+00:00

> weird incompatibilities between libc versions

Care to provide some examples?

reply

aragilar | karma 689 | avg karma 1.78 · 2021-02-22 05:25:55

I disagree that rust, go or python inherently do not work with linux package managers, instead there's a culture within subsections of those communities who do not value stability (I'm not sure about nodejs community though), which cause these disagreements. C isn't immune to this (and never has been)—science codes are infamous around their lack of care around stability (and IMHO one of the reasons for scientific python's success over alternatives has been this stability).

I'd also disagree with the idea that the alternative is reliable—try building and modifing a project that hasn't been touched in 6 months, and see how reliable that is. With your tree of dependencies (absent specific projects which abstract over a set of unstable dependencies and hence implicitly stabilise them, e.g. SDL) the stability of your project (whether it is a library or application or framework) is set by the least stable dependency you have. Increasing you dependencies increases the risk, but if you have an ecosystem which values stability, larger dependency trees should not see a significant increase in risk (personally, I'd love to see the scientific rust ecosystem achieve similar stability to python).

That's not to say linux distros are perfect (change can be slow ;)), but there's lots of little things they do get right (how many projects handle updating configuration files correctly—that kind of thing is built into distro tooling), and they enable ecosystem-wide changes more that the alternative (e.g. https://reproducible-builds.org/).

reply

mwcampbell | karma 10942 | avg karma 3.15 · 2021-02-21 17:06:27

This makes me wonder if there's a Linux base system suitable for servers that embraces the newer approach. That is, a minimal base system that's built for immutable container images, providing just what's needed to bootstrap the current generation of language-specific build and package systems. The Alpine Linux Docker images might be a good choice for now, but IIUC, Alpine Linux itself still embraces the older distro approach.

MetaDark | karma 82 | avg karma 2.73 · 2021-02-21 23:50:30+00:00

Maybe the Nix package manager / NixOS is what you're looking for? I think it takes the best features from both worlds.

Every package installed with Nix is isolated into content-addressable* directories, so for example, my install of Firefox is located at /nix/store/c7pmng2x05dkigpbhnjs8fdzd8kk31np-firefox-85.0.2/bin/firefox. This is pretty inconvenient to use directly, so Nix generates a profile that symlinks all your packages into one place (eg. /run/current/system/sw, ~/.nix-profile), and then environment variables like PATH can just include <PROFILE_DIR>/bin.

With this approach, I can have multiple versions of the same package installed simultaneously, without them conflicting with each other. Like in a traditional distro, any dependencies that are shared between packages aren't duplicated, but if a package needs to explicitly depend on a different version, it can.

Also, because Nix is designed as a functional package manager for building packages from source (even though it has a binary cache), you can trace back exactly what sources were used to build your package and its dependencies, all the way back to the bootstrap binaries used to build any self-hosting compilers (gcc, rust, openjdk, ...)

* Most packages use a hash that's generated from the inputs used to build it, rather than the output that's generated.

reply

vnlp | karma 0 | avg karma 0.0 · 2021-02-22 00:11:21+00:00

Sounds like another idea of djb took off (the slashpackage hierarchy).

MetaDark | karma 82 | avg karma 2.73 · 2021-02-22 00:23:15+00:00

I'm not sure what the slashpackage hierarchy is, but Nix & NixOS started in 2003.

lifthrasiir | karma 12499 | avg karma 3.64 · 2021-02-21 18:27:17

That's https://cr.yp.to/slashpackage.html which is dated at least 2001. It's not quite the same to the Nix approach though, which also tracks all direct and transitive dependencies.

mroche | karma 3307 | avg karma 5.88 · 2021-02-21 18:30:29

> That is, a minimal base system that's built for immutable container images, providing just what's needed to bootstrap the current generation of language-specific build and package systems.

At least for the first half, that sounds sort of like Fedora/Red Hat CoreOS[0], the predecessor CoreOS fork Flatcar Container Linux[1], or the Amazon distribution BottleRocket[2].

[0] https://getfedora.org/en/coreos?stream=stable

[1] https://kinvolk.io/flatcar-container-linux/

[2] https://aws.amazon.com/blogs/aws/bottlerocket-open-source-os...

reply

steveklabnik | karma 91260 | avg karma 5.08 · 2021-02-22 00:36:55+00:00

Fun thing about BottleRocket relevant to this thread: it uses Cargo as its build system. It's really wild and very interesting and more people should know about it, IMHO.

mands | karma 639 | avg karma 4.95 · 2021-02-22 00:24:40

Check out Fedora Silverblue for something along these lines.

baybal2 | karma 13766 | avg karma 1.35 · 2021-02-21 23:10:42+00:00

> it's easier to reason about code that is running if you can look at the commit that a bin was built from and know exactly what's inside (including all deps).

Believe me, it's usually the opposite.

Lack of proper releases, testing, and versioning results in unending checkout-fu to figure out what commits for each of 20 libraries will work for each other.

The idea is plainly stupid, without any redeeming qualities.

The entirety of this cargo cult hinges on the point of "If people calling it a genius invention for last 8-10 years admit it not being it, some major reputation, and cred loss will be incurred"

reply

jauer | karma 2074 | avg karma 4.6 · 2021-02-21 17:36:04

> Lack of proper releases, testing, and versioning results in unending checkout-fu to figure out what commits for each of 20 libraries will work for each other.

Admittedly I'm basing this on my experience in a very large monorepo environment, but there's no figuring out which commits will work with each other. Every commit with every library will work, otherwise it doesn't get committed. Yes, this involves massive CI infra and tooling to aid in refactoring.

You want to make a breaking change to a lib? Great, it's on you to update every piece of code that calls it.

It's great when you can control every piece of your infra, but I totally get how it's unfeasible (and maintainer hell) for the distro community.

reply

loopz | karma 2044 | avg karma 0.95 · 2021-02-21 18:36:48

Distros are great for off-the-shelf software especially if you don't care which version you get too much. When versions matter, you quickly get into dependency-hell. So long-lived software tend to stabilize, and then remain unchanged.

K8s, go and even ruby, tend to change and evolve. It's usually a bad idea to pull such software from distros then, even if available.

The means to get software is simply too different, and it's a non-problem for everyone but completist distro maintainers.

reply

rictic | karma 3198 | avg karma 4.11 · 2021-02-21 23:37:37+00:00

Are you actually talking about static linking here, or just venting your spleen about sloppiness in software engineering practice more generally?

Because tracking dependencies in source control (specifically, checking in lock files) is tremendous for reproducibility. It means that when bisecting through the commit history to find when a problem began is not just bisecting through the local source code, but also the specific versions of every dependency.

So regardless of whether the issue you're investigating is in the package, a dependency, or an unexpected interaction between the two, you're able to find the first commit that introduced the issue in O(log(commits)) time, rather than needing O(commits * (num dependencies * dependency versions)) time.

reply

zrm | karma 2716 | avg karma 3.04 · 2021-02-22 02:56:35+00:00

> you're able to find the first commit that introduced the issue in O(log(commits)) time, rather than needing O(commits * (num dependencies * dependency versions)) time.

Try the oldest and newest compatible version of your code and the oldest and newest compatible version of each dependency. This is O(num dependencies), which is generally small N and you can often guess which to try first. Now do each version of the code or dependency that actually made a difference. O(log(N)) on the versions of that code. You're at O(num dependencies) + O(log(N)), not O(commits * (num dependencies * dependency versions)).

Also, doing it the other way can often lead to frustration. The actual bug is introduced in commit #1234 but is timing dependent, then a dependency is upgraded from version 2 to version 3 in commit #5678 which tickles the bug, e.g. the new dependency version moved around some cache lines. Now you're looking in entirely the wrong place at an innocent dependency.

Whereas if you notice that the bug exists with dependency version 3 and the latest commit and then do binary search on all the commits while holding the dependency at version 3, plausibly the bug shows up between commit #1233 and #1234.

reply

rictic | karma 3198 | avg karma 4.11 · 2021-02-22 11:51:57+00:00

While version 3 of the dependency is innocent, commit 5678 is not. Something went wrong in the interaction between the code and its dependencies in that change and discovering that change quickly is valuable.

From there you can start stepping through the code, or simplifying the situation to get a minimal repro of what's going on, or even bisecting with dependency version 3 held constant to see if that's diagnostic.

In my experience, with a bug that's so timing sensitive that some cache lines moving around will trigger it, you're likely to discover pretty quickly that something weird is happening as you try to get a more minimal repro.

Meanwhile, if you're not tracking dependency versions, such a test is likely to appear weirdly flaky. You're trying to track down this failure but something else comes up and you have to set it aside for a few days. When you come back to it, you can no longer get the test to fail because an untracked dependency change has moved cache lines around again. Which change? Which dependency? Since it's not in source control, you've got a painful process of guessing plausible combinations of recent versions of all transitive dependencies.

reply

zrm | karma 2716 | avg karma 3.04 · 2021-02-22 12:35:33

> While version 3 of the dependency is innocent, commit 5678 is not. Something went wrong in the interaction between the code and its dependencies in that change and discovering that change quickly is valuable.

The trouble is that this will tend to point the finger at large changes that jostle many things around at once and become a rabbit hole rather than the two line commit with a typo that actually caused the problem.

The main advantage you're putting forth is to know the versions of each dependency needed to reproduce the problem. But you can get that from the person reporting the bug. You can add a switch to your software to output the versions of every dependency it's using and then it's there in the bug report. And once you have a combination that can reproduce the bug, the process of experimenting with things to identify the cause is basically the same either way.

reply

Guvante | karma 2391 | avg karma 1.87 · 2021-02-21 23:53:15+00:00

The old way of forever support for all versions and never any breaking changes is really beneficial on the consumer side.

However it is incredibly expensive on the producer side.

Package managers have pushed that pain to consumers. It isn't terribly surprising that producers are jumping on board en-masse.

reply

pjmlp | karma 109153 | avg karma 1.76 · 2021-02-22 07:01:31+00:00

Additionally, when you have languages with rich library ecosystems, the OS kind of becomes irrelevant, the platform is the language ecosystem.

Just to pick Go as an example (not to be lost discussing VMs and such), it doesn't matter if I am targeting bare metal, Linux, Windows, IBM z/OS, AWS special cloud runtime, whatever.

As long as the Go code is the same, and someone has done the low level runtime support, it is a compile away and done.

Finally by pushing containers no matter what, the Linux community has made this even easier.

reply

rini17 | karma 2455 | avg karma 1.39 · 2021-02-22 02:46:33

What if there's another software in another language you want to interoperate with? What if you want to avoid containers with their complexity and dubious security record?

pjmlp | karma 109153 | avg karma 1.76 · 2021-02-22 03:22:06

Some form of IPC, usually OS agnostic ones.

That is what I have been doing the last 20 years, in the context of C++, Java and .NET.

reply

edoceo | karma 6496 | avg karma 2.01 · 2021-02-21 22:24:07+00:00

Based on this writing, I'm keen to read your full essay

nightowl_games | karma 2199 | avg karma 3.36 · 2021-02-21 23:01:18

I sympathize with your concern.

Fragmentation is fundamental to Open Source. Fragmentation brings mutation, and mutation brings evolution. Fragmentation is a virtue of open source. Not a vice. The world simply doesn't value security above convenience, as I intuit you may.

I think the natural tendency of all organic evolution is to take the path of least resistance.

It will not be effective to ask developers to do more work. That is going against nature. The packaging ecosystem has to compete and win against its competitors. That is the true way of Open Source, of organic systems.

It is likely that the packaging system you are advocating for is more difficult than the competitors you mentioned. Reducing that friction is perhaps a more effective place to apply your focus. What are these competing ecosystems doing that makes them more attractive? Why is your ecosystem losing users? How does Flathub, Snapcraft and nixOS fit into this view? How does the mac app store, the windows store fit into this view? What is the one true way to package an application?

reply

lrvick | karma 7145 | avg karma 4.41 · 2021-02-22 04:02:32

Publish it if for no other reason than to add another item in search results for those researching this issue.

It is one if the biggest house of cards issues in the FOSS ecosystem that few are talking about.

reply

virgo_eye | karma 10 | avg karma 1.0 · 2021-02-22 10:50:22+00:00

The distro model did not work, which is why the other model took over.

I used to try to religiously follow the recommendation only to install Python packages which had been repackaged by Debian. Fine, but it meant that you couldn't use any even slightly obscure package, nor one that was younger than some timelag, which ranged from a few months to a couple of years.

Inevitably, you want to pip install something. Then the repercussions of mixing Debian packages and pip packages are a whole new set of problems. And you can't get anyone to look at your problem, even if it's a common issue which the Debian packager could fix or workaround, because 'pip is not supported, you should install this via `apt install python-foo`'.

The best solution is and was to only use pip packages, along with some form of isolation from the wider system, whether virtual environments, containers or what. Python now has extremely good native tools to work this way, and so do most modern languages. I only work like this now, and so, it appears, do all the maintainers of my dependencies and my transitive dependencies. Development cycles are much faster, and it all just works.

reply

choeger | karma 5417 | avg karma 3.41 · 2021-02-21 14:44:31

Would it be feasible for rust to ship "re-link" scripts? That is, when a cargo dependency gets updated, the distribution can check whether packages code needs to be re-linked? Similarly, when distributed code (say libssl) changes, the rust packages can be re-linked by the distribution on-site?

charliesome | karma 3785 | avg karma 8.74 · 2021-02-21 15:54:45

You can achieve this by recompiling the software. Cargo will not recompile dependencies that have not changed.

IshKebab | karma 13023 | avg karma 1.29 · 2021-02-21 14:55:19

This article doesn't make a good case against static linking, and the author doesn't seem to understand what vendoring is either:

> Bundling (often called vendoring in newspeak) means including the dependencies of your program along with it.

No, vendoring means including a copy of the source code of dependencies in your repo. You can bundle dependencies without vendoring them.

The only argument presented against static linking is that when a library is updated rebuilding dependants takes longer (and people will have to download bigger updates but I doubt many people care about that).

That may be true, but is it really that big of an issue? I somewhat doubt it (for sane distros that don't make users build everything from source themselves anyway).

The author clearly doesn't like how people do development these days but hasn't stopped to think why they do it like that.

reply

UncleEntity | karma 2069 | avg karma 0.83 · 2021-02-21 22:52:42+00:00

> The only argument presented against static linking is that when a library is updated rebuilding dependants takes longer (and people will have to download bigger updates but I doubt many people care about that).

Not really, the argument is that instead of rebuilding one library you now have to rebuild hundreds of applications to see the benefits for whatever the library update was for...assuming upstream even cares enough to bump the version number for the who knows how many libs they are vendoring because they have other stuff to do which is way more important.

reply

IshKebab | karma 13023 | avg karma 1.29 · 2021-02-22 13:46:14+00:00

Right but that can be done automatically for static linking just as easily as it can for dynamic linking. So the only difference is time and downloads.

I think you're conflating static linking and bundling and vendoring a bit, like the author is. They're all different things.

reply

posix_me_less | karma 417 | avg karma 1.19 · 2021-02-21 20:56:07+00:00

Here is the solution to this Debian/Gentoo/<other true free software distribution> packager's dilemma. Packagers realize that their numbers are small and they can't keep up fixing all the modern big software (BS) projects that go against their philosophy.

They define the core of the OS that they can keep managing in this traditional way (kernel, basic userland, basic desktop, basic server services, basic libraries, security support for all that).

"But people want Firefox, Rust, Kubernetes, modern OS has to provide them".

No it doesn't. Let the new complicated juggernaut software be deployed and configured by the users, according to developers' instructions, using whatever modern mechanism they prefer, containers/chroots/light-weight VMs, whatever. Packagers can then forget about the nasty big software and focus on quality of the core OS. Users will have all the newest versions of BS they need from the developers directly.

There is no reason to keep fast-pace, ever-changing and very complicated programs/projects in the main OS distribution. It only leads to lots of OS-specific packaging work with mediocre and obsolete results, and that leads to discontent users, developers and packagers.

reply

pvorb | karma 1819 | avg karma 2.43 · 2021-02-21 15:00:12

Exactly. When you want to install one of those fast-paced pieces of software on your system, you're often dissatisfied with the old version coming with your OS anyway.

drran | karma 1205 | avg karma 0.93 · 2021-02-22 09:52:56+00:00

I see no problem to bump versions of a few packages, rebuild them, and install them via system package manager. Are you developer or user? Of course, some software may not work after that, but package manager newer stays in your way, unless you have no idea how to use it.

pvorb | karma 1819 | avg karma 2.43 · 2021-02-22 15:02:00

I'm a software developer and a user of package managers. A package manager that allows me to only have one version of a software package installed will inevitably get in my way if I need two programs that need different, incompatible versions of that dependency. It happens quite a lot in my experience.

drran | karma 1205 | avg karma 0.93 · 2021-02-22 21:28:01+00:00

Basically, you want to put two versions of the same file(s) into one file.

If you need two versions of almost same set of files, then chose different name for the package, e.g. package-2, chose different base directory for package files, e.g. /usr/share/package-2, and chose different names for binaries, e.g. /usr/bin/binary-2. It's not a magic. Just look at examples, e.g. python2 and python3.

reply

pvorb | karma 1819 | avg karma 2.43 · 2021-02-24 07:45:49+00:00

But I'm not the author of those packages. How does the package manager allow me to rename existing packages?

drran | karma 1205 | avg karma 0.93 · 2021-02-25 09:41:03+00:00

You can create your own repository, where you will be maintainer of your packages.

You can use it locally, as directory on disk, or you can put it on a server and share with others, or you can compile it using openSUSE Build Service, which supports OpenSuSE, SLE, Fedora, RedHat, CentOS, Debian, Arch, etc.: https://en.opensuse.org/openSUSE:Build_Service_supported_bui...

reply

Chyzwar | karma 984 | avg karma 1.72 · 2021-02-22 17:03:24

To some extent you can do it with snap. https://snapcraft.io/blog/parallel-installs-test-and-run-mul...

A lot of discussion here is already solved by snap/flatpack.

reply

tgbugs | karma 969 | avg karma 4.09 · 2021-02-21 21:16:36+00:00

I think that there is a good reason to keep those projects in the main distribution: it makes them more robust and prevents the build systems and bootstrapping process from becoming a massive pile of technical debt.

For example, in order to get clojure 10 running on Gentoo I had to go dig through the git history to figure out where spec-alpha and core-specs-alpha and clojure itself did not form a circular dependency in maven. Because clojure was not being packaged in a variety of ways, they now are at risk of making maven central a hard dependency, which makes the whole project less robust.

reply

posix_me_less | karma 417 | avg karma 1.19 · 2021-02-21 21:35:15+00:00

Do you mean "robust for people using Gentoo" or "robust for everybody"? Isn't the latter responsibility of the Clojure developers rather than packagers?

tgbugs | karma 969 | avg karma 4.09 · 2021-02-21 16:05:59

Robust for everyone. In theory it is the responsibility of the Clojure devs, but if there isn't a packaging workflow that they are aware of, how can they be expected to integrate with it and make design and development decisions that it would reveal to them? The point is more that, the greater variety of different environments a piece of software runs in the more likely it is that systematic weaknesses will be revealed, sort of like chaos engineering, except for whole environments instead of sporadic failures.

posix_me_less | karma 417 | avg karma 1.19 · 2021-02-24 13:18:18+00:00

Yes packaging for different distributions can reveal unknown problems or enhancement possibilities in the project, however I would be surprised if the developers in general were interested at all in supporting that activity. Maybe Clojure is, I don't know.

Packaging is beneficial mainly to users, I expect developers to choose single platform and support that, if there is more, great. But most developers aren't interested in a massive distraction that accomodating 520+ distributions is.

reply

wheybags | karma 3121 | avg karma 6.41 · 2021-02-21 21:39:21+00:00

This is why I think the snap store is going the right direction. My web browser and my kernel don't need the same level of attention.

delroth | karma 3032 | avg karma 8.66 · 2021-02-21 22:04:52+00:00

You're making a distinction between "packagers" and "users" which does not exist. Packagers are advanced users that take the initiative to improve their distro when they find software they want to use and that isn't integrated in their distro.

> Let the new complicated juggernaut software be deployed and configured by the users, according to developers' instructions, using whatever modern mechanism they prefer

I'm not sure what "they" refers to here (developers or users). The existence of packagers is proof that some subset of users "prefer" that software "be deployed and configured" via distro package managers.

reply

lmm | karma 42440 | avg karma 1.91 · 2021-02-22 07:29:46+00:00

> The existence of packagers is proof that some subset of users "prefer" that software "be deployed and configured" via distro package managers.

Not necessarily. It might be inertia - back when linux distributions started being organised this way, those language dependency management mechanisms pretty much didn't exist.

reply

drran | karma 1205 | avg karma 0.93 · 2021-02-22 03:47:21

CPAN exists since 1993, online since 1995. CPAN has the package manager, which is able to install perl software with their dependencies. It's not a new, unknown technology.

If you are sure that you can create functional, stable, bug free, hole free, up to date, full of useful software distro with 10 independent package managers instead of one, then just do it. We will enjoy it. You will spend about 10x more time than current maintainers, but your time is free, so it's not a problem. Of course, we will say huge THANK YOU for your incredible effort, with nine zeroes after initial zero, except for some hatters, which will blame your perfect distro at HN for no reason.

reply

lmm | karma 42440 | avg karma 1.91 · 2021-02-24 05:23:18+00:00

Debian in particular put a fair bit of effort into integrating apt deeply with CPAN so that you could install a package from there that depended on system libraries and vice versa, and then for subsequent languages they... didn't. You'd think these were well-known technologies, but as far as linux distribution maintaners are concerned they're new and scary.

> If you are sure that you can create functional, stable, bug free, hole free, up to date, full of useful software distro with 10 independent package managers instead of one, then just do it.

I'd rather leave the distribution model behind entirely. And you know what? I do, and it works great. You just get the occasional complaint like this article, but it doesn't actually matter in the real world.

reply

posix_me_less | karma 417 | avg karma 1.19 · 2021-03-05 11:16:27+00:00

I meant ordinary users should rely on developers' defaults. If the user is advanced enough, he will be able to go his own way.

Yes I prefer the package manager too, when it works and gives me what I want. But for the software that changes fast, like new languages / compilers, or juggernaut software like Kubernetes, packagers can't keep up and I do not expect them to.

reply

sanxiyn | karma 14687 | avg karma 3.61 · 2021-02-21 22:06:57+00:00

> basic desktop

I actually wish this is feasible, but it isn't. GNOME is basic desktop, GNOME includes librsvg, librsvg build-depends on Rust. So you need to package Rust (hence LLVM) to have GNOME. Your proposal only works if "core OS" refuses to use Rust.

reply

skybrian | karma 22817 | avg karma 2.5 · 2021-02-21 18:04:00

I think it's likely that there will be "system Rust" which is used to compile libraries used by the OS, but this is independent of the toolchain and crates that most developers use for their own development.

In a binary distribution, you don't necessarily need to install the system toolchain, unless you want to work on the system.

reply

posix_me_less | karma 417 | avg karma 1.19 · 2021-02-24 13:26:47+00:00

Yeah I get that. GNOME is a horrible big software. I would not count is as "basic desktop", because it is so big, buggy and dependencies are out of control. Basic desktop, I mean something like Xfce, with modular components and minimal dependencies so the desktop system is simple and unixy. You would still have the option to install GNOME on such a system, if only GNOME developers made a canonical version of GNOME that would be modular, and would install on any clean Linux system with Xorg or Wayland.

noobermin | karma 14622 | avg karma 3.06 · 2021-02-21 18:44:06

That's literally what windows does, for the most part, actually.

ece | karma 590 | avg karma 0.88 · 2021-02-22 09:34:31+00:00

It's the pragmatic thing to do, and it's defacto what happens on Linux distros too. Distros offer pip/docker/cargo, but when some libraries/programs are popular enough, someone will inevitably package it and offer it in a 3rd party repo or the main distro repo. Examples of this would be numpy, ripgrep etc..

symlinkk | karma 848 | avg karma 0.8 · 2021-02-22 09:10:02

Agreed, and this is basically the model that macOS / Windows have followed since the dawn of time.

syllogism | karma 3356 | avg karma 5.2 · 2021-02-21 20:57:08+00:00

This will be a somewhat intemperate response, because as a developer of a significant library I found this quite irritating.

If you publish a Python library without pinned dependencies, your code is broken. It happens to work today, but there will come a day when the artifact you have published no longer works. It's only a matter of time. The command the user had run before, like "pip install spacy==2.3.5" will no longer work. The user will have to then go to significant trouble to find the set of versions that worked at the time.

In short unpinned dependencies mean hopeless bit-rot. It guarantees that your system is a fleeting thing; that you will be unable to today publish an end-to-end set of commands that will work in 2025. This is completely intolerable for practical engineering. In order to fix bugs you may need to go back to prior states of a system and check behaviours. If you can't ever go back and load up a previous version, you'll get into some extremely difficult problems.

Of course the people who are doing the work to actually develop these programs refuse to agree to this. No we will not fucking unpin our dependencies. Yes we will tell you to get lost if you ask us to. If you try to do it yourself, I guess we can't stop you, but no we won't volunteer our help.

It's maddening to hear people say things like, "Oh if everyone just used semantic versioning this wouldn't be a problem". Of course this cannot work. _Think about it_. There are innumerable ways two pieces of code can be incompatible. You might have a change that alters the time-complexity for niche inputs, making some call time-out that used to succeed. You might introduce a new default keyword argument that throws off a *kwargs. If you call these things "breaking" changes, you will constantly be increasing the major version. But if you increase the major version every release, what's the point of semver! You're not actually conveying any information about whether the changes are "breaking".

reply

foerbert | karma 993 | avg karma 3.06 · 2021-02-21 15:37:56

I don't have a particularly strong viewpoint on this, but I find it noteworthy that in your example the user themselves is asking for a specific version of the software. You don't seem to be intending for users to ask for simply the latest version and have that work, but a specific one, and you want that specific version to work exactly as it did whenever it was published.

I can see some instances in which this expectation is important, and others where it is likely not or else certainly less important than the security implications.

For the extremes, I see research using spaCy has a very strong interest in reproducibility and the impact of any security issues would likely be minimal on the whole simply due to the relatively few people likely to run into them.

On the other extreme, say some low-level dependency is somehow so compromised simply running the code will end up with the user ransomware'd after just-long-enough that this whole scenario is marginally plausible. Then say spaCy gets incorporated into some other project that goes up the chain a ways and ultimately ends up in LibreOffice. If all of these projects have pinned dependencies, there is now no way to quickly or reasonably create a safe LibreOffice update. It would require a rather large number of people to sequentially update their dependencies, and publish the new version, so that the next project up the chain can do the same. LibreOffice would remain compromised or at best unavailable until the whole chain finished, or else somebody found a way to remove the offending dependency without breaking LibreOffice.

I'm not sure how to best reconcile these two competing interests. I think it seems clear that both are important. Even more than that, a particular library might sit on both extremes simultaneously depending on how it is used.

The only solution - though a totally unrealistic and terrible one - that comes to mind is to write all code such that all dependencies can be removed without additional work and all dependent features would be automatically disabled. With a standardized listing of these feature-dependency pairs you could even develop more fine-grained workarounds for removal of any feature from any dependency.

The sheer scale of possible configurations this would create is utterly horrifying.

At any rate, your utter rejection of the article's point seems excessively extreme and even ultimately user-hostile. I can understand your point of view, particularly given the library you develop, however I think you should probably give some more thought to indirect users - ie users of programs that (perhaps ultimately) use spaCy. I don't know that it makes sense to practically change how you do anything, but I don't think the other viewpoint is as utterly wrongheaded as you seem to think.

reply

syllogism | karma 3356 | avg karma 5.2 · 2021-02-22 03:43:59+00:00

> I'm not sure how to best reconcile these two competing interests.

What would help a lot is if the requirements were specified outside of the actual artifact, as metadata. Then the requirements metadata could be updated separately.

reply

dbaupp | karma 10204 | avg karma 4.33 · 2021-02-21 15:42:49

Libraries pinning dependencies only fixes a narrow portion of the problem and introduces a bunch of others (particularly in ecosystems where only a single version of a package can exist in a dependency tree). In particular, it is great because it makes life slightly easier for the library developers. However, if every library pinned deps, it becomes much harder to use multiple libraries together: suppose an app used libraries A and B, and A depends on X==1.2.3, while B depends on X==1.2.4. It’s then pushed on to every downstream developer to work out the right resolution of each conflict, rather than upstream libraries having accurate constraints.

Pinning dependencies in applications/binaries/end-products is clearly the right choice, but it’s much fuzzier for libraries.

reply

john-shaffer | karma 593 | avg karma 3.04 · 2021-02-21 22:59:34+00:00

Can you give an example of a real ecosystem that can't handle such a conflict? In my actual experience, the package manager will either automatically use the latest version, or in one case has more complex rules but still picks a version on its own (but I stay away from that one due to the surprise factor). Your argument has force against bad package managers and against using very strict dependency requirements, but not against pinning dependencies sensibly in a good ecosystem.

The only conflict I've seen that can't be automatically resolved is when I had some internal dependencies with a common dependency, and one depended on the git repo of the common dep (the "version" being the sha hash of a commit), and another depended on a pinned version of the common dep. Obviously there's no good way to auto-resolve that conflict, so you should generally stick with versions for library deps and not git shas.

reply

syllogism | karma 3356 | avg karma 5.2 · 2021-02-21 21:35:29

I think you're really under-rating how important it is to be able to do something like "pip install 'requests==1.0.5'" or whatever, in order to reconstruct the past state of a project. If requests hasn't pinned its dependencies, that command will simply not work. The only way you'll be able to install that version of requests is to manually go back and piece together the whole dependency snapshot at that point in time.

There's pretty much no point in setuptools automatically installing library dependencies for you if you expect the library dependencies to be unpinned. In fact it would be actively harmful --- it just leads people to rely on a workflow that works today but will break tomorrow.

You're asking for an ecosystem where there's no easy way to go back and install a particular version of a particular library. That's not better than having version conflicts.

The other thing I'd note is that it's quite an understatement to say that pinning dependencies makes life "slightly easier" for library developers. We're not going to accept builds just breaking overnight, and libraries that depend on us aren't going to accept us breaking their builds either.

reply

dbaupp | karma 10204 | avg karma 4.33 · 2021-02-22 03:44:14+00:00

Sure, it sucks that unpinned dependencies lose historical context as the deps move forward, and I’ve personally suffered this in my own library maintenance work... but there’s still the fundamental issue of conflicting pinned versions if there’s multiple libraries.

(At the app level, the right approach to “going back in time” is for those apps to pin all their deps, with a lockfile or ‘pip freeze’, not just top level ones. That is, one records the deps of requests==1.0.5 in addition to requests itself.)

reply

anderskaseorg | karma 2399 | avg karma 7.38 · 2021-02-21 21:55:47+00:00

If you publish a Python library with pinned dependencies, your code is broken as soon as someone tries to use it with another Python library with pinned dependencies, unless you happened to pin exactly the same version of the dependencies you have in common.

Python libraries should not pin dependencies. _Applications_ can pin dependencies, including all recursive dependencies of their libraries. There are tools like Pipenv and Poetry to make that easy.

This is less of an issue in (say) Node.js, where you can have multiple different versions of a library installed in different branches of the dependency tree. (Though Node.js also has a strong semver culture that almost always works well enough that pinning exact versions isn’t necessary.)

reply

acidbaseextract | karma 576 | avg karma 3.65 · 2021-02-21 16:05:46

The most frustrating thing is that pip doesn't make it easy to use more loose declared dependencies while freezing to actual concrete dependencies for deployment. Everybody rolls their own.

> Python libraries should not pin dependencies. _Applications_ can pin dependencies, including all recursive dependencies of their libraries.

Is the pypi package awscli an application or a library?

poetry is frustrating in that it doesn't allow you to override a library's declared requirements to break conflicts. They refuse to add support [1][2] for the feature too. awscli for example causes huge package conflict issues that make poetry unusable. It's almost impossible not to run into a requirement conflict with awscli if you're using a broad set of packages, even though awscli will operate happily with a more broad set of requirements than it declares.

[1] https://github.com/python-poetry/poetry/issues/697

[2] https://github.com/python-poetry/poetry/issues/697#issuecomm...

reply

anderskaseorg | karma 2399 | avg karma 7.38 · 2021-02-21 16:57:01

For this purpose, I’m defining a “library” as any PyPI package that you expect to be able to install alongside other PyPI packages. This includes some counterintuitive ones like mypy, which needs to extract types from packages in the same environment as the code it’s checking.

The awscli documentation recommends installing it into its own virtualenv, in which case pinned dependencies may be reasonable. There are tools like pipx to automate that.

Though in practice, there are reasons that installing applications into their own virtualenv might be inconvenient, inefficient, or impossible. And even when it’s possible, it still comes with the risk of missing security updates unless upstream is doing a really good job of staying on top of them.

I don’t think that respecting declared dependency bounds is a Poetry bug. Pip respects them too (at least as of 20.3, which enables the new resolver by default: https://pip.pypa.io/en/latest/user_guide/#changes-to-the-pip...). If a package declares unhelpful bounds, the package should be fixed. (And yes, that means its maintainer might have to deal with some extra issues being filed—that’s part of the job.)

reply

sagichmal | karma 2816 | avg karma 2.66 · 2021-02-21 23:09:28+00:00

> Is the pypi package awscli an application or a library?

Hopefully a library! As hopefully the AWS command-line interface is maintained and distributed separately from any SDK that powers it...

reply

X-Istence | karma 7122 | avg karma 2.59 · 2021-02-22 00:47:11+00:00

boto core/boto/boto 3 are the libraries, which awscli drives.

orf | karma 17093 | avg karma 4.05 · 2021-02-22 09:13:12+00:00

Why on earth would you ever add awscli as a dependency? That makes very little sense. It’s an application (that is no longer distributed via pypi).

You should use boto3

reply

u801e | karma 2500 | avg karma 1.17 · 2021-02-21 22:33:05

> Python libraries should not pin dependencies. _Applications_ can pin dependencies, including all recursive dependencies of their libraries.

This is essentially what we do where I work. When we maked a tagged release, we will create a new virtual environment, run a pip install, run all the tests and then run pip freeze. The output of pip freeze is what we use for the install_requires parameter in the setup method in setup.py.

That said, a library could certainly could update their old releases with a patch release and specify a <= requirement on a particular dependency when versions newer than that no longer work. That said, it would be a bit of work since indirect dependencies would also have to be accounted for as well.

reply

acidbaseextract | karma 576 | avg karma 3.65 · 2021-02-21 21:56:19+00:00

It's maddening to hear people say things like, "Oh if everyone just used semantic versioning this wouldn't be a problem". Of course this cannot work. _Think about it_. There are innumerable ways two pieces of code can be incompatible. ... If you call these things "breaking" changes, you will constantly be increasing the major version.

One of the things that prompted the OP was this breakage in Python's cryptography package [1] (OP actually opened this issue) due to the introduction of a Rust dependency in a 0.0.x release. The dependency change didn't change the public API at all, but did still cause plenty of issues downstream. It's a great question on the topic of semver to think about how to handle major dependency changes that aren't API changes. Personally, I would have preferred a new major release, but that's exactly your point syllogism — it's a matter of opinion.

As a sidenote, Alex Gaynor, one of the cryptography package maintainers is on a memory-safe language crusade. Interesting to see how that crusade runs into conflict with the anti-static linking crusade that distro packagers are on. I find both goals admirable from a security perspective. This stuff is hard.

[1] https://github.com/pyca/cryptography/issues/5771

reply

rsj_hn | karma 9643 | avg karma 3.21 · 2021-02-21 23:32:41+00:00

It's hard because underneath is a battle of who bears the maintenance and testing costs that no one wants to bear.

Asking a publisher to qualify their library against a big range of versions just means that they need to do a lot more testing and support. Obviously they want to validate their code against one version, not 20, and certainly don't want an open ended > version which would force them to do a validation each time a minor dep is released.

Similarly when publishers say I will only work against version X, this puts a bigger burden on the user to configure their dependencies and figure out which version they can use. They would like to push that work onto vendors.

What's a bit depressing is that these economic concerns are not raised openly as the primary subject matter, but the discussion is always veiled in terms of engineering best practices. You're not gonna engineer your way out of paying some cost. Just agree on who bears the cost and how you will compensate them for the cost, then the engineering concerns became much easier.

reply

AshamedCaptain | karma 6226 | avg karma 3.11 · 2021-02-21 22:06:40+00:00

> In short unpinned dependencies mean hopeless bit-rot.

No, this is not true, for the simple reason that there will _always_ be unpinned dependencies (e.g. your compiler. your hardware. your processor) and thus _those_ are the ones that will guarantee bitrot.

Pinning a dependency only _guarantees you rot the same or even faster_ because now it's less likely that you can use an updated version of the dependency that supports more recent hardware.

reply

gpm | karma 17771 | avg karma 3.89 · 2021-02-21 16:26:46

> your compiler

Compilers of languages like C, C++, Rust, Go etc go above and beyond to maintain backwards compatibility. It is extremely likely that you will still be able to compile old code with a modern compiler.

> your processor

Hardware is common enough that people go out of their way to make backwards compatibility shims. Things like rosetta, qemu, all the various emulators for various old gaming systems, etc.

> your hardware

Apart from your CPU (see above) your hardware goes through abstraction layers designed to maintain long term backwards compatibility. Things like opengl, vulkan, metal, etc. The abstraction layers are in widespread enough use that as older ones stop being outdated people start implementing them on top of the newer layers. E.g. here is OpenGL on top of Vulkan: https://www.collabora.com/news-and-blog/blog/2018/10/31/intr...

> [Your kernel]

Ok, you didn't say this part, but it's the other big unpinned dependency. And it too goes above and beyond to maintain backwards compatibility. In fact Linus has a good rant on nearly this exact topic that I'd recommend watching: https://www.youtube.com/watch?v=5PmHRSeA2c8&t=298s

> Pinning a dependency only _guarantees you rot the same or even faster_ because now it's less likely that you can use an updated version of the dependency that supports more recent hardware.

Dependencies are far more likely to rot because they change in incompatible ways than the underlying hardware does, even before considering emulators. It's hard to take this suggestion seriously at all.

reply

AshamedCaptain | karma 6226 | avg karma 3.11 · 2021-02-21 16:38:59

> Dependencies are far more likely to rot because they change in incompatible ways than the underlying hardware does

Yes, that is true. It is also very likely that you can more easily go back to a previous version of a dependency than you can go back to a previous hardware. The argument is that, therefore, pinning can only speed up your rotting.

If you don't statically link your dependencies, and due to an upgrade something breaks, you can always go back to the previous version. If you statically link, and the hardware, compiler, processor, operating system, whatever causes your software to break, now you can't update the dependency that is causing the breakage. And it is likely that your issue is within that dependency.

Pinning can only make you rot faster.

reply

sagichmal | karma 2816 | avg karma 2.66 · 2021-02-21 17:17:31

Honest question: have you ever worked as an application developer? Responsible for getting working artifacts to users as a means to an end?

Pinning dependencies absolutely and unquestionably works better, and for longer, than dynamic linking, for this use case.

reply

tgbugs | karma 969 | avg karma 4.09 · 2021-02-21 23:52:04+00:00

If developers are unwilling to maintain dependencies and be good citizens of the larger language community, should they be adding those dependencies in the first place?

If you're not operating in the large ecosystem then fine. But if your project is on e.g. pypi, then there is an issue.

(edit: Note, yes I know the virtualenvs exist, docker exists, etc. but those are space and complexity trade-offs made as a workaround for bad development practices)

reply

AshamedCaptain | karma 6226 | avg karma 3.11 · 2021-02-21 23:56:08+00:00

Perhaps I still fail to explain myself: what I am saying is that _not pinning_ only _adds_ more choices, so by definition it can only work better.

Pinned or not, if a software update breaks things, you can always just revert back to a previous version of your dependencies. This applies to a myriad soft problems including a dependency changing interface.

However, when pinning, when one of your static dependencies is broken due to a change outside your control (e.g. hardware, operating system, security issue making it unusable, or something else), the user's only recourse is to call the developer to fix the software.

I am not claiming whether one happens more frequently than the other, or claiming that hardware changes cannot break the main software itself, which often nullify the point. All these issues can happen to both software with static linking or dynamic linking. However dynamic linking has at least one extra advantage that static linking cannot have, and the opposite is not true.

> have you ever worked as an application developer? Responsible for getting working artifacts to users as a means to an end?

Look, ironically I find that all of this crap discussion is because of a newer generation of "application developers" that do not know yet what does it mean to "deliver working artifacts to users". Imagine my answer to that question.

reply

sagichmal | karma 2816 | avg karma 2.66 · 2021-02-22 00:35:23+00:00

> However, when pinning, when one of your static dependencies is broken due to a change outside your control (e.g. hardware, operating system, security issue making it unusable, or something else), the user's only recourse is to call the developer to fix the software.

In practice, this happens so infrequently it can be ignored as a risk. (When it does happen, users generally don't expect the software to continue to work.)

> dynamic linking has at least one extra advantage...

You don't seem to be acknowledging the downside risk to dynamic linking which motivates the discussion in the first place. An update to a dynamically linked dependency which breaks my delivered artifact is an extremely common event in practice.

reply

AshamedCaptain | karma 6226 | avg karma 3.11 · 2021-02-22 00:49:18+00:00

> In practice, this happens so infrequently it can be ignored as a risk.

Well I disagree there. Security issues or external protocol changes (e.g. TLSv1.2 to TLSv1.3) are rather frequent, not to mention usually customer wants to upgrade their machines (old ones broke) and existing operating system no longer supports the new hardware.

> An update to a dynamically linked dependency which breaks my delivered artifact is an extremely common event in practice.

Again, I agree. A "surreptitious" dependency update breaking the software is much more common. However, I have already acknowledged that _two times already_, and the point that I'm making is that it doesn't matter if you are pinning dependencies or not: customer CAN FIX these issues without help from the developer. They just have to roll back the update!

On the other hand customer CAN'T fix the first issue (e.g. new hardware).

reply

kuschku | karma 11518 | avg karma 1.46 · 2021-02-21 16:34:10

> No, this is not true, for the simple reason that there will _always_ be unpinned dependencies (e.g. your compiler. your hardware. your processor) and thus _those_ are the ones that will guarantee bitrot.

Docker with sha256 tags fixes that issue (and docker container even specify a processor architecture).

reply

Qwertious | karma 3248 | avg karma 2.18 · 2021-02-21 16:40:12

>Of course this cannot work. _Think about it_. There are innumerable ways two pieces of code can be incompatible.

There's a very simple solution here: just don't write bugs.

reply

Groxx | karma 17784 | avg karma 2.5 · 2021-02-22 00:01:41+00:00

This is the difference between an "application" and a "library": https://caremad.io/posts/2013/07/setup-vs-requirement/

Libraries should absolutely not pin their dependencies. Applications should if you care about reproducible builds (not necessarily byte-for-byte, but "can build today == can build tomorrow").

Installing both libraries and applications in the same way in the same environment is a fundamental mismatch that pip encourages, and yes - it leads to fragile binaries.

reply

orf | karma 17093 | avg karma 4.05 · 2021-02-22 02:59:01+00:00

You’re completely wrong and this advice is somewhat harmful. What you’re describing is how a Python application should be managed. Not a library. Libraries should absolutely not lock their advertised dependencies to arbitrary point-in-time versions for fairly obvious reasons.

Picking a suitable dependency specifier depends heavily on the maturity of the library you’re using and if you need any specific features added or removed in a specific release.

Saying your library depends on “spacy==2.3.5” is a lie that will mean any other library that depends on spacy>=2.3.6 can’t be used. Even if your code will realistically work fine with any spacy 2.x release.

reply

syllogism | karma 3356 | avg karma 5.2 · 2021-02-22 03:41:27+00:00

Everyone needs commands like "pip install 'spacy==2.3.5'" to work reliably in the future, so that you can go back and bisect errors. You need to be able to get back to a particular known-good state, and work through changes

I'm not saying we pin our dependencies to exact specific versions, but we absolutely do set an upper bound, usually to the minor version.

reply

globular-toast | karma 3697 | avg karma 0.89 · 2021-02-22 07:45:21+00:00

> I'm not saying we pin our dependencies to exact specific versions, but we absolutely do set an upper bound, usually to the minor version.

OK. That's more sensible, but "pinning" implies == to a specific version. If you know a library does semantic versioning and breaks their API then ~= is fine. Just not ==.

reply

orf | karma 17093 | avg karma 4.05 · 2021-02-22 08:17:44+00:00

You should use a tool that supports a lockfile, not pip directly. I recommend Poetry.

globular-toast | karma 3697 | avg karma 0.89 · 2021-02-22 01:40:39

Even applications shouldn't really be pinning dependencies. The only time to pin dependencies is when deploying that application. That could mean bundling it with pyinstaller or making a docker image. But someone should still be able to install it from source with their own dependencies.

lmm | karma 42440 | avg karma 1.91 · 2021-02-22 07:21:08+00:00

> It's maddening to hear people say things like, "Oh if everyone just used semantic versioning this wouldn't be a problem". Of course this cannot work. _Think about it_. There are innumerable ways two pieces of code can be incompatible. You might have a change that alters the time-complexity for niche inputs, making some call time-out that used to succeed. You might introduce a new default keyword argument that throws off a *kwargs. If you call these things "breaking" changes, you will constantly be increasing the major version. But if you increase the major version every release, what's the point of semver! You're not actually conveying any information about whether the changes are "breaking".

That's the point of the link to Hyrum's law. The article argues that the practice of pinning encourages that attitude: consumers feel free to depend on internal implementation details, producers feel free to change behaviour arbitrarily, and no-one takes responsibility for specifying and maintaining a stable interface, which is how you actually break that knot - producers need to specify which parts are stable interfaces and which are not, consumers need to respect that and not depend on implementation details, and then you can actually use semver because it's clear what's a breaking change and what isn't.

reply

globular-toast | karma 3697 | avg karma 0.89 · 2021-02-22 01:41:36

Completely and utterly wrong. I hope nobody heeds this advice.

progval | karma 4683 | avg karma 3.58 · 2021-02-22 02:44:49

> If you publish a Python library without pinned dependencies, your code is broken.

> you will be unable to today publish an end-to-end set of commands that will work in 2025

Not necessarily.

Since ~2010 I maintain an application with an unpinned requirements.txt; it doesn't even have version constraints at all.

The only breakages I had were either:

1. when switching from Python 2 to Python 3 (obviously)

2. when a new Python version introduces a bug (but Python is not pinnable anyway)

3. once, when a dependency released a new major version and removed an internal attribute I was using in my tests out of laziness (so that one is entirely on me)

The trick is to only use good libraries, that care about not breaking other people's code.

---

It's also worth noting that it's not your job as a developer to make sure your application can be installed anywhere; it's the packager's job to make sure your app can be installed in their distribution.

And if your users want to use pip (which is kind of the Python equivalent of wget + ./configure + make install) instead of apt/yum/... to get the very latest version of your software, then they should be able to figure out how to fix those issues.

reply

virgo_eye | karma 10 | avg karma 1.0 · 2021-02-22 05:25:47

It's clear that your approach is a possible approach:

1. 'only use good libraries' 2. 'it's not your job as a developer to make sure your application can be installed' 3. 'if your users want to use pip... they should be able to fix those issues'

However, this isn't a solution to the problem that led to the existence of language ecosystems. It is a refusal to acknowledge the problem.

reply

Nixinova | karma 3 | avg karma 0.75 · 2021-02-22 18:36:34+00:00

If you're introducing breaking changes in every new release you should still be in the 0.x stage of SemVer. You're doing something wrong if you end up on v77.0.0. The Node ecosystem's strict compliance with SemVer works fine 99% of the time because SemVer is indeed an effective versioning system (when people use it right).

hedora | karma 21378 | avg karma 2.8 · 2021-02-22 12:55:55

I get the impression that this advice is accurate for the python ecosystem, but that’s because the entire ecosystem is broken with respect to backwards compatibility.

The exact same mechanisms work fine with other programming languages, and (more importantly, probably) different developer communities.

In fairness, Python’s lack of static types does make things worse than the situation for compiled languages. (Though that’s a general argument against writing non-throwaway code in python).

People claim node does better, even though JS is also missing static types, so presumably they solved this issue somehow (testing, maybe?). I don’t use it, so I have no idea.

reply

rosshemsley | karma 169 | avg karma 5.45 · 2021-02-21 15:00:27

Whilst I don't totally disagree with many of the points here, I think there's a wider picture to many of these issues.

The author is concerned with installing packages on user machines: which are typically very long-lived installs - maybe a user has the same machine with the same dependencies for years.

However, for many engineers, (such as myself), a binary may not be used past even a few days from when it was first compiled - e.g. as part of a service in a a quickly continuously integrated system.

I might even argue that _most_ software is used in this way.

When software is built this way, many of the points in this article are very helpful to keep builds stable and to make deployment fast - and in fact for the case of security, we usually _don't_ want dependencies to auto-update, as we do not want to automatically deploy new code if it has not been audited.

Maybe there's a future were OSs become more like this, where binaries are more short lived... maybe not. Although I don't think it's strictly fair to label all of these as "Bad" with a capital B :)

reply

ampdepolymerase | karma 1529 | avg karma 2.07 · 2021-02-21 15:06:38

The way iOS and Fuchsia are dealing with the problem is to completely lockdown the operating system with a tight permissions system. An app can be compromised but the damage is limited. Perhaps it is time for servers to move to a similar model.

giantrobot | karma 5286 | avg karma 2.0 · 2021-02-21 15:13:17

We should call this newly invented and wholly original concept a "container". The software gets "contained". It just might work. /s

eecc | karma 4200 | avg karma 2.48 · 2021-02-21 21:38:51+00:00

You mean cgroups, or zones don’t you? Docker (was, last time I heard) a security disaster, not generating robust layer hashes, lacking user isolation, and plenty just running as root...

giantrobot | karma 5286 | avg karma 2.0 · 2021-02-21 16:30:05

There's more to containers on Linux than just Docker.