Package management is a wicked problem

(nesbitt.io)

106 points | by zdw 5 days ago

13 comments

  • 8organicbits 19 hours ago
    Andrew has been writing a ton of interesting blog posts related to package management (https://nesbitt.io/posts/). He's had some great ideas, like testing package managers similar to database Jepsen testing.
    • cbsmith 17 hours ago
      Not to take credit away from Andrew for his ideas and writing, because at least he came up with the idea and wrote about it, but I don't understand how that idea of Jepsen style testing of package managers is a novel idea. Like... what testing would you want to do if you were building a package manager?
  • mooracle 19 hours ago
    cargo works because rust was young enough to be opinionated. try that with npm and enjoy your mass exodus to the next thing that will also betray you

    "but bun!" — faster shovel, same hole

    • skrebbel 18 hours ago
      NPM is plenty opinionated. For all its mistakes, it got lots of things uniquely right too. For example it’s very uncommon in JS land to have version conflicts (“dependency hell”). If two deps both need SuperFoo but different versions, NPM just installs both and things Generally Just Work. Exceptions are gross libraries with lots of global state (such as React) but fortunately those are very uncommon in JS land.

      People love to complain about node_modules being a black hole but that size bought JS land an advantage that’s not very common among popular languages.

      • spankalee 17 hours ago
        Yeah, npm never has "version lock" where it can't figure out a valid solution to the version constraints.

        This is mostly good, but version lock does encourage packages to accept wide ranges of dependencies, and to update their dependency ranges frequently, instead of just sitting there on old versions.

    • pjmlp 19 hours ago
      And only to the extent it is a pure Rust codebase, add a few other languages to the mix, and it becomes a build.rs mess as well.
    • ragall 17 hours ago
      Cargo doesn't work. I'm trying to use it in a monorepo and its cacheing story is horrible. The devs refused when I proposed to switch it to Bazel years ago and now they're regretting it.
  • finally7394 20 hours ago
    I like that the author calls out the naming overloading, cause when I hear package management I think `pacman winget and apt`
    • pxc 19 hours ago
      All three of those are "system package managers" (if you count winget as a package manager at all, which I would not). Pacman and APT are binary package managers while Homebrew is a source-based package manager. Cargo and NPM are language-specific package managers, which is a name I've settled on but don't love.

      Imo there's an identifiable core common to all of these kinds of package managers, and it's not terribly hard to work out a reasonably good hierarchical ontology. I think OP's greater insight in this section is that internally, every package manager has its own ontology with its own semantics and lexicon:

      > Even within a single ecosystem, the naming is contested: is the unit a package, a module, a crate, a distribution? These aren’t synonyms. They encode different assumptions about what gets versioned, what gets published, and what gets installed.

      • morpheuskafka 19 hours ago
        The confusing part is that in many cases, end users are using NPM, pip, Go packaging, and to a lesser extent cargo etc to install finished end-user software. I've never written a line of JS but have installed all kinds of command line utilities with npm/npx.

        Normally with an system package manager you would have a -lib package for using in your own code (or simply required by another package), a -src, and then a package without these suffixes would be some kind of executable binary.

        But with npm and pip, I'm never sure whether a package installs binaries or not, and if it does, is it also usable as a library for other code or is it compiled? (Homebrew as you mentioned is source based but typically uses precompiled "bottles" in most cases, I think?) And then there is some stuff that's installed with npm but is not even javascript like font packages for webdev.

        The other interesting thing about these language package managers is that they complete eliminate the role of the distribution in packaging a lot of end user software. Which ironically, in the oldest days you would download a source tarball and compile it yourself. So I guess its just a return to that approach but with go or cargo replacing wget and make.

        • cozzyd 19 hours ago
          And plenty of people use pip for programs not even written in python!
      • RetroTechie 19 hours ago
        > Imo there's an identifiable core common to all of these kinds of package managers (..)

        Indeed. It's hard to see why eg. a prog language would need its own package management system.

        Separate mechanics from policy. Different groups of software components in a system could have different policies for when to update what, what repositories are allowed etc. But then use the same (1, the system's) package manager to do the work.

        • yxhuvud 43 minutes ago
          It is easy to see why the system package managers in that are in use are not sufficient. The distro packages are too bureaucratic and too opinionated about all the wrong things.

          I don't disagree that it would be nice if there was more interoperability, but so far I havn't seen anyone even try to tackle the different needs that exist for different users. Heck, we really havn't seen anyone trying to build a cross-language package system yet, and that should be a lot easier than trying to bridge the development-distro chasm.

  • nacozarina 4 days ago
    Naming things, cache invalidation, and off-by-one errors: package management heavily emphasizes the hardest ‘blue-collar’ problems in CS.
    • bradgessler 18 hours ago
      Today, sales and marketing are the two hardest problems in computer science.
    • dizhn 20 hours ago
      Feature creep and not invented here too. (Bikeshedding?)
      • taeric 19 hours ago
        I confess "not invented here" is a problem I think too many people focus on. Lots of things are redone all of the time.

        That said, feature creep is absolutely a killer. And it is easy to see how these will stack on each other where people will insist that for this project, they need to try and reinvent the state of the art in solvers to get a product out the door.

    • iberator 19 hours ago
      This is stupid and unproven quote. Citation needed. I hate that HN is repeating this over and over and it snot even real nor funny not new joke.

      Try to say that at job interview if you don't believe

      • pixl97 18 hours ago
        And to add further to the joke here the full saying goes more like

        >There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.

        And, if you actually work in software a very large portion of your hard to troubleshoot/fix issues are going to be the above.

        • troupo 17 hours ago
          It's not DNS

          It can't be DNS

          There's no chance in hell it's DNS

          ...

          It was DNS

          • swiftcoder 17 hours ago
            DNS is a special hell: naming things and caching rolled into one!
          • spl757 8 hours ago
            first thing i check: ping 8.8.8.8
      • AlotOfReading 18 hours ago
        It's not to be taken as a serious assessment of actual "hardest problems", but they're all difficult. Naming things is obviously impossible. Everyone gets cache invalidation wrong at first, from Intel/AMD to your build system.
      • swiftcoder 19 hours ago
        > Try to say that at job interview if you don't believe

        If your interviewer doesn't at least crack a smile when you make the off-by-one joke, run, do not walk, to the nearest exit. You don't want to work with that dude

      • anyonecancode 18 hours ago
        Well, there's the variation I heard recently:

        There are only two problems in computer science. We only have one joke, and it's not very funny.

      • bena 18 hours ago
        Naming things is one of the hardest problems we have. In general. Taxonomy is incredibly difficult because it is essentially classification.

        And things never fit neatly into boxes. Giving us such bangers as: Tomatoes are fruit; Everything is a fish or nothing is a fish; and Trees aren't real.

      • lo_zamoyski 18 hours ago
        To spell it out for you...

        1. It's a joke. The hyperbole is intentional, but it does communicate something relatable.

        2. You don't need a citation. Probably anyone with enough software development experience understands the substance of the claim and understands that it is (1).

      • razingeden 17 hours ago
        In case you need to hear this again,

        > “Sarcasm is difficult to grasp on the internet, but some people apparently have more visceral reactions to their misunderstanding than others.”

      • antonvs 17 hours ago
        Yes, and we also need a citation about that quote about a horse and a duck walking into a bar. It doesn't sound very likely to me.

        Martin Fowler has some history of this joke: https://martinfowler.com/bliki/TwoHardThings.html

  • themafia 13 hours ago
    Repositories require at least one but probably multiple additional semantic layers and client side filtering that can take advantage of it. Otherwise all you have is a large uncurated catalog with a "take it or leave it" strategy for clients.

    There was a time when this was sufficient. We've moved well past that point.

  • DarkNova6 19 hours ago
    Is it not curious that languages known for their rigor have solid package manager/build tools while the remakning languages do not?

    This is not a technical problem. It’s a cultural one.

    • no_wizard 19 hours ago
      I don’t think those have much to do with it.

      Certainly Go is a more rigorous language than say JavaScript but it’s package mangement was abysmal for years. It’s not even all the great now.

      C/C++ is the same deal. The way it handles anything resembling packages is quite dated (though I think Conan has attempted to solve at least some of this)

      I think Cargo and others have the hindsight of their peers, rather than it being due to any rigorous attribution of the language

      • the__alchemist 16 hours ago
        Concur: C and C++ are a great example of being both used for rigorous uses, but building/packaging being a mess. And I think the big adv Cargo/Rust has is learning from past mistakes, and taking good ideas that have come up; discarding bad.
      • pjmlp 19 hours ago
        And vcpkg, not only Conan.
      • DarkNova6 13 hours ago
        I was mostly having typical application programming languages in mind such as C# and Java. Go doesn't exactly fit that bill and I've seen it be used more for technical plumbing that needs a good concurrency model. And Maven isn't exactly new.

        Frankly, PHP also has a very good packet manager with Composer. In general, PHP has done surprisingly good and sane decisions for the language and extremely solid support for static typing by now.

        But yeah, Cargo definitely had the benefit of Hindsight.

    • AnthonyMouse 17 hours ago
      Tacking package management onto a language is feature creep to begin with. You can pretty obviously have a program in one language that uses a library or other dependency written in a different one.

      The real problem is that system package managers need to be made easier to use and have better documentation, so that everyone stops trying to reinvent the wheel.

    • bee_rider 19 hours ago
      Yes, we can even see—the languages with the best culture and superior rigor have the best package manager: C and Fortran, which just use the filesystem and the user to manage their packages.
  • meisel 17 hours ago
    This all just sounds like problems we see when making new features, of any sort, for customers. A feature is never objectively done, there are many opinions on its goodness or badness, once it’s released its mistakes can last with it, etc.

    If this is a wicked problem, then so is much of other real-world engineering.

  • pxc 20 hours ago
    All this, and yet package management is still so much better than managing software any other way, and there are continually real advancements both in foundations and in UX. It is indeed full of wicked problems in a way that suggests there can be no clear "endgame". But it's also a space where the tools and improvements to them regularly make huge positive differences in people's computing experiences.

    The uneven terrain also makes package managers more interesting to compare to each other than many other kinds of software, imo.

  • fridder 17 hours ago
    Honestly just look at the dismal history of Python and package management. easy_install, setuptools, pip(x), conda, poetry, uv. Hell I might even be missing one.
    • the__alchemist 17 hours ago
      UV (And a similar tool I built earlier) does solve it. With the important note: This was made feasible due to standardizing on pyproject toml, and wheel files. And being able to compile a diff wheel for each OS/Arch combo, and have the correct one download and installed automatically. And in the case of linux, the manylinux target. I think the old python libs that did arbitrary things in setup.py was a lost cause.
      • fridder 17 hours ago
        I hope it solves it, but I've seen that stated before
        • nylonstrung 12 hours ago
          I think uv has genuinely permanently solved python package management as well as could be possible in 2026

          None of the other pip replacements were actually good software like uv

        • the__alchemist 17 hours ago
          Hah yea I agree with that mindset. Poetry, Pipenv, pyenv, venv and Conda were all fakers for me!
  • mystraline 19 hours ago
    It is and isnt.

    Version hell is a thing. But Nix's solution is to trade storage space for solving the version problem.

    And I think its probably the right way to go.

  • tonyhart7 17 hours ago
    so what is the "best" package manager humankind have right now ?????
    • nylonstrung 12 hours ago
      Nix
      • deknos 3 hours ago
        they lost it, when they hardwared it with systemd.

        Nothing against systemd, but hardwaring is not a good idea in that regard.

        With guix, you at least install things in a container.

        sadly, guix also went the non-conda-route, so you could not use it as a conda replacement only :(

    • the__alchemist 17 hours ago
      GPOS software: Static-linked executables

      Programming languages: Cargo

  • pydry 19 hours ago
    I dont really agree. Package management has a number of pretty well defined patterns (e.g. lockfiles, isolation, semver, transactionality, etc) which solve common use cases that are largely common across package management.

    It is unfortunately one of the most thankless tasks in software engineering, so these are not applied consistently.

    This was symbolized quite nicely by google pushing out a steaming turd of a version 1 golang package management putting while simultaneously putting the creator of brew in the no hire pile coz he couldnt reverse a binary tree.

    In this respect it is a bit like QA - neglected because it is disrespected.

    What makes it seem like a wicked problem is probably that it is the tip of the software iceberg.

    It is the front line for every security issue and/or bug, especially the nastiest class of bug - "no man's land" bugs where package A blames B for using it incorrectly and vice versa.

    • cxr 19 hours ago
      Every package manager lock file format or requirements file is an inferior, ad hoc, formally-specified, error-prone, incompatible reimplementation of half of Git.

      Supply chain vulnerabilities are a choice. It's a problem you have to opt in to.

      <https://news.ycombinator.com/item?id=46008744>

      • spankalee 17 hours ago
        There is actually a huge difference between checking in all of your dependencies and checking in a lock-file. Some people work with hundreds of repositories on their local machine and checking in dependencies would lead to massive bloat. It really only works if you primarily work in a single monorepo.
        • cxr 16 hours ago
          > It really only works if you primarily work in a single monorepo.

          That's simply not true; it doesn't come down to "monorepo-or-not?"

          It comes down to whether or not the code size of an app's dependencies and transitive dependencies is still reasonable or has gotten out of control.

          The trend of language package managers to store stuff out of repo (and their recent, reluctant adoption of lockfiles to mitigate the obvious problems this causes*) is and always has been designed to paper over the dependency-size-is-out-of-control problem—that's the reason that this package management strategy exists.

          You can work on dozens of projects (unrelated; from disjoint domains) that you maintain or contribute to while having all the source for every library/subroutine that's needed to be able to build the app all right there, checked into source control—but it does mean actually having a handle on things instead of just throwing caution to the wind and sucking down a hundred megabytes or more of simultaneously over- and under-engineered third-party dependencies right before build time.

          It's no different from, "Our app consumes way too much RAM", or, "We don't have a way to build the app aside from installing a monstrously large IDE" (both belonging to the category of, "We could do something about it if we cared to, but we don't.")

          > There is actually a huge difference between checking in all of your dependencies and checking in a lock-file.

          Yes, huge difference indeed: the hugeness of YOLO maintainers' dependency trees.

          * what could possibly go wrong if we devise a scheme to subvert the operations of a tool where the entire purpose of it was to be able to unambiguously keep track of the revisions/content of the source tree at a given point in time?

      • jen20 6 hours ago
        _in_formally specified, surely?
    • hansvm 19 hours ago
      Assuming the binary tree thing is the whole story, that still doesn't sound like a terrible choice on Google's part. Your first few years at Google you won't have enough leeway to do something like "make homebrew," and you will have to interact with an arcane codebase.

      For tree reversal in particular, it shouldn't be any harder than:

      1. If you don't know what a binary tree is then ask the interviewer (you probably _ought_ to know that Google asks you questions about those since their interview packet tells you as much, but let's assume you wanted to wing it instead).

      2. Spend 5-10min exploring what that means with some small trees.

      3. Then start somewhere and ask what needs to change. Clearly the bigger data needs to go left, and the smaller data needs to go right (using an ascending tree as whatever small example you're working on).

      4. Examine what's left, and see what's out of order. Oh, interesting, I again need to swap left and right on this node. And this one. And this one.

      5. Wait, does that actually work? Do I just swap left/right at every node? <5-10min of frantically trying to prove that to yourself in an interview>

      6. Throw together the 1-5 lines of code implementing the algorithm.

      It's a fizzbuzz problem, not a LeetCode Hard. Even with significant evidence to the contrary, I'd be skeptical of their potential next 1-3 years of SWE performance with just that interview to go off of.

      That said, do they actually know that was the issue? With 4+ interviews I wouldn't ordinarily reject somebody just because of one algorithms brain-fart. As the interviewer I'd pivot to another question to try to get evidence of positive abilities, and as the hiring manager I'd consider strong evidence of positive abilities from other interviews much more highly than this one lack of evidence. My understanding is that Google (at least from their published performance research) behaves similarly.

  • iberator 19 hours ago
    apt-get solved this 'problem' like 25 years ago.
    • EvanAnderson 18 hours ago
      RPM "solved" it too.

      I hate package management so much. I hate installing unnecessary cruft to get a box with what I want on it.

      It makes me pine for tarballs built on boxes w/ compilers installed and deployed directly onto the filesystem of the target machines.

      Edit: I'd love to see package management abstracted to a set of interfaces so I could use my OS package manager for all of the bespoke package management that every programming language seems hell-bent on re-implementing.

      • dzr0001 17 hours ago
        I think there's a fundamental difference between programming language repos and package repositories like the official RPM, deb, and ports trees.

        These (typically) operating system repos have oversight and are tested to work within a set of versions. Repositories with public contribution and publishing don't have any compatibility guarantees, so the cruft described in the article must be kept indefinitely.

        Unfortunately, I don't think abstracting those repositories to work within the OS package ecosystem would solve that problem and I suspect the package manager SAT solvers would have a hard time calculating dependencies.

        • EvanAnderson 17 hours ago
          I agree re: the fundamental difference when it comes to compiled languages. I wrote rashly and out of frustration without thinking about it too deeply.

          re: interpreted languages, though, I think it's still a shit show. I don't want to run "composer" or "npm" or whatever the Ruby and Python equivalents are on my production environment. I just want packages analogous to binaries that I can cleanly deploy / remove with OS package management functionality.

      • themafia 13 hours ago
        > It makes me pine for tarballs built on boxes w/ compilers installed and deployed directly onto the filesystem of the target machines.

        You're effectively describing Gentoo.

        Just a personal opinion but it's awesome.

    • Am4TIfIsER0ppos 18 hours ago
      Isn't it `apt` these days?
      • droopyEyelids 18 hours ago
        Your parent comment is referring to its inception, 25 years ago.