> Ironically, one of the reasons for the rise in the number of command line options is another McIlroy dictum, "Write programs to handle text streams, because that is a universal interface" (see ls for one example of this).
That is unfortunately not the reason.
There are 2 main reasons why today (GNU) command line utilities have a lot of options:
1). GNU. Apparrently they like to add features to programs
2). Herritage of SYSV/BSD. Traditionally BSD options were different from SYSV and the new GNU utilities were made to understand both.
Adding features isn't itself a bad thing. But when you cannot iterate or remove features (even the smallest change to existing functionality could break a script somewhere!) then you end up with the jumbled mess we have.
I don’t see an argument from you that (1) wasn’t caused by the point that Luu raises.
It’s not strange to end up adding features to programs if the effective IPC is so impoverished that you would need to sandwich a bunch of repetitive formatting commands if you didn’t add some convenient built-in formatting options.
The UHHB has very few points that are still valid. It is a humorous piece and was written about 3 decades ago. We shouldn't continue citing it as valid criticism.
> The UHHB has very few points that are still valid.
I disagree. Yes, nobody uses sendmail anymore, filesystems vastly improved and X isn't a resource hog any more. I've reread it some months ago and the main points still hold.
For example the part about `kill`:
Most operating systems have a “kill” command. So does Unix. In
most operating systems, the kill command kills a process. The Unix
implementation is much more general: the “kill” command sends a
process a message. This illustrates the first Unix design principle:
• Give the user power by making operations fully general.
The kill command is very powerful; it lets you send all sorts of messages to
processes. For example, one message you can send to a process tells it to
kill itself. This message is -9. -9 is, of course, the
largest single-digit message, which illustrates another important
Unix design principle:
• Choose simple names that reflect function.
I don't find the argument about kill convincing. You can use `kill pid` and it will send a SIGTERM signal to the process. The -9 option is only necessary when you want to send a SIGKILL signal. I don't think it is a bad thing that kill gives users the option to choose which signal to send. I only use the -9 option when I want to terminate a process without it being handled.
While I agree that there's much of the UHHB that is still valid (and what's not valid is still funny), I don't think that's a good example. Modern kill allows you to provide the signal name if you wish. I don't think it's a particularly good criticism of kill that it also allows you to use a numeric shorthand if you wish (even if that shorthand is very popular).
I think the UUHB is mostly worth reading to understand the absolutely dismal state of proprietary Unix in the 90s. It explains why technical reasons contributed to the growth of both Windows NT and GNU.
I mean the empirical evidence alone seems point at the Unix Philosophy not actually being popular in practice. GNU tools are much more widely used on boxes humans interact with than BSD/busybox. You don't see everyone trying to figure out how to install all of the suckless tools immediately after installing linux, etc. And then of course we get into systemd and that whole ball of wax.
I do think there's a growing split between server like things and desktop like systems that people actually regularly use. For example, Alpine is super popular in containers, not so much on desktops. Same thing with Busybox, but that's more explicitly meant for embedded envs.
Almost nobody used the original Unix-tools on any Unix (like Irix, Solaris, AIX, HP-UX and Tru64). Everybody installed the GNU tools, that's why after some time every vendor shipped them with their Unix as add-on disks.
Not really, because devs would have to put up with what the likes of people like myself actually made available on the servers they had to telnet into for development.
The only big UNIX where I actually did install GNU stuff was on Solaris 2.6 via the GNU packages repository, because I was requested to do so.
All our HP-UX and Aix boxes used the vendor tooling (including the C compilers).
> Not really, because devs would have to put up with what the likes of people like myself actually made available on the servers they had to telnet into for development.
Fortunately that didn't work when you needed 3D graphics hardware. But I guess you worked in a 'server' company, not a graphics (3D or engineering) one?
> All our HP-UX and Aix boxes used the vendor tooling (including the C compilers).
Of course, GCC was only needed for most of the later (early 2000s) OSS like Mozilla that didn't work with the native compilers because of missing GNU extensions.
That's one thing I don't miss - the prices of compilers. SGI took more than 10,000 € for their MipsPro C, C++ and Fortran compilers but without 'advanced' optimization options like auto parallelization and their 'IDE' tools, Developer Workshop.
That reminded my of Totalview (a debugger from ???): now Perforce bought them
https://totalview.io/
Sorry, I badly phrased that. I was talking about a company that used Unix on their servers, instead of workstations (usually for 3D Work). Although many used Indys and later O2s and Octanes as terminals for their Crimson or Onyx (2) (but not via telnet).
That's what _I_ did and in these environments (usually Irix, some Solaris, AIX and HP-UX) the GNU tools were usually installed (except on single purpose computers that only ran one program, like F/F/I, Softimage or Maya or some 3D CAD).
Maybe in the big automotive companies (using 3D CAD/CAM) the admins were, well 'enterprisey', but normally you were root on your workstation as developer.
Sure, I didn't want to make it there was only one way to do it, I just happened to be more used to such enterprise style configurations.
Actually one way to mix both, while controlling what people were allowed to do on their workstations was to boot via tftp and have $HOME mapped noexec over NFS.
A solution used in some labs at the university campus.
Kind of hard to separate cause and effect there. Why does my dad prefer Windows? He's used it for 30 years and he's used to it. If he switched to Ubuntu, would he prefer that? Perhaps, but there's a lot of stuff to learn. So he prefers Windows, in some sense.
I think for a large number of people it would make no difference as they are in the browser at all times, or the software they use on the daily is available for Linux, too.
> I mean the empirical evidence alone seems point at the Unix Philosophy not actually being popular in practice. GNU tools are much more widely used on boxes humans interact with than BSD/busybox.
I think that I can present an even better piece of empirical evidence (not to diminish the poignancy of your own):
Almost no large or successful piece of software is composed of Unix command-line utilities.
Firefox, Linux, gcc, Visual Studio Code, Windows, Chrome, Google services, Discord, Spotify, Blender, GIMP, Apache, Slack, Office, Steam, Matrix/Element...almost every single non-trivial piece of software in existence is not composed out of Unix shell utility pipelines.
"Do one thing well" allows you to do one thing well, tautologically - but it doesn't allow you to do many things well. The fact that Unix utilities compose at a very superficial level is irrelevant, because the use of text as an interchange format is such a heart-stoppingly bad idea (consider a programming language where the only type is a string - you wanna write a web server in that?) that it renders Unix tools near-useless for building larger tools (as evidenced by the fact that nothing is made with them), and only useful for immediate interactive use, and building small scripts with (that would already be better served by a real programming language).
The Unix Philosophy is self-contradictory. "Do one thing well" is meant to enable composition, for building larger programs - but "Text is the universal interface" is in direct opposition to building larger programs. But, because tools are only supposed to do one thing well, they're feature-impoverished and not very useful on their own, so you have to compose them. You're screwed either way.
Shell programming through small cli tools was never meant to scale into massive complex software so you're comparing apples and bananas.
Do one thing well and text as the universal interface works well in the small for parsing files, doing sys admin tasks, gluing a few things together. That's what it's intended for. It's obviously the wrong tool to build a system in. Used in the right context they compose beautifully.
Different tools serve different purposes. Just because you wouldn't write a web browser/server/IDE/OS in a shell scripting language doesn't mean they don't have value or are failed paradigms.
I've written tons of shell scripts for one-off jobs or simple work that would've taken 2, 3, maybe 4 times the amount the time to write in a "real" programming language. (E.g. throwaway code to generate 10,000 test files that are slightly different would take me 5-10 minutes in bash, but maybe half an hour in node or python.)
Passing around strings isn't the BEST paradigm but it's extremely flexible when you need flexibility. There are times when you need something more solid and that's fine too.
And fwiw I'd bet a majority of the software you listed uses bash somewhere in either their build process or SAAS stack.
On the projects I've hacked on, a lot of time has been spent trying to replace bash and autotools in the build process and reduce its usage. They are brittle and nobody really likes to maintain giant shell scripts filled with fragile regular expressions and other unreadable awk invocations. Plus I've lost count of the number of build scripts I've seen that do strange things like failing to handle spaces in paths because something is not escaped correctly. They are really not a good solution. This can be caught with CI, but if you use any kind of CI then you have even more incentive to get rid of it because bash and makefiles (which fork an additional shell on every command) are very slow.
Different tools serve different purposes, yes. However, some tools have much narrower uses than others. For instance, most developers agree that writing programs in assembly language is a very bad idea for about 99.9% of programs (1 out of 1000 - which is generous). And, from both empirical evidence, and reasoning from first principles (stringly-typed, quoting and escaping issues, lack of tooling, low performance, implementation-defined behaviors, poor ecosystem, try-to-continue instead of fail-quick design), we can see that shell is entirely unsuited for everything except interactive use.
Moreover, "time to write script" by itself is a bad metric, for several reasons: (1) the cost of understanding and maintaining the script is ignored (and obviously higher for bash than Python) (2) you're not including the higher likelihood of bugs in your bash programs (and all it takes is a single bug that requires 20 minutes to find and fix...) (3) if you had more experience with Python (or with specific libraries useful for writing shell-script-like things - this often goes overlooked) then your development time would probably be shorter and (4) the pool of Python devs is far greater than the pool of bash devs.
> Passing around strings...[is] extremely flexible when you need flexibility
This isn't accurate. Passing around typed data is just as flexible as strings - except that it adds a bunch of extra safety and type-checks. All data in "string" format, with the singular exception of actual English text that is being passed to an NLP system or something, is actually in a structured text format (the operations that you perform on which are merely isomorphic to existing operations on structured data)...that is very likely under-specified relative and more brittle than actual typed data. All that using a string representation does is make errors easier to make and harder to find.
> And fwiw I'd bet a majority of the software you listed uses bash somewhere in either their build process or SAAS stack.
Yes, because bash is a (bad) habit for tooling devs, not because it scales well or adds safety/performance/maintainability - as evidenced by the fact that there aren't any large pieces of software written in it.
That is - bash is used "somewhere" and not "everywhere" because you can only get away with it at small scales - at large scales, it just falls apart.
> Firefox, Linux, gcc, Visual Studio Code, Windows, Chrome, Google services, Discord, Spotify, Blender, GIMP, Apache, Slack, Office, Steam, Matrix/Element...almost every single non-trivial piece of software in existence is not composed out of Unix shell utility pipelines.
All true.
However, the development environment (and frequently the build systems) used for all of the above are likely substantially composed out of Unix shell utilities and pipelines, aided by very unix-y tools such as Perl or awk.
Nice point. Pretty sure GP's observation, while correct, is wrong-headed, and something akin to saying, "Honda, Toyota, Volkswagen, Audi, Porsche, Chevrolet, Ford and even Tesla... every single non-trivial transportation vehicle in existence is not composed out of garage tools."
Thanks! I really appreciate your straw man and shameless baiting, but my point was made. To be perfectly clear, if I can dumb it down a bit, though the observation is correct and maybe even mildly interesting, maybe it is even a unique observation no one else has noticed, it is, regardless, not a valid criticism in the same way it is not a valid criticism of garage tools that cars are not made out of them. So criticizing UNIX commands because no major applications are made out of them is, if not wrong-headed, then it is barking up the wrong tree, because, at least my understanding is, major applications are developed with programming language. I suspect there are a myriad of programs that fail to rise to the level to be classified major applications that are basically ugly and slow and strung-together UNIX commands, so it can be done, if desired, just as I expect one could build a vehicle out of garage tools if one had the drive to do so rather than use the appropriate materials. So there is perhaps some question begging in your observation in the hidden and incorrect assumption "UNIX tools are intended to be used as elements of major applications, that's what they're for" so that you may criticize in the precise way that you did. There are plenty of valid criticism of UNIX tools, but yours is not one of them. Though, it seems obvious, you have an keen and interesting mind possibly with a touch of the inability to be disagreed with, nor can tolerate criticism of your observations. Maybe I'm wrong, it just seemed that way just then.
Point is clearly made for all to see, but I will give it another attempt to help you understand. Your criticism of UNIX commands is not valid in the same way that it not a valid criticism of garage tools that major vehicle manufacturers do not build cars out of garage tools. My understanding is major applications aren't made using UNIX commands because they are ordinarily developed using programing language. So your observation, while incidentally true, as a criticism it is invalid, wrong-headed, misguided, or barking up the wrong tree, which ever way strikes you with the last amount of grief and personal discomfort. I believe your observation is, in fact, a straw man fallacy in the precise way it is fallacious if I observe as a criticism of UNIX tools that no major coffee manufacturers make their product with UNIX tools. While this is probably true, as an argument or criticism of UNIX tools it is simply not valid and not a cogent criticism.
No, it is not. You said "wrong-headed" and then made a very vague analogy. Your point is the opposite of "clear" - in fact, it's unclear whether it exists at all.
> Your criticism of UNIX commands is not valid in the same way that it not a valid criticism of garage tools that major vehicle manufacturers do not build cars out of garage tools.
That's not a valid argument, either. That's a flawed, leaky, vague analogy that does nothing to actually describe, concretely, what your point is. To summarize: you made a non-argument, called it "clear", and then made the same non-argument again.
Here are my points, concretely: the fact that no major application has ever been built even mostly out of Unix shell very strongly indicates that it's unsuitable as a programming language, in addition to indicating its unpopularity. The fact that the most popular variants of Unix shell do not adhere well to the Unix philosophy also indicate that the Unix philosophy isn't a very good one for interactive use, either. The fact that all of the Unix shells (whether Unix-y or not) aren't used at all for building large applications also indicates that the Unix philosophy isn't suitable for building large programs.
> My understanding is major applications aren't made using UNIX commands because they are ordinarily developed using programing language.
The Unix shells are programming languages that implement the Unix philosophy. The fact that they aren't used to build any large or particularly useful piece of software means that the Unix philosophy itself isn't useful for those things.
> a straw man fallacy in the precise way it is fallacious if I observe as a criticism of UNIX tools that no major coffee manufacturers make their product with UNIX tools.
You used a lot of fancy words here to conceal the fact that your point is completely invalid. There's no relationship between coffee and Unix tools, but there's a very strong relationship between the Unix philosophy and the Unix tools. Therefore, the Unix tools not being used to make large and useful systems directly implies that the Unix philosophy is bad at the same.
Moreover, one of the selling points of the Unix tools is that you can use them as a programming language and compose them to make larger programs - so, the fact that people do not do that is, indeed, a valid and cogent criticism of them.
Very insightful. I've been thinking about this myself.
One of the possibilities is "just legacy code, bad habits, and bandwagoning", but that's a bit of a cop-out.
Another is "Unix utilities have a much better API to the filesystem and invoking other programs than most languages available today" - which is a (somewhat more-) concrete issue that we can address in other languages!
A third possibility is that the Unix tools transition very well from interactive use to batch use in a larger program - again, something that we can work on improving in other languages. Emacs seems to also do this - functions that are usable in both interactive mode (either invoking with M-x or through a keybinding) and as pieces in larger scripts/functions are pervasive. Breck Yunits calls these "user methods"[1] and I think that it's a good idea for a great pattern.
> use of text as an interchange format is such a heart-stoppingly bad idea ... consider a programming language where the only type is a string
Posix shell is a programming language where the only type is a string. Text as an interchange format works reasonably well in practice for a wide range of tasks.
To me shell, easy to combine unix commands and text as interchange is a clear example of New Jersey style from "Worse is better" [1] and their critique comes mostly from followers of MIT/Stanford style.
In my carrier I wrote hundreds of small shell scripts which use unix commands and text as an an interchange format. Yes, I see that some bugs can be prevented/detected by using something more strongly typed than shell, but all my scripts worked reliably enough and didn't suffer from any significant bugs so why bother? (disclaimer: I'm old fashioned and like to read man pages; I imagine shell can be full of traps if you ignore documentation and use only trail and error approach or practice stackoverflow driven development).
Of course for large projects and even some small ones other tools/languages fit better than shell but shell have its niche.
> Posix shell is a programming language where the only type is a string.
You got it!
> Text as an interchange format works reasonably well in practice for a wide range of tasks.
Solutions using real programming languages score better on almost every conceivable metric (robustness, tooling, debuggability, maintainability, extensibility, performance, library ecosystem, developer pool, language pitfalls) other than initial time-to-develop, where they only win by a small margin (e.g. 10 minutes to develop a script instead of 2) - so, unless you're developing software purely for your own enjoyment, it's somewhat irresponsible to use shell instead of Python (or some other real, but script-amenable, programming language).
> To me shell, easy to combine unix commands and text as interchange is a clear example of New Jersey style from "Worse is better" [1] and their critique comes mostly from followers of MIT/Stanford style.
The New Jersey style is not only flawed on a theoretical level (just go look at any of those characteristics - tell me that it's more important for your bank or X-ray machine software, or your compiler or firewall to be simple than correct - and it's pretty clear that software that prioritizes simplicity above all else doesn't actually usefully serve humans, which is the point of software in the first place) but also unpopular on a practical level for actually building useful pieces of software.
The New Jersey style is a style guide that is only useful for making art projects (where the value is in beauty, not utility) that aren't required to be performant, maintainable, feature-complete, or have correct behavior.
> all my scripts worked reliably enough and didn't suffer from any significant bugs so why bother
Because of the above list of metrics that real languages score better than shell on - the only thing that shell wins on is initial time-to-development! It's going to be harder for others to maintain, debug, optimize, understand, and extend your scripts after you leave, harder to diagnose bugs, harder to find people who can write bash...
> shell have its niche
Yes - but it's an extremely narrow one where time-to-development is the most important metric and you're only working with small problems (as above a relatively low problem complexity, shell scripting is slower than using real languages anyway) - and these conditions are very rarely true at the same time (e.g. in infrastructure you might have a small problem, but correctness is far more important than TTS - while in a startup you want low TTS, but your problems are non-trivial).
I think the whole value of the Unix tool ecosystem is that you don't need to build large pieces of software. If you look at all the GUI tools that Microsoft had to build for Windows administration in the earlier days and how they eventually came out with PowerShell because of the inadequacy of those tools, you can see how successful Unix tools have been by the --lack-- of big software that does their job.
Git is probably the most notable example of a complex software program that is composed of a series of independant tools, though gcc is actually also architecturally a pipeline. Visual studio code might also not be a great example. It's an ide, whose ide features are based on seperate process servers which communicate over a textual protocol... it's very plan 9 esque.
> GNU tools are much more widely used on boxes humans interact with than BSD/busybox. You don't see everyone trying to figure out how to install all of the suckless tools immediately after installing linux, etc.
Or perhaps those BSDers, suckless tool users, etc get setup and quietly just get to work. I feel that’s plausible.
I think you’re right re: server/desktop split, which decades ago might have been hard to tell apart. Is desktop-computing and some loud but ignorant (of Unix) majority directing the future of Unix down the garden path or are the traditionalists out of touch and just yelling at clouds and kids to stay off their lawn?
I fall into the traditionalist camp - I think the Unix that got us here is Good and to know it is to love it.
I think pushing back against complexity is (and has been for a long time) an additional challenge for BSD/Linux, besides it’s primary role of being a great compute environment. The BSDs have done marvellously here.
> I mean the empirical evidence alone seems point at the Unix Philosophy not actually being popular in practice.
I’m worried this is polling 100 people off the street about how they care for their kitchen knives and trying to tie that back to Michelin Star restaurant kitchens. I don’t know…
Usually both the Unix Philosophy gets uncritical praise and many people complain about how unstructured shell application data is. The great thing about this essay is that it connects the two dots and finally concludes that “do one thing well” and “text is the universal interface” go together like oil and water.
> The Unix philosophy of “do one thing well” doesn’t actually work that well. Dan Luu explains this better than I could.
That quote from the Uncomfortable Truths article is strange because Dan Luu doesn't explain that "it doesn't actually work well". He just explains that the philosophy is not consistently followed, but he is not unhappy with it.
> He just explains that the philosophy is not consistently followed, but he is not unhappy with it.
Which is exactly the reason for the glut of command line options.
A great example of following Unix philosophy using a more recent example is the ssh client on plan 9. It is split into three separate programs each with their own man page: ssh(1) for terminal access, sshfs(4) for mounting a remote file tree, and sshnet(4) which imports a remote machines tcp stack. Keep it simple, stupid.
In years past, some Unix utilities used to do things like silently truncating long lines that were too long for the static buffers used by those tools. The GNU Coding Standards document (originally written in the early 90s I believe) specifically says not to do things like this (GNU's Not Unix, after all..)
> For example, Unix programs often have static tables or fixed-size strings, which make for arbitrary limits; use dynamic allocation instead.
> Avoid arbitrary limits on the length or number of any data structure, including file names, lines, files, and symbols, by allocating all data structures dynamically. In most Unix utilities, “long lines are silently truncated”. This is not acceptable in a GNU utility.
there's a subtle thing about perl that's not mentioned often, which is how it borrowed ideas from many other tools and that made perl "intuitive" to learn for people who knew the other tools. But once you're embedded in just perl for awhile, you forget the other tools and then perl's grab-bag of borrowed ideas becomes sort of burdensome, it loses "intuitiveness".
Only after I had to do some non-trival Bash scripting did I stop hating Perl's syntax. For one thing, at least Perl wasn't as bad as Bash. For another, I now understood where Perl came from!
Perl's syntax is a relic of its time, and it blazed a trail for what to do (and what to avoid!) in later scripting languages.
An interesting alternative approach to Unix pipes are systems where everything is an object and where objects can be composed, similar to the idea of function composition in mathematics (e.g., f(g(x)). This can not only be used for command-line applications (for example, PowerShell), but this approach can be taken in graphical-user interfaces. The Smalltalk-80 environment is the ideal substrate for creating such an ecosystem, and Windows has support for implementing such component-based technology, namely (1) Microsoft's OLE (https://docs.microsoft.com/en-us/cpp/mfc/ole-background?view...), (2) the Component Object Model (https://docs.microsoft.com/en-us/windows/win32/com/the-compo...), and (3) the .NET Common Language Infrastructure (https://en.wikipedia.org/wiki/Common_Language_Infrastructure).
Back in the mid-1990's Apple once promoted OpenDoc (https://en.wikipedia.org/wiki/OpenDoc), a standard that was envisioned to support an ecosystem of component-based software, where users could mix and match components to create modular solutions instead of depending on large, monolithic software applications. Here is a nice short video promo from roughly 1994 describing OpenDoc: https://www.youtube.com/watch?v=oFJdjk2rq4E.
OpenDoc was nixed when Steve Jobs returned to Apple; my opinion for this nixing is because (1) Apple at the time needed to focus on one technical direction (OpenDoc was one of many competing software visions at Apple before Steve Jobs united Apple toward a software vision built on top of OpenStep/Cocoa), and (2) it's hard to promote a component-based software ecosystem when the Mac was on life support and needed support from popular vendors of large, monolithic applications (I'm talking mainly about Adobe and Microsoft) in order for Mac users of these applications to stay on the Mac. I believe this is the same reason why we haven't seen much of a component-based software ecosystem on Windows outside of Microsoft Office's interoperability among its applications: it's easier for software vendors to sell integrated solutions than to sell components that users will have to integrate themselves.
I wish the FOSS desktop ecosystem, which doesn't have the same commercial concerns as the world of proprietary software, embraced component-based software beyond Unix pipes. However, the FOSS desktop ecosystem rallied behind KDE and GNOME in the latter half of the 1990's, and thus many of us still rely on large applications like LibreOffice, Firefox, and the GIMP instead of component-based alternatives.
Components are used quite a lot on Windows, specially on devtooling, have a look at DevExpress, Telerik and so on.
The biggest pain with COM is that Microsoft (except for C++/CX) never put much effort like Borland did, or VB 6, on making it a pleasant experience.
.NET was supposed to replace it, but WinDev rebeled against it, so now we have both, with COM still having the tooling stuck in 2000 (C++/WinRT is a regression to those days).
On UNIX side, there are a couple of possibilities gRPC, D-BUS, DCOP, KParts, Android Binder/Intents,...
macOS builds on the NeXT Tooling and also offers XPC in addition to them.
However where BSD and Linux clones fail is having a full stack approach to the platform, thus turning such approaches in mini-silos, e.g. you can enjoy D-BUS alongside KParts as long as you stay with KDE and KDE specific apps.
The comment since it lacks any context and presents nothing but a vague opinion. It's actually a recursive comment because the poster heard that programmers love recursion.
A gentle suggestion for bloggers: if you're going to write an article with a title like "The growth of command line options, 1979-Present", please also arrange to prominently display the date of composition.
Really, any blog post should have a prominent date on it. For a lot of things, it's essential context. Older posts can be useful/interesting but they can also be pretty much completely useless depending upon the topic.
Maybe it's because I'm not the biggest Unix purist, but I don't really mind the proliferation of command line options. The two things that do irk me are:
* Using short options in scripts. Scripts should almost always use long options. As a corollary, tools should always provide long options unless the tool isn't intended for scripting.
* Lack of discoverability. We still don't have a standard way to ask an arbitrary tool which flags it supports, or for a rough semantic mapping of those flags (e.g. the classic -v/-V verbose and version confusion). Developing a standard here would go a long way towards automating things like tab completion without each tool needing to maintain N scripts for N shells or having its own slightly-different `--completions=SHELL` generator.
The last point would be amazing, imo. It's unfortunate that, because of history, commands accept and parse arbitrary strings as input instead of formally specifying it like a function signature + docblock. If I could rewrite the universe, commands would use a central API for specifying the names, data types, descriptions, etc. for all the input they take. In our timeline, maybe some file format could be standardized that describes the particular inputs and options a command takes, and e.g. shells would hook into it. Kind of like header files, but distributed either in a community repository or by each of the tools themselves.
Yeah, the inertia is the real killer here. I'd love to see a fully embedded solution (particularly given that ecosystems like Go and Rust discourage sidecar files like manpages), but thus far the only one I've really seen is individual tools supporting a `--completions=SHELL` style argument that spits out an `eval`-able script for the requested shell.
The real dream would be some kind of standard way for a program to indicate that it's opting into more intelligent CLI handling without any execution. I've thought about trying to hack something like that together with a custom ELF section, but that wouldn't fly for programs written in scripting languages.
> ... commands accept and parse arbitrary strings as input instead of formally specifying it like a function signature + docblock. If I could rewrite the universe, commands would use a central API for specifying the names, data types, descriptions, etc. for all the input they take
Which is exactly what powershell does: Commands in powershell (called "cmdlets") are like functions with type hints. A command does not parse the input strings itself, rather it exposes this type information to the shell. The shell is the one responsible for discovering the parameters (and types) and parsing the command line parameters and coercing types before passing them (strongly typed) to the actual command.
This means that the information about parameter names and types is readily available for doc-generation, auto tab-completion and language servers which allow syntax highlighting, completion etc to work even inside editors such as vscode or emacs.
The point is that to specify a cmdket you must declare parameters, in much the same way that for a function in a programming language to accept parameters it must declare those as formal parameters.
And with PowerShell Crescendo, same experience can be provided for native commands, although I think it is a bit too much expecting everyone to create crescendo configuration files.
> In our timeline, maybe some file format could be standardized that describes the particular inputs and options a command takes, and e.g. shells would hook into it. Kind of like header files, but distributed either in a community repository or by each of the tools themselves.
This sort of sounds like what we're working on at Fig. We've defined a declarative standard for specifying the inputs to a CLI tool and have a community repo with all supported tools: https://github.com/withfig/autocomplete
And documentation, or when answering questions. SE answers are often like "just type `blah -jraiorjg` and you're done", which means having to look up all these options in the manual. Short options should be the exception, when working in the shell mostly, not the rule.
Re-typing commands from SE answers without consulting a man is a bad idea even with long options. SE is a good place when you don't know where to start or which man to read, but once you see an answer it is better to check in mans/docs how suggested command would work and what it will do. But I agree that if you want to help someone who doesn't know particular command it is better to use long options.
Yes, there's stuff that you can't get just from parsing the man page too, but it's a huge help. I know it's not done every startup, I have running that as part of my "update everything" script.
It would be nice to have a uniform standardized basic command lines on all platform to get the information of the command like 'help'. Annoyingly when I tried to find the command options in the terminal like help, h, -help, --help, -h, and it variants. Can we just have a standardized basic command to make it easier for everyone to expect certain things than making the users to check the manual or the documentations (sometime they don't even list it in there). Don't make it a pain for the end-users.
Unportability is, for better or worse, a moot point when the GNU coreutils form the de facto standard for most actual shell scripting.
It's also not relevant in the context of shell scripting with tools that aren't intended to be part of POSIX or any other standard. Plenty of ecosystems and workflows are built on proprietary, internal, or just not standardized tools.
For most linux distributions sure, GNU might be considered a standard. But if you want your script to work on macOS or BSD, or even for example Alpine linux, then you can't just treat GNU as standard.
1. Having a problem and solving it with a small amount of effort.
2. Having a mental model of some external thing and delighting in seeing that your model is accurate with respect to the actual thing.
Elegant, minimal systems excel at providing #2. Big sprawling systems with lots of options and special cases excel at providing #1.
Many software engineers seem to be calibrated to extract a lot more joy from experiencing #2 than #1. It's probably something to do with the nature of the work. We work "backwards" by having a mental model of a thing we wish existed and then bring it into being. If we didn't find deep satisfaction in harmony between our mental model and software model, we wouldn't have the drive to keep fixing all of the bugs in our software.
But there is nothing instrinsically superior to the joy of #2 compared to #1. It's just a psychological/emotional preference. Many people, if not most, outside of the world of software engineering seem to prefer #1. They just like to get stuff done and make stuff happening. You rarely see woodworkers lamenting about the variety of clamps, jigs, and tools they have. Instead, they delight in finding just the right specific tool to make a job easy. Fishermen love their tackleboxes.
Of course, there is an appreciation of #2 in all of us, and you see minimalism everywhere. But I've never seen it raised to such a religious, moral level as I see in software engineering. It's important to remember that that's a psychological choice we're making—a personal preference we impose—and not something fundamental to systems.
> in reality you spend so much time configuring and debugging that you rarely actually save any time.
Conversely, that's the sales pitch for minimalism. But the reality is that well-intentioned, thoughtful designers and users seem to end up spending most of their time building and using systems with a lot of complexity. If it really ended up costing them more time to use those systems, it begs the question of why we keep building them.
> it begs the question of why we keep building them.
It's harder to build a minimal system which fits users' needs like a glove, than to stuff in options to fit each new issue discovered. So developers build complex systems (eg. I built corrscope's arcane settings system), often because they don't know the domain well enough to design a clean elegant solution with the right abstractions. But the resulting complex configuration system is harder to learn and use than the ideal solution (which may exist, may be possible to build, or may be impossible).
Though minimal systems aren't always viable either, when different users may want different programs. Some people want MS Word for its WordArt, some people want tables of contents, some people want something like Markdown, some people want change tracking. Some people want Vim, some people want Emacs, some people want Notepad++, some people want an IDE. I want tabs in a terminal emulator, because I don't want to throw out my knowledge of stacking WMs which don't have tabs (I don't know why not, BeOS and Haiku are stacking WMs with tabs), and learn a tiling WM with tabs.
Number 1 can be efficient but it isn’t satisfying if it was taken straight from a cookbook.
> But there is nothing instrinsically superior to the joy of #2 compared to #1. It's just a psychological/emotional preference.
I’m not sure about that. Feeling mastery and competence is intrinsically rewarding. And that’s easier to achieve with point number 2; if your mental model aligns more with the real thing then you are less likely to make mistakes and to have to look up reference material—you are less likely to get estranged from the task at hand.[1] And you are less likely to be bewildered by all the apparent complexity that surrounds you and that the sage old ones just tell you to put up with and to not question. I don’t think you would even mind being less efficient as long as you are in that sweet groove of point number two.
Yes, the things we derive satisfaction from are because of our ability to slot things into a preexisting framework, proving our opinions correct (which we all love to happen) and releasing dopamine (probably). There is no such thing from applying a specific special case from a piece of software.
> Number 1 can be efficient but it isn’t satisfying if it was taken straight from a cookbook.
That may be true to one degree or another for you, but isn't universal. I also prefer to understand solutions (why pick #1 or #2 when you can have both?), but I have certainly been happy to see a problem evaporate even if I'm not 100% certain why. The result has its own value.
> I’m not sure about that. Feeling mastery and competence is intrinsically rewarding.
Yes, we all experience #2. My point is just that it's not intrinsically superior to other sources of joy. Preferring it is just that, a preference. There's nothing morally wrong with people who prefer to simply get things done, nor is there something fundamentally flawed with complex software that caters to that.
> I have certainly been happy to see a problem evaporate even if I'm not 100% certain why.
And it reappears months or years later to haunt another user with a slightly different configuration, or more or less CPU cores, or the software itself has changed to re-expose the underlying race condition.
>Feeling mastery and competence is intrinsically rewarding. And that’s easier to achieve with point number 2
I started to write a point about how this wouldn't necessarily generalize to every field. The woodworker, for example, likely feels mastery and competence by knowing exactly which specific tool is best for the task at hand.
I'm not sure it really works as a counterpoint though, because in that scenario "knowing which tool is best for the task" kind of is the mental model.
> I'm not sure it really works as a counterpoint though, because in that scenario "knowing which tool is best for the task" kind of is the mental model.
Yes.
It would be point number one if he had to ask some oracle about what tool to use.
Out of my experience, #2 is often superior over #1 though.
For some limited well understood context, well defined application, well defined task which will not be extended later and only lots of variations are needed, then yes, #1 is probably better. It's nice to have options like enable_fancy_feature=True.
However, in practice, you often have none of these. The context is often not well defined. It's often not yet well understood. The application is not well defined, neither the specific tasks. The scope and tasks will likely be extended during the development.
In that case, it can often turn out that you made the wrong abstractions, wrong options in case of #1. The software itself becomes more and more complex and becomes very hard to maintain, and new things which should be simple are now difficult to do because of too much interdependencies.
Now, an elegant minimal system, or collections of small building blocks or tools, will turn out to be much more effective in solving some new task later on.
The issue is where developers spend mostnof their time. Almost by definition, we don't spend time dealing with scenerio #1, because things just work and we are done.
Where we spend our time is when things don't work. If it doesn't work, but everything fits our mental model, solving the problem is generally straight forward. We may need to see how everything fits together. We may need to add some logic to put things together correctly, but it is gennerally just a straight forward part of the job.
When things don't work, and don't fit your mental model, the work becomes an exersise in frustration. Do that enough times and you start to develop callasus to avoid the frustration.
Interesting article, but I don't see how the increase in number of arguments invalidates the "do one thing well".
Sure ls has a 58 (!) arguments but it's still a tool to list files ultimately.
> I've heard people say that there isn't really any alternative to this kind of complexity for command line tools, but people who say that have never really tried the alternative, something like PowerShell.
I've spent a lot of time passing around JSON and XML, not the same but similar. While it's true that structured data makes things easier, it's not without it's own faults.
I subscribe to unix philosophy of "do one thing well" and pass text around because it's served me well. You have to question status quo to come up with something new and better, but so far I'm not seeing it.
> Sure ls has a 58 (!) arguments but it's still a tool to list files ultimately.
No, it has evolved into a tool to get files/directories of a directory and sort them and format them.
Powershell separates those. `ls` in Powershell gets filesystem objects. It does not have any of the nix craziness for controlling the output and sorting, yet it is possible to do the exact same thing by piping output from ls into a format-command like format-table(ft), format-list(fl) or even convertto-json
And yet there aren't --json (and/or --yaml) universal output option flags to structure your data in a reliably parseable format.
Shouldn't every option for a major UNIX command (insert your definition for major) have an extensive list of useful examples/use cases? Man still fails me here. As does the AWS CLI too as a side complaint. If your option is important and useful, then it demands a good example.
I read in an article (this one - joke) that all lower case letters except `jvyz` are in use - it looks like `-j` and `-y` are just sitting there waiting for someone to add some output formats - then we just need to add a verbose flag and... maybe a switch that automatically gzips the output? We're so close to the 100% completion achievement - we can get there in our lifetimes.
It doesn't really make sense to have unix commands output json or yaml. They're already doing a form of parsing internally, you're then asking them to encode the output in another text format so you can parse it again. Why? You'd have better results just using the underlying API to get a list of files, and this is what every other program that needs to get a list of files actually does. From a scripting perspective this is one of the things that powershell actually got right.
well, because, ls, lsblk, ps, and a host of other table-based outputs aren't reliably parseable. Their columns overflow. They aren't durable to line breaks in indentifiers and other UTF8 and control character oddities.
You know what is? A decent data output format. And JSON is the best of the compromises that exist now. YAML would be nice because if you are outputting json, yaml output is easy and it's a bit more readable.
That doesn't violate the UNIX philosophy of text output. It probably improves it.
Underlying APIs aren't as universal, portable, or well-known. We are talking basically GNU cross-platform programs.
But you don't actually want to parse them though, that's the mechanism by which you receive the data but the real end goal is you just want the data in its canonical format so it can used programmatically; in powershell there is no parsing, the output is always the canonical representation.
I don't get what you meant in the last sentence, the underlying API is glibc which is also cross platform and is incredibly well known. Other cross-platform languages like Java, Golang, Rust, etc have their own portable APIs to enumerate directories that are cross platform.
Perl. Python. Bash or shell scripts. They all have "invoke this shell command" at a minimum. Then you get into ssh and a host of other command invocations to get information.
The shell is the convenient way to invoke commands to get the information: locally, remotely, quickly, no setup of C bindings, lots of invocation options.
If you are polyglot, do you want to use a command you know (ls) or learn each language's infernal syntax for filtering, recursive listing, etc, or learn it once with shell?
>If you are polyglot, do you want to use a command you know (ls) or learn each language's infernal syntax for filtering, recursive listing, etc, or learn it once with shell?
Probably learn the language's syntax because it's faster and more powerful, e.g. in python and perl I'd say don't bother with running shell commands and doing unnecessary forking and parsing if you can avoid it, use the built-in glob instead:
Most languages just have this now because it's easier and faster than invoking external programs and is a better cross platform solution, glibc even has it:
There are, to some extent, in FreeBSD - you can eg do "vmstat --libxo=json". Not for ls(1), though - I think it's just easier to list files using facilities provided by your language of choice than running external tool and parsing JSON.
With Python installed everywhere I often question the logic behind writing long or non-trivial shell scripts.
It always seems so brittle since the output of a command line tool can change (unless it's frozen in some standard) and relies way too much on regex and assumptions about what the data returned will look like. I much prefer to keep bash around only for interactive commands and ad-hoc one-liners.
Or the program should legitimately work in both ways, at different times, depending on context and the user's current needs. Sometimes I want `ls -t`, sometimes I want `ls -S` and sometimes I just want plain `ls`. It would be silly to make them three different programs, and if those options didn't exist, that would be foolish.
Sure, you could say that `ls -t` should be `ls | sort --time`, and if command output was structured rather than just text, and standardized, that would work and be easy to implement in a cross-tool manner. But now we have `sort --time`, which is also an option that you say shouldn't exist. So now what? `ls | timesort` and `ls | sizesort`? That's just silly.
Ok, how about something that doesn't affect how the output is presented, but what output is presented. Most of the time I just want `ls`, but sometimes I want to see dotfiles, so I have `ls -a`. How do we fix that? `ls` and `lshidden`? Again: silly. Do we just say "dotfiles are stupid; we shouldn't have this artificial concept of hidden files"? No, that's silly too.
I get that your follow-up post softens the absolutism with "far too often", but your original post did assert that "every" command line option is a bug, and I'm tired of this "options are bad" argument. Overall I would rather have the option to change behavior than not. Take those options away, and we get complaints that the maintainers are dumbing things down and removing choice.
It's really a no-win situation: either your software is bloated, or it's not flexible enough. Few people agree on where the happy medium is. Discussions about this are, IMO, just tiring and pointless, and are some of the worst forms of bike-shedding. The only reason a potentially-useful option should be removed (or not added in the first place) is if the option creates a maintenance burden that the maintainer is not comfortable taking on.
> but your original post did assert that "every" command line option is a bug, and I'm tired of this "options are bad" argument.
Every option doubles the amount of testing required to exhaustively test the program. This soon reaches the level of impossibility. My point is to encourage people to take a hard look at the options and consider redesigns and specification changes to eliminate as many as one can.
The same applies to `if` statements in code. Finding ways to change the data structures and program logic to eliminate edge cases and the `if` cases is a significant goal of mine.
It also applies to compiler warnings. In the case of C, the warnings are there because the C Standard allows practices that are known to cause many problems, but the C compiler vendor has to compile them anyway, and so choose to make them warnings. But for a newer language, the warnings come about because the language designer did not design the language rules well enough.
> Every option doubles the amount of testing required to exhaustively test the program.
True, but pairwise-independent combinatorial testing provides almost-exhaustive coverage with non-exponential growth in test cases: https://github.com/microsoft/pict
`ls | sort --time` is very close to `ls | sort LastWriteTime` which is how you would do it in PowerShell.
In PowerShell `ls` is only used for retrieving the filesystem objects (files and directories). It does not have options for sorting or formatting[1]
Sorting is generally handled by `sort` (alias for the Sort-Object cmdlet).
Formatting is generally handled by one of the format commands, like format-table (alias `ft`) or format-list (alias `fl`) or by converting into another format like e.g. json or xml through `convertto-json`.
I.e. if I want to display the full path of every file sorted by descending by last access, I would write:
ls | sort -d LastAccessTime | select fullname
[1] except for the `-name` switch which is more of an optimization which only retrieves the local name as string objects.
What really is the difference between `sort LastWriteTime` and `sort --LastWriteTime`? An argument without the dashes doesn't seem any fundamentally different from a flag.
- Tool options increasing monotonically seems like the very definition of bloat. Probably since the marginal cost of adding an option is very low and the cost of removing an option is very high (angry users).
- There's a cognitive cost to bloat. Personally, I know maybe 5-10 options to all the commands in the regular *nix tool set and that's about the limit my brain could hold.
- Options are useless if they are not discoverable at lower mental effort than the alternatives. Man/info pages are great, but I'm not wading through multiple pages on the off chance that ls has an option that can output in JSON if it's a Wednesday and my POSIX locale is unset. Discoverability is a serious problem in large systems, and I'm not sure anyone knows what the solution is. I mean, we've had google, Stack Overflow, man X intro, GNU info, ...
So what does that mean in practice? Is there a way to square modernity with the UNIX philosophy? I would suggest that
1. Tool authors ruthlessly restrict options to fit in the working set memory of the average human user. Let's say <10 options. If you need more options, you need a new tool.
2. Tools are pluggable, so that the base implementation is simple and known-to-be-present (shades of POSIX here), but that there is a expansion pack mechanism for people who need it. Imagine DLC for ls. You want extra magicka in your find(1)? Make sure you have the plugin and off you go.
3. Tools designers and users are clear about the decision criteria to *not\* use a specific tool. Sure I could do crypto in awk, or parse \0 in ls, but I shouldn't do so. There's no shame in reaching for a better solution, and, over time, solutions will coalesce into new tools. For example, everyone uses jq not awk to parse their JSON...right?
Edit: I just returned from my kitchen with a IRL example: my dishwasher has 15 programs but only two buttons show any sign of wear: "quick wash" and "start"...
Gotta love the `clear` command. Started with zero options and still has zero.
Does one thing well. Would you want an option to clear half the screen ;)
"Complaining that memory usage grew by a factor of one thousand when a (portable!) machine that's more than an order of magnitude cheaper has four million times more memory seems a bit ridiculous."
Consider a different user from the the author who is designing a system where memory is the filesystem. "Disk" space is memory. The more "disk" a program occupies, the less memory available for everything else. (The same is true for non-memory backed storage space.)
For this user, the complaint is not ridiculous at all. BusyBox commands do not have the full range of command line options. NetBSD has versions of common utilties that have reduced options, intended to be used on small-sized and/or memory-backed filesystems. There are many examples.
The idea that today's computers have more memory/storage and therefore today's programs should be larger does not make sense if the user intended to use the increased memory/storage for something other than programs. Purchasing computers over a few decades with increases in resources for the same price ("bigger bang for the same buck"), the purchaser might expect an increase in the ratio of (a) memory or storage used by others' programs to (b) free memory or storage she can use for her own things. When the size and memory usage of software not written by the computer purchaser increases because of the increased availablity of storage and memory, then that takes away those resources from use by other things. Who do the resources belong to, who paid for them, and who should have the final say in how they are used.
For example, consider programs the computer owner has written or wishes to write or files she wishes to store. Less resources are available for/to those things because someone else thinks that the bigger bang the purchaser is getting for her buck is not entirely the purchaser's. It belongs to anyone who can write programs. Under this theory, the purchaser does not get to decide the quantity of resources other programmers get to use. Other programmers declare they should be unconstrained to use as much resources as they think they need. There is more memory and storage thus they get to use it. Yet those programmers did not pay for that increased storage and memory. The computer purchaser did.
How dare a user complain that a program grew by a factor of 1000. The computer has more resources. Therefore anyone can use them. Except the resources do not belong to anyone else besides the person who purchased the computer. Who should get to decide how they are used.
McIlroy is just telling it like it is, it is strange that younger people today want to "disagree". Bloat is bloat. More resources does not make bloat cease to exist as a concept. If someone is eating too much food (more than their body requires), then the availability of more food does not change the fact they are eating too much. Not to mention when someone else is paying for it. "There is so much food available. I am only eating a tiny proportion of what is available. Therefore, I am not eating too much." Statements like "We were doing X in the 1970's" when the author was not even born or yet to have fully formed adult mind by the 1970's are peculiar. Is "we" the appropriate word. "They" were different people.
IIRC, McIlroy once said something like, "The hero is the negative coder." Bloat has become so pervasive that IMO such coders really are heros, maybe not to other programmers, but to this user for sure. Write enormous, complex programs with 50+ options, have fun, be "productive", enjoy convenience and resource abundance, but trying to "disagree" with McIlroy's (accurate) observations is quite silly.
For a user who only uses 20% of the commandline options of a program, the code and resulting size increase correspending to the other 80% of options she never uses is perhaps something to consider, especially when this applies to a large number of programs. The shell can keep track of utility usage so one can see which programs she is using the most and which programs she does not need. Perhaps it would be useful to measure options usage so she can assess what options are used the most and which options she does not need. Ideally unutilised options might be disabled at compile-time.
> We used to sit around in the UNIX room saying "what can we throw out? Why is there this option?"
This reminded me of Antoine de Saint-Exupéry's quote that "Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away."
That is unfortunately not the reason. There are 2 main reasons why today (GNU) command line utilities have a lot of options: 1). GNU. Apparrently they like to add features to programs 2). Herritage of SYSV/BSD. Traditionally BSD options were different from SYSV and the new GNU utilities were made to understand both.
reply