Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Things I Wish I'd Known About Bash (zwischenzugs.com) similar stories update story
781.0 points by zwischenzug | karma 4169 | avg karma 4.19 2018-01-06 11:08:03+00:00 | hide | past | favorite | 272 comments



view as:

My take on the same topic:

- use the unofficial strict mode: http://redsymbol.net/articles/unofficial-bash-strict-mode/

- use parameter substitutions like ${foo#prefix}, ${foo%suffix} instead of invoking sed/awk

- process substitution instead of named pipes: <(), >()

- know the difference between an inline group {} and a subshell ()

- use printf "%q" when passing variables to another shell (e.g. assembling a command locally and executing it via SSH)


While process substitution is great on the surface, it comes with a troubling downside: There is no way, that I have found, to reliably catch errors. That is, this:

  some-command <(some-failing-command)
...will succeed. And -e and pipefail do nothing here. I have not found any way to push the error up. Plenty of questions on Stackoverflow and elsewhere, no good answers.

pipefail do nothing here

Isn't that because the syntax provided doesn't use a pipe? Which is why actually using a pipe, rather than paren redirection, is easier on the eyes and the shell. It reads more like the flow of text:

    maybe_failing() {
        if ! some-failing-command; then
           echo "That did not shake out" >&2
           return 1
        fi
    }

    maybe_failing | some-command
Is that one of the things StackOverflow said? And if so, why is it not a "good" answer (verbosity concerns aside)?

This requires that some-command takes input from stdin (many commands don't) and that there's only a single input. What about:

  process-stuff --file <(make-input) \
    --extra-data <(make-more-stuff | grep blah)
This is the point at which one reaches for tempfiles, usually.

The point remains that <() was made for a purpose, yet lacks error handling, thus undermining its usefulness to the point of uselessness.

Edit: Isn't substitution using pipes, though? As in FIFO pipes? You get a file descriptor device which is closed at the end of the script. I don't know if the fd is wired directly to the command's stdout or whether the output is written in its entirety to a hidden tempfile first; the former sounds more natural and efficient.


>I don't know if the fd is wired directly to the command's stdout or whether the output is written in its entirety to a hidden tempfile first; the former sounds more natural and efficient.

Nope. <() and >() is equivalent to creating a named pipe with mkfifo and writing to it:

    tmp_pipe_dir="$(mktemp -d)"
    mkfifo "$tmp_pipe_dir/pipe"
    make-input >"$tmp_pipe_dir/pipe" &
    process-stuff --file "$tmp_pipe_dir/pipe"
That's the whole point of it, so that the data can be consumed in parallel with the producer.

Right, that's what I meant. The command is run in a child process with its stdout writing to the pipe. The script gets the read end.

Yes, you absolutely have me about the stdin part

But, the multiple case example highlights that if (for argument's sake) `make-input` and `make-more-stuff` had run _prior_ to that line, writing their (possibly empty) output to a (file|FIFO), how then would you want bash to behave? It would still open file descriptors to those (file|FIFO)s, which would still be just as blank.

It seems to me that if one wishes more fine grained control over the error handling in a multi-subshell-fd-trickery situation, then creating the FIFO(s) and managing the sender's exit status is the supervising script's responsibility.

a file descriptor device which is closed at the end of the script

I checked, and it's actually not even at the end of the script; those fds only exist for that one child, as `process-stuff` is exec-ed by bash.

whether the output is written in its entirety to a hidden tempfile

It's not a temp-file, it's an actual file descriptor, which bash `dup2`s for the subprocess, then cheekily uses the `/dev/fd/63` syntax to make appear as a file; you can peer into its brain a little:

    $ showme() {
        echo "showme.args=$@" >&2
        the_fd="${1##/dev/fd/}"
        cat <&${the_fd}
    }
    $ showme <(date -u)
    showme.args=/dev/fd/63
    Sat Jan  6 21:31:25 UTC 2018

Thanks so much for highlighting this scenario; I learned a ton about how that works researching this answer. That's why I like answering stuff on S.O., too: win-win

Good point about how to abort. Maybe it would be possible to short-circuit the fd somehow, so that the reader got an EOF, and that would in turn cause failure in the program reading the stream. At the same time, any success code from the program should be turned into an error exit code so the script could detect the failure. Not perfect, but arguably better than no failure at all.

Parameter substitution is far from intuitive. Every time I have to open one of my older shell scripts (>3 months ago), I'm thankful I wrote comments. Otherwise I'd have to man/google things again. I really wish they'd use function names or something, instead (sub, etc.).

My "mnemonic" is that # means prefix, because every shell script starts with a shebang too. From this I can deduce that ## means longest prefix, therefore % and %% means suffix.

> if [ x$(grep not_there /dev/null) = 'x' ]

This is still wrong if the command can output spaces or meta-characters. You should quote the left operand, and then you don't need to prepend x:

if [ "$(grep not_there /dev/null)" = '' ]


The legacy of autoconf! People look at autoconf's output to learn shell, but autoconf output is full of bad practices, working around bugs in shells that noone's ever heard of.

Autoconf output is expected to run on buggy shells with broken empty-string comparison. Your scripts probably aren't.


Because when displayed in a variable width font two single quotes can look very similar to one double quote, that last line is very easy to misread. Here it is in code mode:

  if [ "$(grep not_there /dev/null)" = '' ]

Does the second only work with bash? Because IIRC it didn't work with FreeBSD's /bin/sh, you needed the initial x too.

No, this should work in any POSIX-compliant shell.

#0: If you're writing scripts that are destined for other users, use POSIX sh instead.

Can you explain why?

For portability. bash is not available everywhere, but POSIX sh is a requirement of POSIX so you can expect it to be there.

Exactly: GNU bash is on most GNU/Linux systems, but:

* mac OS uses an ancient version of bash

* Embedded systems use busybox sh

* Android uses its own version of sh

* None of the BSDs come with bash, since, y'know GNU licensing

EDIT: Formatting


And to make things interesting Ubuntu uses dash as sh

https://wiki.ubuntu.com/DashAsBinSh


In fairness, that's just for sh; if for some reason you actually need bash you can just use `#!/usr/bin/env bash`

If you actually //need// bash ask for it, not sh.

This comes from more armature script authors not being precise enough with requirements, or not targeting the more common and portable spec.


`sh` is only supposed to be _a_ POSIX-compliant shell, it's not odd/interesting/confusing that Ubuntu uses dash any more than it is that something else uses bash.

The shebang line #!/bin/sh should only be used for POSIX-compliant scripts; if it needs {bash, dash, python, etc.} it should start #!/usr/bin/env {bash, dash, python, etc.}.


Not all systems have bash.

Not all systems have POSIX, perl, python, javascript, C++, rust, go, etc., so should we write all our programs and scripts in sh?

The relevant point is that all systems running bash should also be POSIX compliant. So, taking some care to avoid bash-only constructs in a shell script can reap substantial gains in portability at fairly minor cost.

Why I should care about all systems? If targeted systems have bash4.x and GNU tools, why I should write scripts in sh? If you care about all systems, ANSI C is better choice, IMHO.

If you're targeting a limited set of systems and can reasonably assume the requirements won't change in the future, obviously use whatever tools you know will be available. I write Bash and Python all the time for those reasons.

But it's silly to think that there's nothing in between "complete control over target environment" and "use ANSI C so it can be compiled to any architecture and platform under the Sun." POSIX compliance will cover a lot. Try browsing /usr/bin on any Unix system you'll see plenty of use cases. I just looked at /usr/bin on my laptop and saw that mysqld_safe is a Bourne shell script, for just one example.


less magic (which you start to care for if you have to use other people's magic), easier to port stuff to zsh, etc.

Do you know of any learning resources that are strictly POSIX sh instead of a specific shell? I'm looking to learn and it will be very helpful.


No need to learn strict POSIX shell. If you learn bash from a good guide and it should explicitly inform you about bash specific features. This way you can learn both at the same time.

On top of that you can use `checkbashisms` tool to lint your script against bash specific syntax.


Check out this presentation: https://youtu.be/olH-9b3VJfs

And the accompanying reference site: http://shellhaters.org/


Are you writing your scripts in sh instead of bash, perl, python, JavaScript?

I prefer sh over bash for most of my scripts, but I will sometimes use Python instead.

So you use python instead of sh, but you argue that we should use sh instead of bash, perl, or python. I'm not convinced.

IMHO, we should use proper tool for the job instead of making artificial limitations. Bash is proper tool in lot of cases, unless old proprietary OS or very limited embedded OS are targeted.


>So you use python instead of sh

That's not what I said

>we should use proper tool for the job

This is the implication of what I said.

I do not think that bash is the proper tool in most cases.


So it boils down to your personal taste.

For most workloads where bash is suited, so is sh. For workloads where sh is insufficient, bash probably is too and you should just use a higher level language.

Chill. You don't have to hold bash so close to your heart.


I'm author of bash-modules project. It's my attempt to create set of libraries for easier scripting on bash in strict mode. Most bash libraries are not designed for strict mode, so I created my own. If you prefer sh over bash so much, you can help me to make it compatible with sh, or just fork it.

See https://github.com/vlisivka/bash-modules


Or option three: not care about your project because bash sucks and continue to write sh without it.

> old proprietary OS or very limited embedded OS

i.e. all the *BSDs? Many Linux distros that ship with BusyBox only or FreeBSD/Linux type systems (AFAIK Debian and also maybe Gentoo allowed FreeBSD userlands)?

POSIX Sh is quite capable. I maintain even my bashrc POSIX compatible, in order to not have any problems when I want to run my config on some remote machine, or on some local BSD installation (and I did run FreeBSD for more than a year, quite feasible option for a workstation IMO).


I see so many #!/bin/sh scripts with bashisms in them. These scripts are broken and will cause errors. /bin/sh is not always Bash even on Linux systems. It definitely is not on BSDs. There is no reason to use Bash for shell scripts when you can do the same thing portably.

Your example shows that advise to "use /bin/sh" doesn't help but creates problem. If someone wants to write portable script, then he must test it on significant subset of target systems.

I disagree with this.

There's nothing wrong with writing bash scripts, with a shebang like `#!/usr/bin/env bash`, just like there's nothing wrong with writing python scripts (with `#!/usr/bin/env python`), or Haskell scripts (with `#!/usr/bin/env runhaskell`), or whatever other language you like/think is appropriate/etc.

It's true that bash is unavailable by default on (say) microcontrollers, but I don't see the relevance given that loads of common scripting languages aren't available by default on microcontrollers (Python, Ruby, JS, PHP, etc.).

PS: With this said, don't start your bash scripts with a `sh` path like `#!/bin/sh` or `#!/usr/bin/sh`. In fact, don't use a hard-coded path like `#!/bin/bash` or `#!/usr/bin/bash` either, always use `#!/usr/bin/env bash` unless you have a good reason not to. (Whilst `/usr/bin/env` is a hard-coded path, it can also be treated as a single special-case by systems which don't follow FHS, like NixOS, GuixSD and GoboLinux).


Bash is a special case where there exists a similar tool which is standardized in POSIX, and for most cases where sh is insufficient bash is probably insufficient too. Therefore POSIX sh is almost always the better choice.

Does fish-shell have an equivalent for '<()'?


Thanks a lot. Shells have so many features it's easy to miss one.

Watch out, there are some limitations to fish's psub preventing it to work as bash's >()

https://github.com/fish-shell/fish-shell/issues/1786


The sections on quoting and globbing suggest this author, though obviously trying to be helpful, isn't really knowledgeable enough to be writing such a guide.

I suggest reading this instead: http://www.grymoire.com/Unix/Sh.html

Granted, it's about the Bourne shell, but since Bash, Korn, and every standard UNIX shell is supposed to be compatible with it, it's well worth learning.

And IMO, if you need more than what the Bourne shell provides, you should be using a proper programming language like Python instead.


..it'd be nice if what was actually happening was explained of just statements like "[ is the original form for tests, and then [[ was introduced, which is more flexible and intuitive"

The primary differences are:

* word splitting and pathname expansion don't happen in [[...]];

* "==" does pattern matching in [[...]], but it does string comparison in [...].


I recommend reading `man bash` for the section on Conditional Execution. It's remarkably readable and useful for man pages. It also cleanly explains the difference between the two.

    rename -n 's/(.*)/new$1$2/' *
There's a chance this won't work on your system.

There are two incompatible versions of rename in the wild:

1) Perl one: https://metacpan.org/pod/distribution/File-Rename/rename.PL

2) from util linux: http://man7.org/linux/man-pages/man1/rename.1.html

Debian (and dertivaties) ship the former; other Linux distros likely the latter.


I think this is not specific to the shell you are using, so it happens in bash, fish, ksh, zsh, and so on.

It’s misleading because rename is a command that accepts regex arguments. The shell isn’t doing anything more sophisticated with the quoted argument than passing it to rename as a positional parameter.

The shell of course does expand unquoted globs.


Can someone explain :h as the article's description was not clear at all to me what was going on there.

Yeah, I couldn't make that work for me (on Linux or MacOS) .. although I'd love it if there were a way to quickly get 'just the directory' or 'just the filename' in bash with a shortcut, instead of having to resort to $(dirname blah) and so on .. I'm sure there is some way but :h doesn't look to be the shortcut as expected.

Assuming blah is in a var named name, you can use:

    dirname => "${name%/*}"
    basename => "${name##*/}"
Sadly you can't use it with !$ since it's not a variable. The closest you can do is:

    $ ls foo/bar/baz
    ls: foo/bar/baz: No such file or directory
    $ last=!$; echo "${last##*/}"
    baz
    $ echo "${last%/*}"
    foo/bar

    $ echo foo bar
    foo bar

    $ echo !$
    echo bar
    bar
The bang syntax is for history expansion, so it won't work on directories directly. But you could hack that with something like:

    $ls /root/my-dir
    [...]

    $echo !:t
    echo my-dir
    my-dir

Wild guess: perhaps OP confused bash syntax with vim syntax, where :h removes the last component?

http://vimdoc.sourceforge.net/htmldoc/cmdline.html#filename-...


It's !:h - not just a plain :h

    $ echo foo /bar
    foo /bar

    $ !:h
    echo foo
    foo
It's under `man bash` in the History Expansion section.

That's not quite what the author was talking about; your example does the same thing as !:0-

This is what the author meant:

  $ echo foo/bar
  foo/bar

  $ !:h
  echo foo
  foo

Yes, the article is flat out wrong there. Probably a typo or pasto though.

Instead of

    ls /long/path/to/some/file/or/other.txt:h
it should be

    ls !:$:h
which would produce

    other.txt

Hang out in #bash on IRC Freenode and you will be a Bash jedi http://mywiki.wooledge.org/BashFAQ best resource IMO for quick Bash syntax lookups as I always need to refer to the BashFaq to remember parameter expansion sub-string retrieval.

  parameter     result
  -----------   ------------------------------
  $name         polish.ostrich.racing.champion
  ${name#*.}           ostrich.racing.champion
  ${name##*.}                         champion
  ${name%%.*}   polish
  ${name%.*}    polish.ostrich.racing

Extending your example above, how would you get a result of racing.champion or polish.ostrich ?

You'd have to do it twice, using a temporary variable:

    $ name="polish.ostrich.racing.champion"
    $ temp="${name#*.}"
    $ echo "${temp#*.}"
    racing.champion
That's if you want it generic (i.e., splitting on "." characters). In one shot, you could make the pattern more specific:

    $ echo "${name#*ostrich.}"
    racing.champion
...or use something like sed or awk:

    $ awk 'BEGIN {FS=OFS="."} {print $(NF-1), $NF}' <<< "${name}"
    racing.champion

    ${name%.*.*}   polish.ostrich
    ${name#*.*.}                  racing.champion

They are needlessly rude and mean at #bash. A bunch of scumbags, actually.

I think it's a result of constantly dealing with people who ask for help, receive good advice, and then ignore it

If that annoys then, they can always quit. Being rude in this situation is either a choice or lack of ability to cope with stress. Neither excuses being rude...

Complete opposite of my experience.

I understand "the hard way" is a commonly used phrase but it does seem a bit infringing on Zed Shaw's entire series Learn Code the Hard Way. Easily confusing. Other than that, good work

Bash has a huge number of little shortcuts that are difficult to learn. When one encounters a sequence of symbols like $(...), it is difficult to Google for its meaning. The reason shells nevertheless have these shortcuts is of course because they are shells: from the commandline it can be very convenient to use shortcuts.

But, in my opinion, that's where it should stop: one shouldn't use a shell language for scripting. In scripts, it is simpler to use more verbose and clear constructs, because most editors are very powerful and provide shortcuts themselves.


Yeah, but why are you trying to use a search engine that cares less and less about exact matches, when there's a manual?

    >man bash
    /\$\(  # search pattern needs escaping
    ...
    value is evaluated as an arithmetic expression even if the $((...)) expansion is not used (see Arithmetic Expansion below).   Word  split-
    ...
    n      # go to next match
    Command Substitution
       Command substitution allows the output of a command to replace the command name.  There are two forms:

              $(command)
       or
              `command`
    (detailed description follows)
I'm still in favor of using more verbose and especially more clear constructs, but not because they are easier for Google, but because they ideally hold enough information on their own that you don't even need to look it up to know what it does.

Man pages are specifically reference documents, not tutorials or guidebooks. To say one should use a man page is to say that one must completely digest the entirety of the tool prior to ever actually using it. That’s just simply not feasible, nor should it be expected of anyone beyond trivial tools. Man pages simply don’t provide the context for solving a problem like a guidebook or tutorial would, which is why there are so many sites that start with a problem and then explain the tools.

Which is why the parent advised to treat the man page like a reference document, by searching in it. Some man pages are just badly written and are indigestible even when searching for a specific thing, but in general, that approach works quite often.

And even in the worst case, the man pages give you more context to use in your subsequent web search.

To use the original example, once you've identified that $(...) is Command Substitution, you have something that pulls up meaningful results in every search engine.


Is there some trick to searching man pages that I don’t know? Because my usual experience is:

  type man foo
  type /-p
  type n n n n n n n n n
as there are a bunch of matches like “...does bar when combined with -p...”

A presentation of man pages that used hypertext would make me a lot happier.


You might want to search with

    /   -p

Thanks, but I think this validates a demand for real hypertext.

While you’re waiting for the rest of the world to agree with you and then implement the true hypertext manuals, consider sharpening your regex saw.

In OpenBSD, we have true hypertext manuals today: https://news.ycombinator.com/item?id=16089300

    man man
    man less
Not a joke. Learn the simple tools that help you daily.

Did you have something in mind? I’ve read those man pages before, and I read them again today, and with the possible exception of tags in less (but I’m not sure about that), nothing seems relevant.

I'm sorry for sounding condescending! I did not understand your problem initially.

What helps me somehow is the fact that definitions in man pages are usually start on a new line and are indented by several spaces.

    man bash 
    / -o
This finds an inline mention, not very useful.

    man bash
    /^ +-o
This finds the definition: start of line, then some space, then the -o.

If foo has a Texinfo manual (GNU tools like bash usually do) then you can try `info foo` and search the index with i or I for -p. Texinfo manuals also have hyperlinks you can press enter on.

info is a greatly underused system and I'd recommend any *nix users to spend some time learning how to navigate it.


Thanks. Is there a good way to open that in a browser, rather than a console?

In general, the best way to read Info documentation is inside Emacs.

"If you are a bash newbie, you should read the bash manual. If you want proper search for the manual, you should use info. If you want proper use of info, you should use Emacs."

Kind of a deep rabbit hole, isn't it?


Not in this case; the bash manual is not in Info form, but a regular old-style Unix man page.

True. You can do just fine in general eschewing info wankery in favor of man pages. Stallman may disapprove, but I don’t lose sleep over it.

I don't know if you can locally, but they're often published online in HTML format, like at https://www.gnu.org/software/bash/manual/html_node/index.htm....

I get that they may be unfamiliar, but learning your system’s tools will pay off over the long run. Search facilities in less and info are generally much more powerful than in your browser.

In addition to the HTML info pages hosted by GNU[1], a variety of GUI texinfo readers are available, such as tkinfo[2].

[1] https://www.gnu.org/software/bash/manual/html_node/index.htm...

[2] http://math-www.uni-paderborn.de/~axel/tkinfo/


For bash: 'i' gives me "no indices found" and 'I' says "no index". I can do a '/' search, which finds some "-p" strings, but "n" doesn't work to find the next.

From some other comments, it sounds like "info" uses emacs at its core? So I suppose I'd have to learn some emacs commands, if that's the case.


info works without emacs, but some people prefer emacs' info viewer to the console one. The console viewer does have some idiosyncratic keybindings - for example, "n" means "next node at same level", and "}" means "search for next occurrence". "H" will give you a quick overview.

I'm surprised indexes aren't working for you - unfortunately I don't know of any suggestions to fix that.


Thank you for the comment! I've always been aware of the info pages, but it's honestly been years since I've tried to use them.

This thread is a good reminder to give it another go.


You can also do keyword searches across all man pages like:

    man -k <keyword>

as there are a bunch of matches like "...does bar when combined with -p..."

It has been my experience that the definition of switches, unlike their use in examples or other text, occurs as the first thing on the line, so my search expression would be:

    /^ *-p 
(you likely can't see it, but there is a trailing space, too, to ensure it's just that flag, and not "-parallel" or whatever)

It's possible the text will be tab-indented, in which case:

    /^[ ^I]*-p[ ^I]
(most pagers will accept just pressing the tab key in the search string, and it may show up as ^I or the literal tab character)

You can use apropos to search names and descriptions inside man pages. For example:

    $ apropos timezone
    Date::Manip::DM5abbrevs (3pm) - A list of all timezone abbreviations
    dm_zdump (1p)        - timezone dumper
    Time::Zone (3pm)     - - miscellaneous timezone manipulations routines
    timezone (3)         - initialize time conversion information
    tzfile (5)           - timezone information
    tzselect (1)         - view timezones
    tzselect (8)         - select a timezone
    zdump (8)            - timezone dumper
    zic (8)              - timezone compiler
Can even search using regular expressions.

Usually options are typeset similar to

    -p

    Potrzebie mode ...
So search for it, e.g.,

    /^ *-p
The caret regex anchor matches at the beginning of line, and the Kleene star matches zero or more of the previous pattern. In English, the above pattern reads “match -p only when it occurs as the first nonblank characters on the line.”

The surrounding context may be different, so adapt your search pattern accordingly.


BSD man pages (and many Linux man pages, though a minority) are written in semantic “mdoc” macros, rather than the classic “man” macros that are strictly presentational.

If you’re using mandoc (http://mandoc.bsd.lv/) as your man(1) program—the default on OpenBSD and a couple of Linuxes like Void and Alpine—it will use these semantics to generate hyperlinks in the terminal using more(1) and less(1)’s ctags support.

So on my machine, your example becomes:

    type man foo
    type :t
    type p
This brings me to the first instance of a command-line flag named “-p” or environment variable named “p” in an itemized list.

It translates to HTML too—check out the links generated by the web viewer, which uses the same backend. https://man.openbsd.org/ls.1


That’s awesome! I think a table of contents could be nice addition for the HTML version (at least for pages that are longer than less).

I generally find things easier to learn with reference documents than tutorials unless I'm unfamiliar with the problem domain and have no mental model to map things to. That's very rare these days, so reference documents are the way to go.

Amusingly, reference documents are themselves increasingly rare, and what you usually get is a rough tutorial and set of examples that cover perhaps 15% of the feature set, and need to dive into the source to figure everything else out.


Look up the name of the concept in the manual and search for that. Compare Google or Stack Overflow searches for <() versus “process substitution.”

It is a reference, and searching for the pattern GP mentioned finds you references to arithmetic expression evaluation and to command substitution. Those are the terms one can Google for if further explanation is required.

This is why the less program has the / command — so you can search the man page for the syntax you are curious about.

It's not just from the command line that these shortcuts are useful. Once you have a decent baseline on shell scripting fundamentals, the constructs are often highly optimized towards efficient and readable scripts.

Not having to put every command in quotes, for example, is a huge advantage all by itself, especially if any argument lists have quoted arguments.

The $(...) construct is pretty fundamental to bash scripting and should be covered in any decent tutorial.


explainshell.com is your friend here.

+1 for the parts that are portable to Bourne shell/ksh/zsh.

-1000 for the parts that are specific to Bash. Stuff like that has been huge pain in my ass over the years. Some clever programmer uses some bash-ism and the build breaks on some ancient hardware that doesn't have bash.

I realize my complaint sounds like Henry Spencer's Ten Commandments and perhaps it feels outdated but trust me, you don't want to wade into thousands of lines of shell to track down why something doesn't work on the stupid AIX box.


The article doesn't mention which are portable and which aren't.

It's a cultural thing. GNU is intended to replace Unix, so interoperability is (at best) not a priority. Same as Microsoft's “embrace, extend, extinguish”, which shows how important a good slogan is if you want your ambitions recognized.

A POSIX (or older Bourne-descended) shell can be distinguished from ‘bash --posix’ in a script with a fragment like “date&>F”, because Bash authors either didn't realize or didn't care that the sequence ‘&>’ already had meaning.


I wanted to argue that POSIX didn't define this behavior. But, reading the spec: I agree with you, Bash is non-compliant there.

According to POSIX (2016 edition)

    date&>F
should be equivalent to

    date &
    (exec >F)
where in Bash, it's equivalent to

    date >F 2>&1

>Stuff like that has been huge pain in my ass over the years. Some clever programmer uses some bash-ism and the build breaks on some ancient hardware that doesn't have bash.

Shouldn't the problem be the "ancient hardware that doesn't have bash" itself?


A brand new macbook pro will not have bash4.

The machine may not be ancient but I would still argue that the problem there is the ~10 year old version of Bash that Apple has decided to ship rather than the programmers that use features added to the shell within the last 10 years.

(I should add that I don't know what version High Sierra ships with but Sierra seemed to ship with 3.2.5x-ish which was 9 years old at the time.)


Well put it this way:

You can assume everyone has a modern bash, and make it the end users problem if they don't, or you can write portable shell scripts and know it will work.

Honestly the things you can't do in posix shell compared to bash border on "use a fully featured language" anyway.


Thing is that Bash is something you can assume to be reasonably widely available[0] — like Perl or Python — but I wouldn't expect to have to avoid any features from the last 10 years of either of those two. Sure, a certain grace period is to be expected but I think 10 years is way past that.

[0]: I know POSIX is supposed to be even more widely available but depending on what you're targeting then it may not be the best option [source: https://en.wikipedia.org/wiki/POSIX#POSIX-oriented_operating...].


Afaik neither perl or python are gpl3-only licensed.

Thats the blocker on macOS.


How do you write a portable shell script? The programs invoked by your shell script need to behave the same everywhere. Even fundamental things like cp, rm, etc. don’t universally behave the same across the various Unix and Unix-like systems.

The joke is that your shell is actually more portable than your shell script :)


Most of the shell utilities described by posix have standard flags, and then gnu/bsd extra flags.

If you use the standard ones (and use the "posix mode" flags when available) you're mostly ok.

Also, a shell script can have logic to handle different tools available (either different flavours of the same tool or even different tools that do similar things).

If the basic syntax it uses (or the shebang) are bash specific then you need bash to run it.


And it's all 2 minutes to add it as your default shell (including installing brew itself).

  22:06  ~  $ bash --version
  GNU bash, version 4.4.12(1)-release (x86_64-apple-
  darwin16.3.0)
Much easier than constraining oneself about what to put in one's script (assuming one is indeed targeting Linux, OS X etc released in the last 10+ years and not some embedded etc platforms).

If your project says it requires something from brew thats a blocker for a number of people.

A mythical kind of project that depends on macOS having a recent bash?

If your project depends on a bash script and that script depends on bash4, it wont run out of the box on a brand new macOS machine.

A posix conptible shell script has no such limitation. Thats all im saying. Im not arguing the merits of which bash is better or what a project might need.


I wrote a book on Bash too. The most important thing for anyone to know about Bash is that it's intended as a command language, not a general purpose scripting language. If it's longer than 10 lines, or if it uses two or more variables, you should probably have written it in something other than Bash.

Amen to that. Now if only I could have gotten my Systems Programming professor to feel the same way... :)

Would you consider non-trivial install scripts as an exception to this general rule? I mean, you wouldn't write some install script in ruby or python, right?

I don't know why not?

At the least, rather than Bash, you might consider Perl as a default, lowest common denominator for scripts that need to run anywhere.

- It's nearly as ubiquitous as bash.

- It has approximately the same kinds of file/path operations built in.

- It has reasonably good support for strings/regexes/etc. all built-in, so you don't have to call out to tools like sed/awk/grep all the time and hope that they are available and compatible across your target platforms.

- It provides reasonably good arrays and hashes, which are horribly horrible in bash.[1]

- You can use syscalls very easily if you really need to, but usually you don't.

[1] Of course, no language can save you from the file system disaster (https://www.dwheeler.com/essays/fixing-unix-linux-filenames....), but being able to know that "foo bar" is a string instead of two array elements is a good start.

Mostly this all applies to Ruby or Python too, modulo perhaps the degree of ubiquity.


Great point about ubiquity.

Perl regexes are the best of breed that everyone else replicates — far better than “reasonably good.”

The Perl erasure in this HN thread is startling.


> "The Perl erasure in this HN thread is startling."

"Erasure" to me implies some active effort to remove Perl from discourse. I don't see anything like that here: indeed, there are a number of positive mentions, and no negative ones I see. Granted, Python and Ruby are both mentioned more often, but none of those is at Perl's expense. Am I misunderstanding what you mean by 'erasure'?


This is what Perl is designed to be. Perl unlike the others is almost certain to be on any Unix or Linux installation. Several commenters leaving out Perl in discussions of the next step up from bash scripts is truly strange.

I suppose being ignored beats the typical herp-derp anti-Perl bigotry, but I’d prefer all-around civility.


> "Several commenters leaving out Perl in discussions of the next step up from bash scripts is truly odd."

I'm having a hard time following you here. Do you think that they're doing so for any other reason that Perl is no longer their go-to tool? There are communities where Perl is still used: PostgreSQL for example uses Perl for some of its scripting, as well as its build farm tool, in particular because of its portability on older systems.

That said, from what I've seen over the past 10 years or so, Perl hasn't had much of a presence in areas where a lot of computer work in tech is being done. For example, in cloud computing, or scientific computing, or machine learning, or web frameworks. Please don't read this to mean that Perl couldn't be or isn't being used in these cases or wouldn't be a better fit. (As an aside, I think Perl missed out a lot while a large portion of the community was focused on Perl 6: there's only so much energy in a community, and that absorbed on Perl 6 wasn't focusing on evangelism. But that's not something I'm interested in litigating here.) Or that there isn't something a bit frustrating in seeing the wheel reinvented time and time again. And so many examples on the web use bash as a common denominator. This puts Perl further out of mind if it's not already part of your everyday workflow. And how many developers today have come of age without seeing Perl in their everyday environments?

Consider the current forum. What's the percentage of front-page posts that are about Perl or tools where Perl is a part of the tool chain? It would be understandable for the people who frequent HN to not view Perl as their go-to. I don't consider it uncivil for people to neglect to mention some other language when it's not something they'd actually think of reaching for. It seems the solution would be to share examples of where Perl provides advantages, both in the comments here and in submissions to HN.


Well said, perl might be the theoretically best match in specifically this problem domain but the thing is there are only so many programming languages one can learn.

If i had to choose only one of ruby/python or perl i would choose the former and it would be able to cover my base both as glue-code and for more programs. Perl would maybe make the glue code a bit easier but instead i would be much less employable and have a much harder time finding other people who can read the glue. I'm not qualified to have an opinion on perls capabilities for other programs but i'm sure there are valid reasons most people prefer other alternatives.


Perl erasure?

After using the linux command line (or its many «relatives» like cygwin, mac os, unixes) for more than 10 years now, I’ve talked to exactly one person that used perl to accomplish anything at all. He used it to edit text files, so he could have done the same in awk/sed/vim in my opinion.

I know more people who write fortran 77 than perl.

People just don’t seem to use (or like) Perl very much.


Yes, you would write non-trivial install scripts in ruby.

https://en.m.wikipedia.org/wiki/Chef_(software)


If any of my bash scripts get long enough to where I want a reusable class, I rewrite in Ruby.

Bash has too many surprising edge cases. A lot of my install scripts have at least snippets of Perl in them. The installation/upgrade/maintenance scripts for my employer's main product is Ruby-based.

We support a lot of different OSes, and there's usually less variance between the Perl deployed on them than there is in the shell.


I would not consider those an exception, and would typically prefer to see the use of some other language. Bash is excellent at dealing with semi-structured text, and as part of a command pipeline, and if you have no alternative than to write POSIX sh, well, it exists. Bash does not have niceties like typed variables or named function arguments, and arrays are best avoided. Even parsing command line arguments is more fun in other languages.

The problem is that Bash has about the lowest barrier to entry which can be found: just dump the things you were going to type anyway into a text file and mark it executable. It's simple, and then you bloody your nose on one of Bash's many idiosyncrasies, and people will say, "Oh yes, ']' is just a required last argument to '['. You gotta watch for that." And the true Bash master knows that my rule is silly and that anything may be written in Bash.

Just...please don't.


I say please don't write any goddamned software at all.

Isn't that what chef and Capistrano are at some level?

I disagree. Here's how I decide:

Do I need to manipulate rich data structures like hash-maps or nested lists? That sort of thing tends to stretch the capabilities of Bash to its limits and I tend to set the bar fairly low here.

Is the program oriented around commands? If I'm gluing executable scripts and binaries, using bash is often superior to a scripting language. Argument passing is more natural and convenient and the built-in support for the standard i/o streams makes it easy for the different commands to pass data between them. The number of variables or lines of logic is usually less important than the number of different installed commands I need to combine. (Or the number of variations on a single command)

Do I need to modify the environment in a significant way? In a bash script it's trivial to source in an environment script, which can be done conditionally or even interactively. This is usually more tedious to do in scripting languages.


Also: do I need to run a producer & consumer (and maybe some intermediary filters) on a stream of data, and want to run them in parallel? Shell pipelines are trivially easy to set up & test.

As someone who has to occasionally modify 100+ line bash scripts written by Coworkers from Christmas Past which matched your spec in terms of what they had to do, please please just use Python (or similar).

Yes, you will have a few extra lines but it will be vastly more readable and maintainable.

And yes, I know I will get the standard the person who wrote the script did a bad job but at some point it should be okay to blame the tools instead of the workman if workmen disproportionately create worse results with a set of tools.


As someone who also has to semi-frequently modify 100+ lines bash scripts written by others, I'd suggest every serious bash scripter read the bash man page. It is much smaller than any book on Python.

For scripts written in Python, I'd use a similar argument and suggest every serious Python scripter to learn Python. As for Perl, or Ruby, or Julia, or anything really. It's just that learning bash from its man page is, IMO, much easier than learning any of those languages to the same degree.

Of course – and it goes without saying — that some things are just not suited to bash, and in those cases a suitable language/framework must be used and learned if not learned. As far as process calling, environment management, or stdio streams management are concerned, bash is better suited than (all the other languages I've tried) Python, Ruby, or Go.


The problem is most people don't want to consider themselves "serious bash scripters" but still think they can write bash, which always results in unstable and vulnerable scripts. Python on the other hand can be written by unserious python scripters and more often be at least accidentally correct. Another big issue is that the quoting, escaping and expansion rules in bash can be daunting even for serious bash scripters, with silent errors or completely different behaviour just because you forgot a : before (or was it after?) the $-

As a test, run shellcheck on any random shell-script written by these Coworkers from Christmas Past (or your own past) and it will spew serious warnings on almost every single line, run an equivalent analyzer on an equivalently unserious python file and you might in bad cases get 2 or 3 minor warnings per 100 lines.

The comparison is a bit unfair because just by getting these compiler errors and exceptions from a real language you force yourself into a more serious mental mode of programming instead of happy scripting, but that is just another argument in favor for not using bash IMO.


I agree with you, mostly, and let me point out where I don't.

1. People who don't want to consider themselves "serious bash scripts" shouldn't be writing non-trivial bash scripts unless they're okay with it turning out buggy. I agree this is a personal standards thing, and reality is often less simple and more lenient than that.

2. The "compiler errors and exceptions" that you speak of in regards to Python also have equivalents in Bash. Agreed, they're still optional and non-"serious bash scripters" don't often know of them. Which is why I make my coworkers use them when I review their code.

3. There are classes of bugs that would happen in Python (and Go, from my experience) that wouldn't happen in bash, simply because they are in areas where bash shines. At work, I've seen process management and stdio management bugs — some of which have bit us in the field — simply because the non-bash language (Go) has a weird affinity to its child processes, or it (both Python and Go) defaults to not wiring up the stdio of child processes. In the latter case, the proper thing for the developer to do was the same as with bash: read the documentation. Most of our process management code is now in bash (because it's simple and safe) and systemd (because it's thorough and absolute).


In my experience, those kinds of shell scripts get written by people who needed automate something quickly and didn't have (or at least perceive) any other language available besides maybe Perl. They're either Unix admins without much programming experience or they're part-time programmers who primarily work in some other language and just don't have the familiarity with Python or Ruby needed to write a good glue script.

But they know how to accomplish this task interactively in the shell, so scripting what they're already doing (or already know how to do) seems like the natural next step. So you wind up with an imperative shell script that's basically a long, flat sequence of commands with some logic and variables sprinkled in haphazardly as they realized they needed it.

Due to the organic way these scripts often emerge, it's not like advocating Python is an easy sell. By the time they think to consider alternatives, the shell version already exists.


I agree. And I agree with your bash 'whitelist'. I'd also add a hard blacklist for any serious string manipulation. A series of awks and seds look clever but they are quite annoying to deal with.

If it's just a bunch of cuts or tr, sure.


And it's SO MUCH MORE expensive to fork all those processes. Just use a real language, please.

"A series of awks and seds look clever but they are annoying to deal with."

Would the following be annoying for you to deal with?

   #!/bin/sh
   sed 's/#.*//' \
   | sed 's/:/#/g' \
   | cat AMD64 - \
   | ./qhasm-ops \
   | ./qhasm-regs \
   | ./qhasm-fp \
   | ./qhasm-as \
   | sed 's/%32/d/g' \
   | sed 's/%raxd/%eax/g' \
   | sed 's/%rbxd/%ebx/g' \
   | sed 's/%rcxd/%ecx/g' \
   | sed 's/%rdxd/%edx/g' \
   | sed 's/%rsid/%esi/g' \
   | sed 's/%rdid/%edi/g' \
   | sed 's/%rbpd/%ebp/g'
where qhasm-as and qhasm-fp are each awk scripts (222 and 427 lines, respectively).

source: http://cr.yp.to/qhasm/qhasm-20061116.tar.gz qhasm-20061116/qhasm-amd64


i'd argue that it's much easier to write bad Bash than bad Python

Assuming that "bad" doesn't mean merely "ugly to glace at." I find that depends largely on the problem at hand.

I think this might help others, but ShellCheck[0] is a good place to start to help eliminate poor shell scripting.

And I would make an argument though that even large shell scripts in bash have their place.

I often write scripts in either Node or Python, but only when I need things bash is bad about (any sort of proper data structure beyond strings or arrays).

But there are just so many things bash makes insanely easy, especially with operating on files and directories.

And functions that are used as completions or need access to aliases or functions in the current process are also better in bash.

I wish there were a scripting language like bash, but enhanced with at least some hash maps and proper array manipulation, and maybe some formal IPC to allow scripts to request info from the parent process.



Whoops! Thanks for that

> As someone who has to occasionally modify 100+ line bash scripts written by Coworkers from Christmas Past which matched your spec in terms of what they had to do, please please just use Python (or similar).

As someone who has inherited thousand-line shell scripts, and had to debug many 3rd party scripts, I stand by my assertion.

> Yes, you will have a few extra lines but it will be vastly more readable and maintainable.

Readability is important but it's not the only aspect to maintainability, nor is maintainability to sole concern of a tool. A low bug rate helps maintainability and actually having the features you need, in an acceptable timeframe, is also important.

For example, the OP mentioned the 'set -e' option that causes the script to exit if any command returns a non-zero exit code. In Python, you'd either have to remember to check the return code for every subprocess or define a wrapper, which adds complexity, reducing readability and can lead to bugs and errors. Nor is Python always the best answer for readability anyway. In many cases, it's not like it's just a few lines you're saving. Here are some functions I've used when scripting in Python

    import subprocess, shlex
    def process_run(cmd_string, stdin=None):
        return subprocess.Popen(shlex.split(cmd_string),
                                stdin=stdin,
                                stdout=subprocess.PIPE,
                                stderr=subprocess.PIPE)
    
    def process_results(process_object):
        (stdout, stderr)=process_object.communicate()
        return (process_object.returncode, stdout, stderr)
    
    def process(cmd_string, stdin=None):
        return process_results(process_run(cmd_string, stdin=stdin))
It's 10 lines of boilerplate to set up an approximation of behavior that is trivial to achieve any shell language. There's actually 7 more functions I use to handle different common subprocess execution patterns. For example, the "stdin" in that process_run function needs to be a filehandle (at least in Python 2.7, I'm not sure about python 3). To pass a string to standard input you'll need something like this:

    f=SpooledTemporaryFile()
    f.write(stdin_string)
    f.seek(0)
    results=process(cmd_string, stdin=f)
    f.close()
    return results
> And yes, I know I will get the standard the person who wrote the script did a bad job but at some point it should be okay to blame the tools instead of the workman if workmen disproportionately create worse results with a set of tools.

Actually what I'd say first is that it's quite possible the person writing the script knew what they were doing. I've inherited bad code in my life, I've inherited some real gems, and I've inherited a lot of code in between. One thing I've learned is that I tend to be unfairly critical of average code. It's hard to read unfamiliar code and easy to criticize inconvenient design choices when you have to adapt their code to some new problem that they never anticipated. Usually I'll be better off just buckling down and untangling the spaghetti.


subprocess.run does exactly what your wrapper does, subprocess.check_output returns stdout only and automatically throws exception on non-zero return code, this is the function you should be using 99% of the time. Those functions both accept a string as stdin-parameter.

Subprocess.run is Python 3+ only. If we're talking about replacing bash, Python 2.7 (possibly with 2.6 compatibility) is the more reasonable target. CentOS 7 and Debian 8 (I've not used 9 yet) still ship with Python 2.7.

Also, who is to say what I should be using "99% of the time?" Each problem has different constraints and different priorities.


Please have a look at https://pythonclock.org/ and stop riding dead horses.

The domain under discussion is scripting and specifically comparisons with Bash. Python 3 has not achieved anywhere close to the platform deployment that Python 2.7 has. When the common OS distributions you're likely to need to script on ship with Python 3 as the default rather than python 2.7, we can start using Python 3 in random comparisons with shell scripts. Until then, Python 2.7 is the language for comparison no matter what rhetoric you want to employ.

Most distros ship with python3 (though `python` will refer to python2).

Is there a reason you cannot just say `#!/usr/bin/env python3` in your scripts? I don't see why you require python3 to be the default, am I missing something here?


The first OS I've used that includes Python3 in the base install is Debian 8(Jessie) and I no longer use Debian in production. CentOS 7 does not include Python3 in the base install. You can install it, sure, but why bother when you can just use the python that is already there? Or better yet, /bin/bash...

Again, in the context of this discussion, the whole argument is yet another point in shell's favor. There's no major backwards-incompatible change in the language. With Bash, you just decide whether POSIX compliance is something you need, and that's basically it. Both versions are still supported and no one interrupts discussions to announce that beatings will continue until morale improves whenever the deprecated version of Python comes up.


If you are using a ten year old software and can’t install a package, that’s your problem. Quit acting like it is the default situation.

Or you could put the wrapper functions in a module and call it a day, either way problem solved.


He's not acting like it's the default situation. For the distributions he mentioned, it is the default situation.

problem is that it is difficult to anticipate; It may look like a simple 'glue the commands together' task, but it may turn out to be more tricky.

My favorite advice was from a search giant's dev infra engineer who said that "any Python script over 100 lines should be rewritten in Bash, because at least that way you're not kidding yourself into thinking it's production quality"

Great arguments. Add one: bash is a more lightweight dependency, frequently available even on windows, weird *nix versions and minimalistic environments (e.g busybox linux, fresh arch install, ...).

[shameless]

> Do I need to manipulate rich data structures like hash-maps or nested lists? That sort of thing tends to stretch the capabilities of Bash

> If I'm gluing executable scripts and binaries, using bash is often superior to a scripting language

The two points above resonate with my view that there is a missing piece. On one hand we have bash which is optimized for being the glue (second point). On the other hand we have Ruby, Python, Perl, Go, etc which are good for first point. What I think is missing is newer, more powerful shell, which supports both use cases and more. I'm working on it:

https://github.com/ilyash/ngs

Please note that I'm not the only one that thinks there is room for more powerful shells. See the readme for links to other projects.


I disagree too. Bash has many downsides, but there are very few languages out there which have the ability to 'connect' different programs so easily.

Bash scripts are slow as hell (as most commands have to spawn new processes), it is hard to write "secure" code (if even possible), handling whitespaces can be a pain in the * and the amount of repetition is awful. If your kid has done something wrong, just tell it to write a bash script: it is the equivalent of writing a hundred times:

  x="$(...)"
Nevertheless, I enjoy writing bash scripts. Many times it starts with a simple curl command and by the end of the day you have a new OS installer. Granted, there are tasks which other languages can do better, but thats the real power of bash, it doesn't care: Then off you go write your super complicated algorithm in Rust, Go, Python, R or whatever and just call the other program, that's what Bash is good at.

It is the glue which keeps everything working together.

Disclaimer: Please don't build a complete cathedral out of glue.


"Bash is the glue which keeps everything working together. Please don't build a complete cathedral out of glue."

--JepZ


The reason it can connect programs so "easily" is because it basically ignores errors and robustness - it really relies on there being a user looking at the output and going "hmm that looked like it failed".

Doing things properly in Python or Go may take a few more lines (not much more really) but it is 100 tones more robust, and you need that if you are writing anything more than a 10-line one-off hack.


No, it doesn't. Unix error codes are extremely well understood. Tell the Git maintainers that their program ignores errors and robustness—they'll be surprised.

Even when well understood, it doesn't mean easy to work with. If you're lucky, the app you're calling only has two states: success+result, or failure+error message. But working with text commands, one day you'll get a "skipped file Xyz" somewhere in the output, because it's neither an error not a success. If you're very lucky, you'll get it in stderr, otherwise it will be mixed with the output. If you're not lucky, the command will print out the error and exit with 0 anyway. What crazy app would do that? For example standard initctl on Ubuntu: https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/55278...

Exit codes are a poor substitute for proper error handling with verbose error reports. They do the job most of the time, as long as you remember exactly which command behaves which way. And that's a clear path to mistakes :-(


I'd rather use perl for these use cases. It's almost as concise as bash regarding subprocess management. It's ubiquitous as bash, and have the python like 2 vs 3 version issue.

I think the 10 line limit seems a little harsh. Bash is great to prototype something, especially because of all the great commands at your disposal versus writing it yourself. e.g. grep, sed, cut are all commands I use frequently. Once you've got a working script and it proves useful or needs to scale - that's a good time to go the 'real language' route like Python.

> I think the 10 line limit seems a little harsh.

I think it's a fine guide. If I'm writing something that starts approaching a program rather than a few lines of utility-throw-away, it's time to at least immediately start co-developing in something more sane. My exception is for Makefiles, if one considers them "shell".


How do you feel about the use of bash for AWS userdata and the like?

You pretty much have to put all the init stuff in a bash script. I can't think of any real-world examples of userdata scripts being less than 10 lines.

The alternative of course would be a three line script that downloads a file and executes it. But what language would that file be in? Probably bash. And it would make the infrastructure-as-code tracking and deployment process much more complicated.


Not OP, but ...

You can use Go/Rust for static linked bootstrap instead of Bash. (See rustup.)

I'm thinking about combining GitLab (private token access + repository files: https://docs.gitlab.com/ce/api/repository_files.html#reposit... ) and a bootstrap script in Bash, that then launches something better. Plus self registration back into a GitLab repo (hey, it has the access token already, so it can push).

Yes, Ansible/SaltStack/Chef/Puppet reinvented, but you're now not bound to the horrible database, language, bootstrap, speed (slowness) and workflow of any of them.

If you want more access control, then creating a HTTP microservice that has the real private token (or proper OAuth2 access) to GitLab plus handles the self-registration and token handout/revocation is easy compared to the other parts. (And it can also store everything in GitLab.)


I believe in your userdata script you can write a line to download a Python script from S3, then proceed to execute. Don't quote me on this since I don't use userdata.

However, note that userdata will only run once at launch time. If you ever need to re-run userdata, you have to stop the instance, remove the userdata from settings, and then add again. YMMV.

I do the Python way when launching a beanstalk instance (basically cloud init).


I'd make the limit a lot more than 10 lines.

I certainly prefer Perl for large scripts (Python is good too but I happen to know Perl better), but I often have to write scripts for targets that have bash but don't have Perl or Python. Since I've been forced to use bash, I've found it to be a better scripting language than I expected it to be.


10 lines is a rule which is meant to be broken. Rules exist to guide the novice, and to inform the expert. Bash is an extremely useful tool, but not one which should be employed carelessly or casually.

If you want all the args to the previous (or earlier) command just use !. (e.g. a common idiom for me is cat `!`). Or you tried a git mv out of habit but the dir isn’t being managed by git: !gi:

Oops that’s the asterisk (star character) that HN interpreted as italics. In hacker news formattingnmy two examples are

  `!*` and !gi:*

Damn it this boils down to RTFM, specifically the bash man page.

The real 'Learn bash the hard way' is to read the man page top to bottom every 6 months. Ye gods, I've read it so many times…

Also: lol at HN downvoting advice to read a tool's docs. Bash is terrible for a lot of reasons, but not because of its lack of informative documentation.


From the HN guidelines:

Please don't comment about the voting on comments. It never does any good, and it makes boring reading.


I've read the guidelines; you can just downvote instead.

Yes, the bash man page is a treasure trove. Reading the article I kept thinking, "this is in the man page", over and over.

There seem to be things like "Testing" and "RTFM" that many people (including me) resist until they actually try it and see for themselves. I can remember the feeling of revelation when I finally learned to try "$ man foo" on everything...


Two important things missing:

1) This is a huge pet peeve of mine, but it kills me when my coworkers "bash" emacs and sing the praise of vim, then proceed to explain to me that ctrl-R in bash searches for the last command with a given pattern. They also often refuse to believe me that they're basically using emacs controls (because you know emacs' dirty). So, please, if you're in love with vim and use ksh or bash, learn about "set -o vi". Oh, and stop preaching!

2) "help xxx" for any xxx bash functionality, right there from the command line!


> they're basically using emacs controls

But not well implemented! You're supposed to be able to edit your search string, and I've never found a way.


Huh? Backspace works the same for me in Bash C-r as it does in Emacs C-r.

It never does anywhere for me. Good to know it's supposed to, though; maybe I've got something mapped weird, or the versions of the shell I'm using are old enough to be unwelcoming in this way, or I don't know what, but knowing it's not by design means there's a fix to be found for it. Thanks!

You should try hstr (https://github.com/dvorka/hstr). It replaces CTRL-R with a full page interactive history search that really works.

Demo GIF here: https://unix.stackexchange.com/a/375914


If you use either vim or emacs, there's something to be said about using that knowledge for your command line history.

hstr has a vi mode.

shellcheck (https://www.shellcheck.net) is an absolute godsend when writing Bash/POSIX sh scripts.

It catches so many errors that I think it's a must have in every programmer's toolbox. It even catches bash-isms when you are targeting POSIX sh. It saved me many many hours of grief trying to debug shell scripts I wrote and changed the way I write them for the better.


This is also my number one thing I wish I'd known about Bash. It saves on so many trivial bugs.

The documentation is especially great, for ever problem it detects you get an unique reference which you can lookup on the wiki eg: https://github.com/koalaman/shellcheck/wiki/SC2086 It then not only describes the problem but also shows different ways of solving it with some great examples and reasoning. I think I learned more Bash from Shellcheck than anywhere else.


Targeting POSIX as much as possible is really important if you don't want to force Bash on people, especially with open-source public code. Many OSes don't have Bash in the default install, but many just assume that bash is available on all the target systems.

like which?

I'm hard pressed to think of a modern unix that doesn't include bash by default. Solaris maybe?


everything embedded - OpenWRT/LEDE uses busybox with ash - Android uses mksh? that only supports a subset of bash features. Every Debian/Ubuntu has ash as /bin/sh that only supports POSIX.

I'd argue that writing scripts for an embedded target for a regular server/workstation target are fundamentally different problems - almost anything I'd write for those platforms would be targeted for them - not for general purpose unix.

Portability is only desirable, if you need portability - otherwise it frequently adds complexity for little return benefit.


All the current BSD systems, and some GNU/Linux distributions. I'd guess that illumos does not have bash installed by default, too.

shellcheck is amazing. I write enough shell scripts that I remember most of the syntax and tricks, but am far from an expert. shellcheck has taught me so many things and makes me feel more safe writing scripts in the event I forget one of the million gotchas.

Using readline is a great thing to know about too.

My favourite little-known readline command is operate-and-get-next:

https://www.gnu.org/software/bash/manual/html_node/Miscellan...

You can use it to search back in history with C-r and then execute that command with C-o and keep pressing C-o to execute the commands that followed that one in history. Very helpful for executing a whole block of history.

For some reason, this documentation is hard to find! It's not here, for example:

http://readline.kablamo.org/emacs.html

I'm a bit saddened when readline replacements don't implement C-o. For example, the Python REPLs don't have it.


I've overridden ctrl-r in my local Bash to search with fzf[0] and I'm using my history so much more now.

Didn't know about ctrl-o though, it sounds great! I hope that my ctrl-r override doesn't somehow break it.

[0]: https://github.com/junegunn/fzf

E: Fixed link.


That link is a 404 for me?

It had a trailing >, https://github.com/junegunn/fzf

I have since gotten out of the habit, but when I was just starting out I gave myself CTS one Christmas doing a giant refactor with just vi to remove a disasterous idiom from the codebase that was O(n^2).

I knew a quite a few commands but I didn’t know the block indentation commands and that was about a third of my typing over that week.

After that, part of my regular “I’m tired or there’s nothing to work on” routine included reading the accelerator key documentation for my editor of choice. I found all sorts of good stuff and it really made me faster for quite some time. Now I do more analysis work and the accelerators that help there are less numerous. Fast code nav being a critical one.

But some of these tools make it pretty hard to find all their features, which is a shame.


Another obscure readline feature is that ~/.inputrc accepts key sequences bound to arbitrary (quoted) macros, including macros that contain more key sequences.

    # a basic macro that types "foo" bound to META+"f" 
    "\ef": "foo"
The cool part is that bash recursively checks the macro output for more key sequences. For example, I use these standard bindings:

    # <META>-k
    "\ek": shell-kill-word
    # <CONTROL>-y
    "\C-y": yank
    # <SHIFT>-RIGHT
    "\e[c": shell-forward-word
    # <SHIFT>-LEFT
    "\e[d": shell-backward-word
...which are used by these macros that use s-LEFT and s-RIGHT to move the current command-line argument one position left or right:

    # <SUPER>-LEFT
    "\e[D": "\e[d\ek\e[d^Y"
    # <SUPER>-RIGHT
    "\e[C": "\e[d\ek\e[c^Y"
    #        ^   ^  ^   ^
    #        |   |  |   > yank
    #        |   |  > shell-forward-word
    #        |   > shell-kill-word
    #        > shell-backward-word
(the actual key sequences depend on the terminal. Check with C-v <KEY>)

ctrl-o doesn't seem to do anything for me.

ctrl-r, find a command, press ctrl-o -> inserts ^o in the command and exits of ctrl-r.

Running bash 4.4.12(1) on Linux.


This is the most helpful bash diagram ever... why didn't I search for this before! https://zwischenzugs.files.wordpress.com/2018/01/shell-start...

As given in the article, it's also the most annoying bash diagram ever...it needs an explanation of what the 7 different colors of arrows mean. All that was given is:

> It shows which scripts bash decides to run from the top, based on decisions made about the context bash is running in (which decides the colour to follow).

> So if you are in a local (non-remote), non-login, interactive shell (eg when you run bash itself from the command line), you are on the ‘green’ line [...]

With a bit of Googling I believe I found the origin of that diagram:

https://blog.flowblok.id.au/2013-02/shell-startup-scripts.ht...

The author there explains how the colors work:

> Fortunately, I’ve read the man pages for you, and drawn a pretty diagram. To read it, pick your shell, whether it's a login shell, whether it's interactive, and follow the same colour through the diagram. When the arrows split out to multiple files, it means that the shell will try to read each one in turn (working left to right), and will use the first one it can read


Id' imagine they dont include it because the descriptions are very intertwined and complicated. I don't even know what they all mean but I usually just google this mess.

This diagram is not entirely correct.

The remote bash startup order is further complicated by the existence of a compile time flag SSH_SOURCE_BASHRC. This flag determines if a remote non-interactive shell will load the ~/.bashrc file.

This flag is turned off by default and stays off in some distributions (like Archlinux), but is turned on in others (Debian, Fedora, ...) to replicate very old rsh behaviour.


This is why I use xonsh (http://xon.sh). It lets me use Python as a shell, and I don't have to remember too much awkward syntax.

The author mentions the substitution !:1-$ to insert all the arguments from the last command (like !$ substitutes the last argument and !! the full command). Note that !* does exactly the same thing, which is a bit easier to type/remember.

I really wonder why the author didn't just leave off the first two, and call it "Eight Things I...". Starting off with "backslash escapes can be confusing" and "/* doesn't just match things consisting of 0 or more slashes!" makes no sense if you're later going to skip over !! because it's "obvious".

You shouldn't call your book 'learn X the hard way' if it isn't (also) freely available online if you ask me.

> !$ - I use this dozens of times a day. It repeats the last argument of the last command.

Press ESC then full stop instead. Less key presses.


Fewer key presses, but doesn't work if you're using vi bindings.

ESC _ or M-_ does work in vi mode though (and does the same thing as ESC ./M-.). Search for "yank-last-arg" in the bash manual.

Thank you.

Went searching how to do it in Zsh as well when the Zshell Line Editor (zle) is configured in vi mode:

$ bindkey -M viins '\e.' insert-last-word

Will make ESC-. work from insert mode.


bash or its minimal version such as ash is critical for embedded systems where python/perl etc are too fat.

    `` vs $()
I really appreciate authors who attack ambiguity or non-obvious equivalence in a subject head on. It's one of those aspects of explaining that takes to heart the perspective of the learner.

Another recent example I encountered is from the online book http://neuralnetworksanddeeplearning.com, there are some confusing ambiguities and plain misleading contradictions in NN terminologies, one being: multi-layer perceptions do not use perception neurons, these things can really screw with you while you're learning, especially the more subtle ones.

Some authors seem to have the ability to have full empathy for the novice while retaining expert knowledge and deep understanding by being able to both predict and answer relevant questions at the right point in an explanation at the right level of detail.


Interesting but from a modern perspective is not perl a better scripting language to learn.

I have never used bash scripts professionally i.e. as a language rather than a simple script with just a command in.

Like wise back in 87 or so I got trained in sed and I have only used it once since, and that was when I was playing around with early linix's (when it came on a huge number of floppys)


You have a terminal if you're on a Mac or Linux machine. It takes bash commands by default. Knowing perl is good, but knowing more bash is always good.

I hadn't known that <(echo "hi") is treated as a file with the contents of stdout, which simplifies commands that take files as arguments.


I meant using bash as a scripting language not as a shell.

    $ grep somestring file1 > /tmp/a
    $ grep somestring file2 > /tmp/b
    $ diff /tmp/a /tmp/b
You shouldn't do that, but not because it's not neat enough. /tmp is world-writable, so you might be writing to somebody else's file, or over a symlink that was set up by someone else. Use mktemp¹ for creating temporary files.

¹ http://man7.org/linux/man-pages/man1/mktemp.1.html


Could you not do that with pipes instead, something like:

    $ diff <(grep somestring file1) <(grep somestring file2)

That's what the article is recommending.

If you use a thing a lot, you should probably invest some time in reading the manual for it.

The manual for bash consists of its Unix-style "man" page, and is therefore more of a reference than an instruction manual. I suggest using the Advanced Bash-Scripting Guide (http://tldp.org/guides.html#abs).


What was the surprising part about

    echo '*'
    echo "*"
? Both prints an asterisk. Is '*' some sort of BASH variable?

The asterisk * is a glob. Since double-quotes allow variables inside to be dereferenced rather than always quoting everything literally, presumably he was expecting that double-quotes might still allow globs to work rather than causing them to be quoted, when in fact all quotes will quote globs whether single or double. So he was expecting it might print out "a".

One of my favorite Bash patterns is the following. I call it 'feed the fish':

  . <(curl https://example.com/trusted.sh)
While it is extremely dangerous I found it so easy to remember, that it stuck in my head. It downloads the script and executes it in the current shell. So if anything unexpected happens you probably have a real problem ;-)

Background: A few years ago I was writing a Bash based OS installer. So after booting from a live CD I had to fetch the installer and execute it, which lead me to using that pattern frequently.

While I love it, I can't stress enough how dangerous it is.


    if [ x$(grep not_there /dev/null) = 'x' ]
See now, I never get why people do this. -z has existed forever.

Because it's acronym is very unclear. I think it stands for "zero", but I am not sure.

re 9) I feel better about always feeling at least slightly confused about what files are being sourced.

The graph included seems to originate from https://blog.flowblok.id.au/2013-02/shell-startup-scripts.ht...


That graph is not entirely correct: https://news.ycombinator.com/item?id=16088866

The trickiest part of Bash I know is that "$(command)" results in the truncation of the output of the command before the newline. It's both handy and damning depending on what you're trying to do.

I don't think that's correct, do you have an example? This works for me:

  $ var=$(echo $'a\nb')
  $ echo "$var"
  a
  b

Oh, I meant before the trailing newlines, sorry for being unclear. Try using a\nb\n\n\n instead of a\nb and observing that the output doesn't change.

Ah, okay, got it. Yes, that can be surprising.

I haven't used the ! history for many years. It's simply easier and faster to use command line editing.

! was useful before command line editing; but not much since.


For me, the biggest gotcha in bash is whether or not a sub-process/shell will be invoked, which can affect things like mutable variables and the number of open file handles. For example:

    COUNT=0
    someCommand | while read -r LINE
                  do
                    COUNT=$(( COUNT + 1 ))
                  done
    echo "$COUNT"
This will always print `0`, since the `COUNT=` line will be run in a sub-process due to the pipe, and hence it can't mutate the outer-process's `COUNT` variable. The following will count as expected, since the `<()` causes `someCommand` to run in a sub-process instead:

    COUNT=0
    while read -r LINE
    do
      COUNT=$(( COUNT + 1 ))
    done < <(someCommand)
    echo "$COUNT"
Another issue I ran into is `$()` exit codes being ignored when spliced into strings. For example, if `someCommand` errors-out then so will this:

    set -e
    FOO=$(someCommand)
    BAR="pre $FOO post"
Yet this will fail silently:

    set -e
    BAR="pre $(someCommand) post"

    $ bash -ec 'BAR="pre $(someCommand) post"; echo hi'
    bash: someCommand: command not found
    $
I get the same behavior in bash, zsh, dash, and busybox sh. shbot says the same for bash versions since 1.14, ksh, and even the original bourne shell (modified to use ` instead of $()).

I think I've run into the first issue you describe, and I'm having a hell of a time trying to understand it. Would you mind taking a look at my example and helping me out?

Consider the following:

  cd /tmp/
  echo -e "hello world\nhello world\n:)" >> hello.txt
  cat hello.txt
Outputs:

  hello world
  hello world
  :)
Then running

  bash -c 'sed s/"hello"/"hiiii"/ hello.txt | tee hello.txt'
  cat hello.txt
yields

  hiiii world
  hiiii world
  :)
Reseting to the original `hello.txt` and running

  ssh localhost -t 'cd /tmp && sed s/"hello"/"hiiii"/ hello.txt | tee hello.txt'
  cat hello.txt
yields an empty file.

Replacing the command with

  ssh localhost -t 'cd /tmp && sed s/"hello"/"hiiii"/ hello.txt >> hello.txt'

yields

  hello world
  hello world
  :)
  hiiii world
  hiiii world
  :)

I'm trying to figure this out so I can publish a script to set up a test env for a package I'm trying to publish, and somewhat stuck on this step...

You're using hello.txt as both an input file and an output file, which seems like it's just asking for race-condition problems.

What about using sed -i to make changes to the file, and not doing any bash redirection?

Alternatively, if the real problem is more complex than that, try using a different name for the output file and rename after it's finished.


The `>>` operator in use is /appending/ to the file.

Also, as mentioned in a sibling comment, your hello.txt is both an input and output.


let COUNT=COUNT+1

is all you need


or:

let count++


What is the best alternatives to bash for writing fairly large automation scripts? I have already looked at Python, Lua and Guile. The last two especially since I like using them and they have some sort of posix interfaces. I haven't looked at Perl 6.

Why didn’t you consider Perl 5?

I've been looking for something like <() for a long time. Thanks for sharing it!

Regarding :h, is it any different than just using dirname?


It's 2018 and we're still talking about writing non-trivial bash scripts?

It's 2018 and bash is still the most convenient language for an enormous number of tasks. Funny, that...

Blog states that author has 20 years of development experience.

Blog post suggests he knew little of basic shell scripting until recently.

Blog also reveals he is selling a book on shell scripting with Bash, "Learn Bash the Hard Way."


> "Blog post suggests he knew little of basic shell scripting until recently."

I don't see how you could come to this conclusion based on the title. From the article, the author elaborates:

> "Recently I wanted to deepen my understanding of bash by researching as much of it as possible."

There seems to be a lot of good commentary in this thread indicating that people are finding the post useful. There are points here that I have. Do you have specific disagreements with the post contents?


If you "set -e", you probably also want to "set pipefail". By default, a pipeline returns the return value of the last element. pipefail means that if any element of the pipeline fails, then the pipeline as a whole will fail. I discovered this the hard way when I had:

make run-asan-test | c++filt

And even if the tests failed, the script would succeed.


The thing I wished I had learned earlier is "quick and dirty assertions". If you write lots of functions in Bash, you quickly end up getting tripped up by cases where an argument is omitted and the function does something totally batshit given the missing (empty string) argument. Now, the canonical way to handle this is to put validators on your input, (and make sure those validators don't crash with cryptic errors if someone calling your function is using "set -u") like so:

  function() myfunc {
    local foo="${1:-}"
    if [ -z "$foo" ]; then
      echo "Invalid first parameter!" >&2
      return 127
    fi
    ...
  }
...but man, that's time consuming when you have lots of parameters.

Instead, the quick and dirty way is to just "assert" via [parameter expansion](https://www.gnu.org/software/bash/manual/html_node/Shell-Par...):

  function myfunc() {
    local foo="${1:?First parameter must be provided}"
    ...
  }
Much quicker, especially when throwing things together in a hurry. It has a gotcha, though: ":?" assertion doesn't cause a function to return early, it shuts down the whole interpreter after outputting the error. So it's more like a true assert() statement than an input validator. If you'd only ever call your function in a subshell, this won't matter (because the subshell will exit early with a nonzero code, big deal), but otherwise it can be a nasty surprise to users when an argument-validation issue inside a function shuts the program down. Then again, the "return 127" in the first example would also shut the program down if someone was using "set -e".

...and while we're on the subject of "set -e", I think that the ["unofficial Bash strict mode"](http://redsymbol.net/articles/unofficial-bash-strict-mode/) (putting "set -euo pipefail" and "IFS=$'\n\t'" at the top of your scripts) has been a bigger bug-prevention/rapid development aide to me than anything else. To be clear, I think it's a means of detecting some kinds of bugs. I've read Wooledge and others' objections to those patterns, especially "set -e", and agree with the point that this does not make your programs objectively safer and shouldn't be counted on as a crutch. Then again, neither does a linter, but it still helps you detect and avoid some kinds of bugs, so why not use it?


I usually do something like that:

    [ -z "$1" ] && echo "Invalid first parameter!" >&2 && exit 127
as a precondition. I must admit that it's longer to write, but I can write a bunch of preconditions for my function then write the logic with an appeased mind

9) The remote bash startup order is further complicated by the existence of a compile time flag SSH_SOURCE_BASHRC. This flag determines if a remote non-interactive shell will load the ~/.bashrc file.

This flag is turned off by default and stays off in some distributions (like Archlinux), but is turned on in others (Debian, Fedora, ...) to replicate very old rsh behaviour.


I didn't see it mentioned, but I use it frequently so here is my tip. I'm not sure if it is bash specific (I just use it!).

Instead of typing !$ for the previous command's final argument, you can use the keyboard shortcut alt+. (alt+period). Pressing it multiple times will go to the last argument of previous commands. I use this quite a bit and found it easier than !$, because you can see which command it will be : )

I still don't always understand exactly what is happening with subprocesses vs subshells (chriswarbo's post is very useful in pointing out how wrinkly this can be), so I try and keep bash my usage simple.


Something I wish people teaching intermediate or advanced Bash tricks would emphasize more is how to make your program compatible with other shells. With the rise of Zsh's popularity, and the switch to Dash for Ubuntu/some Debian derivatives, I see a lot of people repeating bashisms in code they share without the knowledge that a) their code may not work for an unexpectedly large number of people, and b) switching to compatible equivalents doesn't make their code worse or less performant in many/most cases.

The most common bashisms and ways to avoid them are:

- Double brackets ([[) around conditions. Yes, I know that [ is a program (don't believe me? "which ["). That doesn't mean Bash uses it; it uses a builtin which is (almost) equivalent to [[ instead. Use that and your code will work in zsh/dash/all other POSIX shells. And while you're at it, stop using "which" as an authority for "is this a shell builtin or not?" [type()](http://linuxcommand.org/lc3_man_pages/typeh.html) is your friend.

- When comparing strings for equality, use a single equals sign "=", not "==" (e.g. 'if [ "$foo" = "some string" ]'). I know it feels dirty if you've programmed in any other language, but it changes nothing about your code's behavior and makes it compatible with several other shells.

- Don't use "function funcname()" syntax. It adds nothing over the basic "funcname()" syntax, but prevents your code running in many/most non-Bash shells. And consider putting your function-opening brace on a separate line (someone once told me that there are shells that won't accept any other function declaration style, but I've never seen one, so ymmv).

- Don't use "local" if you need to interoperate with ksh. Abandoning "local" pollutes global namespaces, though, so your call.

- Don't use substring expansion (e.g. extracting the 3rd-10th characters of a string via 'substr="${somevar:3:7}"'. That's not supported in many other shells. Alternatives include sed/awk/etc., or, if invoking external programs is absolutely unacceptable to you, something horrific like:

    substr()
    {
        local input="${1:?String is required}"  
        local dist_from_start="${2:?Start position is required}"
        local dist_from_end="${3:-${#input}}" # Here, it's actually 'offset', not distance.
        local start_nulls=
        local end_nulls=

        dist_from_end=$(( 5 * (${#input} - ($dist_from_start + $dist_from_end)) ))
        dist_from_start=$(( 5 * $dist_from_start ))

        # Make a string of the regex for "any not null character" that "masks" the
        # characters in the input before the start point, and the characters after
        # the end of the substring. This is disgusting, and is only done because the
        # parameter expansion statements can't contain repetitions (e.g. [^\0]{5})
        # without the bash-only 'extglob' shell option.
        # The not-null character is used because it will never match in a shell
        # string.
        while true; do
            if [ "${#start_nulls}" -lt $dist_from_start ]; then
                start_nulls="${start_nulls}[^\0]"
            elif [ "${#end_nulls}" -lt $dist_from_end ]; then
                end_nulls="${end_nulls}[^\0]"
            else
                break
            fi
        done

        input="${input#$start_nulls}"
        echo "${input%$end_nulls}"

    }

    substr "$@"
...actually, please never use that code. Ew.

Anyway, some more bashisms: https://mywiki.wooledge.org/Bashism


I didn't know a few of these, and I think I'll find `<()` specially useful.

I wasted a lot of time on this one:

If you declare a local variable and set it in the same step e.g: local MYVAR = $(/bin/false) The return code you get is from the local declaration, not assigning a value to the variable. It can be quite confusing when you afterwards check the return code with $? and it returns 0. Avoid it by assigning the value in a seperate command.


Legal | privacy