shell

Shell script suffers/endures popular prejudice, somewhat deserving but mostly misconstrued (much as Perl hype promoted the mistaken obsolescence of sed & awk). Ugly and strange as it may be, shell script has useful advantages which make it well worth learning, but it can indeed be troublesome (as it was for me at least), so I offer these notes to reduce such trouble.

why?

speed/efficiency

Some shells (ksh, dash) are surprisingly fast, and often fast enough (as faster execution may not be useful beyond a given threshold). Shell script is also relatively quick to develop and efficiently addresses many applications.

less indirection

In contrast to languages which use different syntax and wrapper functions to encapsulate shell commands, each line in a shell script is essentially a terminal prompt awaiting input (hence the name of the Plan 9 shell "rc": run commands).

Shell syntax is simpler and more concise in many cases, such as pattern matching, and manipulating files and data streams (pipelines, &c.).

concurrency

Shell script supports concurrency: built-in both in background jobs, and in pipelines; and also by way of commands which support concurrency (e.g. xargs —max-procs=n, split -n, GNU Parallel).

interactive

Shell script can be interactive, via the terminal or a GUI (using external tools), or both adaptively depending on context. Development is interactive as the terminal serves as a REPL (read/evaluate/print loop).

portable

Shell script is fairly portable, like other popular interpreted languages.

which?

There are many Unix shells, and potential confusion in choosing one. I use three, depending on context.

scripting: ksh, dash

I use ksh for scripting because it's fast and powerful (floating point arithmetic, &c.). Ksh is large (even larger than bash) but it's very fast (even faster than dash, except in startup).

I use dash only for exceptional cases which benefit more from fast load time (startup) and low memory usage than from fast execution.

interactive: bash

Bash is my interactive shell because its command completion is well supported and its syntax is mostly compatible with (i.e. mostly copied from) ksh. I would use ksh if it had similar completion support and a prompt command (and ksh would surely be the default had it been freely licenced before bash was born). I tried zsh a few times but it's not worth the effort to me (though that may change as it gathers more support).

sh: dash

Dash is my default system shell (/bin/sh) because it's POSIX-compliant, simple, and light on system resources.

configuration

Configuration can be confusing considering the various files involved and the different conditions which cause them be sourced. Separating configuration by context works well (e.g. to avoid loading GUI-related configuration in console mode).

~/.profile
basic user environment (executed on login)
~/.env-interactive
generic user environment for interactive shells (including aliases/functions)
~/.inputrc
readline configuration (for command line editing/history)
~/.bashrc
executed for shells which are both interactive and non-login
only bash-specific configuration here
~/.bash_profile
executed for login shells
keep it simple: just source ~/.bashrc and ~/.profile (because bash ignores ~/.profile if ~/.bash_profile exists)
~/.xinitrc
(~/.xprofile for GUI login)
xorg-specific configuration

trouble

Bourne-style syntax is strange (if not ugly) and it's easy to avoid learning how it works in contrast to just solving the problem at hand. In hindsight, I would have saved time and avoided trouble by understanding the details before getting caught in their inevitable pitfalls (like issues with quoting, word splitting, globbing, ...). The following resources help avoid such trouble.

practice

Shell script can be a mess (like any versatile language) but it need not be, if one is careful to maintain consistent structure and style.

structure

I generally divide my shell scripts into several sections:

  1. preliminary – metadata, configuration defaults (just assignment)
  2. functions – function definitions only
  3. main logic
    a. read options (getopts)
    b. initial validation (arguments, execution context, &c.)
    c. main logic (simplified by use of functions)

style

I do the following to aid legibility and reduce error:

prudence

In preventative practice:

performance

Shell script need not be slow!