Command line and bash notes

From Helpful
Revision as of 16:31, 19 August 2018 by Helpful (Talk | contribs)

Jump to: navigation, search
Linux-related notes
Linux user notes

Shell, admin, and both:

Shell - command line and bash notes · shell login - profiles and scripts ·· find and xargs and parallel · screen and tmux ·· Shell and process nitty gritty

Linux admin - disk and filesystem · Init systems and service management (upstart notes, systemd notes) · users and permissions · Debugging · security enhanced linux · health and statistics · kernel modules · YP notes · unsorted and muck

Logging and graphing - Logging · RRDtool and munin notes
Network admin - Firewalling and other packet stuff ·

Remote desktops
VNC notes
XDMCP notes


Safer/cleaner scripts

Shell expansion

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Bash shell expansion is the following sections - and apply in that order.

While powerfully brief, it's also hard to truly understand, depends on environment settings (environment variables, shell option), so it will bite you and if you want something robust it's best avoided.

Usually the suggestion is to use a scripting language, one where it is easier to be correct, clear, and still brief. ( not perl). Python is an option, due to it being ubiquitous in modern linux.

For example, can you say why

for fn in `ls *.txt`; do echo $n; done

is a problem while

for fn in *.txt; do echo $n; done

is mostly-fine-except-for-a-footnote-or-two? And what the better-form-yet is?

Some examples below are demonstrated via a command called something like argshow. Make this yourself with the following contents and a chmod +x

printf "%d args:" $#
printf " <%s>" "$@"

brace expansion

Combinatorial expansion:

# echo {a,b,c}{3,2,1}
a3 a2 a1 b3 b2 b1 c3 c2 c1
# echo a{d,c,b}e
ade ace abe
ls -l /usr/{bin,sbin}/h* 

Sequence expression (integers):

# echo {1..6}
1 2 3 4 5 6
# echo {1..10..2}                                                                                                                                                                           
1 3 5 7 9

Sequence expression (characters, in C locale):

# echo {a..f}
a b c d e f                 


  • Expanded left to right.
  • order is preserved as specified, not sorted
  • things stuck to the braces on the outside are treated as preamble (to prepend to each result) and postscript (to append to each result), see second example
  • single list is effectively
  • when using it for filenames, keep in mind that
it generates names without requiring they exist
it happens before pathname expansion (meaning you can combine with globs - and that you should consider cases where they don't expand)
  • may be nested, is treated flattened(verify)
# echo {1,2}-{{_,-},{X,Y,Z}}                                                                                                                                                                
1-_ 1-- 1-X 1-Y 1-Z 2-_ 2-- 2-X 2-Y 2-Z 
# echo {a,b}{{_,-}{X,Y,Z}}                                                                                                                                                                  
a{_X} a{_Y} a{_Z} a{-X} a{-Y} a{-Z} b{_X} b{_Y} b{_Z} b{-X} b{-Y} b{-Z}
# echo {00{1..9},0{10..50}}
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 
026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 
  • can't combine sequence and set (e.g. {1,3..5} works as two string elements)

See also:

tilde expansion

The two best-known two:

  • ~
    is your shell's $HOME
some footnotes with su
  • ~username
    is their home path

And these can start path expressions, e.g.
ls -l ~/.ssh ~nobody/.ssh

Keep in mind this comes from the account database and do not necessarily exist (though usually do).

There are a few others, such as
for the current and previous working directory ($PWD and $OLDPWD).
you may like the latter as the special-cased
cd -

parameter/variable expansion

On delimiting

The bracket style ${var} allows more unambiguous delimiting (and don't need whitespace to delimit between it and other things).

It also allows the conditional replacement mentioned below.

The $var is fine in basic cases.


o="One";t="Two" ; echo $otfoo ; echo $o t foo ; echo ${o}t foo ; echo ${o}${t}foo

Conditional replacement

Warning if not set

- when if VAR is unset or null, bash complains with the message (but does not stop a script)
# yn="";echo ${yn:?Missing value}
-bash: yn: Missing value

Return this value if not set

where if VAR is unset or null, (the expansion of) word is returned instead
# Take device from first command line argument, default to eth0 if not given
# Reports all files containing a certain pattern. Call like:  
#   fileswith greppattern [file [file...]]
shift # consume that pattern from the cmdlinearglist so we can use @:
grep -l $PATTERN $FILES | tr '\n' ' '

"If variable not set, return this other value and assign to the variable":

#If ans was set, keep its value. 
#If ans was not set, will return no and set ans to it.
#nice in that later code can safely assume it is set
echo $ans

"Use given value when set at all"

For example "any actual answer is taken as 'yes', non-answers are unchanged"

yn="";echo ${yn:+yes}
yn="wonk";echo ${yn:+yes}

Pattern and substring stuff


...removes a string from the end of var, allowing globs.

Say you have


...and want to handle them as sets, then one way is look for all firsts, strip down to the base, and expand again:

for fn in *-001.txt; do
 echo $basename
 echo $basename*
 # cat $basename* > ${basename}-all.txt

will print

a-001.txt a-002.txt
b-001.txt b-002.txt b-003.txt b-004.txt

The difference between % and %% is that when you use a glob, % will remove the shortest match and %% the longest, e.g.

$ export v=abcabcabc
$ echo ${v%b*c}
$ echo ${v%%b*c}


Remove a string from the start of var, allowing globs, again shortest and longest.

For example:

  • ${0##*/}
    is a good imitation to get the basename of the script
  • ${filename##*.}
    gets the filename's extension

arithmetic expansion

Basically, using
$(( expr ))
evaluates expr according to shell arithmetic rules

command substitution

The following will be replaced by stdout from that command


(the former is mildly preferred in that it has fewer edge cases in parsing characters)


  • it's executed in a subshell
  • trailing newlines are stripped
  • note that word splitting applies, except when this appears in double quotes (single quotes would avoid evaluation)
# argshow $(echo a b)
2 args: <a> <b>            
# argshow "$(echo a b)"
1 args: <a b>
# argshow '$(echo a b)'
1 args: <$(echo a b)>
  • $(< file)
    is done without subshell(verify) so is faster than
    $(cat file)
  • can be nested
(backquote style needs escaped backquotes to do so)
echo $(echo $(ls)) 
echo `echo \`ls\``
  • evaluated left-to-right

word splitting

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Most of what you need to know:

  • Word splitting is performed on almost all unquoted expansions
  • if no expansion occurs, no splitting will occur either (verify)
  • Will split on any run of the characters in $IFS
if unset, default is whitespaces
if set to empty string (a.k.a null), no splitting occurs

If IFS isn't set, it defaults to act like
\ \t\n
(space, tab, newline).
which is why it misbehaves around files with spaces in them. One partial workaround is to remove space from $IFS, i.e. set it to tab-and-newline.
IFS=$(echo -en "\t\n")  # echo call to parse these; IFS="\t\n" would actual be those four characters
for fn in `ls *.txt`; do echo $fn; done
unset IFS # unless you want everything late to behave differently

These delimiters are ignored at the edges (so empty-argument results are avoided)

You can use IFS for other tricks, like:

while read username pwd uid gid gecos home shell 
   echo $username
done < "/etc/passwd"
unset IFS


  • Double-quoting suppresses word splitting,
  • ...except for "$@" and "${array[@]}"
  • no word splitting in
    • bash keywords such as ... and case (verify)
    • expansions in assignments
  • You can see what's in IFS currently with something like
    echo -n "$IFS"
20 09 0a is space tab newline
echo -n means it won't add its own newline, doublequoting means avoids do word splitting :)

pathname expansion

Other notes

Shell stuff I occasionally look up


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Fixed values and variables

for arg in "$var1" "$var2"; do 
  echo $arg

The doublequoting is good practice because you usually want to avoid word splitting on arbitrary values.

Which is mostly a variant of something like:

for body in Mercury Venus Earth Mars Jupiter Saturn Uranus Neptune; do
   echo $planet


for lastoct in `seq 2 254`; do
  echo 192.168.0.$lastoct

Both use bash-specific word splitting.


Mostly: See test, and a few of the notes for for

syntax error near unexpected token done
...often means you didn't put a semicolon/newline between the condition and do

A poor man's watch, which I use to get shell colors without forcing them:

while true
  echo ls
  sleep 1
# Or as a one-liner
while true; do echo ls; sleep 1; done 
# You can use 
#while :           
#while [ 1 ] 
# ...if you find them easier to remember


  • :
    is a historical shorthand for
    , and is also sometimes useful as a short no-op

Redirecting, basic

  • <
    feed file into stdin
  • >
    write stdout to file (overwrite contents)
  • >>
    write stdout to file, appending if it already exists

For example:

ls dir1 > listing       # would overwrite each time
ls dir2 >> listing      # would append if exists
sort  <listing  >sorted_listing

By default this applies to stream 1, stdout, because that's where most programs put their most pertinent output.

The standard streams are numbered, and (unless redirected) are:

  • stdin is 0
  • stdout is 1
  • stderr is 2

So e.g.

find >output 2>errors
# or, equivalently
find 1>output 2>errors


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Piping is redirecting between programs.

When starting multiple processes, you redirect an output stream from one to the input stream of another.

For example:

locate etc/ | less
cat infile | sort | tee sorted_list | uniq > unique_list

This can also be combine with redirection, e.g.

find . 2>&1 | less   # don't ignore the errors

Redirecting, fancier

You'll want to know that there is some syntax variation (particurly between shells). In bash,

&> filename
>& filename
are equivalent, and short for:
>filename 2>&1
i.e. stdout and stderr are written to the same file, because it says:
write stdout to filename
write stderr to what stdout currently points to

Also, some of this is specific to bash

e.g. dash[2] will trip over
saying Syntax error: Bad fd number)}}

Consider how multiple requests are handled - primarily that changes are processed in order. Consider:

prog >x 2>&1 >y
This means:
connect stdout to file named x
connect stderr to what stdout currently points to (which is the file named x) (actually duplicates the file descriptor(verify))
connect stdout to file named y
The net effect is "connect stderr to a file named x, and stdout to a file named y".

Redirection, less common

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)
copies stdin to stdout verbatim and writes it to the named file

This is sometimes a nice streaming thing, though usually just for command brevity

# log output and show it live
find / 2>&1 | tee allfiles
# writes both sorted and unique list
cat infile | sort | tee sorted_list | uniq > unique_list

(bash-specific, not bourne?(verify)) - pipe in a here document [3]
    • Example: TODO

(bash-specific, not bourne?) - here string [4]
    • goes through most interpretation. Some use this syntax primarily for its short command substitution
    • Example: TODO

utility copies stdin to stdout and prints how fast on stderr.
can be nice to see how fast data is moving through
can deal with showing multiple streams. E.g. to test how people's homedirs would compress on average
tar cvf - /home 2>/dev/null | pv -c -N RAW | pigz -3 - | pv -c -N COMP > /dev/null

See also:

piping/catching both stdout and stderr

These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.

When you call an external program and read from one stream, you typically use blocking reads for simple 'wait until it does something' logic.

Doing that from both stdout and stderr is a potential problem, in that you can have output on one while not getting any on the other. Usually you can get away with this, but it can produce deadlock-like situations.

Generally, you want to either:

  • use non-blocking reads (probably in a loop with a small sleep to avoid hammering the system with IO)
  • test streams with select() before read()ing
    • In some cases, your OS or language (standard) libary does not expose select(), you cannot find the file descriptor to select on, it does not let you select on pipes, or some other problem.

Other workarounds:

  • redirect both to the same stream (but that can be annoying to do from an exec()-style call, because you need to wrap it in a shell - redirection is shell stuff)
  • for non-interactive stuff, write both streams to a file, read those after the programs exit

Shell escaping

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

You'll occasionally create a string to be evaluated in another context (or immediately via expr or backticks) -- and run into problems with escaping/delimiting.

'Not safe' below tends to mean one of:

  • Will open some interpreted, to-be-closed range (e.g. `)
  • Interpreted differently if in script or on command line (e.g. "\")
  • terminates some parse by odd tokenization, such as spaces in filenames

In various cases I prefer a scripting language that more or less forces you to things in a stricter (if longer) way, simply because I won't spend as much time convincing myself that the bash script is correct, or at least good enough.

single quotes: 'string'

  • Not safe to dump in:
    , possibly more
  • Safe:
    (safe as in "not interpreted as anything more than a character")

double quotes: "string"

  • Not safe to dump in:
    and probably more
  • Safe:

backslashing\ each\ necessary\ character

  • Potentially safer than the above (solves mentioned nonsafe character problems)
  • But: interpretation of backslashes unsafe themselves - or rather, they depend on quotes again:
    • 'single quotes' (no interpretation?)
      • echo '\z' → \z
      • echo '\\z' → \\z
      • echo '\\\z' → \\\z
      • echo '\\\\z' → \\\\z
    • outside quotes
      • echo \z → z
      • echo \\z → \z
      • echo \\\z → \z
      • echo \\\\z → \\z
    • "double quotes"
      • echo "\z" → \z
      • echo "\\z" → \z
      • echo "\\\z" → \\z
      • echo "\\\\z" → \\z

Further notes:

Here documents (those
things) act differently from the above descriptions, apparently acting like escapes inside backquotes (command substitution -- but frankly, if you're doing shell scripting that complex, you're dangerous to begin with:)

Using escaping from the shell (in most shells, anyway) gets a layer of pre-interpretation that would not be applied in a script (!)

Shell conditionals and scripting

Conditional execution

Say that you have a regularly-running script conceptually like:

graphdata > file.gif
mv file.png /var/www/mywebserver

...and you want to do some parts only if the earlier bit succeeds.

Basically: Make programs return meaningful return codes (most do), and test for them and use the result.

The short syntax is
('if success') and
('if failed').

You can even use both (though syntaxwise this is cheating a little bit - guess why), like in:

/bin/true  && echo "Jolly good." || echo Drat.
/bin/false && echo "Jolly good." || echo Drat.

A brief one-liner with bash syntax is to use
, for example:
collectdata && graphdata > file.gif && mv file.png /var/www/mywebserver

If this is not a one-liner (e.g. in your crontab) but a longer script, it's probably cleaner to do something like:

collectdata                       || { echo "Data collection failed"; return 1 }
graphdata > file.gif              || { echo "Data graphing failed"; return 2 }
mv file.png /var/www/mywebserver  || { echo "Moving graph failed"; return 3 }

Note: Prefer
over Template:Exit in anything that may be sourced rather than executed - exit would e.g. render an xterm unstartable.

For the pedantic: The && and || essentially mean 'if zero return code' and 'if nonzero' -- which is inverted from the way true and false works within almost all programming languages. It's often less confusing if you don't think about the values :)

See also


if, test

See also [[, extended test

In bourne-style scripts you frequently see lines like:

if [ "$val" -lt 2 ]; then
if test "$val" -lt 2; then
These two are functionally equivalent. The difference that
(which is an executable, with that somewhat unusual name) looks for a closing ]. People seem to prefer this form.

You can negate tests with
if test ! -r ~/.hushlogin; then
  echo "La la you haven't shut up motd yet"
test ! -d /var/run/postgresql && mkdir -p /var/run/postgresql

Actual tests include: (list needs to be (verify)'d)


  • -eq
  • -ne
    not equal
  • -lt
    less than, greater than,
  • -le
    less than or equal to, greater or equal


  • -r
    exists and can be read
  • -w
    exists and can be written
  • -x
    exists and can be executed
  • -s
    file exists and isn't empty (size isn't zero)
  • -e
    file exists (may not appear in all implementations(verify))
  • -f
    exists and is regular file
  • -d
    exists and is directory
  • -h
    or -L: exists and is a symbolic link
  • -p
    exists and is a pipe
  • -b
    exists and is a block device
  • -c
    exists and is a character device
  • -S
    exists and is a socket


  • -n
    nonzero string length (you probably want doublequotes around a variable)
  • -z
    zero string length (you probably want doublequotes around a variable)
  • =
    string equality
  • !=
    string inequality
  • Nonstandard: 'lexically comes before' and 'lexically comes after', \< and \>, but be careful: without correct escaping these become file redirection.

boolean combinations -- which are nonstandard

  • -a
  • -o

Other operators test ownership by set or effective user or group, by relative age, by inode equality and others.

On empty/missing arguments

Things can get a little finicky in this case.

Common mistake #1: Unquoted empty variables

Consider that if $var is not set, or an empty string, then

[ $var = '' ]
[ -n $var ]
[ -z $var ]

would expand into:

[ = '' ]
[ -n ]
[ -z ]

The first is a syntax error. The second isn't but doesn't do what you want (returns true without an argument). The third is basically fine. Regardless, you should be in the habit of always using quotes: (probably doublequotes)

[ "$var" = '' ]
[ -n "$var" ]
[ -z "$var" ]

Common annoyance #1: No substring test

It's not there.

But it sort of is -- in bash and sh (recent/all?(verify)), you can use case for this, e.g.:

case "$var" in
      echo "Saw error, stopping now"
      exit 0 ;;
      echo "We're probably good, doing stuff"

The following is also possible, and arguably more generic, because it uses something external (that we know the behaviour of):

grep -o "pattern1" <<< "$var" && echo "do something"

Also, both are fixed in the extended test command, which is a bashism (not standard POSIX, not available in e.g. sh or dash which can be your system's default shell, so only use around a /bin/bash hashbang).

test and conditional commands
also set the exit code, you see shell script lines like:
# stop now if we are running in interactive context
[ -z "$PS1" ]      && return           
# source this if it exists
[ -f /etc/bashrc ] && . /etc/bashrc

While experimenting with command success/failure, you may find it useful to show test's exit status, for example:

test -n "`find . -name '*.err' -print0`" ; echo $?

...but this can be confusing -- the logic is wrapped around program exit codes, so 0 is true and nonzero is false, which is the opposite of how most programming logic works. You usually don't need to think about that until you're consuming it as a number. For example:

test 2 -eq 2 ; echo $?
test 4 -eq 2 ; echo $?

[[, extended test

The extended test command, [[, was adopted by bash from ksh around bash 2.02 (~1998) (regex support since 3.1, ~2005) [6], and also available in zsh.

Note: when using in scripts, use explicit #!/bin/bash hashbang to avoid problems with default shells not being bash (or ksh).

This because while test and [ are POSIX, [[ is not. Bash is very typically installed (if not can be considered exotic), but don't assume it's your system's default shell (used e.g. for scripts). Pay attention to systems using:

using sh (bourne shell, not yet again)
the lightweight dash (e.g. various Ubuntu and BSD do this, and dash doesn't do [[)
embdedded systems using busybox (so ash(verify)[7], the origin of dash)

Most interesting details compared to
  • parsed before other processing,
in particular it sees things before any word splitting or glob expansion
it's more predictable as it's less likely to be mangled by something you didn't think about
no longer a mistake to omit (double)quotes for variables, file tests, or spaceless literals:
[[ $VAR != yes ]]
[[ -e $b ]]
  • now standard instead of deprecated:
for logical AND within a test
for logical OR within a test
for logical grouping
  • glob matching, which includes substring matching
[[ abc = a* ]]
[[ $var = *a* ]]
  • regexp pattern matching
[[ abb =~ ab+ ]]

Note that it still makes sense to use (( for arithmetic, [[ is for string and file stuff.

See also:

Some practical tests

Test whether variable is set

If you only care between set to a nonempty string and set to empty string or null or unset, the following is fine:

if [ -z "$var" ]

If you care between set and unset or set to null(verify), then there is the following (which happens to be valid with and without quotes):

if [ -z ${var+s} ]

If you care about the difference between 'set and not null and set and null and unset, things are even more interesting. See also:

Test whether command exists


One of the most robust, POSIX-compliant tests seems to be:

if [ ! -x "$(command -v progname 2>/dev/null)" ]; then
  echo progname not installed


If it sets are return code, you can generally could just run it
(suppressing stderr to avoid <tt>-bash: progname: command not found)</tt>
if ! progname 2>/dev/null; then
  echo progname not installed

...except you may run into the command-not-found handle that suggests installing things.


Generally preferred:
, with option -v to describe (which you probably want to hide) rather than run. Basically:
if ! command -v progname 2>/dev/null; then
   echo commandname not installed

Command may match weirdly defined aliases, and some implementations (e.g. dash's(verify)) will also match non-executables, meaning the slightly better variant is:

if [ ! -x "$(command -v progname 2>/dev/null)" ]; then
  echo progname not installed


Has a builtin
that mentions whether something is a builtin, keyword, disk executable, etc - and will give an error return code if it doesn't find anything (and complain on stderr), so:
if ! builtin type progname 2>/dev/null; then
   echo progname not installed
A similar bash-builtin-ism is
, which is basically its builtin which.

(The main difference to type is that it'll only report disk executables)

if ! builtin hash progname 2>/dev/null; then
   echo progname not installed

The use of builtin is optional, but preferred in case you may have functions/commands called has or type.

type and hash are POSIX too[8][9] so will also work in zsh, dash and ash (busybox). Note that some have non-POSIX extensions that you shouldn't use if you want this portability. (some suggest implementations vary more -- TODO: figure out(verify))

Why not which? - it's not portable. While it's typically present, and it may work well on your system, it is not a standard tool, not all implementations set a return code, and some do heavy stuff in package management.[10]

See also:

More notes

If you want to take out some block of code via an if (faster than commenting a lot of lines) then:

if [ false ] # or whatever string in there 

...won't work because bash uses string variables, and the default operation is -n ("is string non-empty").

The shortest thing that does what you want is:

if [ ]

or perhaps:

if [ "" ]

or, if you want something more obvious to passing readers, you could do:

if [ ignore = block ]


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Note: only in the bourne-style shells (verify)

For example:

case "$var" in
   *pattern1* )            echo "seeing pattern 1" ;;
   *pattern2*|*pattern3* ) echo "seeing pattern 2 or 3" ;;
   * )                     echo "fallback case" ;;

The thing this has over test/if is proper wildcard behaviour.


You'll know that a bash script can act as a batch file, running one command after the other in the hope nothing will screw up. Bash, however, offers more useful functionality, in and out of scripts (there is, in fact, no noticeable difference). For example:

for pid in `pidof swipl pl`; do renice 5 $pid; done 
29180: old priority 0, new priority -5
26858: old priority 0, new priority -5

...will re-nice the named processes, because for expects a space-seperated list, pidof returns a list of pids, and backquotes (`) mean "treat this as its output of the command specified"

The above could have been spread among lines:

bash-2.05b $ for pid in `pidof swipl pl`
> do 
>   renice 5 $pid
> done

Something similar goes for if-else loops. These allow you construct scripts that catch errors, run differently depending on how other commands managed, on environment variables, and whatnot. Scripting tends to beat real programming for simple little jobs.

While you can do this for files by using a wildcard, but it is generally a bad thing to do on files and you shouldn't learn this this way, because it won't work in two situations:

  • when files contain spaces (possibly also on other less usual but legal characters)
  • when there are so many files that bash expands the command to something longer than it can use (see Argument list too long (although this is less of a problem now)

If you want to do it robustly / properly, learn using using find and xargs.


while is a conditional loop.

You can do things like

while [ 1 ]; do (clear; df; sleep 5); done
#which imitates   watch -n 5 df


let c=0
while [ $c -lt 10 ]; do  # better served by a for
  echo $c; 
  let c=c+1 

User input

read reads user input into a variable, for example:

read -p "Do you want to continue? " usercont
echo $usercont

There are some options - that the man page doesn't mention.


Substring (by position,length):

# s="foobarquu";echo ${s:3:5}

Regexp is possible, but strange and limited. Use of awk and/or sed is probably handier.

sourcing scripts

Usually, running a script means creating a process, and running the listed commands in that process.

When you want to alter a current shell's environment, it is useful to run another script's commands in our context. This is what
is for.
source /etc/profile
# (bourne-style?) shorthand:
. /etc/profile

Backgrounding processes

Ctrl-Z, fg and bg

I occasionally see people using shells used only to run a program, typing e.g.
and then minimizing it as a useless shell. That while it is simple to background a program, like:
netscape &

(In the case of KDE, you can of course use the run dialog, Alt-F2. The parent of the process will be kdeinit then, I believe.) If you wish to have the same effect as the & after you didn't initially use it, you can use Control-Z to pause the current foreground process, which should print something like:

[1]+ Stopped       firefox
...which is a shell-specific (bash, here) job management list. You can then run
to have the same effect as the ampersand, or
to continue the program as before - an effective pause. You can use the job id's if you want detailed control over more than one process, but I've never needed this. When a backgrounded program's parent shell is terminated, the program should keep running, although there are likely details there that I've never checked out. For a more more certain, permanent and convenient running-in-the-background solution, use , which is probably more useful in the first place.

Directory of script being run

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

Console scrolling

Shift-PgUp and Shift-PgDown (often)

Useful for those happy-go-verbose programs, you can scroll back as far as the screen history goes. This usually works in text consoles, and is usually imitated by X terminal consoles.

Note that various things (PuTTy/Konsole/xterm, but also screen) may have their own configurable limit to how many past lines they keep, and in the case of screen, their own way of looking at it (screens are not really regular terminals, after all...)

Shell aliases and functions


Aliases are short identifiers that expand to longer things. For bash, the syntax is like setting a variable. Potentially useful examples:

alias webdir="cd /var/www/www/htdocs"                # go to some directory you regularly work in
alias weblogtail="tail -F /var/log/apache2/*"        # watch web server log
alias logs="tail -F /var/log/*.log /var/log/*/*.log /var/log/syslog"  # watch various current logs
alias ..='cd ..'                                     # funky shortcut
Customized alternatives, such as and ls that uses iso date formats, hides
, hides the groupname, hides backups (*~), adds a / to directory names, sorts by mtime (most recent last), uses human-readable sizes, and color when appropriate:
alias l="ls -lAhGBptr --time-style=long-iso --color=auto "

Some examples:

alias vf='cd'                            # catch typo
alias duh="du -h "                       # use human-readable sizes
alias dud="du --max-depth=1 -h "         # human, one-deep (often more readable)
alias lslast="ls -lrt "                  # show last modified last
alias lsd='find * -prune -type d -ls'    # 'list directories under curdir'
alias hexdump='od -t x1z'                # "show hex representation, of single bytes at a time, show text alongside"
alias dlpage="wget -r -l 1 "             # save page and direct links
alias lesscol="less -R"                  # less that allows color (...control codes)
alias psgrep="ps aux | grep"             # short way of grepping through process list
alias go-go-gadget=sudo
# change default verbosity
alias df="df -hT "                       # use human-readable sizes and show filesystem type
alias bzip2="bzip2 -p "                  # always print progress when bzipping
alias pstree='pstree -pu '               # always show pid, and show usernames where UID changes
# change default behaviour:
alias grep="egrep "                      # always use extended grep (always have regexp)
alias bc="bc -lq "                       # bc always does float calculations


  • aliases can be removed with
  • naming an aliasing the same as the command is possible, but can mean arguments that are hard to negate, arrive double can cause confusion, and such. You may want to know about
  • aliases don't have arguments as such - they expand and let arguments come at the end (note that bash functions can have arguments, so they can be a better choice)
  • if intead of an alias you can use an environment variable (e.g. <tt>GREP_COLOR=</tt> for grep's --color=), that may be preferable, as it is more flexible.
  • some distributions have different default behaviour differences (via aliases), such as having rm do "rm -i".


Bash functions have a somewhat flexible syntax. They look like the following, although the 'function' keyword is optional:

# alternative to cd that lists content when you switch directory
function cdd() { cd ${1} ; echo $PWD ; ls -FC --color ; };

More adaptively:

rot13 () { 
   if [ $# -eq 0 ]; then  #no arguments? eternal per-line translation
      tr '[a-m][n-z][A-M][N-Z]' '[n-z][a-m][N-Z][A-M]'
   else                   #translate all arguments
      echo $* | tr '[a-m][n-z][A-M][N-Z]' '[n-z][a-m][N-Z][A-M]'


  • Functions can be removed with
    unset -f name
  • Neither aliases or functions are forked

Renaming many files

There are different things called rename out there.

rename (Perl script)

If running
without arguments says:
Usage: rename [-v] [-n] [-f] perlexpr [filenames]

This seems to be to be a variant of this This seems to be the default on Debian/ubuntu/derived. Many other distros have it under a package called something like prename

You use it something like:

rename 's/[.]htm$/[.]html/' *.htm
rename 's/[a-z]+_([0-9]+)[.]html$/frame_$1.html/' *.html

perlexpression is typically a regex, but could be any perl code that alters $_)

In a regex, you may often want to use <t>/g</t> or you'll get only one replacement.

"rename (util-linux" variant)

If running without arguments starts with:

call: rename from to files...

OR (...there seems to be a mix of old and new in the wild) (verify)

rename: not enough arguments

 rename [options] expression replacement file...

Then it's a simple substring replace and you use it like:

rename '.htm' '.html' *.htm

And e.g:

rename '' 'PrependMe ' *
rename 'RemoveMe' '' *

See also:


Haven't used yet. TODO

Configurable autocompletion

Bash >=2.04 has configurable autocompletion.

man bash
, somewhere under SHELL BUILTIN COMMANDS.

Actions are pre-made completion behaviour.

complete -A directory  cd rmdir         # complete only with directories for these two
complete -A variable   export           # assist re-exports
complete -A user       mail su finger   # complete usernames
complete -A hostname   ping scp         # complete hostnames (presumably from /etc/hosts)

Filter patterns are usually for filenames (-f), to filter out completion candidates, for example to filter out everything that does't end in '.(zip|ZIP)' when the command is unzip.

This can be helpful but also potentially really annoying: if I know a file is an archive but doesn't have the exact extension the completion is expecting, you will have to type out the filename (or change the command temporarily).

Manual filters: You can use a bash function, and inside that do whatever you want, including calling applications to get and process your options (just don't make them heavy ones). The following example (found somewhere, and rewritten) illustrates:

The following allows killall completion, with the names of the currently processes that the current user owns:

_processnames() {
   local cur=${COMP_WORDS[COMP_CWORD]}    #the partial thing you typed already
   COMPREPLY=(                                    \
       $( ps --no-headers -u $USER -o comm      | \
       awk '{if($0  ~ /^'$cur'/)    print $0}'  | \
       awk '{if($0 !~ /\/[0-9]$/)   print $0}' )  \
   return 0
complete -F _processnames  killall

That first awk takes out things that don't start with what you typed so far, the second filters out the names of some 2.6 kernel process names (in the form processname/0)you probably can't and don't want to kill.


Shopt sets some options for the sh family of shells.

Shopt things are set (-s) or unset (-u).

You can see the current state with a
shopt -p

Most of the settings are low-level and are probably already set to sensible values.

Things that might interest you include:

  • checkwinsize
    update LINES and COLUMNS environment variables whenever the shell has control. Useful for resizeable shell windows, e.g. remote graphical ones.
  • histappend
    appends instead of overwrite the history file. Seems to be useful when you often have multiple shells on the same host.
  • dotglob
    considers .dotfiles in filename completion
  • nocaseglob
    ignores case while completing. This can be useful if you, say, want'*.jpg' to include '.JPG', '.Jpg', etc. files too. (You may wish to be a bit more careful when you have this set, though)

If you generally want case sensitive matching, but sometimes case insensitive matching, say,

ngc ls *.jpg    # case insensitive 
ls *.jpg        # case sensitive

...then you can use a trick to temporarily disable e.g. nocaseglob:

alias ncg='shopt -s nocaseglob; ncgf'
ncgf() {
  shopt -u nocaseglob

This works because an alias is evaluated before the main command, a function after.

Key shortcuts

To get an old command, instead of pressing Up a lot, you can search for a substring with Control-R. When you get the one you want, use enter to run it, or most anything else to change it first.

See also

Some shell-fu exercise

Links and sites

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

These are general sites, here partly because they need some place. You may find some of them intereting to read, but none are of the "Read this before you go on" type.

Thinks to look at:

Here documents

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

You've probably seen scripts with something like:

wall <<EOF
Hello there.
Please be aware the system is going down in half an hour.

<< means feeding in data that follows into stdin of the preceding command, everything up to the token mentioned immediately after. People often use EOF as a recognizable convention, but it could be xx62EndOfMessageZorp just as easily.

Here documents can be easier than trying to construct an echo command to do your multi-line escaped bidding.

Combination with shell arguments (redirection, piping) look weirdly positioned

wall <<EOF &

...until you realize that the here-document start is really just a trigger for behaviour that starts after the rest of the command is parsed and evaluated

strace -eopen workhard <<EOF 2>&1 | grep datafile

See also:

Quick and dirty utilities

du with better kilo, mega, giga behaviour

Written to use
with size sorting and human-readable size output.

Made to be used in bash function (sort of like aliases, but allowing further arguments):

function duk()  { du --block-size=1 ${1} | sort -n | kmg; };
function duk1() { du --block-size=1 --max-depth=1 ${1} | sort -n | kmg; };
function duk2() { du --block-size=1 --max-depth=2 ${1} | sort -n | kmg; };

That kmg script (e.g. put it in /usr/local/bin and chmod +x it):

""" Looks for inital number on a line. If large, is assumed to be summarizable in kilo/mega/giga """
import sys,re
def kmg(bytes,kilo=1024):
    """ Readable size formatter.                                                                            
        Binary-based kilos by default. Specify kilo=1000 if you want decimal kilos.                         
    if abs(bytes) > 0.95*tera:
        return "%.1fT"%(bytes/float(tera))
    if abs(bytes) > 0.95*giga:
        return "%.0fG"%(bytes/float(giga))
    if abs(bytes) > 0.9*mega:
        return "%.0fM"%(bytes/float(mega))
    if abs(bytes) > 0.85*kilo:
        return "%.0fK"%(bytes/float(kilo))
        return "%d"%bytes
firstws = re.compile('^[0-9]+(?=[\t\ ])')  # look for initial number, followed by space or tab
for line in sys.stdin:
    m = firstws.match(line)
    if m:
        bytesize = int( line[m.start():m.end()], 10)   
        #for du uses, we could filter out below a particular size (if argument given)
        sys.stdout.write("%s %s"%(kmg( bytesize ),line[m.end():])) # using stdout.write saves a rstrip()

technical notes

Return codes

Return codes a.k.a. exit status are a number that a process returns.

Often either

  • via a return on the main() function
  • via a function called exit() that also causes the termination

It's regularly treated as an 8-bit value (It seems to be 32-bit in windows. In POSIX it's 32-bit internally, and it uses part of it for the wait/waitid/waitpid syscalls, but masks what you see elsewhere)

This can be used in simple shell logic, backing
, and a shell can typically read out the most recent exit code, e.g. in bash:
diff one two ; echo $?

The only thing you can truly count on is stdlib.h's definition of:

1        EXIT_FAILURE 

Or more widely, 0 means success and anything else means error,

which makes
, and shell logic like

Beyond that, there are only conventions. Some of those include

  • using 1, 2, 3, 4, etc.. for specific reasons as you invent them
  • using -1, -2 in a similar way (and/or)
  • passing through errno (though note those could exceed 255 in theory)
  • using 128+ only for serious errors
  • sysexits.h added some entries a bunch of years later (originating from mail servers, apparently). You see them around, but not very widely.
64       command line usage error 
65       data format error 
66       cannot open input 
67       addressee unknown 
68       host name unknown 
69       service unavailable 
70       internal software error 
71       system error (e.g., can't fork) 
72       critical OS file missing 
73       can't create (user) output file 
74       input/output error 
75       temp failure; user is invited to retry 
76       remote error in protocol 
77       permission denied 
78       configuration error 
  • Bash (mainly meaning bash scripts) seems to add:
126      Command invoked cannot execute	(e.g. Permission problem or command is not an executable)
127      "command not found". Also seems to include the case of "error while loading shared libraries" 
128      Invalid argument to bash's exit
128+n Fatal error signal n (signals go up to 64ish, see
kill --list
), so e.g.
130 terminated by SIGINT (Ctrl-C)
137 terminated by SIGKILL
143 terminated by SIGTERM

See also

tty, pty, pts, and such

  • tty - teletypewriter.
    • broad term: can include physical terminals [11], virtual terminals (e.g. the text-mode terminals in various unices), and pseudoterminals (see below)
    • also regularly refers to 'the terminal that this process is wrapped in' (which is what the
      command reports - see its man page).

  • pty - pseudoterminal
    • ...which is a pair of a ptmx (pseudoterminal master) and pts (pseudoterminal slave)
    • see
      man pts
    • most recognizably used in cases like remote logins (e.g. sshd) and graphical terminals

On linux, /dev/tty* are text-mode terminals (getty), while pts are typically graphical shells and sshd

(may be useful when inspecting the output of things like

Be lazy, type less

Tab autocompletion

You don't have to type out long names. Most shells will autocomplete both command names and file names, up to the point of ambiguity.

For example, if you have three files in your current directory:


You can complete to the third with e.g.

cat inTab

and the second with:

cat ITabtTab

Pressing tab twice at a point of ambiguity will show the options. For example, psTabTab will likely list ps2pdf, ps2pdf13, ps2pdfwr, ps2ps, psscale, pslatex and more.

Using your history

In bash, there are two basic tools to use commands from your history.

I prefer to use only the search feature: Ctrl-R, then type a substring. If you want to change it before running it, make sure to accept your choice with some key that is not Enter.

In bash, there is also the exclamation mark. Other shells have similar functionality, but the details will differ.

It does not autocomplete, so the most use I see in it is repeating a very recent commands you know you can safely use verbatim. For example, if you recently used a long command line (e.g. pdflatex, bibtex, pdflatex, pdflatex) you can repeat the entire thing with:


Backgrounding, pausing, and detaching processes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

backgrounding, job control (Ctrl-Z, bg, fg, effective pausing)

Most shells have job control. The following mentions bash's

Job control means you can run multiple things from one shell, and the shell need not be occupied and useless while it's running something.

Say you want to update the database that backs the
command. The update command is
, and it takes a while. Running
updatedb &
will start it program,

but disconnect its stdin from your terminal. (Not that this particular program asks for any further input via stdin, but on other cases that can be a problem).

In other words, it now runs in the background.

Its stdout and stderr are still connected to your terminal (so it'll spout any output while you're doing other things -- in updatedb's case mostly warnings), and the process is still the shell's direct child (so will be killed when your terminal quits).

When a program occupies your terminal, Ctrl-Z will disconnect it from your stdin and effectively pause it.

If you follow that with a
, it continues running in the foreground. (Sometimes this is a convenient way to pause a program, though anything watching the time may get confused, so this mostly makes sense for simple shell utilities) If you follow it up with a
, it will continue running as a background process,

which is functionally equivalent to having started the process with & (note: bg and fg are bash-specific. other shells do job control differently)

avoiding dependency on starting process

When a shell starts a process, it is the child of that shell. Normally, killing a process with children means the HUP signal is sent to each child -- a message meaning "controlling terminal is closed". The default signal hander for HUP stops a process.

A shell is itself the child of something -- with SSH login it's the sshd process for the network connection, with local graphical login it's the xterm, itself a child of your window manager, which is a child of your login session, etc. Particularly in the graphical login case, this default to clean up is a useful thing.

There are actually two relevant ways a child and parent are related. One is the process tree cleanup described above. The other is how the stdin/stdout/stderr streams are attached (by default to the controlling terminal), because closing one end tends to break the program on the other end.

When you want to run a job that may take a while, both of the above mean it may quit purely because its startup shell was closed. If you are running long-term jobs, this is too fragile.

This is where nohup is useful. Nohup tells the process it will be starting up to ignore the HUP signal, which means that when its parent stops, the process will be moved to become a child to the init process (which will always be running).

The nohup utility will also not connect the standard streams to the controlling terminal. Instead, stdin is connected to /dev/null, stdout is written to a file called nohup.out (current directory or home directory), and stderr goes to stdout (?why?)

If you started a process that you want to become immune to HUP without having to restart it, your shell may provide for this. In bash, the command is
(with a jobspec)

on bash jobspecs
In bash,
will give you a list like:
[1]   Running                 sleep 200 &
[2]   Running                 sleep 200 &
[4]   Running                 sleep 200 &
[5]-  Running                 sleep 200 &
[7]+  Running                 sleep 200 &

Use of jobspecs looks like:

disown %4
kill %1      # this kill is a bash builtin, /bin/kill won't understand this

There's more, but I've never needed the complex stuff.


  • some shells have their own nohup, which supersedes the nohup executable. (example: csh's built-in nohup acts differently from bash's. In particular, it does not redirect stdout and stderr)

See also:

Limitations and problems


Changing to common directories

When working on a project or dataset is likely to take a while, I like to have a few-key method of going there.

Using tmux/screen solves that half of the time (because you return to a shell in the right directory), but it's still nice to allow new shells to move quickly.

The simple way is an alias:

alias work="cd /home/proj/code/mybranch"

I've worked at a place you would commonly want to find directories with known names which were annoying to type or even complete entirely.

While you can't get a subprocess to change directory for you (its environment is its own), a cd in a bash function applies to the shell it's called from, so you can do:

workdir () {
    cd $(/usr/local/bin/ $@)

My version of that script

  • was hardcoded to glob a few directories for subdirectories you might want to go to,
  • does a case-insensitive substring match,
  • ...and when that's ambiguous, or matches nothing, it outputs the current directory to effectively not change directories, and prints some friendly information on stderr.


When something in the shell is showing a human more than one page at a time, it may choose to invoke the PAGER on said output.

PAGER is an environment variable controlling what command is used for this.

If not set, many things default to

While you can set this to anything, it should be well behaved in the shell, and present, so less is a very sensible default.

It's used by things like
- and also by anything else that chooses to, including

things like python's help(), psql's pager, mycli's pager, etc.

tweaking less's behaviour

You can set other behaviour by setting the options them in the LESS environment variable, for example:

export LESS="-Saz-5j3R"
#and potentially have it work differently in general than as the PAGER
export PAGER="less -SaXz-5j3R"

Some options that I've used:

  • -S: crop lines, don't wrap. Means it acts consistently as if you have a rectangular window on the text. (By default, less does line wrapping when you are positioned to the left, and disables line wrapping once you look to the right at all. Since I'm usually looking at data or code, not man-page-like text, I find this annoying)
  • -a: while searching and asking for the next match, those currently visible are considered as having been seen, and won't be paused at. Useful when there are a lot of hits on the same screen
  • -X: don't clear the screen (meaning very short files are shown inline in the terminal)
  • -z-5: PgUp/PgDn will scroll (a screenful minus five) lines instead of an exact screenful. I like this context when reading code and text.
  • -j3: search results show up on third line instead of top line, for some readable context (negative number has different meaning; see man page)
  • -R: Allow through ANSI control codes (mostly for colors), and try to estimate how that affects layout so that we don't mess up layout too much. It's probably a good idea to set this conditionally, e.g. have bashrc include something like:
case "$TERM" in
        # allow raw ANSI too
        export LESS="-SaXz-5j3R" ;;
    * )
    export LESS="-SaXz-5j3" ;;
  • -n: Suppress line numbering. Very large files load faster. Does disable some line-related features, but I rarely use them.
Note that you can also cancel this while it's doing it, with a single Ctrl-C

Further notes:

  • less filename and cat filename | less are not entirely identical.

When less takes input from stdin (the second way above), it will show contents more or less verbatim. When invoked directly, less may apply preprocessing.

  • Preprocessing basically means less runs something on the data, depending on what it is. For example to show the decompressed version of files, rendering HTML via a text-mode browser, showing the text from a PostScript file, showing music's metadata, colorizing code, and more.
see 'input preprocessor' in the man page for more details
  • less is usually the default PAGER, a variable which contains the executable that programs can call to show long content. For example, man uses PAGER.