Command line and bash notes
Linux-related notes
Shell, admin, and both:
|
Contents
- 1 Safer scripts
- 2 Shell expansion
- 3 Shell stuff I occasionally look up
- 3.1 For
- 3.2 While
- 3.3 Redirecting, basic
- 3.4 Piping
- 3.5 Redirecting, fancier
- 3.6 Redirection, less common
- 3.7 piping/catching both stdout and stderr
- 3.8 Shell escaping
- 3.9 Shell conditionals and scripting
- 3.9.1 Conditional execution
- 3.9.2 Control
- 3.9.3 User input
- 3.9.4 Strings
- 3.9.5 sourcing scripts
- 3.9.6 Backgrounding processes
- 3.9.7 Directory of script being run
- 3.9.8 Console scrolling
- 3.9.9 Shell aliases and functions
- 3.9.10 Renaming many files
- 3.9.11 Configurable autocompletion
- 3.9.12 shopt
- 3.9.13 Key shortcuts
- 3.9.14 See also
- 3.9.15 Some shell-fu exercise
- 3.10 Links and sites
- 3.11 Here documents
- 4 Quick and dirty utilities
- 5 technical notes
- 6 Be lazy, type less
- 7 tweaking less's behaviour
Safer scripts
Shell expansion
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
Introduction
Bash shell expansion is the following sections - and apply in that order.
While powerfully brief, it's also hard to truly understand,
depends on environment settings (environment variables, shell option), so it will bite you and if you want something robust it's best avoided.
Usually the suggestion is to use a scripting language, one where it is easier to be correct, clear, and still brief. (...so not perl). Python is an option, due to it being ubiquitous in modern linux.
For example, can you say why
for fn in `ls *.txt`; do echo $n; done
is a problem while
for fn in *.txt; do echo $n; done
is mostly-fine-except-for-a-footnote-or-two? And what the better-form-yet is?
Some examples below are demonstrated via a command called something like argshow. Make this yourself with the following contents and a chmod +x
#!/bin/bash printf "%d args:" $# printf " <%s>" "$@" echo EOF
brace expansion
Combinatorial expansion:
# echo {a,b,c}{3,2,1} a3 a2 a1 b3 b2 b1 c3 c2 c1
# echo a{d,c,b}e ade ace abe
ls -l /usr/{bin,sbin}/h* /usr/bin/h2ph /usr/bin/h2xs /usr/bin/h5c++ /usr/bin/h5cc ... /usr/sbin/httxt2dbm
Sequence expression (integers):
# echo {1..6} 1 2 3 4 5 6 # echo {1..10..2} 1 3 5 7 9
Sequence expression (characters, in C locale):
# echo {a..f} a b c d e f
Notes:
- Expanded left to right.
- order is preserved as specified, not sorted
- things stuck to the braces on the outside are treated as preamble (to prepend to each result) and postscript (to append to each result), see second example
- single list is effectively
- when using it for filenames, keep in mind that
- it generates names without requiring they exist
- it happens before pathname expansion (meaning you can combine with globs - and that you should consider cases where they don't expand)
- may be nested, is treated flattened(verify)
# echo {1,2}-{{_,-},{X,Y,Z}} 1-_ 1-- 1-X 1-Y 1-Z 2-_ 2-- 2-X 2-Y 2-Z
# echo {a,b}{{_,-}{X,Y,Z}} a{_X} a{_Y} a{_Z} a{-X} a{-Y} a{-Z} b{_X} b{_Y} b{_Z} b{-X} b{-Y} b{-Z}
# echo {00{1..9},0{10..50}} 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
- can't combine sequence and set (e.g. {1,3..5} works as two string elements)
See also:
- https://www.gnu.org/software/bash/manual/html_node/Brace-Expansion.html
- http://wiki.bash-hackers.org/syntax/expansion/brace
tilde expansion
The two best-known two:
- ~is your shell's $HOME
- some footnotes with su
- ~usernameis their home path
Keep in mind this comes from the account database and do not necessarily exist (though usually do).
- you may like the latter as the special-cased cd -
parameter/variable expansion
On delimiting
The bracket style ${var} allows more unambiguous delimiting (and don't need whitespace to delimit between it and other things).
It also allows the conditional replacement mentioned below.
The $var is fine in basic cases.
Example:
o="One";t="Two" ; echo $otfoo ; echo $o t foo ; echo ${o}t foo ; echo ${o}${t}foo
Conditional replacement
Warning if not set
- ${VAR:?message}- when if VAR is unset or null, bash complains with the message (but does not stop a script)
# yn="";echo ${yn:?Missing value} -bash: yn: Missing value
Return this value if not set
- ${VAR:-word}where if VAR is unset or null, (the expansion of) word is returned instead
# Take device from first command line argument, default to eth0 if not given DEVICE=${1:-eth0}
#!/bin/bash # Reports all files containing a certain pattern. Call like: # fileswith greppattern [file [file...]] PATTERN=${1} shift # consume that pattern from the cmdlinearglist so we can use @: FILES=${@:-*} grep -l $PATTERN $FILES | tr '\n' ' '
"If variable not set, return this other value and assign to the variable":
${ans:=no} #If ans was set, keep its value. #If ans was not set, will return no and set ans to it. #nice in that later code can safely assume it is set echo $ans
"Use given value when set at all"
For example "any actual answer is taken as 'yes', non-answers are unchanged"
yn="";echo ${yn:+yes} yn="wonk";echo ${yn:+yes} yes
Pattern and substring stuff
...removes a string from the end of var, allowing globs.
Say you have
a-001.txt a-002.txt b-001.txt b-002.txt b-003.txt b-004.txt c-001.txt
...and want to handle them as sets, then one way is look for all firsts, strip down to the base, and expand again:
for fn in *-001.txt; do basename=${fn%-001.txt} echo $basename echo $basename* # cat $basename* > ${basename}-all.txt done
will print
a a-001.txt a-002.txt b b-001.txt b-002.txt b-003.txt b-004.txt c c-001.txt
The difference between % and %% is that when you use a glob, % will remove the shortest match and %% the longest, e.g.
$ export v=abcabcabc $ echo ${v%b*c} abcabca $ echo ${v%%b*c} a
Remove a string from the start of var, allowing globs, again shortest and longest.
For example:
- ${0##*/}is a good imitation to get the basename of the script
- ${filename##*.}gets the filename's extension
arithmetic expansion
Basically, usingcommand substitution
The following will be replaced by stdout from that command
$(command) `command`
(the former is mildly preferred in that it has fewer edge cases in parsing characters)
Notes:
- it's executed in a subshell
- trailing newlines are stripped
- note that word splitting applies, except when this appears in double quotes (single quotes would avoid evaluation)
# argshow $(echo a b) 2 args: <a> <b> # argshow "$(echo a b)" 1 args: <a b> # argshow '$(echo a b)' 1 args: <$(echo a b)>
- $(< file)is done without subshell(verify) so is faster than$(cat file)
- can be nested
- (backquote style needs escaped backquotes to do so)
echo $(echo $(ls)) echo `echo \`ls\``
- evaluated left-to-right
word splitting
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
Most of what you need to know:
- Word splitting is performed on almost all unquoted expansions
- if no expansion occurs, no splitting will occur either (verify)
- Will split on any run of the characters in $IFS
- if unset, default is whitespaces
- if set to empty string (a.k.a null), no splitting occurs
- which is why it misbehaves around files with spaces in them. One partial workaround is to remove space from $IFS, i.e. set it to tab-and-newline.
IFS=$(echo -en "\t\n") # echo call to parse these; IFS="\t\n" would actual be those four characters for fn in `ls *.txt`; do echo $fn; done unset IFS # unless you want everything late to behave differently
These delimiters are ignored at the edges (so empty-argument results are avoided)
You can use IFS for other tricks, like:
IFS=":" while read username pwd uid gid gecos home shell do echo $username done < "/etc/passwd" unset IFS
Notes:
- Double-quoting suppresses word splitting,
- ...except for "$@" and "${array[@]}"
- You can see what's in IFS currently with something like echo -n "$IFS"
- 20 09 0a is space tab newline
- echo -n means it won't add its own newline, doublequoting means avoids do word splitting :)
- readis weird (verify)
pathname expansion
Other notes
Shell stuff I occasionally look up
For
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
Fixed values and variables
for arg in "$var1" "$var2"; do echo $arg done
The doublequoting is good practice because you usually want to avoid word splitting on arbitrary values.
Which is mostly a variant of something like:
for body in Mercury Venus Earth Mars Jupiter Saturn Uranus Neptune; do echo $planet done
or
for lastoct in `seq 2 254`; do echo 192.168.0.$lastoct done
Both use bash-specific word splitting.
While
Mostly: See test, and a few of the notes for for
- ...often means you didn't put a semicolon/newline between the condition and do
A poor man's watch, which I use to get shell colors without forcing them:
while true do echo ls sleep 1 done # Or as a one-liner while true; do echo ls; sleep 1; done # You can use #while : #while [ 1 ] # ...if you find them easier to remember
Notes:
- :is a historical shorthand fortrue, and is also sometimes useful as a short no-op
Redirecting, basic
- <feed file into stdin
- >write stdout to file (overwrite contents)
- >>write stdout to file, appending if it already exists
For example:
ls dir1 > listing # would overwrite each time ls dir2 >> listing # would append if exists sort <listing >sorted_listing
By default this applies to stream 1, stdout, because that's where most programs put their most pertinent output.
The standard streams are numbered, and (unless redirected) are:
- stdin is 0
- stdout is 1
- stderr is 2
So e.g.
find >output 2>errors # or, equivalently find 1>output 2>errors
Piping
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
Piping is redirecting between programs.
When starting multiple processes, you redirect an output stream from one to the input stream of another.
For example:
locate etc/ | less cat infile | sort | tee sorted_list | uniq > unique_list
This can also be combine with redirection, e.g.
find . 2>&1 | less # don't ignore the errors
Redirecting, fancier
You'll want to know that there is some syntax variation (particurly between shells). In bash,
&> filename >& filename
- are equivalent, and short for:
>filename 2>&1
- i.e. stdout and stderr are written to the same file, because it says:
- write stdout to filename
- write stderr to what stdout currently points to
Also, some of this is specific to bash
- e.g. dash[2] will trip over >&saying Syntax error: Bad fd number)}}
Consider how multiple requests are handled - primarily that changes are processed in order. Consider:
prog >x 2>&1 >y
- This means:
- connect stdout to file named x
- connect stderr to what stdout currently points to (which is the file named x) (actually duplicates the file descriptor(verify))
- connect stdout to file named y
- The net effect is "connect stderr to a file named x, and stdout to a file named y".
Redirection, less common
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
This is sometimes a nice streaming thing, though usually just for command brevity
# log output and show it live find / 2>&1 | tee allfiles # writes both sorted and unique list cat infile | sort | tee sorted_list | uniq > unique_list
- Example: TODO
- goes through most interpretation. Some use this syntax primarily for its short command substitution
- Example: TODO
- can be nice to see how fast data is moving through
- [5]
- can deal with showing multiple streams. E.g. to test how people's homedirs would compress on average
tar cvf - /home 2>/dev/null | pv -c -N RAW | pigz -3 - | pv -c -N COMP > /dev/null
See also:
- http://www.gnu.org/software/bash/manual/html_node/Redirections.html
- http://stackoverflow.com/questions/2341023/what-does-the-ampersand-indicate-in-this-bash-command-12
piping/catching both stdout and stderr
These are primarily notes It won't be complete in any sense. It exists to contain fragments of useful information. |
When you call an external program and read from one stream, you typically use blocking reads for simple 'wait until it does something' logic.
Doing that from both stdout and stderr is a potential problem, in that you can have output on one while not getting any on the other. Usually you can get away with this, but it can produce deadlock-like situations.
Generally, you want to either:
- use non-blocking reads (probably in a loop with a small sleep to avoid hammering the system with IO)
- test streams with select() before read()ing
- In some cases, your OS or language (standard) libary does not expose select(), you cannot find the file descriptor to select on, it does not let you select on pipes, or some other problem.
Other workarounds:
- redirect both to the same stream (but that can be annoying to do from an exec()-style call, because you need to wrap it in a shell - redirection is shell stuff)
- for non-interactive stuff, write both streams to a file, read those after the programs exit
Shell escaping
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
You'll occasionally create a string to be evaluated in another context (or immediately via expr or backticks) -- and run into problems with escaping/delimiting.
'Not safe' below tends to mean one of:
- Will open some interpreted, to-be-closed range (e.g. `)
- Interpreted differently if in script or on command line (e.g. "\")
- terminates some parse by odd tokenization, such as spaces in filenames
In various cases I prefer a scripting language that more or less forces you to things in a stricter (if longer) way, simply because I won't spend as much time convincing myself that the bash script is correct, or at least good enough.
single quotes: 'string'
- Not safe to dump in: ', possibly more
- Safe: `"$(safe as in "not interpreted as anything more than a character")
double quotes: "string"
- Not safe to dump in: !$"`\and probably more
- Safe: `
backslashing\ each\ necessary\ character
- Potentially safer than the above (solves mentioned nonsafe character problems)
- But: interpretation of backslashes unsafe themselves - or rather, they depend on quotes again:
- 'single quotes' (no interpretation?)
- echo '\z' → \z
- echo '\\z' → \\z
- echo '\\\z' → \\\z
- echo '\\\\z' → \\\\z
- outside quotes
- echo \z → z
- echo \\z → \z
- echo \\\z → \z
- echo \\\\z → \\z
- "double quotes"
- echo "\z" → \z
- echo "\\z" → \z
- echo "\\\z" → \\z
- echo "\\\\z" → \\z
- 'single quotes' (no interpretation?)
Further notes:
Using escaping from the shell (in most shells, anyway) gets a layer of pre-interpretation that would not be applied in a script (!)
Shell conditionals and scripting
Conditional execution
Say that you have a regularly-running script conceptually like:
collectdata
graphdata > file.gif
mv file.png /var/www/mywebserver
...and you want to do some parts only if the earlier bit succeeds.
Basically: Make programs return meaningful return codes (most do), and test for them and use the result.
You can even use both (though syntaxwise this is cheating a little bit - guess why), like in:
/bin/true && echo "Jolly good." || echo Drat. /bin/false && echo "Jolly good." || echo Drat.
collectdata && graphdata > file.gif && mv file.png /var/www/mywebserver
If this is not a one-liner (e.g. in your crontab) but a longer script, it's probably cleaner to do something like:
collectdata || { echo "Data collection failed"; exit 1 } graphdata > file.gif || { echo "Data graphing failed"; exit 2 } mv file.png /var/www/mywebserver || { echo "Moving graph failed"; exit 3 }
For the pedantic: The && and || essentially mean 'if zero return code' and 'if nonzero' -- which is inverted from the way true and false works within almost all programming languages.
It's often less confusing if you don't think about the values :)
See also
Control
if, test
See also [[, extended test
In bourne-style scripts you frequently see lines like:
if [ "$val" -lt 2 ]; then if test "$val" -lt 2; then
if test ! -r ~/.hushlogin; then echo "La la you haven't shut up motd yet" fi test ! -d /var/run/postgresql && mkdir -p /var/run/postgresql
Actual tests include: (list needs to be (verify)'d)
integers
- -eqequal
- -nenot equal
- -lt,-gtless than, greater than,
- -le,-geless than or equal to, greater or equal
filesystem
- -rexists and can be read
- -wexists and can be written
- -xexists and can be executed
- -sfile exists and isn't empty (size isn't zero)
- -efile exists (may not appear in all implementations(verify))
- -fexists and is regular file
- -dexists and is directory
- -hor -L: exists and is a symbolic link
- -pexists and is a pipe
- -bexists and is a block device
- -cexists and is a character device
- -Sexists and is a socket
strings
- -nnonzero string length (you probably want doublequotes around a variable)
- -zzero string length (you probably want doublequotes around a variable)
- =string equality
- !=string inequality
- Nonstandard: 'lexically comes before' and 'lexically comes after', \< and \>, but be careful: without correct escaping these become file redirection.
boolean combinations -- which are nonstandard
- -aand
- -oor
Other operators test ownership by set or effective user or group, by relative age, by inode equality and others.
On empty/missing arguments
Things can get a little finicky in this case.
- Common mistake #1: Unquoted empty variables
Consider that if $var is not set, or an empty string, then
[ $var = '' ] [ -n $var ] [ -z $var ]
would expand into:
[ = '' ] [ -n ] [ -z ]
The first is a syntax error. The second isn't but doesn't do what you want (returns true without an argument). The third is basically fine. Regardless, you should be in the habit of always using quotes: (probably doublequotes)
[ "$var" = '' ] [ -n "$var" ] [ -z "$var" ]
- Common annoyance #1: No substring test
It's not there.
But it sort of is -- in bash and sh (recent/all?(verify)), you can use case for this, e.g.:
case "$var" in *error*) echo "Saw error, stopping now" exit 0 ;; *) echo "We're probably good, doing stuff" ;; esac
The following is also possible, and arguably more generic, because it uses something external (that we know the behaviour of):
- grep -o "pattern1" <<< "$var" && echo "do something"
Also, both are fixed in the extended test command, which is a bashism (not standard POSIX, not available in e.g. sh or dash which can be your system's default shell, so only use around a /bin/bash hashbang).
test and conditional commands
Since# stop now if we are running in interactive context [ -z "$PS1" ] && return # source this if it exists [ -f /etc/bashrc ] && . /etc/bashrc
While experimenting with command success/failure, you may find it useful to show test's exit status, for example:
test -n "`find . -name '*.err' -print0`" ; echo $?
...but this can be confusing -- the logic is wrapped around program exit codes, so 0 is true and nonzero is false, which is the opposite of how most programming logic works. You usually don't need to think about that until you're consuming it as a number. For example:
test 2 -eq 2 ; echo $? 0 test 4 -eq 2 ; echo $? 1
[[, extended test
The extended test command, [[, was adopted by bash from ksh around bash 2.02 (~1998) (regex support since 3.1, ~2005) [6], and also available in zsh.
Note: when using in scripts, use explicit #!/bin/bash hashbang to avoid problems with default shells not being bash (or ksh).
This because while test and [ are POSIX, [[ is not. Bash is very typically installed (if not can be considered exotic), but don't assume it's your system's default shell (used e.g. for scripts). Pay attention to systems using:
- using sh (bourne shell, not yet again)
- the lightweight dash (e.g. various Ubuntu and BSD do this, and dash doesn't do [[)
- embdedded systems using busybox (so ash(verify)[7], the origin of dash)
- parsed before other processing,
- in particular it sees things before any word splitting or glob expansion
- it's more predictable as it's less likely to be mangled by something you didn't think about
- no longer a mistake to omit (double)quotes for variable or file tests:
$var = yes -e $b
- now standard instead of deprecated:
- &&for logical AND within a test
- for logical OR within a test
- ()for logical grouping
- glob matching, which includes substring matching
[[ abc = a* ]] [[ $var = *a* ]]
- regexp pattern matching
[[ abb =~ ab+ ]]
Note that it still makes sense to use (( for arithmetic, [[ is for string and file stuff.
See also:
More notes
If you want to take out some block of code via an if (faster than commenting a lot of lines) then:
if [ false ] # or whatever string in there
...won't work because bash uses string variables, and the default operation is -n ("is string non-empty").
The shortest thing that does what you want is:
if [ ]
or perhaps:
if [ "" ]
or, if you want something more obvious to passing readers, you could do:
if [ ignore = block ]
case
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
Note: only in the bourne-style shells (verify)
For example:
case "$var" in *pattern1* ) echo "seeing pattern 1" ;; *pattern2*|*pattern3* ) echo "seeing pattern 2 or 3" ;; * ) echo "fallback case" ;; esac
The thing this has over test/if is proper wildcard behaviour.
for
You'll know that a bash script can act as a batch file, running one command after the other in the hope nothing will screw up. Bash, however, offers more useful functionality, in and out of scripts (there is, in fact, no noticeable difference). For example:
for pid in `pidof swipl pl`; do renice 5 $pid; done 29180: old priority 0, new priority -5 26858: old priority 0, new priority -5
...will re-nice the named processes, because for expects a space-seperated list, pidof returns a list of pids, and backquotes (`) mean "treat this as its output of the command specified"
The above could have been spread among lines:
bash-2.05b $ for pid in `pidof swipl pl` > do > renice 5 $pid > done
Something similar goes for if-else loops. These allow you construct scripts that catch errors, run differently depending on how other commands managed, on environment variables, and whatnot. Scripting tends to beat real programming for simple little jobs.
While you can do this for files by using a wildcard, but it is generally a bad thing to do on files and you shouldn't learn this this way, because it won't work in two situations:
- when files contain spaces (possibly also on other less usual but legal characters)
- when there are so many files that bash expands the command to something longer than it can use (see Argument list too long (although this is less of a problem now)
If you want to do it robustly / properly, learn using using find and xargs.
while
while is a conditional loop.
You can do things like
while [ 1 ]; do (clear; df; sleep 5); done #which imitates watch -n 5 df
or
let c=0 while [ $c -lt 10 ]; do # better served by a for echo $c; let c=c+1 done
User input
read reads user input into a variable, for example:
read -p "Do you want to continue? " usercont echo $usercont
There are some options - that the man page doesn't mention.
Strings
Substring (by position,length):
# s="foobarquu";echo ${s:3:5} barqu
Regexp is possible, but strange and limited. Use of awk and/or sed is probably handier.
sourcing scripts
Usually, running a script means creating a process, and running the listed commands in that process.
source /etc/profile # (bourne-style?) shorthand: . /etc/profile
Backgrounding processes
Ctrl-Z, fg and bg
I occasionally see people using shells used only to run a program, typing e.g.(In the case of KDE, you can of course use the run dialog, Alt-F2. The parent of the process will be kdeinit then, I believe.) If you wish to have the same effect as the & after you didn't initially use it, you can use Control-Z to pause the current foreground process, which should print something like:
[1]+ Stopped firefox...which is a shell-specific (bash, here) job management list. You can then run
Directory of script being run
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
Console scrolling
Shift-PgUp and Shift-PgDown (often)
Useful for those happy-go-verbose programs, you can scroll back as far as the screen history goes. This usually works in text consoles, and is usually imitated by X terminal consoles.
Note that various things (PuTTy/Konsole/xterm, but also screen) may have their own configurable limit to how many past lines they keep, and in the case of screen, their own way of looking at it (screens are not really regular terminals, after all...)
Shell aliases and functions
Aliases
Aliases are short identifiers that expand to longer things. For bash, the syntax is like setting a variable. Potentially useful examples:
alias webdir="cd /var/www/www/htdocs" # go to some directory you regularly work in alias weblogtail="tail -F /var/log/apache2/*" # watch web server log alias logs="tail -F /var/log/*.log /var/log/*/*.log /var/log/syslog" # watch various current logs alias ..='cd ..' # funky shortcut
alias l="ls -lAhGBptr --time-style=long-iso --color=auto "
Some examples:
alias vf='cd' # catch typo alias duh="du -h " # use human-readable sizes alias dud="du --max-depth=1 -h " # human, one-deep (often more readable) alias lslast="ls -lrt " # show last modified last alias lsd='find * -prune -type d -ls' # 'list directories under curdir' alias hexdump='od -t x1z' # "show hex representation, of single bytes at a time, show text alongside" alias dlpage="wget -r -l 1 " # save page and direct links alias lesscol="less -R" # less that allows color (...control codes) alias psgrep="ps aux | grep" # short way of grepping through process list alias go-go-gadget=sudo # change default verbosity alias df="df -hT " # use human-readable sizes and show filesystem type alias bzip2="bzip2 -p " # always print progress when bzipping alias pstree='pstree -pu ' # always show pid, and show usernames where UID changes # change default behaviour: alias grep="egrep " # always use extended grep (always have regexp) alias bc="bc -lq " # bc always does float calculations
Notes:
- aliases can be removed with unalias
- naming an aliasing the same as the command is possible, but can mean arguments that are hard to negate, arrive double can cause confusion, and such. You may want to know about unalias
- aliases don't have arguments as such - they expand and let arguments come at the end (note that bash functions can have arguments, so they can be a better choice)
- if intead of an alias you can use an environment variable (e.g. <tt>GREP_COLOR=</tt> for grep's --color=), that may be preferable, as it is more flexible.
- some distributions have different default behaviour differences (via aliases), such as having rm do "rm -i".
Functions
Bash functions have a somewhat flexible syntax. They look like the following, although the 'function' keyword is optional:
# alternative to cd that lists content when you switch directory function cdd() { cd ${1} ; echo $PWD ; ls -FC --color ; };
More adaptively:
rot13 () { if [ $# -eq 0 ]; then #no arguments? eternal per-line translation tr '[a-m][n-z][A-M][N-Z]' '[n-z][a-m][N-Z][A-M]' else #translate all arguments echo $* | tr '[a-m][n-z][A-M][N-Z]' '[n-z][a-m][N-Z][A-M]' fi }
Notes:
- Functions can be removed with unset -f name
- Neither aliases or functions are forked
Renaming many files
There are different things called rename out there.
rename (Perl script)
If runningUsage: rename [-v] [-n] [-f] perlexpr [filenames]
This seems to be to be a variant of this This seems to be the default on Debian/ubuntu/derived. Many other distros have it under a package called something like prename
You use it something like:
rename 's/[.]htm$/[.]html/' *.htm rename 's/[a-z]+_([0-9]+)[.]html$/frame_$1.html/' *.html
perlexpression is typically a regex, but could be any perl code that alters $_)
In a regex, you may often want to use <t>/g</t> or you'll get only one replacement.
"rename (util-linux" variant)
If running without arguments starts with:
call: rename from to files...
OR (...there seems to be a mix of old and new in the wild) (verify)
rename: not enough arguments Usage: rename [options] expression replacement file...
Then it's a simple substring replace and you use it like:
rename '.htm' '.html' *.htm
And e.g:
rename '' 'PrependMe ' * rename 'RemoveMe' '' *
See also:
mmv
Haven't used yet. TODO
Configurable autocompletion
Bash >=2.04 has configurable autocompletion.
See
Actions are pre-made completion behaviour.
complete -A directory cd rmdir # complete only with directories for these two complete -A variable export # assist re-exports complete -A user mail su finger # complete usernames complete -A hostname ping scp # complete hostnames (presumably from /etc/hosts)
Filter patterns are usually for filenames (-f), to filter out completion candidates, for example to filter out everything that does't end in '.(zip|ZIP)' when the command is unzip.
This can be helpful but also potentially really annoying: if I know a file is an archive but doesn't have the exact extension the completion is expecting, you will have to type out the filename (or change the command temporarily).
Manual filters: You can use a bash function, and inside that do whatever you want, including calling applications to get and process your options (just don't make them heavy ones). The following example (found somewhere, and rewritten) illustrates:
The following allows killall completion, with the names of the currently processes that the current user owns:
_processnames() { local cur=${COMP_WORDS[COMP_CWORD]} #the partial thing you typed already COMPREPLY=( \ $( ps --no-headers -u $USER -o comm | \ awk '{if($0 ~ /^'$cur'/) print $0}' | \ awk '{if($0 !~ /\/[0-9]$/) print $0}' ) \ ) return 0 } complete -F _processnames killall
That first awk takes out things that don't start with what you typed so far, the second filters out the names of some 2.6 kernel process names (in the form processname/0)you probably can't and don't want to kill.
shopt
Shopt sets some options for the sh family of shells.
Shopt things are set (-s) or unset (-u).
You can see the current state with a
Most of the settings are low-level and are probably already set to sensible values.
Things that might interest you include:
- checkwinsizeupdate LINES and COLUMNS environment variables whenever the shell has control. Useful for resizeable shell windows, e.g. remote graphical ones.
- histappendappends instead of overwrite the history file. Seems to be useful when you often have multiple shells on the same host.
- dotglobconsiders .dotfiles in filename completion
- nocaseglobignores case while completing. This can be useful if you, say, want'*.jpg' to include '.JPG', '.Jpg', etc. files too. (You may wish to be a bit more careful when you have this set, though)
If you generally want case sensitive matching, but sometimes case insensitive matching, say,
ngc ls *.jpg # case insensitive ls *.jpg # case sensitive
...then you can use a trick to temporarily disable e.g. nocaseglob:
alias ncg='shopt -s nocaseglob; ncgf' ncgf() { shopt -u nocaseglob "$@" }
This works because an alias is evaluated before the main command, a function after.
Key shortcuts
To get an old command, instead of pressing Up a lot, you can search for a substring with Control-R. When you get the one you want, use enter to run it, or most anything else to change it first.
See also
Some shell-fu exercise
Links and sites
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
These are general sites, here partly because they need some place. You may find some of them intereting to read, but none are of the "Read this before you go on" type.
- GNU/Linux most wanted (command cheatsheet)
- UNIX/Linux Bourne/Bash Shell Scripting Tutorial (includes shell programming)
- BASH Programming - Introduction HOW-TO
- Advanced Bash-Scripting Guide
- Bash by example, part 1 and part 2
- http://linuxcommand.org/
- http://en.wikibooks.org/wiki/Bourne_Shell_Scripting
- http://www.linuxdevcenter.com/linux/cmd/
- http://www.tuxfiles.org/linuxhelp/cli.html (command list)
- http://en.wikibooks.org/wiki/Linux
- http://www.linux-tutorial.info/
- http://www.chongluo.com/books/rute/
- http://free-electrons.com/training/intro_unix_linux/
- http://www.unix-manuals.com/
- http://www.cns.uni.edu/cns-computing/help/unix-concepts.html
- http://www.felixgers.de/teaching/unix/unix_concepts.html
- http://www.tldp.org
- http://www.linuxvirgins.com/
Thinks to look at:
Here documents
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
You've probably seen scripts with something like:
wall <<EOF Hello there. Please be aware the system is going down in half an hour. EOF
<< means feeding in data that follows into stdin of the preceding command, everything up to the token mentioned immediately after. People often use EOF as a recognizable convention, but it could be xx62EndOfMessageZorp just as easily.
Here documents can be easier than trying to construct an echo command to do your multi-line escaped bidding.
Combination with shell arguments (redirection, piping) look weirdly positioned
wall <<EOF & Test EOF
...until you realize that the here-document start is really just a trigger for behaviour that starts after the rest of the command is parsed and evaluated
strace -eopen workhard <<EOF 2>&1 | grep datafile Test EOF
See also:
Quick and dirty utilities
du with better kilo, mega, giga behaviour
Written to useMade to be used in bash function (sort of like aliases, but allowing further arguments):
function duk() { du --block-size=1 ${1} | sort -n | kmg; }; function duk1() { du --block-size=1 --max-depth=1 ${1} | sort -n | kmg; }; function duk2() { du --block-size=1 --max-depth=2 ${1} | sort -n | kmg; };
That kmg script (e.g. put it in /usr/local/bin and chmod +x it):
#!/usr/bin/python """ Looks for inital number on a line. If large, is assumed to be summarizable in kilo/mega/giga """ import sys,re mega=kilo*kilo giga=mega*kilo tera=giga*kilo def kmg(bytes,kilo=1024): """ Readable size formatter. Binary-based kilos by default. Specify kilo=1000 if you want decimal kilos. """ if abs(bytes) > 0.95*tera: return "%.1fT"%(bytes/float(tera)) if abs(bytes) > 0.95*giga: return "%.0fG"%(bytes/float(giga)) if abs(bytes) > 0.9*mega: return "%.0fM"%(bytes/float(mega)) if abs(bytes) > 0.85*kilo: return "%.0fK"%(bytes/float(kilo)) else: return "%d"%bytes firstws = re.compile('^[0-9]+(?=[\t\ ])') # look for initial number, followed by space or tab for line in sys.stdin: m = firstws.match(line) if m: bytesize = int( line[m.start():m.end()], 10) #for du uses, we could filter out below a particular size (if argument given) sys.stdout.write("%s %s"%(kmg( bytesize ),line[m.end():])) # using stdout.write saves a rstrip() else: sys.stdout.write(line)
technical notes
Return codes
Return codes a.k.a. exit status are a number that a process returns.
Often either
- via a return on the main() function
- via a function called exit() that also causes the termination
It's regularly treated as an 8-bit value (It seems to be 32-bit in windows. In POSIX it's 32-bit internally, and it uses part of it for the wait/waitid/waitpid syscalls, but masks what you see elsewhere)
diff one two ; echo $?
The only thing you can truly count on is stdlib.h's definition of:
0 EXIT_SUCCESS 1 EXIT_FAILURE
Or more widely, 0 means success and anything else means error,
which makes
Beyond that, there are only conventions. Some of those include
- using 1, 2, 3, 4, etc.. for specific reasons as you invent them
- using -1, -2 in a similar way (and/or)
- passing through errno (though note those could exceed 255 in theory)
- using 128+ only for serious errors
- sysexits.h added some entries a bunch of years later (originating from mail servers, apparently). You see them around, but not very widely.
64 command line usage error 65 data format error 66 cannot open input 67 addressee unknown 68 host name unknown 69 service unavailable 70 internal software error 71 system error (e.g., can't fork) 72 critical OS file missing 73 can't create (user) output file 74 input/output error 75 temp failure; user is invited to retry 76 remote error in protocol 77 permission denied 78 configuration error
- Bash (mainly meaning bash scripts) seems to add:
126 Command invoked cannot execute (e.g. Permission problem or command is not an executable) 127 "command not found". Also seems to include the case of "error while loading shared libraries" 128 Invalid argument to bash's exit
- 128+n Fatal error signal n (signals go up to 64ish, see kill --list), so e.g.
- 130 terminated by SIGINT (Ctrl-C)
- 137 terminated by SIGKILL
- 143 terminated by SIGTERM
See also
- https://en.wikipedia.org/wiki/Exit_status
- https://www.gnu.org/software/libc/manual/html_node/Exit-Status.html
tty, pty, pts, and such
- tty - teletypewriter.
- broad term: can include physical terminals [8], virtual terminals (e.g. the text-mode terminals in various unices), and pseudoterminals (see below)
- also regularly refers to 'the terminal that this process is wrapped in' (which is what the ttycommand reports - see its man page).
- pty - pseudoterminal
- ...which is a pair of a ptmx (pseudoterminal master) and pts (pseudoterminal slave)
- see man pts
- most recognizably used in cases like remote logins (e.g. sshd) and graphical terminals
On linux, /dev/tty* are text-mode terminals (getty), while pts are typically graphical shells and sshd
Be lazy, type less
Tab autocompletion
You don't have to type out long names. Most shells will autocomplete both command names and file names, up to the point of ambiguity.
For example, if you have three files in your current directory:
Iamalongfilenameyoubet Iamalongfilenametoo.verboseout inane-innit
You can complete to the third with e.g.
cat inTab
and the second with:
cat ITabtTab
Pressing tab twice at a point of ambiguity will show the options. For example, psTabTab will likely list ps2pdf, ps2pdf13, ps2pdfwr, ps2ps, psscale, pslatex and more.
Using your history
In bash, there are two basic tools to use commands from your history.
I prefer to use only the search feature: Ctrl-R, then type a substring.
If you want to change it before running it, make sure to accept your choice with some key that is not Enter.
In bash, there is also the exclamation mark. Other shells have similar functionality, but the details will differ.
It does not autocomplete, so the most use I see in it is repeating a very recent commands you know you can safely use verbatim. For example, if you recently used a long command line (e.g. pdflatex, bibtex, pdflatex, pdflatex) you can repeat the entire thing with:
!pdfl
Backgrounding, pausing, and detaching processes
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
- backgrounding, job control (Ctrl-Z, bg, fg, effective pausing)
Most shells have job control. The following mentions bash's
Job control means you can run multiple things from one shell, and the shell need not be occupied and useless while it's running something.
but disconnect its stdin from your terminal. (Not that this particular program asks for any further input via stdin, but on other cases that can be a problem).
In other words, it now runs in the background.
Its stdout and stderr are still connected to your terminal (so it'll spout any output while you're doing other things -- in updatedb's case mostly warnings), and the process is still the shell's direct child (so will be killed when your terminal quits).
When a program occupies your terminal, Ctrl-Z will disconnect it from your stdin and effectively pause it.
If you follow that with awhich is functionally equivalent to having started the process with & (note: bg and fg are bash-specific. other shells do job control differently)
- avoiding dependency on starting process
When a shell starts a process, it is the child of that shell. Normally, killing a process with children means the HUP signal is sent to each child -- a message meaning "controlling terminal is closed". The default signal hander for HUP stops a process.
A shell is itself the child of something -- with SSH login it's the sshd process for the network connection, with local graphical login it's the xterm, itself a child of your window manager, which is a child of your login session, etc. Particularly in the graphical login case, this default to clean up is a useful thing.
There are actually two relevant ways a child and parent are related.
One is the process tree cleanup described above.
The other is how the stdin/stdout/stderr streams are attached (by default to the controlling terminal), because closing one end tends to break the program on the other end.
When you want to run a job that may take a while, both of the above mean it may quit purely because its startup shell was closed.
If you are running long-term jobs, this is too fragile.
This is where nohup is useful. Nohup tells the process it will be starting up to ignore the HUP signal, which means that when its parent stops, the process will be moved to become a child to the init process (which will always be running).
The nohup utility will also not connect the standard streams to the controlling terminal. Instead, stdin is connected to /dev/null, stdout is written to a file called nohup.out (current directory or home directory), and stderr goes to stdout (?why?)
- on bash jobspecs
[1] Running sleep 200 & [2] Running sleep 200 & [4] Running sleep 200 & [5]- Running sleep 200 & [7]+ Running sleep 200 &
Use of jobspecs looks like:
disown %4 kill %1 # this kill is a bash builtin, /bin/kill won't understand this
There's more, but I've never needed the complex stuff.
Notes:
- some shells have their own nohup, which supersedes the nohup executable. (example: csh's built-in nohup acts differently from bash's. In particular, it does not redirect stdout and stderr)
See also:
- Limitations and problems
TODO
Changing to common directories
When working on a project or dataset is likely to take a while, I like to have a few-key method of going there.
Using tmux/screen solves that half of the time (because you return to a shell in the right directory), but it's still nice to allow new shells to move quickly.
The simple way is an alias:
alias work="cd /home/proj/code/mybranch"
I've worked at a place you would commonly want to find directories with known names which were annoying to type or even complete entirely.
While you can't get a subprocess to change directory for you (its environment is its own), a cd in a bash function applies to the shell it's called from, so you can do:
workdir () { cd $(/usr/local/bin/resolve-workdir.py $@) }
My version of that script
- was hardcoded to glob a few directories for subdirectories you might want to go to,
- does a case-insensitive substring match,
- ...and when that's ambiguous, or matches nothing, it outputs the current directory to effectively not change directories, and prints some friendly information on stderr.
tweaking less's behaviour
You can set other behaviour by setting the options them in the LESS environment variable, for example:
export LESS="-SaXz-5j3R"
Some options that I've used:
- -S: crop lines, don't wrap. Means it acts consistently as if you have a rectangular window on the text. (By default, less does line wrapping when you are positioned to the left, and disables line wrapping once you look to the right at all. Since I'm usually looking at data or code, not man-page-like text, I find this annoying)
- -a: while searching and asking for the next match, those currently visible are considered as having been seen, and won't be paused at. Useful when there are a lot of hits on the same screen
- -X: don't clear the screen (meaning very short files are shown inline in the terminal)
- -z-5: PgUp/PgDn will scroll (a screenful minus five) lines instead of an exact screenful. I like this context when reading code and text.
- -j3: search results show up on third line instead of top line, for some readable context (negative number has different meaning; see man page)
- -R: Allow through ANSI control codes (mostly for colors), and try to estimate how that affects layout so that we don't mess up layout too much. It's probably a good idea to set this conditionally, e.g. have bashrc include something like:
case "$TERM" in xterm*|rxvt*|screen*) # allow raw ANSI too export LESS="-SaXz-5j3R" ;; * ) export LESS="-SaXz-5j3" ;; esac
- -n: Suppress line numbering. Very large files load faster. Does disable some line-related features, but I rarely use them.
- Note that you can also cancel this while it's doing it, with a single Ctrl-C
Further notes:
- less filename and cat filename | less are not entirely identical.
When less takes input from stdin (the second way above), it will show contents more or less verbatim. When invoked directly, less may apply preprocessing.
- Preprocessing basically means less runs something on the data, depending on what it is. For example to show the decompressed version of files, rendering HTML via a text-mode browser, showing the text from a PostScript file, showing music's metadata, colorizing code, and more.
- see 'input preprocessor' in the man page for more details
- less is usually the default PAGER, a variable which contains the executable that programs can call to show long content. For example, man uses PAGER.