Sed

From Helpful
Jump to navigation Jump to search
📃 These are primarily notes, intended to be a collection of useful fragments, that will probably never be complete in any sense.
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


Sed is a stream editor, meaning it does things to a stream of (character) data, which makes it interesting for some shell-fu and some basic text editing.

I for one rarely use sed - or awk - because by the time I may need it, I'm thinking of complex enough to require scripting. Yet both definitely have their uses.


Note: you often want -r argument, for extended regexps. sed's default seems to be classical unix regexps, which excludes things you probably expect, like +, ?, |, backreferences(verify), and more.


Basics

sed processes one line at a time. It lets you take regular expressions and related operations to a streams of data, e.g. doing replacement:

# echo "All day and all night" | sed 's/day/night/'
All night and all night

There are three things you can do:

  • s/substitute/text/
  • y/transliterate/text/ (equal-length strings in both parts; this is basically a dumb tr)
  • /match text/, which is only interesting in combination with other options; more about that later.



First (default) versus all (/g)

By default, sed will only operate on the first match:

> echo "All day and all day" | sed 's/day/night/'
All night and all day


To work on all matches, add /g:

> echo "All day and all day" | sed 's/day/night/g'
All night and all night

Delimiters

You can use an arbitrary character as the delimiter. For example:

find /etc | sed 's_/etc/__'

...is less confusing than s/\/etc\///.

-e: multiple expressions

You can apply multiple things per line. It looks like they are applied sequentially, acting like connected pipes.(verify)

This means you can do things like double output, sequential transformations, simple conditionals, and such.

This also makes things like -n (don't automatically print lines) and /p ('print this line') more interesting (see below).

Grouping

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Presumably, you will use groups to capture specific groups. In this case, you need to add -r (extended regexp).

For example, to selectively rewrite lines like:

   'key':'value',

to

   'value':'key',

You could use something like:

sed -r 's/\s+([^:]+)[:](.+?)[,]/    \2:\1,/g'

(Yes, in general you should always parse that, becuse things like that need to cheating hard based on knowing what cannot appear in the key. The [^something] matches not-something characters. Having it match up to some point you know it should stop is one simple-and-stupid way of making a long match non-greedy)



When you s/replace//, you can also get the matched value using &

find /etc/ | sed -n -r 's/\.(conf|cnf|cf|ini|rc)$/: configuration file (type: &)/p'

Advanced features you may not need

Addresses: line-based conditions

You can make expressions apply according to what line of input you're on. For example, you can do:

find /etc/ | sed '3   s_/etc_/foo_'   #does the replacement only on the third line
find /etc/ | sed '2,5 s_/etc_/foo_'   #does the replacement on line two to five
find /etc/ | sed '$   s_/etc_/foo_'   #does the replacement on the last line
find /etc/ | sed '1~5 s_/etc_/foo_'   #does the replacement every fifth line

You can add a ! between the address and the command to negate the logic of the address. For the above what would respectively mean 'every line but the third', 'four of every five lines', 'everything but the last line', 'lines 1, and from 6 onwards'.

See the sed man page for more.

-n and /p: grep-like behaviour

When you use the -n option on sed, it does not print what it reads, only what you tell it to using /p:

For example, "print only lines that contain 'All' or 'all', and replace 'all' with 'boo'":

 > echo -e "All day \n and all day\n  not too long" | sed -n -e '/All/p' -e 's/all/boo/p'
 All day
  and boo day

side note: the -e option on echo forces it to interpret the escape, so that it outputs the \n as an actual newline and not two characters \ and n

# A slightly more useful example:
#   From the log of initial install (debian/ubuntu), 
#   use just the lines mentioning the package
#   and print only the name part of that line:
zcat /var/log/installer/initial-status.gz | sed -n 's/^Package: //p'

y///: transliteration

y/sourcelist/targetlist/: transliterates single characters from the first list into the second, for example:

> echo "All day  and all day" | sed 'y/aA/Aa/'
all dAy And All dAy

This is basically a simple version of what the tr utility does, though in combination with other expressions you can make it more advanced than tr.

Matching as a condition for operations

Useful for things character substitution/transliteration.

The following means 'if the line matches the expression, apply operation to whole line' (not 'apply only to the match').

sed '/[A-Z]/y/0123456789/         /'
sed '/[A-Z]/s/[0-9]/\ /g'

Both of these do the nonsense operation of replacing digits with spaces only when there is a capital letter on the line.

/i and /a: inserting text before and after a line

For example, the following decorates 'Section ...':

 cat mytext | sed -e '/^[Ss]ection [0-9]*$/i #####' -e '/^[Ss]ection [0-9]*$/a ###'

With this,

Section 1

would become

#####
Section 1
###

(this may have uses in the context of LaTeX, sometimes XML)


Halfway useful examples

Extract single value

...by regexp, here for the CUDA version:

nvcc -V 2>/dev/null | sed -n -r 's/.*release ([0-9][.][0-9]),.*/\1/p'


Replace something in one or more files

Such as to change naming conventions in code, do an in-place restore

sed -i.bak 's/\bgetSession\b/get_session/g' *.py

The \b ('only if at word boundary') avoids potentially nasty substring replaces.

The .bak argument to -i makes sed make a backup (with that extension), just in case this messed up everything and you want to restore the old copy/copies. If you're happy (you could diff the two to check) you can remove these.

Note: other ways of doing this include perl -pie (perl -p -i -e), and replace. Using sed seems a decent balance of brief and powerful. See also replacing text in multiple files].


Indent an entire file

sed 's/^/    /'


Double-space a file

sed 'G'


Insert newlines

e.g. to make an XML file that had all data on a single line greppable

 cat isbns.xml | sed 's_</isbn><isbn>_</isbn>\n<isbn>_g' > isbns_viewable.xml


Extracting things from security logs

See SSH_-_loose_notes#Check_whether_people_brute-force_you

Common errors

invalid reference on `s' command's RHS

Usually you forgot the -r (extended regexp) option.

See also

Tutorials:

Examples: