Text processing notes

From Helpful
Jump to navigation Jump to search
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Command line

*nix-ly basics:

  • cut, tr, wc, grep,
  • awk, sed
  • but also things like recode

...see e.g. [1]


There are various interesting tricks you can do with grep, awk, and sed. For example:

Variants on grep: (see also [2])

  • sgrep (structured grep) interprets HTML and similar structures and has a GCL-like query syntax(verify)
  • agrep (approximate grep) returns things that differ up to a certain amount of characters

Ideas

  • finite state automata fore string recognition, simple string translation



Unsorted

Command line:

See also