Newlines

From Helpful
Jump to navigation Jump to search
This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)

ASCII byte values

Line separators in plain-text files are encoded by:

  • \n is LF (LineFeed): 0x0a in hex, 10 in decimal, 12 in octal
  • \r is CR (Carriage Return): 0x0d in hex, 13 in decimal, 15 in octal


How they are used

Different systems use the two in diferent ways:

  • Most unices use LF (\n, 0x0A) by itself.
  • DOS and various windows programs use CRLF (\r\n, 0x0D 0x0A)
  • Macs up to OS9 used CR (\r, 0x0A) by itself(verify) -- OSX sometimes uses this, sometimes unix style.

I have seen mention of \n\r, but this seems to be confusion about which character is which.


Note that most programming languages use newline to refer specifically mean LF (\n, 0x0a), regardless of OS. This applies mostly to output, though there may also be some newline handling/translating code for reading, most commonly for file reading.

Many windows programs will understand both \r\n and \n, though some won't.

In *nix, many utilities will read lines, absorb CRLF and print as LF without you needing to worry about it. Those that do not often show CR as ^M. Many programs do not know about the Mac way.



Translating

There are some utilities to convert these, though most may not be installed. It can be useful to know some tricks with standard utilities.


General solutions aren't very simple.

Very specific cases often are, though. For example, if you have a wordlist that you want only LFs in, you can do:

cat wordlist.crlf | tr -s '\r' '\n' > wordlist.lfonly

This is not a general solution: without that squeeze you double-space the file, and with it you remove empty lines, so it would be more accurate to convert every byte sequence of '\r\n' into '\n'.


Applications that that do not absorb \r usually will see it as just another (control) character, so you can usually say that you want to remove a \r when it is last on a line. For example, the following effectively converts crlf to lf:

sed 's/\r$//'


For more automatic handling and conversion, from and to all formats, you'll have to detect what a file actually contains.

See also

Related software: