Binary files, text files: Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
mNo edit summary |
||
Line 2: | Line 2: | ||
===What do these terms even mean?=== | ===What do these terms even mean?=== | ||
Pragmatically, | |||
* text file = "All data is useful as text" | |||
: characters in a sequence that you could edit at will in the simplest types of "characters after another" style editor | |||
: human-interpretable, human-editable | |||
* binary file = "not just text". It's a catch-all. | |||
: a binary file is one you probably can't edit without severely breaking the present structure | |||
:: and where it probably wouldn't occur to you, e.g. because the most useful data isn't text to start with. | |||
: probably not human-readable, probably not human-editable | |||
Even that needs footnotes, and we haven't even gotten technical yet. | |||
'Binary' seems to come from a time before a lot of different file formats existed, | |||
where computer use was computer programming, | |||
and where we mostly had code that humans wrote, | |||
and code in compiled, machine-readable form. | |||
The compiler output was ofetn called 'the binary', and that is still used. | |||
So arguably it's short for 'a binary executable' or some such term. | |||
<!-- | |||
'Binary data' or 'binary file' is actually a fairly empty and dumb name, because in this context it means "could be anything, but not just text". | |||
--> | |||
<!-- | |||
And more pedantically, everything is just as much made of ones and zeroes as anyhting else when stored, ''and'' [[the ones and zeroes thing is a dumb trope|we ever look at data that way to start with]]. | |||
--> | |||
<!-- | |||
Even if text is involved, you can't be entirely sure of how to interpret or edit it without | |||
parsing the file according to whatever standard the file is encoded to (which may be de facto, or even non-portable serialization). | |||
--> | |||
<!--(More structured documents formats have solved this decades ago)--> | |||
<!-- | |||
If a file or (byte)string contains only text (particularly if in a common coding like ASCII, ISO8859, UTF8) it would often be called '''plain text'''. | |||
--> | |||
A '''string''' in the wide sense refers to a | <!-- | ||
In programming | |||
* A '''string''' in the wide sense refers to a array of values | |||
: ''usually'' to a string of readable characters (unless terms like bytestring are used). | |||
:: {{comment|(...in part because we have words like array and list for numbers and other things)}} | |||
* A '''bytestring''' (sometimes binary string) is a sequence of bytes that can contain any value, not just readable characters. | |||
: Around C/C++ and some others, a string is terminated by a ''value'' -- which means that value cannot appear in the data. That means that for bytestrings, you must store the length separately. | |||
--> | |||
Revision as of 14:10, 16 January 2024
What do these terms even mean?
Pragmatically,
- text file = "All data is useful as text"
- characters in a sequence that you could edit at will in the simplest types of "characters after another" style editor
- human-interpretable, human-editable
- binary file = "not just text". It's a catch-all.
- a binary file is one you probably can't edit without severely breaking the present structure
- and where it probably wouldn't occur to you, e.g. because the most useful data isn't text to start with.
- probably not human-readable, probably not human-editable
Even that needs footnotes, and we haven't even gotten technical yet.
'Binary' seems to come from a time before a lot of different file formats existed,
where computer use was computer programming,
and where we mostly had code that humans wrote,
and code in compiled, machine-readable form.
The compiler output was ofetn called 'the binary', and that is still used. So arguably it's short for 'a binary executable' or some such term.