Programming language typology and glossary
Typing (glossary)
type annotation, type checking, and type hinting
Type declaration is putting types on function arguments, return values, and variables
- which usually implies static typing
Type checking is checking that these declared types are adhered to.
- In static, compiled languages this can be done at compile time (though static languages that still allow dynamic casts can still subvert things).
- In dynamic languages this check could only be done at runtime - and whether and when it is checked depends on the language and/or the programmer
Type annotation is a name we may prefer to use when types are mentioned, but not checked - really just inline documentation of what we expect.
This applies only to dynamic languages (in statically typed languages would be a syntax error to not specify types (or have it be inferred, for languages that do that)). Type annotation can still be very useful for editors to hint to you what to hand in, for certain kind of checks.
Type hinting is a vaguer term, and can refer to things like:
- type annotation as described above (not enforced in any way)
- mainly there so that IDEs can show this to programmers
- e.g. Python 3[1] (see also Python_notes_-_syntax_and_language#Type_annotation)
- type annotation that is enforced at runtime, but optional to specify
- e.g. PHP (though it calls it type declaration) [2]
- around generics, it can refer to requesting which types should be precompiled
- (a specific case of a compiler hint)
- e.g. tensorflow[3]
Some languages add some further distinctions, like Java's 'declaration annotation' [4][5]
Conversion, casting, coercing
The shorthand terms (like conversion, casting, and coercing) are somewhat fuzzy
Or, rather, context-dependent. It's often defined very clearly within a language's specs -- but different languages may use the terms differently, so once you step outside a single language, definitions vary.
More verbose terms may be more specific, but even they can vary between specific language's type system (and sensibly so within each).
The result is that people will often use the terms consistent with the language they know best, so it really helps to understand the underlying concepts. Context then solves most things.
More pragmatically
Programmers often deal with:
- having to explicitly convert values (explicit converting type cast)
- e.g. (float)intvar
- cases where that's done for you (implicit converting type cast)
- e.g. expressions like 2 + 2.2
- e.g. a function taking a float, but you can call with an int because the language's coercion rules specifically allows that as an implicit conversion and does it for you
- the typing system, and its coercion rules (which may be alterable)
- that e.g. says "math mixing integers and floats will always become float" or "it becomes the type of the left value"
- that e.g. emits "this is a common source of mistakes" warnings, e.g. around signedness, or pointer-type conversion (C)
- that raises compiler errors like "no you can't turn a float to a string directly" or "wrong pointer type"
- that makes some things implicit and forces other things to be explicit
- seeing underlying bytes in a different way (explicit non-converting type cast)
- often hackish, sometimes useful.
For example, consider the C code:
int i1 = 10;
int i2 = 4;
float f1 = i1 / i2;
float f2 = (float)i1 / i2;
f1 will store 2.0, because that expression is an integer/integer division resulting in 2, followed by you happening to want to store it into a float so an implicit conversion to suit that.
the f2 line explicitly converts (only) i1 to a float, which then counts on coercion to mean the division happens as a float/float division, and the result 2.5 can be assigned (without conversion).
More technically
(and, while clearer, these are still not universal, but a lot clearer)
- a converting type cast, sometimes type conversion
- changes the underlying bits according to known interpretation of both old and new types
- often to get the best possible, or most useful representation, in another type
- e.g. int as float (typically accurate enough),
- or float as string (typically rounded for human feedback)
- a non-converting type cast (always explicit)
- does not change the underling bits, but sees those same bits with a different interpretation
- e.g. "see these four adjacent bytes as one int32"
- Which sometimes makes sense, mostly for speed reasons
- and is often not necessary, or safe. So not all languages expose the ability, or make it easy
- always has to be done explicitly
A converting type cast:
- explicit type conversion
- e.g. when you do (float)intvalue
- implicit type conversion, often called coercion
- the language's type system allowing certain conversions and doing them for you, things like
- e.g. handing an integer to a function expecting a float, smaller to larger integers,
- e.g. expressions like 2 + 2.2
- e.g. in the 2+2.2 case many languages have coercion rules that effectively say "any integer-float mix becomes a float", while some others always focus on the left value
- This will sometimes emit "this is a common source of mistakes" warnings
- e.g. C around signedness, or pointer-type conversion, which both makes sense
Coercion mostly dictates how implicit conversion can work. And as such is often used as a near-synonym
Typing (typology)
strong typing, weak typing
Mainly describes how easily types are coerced to another within an expression.
Strongly typed languages have stricter and more enforced rules about mixing types,
meaning you need to do more conversions explicitly.
Weak typing often means there are more type-operator-type combinations predefined - and that they won't always do what you expect.
Consider 2 + "2"
- In strong languages, this will give an error
- In weak ones, it will do... something. Depending on the language, it may be 4, it may be "22".
Strong and weak typing is actually a gliding scale.
- some things are more easily coerced in most language.
- In particular, most languages will allow mixing of ints and floats, so 2 + 2.2 is usually valid (and usually becomes a float), because it's pretty convenient.
Overly weak typing is often disliked, in part because these are often also dynamically typed, meaning there are a lot of hidden rules to the typing, like "order matters a lot" or "actually in that case it coerces via integers and not strings like everything else" or other "dunno, specs say so" arguments.
This amounts to "the correctness depends on whether you have internalized this particular language's typing model".
It's potentially more opaque around variables rather than literals, because there is no clear indication of the type it currently has.
Overly strong typed is also disliked, in that it makes you type out absolutely everything needlessly.
It's a balance, of verbosity and how obvious mistakes are.
dynamic typing, static typing
In statically typed languages, variables have types.
In dynamically typed languages, values do.
One way to look at it is that variables in statically typed languages make some space according to the type, and you can store things in it, e.g.
int i=0 i="foo"; # is invalid
Whereas in dynamically typed languages, variables are just temporary names that point at a value-that-comes-with-a-type.
i=0 i='foo' # i now points to a string instead
Note that this is independent of strong/weak typing. Strong typing may still be in place.
Static is liked because the explicit typing is clearer both to people and compilers, and letting you know about typing errors at compile time.
Static is disliked when it makes you write things out too much. Implicit typing is a nice feature some languages have to lessen this.
Dynamic is liked for its flexibility and often shorter code
Dynamic is disliked for its ability to hide bugs, and for some of those bugs to only become discovered at runtime.