Statistics notes - types of data

From Helpful
Revision as of 23:15, 21 April 2024 by Helpful (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This is more for overview of my own than for teaching or exercise.

Overview of the math's areas

Arithmetic · 'elementary mathematics' and similar concepts
Set theory, Category theory
Geometry and its relatives · Topology
Elementary algebra - Linear algebra - Abstract algebra
Calculus and analysis
: Information theory · Number theory · Decision theory, game theory · Recreational mathematics · Dynamical systems · Unsorted or hard to sort

Math on data:

  • Statistics as a field
some introduction · areas of statistics
types of data · on random variables, distributions
Virtues and shortcomings of...
on sampling · probability
glossary · references, unsorted
Footnotes on various analyses

  • Other data analysis, data summarization, learning
Data modeling, restructuring, and massaging
Statistical modeling · Classification, clustering, decisions, and fuzzy coding ·
dimensionality reduction ·
Optimization theory, control theory · State observers, state estimation
Connectionism, neural nets · Evolutionary computing
  • More applied:
Formal grammars - regular expressions, CFGs, formal language
Signal analysis, modeling, processing
Image processing notes

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

One possible typology

Ratio data (ordered, meaningful zero point, linear scale, (often) continuous)

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


  • ordered/monotonous (larger value means larger represented thing)
  • linearly comparable, e.g. twice the distance in these numbers means twice the amount of difference in represented thing)
  • meaningful zero point
  • the combination of the above imply numbers are proportional: twice the number directly means twice the amount of different thing


  • weight
  • length
  • time amount - reaction time, hours of study, time required to run a marathon
  • age
  • temperature in Kelvin
  • number of responses (note: overlap with discrete numeric)
  • many physical measurements in general
though not all - can depends on units and their implied zeroing. For example, Farenheit and Celcius are not zeroed according to energy

Interval data (ordered, no meaningful zero point, possible linear scale, (often) continuous)


  • ordered/monotonous (larger value means larger represented thing)
  • comparability often not linear, though could be for any given case
  • no particularly meaningful zero point

Interval data is quantitative data in a numbering system in which there is no sensible zero point.

This means the assumption of linear relationships may be incorrect (often the most important difference compared to ratio data)

  • ...because of the zero point
  • ...because the scale is arbitrary
  • ...and/or because of other reasons

Discrete numeric data (ordered, linear scale, discrete)

Ordinal data (ordered, but no obvious numbering so not linear; discrete)


  • highest level of education
  • questionnaire items of the 'strongly disagree to strongly agree in five steps' sort
  • age groups ('age up to 20', 'age 20-29', 'age 30-39') (note: in this case based on ratio data)
  • socioeconomic status (sort of)
  • most any ranking

Asking people to rate on a few-point scale is often seen as ordinal, while there is overlap with continuous interval data.

Nominal/categorical data (unordered; qualitative; discrete)


  • labels, such as {T,A,G,C}
  • choosing from multiple choice options
  • distinguishers such as {green,blue} or {true,false}, discrete genders, blood type
  • brand names
  • left/right-handedness
  • political affiliation

More words

More complex cases


Further terms that matter

Continuous data refers to valued numbering that is not restricted to be discrete/integer.

so ratio or interval data in the above list.

Quantitative data - basically anything not categorical, so referring to nominal/categorical.

Variables, dimensions, and measurement, and experiments