Video format notes

From Helpful
Revision as of 19:24, 28 January 2011 by Helpful (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)
These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.


Digital video (files, streaming)

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

This is meant primarily as a technical overview of the codecs in common and/or current use (with some historical relations where they are interesting, or just easy to find), without too many details; there are just too many old and specialist codecs and details that are not interesting to most readers.


Note that some players hand off reading/parsing file formats to libraries, while others do it themselves.

For example, VLC does a lot of work itself, particularly using its own decoders. This puts it in control, allowing it to be more robust to somewhat broken files, and more CPU-efficient in some cases. At the same time, it won't play unusual files as it won't perfectly imitate other common implementations, and it won't be quite as quick to use codecs it doesn't know about; in these cases, players that hand off the work to other things (such as mplayerc) will work better.


Container formats

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Containers are file types that usually allow various streams of various types and using various codecs, though in some cases some codec choices are less robust in a container than others.


General-purpose container formats include:

AVI (Audio Video Interleave)

A fairly common container format (a RIFF derivative; see also IFF), though not ideal for things such as MPEG-4 video tracks, VBR MP3 audio tracks, and some other things, as this old format does not really allow that without minor hacks and agreed-on conventions. Many AVIs in the wild violate the AVI standard but play fine on most players.


Derived:

  • Google Video (.gvi) files use MPEG-4 ASP and MP3 in a mild variant on AVI container [1] (and do not really exist anymore)
  • Files with the .divx extension are usually AVIs (...containing DivX video)

MKV (Matroska Video)

An open standard, preferred by some as it a fairly well designed many-stream format, and also because it allows subtitle embedding, meaning less hassle with external subtitle files.

Ogg

Ogg is an open standard.

Extension is usually .ogg, or .ogm (though .ogv, .oga, and .ogx are also seen).

Note that many people say ogg when thinking of the Ogg Vorbis audio format made for music - the Vorbis codec stored in the Ogg container.

See also Ogg notes.


Ogg Media (.ogm) is a somewhat hackish extension (of Ogg), which supports (multiple) subtitle tracks, (multiple) audio tracks, and some other things that make it more practical than AVI, and put it alongside things like Matroska.

Ogg Media is not really necessary and will probably not be developed, in favour of letting Matroska can become a wider, more useful container format instead.(verify)

MPEG-related

MPEG 1, 2, and 4 (MPEG in general) supports a limited amount of stream types, whether quite specifically settled (such as in DVD VOBs) or less so (various MPEG video in the wild)

  • MPEG-4:
    • MP4 usually refers to MPEG-4's container format (specified in MPEG-4 Part 14). Note that .mpa files are often MPEG-4 containers with no video, only audio.
    • 3GP (3GPP, and 3GPP2 (.3g2)) is a simplified version of this MP4 container format, made for mobile device support)
  • MPEG-PS (more specific encapsulation, used in DVD, HD-DVD, more) [2]
  • MPEG-TS (more specific encapsulation, used in DVB) [3]

Proprietary/minor/other

A number of container formats support only a limited number of choices, particularly if they are proprietary and/or specific-purpose. They may usually imply the codec quite commonly stored in the container, and may support only a handful of pre-defined choices (sometimes just one audio and video format).

Such container formats include:

  • Flash video (.flv) [4]
  • NUT (.nut), a competitor to avi/ogg/matroska [5]
  • Quicktime files (.mov) are containers, though without extensions to quicktime, they support relatively few codecs. In recent versions, MPEG-4 was added.
  • ASF (Advanced Systems Format), a proprietary format from Microsoft, most commonly storing wma and wmv content, and sees little other use in practice (partly because of patents and active legal protecting). [6]
  • RealMedia (.rm)
  • DivX Media Format (.dmf)


Fairly specific-purpose:

  • Digital Picture Exchange (.dpx) [8]
  • Material Exchange Format (.mxf) [9]
  • Smacker (.smk), used in some video games [10]
  • Bink (.bik), used in some video games [11]
  • ratDVD

DVD-Video layout

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

(Summary: data in MPEG-2 PS, some DVD-specific metadata/layout around it)

A VIDEO_TS directory with VOB, IFO, and BUP files are, in a fashion, a container format as they are the DVD-Video way of laying out:

  • metadata about steam data (chapters, languages of tracks, angles, etc.)
  • Video streams (usually MPEG-2, sometimes MPEG-1)
  • Audio streams (AC-3, MPEG-1 Layer II (MP2), PCM, or DTS)
  • Subtitle streams (bitmap images)

(note: The AUDIO_TS directory is used by DVD-Audio discs, which are fairly rare. On DVD-Video discs, this directory is empty, and the audio you hear is one of the streams in the VOBs.)


IFO stores metadata for the streams inside the VOB files (e.g. chapters; subtitles and audio tracks). BUP files are simply a backup copy of the IFO files.


VOB files are containers based on MPEG-2 PS, and store the audio, video, and image tracks.

VOB files are segmented in files no larger than 1GB, which was a design decision meant to avoid problems with filesystem's file size limits (since the largest possible file size on a DVD was larger than various filesystems at the time could deal with).


DVD players are basic computers in that they run a virtual machine. DVD-Video discs with menus are based on such bytecode, although most are actually very simple if you consider the flexibility of the VM -- there are even a few DVD games, playable by any DVD player.


See also:

Stream identifiers (FourCCs, etc.)

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

The wide concept that is MPEG

MPEG can refer to one of three formats, MPEG-1, MPEG-2, and MPEG-4 (3 was skipped to avoid confusion with MP3, which is actually short for MPEG-1 layer 3), formats that can store video and/or audio streams, and a little more.


MPEG-1, MPEG-2

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

MPEG-1 and MPEG-2 see use in DVDs and some related earlier formats (such as VCDs, and variants).

They are relatively simple, meaning that hardware players were easier to get right, but also that these are not the most flexible formats.

Use now tends to mean you want to pick a relatively high bitrate(-per-quality ratio), which is acceptable on DVDs since they have a whole bunch of space to (almost invariably) store a single movie.

These are not a very efficient choice when space is scarcer, though.


See also:

MPEG-4

MPEG-4 is a standard with many parts. People commonly use it to refer to the standard as a whole, the MP4 container format, or to one of two fairly separate parts of the standard that refer to two different (styles of) video coding.

They only describe how video should be decoded, and so different encoders exist that may take more time and squeeze more quality out of what could be called the same format. Such alternatives may work on players that comply to the respective part of MPEG-4, although in some cases, they also involve file format details which may prevent them from being playable on fully standard players.

As such, MPEG-4 refer to two categorizations of many more specific codecs:

  • MPEG-4 ASP (defined in MPEG-4 Part 2); implementations include:
    • MS MPEG4 (v1, v2, v3), (primarily used in ASFs, and not strictly compliant) (FourCC: MP42, MP43, DIV3, WMV7, WMV8, AP41, COL1) [12]
    • DivX ;-), DivX (initially hack of MS MPEG4 v3 to allow use in AVIs. DivX was later commercially developed)
    • Xvid (succeeded OpenDivX)
    • 3ivx (v1, v2) (FourCC: 3IV1, 3IV2)
    • Nero Digital, mostly an internal format
    • HDX4
    • others
  • MPEG-4 AVC, a.k.a. H.264 ('Advanced Video Coding',defined in MPEG-4 Part 10 as well as ITU-T H.264); implementations include:

To see how much wider MPEG-4 is, see e.g. [13]


Video codecs

H.26x family (related to MPEG and ITU standards. H.something is the ITU name):

  • H.261, a format for videoconferencing specifically over ISDN) from before the more widely used H.263. [14]
  • H.262, which is identical to part of the MPEG-2 standard
  • H.263: ITU standard for videoconferencing (sometimes used in H.323). Also the base of various other codecs, including:
    • VIVO 1.0, 2.0, I263 and other h263(+) variants
    • Early RealVideo
    • Sorenson (including early Flash video)
      • Sorenson 1 (SVQ1, svq1, svqi), based on H.263
      • Sorenson Spark (based on H.263) (Used in Flash 6, 7 (and later) for video)
      • Sorenson 3 (SVQ3), apparently based on H.264 draft
    • See also [15]
  • H.264, (a.k.a./identical to) MPEG-4 Part 10, MPEG-4 AVC
    • FourCC depends on the encoder (not too settled?).
      • ffmpeg/mencoder: FMP4 (which it also uses for MPEG-4 ASP, i.e. DivX and such. It seems this is mostly meant to send these files to ffdshow(verify), but not all players understand that)
      • Apple: avc1
      • Various: H264, h264 (verify)
      • Some: x264 (verify)


  • MPEG-4 part 2, a.k.a. MPEG-4 ASP
    • DivX, XviD, and many variants/derivatives
    • [16] mentions FourCCs 3IV2, 3iv2, BLZ0, DIGI, DIV1, div1, DIVX, divx, DX50, dx50, DXGM, EM4A, EPHV, FMP4, fmp4, FVFW, HDX4, hdx4, M4CC, M4S2, m4s2, MP4S, mp4s, MP4V, mp4v, MVXM, RMP4, SEDG, SMP4, UMP4, WV1F, XVID, XviD, xvid, XVIX


RealVideo uses different names internally and publicly, some of which are confusable:

  • RealVideo (FourCC RV10, RV13) (based on H.263)
  • RealVideo G2 (fourCC rv20) used in version 6 (and 7?) (based on H.263)
  • RealVideo 3 (FourCC rv30) used in version 8 (apparently based on a draft of H.264)
  • RealVideo 4 (FourCC RV40, and also UNDF) is the internal name/number for the codec used in version 9. Version 10 is the same format, but the encoder is a little better.
  • The H.263-based versions (up to and including 7) were not very impressive, while versions 9 and 10 are quite decent. All are proprietary and generally only play on RealPlayer itself, unless you use something like Real Alternative.


Microsoft:

  • Windows Media Video: (often in .wmv file - asf containers) [17]
    • version 7 (FourCC: WMV1) (based on MPEG-4 part 2)
    • version 8 (FourCC: WMV2)
    • version 9 (FourCC: WMV3)
  • RTVideo [18]
  • VC-1 [19]


Apple:

  • Quicktime [20]
    • 1: simple graphics, and RPZA video [21]
    • 5: Added Sorenson Video 3 (H.263 based)
    • 6: MPEG-2, MPEG-4 Part 2 support. Later versions also added Pixlet [22] [23]
    • 7: H.264/MPEG-4 AVC, better general MPEG-4 support
  • Internal formats like 'Intermediate Codec' [24] and ProRes [25]


Dirac [26] is a new, royalty-free codec from the BBC, and is apparently comparable to H.264(verify).


On2 (Duck and TrueMotion also refer to the same company):

  • VP3 (FourCC: VP30, VP31, VP32): relatively low-quality [27]
  • VP4 (FourCC: VP40) [28]
  • VP5 (FourCC: VP50): [29] [30]
  • VP6 (FourCC: VP60, VP61, VP62): Used for some broadcasting [31] [32]
  • VP7 (FourCC: VP70, VP71, VP72): A competitor for MPEG-4 [33] [34]


Xiph's Theora codec is based on (and better than) On2's VP3 [35]


Flash Video: [36]

  • Flash video used Sorenson Spark (based on H.263)
  • Flash 8 added support for VP6 [37]
  • (Flash 9 betas added support for H.264)


Older formats:

  • Flic (.fli, .flc), primarily video-only files used in Autodesk Animator [38]
  • Intel Indeo:
    • Indeo 2 (FourCC: RT21) [40]
    • Indeo 3 (FourCC: IV31 for 3.1, IV32 for 3.2) [41]
    • Indeo 4 (FourCC: IV40, also IV41 for 4.1) [42]
    • Indeo 5.0 (FourCC: IV50) [43]
  • MJPEG is mostly just a sequence of JPEG images (FourCC: AVDJ, AVID, AVRn, dmb1, MJPG, mjpa, mjpb). [44] [45] (There are also some variations on this theme)
  • Various RLE-like formats, often used primarily for very simple animations


Unsorted

  • Uncompressed Raw YUV [46]
  • Compressed YUV, e.g.
    • HuffYUV (lossless, and easily over 20GB/hour)
  • RawRGB (FourCC: 'raw ', sometimes 0x00000000) [47]
  • Hardware formats: (verify)
    • AVID
    • VCR2
    • ASV2
  • Flash video (preferred first, will play list) (verify)
    • in Flash 6: Sorenson Spark
    • in Flash 7: Sorenson Spark
    • in Flash 8: VP6, Sorenson Spark
    • in Flash 9: H.264, VP6, Sorenson Spark (and understands MP4, M4V, M4A, 3GP and MOV containers)
    • in Flash 10: (verify)



See also:

Pixel/color formats (and their relation to codecs)

Streaming, streaming support protocols

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

See Streaming audio and video

Subtitles

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Hardsubs are a jargon term that refers to subtitles that are part of the video, and no special part of it either. This doesn't run into support issues, and generally looks good, but they give no choice of language, or whether to display the subtitles or not.

Softsubs refer to separate subtitle data, historically often as a separate file with the same name and a different extension, and more recently as a part of container formats which support multiple streams (such as MKV), which can also store multiple different subtitles (e.g. languages) at once.


There are a number of formats, and not all file extensions are very obvious. Particularly things like .sub and .txt may be one of various formats.


Formats include:

  • Plain-text:
    • Various, but most importantly:
    • SRT (SubRip [48] [49]) - very simple (but not too well standardized, and some advanced features are not well handled by some players/editors)
  • Subtitle editors' internal formats (text, binary, xml, other), some of which became more widely used:
    • SSA (SubStation Alpha, software of the same name) [50] [51]
    • ASS (Advanced Substation Alpha, aegisub) - an extension of SSA ([52])
    • AS5 (ASS version 5) [53]
    • JacoSub [54] [55]
    • XombieSub [56]
    • AQTitle(verify) [57]
    • Turbotitler (old) [58]
  • Image-based: (larger, but avoids font problems)
    • VOBsub (.sub and .idx) are subtitle streams as used by DVD (verify)
    • MicroDVD (.sub), specific to the MicroDVD player [59]
    • PRS (Pre-rendered subtitles) stores (PNG) images [60]
  • XML-based
    • USF (Universal Subtitle Format) [61] [62], and XML format that is not very common outside of Matroska containers
    • SSF (Structured Subtitles Format) is a newer XML-based format (apparently with no current major interest or support [63])
  • Other/unsorted (and other internal formats):
    • SAMI (.smi) [64], often used in Korea
    • DVD-based/derived (CC, SPU, VobSub)
    • Karaoke formats (.lrc, .vkt, )
    • MPSub (.sub), a format internal to mplayer [65]
    • MPEG-4 Timed Text [66]
    • Power DivX (.psb) [67]
    • ViPlay Subtitle File (.vsf)
    • Phoenix Japanimation Society (.pjs) [68] (old(verify))
    • Subsonic (.sub) [69]
    • ZeroG (.zeg) [70]
    • Adove Encore (.txt) [71]
    • MPL2 [72]
    • VPlayer [73]
    • Sasami Script (.s2k)
    • SubViewer (verify)
    • RT (verify)
    • DVB (verify)
    • Teletext (verify)


Editors and other utilities:


Player support

Video encoding notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Codec choices

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Software

Webcam/frame-oriented software

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

(What you may want to look for in) more-than-webcam software

Mainly for editing

Mainly for conversion

Some specific tools

See also

Video cable/plug types

See Common_plugs_and_connectors#Video_cables.2Fplugs


Quality considerations and connection types:

  • Fully digital transfer is best (theoretically over longer distances, though various cables are have low-latency designs and are distance limited)
  • Analog, (well-)separated components is good (over short distances)
    • also depending somewhat on how many conversions are involved
    • and how much it is possible for those conversions to be done by marginal hardware
    • Of the older, non-specialized cable types, component, composite and S-video can carry better-than-TV resolutions
  • Simplified combinations (e.g. composite) can be a little worse


So, roughly:

HDMI, DVI-D, etc.
 > 
DVI-A, 
VGA, 
Component (YPbPr) 
SCART RGB 
 > 
S-Video
 > 
Composite

...although it practically also depends considerably on the video signal you want to transfer, and how suited the source, transfer, and target are for each other, including details such as:

  • the resolution(s) the source device offers
  • how suited the resolution and video connection/resolution for the eventual display,
  • how suited a resolution is for any given type of video connection
  • the quality of the output hardware for each alternative a device offers

For example, composite video outputs on digital cameras are usually ugly, for more than one of these reasons. Computer screen output does not display ideally on TVs. Network/drive based media players) may encode better on one output than another.

Frame rate, analog TV format, and related

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

(I'm not sure about all this - there is so much fuzzy and/or conflicting information out there)


Frame rates

Movies on media like DVD come in different frame rates. This does not matter to computer playability, so unless you are compressing or converting video, you probably want to ignore this completely.

Common rates

Some of the more common rates seem to be: (verify)

rate common uses / suggests source also referred to as approx
24 (exactly) used to shoot most film, and used in most cinema projection film
24000/1001fps progressive conversion from film to TV using equipment based on NTSC rates (Um?) 'NTSC film' 23.976,
23.97
25 exactly (verify) speed of rasters transferred (not shown frames) in (interlaced) broadcasts such as PAL (except PAL M) and SECAM 'PAL film', 'PAL video'
30000/1001 the speed of rasters transferred (not shown frames) in (interlaced) broadcasts such as NTSC M (the most common NTSC form) and also also PAL M. (Bonus trivia: pre-1953 NTSC broadcasts was exactly 30.0fps) 29.97
50 exactly (verify) Can refer to 50 frame per second progressive, or 25 frame per second interlaced that is being played (and possibly deinterlaced) as its 50 contained fields per second (as e.g. in PAL and SECAM TV ((except PAL M)) 'PAL film', 'PAL video'
60000/1001 Sometimes refers to 50 frames per second progressive, but perhaps more commonly to 30 frame per second, 60 field per second interlaced content (as e.g. in NTSC TV). 59.94

These are the most common, but other rates than these exist. For example, there is double rate and quad rate NTSC and PAL (~60fps, ~120fps; 50fps, 100fps), often used for editing or as intermediates in conversions, such as in interlaced material.


A framerate might suggest the source of the video (film, PAL broadcast, NTSC broadcast) and/or the way it is played (e.g. 60000/1001 usually refers to playing content in an interlaced way), though it doesn't necessarily imply whether it is interlaced or progressive, or telecined or not.


Movies still mostly uses 24fps, not because it's better in some way, but seemingly mostly because we're used to it. Higher framerates are one aspect that reminds us of home video and its associated cheapness, while more sluggish but still acceptable motion is associated with movies, content that we think of as higher quality. These associations are also quite entangled with camerawork and other aspects, so it's not quite that simple.

On approximations

The approximations you see, such as 29.97 for 30000/1001, 23.97 or 23.976 for 24000/1001, are inaccurate.

This matters when reprocessing content, such as going between video from/for PAL and NTSC hardware/broadcast (broadcast standards are the origin of the 1001 divisor in said fractions).

The difference between the fraction and its approximation is milliseconds in the best case and for a few minutes of video, but multiple seconds in the worst case and a movie's length. If audio and video desynchronize that much, it becomes almost impossible to watch.

When recording or transcoding video, it helps to know the exact nature of the content to avoid such desynchronization trouble.

Common constraints

When video is made for (analog) broadcast, it is very much constrained by that standard's color and frame coding, and more significantly in this context, its framerate.


When video is made for film, it is limited by projector ability, which has historically mostly been 24fps.

When it is made for NTSC or PAL broadcast, it is usually 30000/1001 or 25, respectively.


Computers will usually play anything, since they are not tied to a specific rate. Even though monitors have a refresh rate, almost all common video will be slower than, meaning you'll see all frames anyway. Computer playing is often synchronized to the audio rate).

Common conversions

Conversion from 24fps film to 25fps PAL broadcast rate can be done by playing the frames and audio faster by a factor 25/24, either letting the pitch be higher (as will happen in analog systems) or the same pitch by using digital filters (optional in digital conversion).

Few people will notice the ~4% a difference in speed, pitch or video length.

This does not work for NTSC, as most people easily notice a ~25% difference. Instead, telecining is often used, though it also makes film on TV a little jumpier for the 60Hz/30fps countries (including the US).


Interlacing, telecining and such

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Progressive

Progressive means that each frame is drawn fully on screen, and that frames are drawn in simple sequence.

Seems very obvious; it is best understood in contrast with interlacing and telecining (Pixel packing is also somewhat related).


Constraints in bandwidth and framerates are the main reasons for interlacing and telecining. Computers are not as constrained in those two aspects as broadcasting standards and CRT displays are, and as such, other factors (such as the compression codecs) tend to control the nature of video that is digitally available. This is one reason that much of it is progressive.

Interlacing

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Interlacing refers to a method of display that, every screen update, updates only half the image -- specifically every second line, alternately refreshing the even and odd-numbered lines.

The human visual system is less senstive to detail flickering than to large-area flickering, and TV phosphors are historically relatively slow compared to other types of screens, which are the main two reasons that interlacing is a practical way of reducing bandwidth by a factor of two when the end display can display interlaced content.


Inerlacing is a lossy trick of getting more frames to display without using more bandwidth, initially created for TVs to get the fixed-rate broadcast/reception systems to display a higher framerate at the display end (60000/1001 for NTSC, 50000/1001 for PAL), for smoother movement.


There is even something to be said for interlacing in systems where bandwidth is not fixed, as advanced de-interlacing algorithms, supported by newer and faster hardware, can bring quality back to near-original levels - given a number of assumptions, but mainly those that are usually true in video.

It remains a question whether interlacing should remain in an age with less and less analog video, though, because there is always some quality involved - the original cannot be exactly calculated. While it is not as bad as many assume it is, in the context of digital video there is less added value of interlacing in and of itself than there was in analog broadcasts.

The main reason against interlacing is that it as well as deinterlacing are inherently lossy techiques. In transfer of compressed video data (rather than analog data), one should consider that codecs do not deal well with interlacing -- that is, the displayed result of alternating lines (you could call it the what-you-see-update progressive-ish view) contains the jaggy lines that are very contrasty and, as such, hard to deal with in compression. However, once a codec knows about interlaced content, it can treat it as the two frames (that are transferred right up to the eventual display) significantly better. However, this is still is a frame-halved, framerate-doubled variation, and codecs are often smart enough to deal at least as well (and sometimes rather better) progressive originals.


Note that depending on how (and where in the chain) you look at the content, you could refer to interlaced content's rate either by the tranferred rasters or the transferred frames - the transferred framerate or the shown framerate (30000/1001 or 60000/1001 for NTSC, 25 or 50 for PAL). This leads to some confusion.

From this perspective of displaying things, each image that is received/read contains two frames, shot at slightly different times, which is updated on the display at different times.


This is not the only relevant view. Digital video will often not see interlaced video as having a doubled framerate, but instead have each captured frame contain two (chronologically adjacent) half updates. This makes sense as this is all the data that is displayed and avoids having to spend a lot of bandwidth -- but note that this basically stores interlaced video in progressive frames, which looks worse than on TV because while you're showing the same content, you're doing so at half the update speed again, which also means the interlacing lines are more apparent once captured than on TV.

For this reason, de-interlacing is often a good idea, which actually refers to a few different processes that make the video look better.


Interlacing in general and in analog sytems general happens at the cost of display artifacts under particular conditions: while interlacing is usually not particularly visible on TVs, specific types of resulting problems are brought out by things such as fast pans (particularly horizontal movement), sharp contrast and fine detail (such as small text with serifs, computer output to TV in general), shirts with small stripes (which can be lessened with some pre-broadcast processing).


Interlacing is also one of a few reasons that a TV recording of a movie will look a little less detailed than the same thing on a (progressive) DVD, even when/though DVDs use the same resolution as TVs (other reasons for the difference are that TV is analog, and involves various lossy steps coming to your TV from, eventually, the original film material).


For notes on telecined content, see below. For now, note also that interlacing is a technique applied at the playing end (involves taking an image and playing it as two frames), while telecining just generates extra frames initially, which play progressively (or have inverse telecine applied to it; see details).



Note that when a storage format and player are aware of interlacing, it can be used smartly again. For example, DVDs may mix progressive, telecined, and interlaced behaviour. The content is marked, and the player will display it accordingly. Interlaced DVD content is stored in the one-image-is-two-frames way, and the player will display it on the TV in the double-framerate-half-updates way described above.


Deinterlacing
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Deinterlacing takes interlaced material and produces a progressive-scan result. It is often applied to reduce interlacing's visual artefacts, particularly jagged edges (a.k.a. saw tooth edge distortion, mice teeth, combing, serrations), less noticeable, for display or for (re)compression (as lossy video compression that isn't counting on interlaced video deals with it quite badly).

Note that in most circumstances, deinterlacing is an inherently lossy process, in that it throws away data and isn't exactly reversible. On the whole it may be perceptually quite worth it, though.


There are also smarter and dumber ways of doing deinterlacing (most of those detailed below are the simple and relatively dumb variants), and the best choice depends on

  • whether you are doing it for display or storage/(re)compresion
  • are you displaying on a CRT (phosphor and scaline) or a digital display?
  • the nature of the video before interlacing (are the two fields from different times or not?)
  • restrictions on the output (e.g. TV/broadcast framerate, or little restriction in )

You may like to know that:

  • Analog TV has to adhere to broadcast standards from the 1930s and is interlaced, but whether the two fields in a frame are from different times depends
    • in PAL countries, most films are not, coming from 24fps reels (in NTSC countries films are likely telecined)
    • most other things are, because it looks smoother (think in particular things like sports)
  • different types of camcorders may only ever store half-height interlaced video, or may store progressive frames (but of course interlaced video on their analog TV-out) but output interlaced video.
  • TV capture cards (verify)


For the examples below, a 25fps stream with 50 fields per second. The text below will mix 'field per second' and 'frame per second' (and sometimes fps for progressive, where there is no difference), so pay attention.


Weave

Weaving creates progressive frames by showing both fields at the same time, interlaced. This can also be described as 'doing nothing'.

For example, you can weave a 25 frame per second, 50 field per second interlaced video into a 25 frame per second progressive video.

  • If both fields in the frame actually came from the same frame, this just reconstructs the original 25fps video.
  • If the fields came from 50fps footage, each 25fps progressive frame will show images from two fields, 0.02 second (1/50th) apart. This retains all the video data, but looks jagged.
    • This would look more acceptable on TV because than on computers because the phosphors lighting up die out relatively slowly and effectively smooth the frames.

Digital capture from interlaced media (e.g. a TV capture card) will often capture in a way that effectively weaves frames.


Discard

An very-simple-to-code method is discarding every second field (a.k.a. single field mode) and drawing the other twice (line doubling) the other, to get the same-sized, same-framerate results.

This of course throws away half the vertical detail, and if the video came from 50fps footage, also throws away half the time material.

You would probably only do this when displaying video on a device with too little processing power.


Blending

The simplest method that uses all the data is blending (a.k.a. averaging, field combining), which blends both fields into a single frame.

The jagging effects are much reduced, but you blur everything, and motion will have a sort of ghosting to it, because you have removed half the time resolution (or rather, didn't use it). The video as a whole will also look slightly more blurred.

It keeps the same framerate. For example, 25 frame, 50 field per second → 25 frame per second. If the original content was 25fps, this yields a slightly blurred version of the original. If the original was 50fps, it blurs movement as well.

(Note that sizing down video has a similar effect, and can even be used as a quick and dirty way of deinerlacing if software does not offer any deinterlacing options at all)


Bob

Bob, Bobbing (also 'progressive scan', but that is an ambiguous term) refers to taking both fields from the frame and displaying them is sequence (scaled up horizontally with some basic interpolation). For example, it would take 25frame per second, 50 field per video and make create 50 frame per second video.

If the original was 50 frame per second video, this creates video that avoids jagginess and looks more fluid (...than methods than any method that would create 25fps progressive because unlike them, you keep the time information actually present)

If the original was 25fps, you're mostly just doubling the amount of bandwidth necessary to store the video, which will display looking like 25fps as pairs of frames will be (nearly) identical.

This is often a good choice for display on PCs because you can actually use the higher frame rate to make it look smoother, and also because you avoid a choice more likely to show jagged (and the higher video bandwidth often doesn't matter).

When re-encoding video, the fact that you are scaling up features may also effectively size up encoding artefacts. If the original was film, you're doubling the bandwidth unnecessarily and will waste space even if the codec is smart about differential coding.


Bob-and-weave and optional cleverness

(verify) Bob and weave refers to the combination of bob and weave.

Telecine

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Telecine is a portmanteau of 'television' and 'cinema' and refers to the process of converting video between these -- though its methods are more widely applicable than that.

It is perhaps most often mentioned inthe context of frame-rate conversions from film (often 24fps) to NTSC television (30fps, adding intermediate frames) or PAL television (25 frames/s, often by playing the frames at 25fps and the audio a little faster)


Frame rate conversion from 24fps film to 30000/1001 broadcast NTSC is usually done using 2:3 pulldown, which uses a constant pattern of interlace-like half-updating frames and some duplication to end up with ~6 additional intermediate frames per second. Since you end up with 30 frames for a second, which look approximately the same as the 24fps original, the audio speed can stay the same. It still means leaving some frame content on-screen longer than others, which is visible in some types of scenes. For example, a slow smooth camera pan would be shown with a slight judder after telecining.

Telecining is inversible, in that you can calculate the original frames from a stream of telecined frames (though when telecined content is spliced this may mean a dropped frame or two).

When the display is not bound to a 30fps refresh rate, you could apply inverse telecine to yield the original 24fps content - which can make sense whenever you can play that back at that rate, since it removes the judder. Inverse telecine is also useful when (re)compressing telecined video, since many codecs don't like the interlace-like effects of telecining and may deal with it badly.


Hard telecining' refers to storing telecined content e.g. 30000/1001 for NTSC generated from 24fps film content, so that the given content can be played (progressively) on NTSC equipment. The upside is that the framerate is correct and the player doesn't have to anything fancy, the downside is that it usually has negative effect on quality.

Soft telecining refers to storing video using the original framerate (e.g. 24000/1001 or 24fps film) and flagging the content to be telecined, so that the eventual player (e.g. box-top DVD players) will telecine it to 30000/1001 on the fly. NTSC DVDs with content originating from 24fps film usually store 24000/1001fpsTemplate:Veify progressive, with pulldown flags set so that the DVD player will play it as 30000/1001fps.


See also:

Mixed content

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Note that (particularly digitally stored) video may contain mixes of different frame rates in a single video, and may mix progressive and telecined, progressive and interlaced, and sometimes even all three.


You generally want to figure out whether a video is progressive, interlaced or telecined. One way to do this is using a player that allows per-frame advancing (such as mplayer). Make sure it's not applying filters to fix interlacing/telecining, find a scene with movement (preferably horizontal movement/panning), and see whether there are interlace-style jaggies.

  • If there are none, it is progressive (or possibly already deinterlaced by the player)
  • If there are in every frame, it is interlaced
  • If there are in only some frames, it is telecined (two out of five in 2:3 24-to-30fps telecine).

Note that things like credits may be different (apparently often telecined on DVDs).


While telecining uses regular patterns of extra frames, splicing after telecining means the video will usually not follow that pattern around a splice, meaning that inverse telecine may not be able to decode all original frames. This is often the cause of encoders/players complain about a few skipped and/or duplicate frames in a movie's worth of frames, and you can ignore this - hardware players do the same.

See also

Deinterlacing:

Telecining:

Deinterlacing, telecining:

Various:

(Analog) TV formats

There are a number of variants on NTSC, PAL and SECAM that may make TVs from different countries incompatible. NTSC is used in North America and part of South America (mostly NTSC M), and Japan (NTSC J).

PAL is used in most of Europe, part of South America, part of Africa, and Asia. SECAM is used in a few European countries, part of Africa, and Russia.

PAL M (used in Brazil) is an odd one out, being incompatible with other PAL standards, and instead resembling NTSC M - in fact being compatible in the monochrome part of the NTSC M signal.


CRT TVs often support just one of these, as it would be more complex to receive and convert/display more than one, and few people would care for this feature.


It should be noted that the actual broadcast signal imagery uses more lines than are shown on the screen. Of the video lines, there are fewer that are the raster, the imagery that will be shown on screen.

  • 525-scanline video (mostly NTSC) has 486 in the raster, and many show/capture only 480(verify)
  • 625-scanline video (mostly PAL) has 576 in the raster

The non-raster lines historically were the CRT's vertical blanking interval (VBI), but now often contains things like teletext, closed captioning, station identifiers, timecodes, sometimes even things like content ratings and copy protection information (note: not the same as the broadcast flag in digital television).

Video recording/capture will often strip the VBI, so it is unlikely that you will even have to deal with it. Some devices, like the TiVo, will use the information (e.g. respect copy protection) but do not record it (as lines of video, anyway).

Devices exist to add and alter the information here.


PAL ↔ NTSC conversion

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Note that various DVD players do this, although others do not, and neither the fact that they do or that they don't is necessarily advertized very clearly.


PAL to NTSC conversion consists of:

  • Reducing 625 lines to 525 lines
  • creating ~5 more frames per second

NTSC to PAL conversion consists of:

  • increasing 525 to 625 lines
  • removing ~5 fewer per second


In general, the simplest method, that cheaper on-the-fly conversion devices often use, is to duplicate/omit lines/frames. This tends to not be the best looking solution, though.

Linear interpolation (of frames or lines) can offer smoother-looking motion and fewer artifacts, but are more computationally expensive, and have further requirements - such as working on deinterlaced content.

Fancier methods can use things like motion estimation (similar to fancy methods of deinterlacing)

Digital / HD broadcasting

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

ATSC in the US, DVB in Europe


See also:


See also

Semi-sorted

Resolution references

References like 480i and 720p became more common in the era commonly known as HD (now), partly just because it's more interesting to be clear.

These references are not often seen alongside monitor resolutions, perhaps because "720p" and "1080p HD" is easier to market when you don't consider when they pack as much information as a minimal and mid-range monitor a decade+ old. Except on a bigger screen (for TV image quality, the pixels make much less difference than the new digital content/broadcast standards that came along with them).


References such as 480i and 720p refer to the vertical pixel size and whether the video is interlaced or progressive.

The common vertical resolutions:

  • 480 (for NTSC compatibility)
    • 480i or 480p
  • 576 (for PAL compatibility)
    • 576i or 576p
  • 720
    • always 720p; 720i does not exist as a standard
    • 1280x720 (sometimes 960x720)
  • 1080 (HD)
    • 1080i or 1080p
    • usually 1920x1080


There are some other newish resolutions, many related to content for laptops/LCDs, monitor/TV hybrids, widescreen variations, and such.


HD TV broadcasts are often either 1080i or 720p. While 1080i has greater horizontal resolution (1920x1080 versus 1280x720), 720p does not have interlace artifacts and may look smoother.


The 480 and 576 variants usually refer to content from/for (analog) TVs, so often refer to more specific formats used in broadcast.

  • 576 often refers to PAL, more specifically:
    • analogue broadcast TV, PAL - specifically 576i, and then often specifically 576i50
    • EDTV PAL is progressive, 576p
  • 480 often refers to NTSC, more specifically:
    • analogue broadcast TV, NTSC - specifically 480i, and then often specifically 480i60
    • EDTV NTSC is progressive, 480p
  • 486 active lines seems to refer to older NTSC - it now usually has 480 active lines

There is more variation with various extensions - widescreen, extra resolution as in e.g. PALPlus, and such.



Sometimes the frame rate is also added, such as 720p50 - which usually refers to the display frequency applicable.

In cases like 480i60 and 576i50 you know this probably refers to content from/for NTSC and PAL TV broadcast.

See also:


On TV horizontal pixel resolution

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)


More resolutions

See e.g. the image on http://en.wikipedia.org/wiki/Display_resolution



Screen and pixel ratios

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)


See also:

And perhaps:


Unsorted