Ogg notes

From Helpful
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
📃 These are primarily notes, intended to be a collection of useful fragments, that will probably never be complete in any sense.

Format and parsing details

Ogg

Ogg is a container format, consisting of a number of pages.


One physical bitstream (file, or streamed bitstream) can contain many logical bytestreams (e.g. video, audio, anything else).

Logical streams are identified with serial numbers, making it easy to decode multiple streams in parallel.

That means no two streams in the same physical stream should have the same serial number, even if those do not overlap.
(So you can't just append physical ogg bitstreams. Because it's mainly a series of packets that already mostly fine, but you should do so with a utility that renumbers where necessary)


Physical bitstreams consist of a number of pages, each consisting of a header and a variable number of segments' (each up to 255 bytes long).

If you consider the segment table part of the header, then the header is a variable-sized thing.

You can still seek quickly by looking for the capture pattern, decoding the header, and verifying the page CRC. If that's fine we have page sync can can start decoding pages end-to-end.


Each page contains:

  • 'OggS'
  • header (27 bytes up to and including the lacing entry amount byte)
    • Ogg version (1 byte, can currently only be 0x00) (so you can currently sync on 'OggS\x00')
    • header type (1 byte), a set of bit flags:
      • 0x01 set means continued packet, otherwise fresh
      • 0x02 set means the first packet in a logical stream
      • 0x04 set means the last packet in a logical stream
    • 'absolute granule position' (a 64-bit int). Its meaning differs per content type
    • stream's serial number (a 32-bit int)
    • page sequence number (a 32-bit int)
    • checksum (a 32-bit int)
    • segment table:
      • amount of lacing values that follow (1 byte)
      • that many values (each a byte)
  • data (as many bytes as the sum of the values in the segment table specifies)


Notes:

  • Integers are coded LSB
  • Sequence numbers should be strictly sequential within a (logical(verify)) stream.
  • The first packet in a logical stream defines its type
e.g. the name and revision of Vorbis, the audio rate and such (as per the Vorbis definition).
  • The checksum should be taken over the packet as a whole, with the header checksum bytes zeroed out.
CRC32 execution details are somewhat unusual; see the Ogg specs.


Read:

Invalid/corrupt files

The first symptom is usually the fact that a file won't play.


When you ask ogginfo about the file, it will report errors.

Note that the xiph ogg decoders (including ogginfo) are simplistic about reporting errors. They don't try to resync on pages, they just trust the size of each page, so when that information is incorrect once it will probably fail on the rest of the file, often in a very verbose way.

I wrote a simple script that resyncs on headers, and gives a little more information (...to those informed about the format).


You can get invalid oggs in various ways, including:

Concatentation

The Ogg spec says/implies you can't just concatenate oggs.

Yes, the page-based nature with stream markers makes that trivial and correct at a mechanical level, in that it's just more pages that follow others, and the streams are marked anyway.


However, every ogg has its own stream serial numbers (identifiers), and if two streams have the same serial number it will create ambiguity which will give errors like:

Warning: illegally placed page(s) for logical stream 1
This indicates a corrupt ogg file: Page found for stream after EOS flag.
Warning: sequence number gap in stream 1. Got page 1 when expecting page 881. 
Indicates missing data.

In theory, random serial numbers would make this very unlikely, but a lot of things just start numbering with 0, 1, which means makes this almost certain to happen.


It is still easy to make an ogg concatenator that simply makes sure this is not a problem.

It is even easy to separate logical streams with the same serial number, because the sequence numbering along should indicate the point of change.

Yet this is still a violation of specs, so ogg demuxers are likely to complain.


ID3

Having an program add ID3 tags to your Oggs is, from an Ogg-container-format view, an invalid thing to do, and some decoders/players will give you errors or warnings.

You could strip them again, for example with a command-line utility like id3v2.


Other cases

A bug in taglib 1.4 (version from 2005, used by various other programs) caused it to resize and change tag data but apparently not the lacing values in the second page(verify). This leads to the file not playing and ogginfo reporting, among other things:

Warning: Hole in data (4500 bytes) found at approximate offset something bytes. Corrupted ogg.
Warning: Hole in data (9000 bytes) found at approximate offset something bytes. Corrupted ogg.

Followed by:

Warning: sequence number gap in stream 1. Got page 2 when expecting page 1. 
Indicates  missing data.
Warning: discontinuity in stream (1)

And many, many mentions of:

Warning: Could not decode vorbis header packet 1 - invalid vorbis stream (1)

However, in a good number of cases there is no real data corruption, only an invalid header.

I wrote a script to fix this, by rewriting the second page's lacing table based on the actual page content, and recalculating the CRC value.

This fixed many but not all of my affected files. The problem cases were possibly those were the metadata size became larger at the time of the bad change(verify) - or could just be a faulty assumption or bug in my code.