EBML

From MobileRead
Jump to: navigation, search

EBML is short for Extensible Binary Meta Language. EBML specifies a binary and Byte (octet) aligned format inspired by the principle of XML. EBML itself is a generalized description of the technique of binary markup. Like XML, it is completely agnostic to any data that it might contain.

The Matroska project is a specific implementation using the rules of EBML: It seeks to define a subset of the EBML language in the context of audio and video data (though it obviously isn't limited to this purpose). The format is made of 2 parts: the semantic and the syntax. The semantic specifies a number of IDs and their basic type and is not included in the data file/stream. There is a specific project dealing with EBML in more details and more recent updates.

Just like XML, the specific "tags" (IDs in EBML parlance) used in an EBML implementation are arbitrary. However, the semantic of EBML outlines general data types and ID's.

The known basic types are:

As well as defining standard data types, EBML uses a system of Elements to make up an EBML "document." Elements incorporate an Element ID, a descriptor for the size of the element, and the binary data itself. Further, Elements can be nested, or contain, Elements of a lower "level."

Element IDs (also called EBML IDs) are outlined as follows, beginning with the ID itself, followed by the Data Size, and then the non-interpreted Binary itself:

Element ID coded with an UTF-8 like system:

bits, big-endian
1xxx xxxx                                  - Class A IDs (2^7 -1 possible values) (base 0x8X)
01xx xxxx  xxxx xxxx                       - Class B IDs (2^14-1 possible values) (base 0x4X 0xXX)
001x xxxx  xxxx xxxx  xxxx xxxx            - Class C IDs (2^21-1 possible values) (base 0x2X 0xXX 0xXX)
0001 xxxx  xxxx xxxx  xxxx xxxx  xxxx xxxx - Class D IDs (2^28-1 possible values) (base 0x1X 0xXX 0xXX 0xXX)

Some Notes:

Personal tools
Namespaces
Variants
Actions
Navigation
MobileRead Networks
Toolbox
Advertisement