ePub 3

From MobileRead
Jump to: navigation, search

ePub version 3 is the newest version of the standard and has now been recommended by the idpf standards committee. This page will describe some of the features as related to the existing ePub 2.01 version. See also Fixed layout ePub. As of June 2014 the 3.0.1 version is now the approved standard.


[edit] Document Organization

In version 2.01 there were three defining documents, the OPF (Open Packaging Format), the OCF (Open Container Format), and the OPS (Open Publications Structure). The OPS referenced a DAISY standard for the NCX file. The new 3.0 standard has 4 defining documents with new names. The OPF becomes the ePub Publications standard. The OCF remains the same and the OPS received the most changes to become the ePub Content Documents. This now includes the old NCX specifications which are no longer used. A fourth document is concerned with Media Overlays and is a new feature of ePub version 3.

The OPF file must contain:

<package version="3.0" xml:lang="en">

[edit] Open Container Format

An ePub file continues to be a self contained document with everything contained in one zip file. In 2.0 the access outside with links was not defined but this omission has been addressed in 3.0 to limit the access to specific cases outside the zip file. The following items are specified in the OCF.

  • container.xml [required] Identifies the file that is the point of entry for each embedded Publication.
  • signatures.xml [optional] Contains digital signatures for various assets.
  • encryption.xml [optional] Contains information about the encryption of Publication resources. (This file is required if font obfuscation is used.)
  • metadata.xml [optional] Used to store metadata about the container.
  • rights.xml [optional] Used to store information about digital rights.
  • manifest.xml [allowed] A manifest of container contents as allowed by Open Document Format

[edit] Publications standard

The version 2.01 standard for OPF remains basically intact with a few additions. The dcterms:modified has been added to provide a solution for consistent publication identifiers. The metadata elements have been expanded to permit descriptions to be targeted at specific portions of a document as well as the whole document. There is a new link entry that can be used to reference external meta data sources. A new properties entry allows defining publication resources.

[edit] The Content Documents

This is the biggest area of change.

  • HTML5 has been adopted as the XHTML format. DTBook is no longer supported as an option for ePub. Note that HTML5 allows constructions that are not XHTML (XML) compliant. These constructions are not permitted. There are also ePub extensions to HTML5.
  • SVG documents can now appear in the spine. They no longer have to be inside an XHTML document. Not all of SVG is supported and the constructions must be XML compliant.
  • MathML is now a supported format.
  • Semantic Inflection - headings and the like have no fixed meaning global meaning but depend on where in the file they appear.
  • Content switching was introduced in OPS 2 but is now simplified.
  • The <!DOCTYPE html PUBLIC... statement near the top of the file is no longer allowed.
  • named entities (Entity reference) are not supported unless they are defined. For example:
<!DOCTYPE html [
   <!ENTITY nbsp "&#160;"> 

[edit] Navigation

A Navigation document is a required element in ePub 3. An internal navigation syntax is defined for ePub 3. It has a human- and machine-readable grammar for publication-wide navigation information and is based on the HTML5 nav element. In this way it is more like an inline TOC in a book since it is an XHTML document. The Navigation document can be included in the spine as well and it will appear as an inline document. An example follows for a top level "toc":

<nav epub:type="toc" id="toc">
  <h1>Table of contents</h1>
      <a href="chap1.xhtml">Chapter 1</a>
          <a href="chap1.xhtml#sec-1.1">Chapter 1.1</a>
          <ol hidden="">
              <a href="chap1.xhtml#sec-1.1.1">Section 1.1.1</a>
              <a href="chap1.xhtml#sec-1.1.2">Section 1.1.2</a>
           <a href="chap1.xhtml#sec-1.2">Chapter 1.2</a>
      <a href="chap2.xhtml">Chapter 2</a>

The top level is indicated by a type of "toc". There can only be one instance in a document. The items are in ordered lists showing the flow of the document. All sections are expected to be present although some can be suppressed from the visual display by using the "hidden" tag as show for the 3rd level.

Note that a toc.ncx file, NCX, will be permitted in a document to provide backward compatibility for existing readers.

[edit] Page numbers

There is also a navigation item that will provide page numbers mapping to a hardcopy book. The page-list nav element (<nav epub:type="page-list">) is a container for pagination information. It provides navigation to positions in the Publication content that correspond to the locations of page boundaries present in a print source being represented by this ePub Publication. Its form looks just like the TOC form shown above except:

  • The page-list nav element should contain only a single <ol> descendant (i.e., it should be a flat list, not a nested structure of navigation items).
  • The order of <li> elements contained within a page-list <nav> structure must match the order of the actual pages inside each targeted ePub Content Document and must also follow the order of Content Documents in the Publication spine.
  • The page-list <nav> element is optional in ePub Navigation Documents and must not occur more than once.
  • A hidden element can be used to suppress showing the list in the normal flow.

Here is an example of page number navigation.

<nav epub:type="page-list" hidden="">
    <h2>Pagebreaks of the print version, third edition</h2>
        <li><a href="frontmatter.xhtml#pi">I</a></li>
        <li><a href="frontmatter.xhtml#pii">II</a></li> 
        <li><a href="chap1.xhtml#p1">1</a></li>
        <li><a href="chap1.xhtml#p2">2</a></li>

The page-list nav element corresponds to the pageList element in the superseded NCX. [OPF2]

Note: The dc:source [Publications30] element provides a means of identifying the source publication to which the given pagination information applies.

[edit] Other nav elements

There are several other navigation elements defined in ePub 3. Nav elements can reference other nav elements in secondary lists such as the 'lot' list of tables or 'loi' list of illustrations (see Figure) which are identified using the epub:type attribute. For multimedia there could also be 'loa' list of audio and 'lov' list of video.

The landmarks element is used to identify and point to these other elements. There can only be one landmarks entry in the document. Its use is optional. An example is shown below.

 <nav epub:type="landmarks">
        <li><a epub:type="toc" href="#toc">Table of Contents</a></li>
        <li><a epub:type="loi" href="content.html#loi">List of Illustrations</a></li>
        <li><a epub:type="bodymatter" href="content.html#bodymatter">Start of Content</a></li>

[edit] Linking

There are planned to be several methods of linking.

  • ePubCFI is a newly defined linking scheme. It provides for a method of linking to a location inside an ePub from somewhere outside the document. It requires specific knowledge of the content. It makes use of id's defined in the document as well as counting of locations. It needs knowledge of the code view.
  • CSS linking will have new alternate forms as specified in the class attribute. Permitted values include vertical, horizontal, day, night.
<link rel="stylesheet" href="horizontal.css" class="horizontal"/>

[edit] Scripting

Scripting is now supported with specific limitations to make it robust even if the reading application doesn't support it.

[edit] CSS

EPUB 3 defines a profile of CSS based on CSS 2.1 with added modules from CSS3. All of CSS 2.1 are applicable except:

  • The fixed value of the position property is not part of the EPUB 3 CSS Profile.
  • The direction and unicode-bidi properties must not be included in an EPUB Style Sheet.
  • language should not be set is CSS, it should be done in HTML5 statements.
  • The CSS must be UTF-8 or UTF-16 encoded
  • ePub extensions to CSS will be prefixed with -epub (instead of oeb- used in ePub 2.01)

CSS3 specific items include:

  • The EPUB 3 CSS Profile includes @font-face rules and descriptors as defined in the CSS3 Fonts Module Level 3 with the following descriptors:
    • font-family
    • font-style
    • font-weight
    • src
    • unicode-range
  • CSS 3.0 Speech module is used with the following additions:
    • -epub-cue
    • -epub-pause
    • -epub-rest
    • -epub-speak
    • -epub-speakability
    • -epub-voice-family

[edit] Embedded Fonts

With and without Obfuscation. Support for OpenType and WOFF type fonts are required.

[edit] Media Overlays

HTML5 adds video and audio elements directly but there is additional support in ePub 3 with a defined format and processing model for publication-wide synchronization of text and audio. Multiple features to assist Text-to-Speech (TTS) engines have been added. This support extends to read aloud capability using recorded audio files.

Systems that support audio playback must support MP3 audio and should support MP4 AAC LC audio. SMIL is required to control the audio insertion and synchronize the data. See SMIL for more details.

Any ePub Content Document associated with a Media Overlay may contain embedded media such as video, audio, and images. The Media Overlay text element may be used in such instances to reference the embedded media by its ID.

To make the Media overlay work the file pointers must be in the manifest. The reading system must show the appropriate page on the screen while the audio is playing. The user can navigate the document normally and the appropriate media will automatically keep pace. There could also be audio or video elements in the ePub file that are not contained within the media overlay. They are not synchronized. The user interface should provide media controls for these items.

[edit] Available Readers

[edit] Available Publishing Tools

  • IGP Digital Publisher - for Windows
  • RoboHelp - from Adobe for Windows
  • BlueGriffon - based on the BlueGriffon WebEditor is a WYSIWYG EPUB editor and reader. Available for Windows, Mac, and Linux. Does both ePub 3 and ePub 2.
  • ePubSTAR - converts from Word, TXT, CHM to ePub 2 or ePub 3. For Windows only. Has a recommended schedule of donations.
  • Pubcoder.com - For OS X. FIxed-format ePub 3, KF8 and "Android app.". Optimized for Apple iBooks, Kobo, Readium, Azardi, Gitden Reader, Google Play Books and Kindle KF8.
  • ViewPorter - Targeted at ePub 2 and ePub 3. Includes check tools. Looks to be a spinoff of Sigil. For Windows and Mac. Seems to be focussed on fixed-format EPUB3 in recent versions.
  • KITABOO publishing - creates interactive digital content with multimedia. Has cloud-based technology to securely publish and distributes eBooks on all mobile platforms and devices with analytics
  • 3D Issue allows you to publish anything anywhere. Use 3D Issue’s eBook creator tool to convert your content into ePub3 or kindle publications. Import content from your word documents, PDFs or just copy and paste the content in. Upload and push your eBooks to your reader base through their eReader devices. A great tool for distributing documents to your mobile reader base.
  • oXygen XML Editor - the oXygen XML suite comes in three versions: Editor (the full package); Author (focuses on authoring documents in XML format); and Developer, (focussed on XML editing, schema development, and XSLT editing and debugging). Both the Editor and Author editions support the painless creation and editing of EPUB 2 and EPUB 3 documents.
  • eXeLearning.net XHTML & HTML5 Editor - eXeLearning is a free / libre software tool under GPL-2 that can be used to create educational interactive web content. eXeLearning can generate interactive contents in XHTML, HTML5 and ePub3 format. It allows you to create easily navigable web pages including text, images, interactive activities, image galleries or multimedia clips.
  • Sigil has a plugin available for 0.8.2 version or later that can convert an ePub 2 to ePub 3.
  • Pandoc can convert many input formats to ePub 3 (or ePub 2), including markdown.

[edit] Convert ePub 2 to ePub 3

There is a complete description at sketchytech on lessons learned. Note that since most eBook readers can also parse ePub 2 you may get away with not doing all of these things particularly if you want the eBook to also be read by an ePub 2 compatible device. An ePub 2 device is likely not to recognize the version number so it would be ignored.

  • Basically the OPF package must claim it is an ePub 3 file: <package version="3.0" ...
  • Remove the text xmlns:opf="http://www.idpf.org/2007/opf" from the metadata tag.
  • change the identifier tag to the new format
    • from <dc:identifier id="BookId" opf:scheme="UUID">urn:uuid:56ca9730-7e4a-446b-962b-74db6533d168</dc:identifier>
    • to <dc:identifier id="uid">56ca9730-7e4a-446b-962b-74db6533d168</dc:identifier>
  • change role tag
    • from <dc:creator opf:role="aut">Nathaniel Stern</dc:creator>
    • to <dc:creator id="aut">Nathaniel Stern</dc:creator>
  • add a dcterms:modified entry to show the modification date
    • <meta property="dcterms:modified">2015-12-19T19:23:07Z</meta>
  • If you remove the TOC.ncx file then you need to remove the reference from the OPF file. (There is no reason you need to remove the TOC.ncx) <spine toc="ncx">
  • You will need a new TOC referenced as: <nav epub:type="toc" id="toc">
    • see above for an example of a nav element for the TOC. It can also be in the document as an inline TOC as well which is a plus for this format.
    • The file needs to be referenced in the manifest section
    • <item href="content.xhtml" id="nav" media-type="application/xhtml+xml" properties="nav" />

[edit] For more information

[edit] ePub 3 conversion services

Personal tools

MobileRead Networks