EpubCheck

From MobileRead
Jump to: navigation, search

EpubCheck is the official checking tool for ePub files. It currently checks both ePub 2 and ePub 3. It does not check CSS.

Contents

[edit] Overview

EpubCheck is a Java tool. Note that epubcheck is normally used from a command line starting it from Java. It is possible to make a simple script (batch file) to invoke java and run EpubCheck from an icon so you can drag an ePub to the tool icon. You can also download a GUI front-end.

[edit] Syntax

java -jar epubcheck-3.0.1.jar singleFile [-mode MODE] [-v VERSION]
   MODE must be one of the following:
       opf for package document validation;
       nav for navigation document validation (available only for version 3.0);
       mo for media overlay validation (available only for version 3.0);
       xhtml;
       svg;
       exp for Expanded EPUB validation (see next section)

   VERSION must be one of
       2.0
       3.0

Note that when validating a single file, only a subset of the available tests is run. Also, when validating a full EPUB, both mode and version are ignored. Expanded mode

java -jar epubcheck-3.0.1.jar folder/ -mode exp [-save]

When using expanded mode, there's an optional flag -save to save the created archive upon validation.

Additional options

   -out file.xml outputs an assessment XML document
   -quiet or -q outputs only if there is any warning or error
   -help, --help or -? displays a help message

[edit] Simple Script

Here is a Windows batch file (easily modified for Unix)

"C:\Program Files (x86)\Java\jre7\bin\java.exe" -jar epubcheck.jar %1 > errors.txt 2>&1
START type errors.txt

To use this script download epubcheck and place the script where epubcheck is located. then rename the jar file to simple epubcheck.jar. See MobileRead forum for the discussion.

[edit] Downloads

[edit] Web based

[edit] Releases

[edit] 4.2.4

Features

  • update HTML schemas from the HTML Checker
  • downgrade PKG-012 (non-ASCII filenames) to USAGE
  • downgrade RSC-004 (cannot decrypt resource) to INFO
  • report empty title elements in XHTML Content Documents
  • ARIA: allow doc-epigraph on 'section' and doc-cover on 'img'
  • update the XML ouput to the new JHOVE schema

Improvements

  • improve reporting of invalid URL host parts
  • harmonize quotes usage in messages
  • add an Automatic-Module-Name entry to the jar manifest
  • deps upgrade commons-compress to v1.20 to remediate
  • deps upgrade guava to v24.1.1 to remediate
  • Will flag empty tags.

[edit] 4.2

Features

  • add new 'voicing' link relationship (97e9f1c)

Bug Fixes

  • allow any role on a elem with no href (b9ed8f6), closes #1022
  • check trailing spaces in mimetype file (123c69f)
  • remove restrictions on MathML annotation-xml (8a1b650), closes #1024
  • report ZIP checks after the 'Validating…' message (73b0ee8), closes #1025

Localization

  • update localizedmessages for Danish, French, German, Italian, Japanese, Korean, and Spanish.

[edit] 4.1

Bug Fixes

  • silence a Saxon warning (Schematron XSLT) (5045d78b), closes #859
  • fix path resolution in EpubNCXCheck (ctc package) (f572a861)
  • handle IllegalStateException in NCX checker (25336894), closes #666
  • check that the mimetype file is uncompressed (6764e250), closes #303
  • fix wrong exit message for single file validation (68af5a9a), closes #740
  • allow ARIA role attributes in SVG (49412e05), closes #769
  • allow empty xml:lang attributes (392c2f68), closes #777
  • handle no src uri in fonts, correct embedded font boolean in the XML output (a26f9c13), closes #773
  • fix issues with landmarks checks ACC-008 (74d0bdd1), closes #457, #734
  • fix focus issue when using EPUBCheck in a GUI app (cd63a166), closes #665
  • fix incorrect warning ACC_011 (5e6a69af), closes #680
  • make the type attribute optional on SVG style elements (275f6b6a), closes #688
  • exit with error when directory is not found in expanded mode (e42d189c), closes #525
  • fix a NullPointerException when checking an empty meta rendition element in OPF (42d75297), closes #727
  • fix DefaultReportImpl to avoid duplicate path info in message locations (9321355b), closes #729
  • fix broken OPF_060 and OPF_061 message format (9f0e7d12), closes #658
  • fix broken OPF_060 and OPF_061 checks for duplicate ZIP entries (05e96f40), closes #728

Features

  • allow the configuration of EPUBCheck’s locale (9b249956), closes #650, #498
  • report invalid dc:identifier UUIDs validation (as WARNING) (48800a04), closes #853
  • change --version and -version command line options to output EPUBCheck version (e498c61d), closes #743
  • check files with extensions other than .epub (1b67e046), closes #490
  • report file:// URL as INFO (8f7a2b7d), closes #289
  • improve messages for OPF-058 and OPF-059 (5e33645e), closes #804
  • enable NCX_001 check also for EPUB 3 when an NCX file is present (9715c352)
  • report non-matching identifiers in OPF and NCX as an error again (515682dc)
  • improved css font size validation (25c0b372), closes #529
  • issue a WARNING when landmarks anchors are not unique (557308ef), closes #493
  • issue a WARNING when guide/reference elements are not unique (25f28c01), closes #493
  • partial update of OPF 2.0 RelaxNG schema to latest version (changing datatype text to anyURI for href attributes) (251aa936), closes #725
  • display error/warning count in EPUBCheck results (b7babedf), closes #655
  • add file path info in uri attributes of the XML report (c958c117), closes #540
  • update the XHTML 1.1 RelaxNG schema to latest version (4c6fb49a)
  • update the OPF20 RNG schema in sync with official schema to validate empty guide elements (6540b03d)
  • report an ERROR when @clipBegin equals @clipEnd in SMIL Media Overlays (00716768), closes #568
  • improve Nav Doc validation (d32de854), closes #763, #759
  • update the NCX RelaxNG schema to add fixed list of pageTarget type values (b2c9e939), closes #761
  • improve URL checks (a44a596b), closes #708
  • rephrase messages RSC-005, RSC-016, RSC-017 (5ef44973)
  • add JHove XSD schema declaration in XML output (e55039c9), closes #736
  • add detailed resource info in RSC-008 messages (5f5ef7b7), closes #720
  • add detailed resource info in RSC-007 messages (71a76ee4), closes #475

Maintenance

  • change the project name to 'EPUBCheck' (dfd7fd27)
  • update the minimum source code compatibility to Java 1.7 (9b249956)
  • update the Saxon dependency to v9.8 (bf10f380)
  • update the Apache commons-compress dependency to v1.18 (e7dfedd8)
  • update the Google Guava dependency to v24.0 (befd9fc3)
  • update the continuous integration build matrix, now testing from Java 7 up to Java 11 (fb84b23c)
  • various translation updates (39a9a093, 6e3a8b41)

[edit] 4.02

Enhancements

  1. 673 – Enhanced XML report output:
  2. 486 – @subMessage and @severity attributes on <message> element
  3. 517 – Include list of all resources + media types
  4. 670 – Fix illegal characters in XML output
  5. 657 – New method Archive.createArchive(File) to specify file paths when using this in 3rd party tools

Bug fixes

  1. Security Fix for critical vulnerability CVE-2016-9487. This permitted hackers to invade a machine.
  2. 689 – Fix for unclosed ImageInputStreams on image file validation
  3. 678 – Clarify ACC-009 message: 'alt' -> 'alttext' attribute
  4. 686 – Make BitmapChecker.ImageHeuristics a public object
  5. 711 – Bugfix for false positive error messages due to locale settings

[edit] 4.0

The Pre-release of v4.0 brought support for ePub 3.0.1 and initial support for the EDUPUB profile.

Changes related to EPUB 3.0.1

  • Package files (OPF)
    • new collection element
    • multiple dc:type elements are now allowed
    • multiple dc:source elements has now allowed
    • allow "record" as value of link rel attribute (requires media-type set)
    • new belongs-to-collection and collection-type metadata properties
    • new media:playback-active-class metadata property
    • new source-of metadata property
    • new rendition:* metadata properties
    • new reserved prefix schema for schema.org vocabularies
    • improved prefix declaration parsing
  • XHTML Content Documents
    • RDFa and Microdata attributes are now allowed and checked for correctness
    • improved prefix declaration parsing
    • improved checking of epub:type attribute values
    • allows custom (namespaced) attributes on any element
    • new triggers ev:defaultAction, ev:phase and ev:propagate
    • new attribute aria-describedat
    • requires xhtml extension on files.
  • SVG Content Documents
    • the epub:type attribute is now allowed on any element
    • Structural semantics vocabulary
    • new term assessment
    • new term learning-objective
    • new term learning-resource
    • new term loa
    • new term lov
    • new term qna
    • new term revision-history

Changes related to EDUPUB

  • Identification of EDUPUB content from the dc:type edupub
  • Custom OPF checks for EDUPUB metadata rules
  • Support for ditributable-object and manifest collections
  • checks for epub:type semantics
  • checks for headings-related rules

Changes to the internal EpubCheck's internals:

  • Early parsing of dc:type in the OPFData object
  • Possibility to set multiple XMLValidator (i.e. schemas) in most checkers
  • Revamped prefix attribute parsing
  • New API for representing vocabularies and property-datatype values.

[edit] For more information

Personal tools
Namespaces

Variants
Actions
Navigation
MobileRead Networks
Toolbox