E-book conversion

From MobileRead

Jump to: navigation, search

Often the book we want to read on our ebook reader is in a format that is incompatible or just doesn’t look good. When this happens we have three choices (1) forget the whole thing, (2) get another ebook reader, or (3) convert the file to something we can use. If you picked #3 then this section of the Wiki page is for you. If you can't find it here then you might have to resort to digitizing Paper Books to Ebooks.

For a handy quick reference see the Conversion matrix.

This section is divided into two areas – (1) software from commercial firms (including their freeware and shareware offerings) and (2) non-commercial software developed and distributed under a GNU or similar license.

Contents

[edit] Commercial eBook Conversion Utilities

Note: These are commercial applications; however, trial versions are available for many. Some are free.

[edit] Windows

  • ABC Amber CHM Converter - converts Windows CHM to most popular formats (commercial, 5 page/book in trial version)
  • ABC Amber LIT Converter - converts Microsoft LIT to PDF + many others (freeware)
  • ABC Amber Palm Converter - converts Palm PDB and PRC to RTF, PDF, DOC, TXT, HTML, CHM, and others (freeware)
  • ABC Amber PDF Converter - converts Adobe Acrobat PDF to RTF, DOC, TXT, HTML, CHM, and others (commercial, trial available)
  • ABC Amber Rocket eBook Converter - converts Rocket eBook EB to RTF, PDF, DOC, TXT, HTML, CHM, and others (commercial, trial available)
  • ABC Amber Sony LRF Converter - converts Sony BBeB LRF format ebooks to RTF, PDF, DOC, LIT, TXT, HTML, CHM, and others (freeware)
  • ABC Amber Text Converter - converts almost every major (and many minor) text formats to RTF, PDF, LIT, RB, PDB, DOC, TXT, and many more file types (commercial, trial available)
  • Adobe Acrobat Pro edits and converts (exports) PDF to jpg,png,tif,html,word, RTF or xml. (commercial, expensive)
  • DeskUNPDF an all-purpose PDF unconverter to xls, jpg, png, tif, html, xhtml, odt or xml. (4 page/book trial version.)
  • eBook Publisher - from eBook Technologies is the content conversion and publishing tool for creating professionally formatted content on the EBookwise-1150 and REB 1200 (IMP format). This free GUI-based tool runs under Windows and MacOS.
  • eBook Studio - from Palm Digital Media, it takes text, RTF, HTML, PML (Palm Markup Language), and OPF (Open eBook Package Format) and produces PDB files for all Palm Readers and with eReader and eReaderPro (reader software is free)
  • GEBLibrarian - by Steve Breen. GEB Librarian allows users of the Gemstar eBook device (REB 1100, REB 1200, EBookwise-1150) to create (from DOC, RTF, HTML, RB and TXT files) and download personal content to their eBook devices directly from their PCs. Also manages their own personal content via Bookshelves.
  • LIT Addin for MS Word</s> - An add-in for MS Word 2002 and 2003 that produces LIT format output. (free)
  • Mobipocket Creator - from Mobipocket, it takes text, HTML, PDF and MS Word and produces Mobipocket PRC files for all Mobipocket readers, with optional DRM. (Freeware with all features enabled.)
  • Par is a package that reformats paragraphs. It is similar to fmt but much better for eBooks. Source and winodws binaries are available. Command line application.
  • Overdrive Readerworks is an authoring and conversion tool to to create LIT format output. The standard version is free.
  • PDFCropper - PDFCropper is the application, designed to solve the problem with preparing for reading normal sized (A4, B4, C4, letter etc.) pdf's on relative small (Sony Reader PRS500/PRS505, iRex Iliad etc.) devices.
  • pdfFactory - a pdf printer driver that converts any printable document into a PDF. (trial version available)
  • Primo PDF - a virtual printer, converts a web page into a PDF. Also works from most any source file (freeware)
  • Prince is a software package that converts HTML and XML to PDF. They have a free version for home use.
  • Readerette News Transferrer - converts rss/atoms feeds including blogs to PDF for Sony PRS-500 (commercial, trial available) - (site no longer works)
  • RoverSoft has text2image software (RTEXTasImage) to convert text for devices that can only display images. Devices include Zune, Image viewers, Archos video viewers, etc.
  • Stanza - eBook reader and converter from Lexcycle. Supports reading from Plain text, HTML, PDF, ePUB/OEB, Mobipocket, Amazon Kindle, MS LIT, MS Word, RTF, PalmDOC, and RAR compressed books. Can export to iPhone bookmarklets, Kindle, Mobipocket, HTML, epub, and PDF.
  • Adolix PDF Converter - PDF converter software, converts to PDF using a virtual printer. PDF pages can be exported to BMP, JPEG and TIFF.(shareware)

[edit] Macintosh

  • eBook Publisher - from eBook Technologies is the content conversion and publishing tool for creating professionally formatted content on the EBookwise-1150 and REB 1200. This free GUI-based tool runs under Windows and MacOS.
  • eBook Studio - from Palm Digital Media, it takes text, RTF, HTML, PML (Palm Markup Language), and OPF (Open eBook Package Format) and produces PDB files for all Palm Readers and with eReader and eReaderPro (reader software is free)
  • HTML to PDF - just use "Save as PDF..." button in your browsers Print dialog box
  • Lit2html - see MacOSX Hints (freeware AppleScript wrapper for ConvertLit)
  • Prince is a software package that converts HTML and XML to PDF. They have a free version for home use.
  • Tubby - converts CHM to HTML (freeware)

[edit] Linux (Also potentially Solaris or FreeBSD)

  • Prince is a software package that converts HTML and XML to PDF. They have a free version for home use.
  • Par is a package that reformats paragraphs. It is similar to fmt but much better for eBooks. Source is available.

[edit] Non-Commercial eBook Conversion Utilities

Note: Some are works in progress, some require you to compile them, and some are commercial quality. Before using please read the posted MobileRead thread where available. At times there are other programs and utilities that must be loaded prior to using certain tools listed below. Note that some of these programs have their own content pages in this wiki.

  • BBeB Binder - creates BBeB (for Sony Reader only) from HTML. Requires Windows .Net 2.0 and Internet Explorer. Check out this discussion
  • BBeBook - a Java port of Make LRF. Supported input is HTML and PDF and the output creates PNG for the images
  • BookCreator Tool is an eBook creator tool. This tool is an MS Word template with VBA macro code.
  • Book Designer - a Windows only conversion tool (Requires MS Word) that can output LRF, IMP, MOBI and several other formats.
  • Calibre - by Kovid Goyal, utility for working with ebooks including file transfer to the SONY Reader. Uses Python. For Windows/Linux/OSX. Can convert HTML, TXT, RTF, LIT and PDF files to LRF. Also has utilities to download websites and automatically convert them to LRF.
  • ComicLRF - Comic Book (CBR/CBZ) one step converter to BBeB (LRF).
  • deimp.exe by Nick Rapallo. The text decompressor (extractor) for .imp files. Version 0.1 can extract just the basic text from within any non-DRM'ed compressed .imp file.
  • Docudesk PRS Browser for OS X: Free Mac OS X tool for manipulating files on the Sony Reader.
  • eCub - a simple to use EPUB and MobiPocket ebook creator
  • EditLRFmeta - edits LRF metadata such as "Author" and "Title"
  • FB2 to LRF Converter - batch mode fast FB2 to LRF converter. Windows GUI.
  • fixLRF - fixes errors in LRF files created with DLL from Book Creator (XYLogParser.Dll)
  • Flat LRF - designed to convert a multilevel web site into a single LRF file. Uses Java
  • Formatting Gutenberg Texts - designed for OpenOffice and MS Word to convert the text files from Project Gutenberg to RTF or PDF (OpenOffice only)
  • Gutclean - A wordwrap utility for Project Gutenberg text files.
  • GuteBook by Nick Rapallo - Windows GUI & Perl script that converts a Project Gutenberg and PG Australia ebook referenced by it's EText-No. and/or URL directly into various .EPUB/.LRF/.MOBI/.LIT/.IMP/.RB formats simultaneously.
  • GutenMark - Produces formatted HTML from Project Gutenberg text files.
  • GUTLRF - uses HTML2LRF (in calibre) to create Sony Reader BBeB (LRF) from Project Gutenberg HTML (or text) ETexts.
  • HTML2LRF - discussion thread - creates Sony Reader BBeB (LRF) from HTML.
  • html2pml by Shu Ning Bian. Basic conversion of HTML to PML for use with DropBook.
  • Imp Librarian, by L. Landwehr. Catalogs a collection of .IMP files and produces .csv files for import into MS Excel or equiv.
  • Ishtar by Yves Sagnier. Command line tool to convert HTML to RTF. Available in 16 bit MS-DOS and 32 bit Windows console versions. Freeware. (See Martha below)
  • JAP - an image-book creation program to produce readable files for the some popular e-book reading devices (Reb-1200/GEB-2150, Sony Reader/Librie, Hanlin V8, PPC) from PDF and DjVu files, or from set of pictures/scans
  • JE Comics Converter - converts JPG into PDF and optimizes it for viewing on a Sony Reader or iRex iliad. Requires Java. Alternative discussion
  • Libriate - a GUI for Make LRF, for Mac OSX only
  • LIT2SB - a conversion tool with detailed instructions on how to convert (many) Microsoft reader .LIT files to .IMP formats (for the EBookwise-1150 and/or REB 1200)
  • LIT2LRF - creates LRF from LIT. Converts LIT's html to a Xylog XML LRS file, then uses LRSParser to compile a LRF. Windows only
  • LRF2LRS - converts LRF to Xylog XML LRS. Written in Python
  • LRF Parser - parses an LRF file, dumps all tags and streams. Also check out the Yahoo Librie Group (requires registration)
  • LRS2LRF - converts LRS to LRF. Command line tool that uses a DLL from Book Creator (not included) to compile a Xylog XML (BBeB source format) (download access requires registration)
  • LRF Unpack - extracts data from LRF. Requires Microsoft .Net 2.0
  • LRS to LRF Converter - converts LRS to LRF. Does not use XYLogParser.dll from Book Creator. Now included in Book Designer. Works both for the Sony Librie and Sony Reader
  • Make LRF - the first conversion tool written for the Sony Librie, predates the Sony Reader. Also check out the Yahoo Librie Group (requires registration)
  • Martha by Yves Sagnier. Command line tools to convert RTF to HTML. Available in 16bit MS-DOS and 32 bit Windows console versions. Freeware. (See Ishtar above)
  • MBP_reader - a program to export user notes from mobipocket mbp files, to plain text.
  • Mobi2IMP by Nick Rapallo - Windows GUI & Perl script to make IMP files from MOBI files as well as other things; also has a dos executable and useful batch file.
  • Mobiperl is a collection of Perl scripts to manipulate MOBI files. There are also some compiled windows binaries.
  • odt2pml - Free OpenOffice extension, converts Writer documents to Palm eReader format. WYSIWYG, within PML limitations.
  • ODF converter - An OpenOffice and Microsoft Office files add-in to allow the exchange of OpenOffice and Microsoft OpenXML files.
  • OOo FBTools - Free OpenOffice extension, converts Writer documents to FictionBook2 format. WYSIWYG.
  • PaperCrop converts a PDF to images. It removes borders and multiple columns to make viewing easier on portable devices.
  • pdflrf - Converts PDF, DJVU, and CBZ to LRF. Has controls to manage the thickness of the lines and can split PDF pages.
  • PDFRead is a tool for converting primarily PDF and DJVU documents for reading on eBook devices. It does this by creating an image out of each page, enhancing the image and then collating the images in a device-specific format (supports .IMP/.RB/.OEB/.LRF/.PRC/.HTML)
  • pielrf - Converts text/light html to lrf similar to makelrf, but feature-rich. Includes Table Of Contents in the Reader Menu, Headers, Curly Quotes, paragraph autoflow, etc. Python based for Mac OS X, Linux, Windows.
  • Plucker Desktop - Free, open source, cross-platform application to "pluck" HTML pages and create pdb files that can be read e.g. by the Plucker viewer, available for PalmOS and other platforms, and e.g. by FBReader.
  • Simplicissimus BookMaker Extension is a free OpenOffice extension that makes optimized PDF for iRex iLiad, Cybook Gen3, BeBook, Sony Reader, Hanlin and Readius starting from all OpenOffice Writer supported formats.
  • SoftSnow Merger is a utility to combine HTML files into one large file.
  • Tidy is one of several programs with tidy in their names. This one is often called HTML Tidy. Technically it is not a conversion program but rather a program to clean up HTML files.
  • UnRTF is a command-line program written in C which converts documents in Rich Text Format (.rtf) to HTML, LaTeX, PostScript, and other formats. Converting to HTML, it supports a number of features of Rich Text Format.
  • Web2Book formerly RSS2Book - by Geekraver and featured at MobileRead. Creates formatted Sony Reader PDFs, RTFs, LRFs or HTML from RSS feeds, HTML pages, Wikipedia entries, Project Gutenberg books, and other sources. Windows only (.Net 2.0 required). Can be configured for other page sizes. Can sync directly to Sony Readers. Supports a plugin architecture for sources, formats and target devices.
  • Word Macro for Formatting RTF - developed by Stingo and featured at MobileRead. Works within Microsoft Word
  • Xpdf - an open source viewer for PDF files, also includes a PDF text extractor, PDF-to-PostScript converter, and various other utilities.
  • yWriter is really a word processor for writing novels on Windows. It will import RTF files.

[edit] Web Based

  • ePubNow! is a powerful XML Workflow for Publishers to convert to ePub format from HTML extracted from PDF files. ePubNow! Google Group provides important information on how to convert your precious content to industry-grade quality epub format ebooks.
  • Feedbooks - Support both epub (new e-book standard for Adobe DE, FBReader and support on the Sony PRS-505) and PDF (templates for A4, Sony Reader, iRex iLiad and custom) generation on the fly. FeedBooks allows you to publish texts in the public domain, your own works, but also RSS feeds and sudoku. Still in the beta stage, FeedBooks currently supports English and French. There is no charge for the site although registration is required to submit a book.
  • Smashwords Free publishing service allows authors and publishers to upload source manuscripts as Word .doc or RTF files, then converts the files to nine ebook formats: two online reading formats (HTML and Javascript), PDF, EPUB, MOBI, PDB, RTF and text (two versions). Authors/publishers set the price of the book.
  • mBook will convert on-line any text and images to a java eBook (JAR) for mobile phones.
  • zinepal.com will convert web pages, blogs and Atom/RSS feeds to PDF, ePub and Mobipocket format files.

[edit] Java Based

  • JE-Comics will convert comic book images to PDF files.
  • lib2go.com will allow to convert documents to LRF format

[edit] SDK

A Software Developers Kit (SDK) is designed to provide the ability of software developers to write tools to generate eBooks in a particular format.

Microsoft SDK includes DLL for Microsoft Visual Studio C++ developers to generate LIT formatted eBooks. They have also have tools for Dictionary development, TTS and other capabilities. These are free downloads

[edit] Conversion Services

  • Digital Media Initiatives (DMI) is a leading technology consulting and conversion house in the e-publishing domain. Based out of several locations in India, Australia and USA, it has developed some premier e-publishing tools to benefit authors and publishers. DMI also supports publishers implement XML Workflows to automate their in-house production and content management, besides, assisting in converting to various formats including Amazon Kindle, Mobipocket ePub, eReader, and DocBook XML.
  • ePubNow! is an online epub and DocBook v5.0 XML production platform where on registration, authors and publishers are given access to an online book authoring interface and an automated publishing to the ePub format. Publishers can follow a scientifically planned workflow to extract HTML from print-PDFs and convert them to DocBook v5.0 XML for an efficient content management, revision management, and publishing to an industry-grade quality assured and validated .epub format ebbok.
  • eBook Architects - The premier provider of conversion services to individuals and publishers. More than 6 years of eBook development and formatting experience and a track record for the highest quality in the industry. Specializing in creating eBooks for the Amazon Kindle, Mobipocket ePub, Smashwords, eReader, and others.
  • Aptara - Transforming Content into Knowledge - Aptara offers full service eBook solutions including eBook creation for both legacy portfolios and digital book publishing. Providing the broadest range of publishing and content management solutions, per customer-specific requirements, Aptara gives your content the required reach to help you stay competitive.    Aptara’s PowerXEditorTM fast publishing tool enables collaboration between content professionals, subject matter experts (SMEs) and reviewers for more efficient ebook creation, faster-time-to market for web-ready and print content, and lower cost of production.   Supported formats include ePub, Amazon's Kindle, Mobipocket, Zinio, Vital Source, and others from almost any input format. DRM for Adobe Digital Editons also available.
  • eBook Conversion - It Global Solution is an ebook conversion service provider company and offering conversion services to eBook from PDF, word document, normal paperback, or any text format to widely used eBook formats like Mobipocket, Microsoft reader, Kindle, and ePUB. They can also do OCR. They can also convert eBook in any other available format according to customer requirements.
  • Apps publisher will publish eBooks for primarily mobile phones. They accept input in a variety of formats and generate an executable containing the eBook.

[edit] See Also

[edit] Best Conversion Practices

The purpose of this section is to present the tips, tricks, usage instructions, and best methods or practices for each package.

This is an evolving area and subject to rapid and dynamic change as the conversion tools themselves improve. While some of the entries may seem quixotic, what is old hat for some may be a revelation to others.

Personal tools
MobileRead Networks