Pandoc
If you need to convert files from one markup format into another, pandoc is your swiss-army knife.
Contents |
[edit] Overview
Pandoc can convert documents in markdown, reStructuredText, Textile, HTML, DocBook, LaTeX, MediaWiki markup, OPML, or Haddock markup to:
- HTML formats: XHTML, HTML5, and HTML slide shows using Slidy, reveal.js, Slideous, S5, or DZSlides.
- Word processor formats: Microsoft Word docx, OpenOffice/LibreOffice ODT, OpenDocument XML
- Ebooks: EPUB version 2 or 3, FictionBook2
- Documentation formats: DocBook, GNU TexInfo, Groff man pages, Haddock markup
- Outline formats: OPML
- TeX formats: LaTeX, ConTeXt, LaTeX Beamer slides
- PDF via LaTeX
- Lightweight markup formats: Markdown, reStructuredText, AsciiDoc, MediaWiki markup, Emacs Org-Mode, Textile.
- Custom formats: custom writers can be written in Lua.
Pandoc is a Command line tool.
[edit] Features
Pandoc understands a number of useful markdown syntax extensions, including document metadata (title, author, date); footnotes; tables; definition lists; superscript and subscript; strikeout; enhanced ordered lists (start number and numbering style are significant); running example lists; delimited code blocks with syntax highlighting; smart quotes, dashes, and ellipses; markdown inside HTML blocks; and inline LaTeX. If strict markdown compatibility is desired, all of these extensions can be turned off.
Pandoc includes a powerful system for automatic citations and bibliographies, using the pandoc-citeproc, which is based on Andrea Rossato’s citeproc-hs.
Pandoc is available for Windows, MacOS X, Linux, BSD
Pandoc can understand dc metadata and CSS files for ePub output along with a list of documents containing the eBook.
[edit] Sample
Generate the following file called Test1.md
# Test! This is a test of *pandoc*. - list one - list two
Now run the following command on the saved file:
pandoc test1.md -f markdown -t html -s -o test1.html
The filename test1.md tells pandoc which file to convert. The -s option says to create a “standalone” file, with a header and footer, not just a fragment. And the -o test1.html says to put the output in the file test1.html. Note that -f markdown and -t html could have been omitted, since the default is to convert from markdown to HTML, but it doesn’t hurt to include them.
Here is sample two: Save to mybook.txt
% My Book % Sam Smith This is my book! # Chapter One Chapter one is over. # Chapter Two Chapter two has just begun.
Now run the following command on the file:
pandoc mybook.txt -o mybook.epub
If you wanted an image in the book then add to mybook.txt:
![Juliet](images/sun.jpg)
Rerun pandoc and it will include the image in the eBook. The word(s) in the brackets is the caption for the figure with the rest is the path/filename of the file to load.