1478 lines
54 KiB
Text
1478 lines
54 KiB
Text
% Pandoc User's Guide
|
|
% John MacFarlane
|
|
% March 20, 2010
|
|
|
|
Synopsis
|
|
========
|
|
|
|
pandoc [*options*] [*input-file*]...
|
|
|
|
Description
|
|
===========
|
|
|
|
Pandoc is a [Haskell] library for converting from one markup format to
|
|
another, and a command-line tool that uses this library. It can read
|
|
[markdown] and (subsets of) [Textile], [reStructuredText], [HTML],
|
|
and [LaTeX]; and it can write plain text, [markdown], [reStructuredText],
|
|
[HTML], [LaTeX], [ConTeXt], [RTF], [DocBook XML], [OpenDocument XML], [ODT],
|
|
[GNU Texinfo], [MediaWiki markup], [EPUB], [Textile], [groff man] pages,
|
|
[Emacs Org-Mode], and [Slidy] or [S5] HTML slide shows.
|
|
|
|
Pandoc's enhanced version of markdown includes syntax for footnotes,
|
|
tables, flexible ordered lists, definition lists, delimited code blocks,
|
|
superscript, subscript, strikeout, title blocks, automatic tables of
|
|
contents, embedded LaTeX math, citations, and markdown inside HTML block
|
|
elements. (These enhancements, described below under [Pandoc's markdown
|
|
vs. standard markdown](#pandocs-markdown-vs.-standard-markdown),
|
|
can be disabled using the `--strict` option.)
|
|
|
|
In contrast to most existing tools for converting markdown to HTML, which
|
|
use regex substitutions, Pandoc has a modular design: it consists of a
|
|
set of readers, which parse text in a given format and produce a native
|
|
representation of the document, and a set of writers, which convert
|
|
this native representation into a target format. Thus, adding an input
|
|
or output format requires only adding a reader or writer.
|
|
|
|
Using Pandoc
|
|
------------
|
|
|
|
If no *input-file* is specified, input is read from *stdin*.
|
|
Otherwise, the *input-files* are concatenated (with a blank
|
|
line between each) and used as input. Output goes to *stdout* by
|
|
default (though output to *stdout* is disabled for the `odt` and
|
|
`epub` output formats). For output to a file, use the `-o` option:
|
|
|
|
pandoc -o output.html input.txt
|
|
|
|
Instead of a file, an absolute URI may be given. In this case
|
|
pandoc will fetch the content using HTTP:
|
|
|
|
pandoc -f html -t markdown http://www.fsf.org
|
|
|
|
If multiple input files are given, `pandoc` will concatenate them all (with
|
|
blank lines between them) before parsing.
|
|
|
|
The format of the input and output can be specified explicitly using
|
|
command-line options. The input format can be specified using the
|
|
`-r/--read` or `-f/--from` options, the output format using the
|
|
`-w/--write` or `-t/--to` options. Thus, to convert `hello.txt` from
|
|
markdown to LaTeX, you could type:
|
|
|
|
pandoc -f markdown -t latex hello.txt
|
|
|
|
To convert `hello.html` from html to markdown:
|
|
|
|
pandoc -f html -t markdown hello.html
|
|
|
|
Supported output formats are listed below under the `-t/--to` option.
|
|
Supported input formats are listed below under the `-f/--from` option. Note
|
|
that the `rst`, `textile`, `latex`, and `html` readers are not complete;
|
|
there are some constructs that they do not parse.
|
|
|
|
If the input or output format is not specified explicitly, `pandoc`
|
|
will attempt to guess it from the extensions of
|
|
the input and output filenames. Thus, for example,
|
|
|
|
pandoc -o hello.tex hello.txt
|
|
|
|
will convert `hello.txt` from markdown to LaTeX. If no output file
|
|
is specified (so that output goes to *stdout*), or if the output file's
|
|
extension is unknown, the output format will default to HTML.
|
|
If no input file is specified (so that input comes from *stdin*), or
|
|
if the input files' extensions are unknown, the input format will
|
|
be assumed to be markdown unless explicitly specified.
|
|
|
|
Pandoc uses the UTF-8 character encoding for both input and output.
|
|
If your local character encoding is not UTF-8, you
|
|
should pipe input and output through `iconv`:
|
|
|
|
iconv -t utf-8 input.txt | pandoc | iconv -f utf-8
|
|
|
|
Wrappers
|
|
========
|
|
|
|
`markdown2pdf`
|
|
--------------
|
|
|
|
The standard Pandoc installation includes `markdown2pdf`, a wrapper
|
|
around `pandoc` and `pdflatex` that produces PDFs directly from markdown
|
|
sources. The default behavior of `markdown2pdf` is to create a file with
|
|
the same base name as the first argument and the extension `pdf`; thus,
|
|
for example,
|
|
|
|
markdown2pdf sample.txt endnotes.txt
|
|
|
|
will produce `sample.pdf`. (If `sample.pdf` exists already,
|
|
it will be backed up before being overwritten.) An output file
|
|
name can be specified explicitly using the `-o` option:
|
|
|
|
markdown2pdf -o book.pdf chap1 chap2
|
|
|
|
If no input file is specified, input will be taken from *stdin*.
|
|
All of `pandoc`'s options will work with `markdown2pdf` as well.
|
|
|
|
`markdown2pdf` assumes that `pdflatex` is in the path. It also
|
|
assumes that the following LaTeX packages are available:
|
|
`unicode`, `fancyhdr` (if you have verbatim text in footnotes),
|
|
`graphicx` (if you use images), `array` (if you use tables),
|
|
and `ulem` (if you use strikeout text). If they are not already
|
|
included in your LaTeX distribution, you can get them from
|
|
[CTAN]. A full [TeX Live] or [MacTeX] distribution will have all of
|
|
these packages.
|
|
|
|
`hsmarkdown`
|
|
------------
|
|
|
|
A user who wants a drop-in replacement for `Markdown.pl` may create
|
|
a symbolic link to the `pandoc` executable called `hsmarkdown`. When
|
|
invoked under the name `hsmarkdown`, `pandoc` will behave as if the
|
|
`--strict` flag had been selected, and no command-line options will be
|
|
recognized. However, this approach does not work under Cygwin, due to
|
|
problems with its simulation of symbolic links.
|
|
|
|
[Cygwin]: http://www.cygwin.com/
|
|
[`iconv`]: http://www.gnu.org/software/libiconv/
|
|
[CTAN]: http://www.ctan.org "Comprehensive TeX Archive Network"
|
|
[TeX Live]: http://www.tug.org/texlive/
|
|
[MacTeX]: http://www.tug.org/mactex/
|
|
|
|
Options
|
|
=======
|
|
|
|
`-f` *FORMAT*, `-r` *FORMAT*, `--from=`*FORMAT*, `--read=`*FORMAT*
|
|
: Specify input format. *FORMAT* can be `native` (native Haskell),
|
|
`json` (JSON version of native AST), `markdown` (markdown),
|
|
`textile` (Textile), `rst` (reStructuredText), `html` (HTML),
|
|
or `latex` (LaTeX). If `+lhs` is appended to `markdown`, `rst`,
|
|
or `latex`, the input will be treated as literate Haskell source:
|
|
see [Literate Haskell support](#literate-haskell-support),
|
|
below.
|
|
|
|
`-t` *FORMAT*, `-w` *FORMAT*, `--to=`*FORMAT*, `--write=`*FORMAT*
|
|
: Specify output format. *FORMAT* can be `native` (native Haskell),
|
|
`json` (JSON version of native AST), `plain` (plain text),
|
|
`markdown` (markdown), `rst` (reStructuredText),
|
|
`html` (HTML), `latex` (LaTeX), `context` (ConTeXt), `man` (groff man),
|
|
`mediawiki` (MediaWiki markup), `textile` (Textile), `org` (Emacs
|
|
Org-Mode), `texinfo` (GNU Texinfo), `docbook` (DocBook XML),
|
|
`opendocument` (OpenDocument XML), `odt` (OpenOffice text document),
|
|
`epub` (EPUB book), `slidy` (Slidy HTML and javascript slide show),
|
|
`s5` (S5 HTML and javascript slide show), or `rtf` (rich text
|
|
format). Note that `odt` and `epub` output will not be directed to
|
|
*stdout*; an output filename must be specified using the `-o/--output`
|
|
option. If `+lhs` is appended to `markdown`, `rst`, `latex`, or `html`,
|
|
the output will be rendered as literate Haskell source:
|
|
see [Literate Haskell support](#literate-haskell-support),
|
|
below.
|
|
|
|
`-s`, `--standalone`
|
|
: Produce output with an appropriate header and footer (e.g. a
|
|
standalone HTML, LaTeX, or RTF file, not a fragment).
|
|
|
|
`-o` *FILE*, `--output=`*FILE*
|
|
: Write output to *FILE* instead of *stdout*. If *FILE* is
|
|
`-`, output will go to *stdout*. (Exception: if the output
|
|
format is `odt` or `epub`, output to stdout is disabled.)
|
|
|
|
`-p`, `--preserve-tabs`
|
|
: Preserve tabs instead of converting them to spaces (the default).
|
|
|
|
`--tab-stop=`*TABSTOP*
|
|
: Specify the number of spaces per tab (default is 4).
|
|
|
|
`--strict`
|
|
: Use strict markdown syntax, with no pandoc extensions or variants.
|
|
When the input format is HTML, this means that constructs that have no
|
|
equivalents in standard markdown (e.g. definition lists or strikeout
|
|
text) will be parsed as raw HTML.
|
|
|
|
`--reference-links`
|
|
: Use reference-style links, rather than inline links, in writing markdown
|
|
or reStructuredText. By default inline links are used.
|
|
|
|
`-R`, `--parse-raw`
|
|
: Parse untranslatable HTML codes and LaTeX environments as raw HTML
|
|
or LaTeX, instead of ignoring them. Affects only HTML and LaTeX
|
|
input. Raw HTML can be printed in markdown, reStructuredText, HTML, Slidy,
|
|
and S5 output; raw LaTeX can be printed in markdown, reStructuredText,
|
|
LaTeX, and ConTeXt output. The default is for the readers to omit
|
|
untranslatable HTML codes and LaTeX environments. (The LaTeX reader
|
|
does pass through untranslatable LaTeX *commands*, even if `-R` is not
|
|
specified.)
|
|
|
|
`-S`, `--smart`
|
|
: Produce typographically correct output, converting straight quotes
|
|
to curly quotes, `---` and `--` to dashes, ande `...` to ellipses.
|
|
Nonbreaking spaces are inserted after certain abbreviations, such
|
|
as "Mr." (Note: This option is significant only when the input format is
|
|
`markdown` or `textile`. It is selected automatically when the input
|
|
format is `textile` or the output format is `latex` or `context`.)
|
|
|
|
`-m` *URL*, `--latexmathml=`*URL*
|
|
: Use the [LaTeXMathML] script to display embedded TeX math in HTML output.
|
|
To insert a link to a local copy of the `LaTeXMathML.js` script,
|
|
provide a *URL*. If no *URL* is provided, the contents of the
|
|
script will be inserted directly into the HTML header, preserving
|
|
portability at the price of efficiency. If you plan to use math on
|
|
several pages, it is much better to link to a copy of the script,
|
|
so it can be cached.
|
|
|
|
`--mathml`
|
|
: Convert TeX math to MathML. In standalone mode, a small javascript
|
|
will be inserted that allows the MathML to be viewed on some browsers.
|
|
|
|
`--jsmath=`*URL*
|
|
: Use [jsMath] to display embedded TeX math in HTML output.
|
|
The *URL* should point to the jsMath load script (e.g.
|
|
`jsMath/easy/load.js`); if provided, it will be linked to in
|
|
the header of standalone HTML documents.
|
|
|
|
`--mathjax=`*URL*
|
|
: Use [MathJax] to display embedded TeX math in HTML output.
|
|
The *URL* should point to the `MathJax.js` load script.
|
|
|
|
`--gladtex`
|
|
: Enclose TeX math in `<eq>` tags in HTML output. These can then
|
|
be processed by [gladTeX] to produce links to images of the typeset
|
|
formulas.
|
|
|
|
`--mimetex=`*URL*
|
|
: Render TeX math using the [mimeTeX] CGI script. If *URL* is not
|
|
specified, it is assumed that the script is at `/cgi-bin/mimetex.cgi`.
|
|
|
|
`--webtex=`*URL*
|
|
: Render TeX formulas using an external script that converts TeX
|
|
formulas to images. The formula will be concatenated with the URL
|
|
provided. If *URL* is not specified, the Google Chart API will be used.
|
|
|
|
`-i`, `--incremental`
|
|
: Make list items in Slidy or S5 display incrementally (one by one).
|
|
The default is for lists to be displayed all at once.
|
|
|
|
`--offline`
|
|
: Include all the CSS and javascript needed for a Slidy or S5 slide
|
|
show in the output, so that the slide show will work even when no
|
|
internet connection is available.
|
|
|
|
`--xetex`
|
|
: Create LaTeX outut suitable for processing by XeTeX.
|
|
|
|
`-N`, `--number-sections`
|
|
: Number section headings in LaTeX, ConTeXt, or HTML output.
|
|
By default, sections are not numbered.
|
|
|
|
`--section-divs`
|
|
: Wrap sections in `<div>` tags, and attach identifiers to the
|
|
enclosing `<div>` rather than the header itself.
|
|
See [Section identifiers](#header-identifiers-in-html), below.
|
|
|
|
`--no-wrap`
|
|
: Disable text wrapping in output. By default, text is wrapped
|
|
appropriately for the output format.
|
|
|
|
`--email-obfuscation=`*none|javascript|references*
|
|
: Specify a method for obfuscating `mailto:` links in HTML documents.
|
|
*none* leaves `mailto:` links as they are. *javascript* obfuscates
|
|
them using javascript. *references* obfuscates them by printing their
|
|
letters as decimal or hexadecimal character references.
|
|
If `--strict` is specified, *references* is used regardless of the
|
|
presence of this option.
|
|
|
|
`--id-prefix`=*STRING*
|
|
: Specify a prefix to be added to all automatically generated identifiers
|
|
in HTML output. This is useful for preventing duplicate identifiers
|
|
when generating fragments to be included in other pages.
|
|
|
|
`--indented-code-classes=`*CLASSES*
|
|
: Specify classes to use for indented code blocks--for example,
|
|
`perl,numberLines` or `haskell`. Multiple classes may be separated
|
|
by spaces or commas.
|
|
|
|
`--toc`, `--table-of-contents`
|
|
: Include an automatically generated table of contents (or, in
|
|
the case of `latex`, `context`, and `rst`, an instruction to create
|
|
one) in the output document. This option has no effect on `man`,
|
|
`docbook`, `slidy`, or `s5` output.
|
|
|
|
`--base-header-level=`*LEVEL*
|
|
: Specify the base level for headers (defaults to 1).
|
|
|
|
`--template=`*FILE*
|
|
: Use *FILE* as a custom template for the generated document. Implies
|
|
`--standalone`. See [Templates](#templates) below for a description
|
|
of template syntax. If this option is not used, a default
|
|
template appropriate for the output format will be used. See also
|
|
`-D/--print-default-template`.
|
|
|
|
`-V` *KEY=VAL*, `--variable=`*KEY:VAL*
|
|
: Set the template variable *KEY* to the value *VAL* when rendering the
|
|
document in standalone mode. This is only useful when the
|
|
`--template` option is used to specify a custom template, since
|
|
pandoc automatically sets the variables used in the default
|
|
templates.
|
|
|
|
`-c` *URL*, `--css=`*URL*
|
|
: Link to a CSS style sheet.
|
|
|
|
`-H` *FILE*, `--include-in-header=`*FILE*
|
|
: Include contents of *FILE*, verbatim, at the end of the header.
|
|
This can be used, for example, to include special
|
|
CSS or javascript in HTML documents. This option can be used
|
|
repeatedly to include multiple files in the header. They will be
|
|
included in the order specified. Implies `--standalone`.
|
|
|
|
`-B` *FILE*, `--include-before-body=`*FILE*
|
|
: Include contents of *FILE*, verbatim, at the beginning of the
|
|
document body (e.g. after the `<body>` tag in HTML, or the
|
|
`\begin{document}` command in LaTeX). This can be used to include
|
|
navigation bars or banners in HTML documents. This option can be
|
|
used repeatedly to include multiple files. They will be included in
|
|
the order specified. Implies `--standalone`.
|
|
|
|
`-A` *FILE*, `--include-after-body=`*FILE*
|
|
: Include contents of *FILE*, verbatim, at the end of the document
|
|
body (before the `</body>` tag in HTML, or the
|
|
`\end{document}` command in LaTeX). This option can be be used
|
|
repeatedly to include multiple files. They will be included in the
|
|
order specified. Implies `--standalone`.
|
|
|
|
`--reference-odt=`*FILE*
|
|
: Use the specified file as a style reference in producing an ODT.
|
|
For best results, the reference ODT should be a modified version
|
|
of an ODT produced using pandoc. The contents of the reference ODT
|
|
are ignored, but its stylesheets are used in the new ODT. If no
|
|
reference ODT is specified on the command line, pandoc will look
|
|
for a file `reference.odt` in the user data directory (see
|
|
`--data-dir`). If this is not found either, sensible defaults will be
|
|
used.
|
|
|
|
`--epub-stylesheet=`*FILE*
|
|
: Use the specified CSS file to style the EPUB. If no stylesheet
|
|
is specified, pandoc will look for a file `epub.css` in the
|
|
user data directory (see `--data-dir`, below). If it is not
|
|
found there, sensible defaults will be used.
|
|
|
|
`--epub-metadata=`*FILE*
|
|
: Look in the specified XML file for metadata for the EPUB.
|
|
The file should contain a series of Dublin Core elements,
|
|
as documented at <http://dublincore.org/documents/dces/>.
|
|
For example:
|
|
|
|
<dc:rights>Creative Commons</dc:rights>
|
|
<dc:language>es-AR</dc:language>
|
|
|
|
By default, pandoc will include the following metadata elements:
|
|
`<dc:title>` (from the document title), `<dc:creator>` (from the
|
|
document authors), `<dc:language>` (from the locale), and
|
|
`<dc:identifier id="BookId">` (a randomly generated UUID). Any of
|
|
these may be overridden by elements in the metadata file.
|
|
|
|
`-D` *FORMAT*, `--print-default-template=`*FORMAT*
|
|
: Print the default template for an output *FORMAT*. (See `-t`
|
|
for a list of possible *FORMAT*s.)
|
|
|
|
`-T` *STRING*, `--title-prefix=`*STRING*
|
|
: Specify *STRING* as a prefix at the beginning of the title
|
|
that appears in the HTML header (but not in the title as it
|
|
appears at the beginning of the HTML body). Implies
|
|
`--standalone`.
|
|
|
|
`--bibliography=`*FILE*
|
|
: Specify bibliography database to be used in resolving
|
|
citations. The database type will be determined from the
|
|
extension of *FILE*, which may be `.mods` (MODS format),
|
|
`.bib` (BibTeX format), `.bbx` (BibLaTeX format),
|
|
`.ris` (RIS format), `.enl` (EndNote format),
|
|
`.xml` (EndNote XML format), `.wos` (ISI format),
|
|
`.medline` (MEDLINE format), `.copac` (Copac format),
|
|
or `.json` (citeproc JSON). If you want to use multiple
|
|
bibliographies, just use this option repeatedly.
|
|
|
|
`--csl=`*FILE*
|
|
: Specify [CSL] style to be used in formatting citations and
|
|
the bibliography. If *FILE* is not found, pandoc will look
|
|
for it in
|
|
|
|
$HOME/.csl
|
|
|
|
in unix and
|
|
|
|
C:\Documents And Settings\USERNAME\Application Data\csl
|
|
|
|
in Windows. If the `--csl` option is not specified, pandoc
|
|
will use a default style: either `default.csl` in the
|
|
user data directory (see `--data-dir`), or, if that is
|
|
not present, the Chicago author-date style.
|
|
|
|
`--data-dir=`*DIRECTORY*
|
|
: Specify the user data directory to search for pandoc data files.
|
|
If this option is not specified, the default user data directory
|
|
will be used:
|
|
|
|
$HOME/.pandoc
|
|
|
|
in unix and
|
|
|
|
C:\Documents And Settings\USERNAME\Application Data\pandoc
|
|
|
|
in Windows. A `reference.odt`, `epub.css`, `templates` directory,
|
|
or `s5` directory placed in this directory will override pandoc's
|
|
normal defaults.
|
|
|
|
`--dump-args`
|
|
: Print information about command-line arguments to *stdout*, then exit.
|
|
This option is intended primarily for use in wrapper scripts.
|
|
The first line of output contains the name of the output file specified
|
|
with the `-o` option, or `-` (for *stdout*) if no output file was
|
|
specified. The remaining lines contain the command-line arguments,
|
|
one per line, in the order they appear. These do not include regular
|
|
Pandoc options and their arguments, but do include any options appearing
|
|
after a `--` separator at the end of the line.
|
|
|
|
`--ignore-args`
|
|
: Ignore command-line arguments (for use in wrapper scripts).
|
|
Regular Pandoc options are not ignored. Thus, for example,
|
|
|
|
pandoc --ignore-args -o foo.html -s foo.txt -- -e latin1
|
|
|
|
is equivalent to
|
|
|
|
pandoc -o foo.html -s
|
|
|
|
`-v`, `--version`
|
|
: Print version.
|
|
|
|
`-h`, `--help`
|
|
: Show usage message.
|
|
|
|
[LaTeXMathML]: http://math.etsu.edu/LaTeXMathML/
|
|
[jsMath]: http://www.math.union.edu/~dpvc/jsmath/
|
|
[MathJax]: http://www.mathjax.org/
|
|
[gladTeX]: http://www.math.uio.no/~martingu/gladtex/index.html
|
|
[mimeTeX]: http://www.forkosh.com/mimetex.html
|
|
[CSL]: http://CitationStyles.org
|
|
|
|
Templates
|
|
=========
|
|
|
|
When the `-s/--standalone` option is used, pandoc uses a template to
|
|
add header and footer material that is needed for a self-standing
|
|
document. To see the default template that is used, just type
|
|
|
|
pandoc -D FORMAT
|
|
|
|
where `FORMAT` is the name of the output format. A custom template
|
|
can be specified using the `--template` option. You can also override
|
|
the system default templates for a given output format `FORMAT`
|
|
by putting a file `templates/FORMAT.template` in the user data
|
|
directory (see `--data-dir`, above).
|
|
|
|
Templates may contain *variables*. Variable names are sequences of
|
|
alphanumerics, `-`, and `_`, starting with a letter. A variable name
|
|
surrounded by `$` signs will be replaced by its value. For example,
|
|
the string `$title$` in
|
|
|
|
<title>$title$</title>
|
|
|
|
will be replaced by the document title.
|
|
|
|
To write a literal `$` in a template, use `$$`.
|
|
|
|
Some variables are set automatically by pandoc. These vary somewhat
|
|
depending on the output format, but include:
|
|
|
|
`header-includes`
|
|
: contents specified by `-H/--include-in-header` (may have multiple
|
|
values)
|
|
`toc`
|
|
: non-null value if `--toc/--table-of-contents` was specified
|
|
`include-before`
|
|
: contents specified by `-B/--include-before-body` (may have
|
|
multiple values)
|
|
`include-after`
|
|
: contents specified by `-A/--include-after-body` (may have
|
|
multiple values)
|
|
`body`
|
|
: body of document
|
|
`title`
|
|
: title of document, as specified in title block
|
|
`author`
|
|
: author of document, as specified in title block (may have
|
|
multiple values)
|
|
`date`
|
|
: date of document, as specified in title block
|
|
|
|
Variables may be set at the command line using the `-V/--variable`
|
|
option. This allows users to include custom variables in their
|
|
templates.
|
|
|
|
Templates may contain conditionals. The syntax is as follows:
|
|
|
|
$if(variable)$
|
|
X
|
|
$else$
|
|
Y
|
|
$endif$
|
|
|
|
This will include `X` in the template if `variable` has a non-null
|
|
value; otherwise it will include `Y`. `X` and `Y` are placeholders for
|
|
any valid template text, and may include interpolated variables or other
|
|
conditionals. The `$else$` section may be omitted.
|
|
|
|
When variables can have multiple values (for example, `author` in
|
|
a multi-author document), you can use the `$for$` keyword:
|
|
|
|
$for(author)$
|
|
<meta name="author" content="$author$" />
|
|
$endfor$
|
|
|
|
You can optionally specify a separator to be used between
|
|
consecutive items:
|
|
|
|
$for(author)$$author$$sep$, $endfor$
|
|
|
|
Pandoc's markdown vs. standard markdown
|
|
=======================================
|
|
|
|
In parsing markdown, Pandoc departs from and extends [standard markdown]
|
|
in a few respects. Except where noted, these differences can
|
|
be suppressed by specifying the `--strict` command-line option.
|
|
|
|
[standard markdown]: http://daringfireball.net/projects/markdown/syntax
|
|
"Markdown syntax description"
|
|
|
|
Backslash escapes
|
|
-----------------
|
|
|
|
Except inside a code block or inline code, any punctuation or space
|
|
character preceded by a backslash will be treated literally, even if it
|
|
would normally indicate formatting. Thus, for example, if one writes
|
|
|
|
*\*hello\**
|
|
|
|
one will get
|
|
|
|
<em>*hello*</em>
|
|
|
|
instead of
|
|
|
|
<strong>hello</strong>
|
|
|
|
This rule is easier to remember than standard markdown's rule,
|
|
which allows only the following characters to be backslash-escaped:
|
|
|
|
\`*_{}[]()>#+-.!
|
|
|
|
A backslash-escaped space is parsed as a nonbreaking space. It will
|
|
appear in TeX output as `~` and in HTML and XML as `\ ` or
|
|
`\ `.
|
|
|
|
A backslash-escaped newline (i.e. a backslash occurring at the end of
|
|
a line) is parsed as a hard line break. It will appear in TeX output as
|
|
`\\` and in HTML as `<br />`. This is a nice alternative to
|
|
markdown's "invisible" way of indicating hard line breaks using
|
|
two trailing spaces on a line.
|
|
|
|
Subscripts and superscripts
|
|
---------------------------
|
|
|
|
Superscripts may be written by surrounding the superscripted text by `^`
|
|
characters; subscripts may be written by surrounding the subscripted
|
|
text by `~` characters. Thus, for example,
|
|
|
|
H~2~O is a liquid. 2^10^ is 1024.
|
|
|
|
If the superscripted or subscripted text contains spaces, these spaces
|
|
must be escaped with backslashes. (This is to prevent accidental
|
|
superscripting and subscripting through the ordinary use of `~` and `^`.)
|
|
Thus, if you want the letter P with 'a cat' in subscripts, use
|
|
`P~a\ cat~`, not `P~a cat~`.
|
|
|
|
Strikeout
|
|
---------
|
|
|
|
To strikeout a section of text with a horizontal line, begin and end it
|
|
with `~~`. Thus, for example,
|
|
|
|
This ~~is deleted text.~~
|
|
|
|
Nested Lists
|
|
------------
|
|
|
|
Pandoc behaves differently from standard markdown on some "edge
|
|
cases" involving lists. Consider this source:
|
|
|
|
1. First
|
|
2. Second:
|
|
- Fee
|
|
- Fie
|
|
- Foe
|
|
|
|
3. Third
|
|
|
|
Pandoc transforms this into a "compact list" (with no `<p>` tags around
|
|
"First", "Second", or "Third"), while markdown puts `<p>` tags around
|
|
"Second" and "Third" (but not "First"), because of the blank space
|
|
around "Third". Pandoc follows a simple rule: if the text is followed by
|
|
a blank line, it is treated as a paragraph. Since "Second" is followed
|
|
by a list, and not a blank line, it isn't treated as a paragraph. The
|
|
fact that the list is followed by a blank line is irrelevant. (Note:
|
|
Pandoc works this way even when the `--strict` option is specified. This
|
|
behavior is consistent with the official markdown syntax description,
|
|
even though it is different from that of `Markdown.pl`.)
|
|
|
|
Ordered Lists
|
|
-------------
|
|
|
|
Unlike standard markdown, Pandoc allows ordered list items to be marked
|
|
with uppercase and lowercase letters and roman numerals, in addition to
|
|
arabic numerals. (This behavior can be turned off using the `--strict`
|
|
option.) List markers may be enclosed in parentheses or followed by a
|
|
single right-parentheses or period. They must be separated from the
|
|
text that follows by at least one space, and, if the list marker is a
|
|
capital letter with a period, by at least two spaces.[^2]
|
|
|
|
[^2]: The point of this rule is to ensure that normal paragraphs
|
|
starting with people's initials, like
|
|
|
|
B. Russell was an English philosopher.
|
|
|
|
do not get treated as list items.
|
|
|
|
This rule will not prevent
|
|
|
|
(C) 2007 Joe Smith
|
|
|
|
from being interpreted as a list item. In this case, a backslash
|
|
escape can be used:
|
|
|
|
(C\) 2007 Joe Smith
|
|
|
|
Pandoc also pays attention to the type of list marker used, and to the
|
|
starting number, and both of these are preserved where possible in the
|
|
output format. Thus, the following yields a list with numbers followed
|
|
by a single parenthesis, starting with 9, and a sublist with lowercase
|
|
roman numerals:
|
|
|
|
9) Ninth
|
|
10) Tenth
|
|
11) Eleventh
|
|
i. subone
|
|
ii. subtwo
|
|
iii. subthree
|
|
|
|
Note that Pandoc pays attention only to the *starting* marker in a list.
|
|
So, the following yields a list numbered sequentially starting from 2:
|
|
|
|
(2) Two
|
|
(5) Three
|
|
1. Four
|
|
* Five
|
|
|
|
If default list markers are desired, use `#.`:
|
|
|
|
#. one
|
|
#. two
|
|
#. three
|
|
|
|
Numbered examples
|
|
-----------------
|
|
|
|
The special list marker `@` can be used for sequentially numbered
|
|
examples. The first list item with a `@` marker will be numbered '1',
|
|
the next '2', and so on, throughout the document. The numbered examples
|
|
need not occur in a single list; each new list using `@` will take up
|
|
where the last stopped. So, for example:
|
|
|
|
(@) My first example will be numbered (1).
|
|
(@) My second example will be numbered (2).
|
|
|
|
Explanation of examples.
|
|
|
|
(@) My third example will be numbered (3).
|
|
|
|
Numbered examples can be labeled and referred to elsewhere in the
|
|
document:
|
|
|
|
(@good) This is a good example.
|
|
|
|
As (@good) illustrates, ...
|
|
|
|
The label can be any string of alphanumeric characters, underscores,
|
|
or hyphens.
|
|
|
|
Definition lists
|
|
----------------
|
|
|
|
Pandoc supports definition lists, using a syntax inspired by
|
|
[PHP Markdown Extra] and [reStructuredText]:[^3]
|
|
|
|
Term 1
|
|
|
|
: Definition 1
|
|
|
|
Term 2 with *inline markup*
|
|
|
|
: Definition 2
|
|
|
|
{ some code, part of Definition 2 }
|
|
|
|
Third paragraph of definition 2.
|
|
|
|
Each term must fit on one line, which may optionally be followed by
|
|
a blank line, and must be followed by one or more definitions.
|
|
A definition begins with a colon or tilde, which may be indented one
|
|
or two spaces. A term may have multiple definitions, and each definition
|
|
may consist of one or more block elements (paragraph, code block, list,
|
|
etc.), each indented four spaces or one tab stop.
|
|
|
|
If you leave space after the definition (as in the example above),
|
|
the blocks of the definitions will be considered paragraphs. In some
|
|
output formats, this will mean greater spacing between term/definition
|
|
pairs. For a compact definition list, do not leave space between the
|
|
definition and the next term:
|
|
|
|
Term 1
|
|
~ Definition 1
|
|
Term 2
|
|
~ Definition 2a
|
|
~ Definition 2b
|
|
|
|
[^3]: I have also been influenced by the suggestions of [David Wheeler](http://www.justatheory.com/computers/markup/modest-markdown-proposal.html).
|
|
|
|
[PHP Markdown Extra]: http://www.michelf.com/projects/php-markdown/extra/
|
|
|
|
Reference links
|
|
---------------
|
|
|
|
Pandoc allows implicit reference links with just a single set of
|
|
brackets. So, the following links are equivalent:
|
|
|
|
1. Here's my [link]
|
|
2. Here's my [link][]
|
|
|
|
[link]: linky.com
|
|
|
|
(Note: Pandoc works this way even if `--strict` is specified, because
|
|
`Markdown.pl` 1.0.2b7 allows single-bracket links.)
|
|
|
|
Footnotes
|
|
---------
|
|
|
|
Pandoc's markdown allows footnotes, using the following syntax:
|
|
|
|
Here is a footnote reference,[^1] and another.[^longnote]
|
|
|
|
[^1]: Here is the footnote.
|
|
|
|
[^longnote]: Here's one with multiple blocks.
|
|
|
|
Subsequent paragraphs are indented to show that they
|
|
belong to the previous footnote.
|
|
|
|
{ some.code }
|
|
|
|
The whole paragraph can be indented, or just the first
|
|
line. In this way, multi-paragraph footnotes work like
|
|
multi-paragraph list items.
|
|
|
|
This paragraph won't be part of the note, because it isn't indented.
|
|
|
|
The identifiers in footnote references may not contain spaces, tabs,
|
|
or newlines. These identifiers are used only to correlate the
|
|
footnote reference with the note itself; in the output, footnotes
|
|
will be numbered sequentially.
|
|
|
|
The footnotes themselves need not be placed at the end of the
|
|
document. They may appear anywhere except inside other block elements
|
|
(lists, block quotes, tables, etc.).
|
|
|
|
Inline footnotes are also allowed (though, unlike regular notes,
|
|
they cannot contain multiple paragraphs). The syntax is as follows:
|
|
|
|
Here is an inline note.^[Inlines notes are easier to write, since
|
|
you don't have to pick an identifier and move down to type the
|
|
note.]
|
|
|
|
Inline and regular footnotes may be mixed freely.
|
|
|
|
Tables
|
|
------
|
|
|
|
Three kinds of tables may be used. All three kinds presuppose the use of
|
|
a fixed-width font, such as Courier.
|
|
|
|
**Simple tables** look like this:
|
|
|
|
Right Left Center Default
|
|
------- ------ ---------- -------
|
|
12 12 12 12
|
|
123 123 123 123
|
|
1 1 1 1
|
|
|
|
Table: Demonstration of simple table syntax.
|
|
|
|
The headers and table rows must each fit on one line. Column
|
|
alignments are determined by the position of the header text relative
|
|
to the dashed line below it:[^4]
|
|
|
|
- If the dashed line is flush with the header text on the right side
|
|
but extends beyond it on the left, the column is right-aligned.
|
|
- If the dashed line is flush with the header text on the left side
|
|
but extends beyond it on the right, the column is left-aligned.
|
|
- If the dashed line extends beyond the header text on both sides,
|
|
the column is centered.
|
|
- If the dashed line is flush with the header text on both sides,
|
|
the default alignment is used (in most cases, this will be left).
|
|
|
|
[^4]: This scheme is due to Michel Fortin, who proposed it on the
|
|
[Markdown discussion list](http://six.pairlist.net/pipermail/markdown-discuss/2005-March/001097.html).
|
|
|
|
The table must end with a blank line, or a line of dashes followed by
|
|
a blank line. A caption may optionally be provided (as illustrated in
|
|
the example above). A caption is a paragraph beginning with the string
|
|
`Table:` (or just `:`), which will be stripped off. It may appear either
|
|
before or after the table.
|
|
|
|
The column headers may be omitted, provided a dashed line is used
|
|
to end the table. For example:
|
|
|
|
------- ------ ---------- -------
|
|
12 12 12 12
|
|
123 123 123 123
|
|
1 1 1 1
|
|
------- ------ ---------- -------
|
|
|
|
When headers are omitted, column alignments are determined on the basis
|
|
of the first line of the table body. So, in the tables above, the columns
|
|
would be right, left, center, and right aligned, respectively.
|
|
|
|
**Multiline tables** allow headers and table rows to span multiple lines
|
|
of text (but cells that span multiple columns or rows of the table are
|
|
not supported). Here is an example:
|
|
|
|
-------------------------------------------------------------
|
|
Centered Default Right Left
|
|
Header Aligned Aligned Aligned
|
|
----------- ------- --------------- -------------------------
|
|
First row 12.0 Example of a row that
|
|
spans multiple lines.
|
|
|
|
Second row 5.0 Here's another one. Note
|
|
the blank line between
|
|
rows.
|
|
-------------------------------------------------------------
|
|
|
|
Table: Here's the caption. It, too, may span
|
|
multiple lines.
|
|
|
|
These work like simple tables, but with the following differences:
|
|
|
|
- They must begin with a row of dashes, before the header text
|
|
(unless the headers are omitted).
|
|
- They must end with a row of dashes, then a blank line.
|
|
- The rows must be separated by blank lines.
|
|
|
|
In multiline tables, the table parser pays attention to the widths of
|
|
the columns, and the writers try to reproduce these relative widths in
|
|
the output. So, if you find that one of the columns is too narrow in the
|
|
output, try widening it in the markdown source.
|
|
|
|
Headers may be omitted in multiline tables as well as simple tables:
|
|
|
|
----------- ------- --------------- -------------------------
|
|
First row 12.0 Example of a row that
|
|
spans multiple lines.
|
|
|
|
Second row 5.0 Here's another one. Note
|
|
the blank line between
|
|
rows.
|
|
-------------------------------------------------------------
|
|
|
|
: Here's a multiline table without headers.
|
|
|
|
It is possible for a multiline table to have just one row, but the row
|
|
should be followed by a blank line (and then the row of dashes that ends
|
|
the table), or the table may be interpreted as a simple table.
|
|
|
|
**Grid tables** look like this:
|
|
|
|
: Sample grid table.
|
|
|
|
+---------------+---------------+--------------------+
|
|
| Fruit | Price | Advantages |
|
|
+===============+===============+====================+
|
|
| Bananas | $1.34 | - built-in wrapper |
|
|
| | | - bright color |
|
|
+---------------+---------------+--------------------+
|
|
| Oranges | $2.10 | - cures scurvy |
|
|
| | | - tasty |
|
|
+---------------+---------------+--------------------+
|
|
|
|
The row of `=`s separates the header from the table body, and can be
|
|
omitted for a headerless table. The cells of grid tables may contain
|
|
arbitrary block elements (multiple paragraphs, code blocks, lists,
|
|
etc.). Alignments are not supported, nor are cells that span multiple
|
|
columns or rows. Grid tables can be created easily using [Emacs table mode].
|
|
|
|
[Emacs table mode]: http://table.sourceforge.net/
|
|
|
|
Delimited Code blocks
|
|
---------------------
|
|
|
|
In addition to standard indented code blocks, Pandoc supports
|
|
*delimited* code blocks. These begin with a row of three or more
|
|
tildes (`~`) and end with a row of tildes that must be at least
|
|
as long as the starting row. Everything between the tilde-lines
|
|
is treated as code. No indentation is necessary:
|
|
|
|
~~~~~~~
|
|
{code here}
|
|
~~~~~~~
|
|
|
|
Like regular code blocks, delimited code blocks must be separated
|
|
from surrounding text by blank lines.
|
|
|
|
If the code itself contains a row of tildes, just use a longer
|
|
row of tildes at the start and end:
|
|
|
|
~~~~~~~~~~~~~~~~
|
|
~~~~~~~~~~
|
|
code including tildes
|
|
~~~~~~~~~~
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
Optionally, you may specify the language of the code block using
|
|
this syntax:
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
|
|
qsort [] = []
|
|
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++
|
|
qsort (filter (>= x) xs)
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Some output formats can use this information to do syntax highlighting.
|
|
Currently, the only output format that uses this information is HTML.
|
|
|
|
If pandoc has been compiled with syntax highlighting support, then the
|
|
code block above will appear highlighted, with numbered lines. (To see
|
|
which languages are supported, do `pandoc --version`.)
|
|
|
|
If pandoc has not been compiled with syntax highlighting support, the
|
|
code block above will appear as follows:
|
|
|
|
<pre class="haskell">
|
|
<code>
|
|
...
|
|
</code>
|
|
</pre>
|
|
|
|
Images with captions
|
|
--------------------
|
|
|
|
An image occurring by itself in a paragraph will be rendered as
|
|
a figure with a caption.[^5] (In LaTeX, a figure environment will be
|
|
used; in HTML, the image will be placed in a `div` with class
|
|
`figure`, together with a caption in a `p` with class `caption`.)
|
|
The image's alt text will be used as the caption.
|
|
|
|
![This is the caption](/url/of/image.png)
|
|
|
|
[^5]: This feature is not yet implemented for RTF, OpenDocument, or
|
|
ODT. In those formats, you'll just get an image in a paragraph by
|
|
itself, with no caption.
|
|
|
|
If you just want a regular inline image, just make sure it is not
|
|
the only thing in the paragraph. One way to do this is to insert a
|
|
nonbreaking space after the image:
|
|
|
|
![This image won't be a figure](/url/of/image.png)\
|
|
|
|
Title blocks
|
|
------------
|
|
|
|
If the file begins with a title block
|
|
|
|
% title
|
|
% author(s) (separated by semicolons)
|
|
% date
|
|
|
|
it will be parsed as bibliographic information, not regular text. (It
|
|
will be used, for example, in the title of standalone LaTeX or HTML
|
|
output.) The block may contain just a title, a title and an author,
|
|
or all three elements. If you want to include an author but no
|
|
title, or a title and a date but no author, you need a blank line:
|
|
|
|
%
|
|
% Author
|
|
|
|
% My title
|
|
%
|
|
% June 15, 2006
|
|
|
|
The title may occupy multiple lines, but continuation lines must
|
|
begin with leading space, thus:
|
|
|
|
% My title
|
|
on multiple lines
|
|
|
|
If a document has multiple authors, the authors may be put on
|
|
separate lines with leading space, or separated by semicolons, or
|
|
both. So, all of the following are equivalent:
|
|
|
|
% Author One
|
|
Author Two
|
|
|
|
% Author One; Author Two
|
|
|
|
% Author One;
|
|
Author Two
|
|
|
|
The date must fit on one line.
|
|
|
|
All three metadata fields may contain standard inline formatting
|
|
(italics, links, footnotes, etc.).
|
|
|
|
Title blocks will always be parsed, but they will affect the output only
|
|
when the `--standalone` (`-s`) option is chosen. In HTML output, titles
|
|
will appear twice: once in the document head -- this is the title that
|
|
will appear at the top of the window in a browser -- and once at the
|
|
beginning of the document body. The title in the document head can have
|
|
an optional prefix attached (`--title-prefix` or `-T` option). The title
|
|
in the body appears as an H1 element with class "title", so it can be
|
|
suppressed or reformatted with CSS. If a title prefix is specified with
|
|
`-T` and no title block appears in the document, the title prefix will
|
|
be used by itself as the HTML title.
|
|
|
|
The man page writer extracts a title, man page section number, and
|
|
other header and footer information from the title line. The title
|
|
is assumed to be the first word on the title line, which may optionally
|
|
end with a (single-digit) section number in parentheses. (There should
|
|
be no space between the title and the parentheses.) Anything after
|
|
this is assumed to be additional footer and header text. A single pipe
|
|
character (`|`) should be used to separate the footer text from the header
|
|
text. Thus,
|
|
|
|
% PANDOC(1)
|
|
|
|
will yield a man page with the title `PANDOC` and section 1.
|
|
|
|
% PANDOC(1) Pandoc User Manuals
|
|
|
|
will also have "Pandoc User Manuals" in the footer.
|
|
|
|
% PANDOC(1) Pandoc User Manuals | Version 4.0
|
|
|
|
will also have "Version 4.0" in the header.
|
|
|
|
Markdown in HTML blocks
|
|
-----------------------
|
|
|
|
While standard markdown leaves HTML blocks exactly as they are, Pandoc
|
|
treats text between HTML tags as markdown. Thus, for example, Pandoc
|
|
will turn
|
|
|
|
<table>
|
|
<tr>
|
|
<td>*one*</td>
|
|
<td>[a link](http://google.com)</td>
|
|
</tr>
|
|
</table>
|
|
|
|
into
|
|
|
|
<table>
|
|
<tr>
|
|
<td><em>one</em></td>
|
|
<td><a href="http://google.com">a link</a></td>
|
|
</tr>
|
|
</table>
|
|
|
|
whereas `Markdown.pl` will preserve it as is.
|
|
|
|
There is one exception to this rule: text between `<script>` and
|
|
`</script>` tags is not interpreted as markdown.
|
|
|
|
This departure from standard markdown should make it easier to mix
|
|
markdown with HTML block elements. For example, one can surround
|
|
a block of markdown text with `<div>` tags without preventing it
|
|
from being interpreted as markdown.
|
|
|
|
Header identifiers in HTML
|
|
--------------------------
|
|
|
|
Each header element in pandoc's HTML output is given a unique
|
|
identifier. This identifier is based on the text of the header. To
|
|
derive the identifier from the header text,
|
|
|
|
- Remove all formatting, links, etc.
|
|
- Remove all punctuation, except underscores, hyphens, and periods.
|
|
- Replace all spaces and newlines with hyphens.
|
|
- Convert all alphabetic characters to lowercase.
|
|
- Remove everything up to the first letter (identifiers may
|
|
not begin with a number or punctuation mark).
|
|
- If nothing is left after this, use the identifier `section`.
|
|
|
|
Thus, for example,
|
|
|
|
Header Identifier
|
|
------------------------------------- ---------------------------
|
|
Header identifiers in HTML `header-identifiers-in-html`
|
|
*Dogs*?--in *my* house? `dogs--in-my-house`
|
|
[HTML], [S5], or [RTF]? `html-s5-or-rtf`
|
|
3. Applications `applications`
|
|
33 `section`
|
|
|
|
These rules should, in most cases, allow one to determine the identifier
|
|
from the header text. The exception is when several headers have the
|
|
same text; in this case, the first will get an identifier as described
|
|
above; the second will get the same identifier with `-1` appended; the
|
|
third with `-2`; and so on.
|
|
|
|
These identifiers are used to provide link targets in the table of
|
|
contents generated by the `--toc|--table-of-contents` option. They
|
|
also make it easy to provide links from one section of a document to
|
|
another. A link to this section, for example, might look like this:
|
|
|
|
See the section on [header identifiers](#header-identifiers-in-html).
|
|
|
|
Note, however, that this method of providing links to sections works
|
|
only in HTML.
|
|
|
|
If the `--section-divs` option is specified, then each section will
|
|
be wrapped in a `div`, and the identifier will be attached to the
|
|
enclosing `<div>` tag rather than the header itself. This allows entire
|
|
sections to be manipulated using javascript or treated differently in
|
|
CSS.
|
|
|
|
Blank lines before headers and blockquotes
|
|
------------------------------------------
|
|
|
|
Standard markdown syntax does not require a blank line before a header
|
|
or blockquote. Pandoc does require this (except, of course, at the
|
|
beginning of the document). The reason for the requirement is that
|
|
it is all too easy for a `>` or `#` to end up at the beginning of a
|
|
line by accident (perhaps through line wrapping). Consider, for
|
|
example:
|
|
|
|
I like several of their flavors of ice cream:
|
|
#22, for example, and #5.
|
|
|
|
Math
|
|
----
|
|
|
|
Anything between two $ characters will be treated as TeX math. The
|
|
opening $ must have a character immediately to its right, while the
|
|
closing $ must have a character immediately to its left. Thus,
|
|
`$20,000 and $30,000` won't parse as math. If for some reason
|
|
you need to enclose text in literal $ characters, backslash-escape
|
|
them and they won't be treated as math delimiters.
|
|
|
|
TeX math will be printed in all output formats. In Markdown,
|
|
reStructuredText, LaTeX, Org-Mode, and ConTeXt output, it will appear
|
|
verbatim between $ characters.
|
|
|
|
In reStructuredText output, it will be rendered using an interpreted
|
|
text role `:math:`, as described
|
|
[here](http://www.american.edu/econ/itex2mml/mathhack.rst).
|
|
|
|
In Texinfo output, it will be rendered inside a `@math` command.
|
|
|
|
In groff man output, it will be rendered verbatim without $'s.
|
|
|
|
In MediaWiki output, it will be rendered inside `<math>` tags.
|
|
|
|
In Textile output, it will be rendered inside `<span class="math">` tags.
|
|
|
|
In RTF, Docbook, and OpenDocument output, it will be rendered, as far as
|
|
possible, using unicode characters, and will otherwise appear verbatim.
|
|
Unknown commands and symbols, and commands that cannot be dealt with
|
|
this way (like `\frac`), will be rendered verbatim. So the results may
|
|
be a mix of raw TeX code and properly rendered unicode math.
|
|
|
|
In HTML, Slidy, and S5 output, the way math is rendered will depend on the
|
|
command-line options selected:
|
|
|
|
1. The default is to render TeX math as far as possible using unicode
|
|
characters, as with RTF, Docbook, and OpenDocument output. Formulas
|
|
are put inside a `span` with `class="math"`, so that they may be
|
|
styled differently from the surrounding text if needed.
|
|
|
|
2. If the `--latexmathml` option is used, TeX math will be displayed
|
|
between $ or $$ characters and put in `<span>` tags with class `LaTeX`.
|
|
The [LaTeXMathML] script will be used to render it as formulas.
|
|
(This trick does not work in all browsers, but it works in Firefox.
|
|
In browsers that do not support LaTeXMathML, TeX math will appear
|
|
verbatim between $ characters.)
|
|
|
|
3. If the `--jsmath` option is used, TeX math will be put inside
|
|
`<span>` tags (for inline math) or `<div>` tags (for display math)
|
|
with class `math`. The [jsMath] script will be used to render
|
|
it.
|
|
|
|
4. If the `--mimetex` option is used, the [mimeTeX] CGI script will
|
|
be called to generate images for each TeX formula. This should
|
|
work in all browsers. The `--mimetex` option takes an optional URL
|
|
as argument. If no URL is specified, it will be assumed that the
|
|
mimeTeX CGI script is at `/cgi-bin/mimetex.cgi`.
|
|
|
|
5. If the `--gladtex` option is used, TeX formulas will be enclosed
|
|
in `<eq>` tags in the HTML output. The resulting `htex` file may then
|
|
be processed by [gladTeX], which will produce image files for each
|
|
formula and an `html` file with links to these images. So, the
|
|
procedure is:
|
|
|
|
pandoc -s --gladtex myfile.txt -o myfile.htex
|
|
gladtex -d myfile-images myfile.htex
|
|
# produces myfile.html and images in myfile-images
|
|
|
|
6. If the `--webtex` option is used, TeX formulas will be converted
|
|
to `<img>` tags that link to an external script that converts
|
|
formulas to images. The formula will be URL-encoded and concatenated
|
|
with the URL provided. If no URL is specified, the Google Chart
|
|
API will be used (`http://chart.apis.google.com/chart?cht=tx&chl=`).
|
|
|
|
LaTeX Macros
|
|
------------
|
|
|
|
For output formats other than LaTeX, pandoc will parse LaTeX `\newcommand` and
|
|
`\renewcommand` definitions and apply the resulting macros to all LaTeX
|
|
math. So, for example, the following will work in all output formats,
|
|
not just LaTeX:
|
|
|
|
\newcommand{\tuple}[1]{\langle #1 \rangle}
|
|
|
|
$\tuple{a, b, c}$
|
|
|
|
Inline TeX
|
|
----------
|
|
|
|
Inline TeX commands will be preserved and passed unchanged to the
|
|
LaTeX and ConTeXt writers. Thus, for example, you can use LaTeX to
|
|
include BibTeX citations:
|
|
|
|
This result was proved in \cite{jones.1967}.
|
|
|
|
Note that in LaTeX environments, like
|
|
|
|
\begin{tabular}{|l|l|}\hline
|
|
Age & Frequency \\ \hline
|
|
18--25 & 15 \\
|
|
26--35 & 33 \\
|
|
36--45 & 22 \\ \hline
|
|
\end{tabular}
|
|
|
|
the material between the begin and end tags will be interpreted as raw
|
|
LaTeX, not as markdown.
|
|
|
|
Inline LaTeX is ignored in output formats other than Markdown, LaTeX,
|
|
and ConTeXt.
|
|
|
|
Citations
|
|
---------
|
|
|
|
Pandoc can automatically generate citations and a bibliography in a number of
|
|
styles (using Andrea Rossato's `hs-citeproc`). In order to use this feature,
|
|
you will need a bibliographic database in one of the following formats:
|
|
|
|
Format File extension
|
|
------------ --------------
|
|
MODS .mods
|
|
BibTeX .bib
|
|
BibLaTeX .bbx
|
|
RIS .ris
|
|
EndNote .enl
|
|
EndNote XML .xml
|
|
ISI .wos
|
|
MEDLINE .medline
|
|
Copac .copac
|
|
JSON citeproc .json
|
|
|
|
You will need to specify the bibliography file using the `--bibliography`
|
|
command-line option (which may be repeated if you have several
|
|
bibliographies).
|
|
|
|
By default, pandoc will use a Chicago author-date format for citations
|
|
and references. To use another style, you will need to use the
|
|
`--csl` option to specify a [CSL] 1.0 style file. A primer on
|
|
creating and modifying CSL styles can be found at
|
|
<http://citationstyles.org/downloads/primer.html>.
|
|
|
|
Citations go inside square brackets and are separated by semicolons.
|
|
Each citation must have a key, composed of '@' + the citation
|
|
identifier from the database, and may optionally have a prefix,
|
|
a locator, and a suffix. Here are some examples:
|
|
|
|
Blah blah [see @doe99, pp. 33-35; also @smith04, ch. 1].
|
|
|
|
Blah blah [@doe99, pp. 33-35, 38-39 and *passim*].
|
|
|
|
Blah blah [@smith04; @doe99].
|
|
|
|
A minus sign (`-`) before the `@` will suppress mention of
|
|
the author in the citation. This can be useful when the
|
|
author is already mentioned in the text:
|
|
|
|
Smith says blah [-@smith04].
|
|
|
|
You can also write an in-text citation, as follows:
|
|
|
|
@smith04 says blah.
|
|
|
|
@smith04 [p. 33] says blah.
|
|
|
|
If the style calls for a list of works cited, it will be placed
|
|
at the end of the document. Normally, you will want to end your
|
|
document with an appropriate header:
|
|
|
|
last paragraph...
|
|
|
|
# References
|
|
|
|
The bibliography will be inserted after this header.
|
|
|
|
Producing HTML slide shows with Pandoc
|
|
======================================
|
|
|
|
You can use Pandoc to produce an HTML + javascript slide presentation
|
|
that can be viewed via a web browser. There are two ways to do this,
|
|
using [S5] or [Slidy].
|
|
|
|
Here's the markdown source for a simple slide show, `eating.txt`:
|
|
|
|
% Eating Habits
|
|
% John Doe
|
|
% March 22, 2005
|
|
|
|
# In the morning
|
|
|
|
- Eat eggs
|
|
- Drink coffee
|
|
|
|
# In the evening
|
|
|
|
- Eat spaghetti
|
|
- Drink wine
|
|
|
|
--------------------------
|
|
|
|
![picture of spaghetti](images/spaghetti.jpg)
|
|
|
|
To produce the slide show, simply type
|
|
|
|
pandoc -w s5 -s eating.txt > eating.html
|
|
|
|
for S5, or
|
|
|
|
pandoc -w slidy -s eating.txt > eating.html
|
|
|
|
for Slidy.
|
|
|
|
A title page is constructed automatically from the document's title
|
|
block. Each level-one header and horizontal rule begins a new slide.
|
|
|
|
The file produced by pandoc with the `-s/--standalone` option embeds a
|
|
link to javascripts and CSS files, which are assumed to be available at
|
|
the relative path `ui/default` (for S5) or at the Slidy website at
|
|
`w3.org` (for Slidy). If the `--offline` option is specified, the
|
|
scripts and CSS will be included directly in the generated file, so that
|
|
it may be used offline.
|
|
|
|
You can change the style of the slides by putting customized CSS files
|
|
in `$DATADIR/s5/default` (for S5) or `$DATADIR/slidy` (for Slidy),
|
|
where `$DATADIR` is the user data directory (see `--data-dir`, above).
|
|
The originals may be found in pandoc's system data directory (generally
|
|
`$CABALDIR/pandoc-VERSION/s5/default`). Pandoc will look there for any
|
|
files it does not find in the user data directory.
|
|
|
|
Incremental lists
|
|
-----------------
|
|
|
|
By default, these writers produces lists that display "all at once."
|
|
If you want your lists to display incrementally (one item at a time),
|
|
use the `-i` option. If you want a particular list to depart from the
|
|
default (that is, to display incrementally without the `-i` option and
|
|
all at once with the `-i` option), put it in a block quote:
|
|
|
|
> - Eat spaghetti
|
|
> - Drink wine
|
|
|
|
In this way incremental and nonincremental lists can be mixed in
|
|
a single document.
|
|
|
|
Literate Haskell support
|
|
========================
|
|
|
|
If you append `+lhs` to an appropriate input or output format (`markdown`,
|
|
`rst`, or `latex` for input or output; `html` for output only), pandoc
|
|
will treat the document as literate Haskell source. This means that
|
|
|
|
- In markdown input, "bird track" sections will be parsed as Haskell
|
|
code rather than block quotations. Text between `\begin{code}`
|
|
and `\end{code}` will also be treated as Haskell code.
|
|
|
|
- In markdown output, code blocks with class `haskell` will be
|
|
rendered using bird tracks, and block quotations will be
|
|
indented one space, so they will not be treated as Haskell code.
|
|
In addition, headers will be rendered setext-style (with underlines)
|
|
rather than atx-style (with '#' characters). (This is because ghc
|
|
treats '#' characters in column 1 as introducing line numbers.)
|
|
|
|
- In restructured text input, "bird track" sections will be parsed
|
|
as Haskell code.
|
|
|
|
- In restructured text output, code blocks with class `haskell` will
|
|
be rendered using bird tracks.
|
|
|
|
- In LaTeX input, text in `code` environments will be parsed as
|
|
Haskell code.
|
|
|
|
- In LaTeX output, code blocks with class `haskell` will be rendered
|
|
inside `code` environments.
|
|
|
|
- In HTML output, code blocks with class `haskell` will be rendered
|
|
with class `literatehaskell` and bird tracks.
|
|
|
|
Examples:
|
|
|
|
pandoc -f markdown+lhs -t html
|
|
|
|
reads literate Haskell source formatted with markdown conventions and writes
|
|
ordinary HTML (without bird tracks).
|
|
|
|
pandoc -f markdown+lhs -t html+lhs
|
|
|
|
writes HTML with the Haskell code in bird tracks, so it can be copied
|
|
and pasted as literate Haskell source.
|
|
|
|
Authors
|
|
=======
|
|
|
|
© 2006-2010 John MacFarlane (jgm at berkeley dot edu). Released under the
|
|
[GPL], version 2 or greater. This software carries no warranty of
|
|
any kind. (See COPYRIGHT for full copyright and warranty notices.)
|
|
Other contributors include Recai Oktaş, Paulo Tanimoto, Peter Wang,
|
|
Andrea Rossato, Eric Kow, infinity0x, Luke Plant, shreevatsa.public,
|
|
Puneeth Chaganti, Paul Rivier, rodja.trappe, Bradley Kuhn, thsutton,
|
|
Nathan Gass, Jonathan Daugherty, Jérémy Bobbio, Justin Bogner.
|
|
|
|
[markdown]: http://daringfireball.net/projects/markdown/
|
|
[reStructuredText]: http://docutils.sourceforge.net/docs/ref/rst/introduction.html
|
|
[S5]: http://meyerweb.com/eric/tools/s5/
|
|
[Slidy]: http://www.w3.org/Talks/Tools/Slidy/
|
|
[HTML]: http://www.w3.org/TR/html40/
|
|
[LaTeX]: http://www.latex-project.org/
|
|
[ConTeXt]: http://www.pragma-ade.nl/
|
|
[RTF]: http://en.wikipedia.org/wiki/Rich_Text_Format
|
|
[DocBook XML]: http://www.docbook.org/
|
|
[OpenDocument XML]: http://opendocument.xml.org/
|
|
[ODT]: http://en.wikipedia.org/wiki/OpenDocument
|
|
[Textile]: http://redcloth.org/textile
|
|
[MediaWiki markup]: http://www.mediawiki.org/wiki/Help:Formatting
|
|
[groff man]: http://developer.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man7/groff_man.7.html
|
|
[Haskell]: http://www.haskell.org/
|
|
[GNU Texinfo]: http://www.gnu.org/software/texinfo/
|
|
[Emacs Org-Mode]: http://org-mode.org
|
|
[EPUB]: http://www.idpf.org/
|
|
[GPL]: http://www.gnu.org/copyleft/gpl.html "GNU General Public License"
|
|
|