`) tag rather than the header itself. This allows entire
+sections to be manipulated using JavaScript or treated differently in
+CSS.
+
+#### Extension: `ascii_identifiers` ####
+
+Causes the identifiers produced by `auto_identifiers` to be pure ASCII.
+Accents are stripped off of accented Latin letters, and non-Latin
+letters are omitted.
+
+Math Input
+----------
+
+The extensions [`tex_math_dollars`](#extension-tex_math_dollars),
+[`tex_math_single_backslash`](#extension-tex_math_single_backslash), and
+[`tex_math_double_backslash`](#extension-tex_math_double_backslash)
+are described in the section about Pandoc's Markdown.
+
+However, they can also be used with HTML input. This is handy for
+reading web pages formatted using MathJax, for example.
+
+Raw HTML/TeX
+------------
+
+The following extensions (especially how they affect Markdown
+input/output) are also described in more detail in their respective
+sections of [Pandoc's Markdown].
+
+#### [Extension: `raw_html`] {#raw_html}
+
+When converting from HTML, parse elements to raw HTML which are not
+representable in pandoc's AST.
+By default, this is disabled for HTML input.
+
+#### [Extension: `raw_tex`] {#raw_tex}
+
+Allows raw LaTeX, TeX, and ConTeXt to be included in a document.
+
+This extension can be enabled/disabled for the following formats
+(in addition to `markdown`):
+
+input formats
+: `latex`, `org`, `textile`
+
+output formats
+: `textile`
+
+#### [Extension: `native_divs`] {#native_divs}
+
+This extension is enabled by default for HTML input. This means that
+`div`s are parsed to pandoc native elements. (Alternatively, you
+can parse them to raw HTML using `-f html-native_divs+raw_html`.)
+
+When converting HTML to Markdown, for example, you may want to drop all
+`div`s and `span`s:
+
+ pandoc -f html-native_divs-native_spans -t markdown
+
+#### [Extension: `native_spans`] {#native_spans}
+
+Analogous to `native_divs` above.
+
+
+Literate Haskell support
+------------------------
+
+#### Extension: `literate_haskell` ####
+
+Treat the document as literate Haskell source.
+
+This extension can be enabled/disabled for the following formats:
+
+input formats
+: `markdown`, `rst`, `latex`
+
+output formats
+: `markdown`, `rst`, `latex`, `html`
+
+If you append `+lhs` (or `+literate_haskell`) to one of the formats
+above, pandoc will treat the document as literate Haskell source.
+This means that
+
+ - In Markdown input, "bird track" sections will be parsed as Haskell
+ code rather than block quotations. Text between `\begin{code}`
+ and `\end{code}` will also be treated as Haskell code. For
+ ATX-style headers the character '=' will be used instead of '#'.
+
+ - In Markdown output, code blocks with classes `haskell` and `literate`
+ will be rendered using bird tracks, and block quotations will be
+ indented one space, so they will not be treated as Haskell code.
+ In addition, headers will be rendered setext-style (with underlines)
+ rather than ATX-style (with '#' characters). (This is because ghc
+ treats '#' characters in column 1 as introducing line numbers.)
+
+ - In restructured text input, "bird track" sections will be parsed
+ as Haskell code.
+
+ - In restructured text output, code blocks with class `haskell` will
+ be rendered using bird tracks.
+
+ - In LaTeX input, text in `code` environments will be parsed as
+ Haskell code.
+
+ - In LaTeX output, code blocks with class `haskell` will be rendered
+ inside `code` environments.
+
+ - In HTML output, code blocks with class `haskell` will be rendered
+ with class `literatehaskell` and bird tracks.
+
+Examples:
+
+ pandoc -f markdown+lhs -t html
+
+reads literate Haskell source formatted with Markdown conventions and writes
+ordinary HTML (without bird tracks).
+
+ pandoc -f markdown+lhs -t html+lhs
+
+writes HTML with the Haskell code in bird tracks, so it can be copied
+and pasted as literate Haskell source.
+
+Note that GHC expects the bird tracks in the first column, so indentend literate
+code blocks (e.g. inside an itemized environment) will not be picked up by the
+Haskell compiler.
+
+Other extensions
+----------------
+
+#### Extension: `empty_paragraphs` ####
+
+Allows empty paragraphs. By default empty paragraphs are
+omitted.
+
+This extension can be enabled/disabled for the following formats:
+
+input formats
+: `docx`, `html`
+
+output formats
+: `markdown`, `docx`, `odt`, `opendocument`, `html`
+
+#### Extension: `amuse` ####
+
+In the `muse` input format, this enables Text::Amuse
+extensions to Emacs Muse markup.
+
+#### Extension: `citations` {#org-citations}
+
+Some aspects of [Pandoc's Markdown citation syntax](#citations) are also accepted
+in `org` input.
+
+
Pandoc's Markdown
=================
@@ -1705,11 +1958,9 @@ Pandoc understands an extended and slightly revised version of
John Gruber's [Markdown] syntax. This document explains the syntax,
noting differences from standard Markdown. Except where noted, these
differences can be suppressed by using the `markdown_strict` format instead
-of `markdown`. An extensions can be enabled by adding `+EXTENSION`
-to the format name and disabled by adding `-EXTENSION`. For example,
-`markdown_strict+footnotes` is strict Markdown with footnotes
-enabled, while `markdown-footnotes-pipe_tables` is pandoc's
-Markdown without footnotes or pipe tables.
+of `markdown`. Extensions can be enabled or disabled to specify the
+behavior more granularly. They are described in the following. See also
+[Extensions] above, for extensions that work also on other formats.
Philosophy
----------
@@ -1801,6 +2052,8 @@ pandoc does require the space.
### Header identifiers ###
+See also the [`auto_identifiers` extension](#extension-auto_identifiers) above.
+
#### Extension: `header_attributes` ####
Headers can be assigned attributes using this syntax at the end
@@ -1837,55 +2090,6 @@ is just the same as
# My header {.unnumbered}
-#### Extension: `auto_identifiers` ####
-
-A header without an explicitly specified identifier will be
-automatically assigned a unique identifier based on the header text.
-To derive the identifier from the header text,
-
- - Remove all formatting, links, etc.
- - Remove all footnotes.
- - Remove all punctuation, except underscores, hyphens, and periods.
- - Replace all spaces and newlines with hyphens.
- - Convert all alphabetic characters to lowercase.
- - Remove everything up to the first letter (identifiers may
- not begin with a number or punctuation mark).
- - If nothing is left after this, use the identifier `section`.
-
-Thus, for example,
-
- Header Identifier
- ------------------------------- ----------------------------
- `Header identifiers in HTML` `header-identifiers-in-html`
- `*Dogs*?--in *my* house?` `dogs--in-my-house`
- `[HTML], [S5], or [RTF]?` `html-s5-or-rtf`
- `3. Applications` `applications`
- `33` `section`
-
-These rules should, in most cases, allow one to determine the identifier
-from the header text. The exception is when several headers have the
-same text; in this case, the first will get an identifier as described
-above; the second will get the same identifier with `-1` appended; the
-third with `-2`; and so on.
-
-These identifiers are used to provide link targets in the table of
-contents generated by the `--toc|--table-of-contents` option. They
-also make it easy to provide links from one section of a document to
-another. A link to this section, for example, might look like this:
-
- See the section on
- [header identifiers](#header-identifiers-in-html-latex-and-context).
-
-Note, however, that this method of providing links to sections works
-only in HTML, LaTeX, and ConTeXt formats.
-
-If the `--section-divs` option is specified, then each section will
-be wrapped in a `div` (or a `section`, if `html5` was specified),
-and the identifier will be attached to the enclosing ``
-(or ``) tag rather than the header itself. This allows entire
-sections to be manipulated using JavaScript or treated differently in
-CSS.
-
#### Extension: `implicit_header_references` ####
Pandoc behaves as if reference links have been defined for each header.
@@ -3028,8 +3232,6 @@ HTML, Slidy, DZSlides, S5, EPUB
command-line options selected. Therefore see [Math rendering in HTML]
above.
-This extension can be used with both `markdown` and `html` input.
-
[interpreted text role `:math:`]: http://docutils.sourceforge.net/docs/ref/rst/roles.html#math
Raw HTML
@@ -3457,33 +3659,6 @@ they cannot contain multiple paragraphs). The syntax is as follows:
Inline and regular footnotes may be mixed freely.
-Typography
-----------
-
-#### Extension: `smart` ####
-
-Interpret straight quotes as curly quotes, `---` as em-dashes,
-`--` as en-dashes, and `...` as ellipses. Nonbreaking spaces are
-inserted after certain abbreviations, such as "Mr." This
-option currently affects the input formats `markdown`,
-`commonmark`, `latex`, `mediawiki`, `org`, `rst`, and `twiki`,
-and the output formats `markdown`, `latex`, and `context`.
-It is enabled by default for `markdown`, `latex`, and `context`
-(in both input and output).
-
-Note: If you are *writing* Markdown, then the `smart` extension
-has the reverse effect: what would have been curly quotes comes
-out straight.
-
-In LaTeX, `smart` means to use the standard TeX ligatures
-for quotation marks (` `` ` and ` '' ` for double quotes,
-`` ` `` and `` ' `` for single quotes) and dashes (`--` for
-en-dash and `---` for em-dash). If `smart` is disabled,
-then in reading LaTeX pandoc will parse these characters
-literally. In writing LaTeX, enabling `smart` tells pandoc
-to use the ligatures when possible; if `smart` is disabled
-pandoc will use unicode quotation mark and dash characters.
-
Citations
---------
@@ -3746,8 +3921,6 @@ TeX math, and anything between `\[` and `\]` to be interpreted
as display TeX math. Note: a drawback of this extension is that
it precludes escaping `(` and `[`.
-This extension can be used with both `markdown` and `html` input.
-
#### Extension: `tex_math_double_backslash` ####
Causes anything between `\\(` and `\\)` to be interpreted as inline
@@ -3790,12 +3963,6 @@ simply skipped (as opposed to being parsed as paragraphs).
Makes all absolute URIs into links, even when not surrounded by
pointy braces `<...>`.
-#### Extension: `ascii_identifiers` ####
-
-Causes the identifiers produced by `auto_identifiers` to be pure ASCII.
-Accents are stripped off of accented Latin letters, and non-Latin
-letters are omitted.
-
#### Extension: `mmd_link_attributes` ####
Parses multimarkdown style key-value attributes on link
@@ -3839,12 +4006,6 @@ in several respects:
we must either disallow lazy wrapping or require a blank line between
list items.
-#### Extension: `empty_paragraphs` ####
-
-Allows empty paragraphs. By default empty paragraphs are
-omitted. This affects the `docx` reader and writer, the
-`opendocument` and `odt` writer, and all HTML-based readers and writers.
-
Markdown variants
-----------------
@@ -3878,34 +4039,21 @@ variants are supported:
: `raw_html`, `shortcut_reference_links`,
`spaced_reference_links`.
-We also support `gfm` (GitHub-Flavored Markdown) as a set of
-extensions on `commonmark`:
+We also support `commonmark` and `gfm` (GitHub-Flavored Markdown,
+which is implemented as a set of extensions on `commonmark`).
+Note, however, that `commonmark` and `gfm` have limited support
+for extensions. Only those listed below (and `smart` and
+`raw_tex`) will work. The extensions can, however, all be
+individually disabled.
+Also, `raw_tex` only affects `gfm` output, not input.
+
+`gfm` (GitHub-Flavored Markdown)
: `pipe_tables`, `raw_html`, `fenced_code_blocks`, `auto_identifiers`,
`ascii_identifiers`, `backtick_code_blocks`, `autolink_bare_uris`,
`intraword_underscores`, `strikeout`, `hard_line_breaks`, `emoji`,
`shortcut_reference_links`, `angle_brackets_escapable`.
- These can all be individually disabled. Note, however, that
- `commonmark` and `gfm` have limited support for extensions:
- extensions other than those listed above (and `smart` and
- `raw_tex`) will have no effect on `commonmark` or `gfm`.
- And `raw_tex` only affects `gfm` output, not input.
-
-Extensions with formats other than Markdown
--------------------------------------------
-
-Some of the extensions discussed above can be used with formats
-other than Markdown:
-
-* `auto_identifiers` can be used with `latex`, `rst`, `mediawiki`,
- and `textile` input (and is used by default).
-
-* `tex_math_dollars`, `tex_math_single_backslash`, and
- `tex_math_double_backslash` can be used with `html` input.
- (This is handy for reading web pages formatted using MathJax,
- for example.)
-
Producing slide shows with pandoc
=================================
@@ -4257,57 +4405,6 @@ with the `src` attribute. For example:
-Literate Haskell support
-========================
-
-If you append `+lhs` (or `+literate_haskell`) to an appropriate input or output
-format (`markdown`, `markdown_strict`, `rst`, or `latex` for input or output;
-`beamer`, `html4` or `html5` for output only), pandoc will treat the document as
-literate Haskell source. This means that
-
- - In Markdown input, "bird track" sections will be parsed as Haskell
- code rather than block quotations. Text between `\begin{code}`
- and `\end{code}` will also be treated as Haskell code. For
- ATX-style headers the character '=' will be used instead of '#'.
-
- - In Markdown output, code blocks with classes `haskell` and `literate`
- will be rendered using bird tracks, and block quotations will be
- indented one space, so they will not be treated as Haskell code.
- In addition, headers will be rendered setext-style (with underlines)
- rather than ATX-style (with '#' characters). (This is because ghc
- treats '#' characters in column 1 as introducing line numbers.)
-
- - In restructured text input, "bird track" sections will be parsed
- as Haskell code.
-
- - In restructured text output, code blocks with class `haskell` will
- be rendered using bird tracks.
-
- - In LaTeX input, text in `code` environments will be parsed as
- Haskell code.
-
- - In LaTeX output, code blocks with class `haskell` will be rendered
- inside `code` environments.
-
- - In HTML output, code blocks with class `haskell` will be rendered
- with class `literatehaskell` and bird tracks.
-
-Examples:
-
- pandoc -f markdown+lhs -t html
-
-reads literate Haskell source formatted with Markdown conventions and writes
-ordinary HTML (without bird tracks).
-
- pandoc -f markdown+lhs -t html+lhs
-
-writes HTML with the Haskell code in bird tracks, so it can be copied
-and pasted as literate Haskell source.
-
-Note that GHC expects the bird tracks in the first column, so indentend literate
-code blocks (e.g. inside an itemized environment) will not be picked up by the
-Haskell compiler.
-
Syntax highlighting
===================