pandoc/doc/filters.md

% Pandoc filters
% John MacFarlane

# Summary

Pandoc provides an interface for users to write programs (known
as filters) which act on pandoc’s AST.

Pandoc consists of a set of readers and writers. When converting
a document from one format to another, text is parsed by a
reader into pandoc’s intermediate representation of the
document---an "abstract syntax tree" or AST---which is then
converted by the writer into the target format.
The pandoc AST format is defined in the module
[`Text.Pandoc.Definition` in the `pandoc-types` package
](https://hackage.haskell.org/package/pandoc-types/docs/Text-Pandoc-Definition.html).

A "filter" is a program that modifies the AST, between the
reader and the writer.

    INPUT --reader--> AST --filter--> AST --writer--> OUTPUT

Pandoc supports two kinds of filters:

- **Lua filters** use the Lua language to
  define transformations on the pandoc AST.  They are
  described in a [separate document](lua-filters.html).

- **JSON filters**, described here, are pipes that read from
  standard input and write to standard output, consuming and
  producing a JSON representation of the pandoc AST:

                             source format
                                  ↓
                               (pandoc)
                                  ↓
                          JSON-formatted AST
                                  ↓
                            (JSON filter)
                                  ↓
                          JSON-formatted AST
                                  ↓
                               (pandoc)
                                  ↓
                            target format

Lua filters have a couple of advantages.  They use a Lua
interpreter that is embedded in pandoc, so you don't need
to have any external software installed.  And they are
usually faster than JSON filters.  But if you wish to
write your filter in a language other than Lua, you may
prefer to use a JSON filter. JSON filters may be written
in any programming language.

You can use a JSON filter directly in a pipeline:

    pandoc -s input.txt -t json | \
     pandoc-citeproc | \
     pandoc -s -f json -o output.html

But it is more convenient to use the `--filter` option,
which handles the plumbing automatically:

    pandoc -s input.txt --filter pandoc-citeproc -o output.html

For a gentle introduction into writing your own filters,
continue this guide. There’s also a [list of third party filters
on the wiki](https://github.com/jgm/pandoc/wiki/Pandoc-Filters).


# A simple example

Suppose you wanted to replace all level 2+ headings in a markdown
document with regular paragraphs, with text in italics. How would you go
about doing this?

A first thought would be to use regular expressions. Something
like this:

    perl -pe 's/^##+ (.*)$/\*\1\*/' source.txt

This should work most of the time.  But don't forget
that ATX style headings can end with a sequence of `#`s
that is not part of the heading text:

    ## My heading ##

And what if your document contains a line starting with `##` in an HTML
comment or delimited code block?

    <!--
    ## This is just a comment
    -->

    ~~~~
    ### A third level heading in standard markdown
    ~~~~

We don't want to touch *these* lines.  Moreover, what about Setext
style second-level heading?

    A heading
    ---------

We need to handle those too.  Finally, can we be sure that adding
asterisks to each side of our string will put it in italics?
What if the string already contains asterisks around it? Then we'll
end up with bold text, which is not what we want. And what if it contains
a regular unescaped asterisk?

How would you modify your regular expression to handle these cases? It
would be hairy, to say the least.

A better approach is to let pandoc handle the parsing, and
then modify the AST before the document is written. For this,
we can use a filter.

To see what sort of AST is produced when pandoc parses our text,
we can use pandoc's `native` output format:

~~~~
% cat test.txt
## my heading

text with *italics*
% pandoc -s -t native test.txt
Pandoc (Meta {unMeta = fromList []})
[Header 2 ("my-heading",[],[]) [Str "My",Space,Str "heading"]
, Para [Str "text",Space,Str "with",Space,Emph [Str "italics"]] ]
~~~~

A `Pandoc` document consists of a `Meta` block (containing
metadata like title, authors, and date) and a list of `Block`
 elements.  In this case, we have two `Block`s, a `Header` and a `Para`.
Each has as its content a list of `Inline` elements.  For more details on
the pandoc AST, see the [haddock documentation for `Text.Pandoc.Definition`].

[haddock documentation for `Text.Pandoc.Definition`]: https://hackage.haskell.org/package/pandoc-types

We can use Haskell to create a JSON filter that transforms this
AST, replacing each `Header` block with level >= 2 with a `Para`
with its contents wrapped inside an `Emph` inline:

~~~~                          {.haskell}
#!/usr/bin/env runhaskell
-- behead.hs
import Text.Pandoc.JSON

main :: IO ()
main = toJSONFilter behead

behead :: Block -> Block
behead (Header n _ xs) | n >= 2 = Para [Emph xs]
behead x = x
~~~~

The `toJSONFilter` function does two things.  First, it lifts
the `behead` function (which maps `Block -> Block`) onto a
transformation of the entire `Pandoc` AST, walking the AST
and transforming each block.  Second, it wraps this `Pandoc ->
Pandoc` transformation with the necessary JSON serialization
and deserialization, producing an executable that consumes
JSON from stdin and produces JSON to stdout.

To use the filter, make it executable:

    chmod +x behead.hs

and then

    pandoc -f SOURCEFORMAT -t TARGETFORMAT --filter ./behead.hs

(It is also necessary that `pandoc-types` be installed in the
local package repository. To do this using cabal-install,
`cabal v2-update && cabal v2-install --lib pandoc-types`.)

Alternatively, we could compile the filter:

    ghc -package-env=default --make behead.hs
    pandoc -f SOURCEFORMAT -t TARGETFORMAT --filter ./behead

Note that if the filter is placed in the system PATH, then the initial
`./` is not needed.  Note also that the command line can include
multiple instances of `--filter`:  the filters will be applied in
sequence.


# LaTeX for WordPress

Another easy example. WordPress blogs require a special format for
LaTeX math.  Instead of `$e=mc^2$`, you need: `$LaTeX e=mc^2$`.
How can we convert a markdown document accordingly?

Again, it's difficult to do the job reliably with regexes.
A `$` might be a regular currency indicator, or it might occur in
a comment or code block or inline code span.  We just want to find
the `$`s that begin LaTeX math. If only we had a parser...

We do.  Pandoc already extracts LaTeX math, so:

~~~~                          {.haskell}
#!/usr/bin/env runhaskell
-- wordpressify.hs
import Text.Pandoc.JSON

main = toJSONFilter wordpressify
  where wordpressify (Math x y) = Math x ("LaTeX " ++ y)
        wordpressify x = x
~~~~

Mission accomplished. (I've omitted type signatures here,
just to show it can be done.)


# But I don't want to learn Haskell!

While it's easiest to write pandoc filters in Haskell, it is fairly
easy to write them in python using the `pandocfilters` package.
The package is in PyPI and can be installed using `pip install
pandocfilters` or `easy_install pandocfilters`.

Here's our "beheading" filter in python:

~~~ {.python}
#!/usr/bin/env python

"""
Pandoc filter to convert all level 2+ headings to paragraphs with
emphasized text.
"""

from pandocfilters import toJSONFilter, Emph, Para

def behead(key, value, format, meta):
  if key == 'Header' and value[0] >= 2:
    return Para([Emph(value[2])])

if __name__ == "__main__":
  toJSONFilter(behead)
~~~

`toJSONFilter(behead)` walks the AST and applies the `behead` action
to each element.  If `behead` returns nothing, the node is unchanged;
if it returns an object, the node is replaced; if it returns a list,
the new list is spliced in.

Note that, although these parameters are not used in this example,
`format` provides access to the target format, and `meta` provides access to
the document's metadata.

There are many examples of python filters in [the pandocfilters
repository](https://github.com/jgm/pandocfilters).

For a more Pythonic alternative to pandocfilters, see
the [panflute](https://pypi.org/project/panflute) library.
Don't like Python? There are also ports of pandocfilters in

- [PHP](https://github.com/vinai/pandocfilters-php),
- [perl](https://metacpan.org/pod/Pandoc::Filter),
- TypeScript/JavaScript via Node.js
  - [pandoc-filter](https://github.com/mvhenderson/pandoc-filter-node),
  - [node-pandoc-filter](https://github.com/mu-io/node-pandoc-filter),
- [Groovy](https://github.com/dfrommi/groovy-pandoc), and
- [Ruby](https://heerdebeer.org/Software/markdown/paru/).

Starting with pandoc 2.0, pandoc includes built-in support for
writing filters in lua.  The lua interpreter is built in to
pandoc, so a lua filter does not require any additional software
to run.  See the [documentation on lua
filters](https://pandoc.org/lua-filters.html).

# Include files

So none of our transforms have involved IO. How about a script that
reads a markdown document, finds all the inline code blocks with
attribute `include`, and replaces their contents with the contents of
the file given?

~~~~                          {.haskell}
#!/usr/bin/env runhaskell
-- includes.hs
import Text.Pandoc.JSON
import qualified Data.Text.IO as TIO
import qualified Data.Text as T

doInclude :: Block -> IO Block
doInclude cb@(CodeBlock (id, classes, namevals) contents) =
  case lookup (T.pack "include") namevals of
       Just f     -> CodeBlock (id, classes, namevals) <$>
                      TIO.readFile (T.unpack f)
       Nothing    -> return cb
doInclude x = return x

main :: IO ()
main = toJSONFilter doInclude
~~~~

Try this on the following:

    Here's the pandoc README:

    ~~~~ {include="README"}
    this will be replaced by contents of README
    ~~~~

# Removing links

What if we want to remove every link from a document, retaining
the link's text?

~~~~                          {.haskell}
#!/usr/bin/env runhaskell
-- delink.hs
import Text.Pandoc.JSON

main = toJSONFilter delink

delink :: Inline -> [Inline]
delink (Link _ txt _) = txt
delink x              = [x]
~~~~

Note that `delink` can't be a function of type `Inline -> Inline`,
because the thing we want to replace the link with is not a single
`Inline` element, but a list of them. So we make `delink` a function
from an `Inline` element to a list of `Inline` elements.
`toJSONFilter` can still lift this function to a transformation of type
`Pandoc -> Pandoc`.

# A filter for ruby text

Finally, here's a nice real-world example, developed on the
[pandoc-discuss](https://groups.google.com/group/pandoc-discuss/browse_thread/thread/7baea325565878c8) list.  Qubyte wrote:

> I'm interested in using pandoc to turn my markdown notes on Japanese
> into nicely set HTML and (Xe)LaTeX. With HTML5, ruby (typically used to
> phonetically read chinese characters by placing text above or to the
> side) is standard, and support from browsers is emerging (Webkit based
> browsers appear to fully support it). For those browsers that don't
> support it yet (notably Firefox) the feature falls back in a nice way
> by placing the phonetic reading inside brackets to the side of each
> Chinese character, which is suitable for other output formats too. As
> for (Xe)LaTeX, ruby is not an issue.
>
> At the moment, I use inline HTML to achieve the result when the
> conversion is to HTML, but it's ugly and uses a lot of keystrokes, for
> example
>
> ~~~ {.xml}
> <ruby>ご<rt></rt>飯<rp>（</rp><rt>はん</rt><rp>）</rp></ruby>
> ~~~
>
> sets ご飯 "gohan" with "han" spelt phonetically above the second
> character, or to the right of it in brackets if the browser does not
> support ruby.  I'd like to have something more like
>
>     r[はん](飯)
>
> or any keystroke saving convention would be welcome.

We came up with the following script, which uses the convention that a
markdown link with a URL beginning with a hyphen is interpreted as ruby:

    [はん](-飯)

~~~ {.haskell}
{-# LANGUAGE OverloadedStrings #-}
-- handleruby.hs
import Text.Pandoc.JSON
import System.Environment (getArgs)
import qualified Data.Text as T

handleRuby :: Maybe Format -> Inline -> Inline
handleRuby (Just format) x@(Link attr [Str ruby] (src,_)) =
  case T.uncons src of
    Just ('-',kanji)
      | format == Format "html" -> RawInline format $
        "<ruby>" <> kanji <> "<rp>(</rp><rt>" <> ruby <>
        "</rt><rp>)</rp></ruby>"
      | format == Format "latex" -> RawInline format $
        "\\ruby{" <> kanji <> "}{" <> ruby <> "}"
      | otherwise -> Str ruby
    _ -> x
handleRuby _ x = x

main :: IO ()
main = toJSONFilter handleRuby
~~~

Note that, when a script is called using `--filter`, pandoc passes
it the target format as the first argument.  When a function's
first argument is of type `Maybe Format`, `toJSONFilter` will
automatically assign it `Just` the target format or `Nothing`.

We compile our script:

    # first, make sure pandoc-types is installed:
    cabal install --lib pandoc-types --package-env .
    ghc --make handleRuby

Then run it:

    % pandoc -F ./handleRuby -t html
    [はん](-飯)
    ^D
    <p><ruby>飯<rp>(</rp><rt>はん</rt><rp>)</rp></ruby></p>
    % pandoc -F ./handleRuby -t latex
    [はん](-飯)
    ^D
    \ruby{飯}{はん}

Note:  to use this to generate PDFs via LaTeX, you'll need
to use `--pdf-engine=xelatex`, specify a `mainfont` that has
the Japanese characters (e.g. "[Noto Sans CJK JP](https://fonts.google.com/noto/specimen/Noto+Sans+JP)"), and add
`\usepackage{ruby}` to your template or header-includes.

# Exercises

1.  Put all the regular text in a markdown document in ALL CAPS
    (without touching text in URLs or link titles).

2.  Remove all horizontal rules from a document.

3.  Renumber all enumerated lists with roman numerals.

4.  Replace each delimited code block with class `dot` with an
    image generated by running `dot -Tpng` (from graphviz) on the
    contents of the code block.

5.  Find all code blocks with class `python` and run them
    using the python interpreter, printing the results to the console.

# Technical details of JSON filters

A JSON filter is any program which can consume and produce a
valid pandoc JSON document representation. This section describes
the technical details surrounding the invocation of filters.

## Arguments

The program will always be called with the target format as the
only argument. A pandoc invocation like

    pandoc --filter demo --to=html

will cause pandoc to call the program `demo` with argument `html`.

## Environment variables

Pandoc sets additional environment variables before calling a
filter.

`PANDOC_VERSION`
:   The version of the pandoc binary used to process the document.
    Example: `2.11.1`.

`PANDOC_READER_OPTIONS`
:   JSON object representation of the options passed to the input
    parser.

    Object fields:

    `abbreviations`
    :   set of known abbreviations (array of strings).

    `columns`
    :   number of columns in terminal; an integer.

    default-image-extension`
    :   default extension for images; a string.

    `extensions`
    :   integer representation of the syntax extensions bit
        field.

    `indented-code-classes`
    :   default classes for indented code blocks; array of
        strings.

    `standalone`
    :   whether the input was a standalone document with header;
        either `true` or `false`.

    `strip-comments`
    :   HTML comments are stripped instead of parsed as raw HTML;
        either `true` or `false`.

    `tab-stop`
    :   width (i.e. equivalent number of spaces) of tab stops;
        integer.

    `track-changes`
    :   track changes setting for docx; one of
        `"accept-changes"`, `"reject-changes"`, and
        `"all-changes"`.

## Supported interpreters

Files passed to the `--filter`/`-F` parameter are expected to be
executable. However, if the executable bit is not set, then
pandoc tries to guess a suitable interpreter from the file
extension.

  file extension   interpreter
  ---------------- --------------
  .py              `python`
  .hs              `runhaskell`
  .pl              `perl`
  .rb              `ruby`
  .php             `php`
  .js              `node`
  .r               `Rscript`
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								% Pandoc filters
 								% John MacFarlane
 								# Summary
 								Pandoc provides an interface for users to write programs (known
 								as filters) which act on pandoc’s AST.
 								Pandoc consists of a set of readers and writers. When converting
 								a document from one format to another, text is parsed by a
 								reader into pandoc’s intermediate representation of the
 								document---an "abstract syntax tree" or AST---which is then
 								converted by the writer into the target format.
 								The pandoc AST format is defined in the module
-												add docs about customizing pandoc (#4972)

closes #3288
											
										
										
											2018-10-16 18:10:34 +02:00
+								[`Text.Pandoc.Definition` in the `pandoc-types` package
 								](https://hackage.haskell.org/package/pandoc-types/docs/Text-Pandoc-Definition.html).
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								A "filter" is a program that modifies the AST, between the
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								reader and the writer.
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								    INPUT --reader--> AST --filter--> AST --writer--> OUTPUT
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								Pandoc supports two kinds of filters:
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								- **Lua filters** use the Lua language to
 								  define transformations on the pandoc AST.  They are
 								  described in a [separate document](lua-filters.html).
 								- **JSON filters**, described here, are pipes that read from
 								  standard input and write to standard output, consuming and
 								  producing a JSON representation of the pandoc AST:
 								                             source format
 								                                  ↓
 								                               (pandoc)
 								                                  ↓
 								                          JSON-formatted AST
 								                                  ↓
 								                            (JSON filter)
 								                                  ↓
 								                          JSON-formatted AST
 								                                  ↓
 								                               (pandoc)
 								                                  ↓
 								                            target format
 								Lua filters have a couple of advantages.  They use a Lua
 								interpreter that is embedded in pandoc, so you don't need
 								to have any external software installed.  And they are
 								usually faster than JSON filters.  But if you wish to
 								write your filter in a language other than Lua, you may
 								prefer to use a JSON filter. JSON filters may be written
 								in any programming language.
 								You can use a JSON filter directly in a pipeline:
 								    pandoc -s input.txt -t json | \
 								     pandoc-citeproc | \
 								     pandoc -s -f json -o output.html
 								But it is more convenient to use the `--filter` option,
 								which handles the plumbing automatically:
 								    pandoc -s input.txt --filter pandoc-citeproc -o output.html
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								For a gentle introduction into writing your own filters,
 								continue this guide. There’s also a [list of third party filters
 								on the wiki](https://github.com/jgm/pandoc/wiki/Pandoc-Filters).
 								# A simple example
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								Suppose you wanted to replace all level 2+ headings in a markdown
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								document with regular paragraphs, with text in italics. How would you go
 								about doing this?
 								A first thought would be to use regular expressions. Something
 								like this:
 								    perl -pe 's/^##+ (.*)$/\*\1\*/' source.txt
 								This should work most of the time.  But don't forget
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								that ATX style headings can end with a sequence of `#`s
 								that is not part of the heading text:
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								    ## My heading ##
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								And what if your document contains a line starting with `##` in an HTML
 								comment or delimited code block?
 								    <!--
 								    ## This is just a comment
 								    -->
 								    ~~~~
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								    ### A third level heading in standard markdown
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								    ~~~~
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								We don't want to touch *these* lines.  Moreover, what about Setext
 								style second-level heading?
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								    A heading
 								    ---------
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								We need to handle those too.  Finally, can we be sure that adding
 								asterisks to each side of our string will put it in italics?
 								What if the string already contains asterisks around it? Then we'll
 								end up with bold text, which is not what we want. And what if it contains
 								a regular unescaped asterisk?
 								How would you modify your regular expression to handle these cases? It
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								would be hairy, to say the least.
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								A better approach is to let pandoc handle the parsing, and
 								then modify the AST before the document is written. For this,
 								we can use a filter.
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								To see what sort of AST is produced when pandoc parses our text,
 								we can use pandoc's `native` output format:
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								~~~~
 								% cat test.txt
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								## my heading
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								text with *italics*
 								% pandoc -s -t native test.txt
 								Pandoc (Meta {unMeta = fromList []})
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								[Header 2 ("my-heading",[],[]) [Str "My",Space,Str "heading"]
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								, Para [Str "text",Space,Str "with",Space,Emph [Str "italics"]] ]
 								~~~~
 								A `Pandoc` document consists of a `Meta` block (containing
 								metadata like title, authors, and date) and a list of `Block`
 								 elements.  In this case, we have two `Block`s, a `Header` and a `Para`.
 								Each has as its content a list of `Inline` elements.  For more details on
 								the pandoc AST, see the [haddock documentation for `Text.Pandoc.Definition`].
-												Fix broken links in documents (#5473)

Fix broken links in doc/epub.md, doc/getting-started.md,
doc/customizing-pandoc.md, doc/using-the-pandoc-api.md.
Also, use absolute links to pandoc.org when possible, so that
the links can be followed by people viewing these documents
on GitHub.

											
										
										
											2019-05-02 02:09:36 +02:00
+								[haddock documentation for `Text.Pandoc.Definition`]: https://hackage.haskell.org/package/pandoc-types
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								We can use Haskell to create a JSON filter that transforms this
 								AST, replacing each `Header` block with level >= 2 with a `Para`
 								with its contents wrapped inside an `Emph` inline:
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								~~~~                          {.haskell}
 								#!/usr/bin/env runhaskell
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								-- behead.hs
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								import Text.Pandoc.JSON
 								main :: IO ()
 								main = toJSONFilter behead
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
 								behead :: Block -> Block
 								behead (Header n _ xs) | n >= 2 = Para [Emph xs]
 								behead x = x
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								~~~~
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								The `toJSONFilter` function does two things.  First, it lifts
 								the `behead` function (which maps `Block -> Block`) onto a
 								transformation of the entire `Pandoc` AST, walking the AST
 								and transforming each block.  Second, it wraps this `Pandoc ->
 								Pandoc` transformation with the necessary JSON serialization
 								and deserialization, producing an executable that consumes
 								JSON from stdin and produces JSON to stdout.
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								To use the filter, make it executable:
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								    chmod +x behead.hs
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								and then
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								    pandoc -f SOURCEFORMAT -t TARGETFORMAT --filter ./behead.hs
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								(It is also necessary that `pandoc-types` be installed in the
-												Update filters doc with better cabal v2 instructions.

											
										
										
											2020-01-15 01:31:09 +01:00
+								local package repository. To do this using cabal-install,
 								`cabal v2-update && cabal v2-install --lib pandoc-types`.)
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								Alternatively, we could compile the filter:
-												Update filters doc with better cabal v2 instructions.

											
										
										
											2020-01-15 01:31:09 +01:00
+								    ghc -package-env=default --make behead.hs
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								    pandoc -f SOURCEFORMAT -t TARGETFORMAT --filter ./behead
 								Note that if the filter is placed in the system PATH, then the initial
 								`./` is not needed.  Note also that the command line can include
 								multiple instances of `--filter`:  the filters will be applied in
 								sequence.
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								# LaTeX for WordPress
 								Another easy example. WordPress blogs require a special format for
 								LaTeX math.  Instead of `$e=mc^2$`, you need: `$LaTeX e=mc^2$`.
 								How can we convert a markdown document accordingly?
 								Again, it's difficult to do the job reliably with regexes.
 								A `$` might be a regular currency indicator, or it might occur in
 								a comment or code block or inline code span.  We just want to find
 								the `$`s that begin LaTeX math. If only we had a parser...
 								We do.  Pandoc already extracts LaTeX math, so:
 								~~~~                          {.haskell}
 								#!/usr/bin/env runhaskell
 								-- wordpressify.hs
 								import Text.Pandoc.JSON
 								main = toJSONFilter wordpressify
 								  where wordpressify (Math x y) = Math x ("LaTeX " ++ y)
 								        wordpressify x = x
 								~~~~
 								Mission accomplished. (I've omitted type signatures here,
 								just to show it can be done.)
 								# But I don't want to learn Haskell!
 								While it's easiest to write pandoc filters in Haskell, it is fairly
 								easy to write them in python using the `pandocfilters` package.
 								The package is in PyPI and can be installed using `pip install
 								pandocfilters` or `easy_install pandocfilters`.
 								Here's our "beheading" filter in python:
 								~~~ {.python}
 								#!/usr/bin/env python
 								"""
-												Update filter documentation.

Remove example using pandoc API directly (we have other
docs for that and it was outdated).

Closes #6065.

											
										
										
											2020-01-14 20:18:24 +01:00
+								Pandoc filter to convert all level 2+ headings to paragraphs with
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								emphasized text.
 								"""
 								from pandocfilters import toJSONFilter, Emph, Para
 								def behead(key, value, format, meta):
 								  if key == 'Header' and value[0] >= 2:
 								    return Para([Emph(value[2])])
 								if __name__ == "__main__":
 								  toJSONFilter(behead)
 								~~~
 								`toJSONFilter(behead)` walks the AST and applies the `behead` action
 								to each element.  If `behead` returns nothing, the node is unchanged;
 								if it returns an object, the node is replaced; if it returns a list,
 								the new list is spliced in.
 								Note that, although these parameters are not used in this example,
 								`format` provides access to the target format, and `meta` provides access to
 								the document's metadata.
 								There are many examples of python filters in [the pandocfilters
-												Fix broken links in documents (#5473)

Fix broken links in doc/epub.md, doc/getting-started.md,
doc/customizing-pandoc.md, doc/using-the-pandoc-api.md.
Also, use absolute links to pandoc.org when possible, so that
the links can be followed by people viewing these documents
on GitHub.

											
										
										
											2019-05-02 02:09:36 +02:00
+								repository](https://github.com/jgm/pandocfilters).
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								For a more Pythonic alternative to pandocfilters, see
-												Fix broken links in documents (#5473)

Fix broken links in doc/epub.md, doc/getting-started.md,
doc/customizing-pandoc.md, doc/using-the-pandoc-api.md.
Also, use absolute links to pandoc.org when possible, so that
the links can be followed by people viewing these documents
on GitHub.

											
										
										
											2019-05-02 02:09:36 +02:00
+								the [panflute](https://pypi.org/project/panflute) library.
-												filters.md: document a new Pandoc filtering framework (#6908)


											
										
										
											2020-12-02 21:28:38 +01:00
+								Don't like Python? There are also ports of pandocfilters in
 								- [PHP](https://github.com/vinai/pandocfilters-php),
 								- [perl](https://metacpan.org/pod/Pandoc::Filter),
 								- TypeScript/JavaScript via Node.js
 								  - [pandoc-filter](https://github.com/mvhenderson/pandoc-filter-node),
 								  - [node-pandoc-filter](https://github.com/mu-io/node-pandoc-filter),
 								- [Groovy](https://github.com/dfrommi/groovy-pandoc), and
 								- [Ruby](https://heerdebeer.org/Software/markdown/paru/).
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								Starting with pandoc 2.0, pandoc includes built-in support for
 								writing filters in lua.  The lua interpreter is built in to
 								pandoc, so a lua filter does not require any additional software
 								to run.  See the [documentation on lua
-												Fix broken links in documents (#5473)

Fix broken links in doc/epub.md, doc/getting-started.md,
doc/customizing-pandoc.md, doc/using-the-pandoc-api.md.
Also, use absolute links to pandoc.org when possible, so that
the links can be followed by people viewing these documents
on GitHub.

											
										
										
											2019-05-02 02:09:36 +02:00
+								filters](https://pandoc.org/lua-filters.html).
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								# Include files
 								So none of our transforms have involved IO. How about a script that
 								reads a markdown document, finds all the inline code blocks with
 								attribute `include`, and replaces their contents with the contents of
 								the file given?
 								~~~~                          {.haskell}
 								#!/usr/bin/env runhaskell
 								-- includes.hs
 								import Text.Pandoc.JSON
-												Update filter code in doc/filters.md...

so it works with latest pandoc. Closes #6185.

											
										
										
											2020-03-15 17:59:44 +01:00
+								import qualified Data.Text.IO as TIO
 								import qualified Data.Text as T
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								doInclude :: Block -> IO Block
 								doInclude cb@(CodeBlock (id, classes, namevals) contents) =
-												Fix bug in filter example. Thanks to Jiří Wolker.

											
										
										
											2022-07-27 17:40:31 +02:00
+								  case lookup (T.pack "include") namevals of
-												Update filter code in doc/filters.md...

so it works with latest pandoc. Closes #6185.

											
										
										
											2020-03-15 17:59:44 +01:00
+								       Just f     -> CodeBlock (id, classes, namevals) <$>
 								                      TIO.readFile (T.unpack f)
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								       Nothing    -> return cb
 								doInclude x = return x
 								main :: IO ()
 								main = toJSONFilter doInclude
 								~~~~
 								Try this on the following:
 								    Here's the pandoc README:
 								    ~~~~ {include="README"}
 								    this will be replaced by contents of README
 								    ~~~~
 								# Removing links
 								What if we want to remove every link from a document, retaining
 								the link's text?
 								~~~~                          {.haskell}
 								#!/usr/bin/env runhaskell
 								-- delink.hs
 								import Text.Pandoc.JSON
 								main = toJSONFilter delink
 								delink :: Inline -> [Inline]
 								delink (Link _ txt _) = txt
 								delink x              = [x]
 								~~~~
 								Note that `delink` can't be a function of type `Inline -> Inline`,
 								because the thing we want to replace the link with is not a single
 								`Inline` element, but a list of them. So we make `delink` a function
 								from an `Inline` element to a list of `Inline` elements.
 								`toJSONFilter` can still lift this function to a transformation of type
 								`Pandoc -> Pandoc`.
 								# A filter for ruby text
 								Finally, here's a nice real-world example, developed on the
-												Fix broken links in documents (#5473)

Fix broken links in doc/epub.md, doc/getting-started.md,
doc/customizing-pandoc.md, doc/using-the-pandoc-api.md.
Also, use absolute links to pandoc.org when possible, so that
the links can be followed by people viewing these documents
on GitHub.

											
										
										
											2019-05-02 02:09:36 +02:00
+								[pandoc-discuss](https://groups.google.com/group/pandoc-discuss/browse_thread/thread/7baea325565878c8) list.  Qubyte wrote:
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								> I'm interested in using pandoc to turn my markdown notes on Japanese
 								> into nicely set HTML and (Xe)LaTeX. With HTML5, ruby (typically used to
 								> phonetically read chinese characters by placing text above or to the
 								> side) is standard, and support from browsers is emerging (Webkit based
 								> browsers appear to fully support it). For those browsers that don't
 								> support it yet (notably Firefox) the feature falls back in a nice way
 								> by placing the phonetic reading inside brackets to the side of each
 								> Chinese character, which is suitable for other output formats too. As
 								> for (Xe)LaTeX, ruby is not an issue.
 								>
 								> At the moment, I use inline HTML to achieve the result when the
 								> conversion is to HTML, but it's ugly and uses a lot of keystrokes, for
 								> example
 								>
 								> ~~~ {.xml}
 								> <ruby>ご<rt></rt>飯<rp>（</rp><rt>はん</rt><rp>）</rp></ruby>
 								> ~~~
 								>
 								> sets ご飯 "gohan" with "han" spelt phonetically above the second
 								> character, or to the right of it in brackets if the browser does not
 								> support ruby.  I'd like to have something more like
 								>
 								>     r[はん](飯)
 								>
 								> or any keystroke saving convention would be welcome.
 								We came up with the following script, which uses the convention that a
 								markdown link with a URL beginning with a hyphen is interpreted as ruby:
 								    [はん](-飯)
 								~~~ {.haskell}
-												Update filter code in doc/filters.md...

so it works with latest pandoc. Closes #6185.

											
										
										
											2020-03-15 17:59:44 +01:00
+								{-# LANGUAGE OverloadedStrings #-}
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								-- handleruby.hs
 								import Text.Pandoc.JSON
 								import System.Environment (getArgs)
-												Update filter code in doc/filters.md...

so it works with latest pandoc. Closes #6185.

											
										
										
											2020-03-15 17:59:44 +01:00
+								import qualified Data.Text as T
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
 								handleRuby :: Maybe Format -> Inline -> Inline
-												Update filter code in doc/filters.md...

so it works with latest pandoc. Closes #6185.

											
										
										
											2020-03-15 17:59:44 +01:00
+								handleRuby (Just format) x@(Link attr [Str ruby] (src,_)) =
 								  case T.uncons src of
 								    Just ('-',kanji)
 								      | format == Format "html" -> RawInline format $
 								        "<ruby>" <> kanji <> "<rp>(</rp><rt>" <> ruby <>
 								        "</rt><rp>)</rp></ruby>"
 								      | format == Format "latex" -> RawInline format $
 								        "\\ruby{" <> kanji <> "}{" <> ruby <> "}"
 								      | otherwise -> Str ruby
 								    _ -> x
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								handleRuby _ x = x
 								main :: IO ()
 								main = toJSONFilter handleRuby
 								~~~
 								Note that, when a script is called using `--filter`, pandoc passes
 								it the target format as the first argument.  When a function's
 								first argument is of type `Maybe Format`, `toJSONFilter` will
 								automatically assign it `Just` the target format or `Nothing`.
 								We compile our script:
-												Add instructions for installing pandoc-types before compiling filter.

											
										
										
											2021-04-30 17:35:52 +02:00
+								    # first, make sure pandoc-types is installed:
 								    cabal install --lib pandoc-types --package-env .
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								    ghc --make handleRuby
 								Then run it:
 								    % pandoc -F ./handleRuby -t html
 								    [はん](-飯)
 								    ^D
 								    <p><ruby>飯<rp>(</rp><rt>はん</rt><rp>)</rp></ruby></p>
 								    % pandoc -F ./handleRuby -t latex
 								    [はん](-飯)
 								    ^D
 								    \ruby{飯}{はん}
-												Update filter code in doc/filters.md...

so it works with latest pandoc. Closes #6185.

											
										
										
											2020-03-15 17:59:44 +01:00
+								Note:  to use this to generate PDFs via LaTeX, you'll need
 								to use `--pdf-engine=xelatex`, specify a `mainfont` that has
-												Update filter.md Noto Sans CJK TC -> JP (#8153)

Noto Sans CJK TC, that is suggested as a character set that contains Japanese characters, may not be suitable to properly display Japanese characters. Rather, Noto Sans CJK JP is much more recommendable for that purpose.

Although some characters originated from China are quite similar among countries/regions, most of them have evolved into different shapes in Mainland China, Hong Kong, Taiwan, Japan, Korea, and Vietnam. Therefore, it is best to use a character set that the language of the country/region uses for the readability/recongnizability sake. See also [an webpage that discusses the glyph appearance issue in Chinese, Japanese, Korean, and Vietnamese languages](https://heistak.github.io/your-code-displays-japanese-wrong/).

[README of Noto CJK](https://github.com/googlefonts/noto-cjk/blob/main/README.md) may be also good resource to know which font should be used to display characters of each language.
											
										
										
											2022-06-30 14:48:45 +02:00
+								the Japanese characters (e.g. "[Noto Sans CJK JP](https://fonts.google.com/noto/specimen/Noto+Sans+JP)"), and add
-												Update filter code in doc/filters.md...

so it works with latest pandoc. Closes #6185.

											
										
										
											2020-03-15 17:59:44 +01:00
+								`\usepackage{ruby}` to your template or header-includes.
-												Removed customizing-pandoc.md from doc/, added filters.md.

filters.md is essentially the scripting tutorial from the
webiste.

											
										
										
											2017-09-17 08:00:20 +02:00
+								# Exercises
 .  Put all the regular text in a markdown document in ALL CAPS
 								    (without touching text in URLs or link titles).
 .  Remove all horizontal rules from a document.
 .  Renumber all enumerated lists with roman numerals.
 .  Replace each delimited code block with class `dot` with an
 								    image generated by running `dot -Tpng` (from graphviz) on the
 								    contents of the code block.
 .  Find all code blocks with class `python` and run them
 								    using the python interpreter, printing the results to the console.
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								# Technical details of JSON filters
 								A JSON filter is any program which can consume and produce a
 								valid pandoc JSON document representation. This section describes
 								the technical details surrounding the invocation of filters.
 								## Arguments
 								The program will always be called with the target format as the
 								only argument. A pandoc invocation like
 								    pandoc --filter demo --to=html
 								will cause pandoc to call the program `demo` with argument `html`.
 								## Environment variables
 								Pandoc sets additional environment variables before calling a
 								filter.
 								`PANDOC_VERSION`
 								:   The version of the pandoc binary used to process the document.
 								    Example: `2.11.1`.
 								`PANDOC_READER_OPTIONS`
 								:   JSON object representation of the options passed to the input
 								    parser.
 								    Object fields:
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    `abbreviations`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   set of known abbreviations (array of strings).
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    `columns`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   number of columns in terminal; an integer.
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    default-image-extension`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   default extension for images; a string.
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    `extensions`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   integer representation of the syntax extensions bit
 								        field.
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    `indented-code-classes`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   default classes for indented code blocks; array of
 								        strings.
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    `standalone`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   whether the input was a standalone document with header;
 								        either `true` or `false`.
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    `strip-comments`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   HTML comments are stripped instead of parsed as raw HTML;
 								        either `true` or `false`.
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    `tab-stop`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   width (i.e. equivalent number of spaces) of tab stops;
 								        integer.
-												Change JSON encodings of some types.

- For LineEnding use lowercase constructors, e.g. `crlf`, `native`.
  This was the original intent, but there was a bug in the
  implementation.
- For HTMLSlideVariant use lowercase constructors.
- For ReaderOptions use e.g. `default-image-extension`
  instead of `readerDefaultImageExtension` for field names.
- For Extension, use e.g. `tex_math_dollars` instead of
  `Ext_tex_math_dollars` as constructor.
- For Extensions, use an array of Extensions, instead of
  an object wrapping the tag `Extensions` and an integer.
  (The representation is not supposed to be part of the
  public API.)
- For Opt, use field names like `tab-stop` instead of `optTabStop`.

											
										
										
											2021-10-27 03:32:11 +02:00
+								    `track-changes`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								    :   track changes setting for docx; one of
 								        `"accept-changes"`, `"reject-changes"`, and
 								        `"all-changes"`.
 								## Supported interpreters
 								Files passed to the `--filter`/`-F` parameter are expected to be
 								executable. However, if the executable bit is not set, then
 								pandoc tries to guess a suitable interpreter from the file
 								extension.
 								  file extension   interpreter
 								  ---------------- --------------
 								  .py              `python`
 								  .hs              `runhaskell`
-												Fixed table with file extensions and interpreters

Assigned .pl file extension to perl interpreter and .rb to ruby
											
										
										
											2020-12-20 12:11:42 +01:00
+								  .pl              `perl`
 								  .rb              `ruby`
-												doc/filters.md: describe technical details of filter invocations (#6815)


											
										
										
											2020-11-07 00:37:24 +01:00
+								  .php             `php`
 								  .js              `node`
 								  .r               `Rscript`