Improved documentation for plugins in README.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1522 788f1e2b-df1e-0410-8736-df70ead52e1b
This commit is contained in:
fiddlosopher 2009-01-24 19:58:33 +00:00
parent 720e9ce3a5
commit 46a3b228fa

91
README
View file

@ -1120,21 +1120,46 @@ Plugins
=======
Pandoc's plugin system allows users to modify pandoc's behavior by writing
short Haskell programs. A plugin is a Haskell module that exports a function
`transform`, of type `a -> a` or `a -> IO a`, where `a` is `Pandoc`,
`Block`, `Inline`, `[Block]`, or `[Inline]`. The `transform` function will
be used to transform the pandoc document generated by the reader, before
it is transformed by the writer.
short Haskell programs. To how this works, and why it is useful, we
need to understand that pandoc transforms one format (the source
format) into another (the target format) by first converting from the
source format into a Haskell data structure representing the document,
and then converting this data structure into the target format. For
example:
-------------------------------------------------------------------------------
Document Format Contents
------------ --------- -----------------------------------------------------
source markdown `Hello *world*.`
↓ ↓ ↓
intermediate native `Pandoc (Meta [] [] "")
[Para [Str "Hello",Space,Emph [Str "world"],Str "."]]`
↓ ↓ ↓
target HTML `<p>Hello <em>world</em>.</p>`
-------------------------------------------------------------------------------
We can use standard text-processing tools (`perl`, `sed`, `awk`, etc.)
to modify the source or target documents. But what if we want to modify
the intermediate representation -- the parsed document -- before it is
written to the target format? That's where plugins are needed.
A plugin is a Haskell module that exports a function `transform`, which
will be used to transform the native representation, after it is generated
by the reader, but before it has been transformed by the writer.
An example will help make this clearer. Suppose we want to use pandoc with
the WordPress blog engine. WordPress provides support for LaTeX math, but
instead of `$e = mc^2$`, WordPress wants `$LaTeX e = mc^2$`. Prior to plugins,
there was no good way to make pandoc do this. We could have tried using
regex replacements on the markdown input or HTML output, but this would have
been error-prone: we'd have to make sure we weren't capturing non-math text
between dollar signs (for example, text inside a code block). Besides,
pandoc's markdown reader has already identified the math bits; why not
make use of that? By writing a plugin, we can:
been error-prone: if someone writes `$e = mc^2$` in a code block, for
example, we wouldn't want to insert `LaTeX` there. There's no good way to
identify the math chunks without parsing the whole document. And pandoc
is already doing that, so why not make use of this work? By writing a
plugin, we can. Here's the whole plugin:
~~~ {.haskell}
-- WordPressPlugin.hs
@ -1156,6 +1181,8 @@ import Text.Pandoc
just define the name of the module (`WordPressPlugin`), the names of any
exported functions (for a plugin, this will always just be `transform`),
and the modules that will be used in the program itself (`Text.Pandoc`).
Every plugin must export a function named `transform`.
The real meat of the program is the three-line definition of `transform`:
~~~ {.haskell}
@ -1166,14 +1193,19 @@ transform x = x
The first line defines the type of the function: it is a function that
takes an `Inline` element and returns an `Inline` element. (For the definition
of `Inline`, see the module `Text.Pandoc.Definition`.) The next line says
that when the input matches the pattern `Math x y`, the string `LaTeX `
should be inserted at the beginning of `y`. (`x` just specifies whether the
math element is inline or display math, so we leave it alone.) The last
line says, in effect, that the `transform` function has no effect on any
other kind of `Inline` element -- it just passes it through. When the plugin
is applied, this transformation will be used on every `Inline` element in
the document, and `LaTeX ` will be inserted where needed in math elements.
of `Inline`, see the module `Text.Pandoc.Definition`.) The `transform`
function in a plugin need not be `Inline -> Inline`, but it must have
type `a -> a` or `a -> IO a`, where `a` is `Pandoc`, `Block`, `Inline`,
`[Block]`, or `[Inline]`.
The next line says that when the input matches the pattern `Math x y`,
the string `LaTeX ` should be inserted at the beginning of `y`. (The `x`
just specifies whether the math element is inline or display math, so
we leave it alone.) The last line says, in effect, that the `transform`
function has no effect on any other kind of `Inline` element -- it just
passes it through. When the plugin is applied, this transformation will
be used on every `Inline` element in the document, and `LaTeX ` will be
inserted where needed in math elements.
To use this plugin, we just specify the module (or alternatively the filename)
with the `--plugins` option:
@ -1186,13 +1218,28 @@ with the `--plugins` option:
>
Let's look at a more complex example, involving IO. Suppose we want to include
some graphviz diagrams in our document. Of course, we could use a Makefile to
generate the diagrams, then use regular images in our document. But wouldn't it
be nicer just to include the graphviz code in the document itself, perhaps in
a specially marked delimited code block?
some [graphviz](http://www.graphviz.org/) diagrams in our document.
Of course, we could use a Makefile to generate the diagrams, then use
regular images in our document. But wouldn't it be nicer just to include
the graphviz code in the document itself, perhaps in a specially marked
delimited code block?
~~~ {.dot name="diagram1"}
digraph G {Hello->World}
graph G {
e
subgraph clusterA {
a -- b;
subgraph clusterC {
C -- D;
}
}
subgraph clusterB {
d -- f
}
d -- D
e -- clusterB
clusterC -- clusterB
}
~~~
This can be accomplished by a plugin: