Improved documentation for plugins in README.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1522 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-01-24 19:58:33 +00:00 · 2009-01-24 19:58:33 +00:00 · 46a3b228fa
commit 46a3b228fa
parent 720e9ce3a5
1 changed files with 69 additions and 22 deletions
--- a/91
+++ b/91
@ -1120,21 +1120,46 @@ Plugins
 =======

 Pandoc's plugin system allows users to modify pandoc's behavior by writing
-short Haskell programs.  A plugin is a Haskell module that exports a function
-`transform`, of type `a -> a` or `a -> IO a`, where `a` is `Pandoc`,
-`Block`, `Inline`, `[Block]`, or `[Inline]`.  The `transform` function will
-be used to transform the pandoc document generated by the reader, before
-it is transformed by the writer.
+short Haskell programs.  To how this works, and why it is useful, we
+ need to understand that pandoc transforms one format (the source
+format) into another (the target format) by first converting from the
+source format into a Haskell data structure representing the document,
+and then converting this data structure into the target format. For
+example:
+
+ -------------------------------------------------------------------------------
+   Document     Format                       Contents
+ ------------  ---------   -----------------------------------------------------
+     source     markdown                 `Hello *world*.`
+
+       ↓           ↓                            ↓
+
+ intermediate    native               `Pandoc (Meta [] [] "")
+                           [Para [Str "Hello",Space,Emph [Str "world"],Str "."]]`
+
+       ↓           ↓                            ↓
+
+     target       HTML               `<p>Hello <em>world</em>.</p>`
+ -------------------------------------------------------------------------------
+
+We can use standard text-processing tools (`perl`, `sed`, `awk`, etc.)
+to modify the source or target documents.  But what if we want to modify
+the intermediate representation -- the parsed document -- before it is
+written to the target format? That's where plugins are needed.
+A plugin is a Haskell module that exports a function `transform`, which
+will be used to transform the native representation, after it is generated
+by the reader, but before it has been transformed by the writer.

 An example will help make this clearer.  Suppose we want to use pandoc with
 the WordPress blog engine.  WordPress provides support for LaTeX math, but
 instead of `$e = mc^2$`, WordPress wants `$LaTeX e = mc^2$`.  Prior to plugins,
 there was no good way to make pandoc do this.  We could have tried using
 regex replacements on the markdown input or HTML output, but this would have
-been error-prone: we'd have to make sure we weren't capturing non-math text
-between dollar signs (for example, text inside a code block). Besides,
-pandoc's markdown reader has already identified the math bits; why not
-make use of that? By writing a plugin, we can:
+been error-prone:  if someone writes `$e = mc^2$` in a code block, for
+example, we wouldn't want to insert `LaTeX` there.  There's no good way to
+identify the math chunks without parsing the whole document. And pandoc
+is already doing that, so why not make use of this work? By writing a
+plugin, we can.  Here's the whole plugin:

 ~~~ {.haskell}
 -- WordPressPlugin.hs
@ -1156,6 +1181,8 @@ import Text.Pandoc
 just define the name of the module (`WordPressPlugin`), the names of any
 exported functions (for a plugin, this will always just be `transform`),
 and the modules that will be used in the program itself (`Text.Pandoc`).
+Every plugin must export a function named `transform`.
+
 The real meat of the program is the three-line definition of `transform`:

 ~~~ {.haskell}
@ -1166,14 +1193,19 @@ transform x          = x

 The first line defines the type of the function:  it is a function that
 takes an `Inline` element and returns an `Inline` element.  (For the definition
-of `Inline`, see the module `Text.Pandoc.Definition`.)  The next line says
-that when the input matches the pattern `Math x y`, the string `LaTeX `
-should be inserted at the beginning of `y`. (`x` just specifies whether the
-math element is inline or display math, so we leave it alone.)  The last
-line says, in effect, that the `transform` function has no effect on any
-other kind of `Inline` element -- it just passes it through.  When the plugin
-is applied, this transformation will be used on every `Inline` element in
-the document, and `LaTeX ` will be inserted where needed in math elements.
+of `Inline`, see the module `Text.Pandoc.Definition`.)  The `transform`
+function in a plugin need not be `Inline -> Inline`, but it must have
+type `a -> a` or `a -> IO a`, where `a` is `Pandoc`, `Block`, `Inline`,
+`[Block]`, or `[Inline]`.
+
+The next line says that when the input matches the pattern `Math x y`,
+the string `LaTeX ` should be inserted at the beginning of `y`. (The `x`
+just specifies whether the math element is inline or display math, so
+we leave it alone.) The last line says, in effect, that the `transform`
+function has no effect on any other kind of `Inline` element -- it just
+passes it through. When the plugin is applied, this transformation will
+be used on every `Inline` element in the document, and `LaTeX ` will be
+inserted where needed in math elements.

 To use this plugin, we just specify the module (or alternatively the filename)
 with the `--plugins` option:
@ -1186,13 +1218,28 @@ with the `--plugins` option:
    >

 Let's look at a more complex example, involving IO.  Suppose we want to include
-some graphviz diagrams in our document.  Of course, we could use a Makefile to
-generate the diagrams, then  use regular images in our document. But wouldn't it
-be nicer just to include the graphviz code in the document itself, perhaps in
-a specially marked delimited code block?
+some [graphviz](http://www.graphviz.org/) diagrams in our document.
+Of course, we could use a Makefile to generate the diagrams, then use
+regular images in our document. But wouldn't it be nicer just to include
+the graphviz code in the document itself, perhaps in a specially marked
+delimited code block?

    ~~~ {.dot name="diagram1"}
-    digraph G {Hello->World}
+    graph G {
+      e
+      subgraph clusterA {
+        a -- b;
+        subgraph clusterC {
+          C -- D;
+        }
+      }
+      subgraph clusterB {
+        d -- f
+      }
+      d -- D
+      e -- clusterB
+      clusterC -- clusterB
+    }
    ~~~
 
 This can be accomplished by a plugin: