Removed plugin support.
Plugins will probably be added in a later version, but for now we want to avoid the heavy GHC API dependency. Also, plugins are very slow with the current system. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1548 788f1e2b-df1e-0410-8736-df70ead52e1b
This commit is contained in:
parent
bb639d5506
commit
5f3a147dfd
6 changed files with 8 additions and 332 deletions
217
README
217
README
|
@ -309,14 +309,6 @@ For further documentation, see the `pandoc(1)` man page.
|
|||
repeatedly to include multiple files. They will be included in the
|
||||
order specified.
|
||||
|
||||
`-P` *MODULE[,MODULE...]*, `--plugins`*=MODULE[,MODULE...]*
|
||||
: specifies plugins to load, by module name or source file pathname.
|
||||
Plugins should export a function `transform` of type `a -> a`
|
||||
or `a -> IO a`, where `a` is `Inline`, `Block`, `Pandoc`,
|
||||
`[Inline]`, or `[Block]`. This function will be used to transform
|
||||
the pandoc document after it is parsed by the reader and before it
|
||||
is written out by the writer. (See below on [Plugins](#plugins).)
|
||||
|
||||
`-T` or `--title-prefix` *string*
|
||||
: includes *string* as a prefix at the beginning of the title that
|
||||
appears in the HTML header (but not in the title as it appears at
|
||||
|
@ -1124,212 +1116,3 @@ ordinary HTML (without bird tracks).
|
|||
writes HTML with the Haskell code in bird tracks, so it can be copied
|
||||
and pasted as literate Haskell source.
|
||||
|
||||
Plugins
|
||||
=======
|
||||
|
||||
Pandoc's plugin system allows users to modify pandoc's behavior by writing
|
||||
short Haskell programs. To how this works, and why it is useful, we
|
||||
need to understand that pandoc transforms one format (the source
|
||||
format) into another (the target format) by first converting from the
|
||||
source format into a Haskell data structure representing the document,
|
||||
and then converting this data structure into the target format. For
|
||||
example:
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
Document Format Contents
|
||||
------------ --------- -----------------------------------------------------
|
||||
source markdown `Hello *world*.`
|
||||
|
||||
↓ ↓ ↓
|
||||
|
||||
intermediate native `Pandoc (Meta [] [] "")
|
||||
[Para [Str "Hello",Space,Emph [Str "world"],Str "."]]`
|
||||
|
||||
↓ ↓ ↓
|
||||
|
||||
target HTML `<p>Hello <em>world</em>.</p>`
|
||||
-------------------------------------------------------------------------------
|
||||
|
||||
We can use standard text-processing tools (`perl`, `sed`, `awk`, etc.)
|
||||
to modify the source or target documents. But what if we want to modify
|
||||
the intermediate representation -- the parsed document -- before it is
|
||||
written to the target format? That's where plugins are needed.
|
||||
A plugin is a Haskell module that exports a function `transform`, which
|
||||
will be used to transform the native representation, after it is generated
|
||||
by the reader, but before it has been transformed by the writer.
|
||||
|
||||
An example will help make this clearer. Suppose we want to use pandoc with
|
||||
the WordPress blog engine. WordPress provides support for LaTeX math, but
|
||||
instead of `$e = mc^2$`, WordPress wants `$LaTeX e = mc^2$`. Prior to plugins,
|
||||
there was no good way to make pandoc do this. We could have tried using
|
||||
regex replacements on the markdown input or HTML output, but this would have
|
||||
been error-prone: if someone writes `$e = mc^2$` in a code block, for
|
||||
example, we wouldn't want to insert `LaTeX` there. There's no good way to
|
||||
identify the math chunks without parsing the whole document. And pandoc
|
||||
is already doing that, so why not make use of this work? By writing a
|
||||
plugin, we can. Here's the whole plugin:
|
||||
|
||||
~~~ {.haskell}
|
||||
-- WordPressPlugin.hs
|
||||
module WordPressPlugin (transform) where
|
||||
import Text.Pandoc
|
||||
|
||||
transform :: Inline -> Inline
|
||||
transform (Math x y) = Math x $ "LaTeX " ++ y
|
||||
transform x = x
|
||||
~~~
|
||||
|
||||
This is a Haskell program, but a very short one. The lines
|
||||
|
||||
~~~ {.haskell}
|
||||
module WordPressPlugin (transform) where
|
||||
import Text.Pandoc
|
||||
~~~
|
||||
|
||||
just define the name of the module (`WordPressPlugin`), the names of any
|
||||
exported functions (for a plugin, this will always just be `transform`),
|
||||
and the modules that will be used in the program itself (`Text.Pandoc`).
|
||||
Every plugin must export a function named `transform`.
|
||||
|
||||
The real meat of the program is the three-line definition of `transform`:
|
||||
|
||||
~~~ {.haskell}
|
||||
transform :: Inline -> Inline
|
||||
transform (Math x y) = Math x $ "LaTeX " ++ y
|
||||
transform x = x
|
||||
~~~
|
||||
|
||||
The first line defines the type of the function: it is a function that
|
||||
takes an `Inline` element and returns an `Inline` element. (For the definition
|
||||
of `Inline`, see the module `Text.Pandoc.Definition`.) The `transform`
|
||||
function in a plugin need not be `Inline -> Inline`, but it must have
|
||||
type `a -> a` or `a -> IO a`, where `a` is `Pandoc`, `Block`, `Inline`,
|
||||
`[Block]`, or `[Inline]`.
|
||||
|
||||
The next line says that when the input matches the pattern `Math x y`,
|
||||
the string `LaTeX ` should be inserted at the beginning of `y`. (The `x`
|
||||
just specifies whether the math element is inline or display math, so
|
||||
we leave it alone.) The last line says, in effect, that the `transform`
|
||||
function has no effect on any other kind of `Inline` element -- it just
|
||||
passes it through. When the plugin is applied, this transformation will
|
||||
be used on every `Inline` element in the document, and `LaTeX ` will be
|
||||
inserted where needed in math elements.
|
||||
|
||||
To use this plugin, we just specify the module (or alternatively the filename)
|
||||
with the `--plugins` option:
|
||||
|
||||
% echo "Hello, $e=mc^2$." | pandoc -m --plugins=WordPressPlugin.hs
|
||||
<p
|
||||
>Hello, <span class="LaTeX"
|
||||
>$LaTeX e=mc^2$</span
|
||||
>.</p
|
||||
>
|
||||
|
||||
Let's look at a more complex example, involving IO. Suppose we want to include
|
||||
some [graphviz](http://www.graphviz.org/) diagrams in our document.
|
||||
Of course, we could use a Makefile to generate the diagrams, then use
|
||||
regular images in our document. But wouldn't it be nicer just to include
|
||||
the graphviz code in the document itself, perhaps in a specially marked
|
||||
delimited code block?
|
||||
|
||||
~~~ {.dot name="diagram1"}
|
||||
graph G {
|
||||
e
|
||||
subgraph clusterA {
|
||||
a -- b;
|
||||
subgraph clusterC {
|
||||
C -- D;
|
||||
}
|
||||
}
|
||||
subgraph clusterB {
|
||||
d -- f
|
||||
}
|
||||
d -- D
|
||||
e -- clusterB
|
||||
clusterC -- clusterB
|
||||
}
|
||||
~~~
|
||||
|
||||
This can be accomplished by a plugin:
|
||||
|
||||
~~~ {.haskell}
|
||||
-- DotPlugin.hs
|
||||
module DotPlugin (transform) where
|
||||
import Text.Pandoc
|
||||
import Text.Pandoc.Shared
|
||||
import System.Process (readProcess)
|
||||
import Data.Char (ord)
|
||||
-- from the utf8-string package on HackageDB:
|
||||
import Data.ByteString.Lazy.UTF8 (fromString)
|
||||
-- from the SHA package on HackageDB:
|
||||
import Data.Digest.Pure.SHA
|
||||
|
||||
transform :: Block -> IO Block
|
||||
transform (CodeBlock (id, classes, namevals) contents) | "dot" `elem` classes = do
|
||||
let (name, outfile) = case lookup "name" namevals of
|
||||
Just fn -> ([Str fn], fn ++ ".png")
|
||||
Nothing -> ([], uniqueName contents ++ ".png")
|
||||
result <- readProcess "dot" ["-Tpng"] contents
|
||||
writeFile outfile result
|
||||
return $ Para [Image name (outfile, "")]
|
||||
transform x = return x
|
||||
|
||||
-- | Generate a unique filename given the file's contents.
|
||||
uniqueName :: String -> String
|
||||
uniqueName = showDigest . sha1 . fromString
|
||||
~~~
|
||||
|
||||
The heart of this plugin is the `transform` function, which converts a `Block`
|
||||
to a `Block`. Again, there are two clauses, one for code blocks that are marked
|
||||
with the "dot" class, one for all other blocks. Code blocks with ".dot" are
|
||||
replaced with links to an image file; this file is generated by running
|
||||
`dot -Tpng` on the contents of the code block.
|
||||
|
||||
Because `transform` performs file reads and writes, it needs to be in the
|
||||
IO monad, hence the type: `Block -> IO Block`.
|
||||
|
||||
One more example. Suppose we want emphasized text to be CAPITALIZED
|
||||
instead of italicized. We could use a plugin:
|
||||
|
||||
~~~ {.haskell}
|
||||
module CapitalizeEmphasisPlugin (transform) where
|
||||
import Text.Pandoc
|
||||
import Data.Char (toUpper)
|
||||
|
||||
transform :: [Inline] -> [Inline]
|
||||
transform (Emph xs : ys) = processWith capStr xs ++ transform ys
|
||||
transform (x : ys) = x : transform ys
|
||||
transform [] = []
|
||||
|
||||
capStr :: Inline -> Inline
|
||||
capStr (Str x) = Str (map toUpper x)
|
||||
capStr x = x
|
||||
~~~
|
||||
|
||||
Here `transform` converts a whole list of `Inline` elements to another
|
||||
such list. The key clause is
|
||||
|
||||
~~~ {.haskell}
|
||||
transform (Emph xs : ys) = processWith capStr xs ++ transform ys
|
||||
~~~
|
||||
|
||||
This applies the `capStr` function recursively to all inlines in the
|
||||
list of emphasized inlines and puts the transformed list in place
|
||||
of the original. `capStr` is a simple `Inline` transformation that
|
||||
capitalizes `Str` elements and leaves everything else alone. The
|
||||
function `processWith`, defined in `Text.Pandoc.Definition`, uses some
|
||||
`Data.Generics` magic to apply its argument (here `capStr`) to every
|
||||
`Inline` element in a list, including elements that are deeply buried in
|
||||
other elements. Thus
|
||||
|
||||
processWith captStr [Str "one", Strong [Str "two", Space]] ==>
|
||||
[Str "ONE", Strong [Str "TWO", Space]]
|
||||
|
||||
There are other sample plugins in the `plugins` subdirectory of the
|
||||
pandoc source code.
|
||||
|
||||
**Note:** Do not attempt to use plugins when running pandoc in the
|
||||
directory containing pandoc's source code. The interpreter will try to
|
||||
load the files directly from the source code, rather than reading the compiled
|
||||
versions, and pandoc will hang.
|
||||
|
||||
|
|
16
changelog
16
changelog
|
@ -16,19 +16,9 @@ pandoc (1.2)
|
|||
+ Added tests (run only if highlighting support compiled in).
|
||||
+ Documented lhs features in man page and README.
|
||||
|
||||
* Added a plugin system, based on hint. This allows users to
|
||||
write small interpreted Haskell programs that transform the
|
||||
AST between the reader and the writer.
|
||||
|
||||
+ In Text.Pandoc.Definition, added processWith, processWithM,
|
||||
and queryWith, and deprecated processPandoc and queryPandoc
|
||||
for these more general functions, which are useful in writing
|
||||
plugins.
|
||||
+ Added module Text.Pandoc.Plugins.
|
||||
+ Added a --plugins option to pandoc, and code to run the parsed
|
||||
document through all the plugins.
|
||||
+ Provided five sample plugin files in the plugins/ directory.
|
||||
+ Documented --plugin in the pandoc man page and README.
|
||||
* In Text.Pandoc.Definition, added processWith, processWithM,
|
||||
and queryWith, and deprecated processPandoc and queryPandoc
|
||||
for these more general functions.
|
||||
|
||||
* Fixed bug in mediawiki writer: improper closing tags in tables.
|
||||
Thanks to Benct Philip Jonsson for reporting the bug.
|
||||
|
|
|
@ -181,14 +181,6 @@ to Pandoc. Or use `html2markdown`(1), a wrapper around `pandoc`.
|
|||
`context`, `docbook`, `man`, `markdown`, `opendocument`,
|
||||
`rst`, `rtf`).
|
||||
|
||||
-P *MODULE[,MODULE...]*, \--plugins=*MODULE[,MODULE...]*
|
||||
: Specify plugins to load, by module name or source file pathname.
|
||||
Plugins should export a function `transform` of type `a -> a`
|
||||
or `a -> IO a`, where `a` is `Inline`, `Block`, `Pandoc`,
|
||||
`[Inline]`, or `[Block]`. This function will be used to transform
|
||||
the pandoc document after it is parsed by the reader and before it
|
||||
is written out by the writer.
|
||||
|
||||
-T *STRING*, \--title-prefix=*STRING*
|
||||
: Specify *STRING* as a prefix to the HTML window title.
|
||||
|
||||
|
|
11
pandoc.cabal
11
pandoc.cabal
|
@ -38,13 +38,7 @@ Data-Files:
|
|||
-- documentation
|
||||
README, INSTALL, COPYRIGHT, BUGS, changelog,
|
||||
-- wrappers
|
||||
markdown2pdf, html2markdown, hsmarkdown,
|
||||
-- sample plugins
|
||||
plugins/WordPressPlugin.hs,
|
||||
plugins/CapitalizeEmphasisPlugin.hs,
|
||||
plugins/IncludeFilePlugin.hs,
|
||||
plugins/DotPlugin.hs,
|
||||
plugins/ListLinksPlugin.hs
|
||||
markdown2pdf, html2markdown, hsmarkdown
|
||||
Extra-Source-Files:
|
||||
-- sources for man pages
|
||||
man/man1/pandoc.1.md, man/man1/markdown2pdf.1.md,
|
||||
|
@ -159,7 +153,7 @@ Library
|
|||
mtl >= 1.1, network >= 2, filepath >= 1.1,
|
||||
process >= 1, directory >= 1, template-haskell >= 2.2,
|
||||
bytestring >= 0.9, zip-archive >= 0.1.1,
|
||||
utf8-string >= 0.3, old-time >= 1, hint >= 0.3.0.0
|
||||
utf8-string >= 0.3, old-time >= 1
|
||||
if impl(ghc >= 6.10)
|
||||
Build-depends: base >= 4, syb
|
||||
else
|
||||
|
@ -178,7 +172,6 @@ Library
|
|||
Text.Pandoc.CharacterReferences,
|
||||
Text.Pandoc.Shared,
|
||||
Text.Pandoc.ODT,
|
||||
Text.Pandoc.Plugins,
|
||||
Text.Pandoc.LaTeXMathML,
|
||||
Text.Pandoc.DefaultHeaders,
|
||||
Text.Pandoc.Highlighting,
|
||||
|
|
|
@ -1,69 +0,0 @@
|
|||
{-
|
||||
Copyright (C) 2008 John MacFarlane <jgm@berkeley.edu>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
-}
|
||||
|
||||
{- |
|
||||
Module : Text.Pandoc.Pluigns
|
||||
Copyright : Copyright (C) 2008 John MacFarlane
|
||||
License : GNU GPL, version 2 or above
|
||||
|
||||
Maintainer : John MacFarlane <jgm@berkeley.edu>
|
||||
Stability : alpha
|
||||
Portability : portable
|
||||
|
||||
Support for plugins.
|
||||
-}
|
||||
|
||||
module Text.Pandoc.Plugins (getPlugin)
|
||||
where
|
||||
|
||||
import Language.Haskell.Interpreter
|
||||
import Text.Pandoc
|
||||
import Control.Monad (unless, liftM)
|
||||
import Control.Monad.Error (throwError)
|
||||
import Data.List (isInfixOf)
|
||||
|
||||
-- | Returns the function named @transform@ in the specified
|
||||
-- module. The module may be identified either by module name
|
||||
-- or by path name. The @transform@ function should have type
|
||||
-- @a -> a@ or @a -> IO a@, where @a@ is an instance of 'Data':
|
||||
-- for example, @Pandoc -> Pandoc@, @Inline -> IO Inline@,
|
||||
-- @Block -> Block@, or @[Inline] -> IO [Inline]@.
|
||||
getPlugin :: String -> IO (Pandoc -> IO Pandoc)
|
||||
getPlugin modsrc = do
|
||||
res <- runInterpreter (evaluatePlugin modsrc)
|
||||
case res of
|
||||
Right func -> return func
|
||||
Left (WontCompile xs) -> error $ "WontCompile error for plugin '" ++ modsrc ++ "'\n" ++ unlines (map errMsg xs)
|
||||
Left (NotAllowed x) -> error $ "NotAllowed error for plugin '" ++ modsrc ++ "'\n" ++ x
|
||||
Left (UnknownError x) -> error $ "UnknownError for plugin '" ++ modsrc ++ "'\n" ++ x
|
||||
Left (GhcException x) -> error $ "GhcException for plugin '" ++ modsrc ++ "'\n" ++ x
|
||||
|
||||
evaluatePlugin :: String -> Interpreter (Pandoc -> IO Pandoc)
|
||||
evaluatePlugin modsrc = do
|
||||
set [installedModulesInScope := False]
|
||||
loadModules [modsrc]
|
||||
modnames <- getLoadedModules
|
||||
setTopLevelModules modnames
|
||||
setImports ["Prelude", "Text.Pandoc", "Text.Pandoc.Definition"]
|
||||
exports <- liftM concat $ mapM getModuleExports modnames
|
||||
unless ((Fun "transform") `elem` exports) $
|
||||
throwError $ UnknownError $ "The plugin module must define a function 'transform'."
|
||||
transformType <- typeOf "transform"
|
||||
if "-> IO" `isInfixOf` transformType
|
||||
then interpret "processWithM transform" (as :: Pandoc -> IO Pandoc)
|
||||
else interpret "return . (processWith transform)" (as :: Pandoc -> IO Pandoc)
|
|
@ -32,7 +32,7 @@ writers.
|
|||
module Main where
|
||||
import Text.Pandoc
|
||||
import Text.Pandoc.ODT
|
||||
import Text.Pandoc.Shared ( HTMLMathMethod (..), splitBy, tabFilter, ObfuscationMethod (..) )
|
||||
import Text.Pandoc.Shared ( HTMLMathMethod (..), tabFilter, ObfuscationMethod (..) )
|
||||
import Text.Pandoc.Highlighting ( languages )
|
||||
import System.Environment ( getArgs, getProgName, getEnvironment )
|
||||
import System.Exit ( exitWith, ExitCode (..) )
|
||||
|
@ -48,8 +48,7 @@ import System.IO.UTF8
|
|||
import Text.CSL
|
||||
import Text.Pandoc.Biblio
|
||||
#endif
|
||||
import Text.Pandoc.Plugins (getPlugin)
|
||||
import Control.Monad (foldM, when, unless)
|
||||
import Control.Monad (when, unless)
|
||||
|
||||
copyrightMessage :: String
|
||||
copyrightMessage = "\nCopyright (C) 2006-8 John MacFarlane\n" ++
|
||||
|
@ -365,15 +364,6 @@ options =
|
|||
"FILENAME")
|
||||
"" -- "File to use for custom header (implies -s)"
|
||||
|
||||
, Option "P" ["plugins"]
|
||||
(ReqArg
|
||||
(\arg opt -> do
|
||||
let pluginModules = splitBy ',' arg
|
||||
plugins <- mapM getPlugin pluginModules
|
||||
return opt { optPlugins = plugins })
|
||||
"MODULE[,MODULE...]")
|
||||
"" -- "Haskell modules"
|
||||
|
||||
, Option "T" ["title-prefix"]
|
||||
(ReqArg
|
||||
(\arg opt -> return opt { optTitlePrefix = arg,
|
||||
|
@ -540,7 +530,6 @@ main = do
|
|||
, optReferenceLinks = referenceLinks
|
||||
, optWrapText = wrap
|
||||
, optSanitizeHTML = sanitize
|
||||
, optPlugins = plugins
|
||||
, optEmailObfuscation = obfuscationMethod
|
||||
#ifdef _CITEPROC
|
||||
, optBiblioFile = biblioFile
|
||||
|
@ -653,9 +642,7 @@ main = do
|
|||
return doc
|
||||
#endif
|
||||
|
||||
doc'' <- foldM (flip ($)) doc' plugins
|
||||
|
||||
let writerOutput = writer writerOptions doc'' ++ "\n"
|
||||
let writerOutput = writer writerOptions doc' ++ "\n"
|
||||
|
||||
case writerName' of
|
||||
"odt" -> saveOpenDocumentAsODT outputFile sourceDirRelative writerOutput
|
||||
|
|
Loading…
Add table
Reference in a new issue