Added MediaWiki writer.

+ Added Text/Pandoc/Writers/MediaWiki.hs
+ Added to pandoc.cabal
+ Added to Main.hs and Text/Pandoc.hs exports.
+ Added tests for mediawiki writer & table writer.
+ Added information on MediaWiki writer to README.
+ Added mediawiki markup to list of formats in pandoc(1) man page.
+ Updated debian/control with mediawiki output format.
+ Added mediawiki markup to description in macports portfile.
+ Updated freebsd package description to include mediawiki format.
+ Mention MediaWiki output format in web page index.
+ Added mediawiki demo to website.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1337 788f1e2b-df1e-0410-8736-df70ead52e1b
This commit is contained in:
fiddlosopher 2008-07-27 03:25:51 +00:00
parent 891e06d4aa
commit ddaec05d8f
15 changed files with 1172 additions and 30 deletions

View file

@ -88,6 +88,7 @@ writers = [("native" , (writeDoc, ""))
,("man" , (writeMan, ""))
,("markdown" , (writeMarkdown, ""))
,("rst" , (writeRST, ""))
,("mediawiki" , (writeMediaWiki, ""))
,("rtf" , (writeRTF, defaultRTFHeader))
]

28
README
View file

@ -6,9 +6,9 @@ Pandoc is a [Haskell] library for converting from one markup format to
another, and a command-line tool that uses this library. It can read
[markdown] and (subsets of) [reStructuredText], [HTML], and [LaTeX], and
it can write [markdown], [reStructuredText], [HTML], [LaTeX], [ConTeXt],
[RTF], [DocBook XML], [OpenDocument XML], [GNU Texinfo], [groff man]
pages, and [S5] HTML slide shows. Pandoc's version of markdown contains
some enhancements, like footnotes and embedded LaTeX.
[RTF], [DocBook XML], [OpenDocument XML], [GNU Texinfo], [MediaWiki markup],
[groff man] pages, and [S5] HTML slide shows. Pandoc's version of
markdown contains some enhancements, like footnotes and embedded LaTeX.
In contrast to existing tools for converting markdown to HTML, which
use regex substitutions, Pandoc has a modular design: it consists of a
@ -26,6 +26,7 @@ or output format requires only adding a reader or writer.
[RTF]: http://en.wikipedia.org/wiki/Rich_Text_Format
[DocBook XML]: http://www.docbook.org/
[OpenDocument XML]: http://opendocument.xml.org/
[MediaWiki markup]: http://www.mediawiki.org/wiki/Help:Formatting
[groff man]: http://developer.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man7/groff_man.7.html
[Haskell]: http://www.haskell.org/
[GNU Texinfo]: http://www.gnu.org/software/texinfo/
@ -73,15 +74,16 @@ To convert `hello.html` from html to markdown:
Supported output formats include `markdown`, `latex`, `context`
(ConTeXt), `html`, `rtf` (rich text format), `rst` (reStructuredText),
`docbook` (DocBook XML), `opendocument` (OpenDocument XML), `texinfo`,
`man` (groff man), and `s5` (which produces an HTML file that acts
like powerpoint). Supported input formats include `markdown`, `html`,
`latex`, and `rst`. Note that the `rst` reader only parses a subset of
reStructuredText syntax. For example, it doesn't handle tables, option
lists, or footnotes. But for simple documents it should be adequate. The
`latex` and `html` readers are also limited in what they can do. Because
the `html` reader is picky about the HTML it parses, it is recommended
that you pipe HTML through [HTML Tidy] before sending it to `pandoc`, or
use the `html2markdown` script described below.
`mediawiki` (MediaWiki markup), `man` (groff man), and `s5` (which
produces an HTML file that acts like powerpoint). Supported input
formats include `markdown`, `html`, `latex`, and `rst`. Note that the
`rst` reader only parses a subset of reStructuredText syntax. For
example, it doesn't handle tables, option lists, or footnotes. But for
simple documents it should be adequate. The `latex` and `html` readers
are also limited in what they can do. Because the `html` reader is picky
about the HTML it parses, it is recommended that you pipe HTML through
[HTML Tidy] before sending it to `pandoc`, or use the `html2markdown`
script described below.
If you don't specify a reader or writer explicitly, `pandoc` will
try to determine the input and output format from the extensions of
@ -927,6 +929,8 @@ In Texinfo output, it will be rendered inside a `@math` command.
In groff man output, it will be rendered verbatim without $'s.
In MediaWiki output, it will be rendered inside `<math>` tags.
In RTF, Docbook, and OpenDocument output, it will be rendered, as far as
possible, using unicode characters, and will otherwise appear verbatim.
Unknown commands and symbols, and commands that cannot be dealt with

View file

@ -76,6 +76,7 @@ module Text.Pandoc
, writeDocbook
, writeOpenDocument
, writeMan
, writeMediaWiki
, writeRTF
, prettyPandoc
-- * Writer options used in writers
@ -105,6 +106,7 @@ import Text.Pandoc.Writers.Docbook
import Text.Pandoc.Writers.OpenDocument
import Text.Pandoc.Writers.Man
import Text.Pandoc.Writers.RTF
import Text.Pandoc.Writers.MediaWiki
import Text.Pandoc.DefaultHeaders
import Text.Pandoc.UTF8
import Text.Pandoc.Shared

View file

@ -0,0 +1,394 @@
{-
Copyright (C) 2008 John MacFarlane <jgm@berkeley.edu>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
-}
{- |
Module : Text.Pandoc.Writers.MediaWiki
Copyright : Copyright (C) 2006-7 John MacFarlane
License : GNU GPL, version 2 or above
Maintainer : John MacFarlane <jgm@berkeley.edu>
Stability : alpha
Portability : portable
Conversion of 'Pandoc' documents to MediaWiki markup.
MediaWiki: <http://www.mediawiki.org/wiki/MediaWiki>
-}
module Text.Pandoc.Writers.MediaWiki ( writeMediaWiki ) where
import Text.Pandoc.Definition
import Text.Pandoc.Shared
import Text.Pandoc.XML ( escapeStringForXML )
import Data.List ( intersect )
import Network.URI ( isURI )
import Control.Monad.State
data WriterState = WriterState {
stNotes :: Bool -- True if there are notes
, stListLevel :: [Char] -- String at beginning of list items, e.g. "**"
, stUseTags :: Bool -- True if we should use HTML tags because we're in a complex list
}
-- | Convert Pandoc to MediaWiki.
writeMediaWiki :: WriterOptions -> Pandoc -> String
writeMediaWiki opts document =
evalState (pandocToMediaWiki opts document)
(WriterState { stNotes = False, stListLevel = [], stUseTags = False })
-- | Return MediaWiki representation of document.
pandocToMediaWiki :: WriterOptions -> Pandoc -> State WriterState String
pandocToMediaWiki opts (Pandoc _ blocks) = do
let before = writerIncludeBefore opts
let after = writerIncludeAfter opts
let head' = if writerStandalone opts
then writerHeader opts
else ""
let toc = if writerTableOfContents opts
then "__TOC__\n"
else ""
body <- blockListToMediaWiki opts blocks
notesExist <- get >>= return . stNotes
let notes = if notesExist
then "\n== Notes ==\n<references />"
else ""
return $ head' ++ before ++ toc ++ body ++ after ++ notes
-- | Escape special characters for MediaWiki.
escapeString :: String -> String
escapeString = escapeStringForXML
-- | Convert Pandoc block element to MediaWiki.
blockToMediaWiki :: WriterOptions -- ^ Options
-> Block -- ^ Block element
-> State WriterState String
blockToMediaWiki _ Null = return ""
blockToMediaWiki opts (Plain inlines) =
inlineListToMediaWiki opts inlines
blockToMediaWiki opts (Para inlines) = do
useTags <- get >>= return . stUseTags
listLevel <- get >>= return . stListLevel
contents <- inlineListToMediaWiki opts inlines
return $ if useTags
then "<p>" ++ contents ++ "</p>"
else contents ++ if null listLevel then "\n" else ""
blockToMediaWiki _ (RawHtml str) = return str
blockToMediaWiki _ HorizontalRule = return "\n-----\n"
blockToMediaWiki opts (Header level inlines) = do
contents <- inlineListToMediaWiki opts inlines
let eqs = replicate (level + 1) '='
return $ eqs ++ " " ++ contents ++ " " ++ eqs ++ "\n"
blockToMediaWiki _ (CodeBlock (_,classes,_) str) = do
let at = classes `intersect` ["actionscript", "ada", "apache", "applescript", "asm", "asp",
"autoit", "bash", "blitzbasic", "bnf", "c", "c_mac", "caddcl", "cadlisp", "cfdg", "cfm",
"cpp", "cpp-qt", "csharp", "css", "d", "delphi", "diff", "div", "dos", "eiffel", "fortran",
"freebasic", "gml", "groovy", "html4strict", "idl", "ini", "inno", "io", "java", "java5",
"javascript", "latex", "lisp", "lua", "matlab", "mirc", "mpasm", "mysql", "nsis", "objc",
"ocaml", "ocaml-brief", "oobas", "oracle8", "pascal", "perl", "php", "php-brief", "plsql",
"python", "qbasic", "rails", "reg", "robots", "ruby", "sas", "scheme", "sdlbasic",
"smalltalk", "smarty", "sql", "tcl", "", "thinbasic", "tsql", "vb", "vbnet", "vhdl",
"visualfoxpro", "winbatch", "xml", "xpp", "z80"]
let (beg, end) = if null at
then ("<pre>", "</pre>")
else ("<source lang=\"" ++ head at ++ "\">", "</source>")
return $ beg ++ escapeString str ++ end
blockToMediaWiki opts (BlockQuote blocks) = do
contents <- blockListToMediaWiki opts blocks
return $ "<blockquote>" ++ contents ++ "</blockquote>"
blockToMediaWiki opts (Table caption aligns widths headers rows) = do
let alignStrings = map alignmentToString aligns
captionDoc <- if null caption
then return ""
else do
c <- inlineListToMediaWiki opts caption
return $ "<caption>" ++ c ++ "</caption>"
colHeads <- colHeadsToMediaWiki opts alignStrings widths headers
rows' <- mapM (tableRowToMediaWiki opts alignStrings) rows
return $ "<table>\n" ++ captionDoc ++ colHeads ++ vcat rows' ++ "\n</table>"
blockToMediaWiki opts x@(BulletList items) = do
oldUseTags <- get >>= return . stUseTags
let useTags = oldUseTags || not (isSimpleList x)
if useTags
then do
modify $ \s -> s { stUseTags = True }
contents <- mapM (listItemToMediaWiki opts) items
modify $ \s -> s { stUseTags = oldUseTags }
return $ "<ul>\n" ++ vcat contents ++ "</ul>\n"
else do
modify $ \s -> s { stListLevel = stListLevel s ++ "*" }
contents <- mapM (listItemToMediaWiki opts) items
modify $ \s -> s { stListLevel = init (stListLevel s) }
return $ vcat contents
blockToMediaWiki opts x@(OrderedList attribs items) = do
oldUseTags <- get >>= return . stUseTags
let useTags = oldUseTags || not (isSimpleList x)
if useTags
then do
modify $ \s -> s { stUseTags = True }
contents <- mapM (listItemToMediaWiki opts) items
modify $ \s -> s { stUseTags = oldUseTags }
return $ "<ol" ++ listAttribsToString attribs ++ ">\n" ++ vcat contents ++ "</ol>\n"
else do
modify $ \s -> s { stListLevel = stListLevel s ++ "#" }
contents <- mapM (listItemToMediaWiki opts) items
modify $ \s -> s { stListLevel = init (stListLevel s) }
return $ vcat contents
blockToMediaWiki opts x@(DefinitionList items) = do
oldUseTags <- get >>= return . stUseTags
let useTags = oldUseTags || not (isSimpleList x)
if useTags
then do
modify $ \s -> s { stUseTags = True }
contents <- mapM (definitionListItemToMediaWiki opts) items
modify $ \s -> s { stUseTags = oldUseTags }
return $ "<dl>\n" ++ vcat contents ++ "</dl>\n"
else do
modify $ \s -> s { stListLevel = stListLevel s ++ ";" }
contents <- mapM (definitionListItemToMediaWiki opts) items
modify $ \s -> s { stListLevel = init (stListLevel s) }
return $ vcat contents
-- Auxiliary functions for lists:
-- | Convert ordered list attributes to HTML attribute string
listAttribsToString :: ListAttributes -> String
listAttribsToString (startnum, numstyle, _) =
let numstyle' = camelCaseToHyphenated $ show numstyle
in (if startnum /= 1
then " start=\"" ++ show startnum ++ "\""
else "") ++
(if numstyle /= DefaultStyle
then " style=\"list-style-type: " ++ numstyle' ++ ";\""
else "")
-- | Convert bullet or ordered list item (list of blocks) to MediaWiki.
listItemToMediaWiki :: WriterOptions -> [Block] -> State WriterState String
listItemToMediaWiki opts items = do
contents <- blockListToMediaWiki opts items
useTags <- get >>= return . stUseTags
if useTags
then return $ "<li>" ++ contents ++ "</li>"
else do
marker <- get >>= return . stListLevel
return $ marker ++ " " ++ contents
-- | Convert definition list item (label, list of blocks) to MediaWiki.
definitionListItemToMediaWiki :: WriterOptions
-> ([Inline],[Block])
-> State WriterState String
definitionListItemToMediaWiki opts (label, items) = do
labelText <- inlineListToMediaWiki opts label
contents <- blockListToMediaWiki opts items
useTags <- get >>= return . stUseTags
if useTags
then return $ "<dt>" ++ labelText ++ "</dt>\n<dd>" ++ contents ++ "</dd>"
else do
marker <- get >>= return . stListLevel
return $ marker ++ " " ++ labelText ++ "\n" ++ (init marker ++ ": ") ++ contents
-- | True if the list can be handled by simple wiki markup, False if HTML tags will be needed.
isSimpleList :: Block -> Bool
isSimpleList x =
case x of
BulletList items -> all isSimpleListItem items
OrderedList (num, sty, _) items -> all isSimpleListItem items &&
num == 1 && sty `elem` [DefaultStyle, Decimal]
DefinitionList items -> all isSimpleListItem $ map snd items
_ -> False
-- | True if list item can be handled with the simple wiki syntax. False if
-- HTML tags will be needed.
isSimpleListItem :: [Block] -> Bool
isSimpleListItem [] = True
isSimpleListItem [x] =
case x of
Plain _ -> True
Para _ -> True
BulletList _ -> isSimpleList x
OrderedList _ _ -> isSimpleList x
DefinitionList _ -> isSimpleList x
_ -> False
isSimpleListItem [x, y] | isPlainOrPara x =
case y of
BulletList _ -> isSimpleList y
OrderedList _ _ -> isSimpleList y
DefinitionList _ -> isSimpleList y
_ -> False
isSimpleListItem _ = False
isPlainOrPara :: Block -> Bool
isPlainOrPara (Plain _) = True
isPlainOrPara (Para _) = True
isPlainOrPara _ = False
tr :: String -> String
tr x = "<tr>\n" ++ x ++ "\n</tr>"
-- | Concatenates strings with line breaks between them.
vcat :: [String] -> String
vcat [] = ""
vcat [x] = x
vcat (x:xs) = x ++ "\n" ++ vcat xs
-- Auxiliary functions for tables:
colHeadsToMediaWiki :: WriterOptions
-> [[Char]]
-> [Float]
-> [[Block]]
-> State WriterState String
colHeadsToMediaWiki opts alignStrings widths headers = do
heads <- sequence $ zipWith3
(\alignment columnwidth item -> tableItemToMediaWiki opts "th" alignment columnwidth item)
alignStrings widths headers
return $ tr $ vcat heads
alignmentToString :: Alignment -> [Char]
alignmentToString alignment = case alignment of
AlignLeft -> "left"
AlignRight -> "right"
AlignCenter -> "center"
AlignDefault -> "left"
tableRowToMediaWiki :: WriterOptions
-> [[Char]]
-> [[Block]]
-> State WriterState String
tableRowToMediaWiki opts aligns columns =
(sequence $ zipWith3 (tableItemToMediaWiki opts "td") aligns (repeat 0) columns) >>=
return . tr . vcat
tableItemToMediaWiki :: WriterOptions
-> [Char]
-> [Char]
-> Float
-> [Block]
-> State WriterState String
tableItemToMediaWiki opts tag' align' width' item = do
contents <- blockListToMediaWiki opts item
let attrib = " align=\"" ++ align' ++ "\"" ++
if width' /= 0
then " style=\"width: " ++ (show (truncate (100 * width') :: Integer)) ++ "%;\""
else ""
return $ "<" ++ tag' ++ attrib ++ ">" ++ contents ++ "<" ++ tag' ++ ">"
-- | Convert list of Pandoc block elements to MediaWiki.
blockListToMediaWiki :: WriterOptions -- ^ Options
-> [Block] -- ^ List of block elements
-> State WriterState String
blockListToMediaWiki opts blocks =
mapM (blockToMediaWiki opts) blocks >>= return . vcat
-- | Convert list of Pandoc inline elements to MediaWiki.
inlineListToMediaWiki :: WriterOptions -> [Inline] -> State WriterState String
inlineListToMediaWiki opts lst =
mapM (inlineToMediaWiki opts) lst >>= return . concat
-- | Convert Pandoc inline element to MediaWiki.
inlineToMediaWiki :: WriterOptions -> Inline -> State WriterState String
inlineToMediaWiki opts (Emph lst) = do
contents <- inlineListToMediaWiki opts lst
return $ "''" ++ contents ++ "''"
inlineToMediaWiki opts (Strong lst) = do
contents <- inlineListToMediaWiki opts lst
return $ "'''" ++ contents ++ "'''"
inlineToMediaWiki opts (Strikeout lst) = do
contents <- inlineListToMediaWiki opts lst
return $ "<s>" ++ contents ++ "</s>"
inlineToMediaWiki opts (Superscript lst) = do
contents <- inlineListToMediaWiki opts lst
return $ "<sup>" ++ contents ++ "</sup>"
inlineToMediaWiki opts (Subscript lst) = do
contents <- inlineListToMediaWiki opts lst
return $ "<sub>" ++ contents ++ "</sub>"
inlineToMediaWiki opts (SmallCaps lst) = inlineListToMediaWiki opts lst
inlineToMediaWiki opts (Quoted SingleQuote lst) = do
contents <- inlineListToMediaWiki opts lst
return $ "&lsquo;" ++ contents ++ "&rsquo;"
inlineToMediaWiki opts (Quoted DoubleQuote lst) = do
contents <- inlineListToMediaWiki opts lst
return $ "&ldquo;" ++ contents ++ "&rdquo;"
inlineToMediaWiki _ EmDash = return "&mdash;"
inlineToMediaWiki _ EnDash = return "&ndash;"
inlineToMediaWiki _ Apostrophe = return "&rsquo;"
inlineToMediaWiki _ Ellipses = return "&hellip;"
inlineToMediaWiki _ (Code str) =
return $ "<tt>" ++ (escapeString str) ++ "</tt>"
inlineToMediaWiki _ (Str str) = return $ escapeString str
inlineToMediaWiki _ (Math str) = return $ "<math>" ++ str ++ "</math>"
-- note: str should NOT be escaped
inlineToMediaWiki _ (TeX _) = return ""
inlineToMediaWiki _ (HtmlInline str) = return str
inlineToMediaWiki _ (LineBreak) = return "<br />\n"
inlineToMediaWiki _ Space = return " "
inlineToMediaWiki opts (Link txt (src, _)) = do
link <- inlineListToMediaWiki opts txt
let useAuto = txt == [Code src]
let src' = if isURI src
then src
else if take 1 src == "/"
then "http://{{SERVERNAME}}" ++ src
else "http://{{SERVERNAME}}/" ++ src
return $ if useAuto
then src'
else "[" ++ src' ++ " " ++ link ++ "]"
inlineToMediaWiki opts (Image alt (source, tit)) = do
alt' <- inlineListToMediaWiki opts alt
let txt = if (null tit)
then if null alt
then ""
else "|" ++ alt'
else "|" ++ tit
return $ "[[Image:" ++ source ++ txt ++ "]]"
inlineToMediaWiki opts (Note contents) = do
contents' <- blockListToMediaWiki opts contents
modify (\s -> s { stNotes = True })
return $ "<ref>" ++ contents' ++ "</ref>"
-- note - may not work for notes with multiple blocks

15
debian/control vendored
View file

@ -18,8 +18,9 @@ Description: general markup converter
Pandoc is a Haskell library for converting from one markup format to
another, and a command-line tool that uses this library. It can read
markdown and (subsets of) reStructuredText, HTML, and LaTeX, and it
can write markdown, reStructuredText, HTML, LaTeX, ConTeXt, DocBook,
OpenDocument XML, RTF, GNU Texinfo, groff man pages, and S5 HTML slide shows.
can write markdown, reStructuredText, HTML, LaTeX, ConTeXt, DocBook XML,
OpenDocument XML, RTF, GNU Texinfo, MediaWiki markup, groff man pages,
and S5 HTML slide shows.
.
Pandoc extends standard markdown syntax with footnotes, embedded LaTeX,
tables, definition lists, and other features. A compatibility mode is
@ -43,8 +44,9 @@ Description: general markup converter
Pandoc is a Haskell library for converting from one markup format to
another, and a command-line tool that uses this library. It can read
markdown and (subsets of) reStructuredText, HTML, and LaTeX, and it
can write markdown, reStructuredText, HTML, LaTeX, ConTeXt, DocBook,
OpenDocument XML, RTF, GNU Texinfo, groff man pages, and S5 HTML slide shows.
can write markdown, reStructuredText, HTML, LaTeX, ConTeXt, DocBook XML,
OpenDocument XML, RTF, GNU Texinfo, MediaWiki markup, groff man pages,
and S5 HTML slide shows.
.
Pandoc extends standard markdown syntax with footnotes, embedded LaTeX,
tables, definition lists, and other features. A compatibility mode is
@ -68,8 +70,9 @@ Description: general markup converter
Pandoc is a Haskell library for converting from one markup format to
another, and a command-line tool that uses this library. It can read
markdown and (subsets of) reStructuredText, HTML, and LaTeX, and it
can write markdown, reStructuredText, HTML, LaTeX, ConTeXt, DocBook,
OpenDocument XML, RTF, GNU Texinfo, groff man pages, and S5 HTML slide shows.
can write markdown, reStructuredText, HTML, LaTeX, ConTeXt, DocBook XML,
OpenDocument XML, RTF, GNU Texinfo, MediaWiki markup, groff man pages,
and S5 HTML slide shows.
.
Pandoc extends standard markdown syntax with footnotes, embedded LaTeX,
tables, definition lists, and other features. A compatibility mode is

View file

@ -1,8 +1,8 @@
Pandoc is a command-line tool for converting from one markup format
to another. It can read markdown and (subsets of) reStructuredText,
HTML, and LaTeX, and it can write markdown, reStructuredText, HTML,
LaTeX, ConTeXt, DocBook, OpenDocument XML, RTF, GNU Texinfo, groff man
pages, and S5 HTML slide shows.
LaTeX, ConTeXt, DocBook XML, OpenDocument XML, RTF, GNU Texinfo,
MediaWiki markup, groff man pages, and S5 HTML slide shows.
Pandoc extends standard markdown syntax with footnotes, embedded LaTeX,
and other features. A compatibility mode is provided for those who

View file

@ -11,8 +11,8 @@ long_description \
Pandoc is a command-line tool for converting from one markup format \
to another. It can read markdown and (subsets of) reStructuredText, \
HTML, and LaTeX, and it can write markdown, reStructuredText, HTML, \
LaTeX, ConTeXt, DocBook, OpenDocument XML, RTF, Texinfo, groff man, \
and S5 HTML slide shows.
LaTeX, ConTeXt, DocBook XML, OpenDocument XML, RTF, Texinfo, \
MediaWiki markup, groff man, and S5 HTML slide shows.
homepage http://johnmacfarlane.net/pandoc/
platforms darwin

View file

@ -15,7 +15,8 @@ pandoc [*options*] [*input-file*]...
Pandoc converts files from one markup format to another. It can
read markdown and (subsets of) reStructuredText, HTML, and LaTeX, and
it can write markdown, reStructuredText, HTML, LaTeX, ConTeXt, Texinfo,
groff man, RTF, OpenDocument XML, DocBook XML, and S5 HTML slide shows.
groff man, MediaWiki markup, RTF, OpenDocument XML, DocBook XML,
and S5 HTML slide shows.
If no *input-file* is specified, input is read from STDIN.
Otherwise, the *input-files* are concatenated (with a blank
@ -67,9 +68,10 @@ to Pandoc. Or use `html2markdown`(1), a wrapper around `pandoc`.
: Specify output format. *FORMAT* can be `native` (native Haskell),
`markdown` (markdown or plain text), `rst` (reStructuredText),
`html` (HTML), `latex` (LaTeX), `context` (ConTeXt), `man` (groff man),
`texinfo` (GNU Texinfo), `docbook` (DocBook XML),
`opendocument` (OpenDocument XML), `s5` (S5 HTML and javascript slide
show), or `rtf` (rich text format).
`mediawiki` (MediaWiki markup), `texinfo` (GNU Texinfo),
`docbook` (DocBook XML), `opendocument` (OpenDocument XML),
`s5` (S5 HTML and javascript slide show),
or `rtf` (rich text format).
-s, \--standalone
: Produce output with an appropriate header and footer (e.g. a

View file

@ -17,7 +17,8 @@ Description: Pandoc is a Haskell library for converting from one markup
this library. It can read markdown and (subsets of)
reStructuredText, HTML, and LaTeX, and it can write
markdown, reStructuredText, HTML, LaTeX, ConTeXt, Docbook,
OpenDocument, RTF, groff man pages, and S5 HTML slide shows.
OpenDocument, RTF, MediaWiki, groff man pages, and
S5 HTML slide shows.
.
Pandoc extends standard markdown syntax with footnotes,
embedded LaTeX, definition lists, tables, and other
@ -71,6 +72,7 @@ Library
Text.Pandoc.Writers.Man,
Text.Pandoc.Writers.Markdown,
Text.Pandoc.Writers.RST,
Text.Pandoc.Writers.MediaWiki,
Text.Pandoc.Writers.RTF,
Text.Pandoc.Writers.S5
Other-Modules: Text.Pandoc.XML

View file

@ -8,6 +8,7 @@
../pandoc -r native -s -w texinfo testsuite.native > writer.texinfo
../pandoc -r native -s -w rtf testsuite.native > writer.rtf
../pandoc -r native -s -w man testsuite.native > writer.man
../pandoc -r native -s -w mediawiki testsuite.native > writer.mediawiki
sed -e '/^, Header 1 \[Str "HTML",Space,Str "Blocks"\]/,/^, HorizontalRule/d' testsuite.native | ../pandoc -r native -w docbook -s > writer.docbook
sed -e '/^, Header 1 \[Str "LaTeX"\]/,/^, HorizontalRule/d' testsuite.native | ../pandoc -r native -w context -s > writer.context

View file

@ -14,7 +14,7 @@ unless (-x $script) { die "$script is not executable.\n"; }
print "Writer tests:\n";
my @writeformats = ("html", "latex", "texinfo", "rst", "rtf", "markdown", "man", "native"); # docbook, context, and s5 handled separately
my @writeformats = ("html", "latex", "texinfo", "rst", "rtf", "markdown", "mediawiki", "man", "native"); # docbook, context, and s5 handled separately
my $source = "testsuite.native";
sub test_results

123
tests/tables.mediawiki Normal file
View file

@ -0,0 +1,123 @@
Simple table with caption:
<table>
<caption>Demonstration of simple table syntax.</caption><tr>
<th align="right" style="width: 15%;">Right<th>
<th align="left" style="width: 8%;">Left<th>
<th align="center" style="width: 16%;">Center<th>
<th align="left" style="width: 12%;">Default<th>
</tr><tr>
<td align="right">12<td>
<td align="left">12<td>
<td align="center">12<td>
<td align="left">12<td>
</tr>
<tr>
<td align="right">123<td>
<td align="left">123<td>
<td align="center">123<td>
<td align="left">123<td>
</tr>
<tr>
<td align="right">1<td>
<td align="left">1<td>
<td align="center">1<td>
<td align="left">1<td>
</tr>
</table>
Simple table without caption:
<table>
<tr>
<th align="right" style="width: 15%;">Right<th>
<th align="left" style="width: 8%;">Left<th>
<th align="center" style="width: 16%;">Center<th>
<th align="left" style="width: 12%;">Default<th>
</tr><tr>
<td align="right">12<td>
<td align="left">12<td>
<td align="center">12<td>
<td align="left">12<td>
</tr>
<tr>
<td align="right">123<td>
<td align="left">123<td>
<td align="center">123<td>
<td align="left">123<td>
</tr>
<tr>
<td align="right">1<td>
<td align="left">1<td>
<td align="center">1<td>
<td align="left">1<td>
</tr>
</table>
Simple table indented two spaces:
<table>
<caption>Demonstration of simple table syntax.</caption><tr>
<th align="right" style="width: 15%;">Right<th>
<th align="left" style="width: 8%;">Left<th>
<th align="center" style="width: 16%;">Center<th>
<th align="left" style="width: 12%;">Default<th>
</tr><tr>
<td align="right">12<td>
<td align="left">12<td>
<td align="center">12<td>
<td align="left">12<td>
</tr>
<tr>
<td align="right">123<td>
<td align="left">123<td>
<td align="center">123<td>
<td align="left">123<td>
</tr>
<tr>
<td align="right">1<td>
<td align="left">1<td>
<td align="center">1<td>
<td align="left">1<td>
</tr>
</table>
Multiline table with caption:
<table>
<caption>Here's the caption. It may span multiple lines.</caption><tr>
<th align="center" style="width: 15%;">Centered Header<th>
<th align="left" style="width: 13%;">Left Aligned<th>
<th align="right" style="width: 16%;">Right Aligned<th>
<th align="left" style="width: 33%;">Default aligned<th>
</tr><tr>
<td align="center">First<td>
<td align="left">row<td>
<td align="right">12.0<td>
<td align="left">Example of a row that spans multiple lines.<td>
</tr>
<tr>
<td align="center">Second<td>
<td align="left">row<td>
<td align="right">5.0<td>
<td align="left">Here's another one. Note the blank line between rows.<td>
</tr>
</table>
Multiline table without caption:
<table>
<tr>
<th align="center" style="width: 15%;">Centered Header<th>
<th align="left" style="width: 13%;">Left Aligned<th>
<th align="right" style="width: 16%;">Right Aligned<th>
<th align="left" style="width: 33%;">Default aligned<th>
</tr><tr>
<td align="center">First<td>
<td align="left">row<td>
<td align="right">12.0<td>
<td align="left">Example of a row that spans multiple lines.<td>
</tr>
<tr>
<td align="center">Second<td>
<td align="left">row<td>
<td align="right">5.0<td>
<td align="left">Here's another one. Note the blank line between rows.<td>
</tr>
</table>

605
tests/writer.mediawiki Normal file
View file

@ -0,0 +1,605 @@
This is a set of tests for pandoc. Most of them are adapted from John Gruber&rsquo;s markdown test suite.
-----
== Headers ==
=== Level 2 with an [http://{{SERVERNAME}}/url embedded link] ===
==== Level 3 with ''emphasis'' ====
===== Level 4 =====
====== Level 5 ======
== Level 1 ==
=== Level 2 with ''emphasis'' ===
==== Level 3 ====
with no blank line
=== Level 2 ===
with no blank line
-----
== Paragraphs ==
Here&rsquo;s a regular paragraph.
In Markdown 1.0.0 and earlier. Version 8. This line turns into a list item. Because a hard-wrapped line in the middle of a paragraph looked like a list item.
Here&rsquo;s one with a bullet. * criminey.
There should be a hard line break<br />
here.
-----
== Block Quotes ==
E-mail style:
<blockquote>This is a block quote. It is pretty short.
</blockquote>
<blockquote>Code in a block quote:
<pre>sub status {
print &quot;working&quot;;
}</pre>
A list:
# item one
# item two
Nested block quotes:
<blockquote>nested
</blockquote>
<blockquote>nested
</blockquote></blockquote>
This should not be a block quote: 2 &gt; 1.
And a following paragraph.
-----
== Code Blocks ==
Code:
<pre>---- (should be four hyphens)
sub status {
print &quot;working&quot;;
}
this code block is indented by one tab</pre>
And:
<pre> this code block is indented by two tabs
These should not be escaped: \$ \\ \&gt; \[ \{</pre>
-----
== Lists ==
=== Unordered ===
Asterisks tight:
* asterisk 1
* asterisk 2
* asterisk 3
Asterisks loose:
* asterisk 1
* asterisk 2
* asterisk 3
Pluses tight:
* Plus 1
* Plus 2
* Plus 3
Pluses loose:
* Plus 1
* Plus 2
* Plus 3
Minuses tight:
* Minus 1
* Minus 2
* Minus 3
Minuses loose:
* Minus 1
* Minus 2
* Minus 3
=== Ordered ===
Tight:
# First
# Second
# Third
and:
# One
# Two
# Three
Loose using tabs:
# First
# Second
# Third
and using spaces:
# One
# Two
# Three
Multiple paragraphs:
<ol style="list-style-type: decimal;">
<li><p>Item 1, graf one.</p>
<p>Item 1. graf two. The quick brown fox jumped over the lazy dog&rsquo;s back.</p></li>
<li><p>Item 2.</p></li>
<li><p>Item 3.</p></li></ol>
=== Nested ===
* Tab
** Tab
*** Tab
Here&rsquo;s another:
# First
# Second:
#* Fee
#* Fie
#* Foe
# Third
Same thing but with paragraphs:
# First
# Second:
#* Fee
#* Fie
#* Foe
# Third
=== Tabs and spaces ===
* this is a list item indented with tabs
* this is a list item indented with spaces
** this is an example list item indented with tabs
** this is an example list item indented with spaces
=== Fancy list markers ===
<ol start="2" style="list-style-type: decimal;">
<li>begins with 2</li>
<li><p>and now 3</p>
<p>with a continuation</p>
<ol start="4" style="list-style-type: lower-roman;">
<li>sublist with roman numerals, starting with 4</li>
<li>more items
<ol style="list-style-type: upper-alpha;">
<li>a subsublist</li>
<li>a subsublist</li></ol>
</li></ol>
</li></ol>
Nesting:
<ol style="list-style-type: upper-alpha;">
<li>Upper Alpha
<ol style="list-style-type: upper-roman;">
<li>Upper Roman.
<ol start="6" style="list-style-type: decimal;">
<li>Decimal start with 6
<ol start="3" style="list-style-type: lower-alpha;">
<li>Lower alpha with paren</li></ol>
</li></ol>
</li></ol>
</li></ol>
Autonumbering:
# Autonumber.
# More.
## Nested.
Should not be a list item:
M.A.&#160;2007
B. Williams
-----
== Definition Lists ==
Tight using spaces:
; apple
: red fruit
; orange
: orange fruit
; banana
: yellow fruit
Tight using tabs:
; apple
: red fruit
; orange
: orange fruit
; banana
: yellow fruit
Loose:
; apple
: red fruit
; orange
: orange fruit
; banana
: yellow fruit
Multiple blocks with italics:
<dl>
<dt>''apple''</dt>
<dd><p>red fruit</p>
<p>contains seeds, crisp, pleasant to taste</p></dd>
<dt>''orange''</dt>
<dd><p>orange fruit</p>
<pre>{ orange code block }</pre>
<blockquote><p>orange block quote</p></blockquote></dd></dl>
== HTML Blocks ==
Simple block on one line:
<div>
foo
</div>
And nested without indentation:
<div>
<div>
<div>
foo
</div>
</div>
<div>
bar
</div>
</div>
Interpreted markdown in a table:
<table>
<tr>
<td>
This is ''emphasized''
</td>
<td>
And this is '''strong'''
</td>
</tr>
</table>
<script type="text/javascript">document.write('This *should not* be interpreted as markdown');</script>
Here&rsquo;s a simple block:
<div>
foo
</div>
This should be a code block, though:
<pre>&lt;div&gt;
foo
&lt;/div&gt;</pre>
As should this:
<pre>&lt;div&gt;foo&lt;/div&gt;</pre>
Now, nested:
<div>
<div>
<div>
foo
</div>
</div>
</div>
This should just be an HTML comment:
<!-- Comment -->
Multiline:
<!--
Blah
Blah
-->
<!--
This is another comment.
-->
Code block:
<pre>&lt;!-- Comment --&gt;</pre>
Just plain comment, with trailing spaces on the line:
<!-- foo -->
Code:
<pre>&lt;hr /&gt;</pre>
Hr&rsquo;s:
<hr>
<hr />
<hr />
<hr>
<hr />
<hr />
<hr class="foo" id="bar" />
<hr class="foo" id="bar" />
<hr class="foo" id="bar">
-----
== Inline Markup ==
This is ''emphasized'', and so ''is this''.
This is '''strong''', and so '''is this'''.
An ''[http://{{SERVERNAME}}/url emphasized link]''.
'''''This is strong and em.'''''
So is '''''this''''' word.
'''''This is strong and em.'''''
So is '''''this''''' word.
This is code: <tt>&gt;</tt>, <tt>$</tt>, <tt>\</tt>, <tt>\$</tt>, <tt>&lt;html&gt;</tt>.
<s>This is ''strikeout''.</s>
Superscripts: a<sup>bc</sup>d a<sup>''hello''</sup> a<sup>hello&#160;there</sup>.
Subscripts: H<sub>2</sub>O, H<sub>23</sub>O, H<sub>many&#160;of&#160;them</sub>O.
These should not be superscripts or subscripts, because of the unescaped spaces: a^b c^d, a~b c~d.
-----
== Smart quotes, ellipses, dashes ==
&ldquo;Hello,&rdquo; said the spider. &ldquo;&lsquo;Shelob&rsquo; is my name.&rdquo;
&lsquo;A&rsquo;, &lsquo;B&rsquo;, and &lsquo;C&rsquo; are letters.
&lsquo;Oak,&rsquo; &lsquo;elm,&rsquo; and &lsquo;beech&rsquo; are names of trees. So is &lsquo;pine.&rsquo;
&lsquo;He said, &ldquo;I want to go.&rdquo;&rsquo; Were you alive in the 70&rsquo;s?
Here is some quoted &lsquo;<tt>code</tt>&rsquo; and a &ldquo;[http://example.com/?foo=1&bar=2 quoted link]&rdquo;.
Some dashes: one&mdash;two &mdash; three&mdash;four &mdash; five.
Dashes between numbers: 5&ndash;7, 255&ndash;66, 1987&ndash;1999.
Ellipses&hellip;and&hellip;and&hellip;.
-----
== LaTeX ==
*
*
* <math>2+2=4</math>
* <math>x \in y</math>
* <math>\alpha \wedge \omega</math>
* <math>223</math>
* <math>p</math>-Tree
* <math>\frac{d}{dx}f(x)=\lim_{h\to 0}\frac{f(x+h)-f(x)}{h}</math>
* Here&rsquo;s one that has a line break in it: <math>\alpha + \omega \times x^2</math>.
These shouldn&rsquo;t be math:
* To get the famous equation, write <tt>$e = mc^2$</tt>.
* $22,000 is a ''lot'' of money. So is $34,000. (It worked if &ldquo;lot&rdquo; is emphasized.)
* Shoes ($20) and socks ($5).
* Escaped <tt>$</tt>: $73 ''this should be emphasized'' 23$.
Here&rsquo;s a LaTeX table:
-----
== Special Characters ==
Here is some unicode:
* I hat: Î
* o umlaut: ö
* section: §
* set membership: ∈
* copyright: ©
AT&amp;T has an ampersand in their name.
AT&amp;T is another way to write it.
This &amp; that.
4 &lt; 5.
6 &gt; 5.
Backslash: \
Backtick: `
Asterisk: *
Underscore: _
Left brace: {
Right brace: }
Left bracket: [
Right bracket: ]
Left paren: (
Right paren: )
Greater-than: &gt;
Hash: #
Period: .
Bang: !
Plus: +
Minus: -
-----
== Links ==
=== Explicit ===
Just a [http://{{SERVERNAME}}/url/ URL].
[http://{{SERVERNAME}}/url/ URL and title].
[http://{{SERVERNAME}}/url/ URL and title].
[http://{{SERVERNAME}}/url/ URL and title].
[http://{{SERVERNAME}}/url/ URL and title]
[http://{{SERVERNAME}}/url/ URL and title]
[http://{{SERVERNAME}}/url/with_underscore with_underscore]
[mailto:nobody@nowhere.net Email link]
[http://{{SERVERNAME}}/ Empty].
=== Reference ===
Foo [http://{{SERVERNAME}}/url/ bar].
Foo [http://{{SERVERNAME}}/url/ bar].
Foo [http://{{SERVERNAME}}/url/ bar].
With [http://{{SERVERNAME}}/url/ embedded [brackets]].
[http://{{SERVERNAME}}/url/ b] by itself should be a link.
Indented [http://{{SERVERNAME}}/url once].
Indented [http://{{SERVERNAME}}/url twice].
Indented [http://{{SERVERNAME}}/url thrice].
This should [not][] be a link.
<pre>[not]: /url</pre>
Foo [http://{{SERVERNAME}}/url/ bar].
Foo [http://{{SERVERNAME}}/url/ biz].
=== With ampersands ===
Here&rsquo;s a [http://example.com/?foo=1&bar=2 link with an ampersand in the URL].
Here&rsquo;s a link with an amersand in the link text: [http://att.com/ AT&amp;T].
Here&rsquo;s an [http://{{SERVERNAME}}/script?foo=1&bar=2 inline link].
Here&rsquo;s an [http://{{SERVERNAME}}/script?foo=1&bar=2 inline link in pointy braces].
=== Autolinks ===
With an ampersand: http://example.com/?foo=1&bar=2
* In a list?
* http://example.com/
* It should.
An e-mail address: [mailto:nobody@nowhere.net <tt>nobody@nowhere.net</tt>]
<blockquote>Blockquoted: http://example.com/
</blockquote>
Auto-links should not occur here: <tt>&lt;http://example.com/&gt;</tt>
<pre>or here: &lt;http://example.com/&gt;</pre>
-----
== Images ==
From &ldquo;Voyage dans la Lune&rdquo; by Georges Melies (1902):
[[Image:lalune.jpg|Voyage dans la Lune]]
Here is a movie [[Image:movie.jpg|movie]] icon.
-----
== Footnotes ==
Here is a footnote reference,<ref>Here is the footnote. It can go anywhere after the footnote reference. It need not be placed at the end of the document.
</ref> and another.<ref>Here&rsquo;s the long note. This one contains multiple blocks.
Subsequent blocks are indented to show that they belong to the footnote (as with list items).
<pre> { &lt;code&gt; }</pre>
If you want, you can indent every line, but you can also be lazy and just indent the first line of each block.
</ref> This should ''not'' be a footnote reference, because it contains a space.[^my note] Here is an inline note.<ref>This is ''easier'' to type. Inline notes may contain [http://google.com links] and <tt>]</tt> verbatim characters, as well as [bracketed text].
</ref>
<blockquote>Notes can go in quotes.<ref>In quote.
</ref>
</blockquote>
# And in list items.<ref>In list.</ref>
This paragraph should not be part of the note, as it is not indented.
== Notes ==
<references />

View file

@ -102,3 +102,7 @@ click on the name of the output file:
@ markdown2odt @@README@@ -o @@example21.odt@@
22. MediaWiki markup:
@ pandoc -s -S -w mediawiki --toc @@README@@ -o @@example22.wiki@@

View file

@ -6,8 +6,8 @@ Pandoc is a [Haskell] library for converting from one markup format
to another, and a command-line tool that uses this library. It can read
[markdown] and (subsets of) [reStructuredText], [HTML], and [LaTeX],
and it can write [markdown], [reStructuredText], [HTML], [LaTeX], [ConTeXt],
[RTF], [DocBook XML], [OpenDocument XML], [GNU Texinfo], [groff man]
pages, and [S5] HTML slide shows.
[RTF], [DocBook XML], [OpenDocument XML], [GNU Texinfo], [MediaWiki markup],
[groff man] pages, and [S5] HTML slide shows.
Pandoc features
@ -170,6 +170,7 @@ kind.
[RTF]: http://en.wikipedia.org/wiki/Rich_Text_Format
[DocBook XML]: http://www.docbook.org/
[OpenDocument XML]: http://opendocument.xml.org/
[MediaWiki markup]: http://www.mediawiki.org/wiki/Help:Formatting
[groff man]: http://developer.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man7/groff_man.7.html
[GNU Texinfo]: http://www.gnu.org/software/texinfo/
[Haskell]: http://www.haskell.org/