Merged 'strict' branch from r324. This adds a '--strict'

option to pandoc, which forces it to stay as close as possible to official Markdown syntax. git-svn-id: https://pandoc.googlecode.com/svn/trunk@347 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-30 22:51:49 +00:00 · 2006-12-30 22:51:49 +00:00 · 4ea1b2bdc0
commit 4ea1b2bdc0
parent 7cd9db048b
23 changed files with 248 additions and 141 deletions
--- a/9
+++ b/9
@ -296,8 +296,13 @@ $(win_pkg_name): $(THIS).exe  $(win_docs)
 .PHONY: test test-markdown
 test: $(MAIN)
 	@cd $(TESTDIR) && perl runtests.pl -s $(PWD)/$(MAIN)
-test-markdown: $(MAIN)
-	@cd $(TESTDIR)/MarkdownTest_1.0.3 && perl MarkdownTest.pl -s $(PWD)/$(MAIN) -tidy
+strict:=$(MAIN)-strict
+cleanup_files+=$(strict)
+$(strict): $(MAIN)
+	echo "#!/bin/sh -e\n$(PWD)/$(MAIN) --strict \"\$$@\"" > $@; \
+	chmod +x $(strict)
+test-markdown: $(strict)
+	@cd $(TESTDIR)/MarkdownTest_1.0.3 && perl MarkdownTest.pl -s $(PWD)/$(strict) -tidy

 # Stolen and slightly improved from a GPLed Makefile.  Credits to John Meacham.
 src_all:=$(shell find $(SRCDIR) -type f -name '*hs' | egrep -v '^\./(_darcs|lib|test)/')
--- a/35
+++ b/35
@ -213,6 +213,9 @@ preserved, rather than converted to spaces (the default).

 `--tabstop` allows the user to set the tab stop (which defaults to 4).

+`--strict` specifies that strict markdown syntax is to be used, without
+pandoc's usual extensions and variants (described below).
+
 `-R` or `--parse-raw` causes the HTML and LaTeX readers to parse HTML
 codes and LaTeX environments that it can't translate as raw HTML or
 LaTeX.  Raw HTML can be printed in markdown, reStructuredText, HTML,
@ -293,7 +296,8 @@ Pandoc's markdown vs. standard markdown

 In parsing markdown, Pandoc departs from and extends [standard markdown]
 in a few respects.  (To run Pandoc on the official
-markdown test suite, type `make test-markdown`.)
+markdown test suite, type `make test-markdown`.)  These differences can
+be suppressed by specifying the `--strict` command-line option.

 [standard markdown]:  http://daringfireball.net/projects/markdown/syntax
  "Markdown syntax description"
@ -319,7 +323,10 @@ the blank space around "Third".  Pandoc follows a simple rule:
 if the text is followed by a blank line, it is treated as a
 paragraph.  Since "Second" is followed by a list, and not a blank
 line, it isn't treated as a paragraph.  The fact that the list
-is followed by a blank line is irrelevant. 
+is followed by a blank line is irrelevant.  (Note:  Pandoc works
+this way even when the `--strict` option is specified.  This
+behavior is consistent with the official markdown syntax
+description, even though it is different from that of `Markdown.pl`.)

 Unlike standard markdown, Pandoc allows ordered list items to be
 marked with single letters, instead of numbers.  So, for example,
@ -350,33 +357,17 @@ the example above:
        B)  Fie
    C)  Third

-Literal quotes in titles
------------------------
-
-Standard markdown allows unescaped literal quotes in titles, as
-in 
-
-	[foo]: "bar "embedded" baz"
-
-Pandoc requires all quotes within titles to be escaped:
-
-	[foo]: "bar \"embedded\" baz"
-
 Reference links
 ---------------

-Pandoc allows implicit reference links in either of two styles:
+Pandoc allows implicit reference links with just a single set of
+brackets.  So, the following links are equivalent:

 	1. Here's my [link]
 	2. Here's my [link][]

 	[link]: linky.com

-If there's no corresponding reference, the implicit reference link
-will appear as regular bracketed text.  Note: even `[link][]` will
-appear as `[link]` if there's no reference for `link`.  If you want
-`[link][]`, use a backslash escape: `\[link]\[]`.
-
 Footnotes
 ---------

@ -439,7 +430,7 @@ into
 		</tr>
 	</table>

-whereas Markdown 1.0 will preserve it as is.
+whereas `Markdown.pl` will preserve it as is.

 There is one exception to this rule:  text between `<script>` and
 `</script>` tags is not interpreted as markdown.
@ -527,7 +518,7 @@ Note, however, that material between the begin and end tags will
 be interpreted as raw LaTeX, not as markdown.

 Custom headers
--------------
+==============

 When run with the "standalone" option (`-s`), `pandoc` creates a
 standalone file, complete with an appropriate header.  To see the
--- a/man/man1/html2markdown.1
+++ b/man/man1/html2markdown.1
@ -32,6 +32,10 @@ Preserve tabs instead of converting them to spaces.
 .B \-\-tab-stop=\fITABSTOP\fB
 Specify tab stop (default is 4).
 .TP
+.B \-\-strict
+Use strict markdown syntax, with no extensions or variants.
+.TP
+.TP
 .B \-R, \-\-parse-raw
 Parse untranslatable HTML codes as raw HTML.
 .TP
--- a/man/man1/markdown2pdf.1
+++ b/man/man1/markdown2pdf.1
@ -36,6 +36,10 @@ Preserve tabs instead of converting them to spaces.
 .B \-\-tab-stop=\fITABSTOP\fB
 Specify tab stop (default is 4).
 .TP
+.B \-\-strict
+Use strict markdown syntax, with no extensions or variants.
+.TP
+.TP
 .B \-N, \-\-number-sections
 Number section headings in LaTeX output.  (Default is not to number them.)
 .TP
--- a/man/man1/pandoc.1
+++ b/man/man1/pandoc.1
@ -94,6 +94,9 @@ Preserve tabs instead of converting them to spaces.
 .B \-\-tab-stop=\fITABSTOP\fB
 Specify tab stop (default is 4).
 .TP
+.B \-\-strict
+Use strict markdown syntax, with no extensions or variants.
+.TP
 .B \-R, \-\-parse-raw
 Parse untranslatable HTML codes and LaTeX environments as raw HTML
 or LaTeX, instead of ignoring them.
--- a/src/Main.hs
+++ b/src/Main.hs
@ -91,13 +91,12 @@ writeDoc options = prettyPandoc

 -- | Data structure for command line options.
 data Opt = Opt
-    { optPreserveTabs      :: Bool    -- ^ If @False@, convert tabs to spaces
+    { optPreserveTabs      :: Bool    -- ^ Convert tabs to spaces
    , optTabStop           :: Int     -- ^ Number of spaces per tab
-    , optStandalone        :: Bool    -- ^ If @True@, include header, footer
+    , optStandalone        :: Bool    -- ^ Include header, footer
    , optReader            :: String  -- ^ Reader format
    , optWriter            :: String  -- ^ Writer format
-    , optParseRaw          :: Bool    -- ^ If @True@, parse unconvertable 
-                                      -- HTML and TeX
+    , optParseRaw          :: Bool    -- ^ Parse unconvertable HTML and TeX
    , optCSS               :: String  -- ^ CSS file to link to
    , optIncludeInHeader   :: String  -- ^ File to include in header
    , optIncludeBeforeBody :: String  -- ^ File to include at top of body
@ -105,11 +104,12 @@ data Opt = Opt
    , optCustomHeader      :: String  -- ^ Custom header to use, or "DEFAULT"
    , optTitlePrefix       :: String  -- ^ Optional prefix for HTML title
    , optOutputFile        :: String  -- ^ Name of output file
-    , optNumberSections    :: Bool    -- ^ If @True@, number sections in LaTeX
-    , optIncremental       :: Bool    -- ^ If @True@, incremental lists in S5
-    , optSmart             :: Bool    -- ^ If @True@, use smart typography
-    , optASCIIMathML       :: Bool    -- ^ If @True@, use ASCIIMathML in HTML
-    , optDebug             :: Bool    -- ^ If @True@, output debug messages 
+    , optNumberSections    :: Bool    -- ^ Number sections in LaTeX
+    , optIncremental       :: Bool    -- ^ Use incremental lists in S5
+    , optSmart             :: Bool    -- ^ Use smart typography
+    , optASCIIMathML       :: Bool    -- ^ Use ASCIIMathML in HTML
+    , optDebug             :: Bool    -- ^ Output debug messages 
+    , optStrict            :: Bool    -- ^ Use strict markdown syntax
    }

 -- | Defaults for command-line options.
@ -133,6 +133,7 @@ defaultOpts = Opt
    , optSmart             = False
    , optASCIIMathML       = False
    , optDebug             = False
+    , optStrict            = False
    }

 -- | A list of functions, each transforming the options data structure
@ -175,6 +176,11 @@ options =
                  "TABSTOP")
                 "Tab stop (default 4)"

+    , Option "" ["strict"]
+                 (NoArg
+                  (\opt -> return opt { optStrict = True } ))
+                 "Use strict markdown syntax with no extensions"
+
    , Option "R" ["parse-raw"]
                 (NoArg
                  (\opt -> return opt { optParseRaw = True }))
@ -364,6 +370,7 @@ main = do
              , optSmart             = smart
              , optASCIIMathML       = asciiMathML
 			  , optDebug             = debug
+              , optStrict            = strict
             } = opts

  -- assign reader and writer based on options and filenames
@ -399,7 +406,9 @@ main = do
  let filter = tabFilter . addBlank . removeCRs
  let startParserState = defaultParserState { stateParseRaw     = parseRaw,
                                              stateTabStop      = tabStop, 
-                                              stateStandalone   = standalone }
+                                              stateStandalone   = standalone &&
+                                                                  (not strict),
+                                              stateStrict       = strict }
  let csslink = if (css == "")
                   then "" 
                   else "<link rel=\"stylesheet\" href=\"" ++ css ++ 
@ -409,16 +418,19 @@ main = do
                   then defaultHeader
                   else customHeader) ++ 
               csslink ++ asciiMathMLLink ++ includeHeader
-  let writerOptions = WriterOptions { writerStandalone     = standalone, 
+  let writerOptions = WriterOptions { writerStandalone     = standalone &&
+                                                             (not strict), 
                                      writerHeader         = header, 
                                      writerTitlePrefix    = titlePrefix,
-                                      writerSmart          = smart, 
+                                      writerSmart          = smart && 
+                                                             (not strict), 
                                      writerTabStop        = tabStop, 
                                      writerS5             = (writerName=="s5"),
                                      writerIncremental    = incremental, 
                                      writerNumberSections = numberSections,
                                      writerIncludeBefore  = includeBefore, 
-                                      writerIncludeAfter   = includeAfter }
+                                      writerIncludeAfter   = includeAfter,
+                                      writerStrictMarkdown = strict }

  (readSources sources) >>= (hPutStr output . encodeUTF8 . 
                             (writer writerOptions) . 
--- a/src/Text/Pandoc/Readers/HTML.hs
+++ b/src/Text/Pandoc/Readers/HTML.hs
@ -32,7 +32,12 @@ module Text.Pandoc.Readers.HTML (
                                 rawHtmlInline, 
                                 rawHtmlBlock, 
                                 anyHtmlBlockTag, 
-                                 anyHtmlInlineTag  
+                                 anyHtmlInlineTag,  
+                                 anyHtmlTag,
+                                 anyHtmlEndTag,
+                                 htmlEndTag,
+                                 extractTagType,
+                                 htmlBlockElement 
                                ) where

 import Text.Regex ( matchRegex, mkRegex )
@ -78,17 +83,18 @@ inlinesTilEnd tag = try (do
  inlines <- manyTill inline (htmlEndTag tag)
  return inlines)

-- extract type from a tag:  e.g. br from <br>, < br >, </br>, etc.
+-- | Extract type from a tag:  e.g. 'br' from '<br>'
 extractTagType tag = 
    case (matchRegex (mkRegex  "<[[:space:]]*/?([A-Za-z0-9]+)") tag) of
          Just [match]   -> (map toLower match)
          Nothing        -> ""

+-- | Parse any HTML tag (closing or opening) and return text of tag
 anyHtmlTag = try (do
  char '<'
  spaces
  tag <- many1 alphaNum
-  attribs <- htmlAttributes  
+  attribs <- htmlAttributes
  spaces
  ender <- option "" (string "/")
  let ender' = if (null ender) then "" else " /"
@ -150,9 +156,10 @@ htmlRegularAttribute = try (do
                                  (do
                                     a <- many (alphaNum <|> (oneOf "-._:"))
                                     return (a,"")) ]
-  return (name, content, 
+  return (name, content,
          (" " ++ name ++ "=" ++ quoteStr ++ content ++ quoteStr)))

+-- | Parse an end tag of type 'tag'
 htmlEndTag tag = try (do
  char '<'   
  spaces
@ -174,20 +181,23 @@ anyHtmlInlineTag = try (do
  tag <- choice [ anyHtmlTag, anyHtmlEndTag ]
  if isInline tag then return tag else fail "not an inline tag")

-- scripts must be treated differently, because they can contain <> etc.
+-- | Parses material between script tags.
+-- Scripts must be treated differently, because they can contain '<>' etc.
 htmlScript = try (do
  open <- string "<script"
  rest <- manyTill anyChar (htmlEndTag "script")
  return (open ++ rest ++ "</script>"))

+htmlBlockElement = choice [ htmlScript, htmlComment, xmlDec, definition ]
+
 rawHtmlBlock = try (do
  notFollowedBy' (choice [htmlTag "/body", htmlTag "/html"])
-  body <- choice [htmlScript, anyHtmlBlockTag, htmlComment, xmlDec, 
-                  definition]
+  body <- htmlBlockElement <|> anyHtmlBlockTag
  sp <- (many space)
  state <- getState
  if stateParseRaw state then return (RawHtml (body ++ sp)) else return Null)

+-- | Parses an HTML comment.
 htmlComment = try (do
  string "<!--"
  comment <- manyTill anyChar (try (string "-->"))
--- a/src/Text/Pandoc/Readers/Markdown.hs
+++ b/src/Text/Pandoc/Readers/Markdown.hs
@ -36,8 +36,11 @@ import Text.ParserCombinators.Pandoc
 import Text.Pandoc.Definition
 import Text.Pandoc.Readers.LaTeX ( rawLaTeXInline, rawLaTeXEnvironment )
 import Text.Pandoc.Shared 
-import Text.Pandoc.Readers.HTML ( rawHtmlInline, rawHtmlBlock, 
-                                  anyHtmlBlockTag, anyHtmlInlineTag )
+import Text.Pandoc.Readers.HTML ( rawHtmlBlock, 
+                                  anyHtmlBlockTag, anyHtmlInlineTag,
+                                  anyHtmlTag, anyHtmlEndTag,
+                                  htmlEndTag, extractTagType,
+                                  htmlBlockElement )
 import Text.Pandoc.HtmlEntities ( decodeEntities )
 import Text.Regex ( matchRegex, mkRegex )
 import Text.ParserCombinators.Parsec
@ -107,6 +110,16 @@ skipNonindentSpaces = do
  let tabStop = stateTabStop state
  choice (map (\n -> (try (count n (char ' ')))) (reverse [0..(tabStop - 1)]))

+-- | Fail if reader is in strict markdown syntax mode.
+failIfStrict = do
+    state <- getState
+    if stateStrict state then fail "Strict markdown mode" else return ()
+
+-- | Fail unless we're at beginning of a line.
+failUnlessBeginningOfLine = do
+  pos <- getPosition
+  if sourceColumn pos == 1 then return () else fail "not beginning of line"
+
 --
 -- document structure
 --
@ -132,6 +145,7 @@ dateLine = try (do
  return (removeTrailingSpace date))

 titleBlock = try (do
+  failIfStrict
  title <- option [] titleLine
  author <- option [] authorsLine
  date <- option "" dateLine
@ -147,7 +161,14 @@ parseMarkdown = do
  updateState (\state -> state { stateParseRaw = True }) 
  -- need to parse raw HTML, since markdown allows it
  (title, author, date) <- option ([],[],"") titleBlock
-  blocks <- parseBlocks
+  oldState <- getState
+  oldInput <- getInput
+  parseBlocks -- go through once just to get list of reference keys
+  newState <- getState
+  let keysUsed = stateKeysUsed newState
+  setInput oldInput
+  setState (oldState { stateKeysUsed = keysUsed })
+  blocks <- parseBlocks  -- go through again, for real
  let blocks' = filter (/= Null) blocks
  state <- getState
  let keys = reverse $ stateKeyBlocks state
@ -165,7 +186,7 @@ parseBlocks = do
  return result

 block = choice [ codeBlock, note, referenceKey, header, hrule, list, 
-                 blockQuote, rawHtmlBlocks, rawLaTeXEnvironment, para,
+                 blockQuote, htmlBlock, rawLaTeXEnvironment', para,
                 plain, blankBlock, nullBlock ] <?> "block"

 --
@ -190,8 +211,7 @@ setextHeader = choice $
               map (\x -> setextH x) (enumFromTo 1 (length setextHChars))

 setextH n = try (do
-  txt <- many1 (do {notFollowedBy newline; inline})
-  endline
+  txt <- many1Till inline newline
  many1 (char (setextHChars !! (n-1)))
  skipSpaces
  newline
@ -256,6 +276,7 @@ rawLines = do
    return (concat lines)

 note = try (do
+  failIfStrict
  ref <- noteMarker
  char ':'
  skipSpaces
@ -280,6 +301,7 @@ note = try (do
 --

 emacsBoxQuote = try (do
+  failIfStrict
  string ",----"
  manyTill anyChar newline
  raw <- manyTill (try (do 
@ -336,8 +358,9 @@ bulletListStart = try (do
 orderedListStart = try (do
  option ' ' newline -- if preceded by a Plain block in a list context
  skipNonindentSpaces
-  many1 digit <|> count 1 letter
-  oneOf orderedListDelimiters
+  many1 digit <|> (do{failIfStrict; count 1 letter})
+  delim <- oneOf orderedListDelimiters
+  if delim /= '.' then failIfStrict else return ()
  oneOf spaceChars
  skipSpaces)

@ -410,10 +433,12 @@ bulletList = try (do
 para = try (do 
  result <- many1 inline
  newline
-  choice [ (do 
-              followedBy' (oneOfStrings [">", ",----"])
-              return "" ), 
-           blanklines ]  
+  st <- getState
+  if stateStrict st
+     then choice [followedBy' blockQuote, followedBy' header, 
+                  (do{blanklines; return ()})]
+     else choice [followedBy' emacsBoxQuote, 
+                  (do{blanklines; return ()})]
  let result' = normalizeSpaces result
  return (Para result'))

@ -426,6 +451,36 @@ plain = do
 -- raw html
 --

+htmlElement = choice [strictHtmlBlock,
+                      htmlBlockElement] <?> "html element"
+
+htmlBlock = do
+  st <- getState
+  if stateStrict st
+    then do
+           failUnlessBeginningOfLine
+           first <- htmlElement
+           finalSpace <- many (oneOf spaceChars)
+           finalNewlines <- many newline
+           return (RawHtml (first ++ finalSpace ++ finalNewlines))
+    else rawHtmlBlocks
+
+-- True if tag is self-closing
+selfClosing tag = case (matchRegex (mkRegex "\\/[[:space:]]*>$") tag) of
+                      Just _  -> True
+                      Nothing -> False
+
+strictHtmlBlock = try (do
+  tag <- anyHtmlBlockTag 
+  let tag' = extractTagType tag
+  if selfClosing tag || tag' == "hr" 
+     then return tag
+     else do
+            contents <- many (do{notFollowedBy' (htmlEndTag tag'); 
+                                 htmlElement <|> (count 1 anyChar)})
+            end <- htmlEndTag tag'
+            return $ tag ++ (concat contents) ++ end)
+
 rawHtmlBlocks = try (do
  htmlBlocks <- many1 rawHtmlBlock    
  let combined = concatMap (\(RawHtml str) -> str) htmlBlocks
@ -448,7 +503,18 @@ referenceKey = try (do
  option ' ' (char autoLinkEnd)
  tit <- option "" title 
  blanklines 
-  return (Key label (Src (removeTrailingSpace src) tit))) 
+  state <- getState
+  let keysUsed = stateKeysUsed state
+  updateState (\st -> st { stateKeysUsed = (label:keysUsed) })
+  return $ Key label (Src (removeTrailingSpace src) tit))
+
+--
+-- LaTeX
+--
+
+rawLaTeXEnvironment' = do
+  failIfStrict
+  rawLaTeXEnvironment

 -- 
 -- inline
@ -457,10 +523,10 @@ referenceKey = try (do
 text = choice [ math, strong, emph, code2, code1, str, linebreak, tabchar, 
                whitespace, endline ] <?> "text"

-inline = choice [ rawLaTeXInline, escapedChar, special, hyphens, text, 
+inline = choice [ rawLaTeXInline', escapedChar, special, hyphens, text, 
                  ltSign, symbol ] <?> "inline"

-special = choice [ noteRef, inlineNote, link, referenceLink, rawHtmlInline, 
+special = choice [ noteRef, inlineNote, link, referenceLink, rawHtmlInline', 
                   autoLink, image ] <?> "link, inline html, note, or image"

 escapedChar = escaped anyChar
@ -507,6 +573,7 @@ mathWord = many1 (choice [ (noneOf (" \t\n\\" ++ [mathEnd])),
                                   return c))])

 math = try (do
+  failIfStrict
  char mathStart
  notFollowedBy space
  words <- sepBy1 mathWord (many1 space)
@ -549,18 +616,17 @@ str = do
 -- an endline character that can be treated as a space, not a structural break
 endline = try (do
  newline
-  -- next line would allow block quotes without preceding blank line
-  -- Markdown.pl does allow this, but there's a chance of a wrapped
-  -- greater-than sign triggering a block quote by accident...
-  -- notFollowedBy' (choice [emailBlockQuoteStart, string ",----"])  
  notFollowedBy blankline
-  -- parse potential list-starts differently if in a list:
  st <- getState
+  if stateStrict st 
+    then do
+           notFollowedBy' emailBlockQuoteStart
+           notFollowedBy' header
+    else return () 
+  -- parse potential list-starts differently if in a list:
  if (stateParserContext st) == ListItemState
-     then do
-            notFollowedBy' orderedListStart
-            notFollowedBy' bulletListStart
-     else option () pzero
+     then notFollowedBy' (orderedListStart <|> bulletListStart)
+     else return ()
  return Space)

 --
@ -571,8 +637,12 @@ endline = try (do
 reference = do
  char labelStart
  notFollowedBy (char noteStart)
-  label <- manyTill inline (char labelEnd)
-  return (normalizeSpaces label)
+  -- allow for embedded brackets:
+  label <- manyTill ((do{res <- reference;
+                         return $ [Str "["] ++ res ++ [Str "]"]}) <|>
+                      count 1 inline)
+                    (char labelEnd)
+  return (normalizeSpaces (concat label))

 -- source for a link, with optional title
 source = try (do 
@ -590,8 +660,10 @@ titleWith startChar endChar = try (do
  skipEndline  -- a title can be on the next line from the source
  skipSpaces
  char startChar
-  tit <- manyTill (choice [ try (do {char '\\'; char endChar}), 
-                            (noneOf (endChar:endLineChars)) ]) (char endChar) 
+  tit <- manyTill anyChar (try (do
+                                  char endChar
+                                  skipSpaces
+                                  followedBy' (char ')' <|> newline)))
  let tit' = gsub "\"" "&quot;" tit
  return tit')

@ -608,19 +680,26 @@ explicitLink = try (do

 referenceLink = choice [referenceLinkDouble, referenceLinkSingle]

-- a link like [this][/url/]
+-- a link like [this][ref]
 referenceLinkDouble = try (do
  label <- reference
  skipSpaces
  skipEndline
  skipSpaces
  ref <- reference 
-  return (Link label (Ref ref))) 
+  let ref' = if null ref then label else ref
+  state <- getState
+  if ref' `elem` (stateKeysUsed state)
+     then return () else fail "no corresponding key"
+  return (Link label (Ref ref'))) 

 -- a link like [this]
 referenceLinkSingle = try (do
  label <- reference
-  return (Link label (Ref []))) 
+  state <- getState
+  if label `elem` (stateKeysUsed state)
+     then return () else fail "no corresponding key"
+  return (Link label (Ref label))) 

 -- a link <like.this.com>
 autoLink = try (do
@ -645,6 +724,7 @@ noteMarker = try (do
  manyTill (noneOf " \t\n") (char labelEnd))

 noteRef = try (do
+  failIfStrict
  ref <- noteMarker
  state <- getState
  let identifiers = (stateNoteIdentifiers state) ++ [ref] 
@ -652,6 +732,7 @@ noteRef = try (do
  return (NoteRef (show (length identifiers))))

 inlineNote = try (do
+  failIfStrict
  char noteStart
  char labelStart
  contents <- manyTill inline (char labelEnd)
@ -664,3 +745,14 @@ inlineNote = try (do
                               (Note ref [Para contents]):noteBlocks})
  return (NoteRef ref))

+rawLaTeXInline' = do
+  failIfStrict
+  rawLaTeXInline
+
+rawHtmlInline' = do
+  st <- getState
+  result <- if stateStrict st
+              then choice [htmlBlockElement, anyHtmlTag, anyHtmlEndTag] 
+              else choice [htmlBlockElement, anyHtmlInlineTag]
+  return (HtmlInline result)
+
--- a/src/Text/Pandoc/Shared.hs
+++ b/src/Text/Pandoc/Shared.hs
@ -114,6 +114,7 @@ data ParserState = ParserState
      stateTitle           :: [Inline],      -- ^ Title of document
      stateAuthors         :: [String],      -- ^ Authors of document
      stateDate            :: String,        -- ^ Date of document
+      stateStrict          :: Bool,          -- ^ Use strict markdown syntax
      stateHeaderTable     :: [HeaderType]   -- ^ List of header types used,
                                             -- in what order (rst only)
    }
@ -132,6 +133,7 @@ defaultParserState =
                  stateTitle           = [],
                  stateAuthors         = [],
                  stateDate            = [],
+                  stateStrict          = False,
                  stateHeaderTable     = [] }

 -- | Consolidate @Str@s and @Space@s in an inline list into one big @Str@.
@ -325,10 +327,11 @@ data WriterOptions = WriterOptions
    , writerHeader          :: String -- ^ Header for the document
    , writerIncludeBefore   :: String -- ^ String to include before the  body
    , writerIncludeAfter    :: String -- ^ String to include after the body
-    , writerSmart           :: Bool   -- ^ If @True@, use smart typography
-    , writerS5              :: Bool   -- ^ @True@ if we're writing S5 
-    , writerIncremental     :: Bool   -- ^ If @True@, inceremental S5 lists
-    , writerNumberSections  :: Bool   -- ^ If @True@, number sections in LaTeX
+    , writerSmart           :: Bool   -- ^ Use smart typography
+    , writerS5              :: Bool   -- ^ We're writing S5 
+    , writerIncremental     :: Bool   -- ^ Incremental S5 lists
+    , writerNumberSections  :: Bool   -- ^ Number sections in LaTeX
+    , writerStrictMarkdown  :: Bool   -- ^ Use strict markdown syntax
    , writerTabStop         :: Int    -- ^ Tabstop for conversion between 
                                      -- spaces and tabs
    } deriving Show
--- a/src/Text/Pandoc/Writers/HTML.hs
+++ b/src/Text/Pandoc/Writers/HTML.hs
@ -91,12 +91,15 @@ obfuscateLink options text src =
                        then name ++ " at " ++ domain'
                        else text' ++ " (" ++ name ++ " at " ++ 
                             domain' ++ ")" in 
-      "<script type=\"text/javascript\">\n<!--\nh='" ++ 
-      obfuscateString domain ++ "';a='" ++ at' ++ "';n='" ++ 
-      obfuscateString name ++ "';e=n+a+h;\n" ++
-      "document.write('<a h'+'ref'+'=\"ma'+'ilto'+':'+e+'\">'+" ++ 
-      linkText  ++ "+'<\\/'+'a'+'>');\n// -->\n</script><noscript>" ++ 
-      obfuscateString altText ++ "</noscript>"
+      if writerStrictMarkdown options
+        then "<a href=\"" ++ obfuscateString src' ++ "\">" ++ 
+             obfuscateString text' ++ "</a>"
+        else "<script type=\"text/javascript\">\n<!--\nh='" ++ 
+             obfuscateString domain ++ "';a='" ++ at' ++ "';n='" ++ 
+             obfuscateString name ++ "';e=n+a+h;\n" ++
+             "document.write('<a h'+'ref'+'=\"ma'+'ilto'+':'+e+'\">'+" ++ 
+             linkText  ++ "+'<\\/'+'a'+'>');\n// -->\n</script><noscript>" ++ 
+             obfuscateString altText ++ "</noscript>"
    _ -> "<a href=\"" ++ src ++ "\">" ++ text' ++ "</a>" -- malformed email

 -- | Obfuscate character as entity.
@ -264,8 +267,6 @@ inlineToHtml options (Link text (Src src tit)) =
     else "<a href=\"" ++ (codeStringToHtml src) ++ "\"" ++ 
          (if tit /= "" then " title=\"" ++ title  ++ "\">" else ">") ++ 
          (inlineListToHtml options text) ++ "</a>"
-inlineToHtml options (Link text (Ref [])) = 
-  "[" ++ (inlineListToHtml options text) ++ "]" 
 inlineToHtml options (Link text (Ref ref)) = 
  "[" ++ (inlineListToHtml options text) ++ "][" ++ 
  (inlineListToHtml options ref) ++ "]"  
@ -276,8 +277,6 @@ inlineToHtml options (Image alt (Src source tit)) =
  "<img src=\"" ++ source ++ "\"" ++ 
  (if tit /= "" then " title=\"" ++ title ++ "\"" else "") ++ 
  (if alternate /= "" then " alt=\"" ++ alternate ++ "\"" else "") ++ ">"
-inlineToHtml options (Image alternate (Ref [])) = 
-  "![" ++ (inlineListToHtml options alternate) ++ "]"
 inlineToHtml options (Image alternate (Ref ref)) = 
  "![" ++ (inlineListToHtml options alternate) ++ "][" ++ 
  (inlineListToHtml options ref) ++ "]"
--- a/src/Text/Pandoc/Writers/LaTeX.hs
+++ b/src/Text/Pandoc/Writers/LaTeX.hs
@ -180,15 +180,11 @@ inlineToLaTeX notes (LineBreak) = "\\\\\n"
 inlineToLaTeX notes Space = " "
 inlineToLaTeX notes (Link text (Src src tit)) = 
    "\\href{" ++ src ++ "}{" ++ (inlineListToLaTeX notes (deVerb text)) ++ "}"
-inlineToLaTeX notes (Link text (Ref [])) = "[" ++ 
-    (inlineListToLaTeX notes text) ++ "]"
 inlineToLaTeX notes (Link text (Ref ref)) = "[" ++ 
    (inlineListToLaTeX notes text) ++ "][" ++ (inlineListToLaTeX notes ref) ++
    "]"  -- this is what markdown does, for better or worse
 inlineToLaTeX notes (Image alternate (Src source tit)) = 
    "\\includegraphics{" ++ source ++ "}" 
-inlineToLaTeX notes (Image alternate (Ref [])) = 
-    "![" ++ (inlineListToLaTeX notes alternate) ++ "]" 
 inlineToLaTeX notes (Image alternate (Ref ref)) = 
    "![" ++ (inlineListToLaTeX notes alternate) ++ "][" ++ 
    (inlineListToLaTeX notes ref) ++ "]"
--- a/src/Text/Pandoc/Writers/Markdown.hs
+++ b/src/Text/Pandoc/Writers/Markdown.hs
@ -168,11 +168,12 @@ inlineToMarkdown (Link txt (Src src tit)) =
  (if tit /= ""
      then text (" \"" ++ (escapeLinkTitle tit) ++ "\"") 
      else empty) <> char ')'
-inlineToMarkdown (Link txt (Ref [])) = 
-  char '[' <> inlineListToMarkdown txt <> text "][]"
 inlineToMarkdown (Link txt (Ref ref)) = 
-  char '[' <> inlineListToMarkdown txt <> char ']' <> char '[' <> 
-  inlineListToMarkdown ref <> char ']'
+  let first = char '[' <> inlineListToMarkdown txt <> char ']'
+      second = if (txt == ref) 
+                 then empty
+                 else char '[' <> inlineListToMarkdown ref <> char ']' in
+      first <> second
 inlineToMarkdown (Image alternate (Src source tit)) = 
  let alt = if (null alternate) || (alternate == [Str ""])
               then text "image" 
@ -181,10 +182,7 @@ inlineToMarkdown (Image alternate (Src source tit)) =
  (if tit /= ""
      then text (" \"" ++ (escapeLinkTitle tit) ++ "\"") 
      else empty) <> char ')'
-inlineToMarkdown (Image alternate (Ref [])) = 
-  char '!' <> char '[' <> inlineListToMarkdown alternate <> char ']'
 inlineToMarkdown (Image alternate (Ref ref)) = 
-  char '!' <> char '[' <> inlineListToMarkdown alternate <> char ']' <> 
-  char '[' <> inlineListToMarkdown ref <> char ']'
+  char '!' <> inlineToMarkdown (Link alternate (Ref ref))
 inlineToMarkdown (NoteRef ref) = 
  text "[^" <> text (escapeString ref) <> char ']'
--- a/src/Text/Pandoc/Writers/RST.hs
+++ b/src/Text/Pandoc/Writers/RST.hs
@ -202,9 +202,6 @@ inlineToRST (Link txt (Src src tit)) =
                     else linktext' in
    let ref = text ".. _" <> text linktext'' <> text ": " <> text src  in
    (link, ref' $$ ref)
-inlineToRST (Link txt (Ref [])) = 
-  let (linktext, refs) = inlineListToRST txt in
-  (char '[' <> linktext <> char ']', refs)   
 inlineToRST (Link txt (Ref ref)) = 
  let (linktext, refs1) = inlineListToRST txt 
      (reftext, refs2) = inlineListToRST ref in
@ -216,9 +213,6 @@ inlineToRST (Image alternate (Src source tit)) =
  let link = char '|' <> alt <> char '|' in
  let ref = text ".. " <> link <> text " image:: " <> text source  in
  (link, ref' $$ ref)
-inlineToRST (Image alternate (Ref [])) = 
-  let (alttext, refs) = inlineListToRST alternate in
-  (char '|' <> alttext <> char '|', refs)
 -- The following case won't normally occur...
 inlineToRST (Image alternate (Ref ref)) = 
  let (alttext, refs1) = inlineListToRST alternate
--- a/src/Text/Pandoc/Writers/RTF.hs
+++ b/src/Text/Pandoc/Writers/RTF.hs
@ -220,15 +220,11 @@ inlineToRTF notes Space = " "
 inlineToRTF notes (Link text (Src src tit)) = 
  "{\\field{\\*\\fldinst{HYPERLINK \"" ++ (codeStringToRTF src) ++ 
  "\"}}{\\fldrslt{\\ul\n" ++ (inlineListToRTF notes text) ++ "\n}}}\n"
-inlineToRTF notes (Link text (Ref [])) = 
-  "[" ++ (inlineListToRTF notes text) ++ "]"
 inlineToRTF notes (Link text (Ref ref)) = 
  "[" ++ (inlineListToRTF notes text) ++ "][" ++ 
  (inlineListToRTF notes ref) ++ "]"  -- this is what markdown does
 inlineToRTF notes (Image alternate (Src source tit)) = 
  "{\\cf1 [image: " ++ source ++ "]\\cf0}" 
-inlineToRTF notes (Image alternate (Ref [])) = 
-  "![" ++ (inlineListToRTF notes alternate) ++ "]" 
 inlineToRTF notes (Image alternate (Ref ref)) = "![" ++ 
  (inlineListToRTF notes alternate) ++ "][" ++ 
  (inlineListToRTF notes ref) ++ "]"
--- a/tests/testsuite.native
+++ b/tests/testsuite.native
@ -277,18 +277,18 @@ Pandoc (Meta [Str "Pandoc",Space,Str "Test",Space,Str "Suite"] ["John MacFarlane
 , Para [Str "Foo",Space,Link [Str "bar"] (Ref [Str "a"]),Str "."]
 , Para [Str "Foo",Space,Link [Str "bar"] (Ref [Str "a"]),Str "."]
 , Key [Str "a"] (Src "/url/" "")
-, Para [Str "With",Space,Link [Str "embedded",Space,Link [Str "brackets"] (Ref [])] (Ref [Str "b"]),Str "."]
-, Para [Link [Str "b"] (Ref []),Space,Str "by",Space,Str "itself",Space,Str "should",Space,Str "be",Space,Str "a",Space,Str "link."]
-, Para [Str "Indented",Space,Link [Str "once"] (Ref []),Str "."]
-, Para [Str "Indented",Space,Link [Str "twice"] (Ref []),Str "."]
-, Para [Str "Indented",Space,Link [Str "thrice"] (Ref []),Str "."]
-, Para [Str "This",Space,Str "should",Space,Link [Str "not"] (Ref []),Space,Str "be",Space,Str "a",Space,Str "link."]
+, Para [Str "With",Space,Link [Str "embedded",Space,Str "[",Str "brackets",Str "]"] (Ref [Str "b"]),Str "."]
+, Para [Link [Str "b"] (Ref [Str "b"]),Space,Str "by",Space,Str "itself",Space,Str "should",Space,Str "be",Space,Str "a",Space,Str "link."]
+, Para [Str "Indented",Space,Link [Str "once"] (Ref [Str "once"]),Str "."]
+, Para [Str "Indented",Space,Link [Str "twice"] (Ref [Str "twice"]),Str "."]
+, Para [Str "Indented",Space,Link [Str "thrice"] (Ref [Str "thrice"]),Str "."]
+, Para [Str "This",Space,Str "should",Space,Str "[",Str "not",Str "]",Str "[",Str "]",Space,Str "be",Space,Str "a",Space,Str "link."]
 , Key [Str "once"] (Src "/url" "")
 , Key [Str "twice"] (Src "/url" "")
 , Key [Str "thrice"] (Src "/url" "")
 , CodeBlock "[not]: /url"
 , Key [Str "b"] (Src "/url/" "")
-, Para [Str "Foo",Space,Link [Str "bar"] (Ref []),Str "."]
+, Para [Str "Foo",Space,Link [Str "bar"] (Ref [Str "bar"]),Str "."]
 , Para [Str "Foo",Space,Link [Str "biz"] (Src "/url/" "Title with &quot;quote&quot; inside"),Str "."]
 , Key [Str "bar"] (Src "/url/" "Title with &quot;quotes&quot; inside")
 , Header 2 [Str "With",Space,Str "ampersands"]
@ -313,7 +313,7 @@ Pandoc (Meta [Str "Pandoc",Space,Str "Test",Space,Str "Suite"] ["John MacFarlane
 , HorizontalRule
 , Header 1 [Str "Images"]
 , Para [Str "From",Space,Str "\"Voyage",Space,Str "dans",Space,Str "la",Space,Str "Lune\"",Space,Str "by",Space,Str "Georges",Space,Str "Melies",Space,Str "(1902):"]
-, Para [Image [Str "lalune"] (Ref [])]
+, Para [Image [Str "lalune"] (Ref [Str "lalune"])]
 , Key [Str "lalune"] (Src "lalune.jpg" "Voyage dans la Lune")
 , Para [Str "Here",Space,Str "is",Space,Str "a",Space,Str "movie",Space,Image [Str "movie"] (Src "movie.jpg" ""),Space,Str "icon."]
 , HorizontalRule
--- a/tests/testsuite.txt
+++ b/tests/testsuite.txt
@ -498,7 +498,7 @@ Just a [URL](/url/).

 [URL and title](/url/	"title preceded by a tab").

-[URL and title](/url/ "title with \"quotes\" in it")
+[URL and title](/url/ "title with "quotes" in it")

 [URL and title](/url/ 'title with single quotes')

@ -541,9 +541,9 @@ This should [not][] be a link.

 Foo [bar][].

-Foo [biz](/url/ "Title with \"quote\" inside").
+Foo [biz](/url/ "Title with "quote" inside").

-  [bar]: /url/ "Title with \"quotes\" inside"
+  [bar]: /url/ "Title with "quotes" inside"

 ## With ampersands

--- a/tests/writer.html
+++ b/tests/writer.html
@ -407,7 +407,7 @@ document.write('<a h'+'ref'+'="ma'+'ilto'+':'+e+'">'+'Email link'+'<\/'+'a'+'>')
 <p>Indented <a href="/url">once</a>.</p>
 <p>Indented <a href="/url">twice</a>.</p>
 <p>Indented <a href="/url">thrice</a>.</p>
-<p>This should [not] be a link.</p>
+<p>This should [not][] be a link.</p>
 <pre><code>[not]: /url
 </code></pre>
 <p>Foo <a href="/url/" title="Title with &quot;quotes&quot; inside">bar</a>.</p>
--- a/tests/writer.latex
+++ b/tests/writer.latex
@ -514,7 +514,7 @@ Indented \href{/url}{twice}.

 Indented \href{/url}{thrice}.

-This should [not] be a link.
+This should [not][] be a link.

 \begin{verbatim}
 [not]: /url
--- a/tests/writer.markdown
+++ b/tests/writer.markdown
@ -529,15 +529,15 @@ Foo [bar][a].

  [a]: /url/

-With [embedded [brackets][]][b].
+With [embedded [brackets]][b].

-[b][] by itself should be a link.
+[b] by itself should be a link.

-Indented [once][].
+Indented [once].

-Indented [twice][].
+Indented [twice].

-Indented [thrice][].
+Indented [thrice].

 This should [not][] be a link.

@ -550,7 +550,7 @@ This should [not][] be a link.

  [b]: /url/

-Foo [bar][].
+Foo [bar].

 Foo [biz](/url/ "Title with &quot;quote&quot; inside").

--- a/tests/writer.native
+++ b/tests/writer.native
@ -277,18 +277,18 @@ Pandoc (Meta [Str "Pandoc",Space,Str "Test",Space,Str "Suite"] ["John MacFarlane
 , Para [Str "Foo",Space,Link [Str "bar"] (Ref [Str "a"]),Str "."]
 , Para [Str "Foo",Space,Link [Str "bar"] (Ref [Str "a"]),Str "."]
 , Key [Str "a"] (Src "/url/" "")
-, Para [Str "With",Space,Link [Str "embedded",Space,Link [Str "brackets"] (Ref [])] (Ref [Str "b"]),Str "."]
-, Para [Link [Str "b"] (Ref []),Space,Str "by",Space,Str "itself",Space,Str "should",Space,Str "be",Space,Str "a",Space,Str "link."]
-, Para [Str "Indented",Space,Link [Str "once"] (Ref []),Str "."]
-, Para [Str "Indented",Space,Link [Str "twice"] (Ref []),Str "."]
-, Para [Str "Indented",Space,Link [Str "thrice"] (Ref []),Str "."]
-, Para [Str "This",Space,Str "should",Space,Link [Str "not"] (Ref []),Space,Str "be",Space,Str "a",Space,Str "link."]
+, Para [Str "With",Space,Link [Str "embedded",Space,Str "[",Str "brackets",Str "]"] (Ref [Str "b"]),Str "."]
+, Para [Link [Str "b"] (Ref [Str "b"]),Space,Str "by",Space,Str "itself",Space,Str "should",Space,Str "be",Space,Str "a",Space,Str "link."]
+, Para [Str "Indented",Space,Link [Str "once"] (Ref [Str "once"]),Str "."]
+, Para [Str "Indented",Space,Link [Str "twice"] (Ref [Str "twice"]),Str "."]
+, Para [Str "Indented",Space,Link [Str "thrice"] (Ref [Str "thrice"]),Str "."]
+, Para [Str "This",Space,Str "should",Space,Str "[",Str "not",Str "]",Str "[",Str "]",Space,Str "be",Space,Str "a",Space,Str "link."]
 , Key [Str "once"] (Src "/url" "")
 , Key [Str "twice"] (Src "/url" "")
 , Key [Str "thrice"] (Src "/url" "")
 , CodeBlock "[not]: /url"
 , Key [Str "b"] (Src "/url/" "")
-, Para [Str "Foo",Space,Link [Str "bar"] (Ref []),Str "."]
+, Para [Str "Foo",Space,Link [Str "bar"] (Ref [Str "bar"]),Str "."]
 , Para [Str "Foo",Space,Link [Str "biz"] (Src "/url/" "Title with &quot;quote&quot; inside"),Str "."]
 , Key [Str "bar"] (Src "/url/" "Title with &quot;quotes&quot; inside")
 , Header 2 [Str "With",Space,Str "ampersands"]
@ -313,7 +313,7 @@ Pandoc (Meta [Str "Pandoc",Space,Str "Test",Space,Str "Suite"] ["John MacFarlane
 , HorizontalRule
 , Header 1 [Str "Images"]
 , Para [Str "From",Space,Str "\"Voyage",Space,Str "dans",Space,Str "la",Space,Str "Lune\"",Space,Str "by",Space,Str "Georges",Space,Str "Melies",Space,Str "(1902):"]
-, Para [Image [Str "lalune"] (Ref [])]
+, Para [Image [Str "lalune"] (Ref [Str "lalune"])]
 , Key [Str "lalune"] (Src "lalune.jpg" "Voyage dans la Lune")
 , Para [Str "Here",Space,Str "is",Space,Str "a",Space,Str "movie",Space,Image [Str "movie"] (Src "movie.jpg" ""),Space,Str "icon."]
 , HorizontalRule
--- a/tests/writer.rst
+++ b/tests/writer.rst
@ -638,7 +638,7 @@ Indented `twice`_.

 Indented `thrice`_.

-This should [not] be a link.
+This should [not][] be a link.

 ::
 
--- a/tests/writer.rtf
+++ b/tests/writer.rtf
@ -312,7 +312,7 @@ twice
 thrice
 }}}
 .\par}
-{\pard \f0 \sa180 \li0 \fi0 This should [not] be a link.\par}
+{\pard \f0 \sa180 \li0 \fi0 This should [not][] be a link.\par}
 {\pard \f0 \sa180 \li0 \fi0 \f1 [not]: /url\par}
 {\pard \f0 \sa180 \li0 \fi0 Foo {\field{\*\fldinst{HYPERLINK "/url/"}}{\fldrslt{\ul
 bar
--- a/tests/writer.smart.html
+++ b/tests/writer.smart.html
@ -407,7 +407,7 @@ document.write('<a h'+'ref'+'="ma'+'ilto'+':'+e+'">'+'Email link'+'<\/'+'a'+'>')
 <p>Indented <a href="/url">once</a>.</p>
 <p>Indented <a href="/url">twice</a>.</p>
 <p>Indented <a href="/url">thrice</a>.</p>
-<p>This should [not] be a link.</p>
+<p>This should [not][] be a link.</p>
 <pre><code>[not]: /url
 </code></pre>
 <p>Foo <a href="/url/" title="Title with &quot;quotes&quot; inside">bar</a>.</p>