Removed --strict, added extensions to writer/reader names.

* The `--strict` option has been removed. * Instead of using `--strict`, one can now use `strict` instead of `markdown` as an input or output format name. * The `--enable` and `--disable` optinos have been removed. * It is now possible to enable or disable specific extensions by appending them (with '+' or '-') to the writer or reader name. For example `pandoc -f markdown-footnotes+hard_line_breaks`. * The lhs extensions are now implemented this way, too; you can use either `+lhs` or `+literate_haskell`.
2012-08-09 20:19:06 -07:00 · 2012-08-09 20:19:06 -07:00 · 0cb7362f62
commit 0cb7362f62
parent 71e0c206c1
5 changed files with 73 additions and 78 deletions
--- a/37
+++ b/37
@ -26,8 +26,8 @@ tables, flexible ordered lists, definition lists, delimited code blocks,
 superscript, subscript, strikeout, title blocks, automatic tables of
 contents, embedded LaTeX math, citations, and markdown inside HTML block
 elements. (These enhancements, described below under
-[Pandoc's markdown](#pandocs-markdown), can be disabled using the `--strict`
-option.)
+[Pandoc's markdown](#pandocs-markdown), can be disabled using the `strict`
+input or output format.)

 In contrast to most existing tools for converting markdown to HTML, which
 use regex substitutions, Pandoc has a modular design: it consists of a
@ -117,10 +117,11 @@ and `xunicode` (if `xelatex` is used).

 A user who wants a drop-in replacement for `Markdown.pl` may create
 a symbolic link to the `pandoc` executable called `hsmarkdown`. When
-invoked under the name `hsmarkdown`, `pandoc` will behave as if the
-`--strict` flag had been selected, and no command-line options will be
-recognized. However, this approach does not work under Cygwin, due to
-problems with its simulation of symbolic links.
+invoked under the name `hsmarkdown`, `pandoc` will behave as if
+invoked with `-f strict --email-obfuscation=references`,
+and all command-line options will be treated as regular arguments.
+However, this approach does not work under Cygwin, due to problems with
+its simulation of symbolic links.

 [Cygwin]:  http://www.cygwin.com/
 [`iconv`]: http://www.gnu.org/software/libiconv/
@ -192,12 +193,6 @@ General options
 Reader options
 --------------

-`--strict`
-:   Use strict markdown syntax, with no pandoc extensions or variants.
-    When the input format is HTML, this means that constructs that have no
-    equivalents in standard markdown (e.g. definition lists or strikeout
-    text) will be parsed as raw HTML.
-
 `-R`, `--parse-raw`
 :   Parse untranslatable HTML codes and LaTeX environments as raw HTML
    or LaTeX, instead of ignoring them.  Affects only HTML and LaTeX
@ -403,8 +398,6 @@ Options affecting specific writers
    *none* leaves `mailto:` links as they are.  *javascript* obfuscates
    them using javascript. *references* obfuscates them by printing their
    letters as decimal or hexadecimal character references.
-    If `--strict` is specified, *references* is used regardless of the
-    presence of this option.

 `--id-prefix`=*STRING*
 :   Specify a prefix to be added to all automatically generated identifiers
@ -764,8 +757,8 @@ Pandoc's markdown
 Pandoc understands an extended and slightly revised version of
 John Gruber's [markdown] syntax.  This document explains the syntax,
 noting differences from standard markdown. Except where noted, these
-differences can be suppressed by specifying the `--strict` command-line
-option.
+differences can be suppressed by using the `strict` format instead
+of `markdown`.

 Philosophy
 ----------
@ -934,8 +927,8 @@ Standard markdown syntax does not require a blank line before a block
 quote.  Pandoc does require this (except, of course, at the beginning of the
 document). The reason for the requirement is that it is all too easy for a
 `>` to end up at the beginning of a line by accident (perhaps through line
-wrapping). So, unless `--strict` is used, the following does not produce
-a nested block quote in pandoc:
+wrapping). So, unless the `strict` format is used, the following does
+not produce a nested block quote in pandoc:

    > This is a block quote.
    >> Nested.
@ -1291,7 +1284,7 @@ around "Third". Pandoc follows a simple rule: if the text is followed by
 a blank line, it is treated as a paragraph. Since "Second" is followed
 by a list, and not a blank line, it isn't treated as a paragraph. The
 fact that the list is followed by a blank line is irrelevant. (Note:
-Pandoc works this way even when the `--strict` option is specified. This
+Pandoc works this way even when the `strict` format is specified. This
 behavior is consistent with the official markdown syntax description,
 even though it is different from that of `Markdown.pl`.)

@ -1612,8 +1605,8 @@ which allows only the following characters to be backslash-escaped:

    \`*_{}[]()>#+-.!

-(However, if the `--strict` option is supplied, the standard
-markdown rule will be used.)
+(However, if the `strict` format is used, the standard markdown rule
+will be used.)

 A backslash-escaped space is parsed as a nonbreaking space.  It will
 appear in TeX output as `~` and in HTML and XML as `\&#160;` or
@ -1847,7 +1840,7 @@ with blank lines, and start and end at the left margin.  Within
 these blocks, everything is interpreted as HTML, not markdown;
 so (for example), `*` does not signify emphasis.

-Pandoc behaves this way when `--strict` is specified; but by default,
+Pandoc behaves this way when the `strict` format is used; but by default,
 pandoc interprets material between HTML block tags as markdown.
 Thus, for example, Pandoc will turn

--- a/src/Text/Pandoc.hs
+++ b/src/Text/Pandoc.hs
@ -139,11 +139,13 @@ import Text.Pandoc.Templates
 import Text.Pandoc.Options
 import Text.Pandoc.Shared (safeRead)
 import Data.ByteString.Lazy (ByteString)
+import Data.List (intercalate)
 import Data.Version (showVersion)
 import Text.JSON.Generic
 import Data.Set (Set)
 import qualified Data.Set as Set
 import Text.Parsec
+import Text.Parsec.Error
 import Paths_pandoc (version)

 -- | Version number of pandoc library.
@ -160,10 +162,12 @@ parseFormatSpec = parse formatSpec ""
        formatName = many1 $ noneOf "-+"
        extMod = do
          polarity <- oneOf "-+"
-          name <- many1 $ noneOf "-+"
-          ext <- case safeRead name of
+          name <- many $ noneOf "-+"
+          ext <- case safeRead ("Ext_" ++ name) of
                       Just n  -> return n
-                       Nothing -> unexpected $ "Unknown extension: " ++ name
+                       Nothing
+                         | name == "lhs" -> return Ext_literate_haskell
+                         | otherwise -> fail $ "Unknown extension: " ++ name
          return $ case polarity of
                        '-'  -> Set.delete ext
                        _    -> Set.insert ext
@ -172,6 +176,8 @@ parseFormatSpec = parse formatSpec ""
 readers :: [(String, ReaderOptions -> String -> Pandoc)]
 readers = [("native"       , \_ -> readNative)
          ,("json"         , \_ -> decodeJSON)
+          ,("strict"       , \o -> readMarkdown
+                               o{ readerExtensions = strictExtensions } )
          ,("markdown"     , readMarkdown)
          ,("rst"          , readRST)
          ,("docbook"      , readDocBook)
@ -215,6 +221,8 @@ writers = [
  ,("texinfo"      , PureStringWriter writeTexinfo)
  ,("man"          , PureStringWriter writeMan)
  ,("markdown"     , PureStringWriter writeMarkdown)
+  ,("strict"       , PureStringWriter $ \o ->
+     writeMarkdown o{ writerExtensions = strictExtensions } )
  ,("plain"        , PureStringWriter writePlain)
  ,("rst"          , PureStringWriter writeRST)
  ,("mediawiki"    , PureStringWriter writeMediaWiki)
@ -224,10 +232,11 @@ writers = [
  ,("asciidoc"     , PureStringWriter writeAsciiDoc)
  ]

+-- | Retrieve reader based on formatSpec (format+extensions).
 getReader :: String -> Either String (ReaderOptions -> String -> Pandoc)
 getReader s =
  case parseFormatSpec s of
-       Left e  -> Left $ show e
+       Left e  -> Left $ intercalate "\n" $ [m | Message m <- errorMessages e]
       Right (readerName, setExts) ->
           case lookup readerName readers of
                   Nothing  -> Left $ "Unknown reader: " ++ readerName
@ -239,7 +248,7 @@ getReader s =
 getWriter :: String -> Either String Writer
 getWriter s =
  case parseFormatSpec s of
-       Left e  -> Left $ show e
+       Left e  -> Left $ intercalate "\n" $ [m | Message m <- errorMessages e]
       Right (writerName, setExts) ->
           case lookup writerName writers of
                   Nothing  -> Left $ "Unknown writer: " ++ writerName
--- a/src/Text/Pandoc/Shared.hs
+++ b/src/Text/Pandoc/Shared.hs
@ -498,5 +498,6 @@ warn msg = do

 safeRead :: (Monad m, Read a) => String -> m a
 safeRead s = case reads s of
-                  (d,[]):_  -> return d
+                  (d,x):_
+                    | all isSpace x -> return d
                  _                 -> fail $ "Could not read `" ++ s ++ "'"
--- a/src/Text/Pandoc/Templates.hs
+++ b/src/Text/Pandoc/Templates.hs
@ -86,15 +86,16 @@ import qualified Control.Exception.Extensible as E (try, IOException)
 getDefaultTemplate :: (Maybe FilePath) -- ^ User data directory to search first
                   -> String           -- ^ Name of writer
                   -> IO (Either E.IOException String)
-getDefaultTemplate _ "native" = return $ Right ""
-getDefaultTemplate _ "json"   = return $ Right ""
-getDefaultTemplate _ "docx"   = return $ Right ""
-getDefaultTemplate user "odt" = getDefaultTemplate user "opendocument"
-getDefaultTemplate user "epub" = getDefaultTemplate user "html"
 getDefaultTemplate user writer = do
-  let format = takeWhile (/='+') writer  -- strip off "+lhs" if present
-  let fname = "templates" </> "default" <.> format
-  E.try $ readDataFile user fname
+  let format = takeWhile (`notElem` "+-") writer  -- strip off extensions
+  case format of
+       "native" -> return $ Right ""
+       "json"   -> return $ Right ""
+       "docx"   -> return $ Right ""
+       "odt"    -> getDefaultTemplate user "opendocument"
+       "epub"   -> return $ Right ""
+       _        -> let fname = "templates" </> "default" <.> format
+                   in  E.try $ readDataFile user fname

 data TemplateState = TemplateState Int [(String,String)]

--- a/src/pandoc.hs
+++ b/src/pandoc.hs
@ -44,7 +44,7 @@ import System.Exit ( exitWith, ExitCode (..) )
 import System.FilePath
 import System.Console.GetOpt
 import Data.Char ( toLower )
-import Data.List ( intercalate, isSuffixOf, isPrefixOf )
+import Data.List ( intercalate, isPrefixOf )
 import System.Directory ( getAppUserDataDirectory, doesFileExist, findExecutable )
 import System.IO ( stdout )
 import System.IO.Error ( isDoesNotExistError )
@ -58,7 +58,6 @@ import Network.HTTP (simpleHTTP, mkRequest, getResponseBody, RequestMethod(..))
 import Network.URI (parseURI, isURI, URI(..))
 import qualified Data.ByteString.Lazy as B
 import Data.ByteString.Lazy.UTF8 (toString)
-import qualified Data.Set as Set
 import Text.CSL.Reference (Reference(..))
 #if MIN_VERSION_base(4,4,0)
 #else
@ -99,8 +98,8 @@ wrapWords indent c = wrap' (c - indent) (c - indent)
                                                       then ",\n" ++ replicate indent ' ' ++ x ++ wrap' cols (cols - length x) xs
                                                       else ", "  ++ x ++ wrap' cols (remaining - (length x + 2)) xs

-nonTextFormats :: [String]
-nonTextFormats = ["odt","docx","epub"]
+isTextFormat :: String -> Bool
+isTextFormat s = takeWhile (`notElem` "+-") s `notElem` ["odt","docx","epub"]

 -- | Data structure for command line options.
 data Opt = Opt
@ -133,7 +132,6 @@ data Opt = Opt
    , optEPUBFonts         :: [FilePath] -- ^ EPUB fonts to embed
    , optDumpArgs          :: Bool    -- ^ Output command-line arguments
    , optIgnoreArgs        :: Bool    -- ^ Ignore command-line arguments
-    , optStrict            :: Bool    -- ^ Use strict markdown syntax
    , optReferenceLinks    :: Bool    -- ^ Use reference links in writing markdown, rst
    , optWrapText          :: Bool    -- ^ Wrap text
    , optColumns           :: Int     -- ^ Line length in characters
@ -186,7 +184,6 @@ defaultOpts = Opt
    , optEPUBFonts         = []
    , optDumpArgs          = False
    , optIgnoreArgs        = False
-    , optStrict            = False
    , optReferenceLinks    = False
    , optWrapText          = True
    , optColumns           = 72
@ -237,7 +234,10 @@ options =

    , Option "" ["strict"]
                 (NoArg
-                  (\opt -> return opt { optStrict = True } ))
+                  (\opt -> do
+                      err 59 $ "The --strict option has been removed.\n" ++
+                               "Use `strict' input or output format instead."
+                      return opt ))
                 "" -- "Disable markdown syntax extensions"

    , Option "R" ["parse-raw"]
@ -702,7 +702,7 @@ usageMessage :: String -> [OptDescr (Opt -> IO Opt)] -> String
 usageMessage programName = usageInfo
  (programName ++ " [OPTIONS] [FILES]" ++ "\nInput formats:  " ++
  (wrapWords 16 78 $ readers'names) ++ "\nOutput formats: " ++
-  (wrapWords 16 78 $ writers'names ++ nonTextFormats) ++ "\nOptions:")
+  (wrapWords 16 78 $ writers'names) ++ "\nOptions:")
  where
    writers'names = map fst writers
    readers'names = map fst readers
@ -782,9 +782,10 @@ main = do
        ["Try " ++ prg ++ " --help for more information."]

  let defaultOpts' = if compatMode
-                       then defaultOpts { optReader = "markdown"
+                       then defaultOpts { optReader = "strict"
                                        , optWriter = "html"
-                                        , optStrict = True }
+                                        , optEmailObfuscation =
+                                           ReferenceObfuscation }
                       else defaultOpts

  -- thread option data structure through all supplied option actions
@ -819,7 +820,6 @@ main = do
              , optEPUBFonts         = epubFonts
              , optDumpArgs          = dumpArgs
              , optIgnoreArgs        = ignoreArgs
-              , optStrict            = strict
              , optReferenceLinks    = referenceLinks
              , optWrapText          = wrap
              , optColumns           = columns
@ -867,8 +867,8 @@ main = do

  let pdfOutput = map toLower (takeExtension outputFile) == ".pdf"

-  let laTeXOutput = writerName' == "latex" || writerName' == "beamer" ||
-         writerName' == "latex+lhs" || writerName' == "beamer+lhs"
+  let laTeXOutput = "latex" `isPrefixOf` writerName' ||
+                    "beamer" `isPrefixOf` writerName'

  when pdfOutput $ do
    -- make sure writer is latex or beamer
@ -882,11 +882,11 @@ main = do
           latexEngine ++ " is needed for pdf output."
         Just _   -> return ()

-  reader <- case (lookup readerName' readers) of
-     Just r  -> return r
-     Nothing -> err 7 ("Unknown reader: " ++ readerName')
+  reader <- case getReader readerName' of
+     Right r  -> return r
+     Left e   -> err 7 e

-  let standalone' = standalone || writerName' `elem` nonTextFormats || pdfOutput
+  let standalone' = standalone || not (isTextFormat writerName') || pdfOutput

  templ <- case templatePath of
                _ | not standalone' -> return ""
@ -896,8 +896,8 @@ main = do
                                 Left e   -> throwIO e
                                 Right t  -> return t
                Just tp -> do
-                           -- strip off "+lhs" if present
-                           let format = takeWhile (/='+') writerName'
+                           -- strip off extensions
+                           let format = takeWhile (`notElem` "+-") writerName'
                           let tp' = case takeExtension tp of
                                          ""   -> tp <.> format
                                          _    -> tp
@ -919,13 +919,13 @@ main = do
                         return $ ("mathml-script", s) : variables
                      _ -> return variables

-  variables'' <- case writerName' of
-                      "dzslides" -> do
+  variables'' <- if "dzslides" `isPrefixOf` writerName'
+                    then do
                        dztempl <- readDataFile datadir $ "dzslides" </> "template.html"
                        let dzcore = unlines $ dropWhile (not . isPrefixOf "<!-- {{{{ dzslides core")
                                             $ lines dztempl
                        return $ ("dzslides-core", dzcore) : variables'
-                      _         -> return variables'
+                    else return variables'

  -- unescape reference ids, which may contain XML entities, so
  -- that we can do lookups with regular string equality
@ -942,13 +942,8 @@ main = do
                     then "."
                     else takeDirectory (head sources)

-  let defaultExts = if strict
-                       then strictExtensions
-                       else pandocExtensions
-
-  let readerOpts = def{ readerExtensions = defaultExts
-                      , readerSmart = smart || (texLigatures &&
-                          (laTeXOutput || writerName' == "context"))
+  let readerOpts = def{ readerSmart = smart || (texLigatures &&
+                          (laTeXOutput || "context" `isPrefixOf` writerName'))
                      , readerStandalone = standalone'
                      , readerParseRaw = parseRaw
                      , readerColumns = columns
@ -972,13 +967,10 @@ main = do
                            writerIgnoreNotes      = False,
                            writerNumberSections   = numberSections,
                            writerSectionDivs      = sectionDivs,
-                            writerExtensions       = defaultExts,
                            writerReferenceLinks   = referenceLinks,
                            writerWrapText         = wrap,
                            writerColumns          = columns,
-                            writerEmailObfuscation = if strict
-                                                     then ReferenceObfuscation
-                                                     else obfuscationMethod,
+                            writerEmailObfuscation = obfuscationMethod,
                            writerIdentifierPrefix = idPrefix,
                            writerSourceDirectory  = sourceDir,
                            writerUserDataDir      = datadir,
@ -997,7 +989,7 @@ main = do
                            writerReferenceDocx    = referenceDocx
                          }

-  when (writerName' `elem` nonTextFormats&& outputFile == "-") $
+  when (not (isTextFormat writerName') && outputFile == "-") $
    err 5 $ "Cannot write " ++ writerName' ++ " output to stdout.\n" ++
            "Specify an output file using the -o option."

@ -1022,7 +1014,7 @@ main = do

  let doc0 = foldr ($) doc transforms

-  doc1 <- if writerName' == "rtf"
+  doc1 <- if "rtf" `isPrefixOf` writerName'
             then bottomUpM rtfEmbedImage doc0
             else return doc0

@ -1050,12 +1042,11 @@ main = do
      writerFn "-" = UTF8.putStr
      writerFn f   = UTF8.writeFile f

-  let mbwriter = lookup writerName' writers
-  case mbwriter of
-    Nothing -> err 9 ("Unknown writer: " ++ writerName')
-    Just (IOStringWriter f) -> f writerOptions doc2 >>= writerFn outputFile
-    Just (IOByteStringWriter f) -> f writerOptions doc2 >>= writeBinary
-    Just (PureStringWriter f)
+  case getWriter writerName' of
+    Left e -> err 9 e
+    Right (IOStringWriter f) -> f writerOptions doc2 >>= writerFn outputFile
+    Right (IOByteStringWriter f) -> f writerOptions doc2 >>= writeBinary
+    Right (PureStringWriter f)
      | pdfOutput -> do
              res <- tex2pdf latexEngine $ f writerOptions doc2
              case res of