Markdown reader: revised definition list syntax (closes #1429).

* This change brings pandoc's definition list syntax into alignment
  with that used in PHP markdown extra and multimarkdown (with the
  exception that pandoc is more flexible about the definition markers,
  allowing tildes as well as colons).

* Lazily wrapped definitions are now allowed; blank space is required
  between list items; and the space before definition is used to
  determine whether it is a paragraph or a "plain" element.

* For backwards compatibility, a new extension,
  `compact_definition_lists`, has been added that restores the behavior
  of pandoc 1.12.x, allowing tight definition lists with no blank space
  between items, and disallowing lazy wrapping.
This commit is contained in:
John MacFarlane 2014-07-20 16:33:59 -07:00
parent cdc4ecbe98
commit 4af8eed764
5 changed files with 147 additions and 35 deletions

66
README
View file

@ -1437,8 +1437,8 @@ If default list markers are desired, use `#.`:
**Extension: `definition_lists`** **Extension: `definition_lists`**
Pandoc supports definition lists, using a syntax inspired by Pandoc supports definition lists, using the syntax of
[PHP Markdown Extra] and [reStructuredText]:[^3] [PHP Markdown Extra] with some extensions.[^3]
Term 1 Term 1
@ -1455,25 +1455,41 @@ Pandoc supports definition lists, using a syntax inspired by
Each term must fit on one line, which may optionally be followed by Each term must fit on one line, which may optionally be followed by
a blank line, and must be followed by one or more definitions. a blank line, and must be followed by one or more definitions.
A definition begins with a colon or tilde, which may be indented one A definition begins with a colon or tilde, which may be indented one
or two spaces. The body of the definition (including the first line, or two spaces.
aside from the colon or tilde) should be indented four spaces. A term may have
multiple definitions, and each definition may consist of one or more block
elements (paragraph, code block, list, etc.), each indented four spaces or one
tab stop.
If you leave space after the definition (as in the example above), A term may have multiple definitions, and each definition may consist of one or
the blocks of the definitions will be considered paragraphs. In some more block elements (paragraph, code block, list, etc.), each indented four
spaces or one tab stop. The body of the definition (including the first line,
aside from the colon or tilde) should be indented four spaces. However,
as with other markdown lists, you can "lazily" omit indentation except
at the beginning of a paragraph or other block element:
Term 1
: Definition
with lazy continuation.
Second paragraph of the definition.
If you leave space before the definition (as in the example above),
the text of the definition will be treated as a paragraph. In some
output formats, this will mean greater spacing between term/definition output formats, this will mean greater spacing between term/definition
pairs. For a compact definition list, do not leave space between the pairs. For a more compact definition list, omit the space before the
definition and the next term: definition:
Term 1 Term 1
~ Definition 1 ~ Definition 1
Term 2 Term 2
~ Definition 2a ~ Definition 2a
~ Definition 2b ~ Definition 2b
[^3]: I have also been influenced by the suggestions of [David Wheeler](http://www.justatheory.com/computers/markup/modest-markdown-proposal.html). Note that space between items in a definition list is required.
(A variant that loosens this requirement, but disallows "lazy"
hard wrapping, can be activated with `compact_definition_lists`: see
[Non-pandoc extensions](#non-pandoc-extensions), below.)
[^3]: I have been influenced by the suggestions of [David Wheeler](http://www.justatheory.com/computers/markup/modest-markdown-proposal.html).
[PHP Markdown Extra]: http://www.michelf.com/projects/php-markdown/extra/ [PHP Markdown Extra]: http://www.michelf.com/projects/php-markdown/extra/
@ -2629,6 +2645,32 @@ these, so they are presently just ignored.
Parses multimarkdown style header identifiers (in square brackets, Parses multimarkdown style header identifiers (in square brackets,
after the header but before any trailing `#`s in an ATX header). after the header but before any trailing `#`s in an ATX header).
**Extension: `compact_definition_lists`**\
Activates the definition list syntax of pandoc 1.12.x and earlier.
This syntax differs from the one described [above](#definition-lists)
in several respects:
- No blank line is required between consecutive items of the
definition list.
- To get a "tight" or "compact" list, omit space between consecutive
items; the space between a term and its definition does not affect
anything.
- Lazy wrapping of paragraphs is not allowed: the entire definition must
be indented four spaces.[^6]
[^6]: To see why laziness is incompatible with relaxing the requirement
of a blank line between items, consider the following example:
bar
: definition
foo
: definition
Is this a single list item with two definitions of "bar," the first of
which is lazily wrapped, or two list items? To remove the ambiguity
we must either disallow lazy wrapping or require a blank line between
list items.
Markdown variants Markdown variants
----------------- -----------------

View file

@ -84,6 +84,8 @@ data Extension =
| Ext_lists_without_preceding_blankline -- ^ Allow lists without preceding blank | Ext_lists_without_preceding_blankline -- ^ Allow lists without preceding blank
| Ext_startnum -- ^ Make start number of ordered list significant | Ext_startnum -- ^ Make start number of ordered list significant
| Ext_definition_lists -- ^ Definition lists as in pandoc, mmd, php | Ext_definition_lists -- ^ Definition lists as in pandoc, mmd, php
| Ext_compact_definition_lists -- ^ Definition lists without
-- space between items, and disallow laziness
| Ext_example_lists -- ^ Markdown-style numbered examples | Ext_example_lists -- ^ Markdown-style numbered examples
| Ext_all_symbols_escapable -- ^ Make all non-alphanumerics escapable | Ext_all_symbols_escapable -- ^ Make all non-alphanumerics escapable
| Ext_intraword_underscores -- ^ Treat underscore inside word as literal | Ext_intraword_underscores -- ^ Treat underscore inside word as literal

View file

@ -846,38 +846,53 @@ defListMarker = do
else mzero else mzero
return () return ()
definitionListItem :: MarkdownParser (F (Inlines, [Blocks])) definitionListItem :: Bool -> MarkdownParser (F (Inlines, [Blocks]))
definitionListItem = try $ do definitionListItem compact = try $ do
-- first, see if this has any chance of being a definition list: rawLine' <- anyLine
lookAhead (anyLine >> optional blankline >> defListMarker) raw <- many1 $ defRawBlock compact
term <- trimInlinesF . mconcat <$> manyTill inline newline term <- parseFromString (trimInlinesF . mconcat <$> many inline) rawLine'
optional blankline
raw <- many1 defRawBlock
state <- getState
let oldContext = stateParserContext state
-- parse the extracted block, which may contain various block elements:
contents <- mapM (parseFromString parseBlocks) raw contents <- mapM (parseFromString parseBlocks) raw
updateState (\st -> st {stateParserContext = oldContext}) optional blanklines
return $ liftM2 (,) term (sequence contents) return $ liftM2 (,) term (sequence contents)
defRawBlock :: MarkdownParser String defRawBlock :: Bool -> MarkdownParser String
defRawBlock = try $ do defRawBlock compact = try $ do
hasBlank <- option False $ blankline >> return True
defListMarker defListMarker
firstline <- anyLine firstline <- anyLine
rawlines <- many (notFollowedBy blankline >> indentSpaces >> anyLine) let dline = try
trailing <- option "" blanklines ( do notFollowedBy blankline
cont <- liftM concat $ many $ do if compact -- laziness not compatible with compact
lns <- many1 $ notFollowedBy blankline >> indentSpaces >> anyLine then () <$ indentSpaces
trl <- option "" blanklines else (() <$ indentSpaces)
return $ unlines lns ++ trl <|> notFollowedBy defListMarker
return $ firstline ++ "\n" ++ unlines rawlines ++ trailing ++ cont anyLine )
rawlines <- many dline
cont <- liftM concat $ many $ try $ do
trailing <- option "" blanklines
ln <- indentSpaces >> notFollowedBy blankline >> anyLine
lns <- many dline
return $ trailing ++ unlines (ln:lns)
return $ trimr (firstline ++ "\n" ++ unlines rawlines ++ cont) ++
if hasBlank || not (null cont) then "\n\n" else ""
definitionList :: MarkdownParser (F Blocks) definitionList :: MarkdownParser (F Blocks)
definitionList = do definitionList = try $ do
guardEnabled Ext_definition_lists lookAhead (anyLine >> optional blankline >> defListMarker)
items <- fmap sequence $ many1 definitionListItem compactDefinitionList <|> normalDefinitionList
compactDefinitionList :: MarkdownParser (F Blocks)
compactDefinitionList = do
guardEnabled Ext_compact_definition_lists
items <- fmap sequence $ many1 $ definitionListItem True
return $ B.definitionList <$> fmap compactify'DL items return $ B.definitionList <$> fmap compactify'DL items
normalDefinitionList :: MarkdownParser (F Blocks)
normalDefinitionList = do
guardEnabled Ext_definition_lists
items <- fmap sequence $ many1 $ definitionListItem False
return $ B.definitionList <$> items
-- --
-- paragraph block -- paragraph block
-- --

View file

@ -16,6 +16,10 @@ markdown = readMarkdown def
markdownSmart :: String -> Pandoc markdownSmart :: String -> Pandoc
markdownSmart = readMarkdown def { readerSmart = True } markdownSmart = readMarkdown def { readerSmart = True }
markdownCDL :: String -> Pandoc
markdownCDL = readMarkdown def { readerExtensions = Set.insert
Ext_compact_definition_lists $ readerExtensions def }
infix 4 =: infix 4 =:
(=:) :: ToString c (=:) :: ToString c
=> String -> (String, c) -> Test => String -> (String, c) -> Test
@ -222,6 +226,43 @@ tests = [ testGroup "inline code"
-- , testGroup "round trip" -- , testGroup "round trip"
-- [ property "p_markdown_round_trip" p_markdown_round_trip -- [ property "p_markdown_round_trip" p_markdown_round_trip
-- ] -- ]
, testGroup "definition lists"
[ "no blank space" =:
"foo1\n : bar\n\nfoo2\n : bar2\n : bar3\n" =?>
definitionList [ (text "foo1", [plain (text "bar")])
, (text "foo2", [plain (text "bar2"),
plain (text "bar3")])
]
, "blank space before first def" =:
"foo1\n\n : bar\n\nfoo2\n\n : bar2\n : bar3\n" =?>
definitionList [ (text "foo1", [para (text "bar")])
, (text "foo2", [para (text "bar2"),
plain (text "bar3")])
]
, "blank space before second def" =:
"foo1\n : bar\n\nfoo2\n : bar2\n\n : bar3\n" =?>
definitionList [ (text "foo1", [plain (text "bar")])
, (text "foo2", [plain (text "bar2"),
para (text "bar3")])
]
, "laziness" =:
"foo1\n : bar\nbaz\n : bar2\n" =?>
definitionList [ (text "foo1", [plain (text "bar baz"),
plain (text "bar2")])
]
, "no blank space before first of two paragraphs" =:
"foo1\n : bar\n\n baz\n" =?>
definitionList [ (text "foo1", [para (text "bar") <>
para (text "baz")])
]
]
, testGroup "+compact_definition_lists"
[ test markdownCDL "basic compact list" $
"foo1\n: bar\n baz\nfoo2\n: bar2\n" =?>
definitionList [ (text "foo1", [plain (text "bar baz")])
, (text "foo2", [plain (text "bar2")])
]
]
, testGroup "lists" , testGroup "lists"
[ "issue #1154" =: [ "issue #1154" =:
" - <div>\n first div breaks\n </div>\n\n <button>if this button exists</button>\n\n <div>\n with this div too.\n </div>\n" " - <div>\n first div breaks\n </div>\n\n <button>if this button exists</button>\n\n <div>\n with this div too.\n </div>\n"

View file

@ -270,8 +270,10 @@ Tight using spaces:
apple apple
: red fruit : red fruit
orange orange
: orange fruit : orange fruit
banana banana
: yellow fruit : yellow fruit
@ -279,31 +281,38 @@ Tight using tabs:
apple apple
: red fruit : red fruit
orange orange
: orange fruit : orange fruit
banana banana
: yellow fruit : yellow fruit
Loose: Loose:
apple apple
: red fruit : red fruit
orange orange
: orange fruit : orange fruit
banana banana
: yellow fruit : yellow fruit
Multiple blocks with italics: Multiple blocks with italics:
*apple* *apple*
: red fruit : red fruit
contains seeds, contains seeds,
crisp, pleasant to taste crisp, pleasant to taste
*orange* *orange*
: orange fruit : orange fruit
{ orange code block } { orange code block }
@ -315,6 +324,7 @@ Multiple definitions, tight:
apple apple
: red fruit : red fruit
: computer : computer
orange orange
: orange fruit : orange fruit
: bank : bank
@ -322,11 +332,13 @@ orange
Multiple definitions, loose: Multiple definitions, loose:
apple apple
: red fruit : red fruit
: computer : computer
orange orange
: orange fruit : orange fruit
: bank : bank