Markdown reader: revised definition list syntax (closes #1429).

* This change brings pandoc's definition list syntax into alignment
  with that used in PHP markdown extra and multimarkdown (with the
  exception that pandoc is more flexible about the definition markers,
  allowing tildes as well as colons).

* Lazily wrapped definitions are now allowed; blank space is required
  between list items; and the space before definition is used to
  determine whether it is a paragraph or a "plain" element.

* For backwards compatibility, a new extension,
  `compact_definition_lists`, has been added that restores the behavior
  of pandoc 1.12.x, allowing tight definition lists with no blank space
  between items, and disallowing lazy wrapping.
This commit is contained in:
John MacFarlane 2014-07-20 16:33:59 -07:00
parent cdc4ecbe98
commit 4af8eed764
5 changed files with 147 additions and 35 deletions

66
README
View file

@ -1437,8 +1437,8 @@ If default list markers are desired, use `#.`:
**Extension: `definition_lists`**
Pandoc supports definition lists, using a syntax inspired by
[PHP Markdown Extra] and [reStructuredText]:[^3]
Pandoc supports definition lists, using the syntax of
[PHP Markdown Extra] with some extensions.[^3]
Term 1
@ -1455,25 +1455,41 @@ Pandoc supports definition lists, using a syntax inspired by
Each term must fit on one line, which may optionally be followed by
a blank line, and must be followed by one or more definitions.
A definition begins with a colon or tilde, which may be indented one
or two spaces. The body of the definition (including the first line,
aside from the colon or tilde) should be indented four spaces. A term may have
multiple definitions, and each definition may consist of one or more block
elements (paragraph, code block, list, etc.), each indented four spaces or one
tab stop.
or two spaces.
If you leave space after the definition (as in the example above),
the blocks of the definitions will be considered paragraphs. In some
A term may have multiple definitions, and each definition may consist of one or
more block elements (paragraph, code block, list, etc.), each indented four
spaces or one tab stop. The body of the definition (including the first line,
aside from the colon or tilde) should be indented four spaces. However,
as with other markdown lists, you can "lazily" omit indentation except
at the beginning of a paragraph or other block element:
Term 1
: Definition
with lazy continuation.
Second paragraph of the definition.
If you leave space before the definition (as in the example above),
the text of the definition will be treated as a paragraph. In some
output formats, this will mean greater spacing between term/definition
pairs. For a compact definition list, do not leave space between the
definition and the next term:
pairs. For a more compact definition list, omit the space before the
definition:
Term 1
~ Definition 1
Term 2
~ Definition 2a
~ Definition 2b
[^3]: I have also been influenced by the suggestions of [David Wheeler](http://www.justatheory.com/computers/markup/modest-markdown-proposal.html).
Note that space between items in a definition list is required.
(A variant that loosens this requirement, but disallows "lazy"
hard wrapping, can be activated with `compact_definition_lists`: see
[Non-pandoc extensions](#non-pandoc-extensions), below.)
[^3]: I have been influenced by the suggestions of [David Wheeler](http://www.justatheory.com/computers/markup/modest-markdown-proposal.html).
[PHP Markdown Extra]: http://www.michelf.com/projects/php-markdown/extra/
@ -2629,6 +2645,32 @@ these, so they are presently just ignored.
Parses multimarkdown style header identifiers (in square brackets,
after the header but before any trailing `#`s in an ATX header).
**Extension: `compact_definition_lists`**\
Activates the definition list syntax of pandoc 1.12.x and earlier.
This syntax differs from the one described [above](#definition-lists)
in several respects:
- No blank line is required between consecutive items of the
definition list.
- To get a "tight" or "compact" list, omit space between consecutive
items; the space between a term and its definition does not affect
anything.
- Lazy wrapping of paragraphs is not allowed: the entire definition must
be indented four spaces.[^6]
[^6]: To see why laziness is incompatible with relaxing the requirement
of a blank line between items, consider the following example:
bar
: definition
foo
: definition
Is this a single list item with two definitions of "bar," the first of
which is lazily wrapped, or two list items? To remove the ambiguity
we must either disallow lazy wrapping or require a blank line between
list items.
Markdown variants
-----------------

View file

@ -84,6 +84,8 @@ data Extension =
| Ext_lists_without_preceding_blankline -- ^ Allow lists without preceding blank
| Ext_startnum -- ^ Make start number of ordered list significant
| Ext_definition_lists -- ^ Definition lists as in pandoc, mmd, php
| Ext_compact_definition_lists -- ^ Definition lists without
-- space between items, and disallow laziness
| Ext_example_lists -- ^ Markdown-style numbered examples
| Ext_all_symbols_escapable -- ^ Make all non-alphanumerics escapable
| Ext_intraword_underscores -- ^ Treat underscore inside word as literal

View file

@ -846,38 +846,53 @@ defListMarker = do
else mzero
return ()
definitionListItem :: MarkdownParser (F (Inlines, [Blocks]))
definitionListItem = try $ do
-- first, see if this has any chance of being a definition list:
lookAhead (anyLine >> optional blankline >> defListMarker)
term <- trimInlinesF . mconcat <$> manyTill inline newline
optional blankline
raw <- many1 defRawBlock
state <- getState
let oldContext = stateParserContext state
-- parse the extracted block, which may contain various block elements:
definitionListItem :: Bool -> MarkdownParser (F (Inlines, [Blocks]))
definitionListItem compact = try $ do
rawLine' <- anyLine
raw <- many1 $ defRawBlock compact
term <- parseFromString (trimInlinesF . mconcat <$> many inline) rawLine'
contents <- mapM (parseFromString parseBlocks) raw
updateState (\st -> st {stateParserContext = oldContext})
optional blanklines
return $ liftM2 (,) term (sequence contents)
defRawBlock :: MarkdownParser String
defRawBlock = try $ do
defRawBlock :: Bool -> MarkdownParser String
defRawBlock compact = try $ do
hasBlank <- option False $ blankline >> return True
defListMarker
firstline <- anyLine
rawlines <- many (notFollowedBy blankline >> indentSpaces >> anyLine)
trailing <- option "" blanklines
cont <- liftM concat $ many $ do
lns <- many1 $ notFollowedBy blankline >> indentSpaces >> anyLine
trl <- option "" blanklines
return $ unlines lns ++ trl
return $ firstline ++ "\n" ++ unlines rawlines ++ trailing ++ cont
let dline = try
( do notFollowedBy blankline
if compact -- laziness not compatible with compact
then () <$ indentSpaces
else (() <$ indentSpaces)
<|> notFollowedBy defListMarker
anyLine )
rawlines <- many dline
cont <- liftM concat $ many $ try $ do
trailing <- option "" blanklines
ln <- indentSpaces >> notFollowedBy blankline >> anyLine
lns <- many dline
return $ trailing ++ unlines (ln:lns)
return $ trimr (firstline ++ "\n" ++ unlines rawlines ++ cont) ++
if hasBlank || not (null cont) then "\n\n" else ""
definitionList :: MarkdownParser (F Blocks)
definitionList = do
guardEnabled Ext_definition_lists
items <- fmap sequence $ many1 definitionListItem
definitionList = try $ do
lookAhead (anyLine >> optional blankline >> defListMarker)
compactDefinitionList <|> normalDefinitionList
compactDefinitionList :: MarkdownParser (F Blocks)
compactDefinitionList = do
guardEnabled Ext_compact_definition_lists
items <- fmap sequence $ many1 $ definitionListItem True
return $ B.definitionList <$> fmap compactify'DL items
normalDefinitionList :: MarkdownParser (F Blocks)
normalDefinitionList = do
guardEnabled Ext_definition_lists
items <- fmap sequence $ many1 $ definitionListItem False
return $ B.definitionList <$> items
--
-- paragraph block
--

View file

@ -16,6 +16,10 @@ markdown = readMarkdown def
markdownSmart :: String -> Pandoc
markdownSmart = readMarkdown def { readerSmart = True }
markdownCDL :: String -> Pandoc
markdownCDL = readMarkdown def { readerExtensions = Set.insert
Ext_compact_definition_lists $ readerExtensions def }
infix 4 =:
(=:) :: ToString c
=> String -> (String, c) -> Test
@ -222,6 +226,43 @@ tests = [ testGroup "inline code"
-- , testGroup "round trip"
-- [ property "p_markdown_round_trip" p_markdown_round_trip
-- ]
, testGroup "definition lists"
[ "no blank space" =:
"foo1\n : bar\n\nfoo2\n : bar2\n : bar3\n" =?>
definitionList [ (text "foo1", [plain (text "bar")])
, (text "foo2", [plain (text "bar2"),
plain (text "bar3")])
]
, "blank space before first def" =:
"foo1\n\n : bar\n\nfoo2\n\n : bar2\n : bar3\n" =?>
definitionList [ (text "foo1", [para (text "bar")])
, (text "foo2", [para (text "bar2"),
plain (text "bar3")])
]
, "blank space before second def" =:
"foo1\n : bar\n\nfoo2\n : bar2\n\n : bar3\n" =?>
definitionList [ (text "foo1", [plain (text "bar")])
, (text "foo2", [plain (text "bar2"),
para (text "bar3")])
]
, "laziness" =:
"foo1\n : bar\nbaz\n : bar2\n" =?>
definitionList [ (text "foo1", [plain (text "bar baz"),
plain (text "bar2")])
]
, "no blank space before first of two paragraphs" =:
"foo1\n : bar\n\n baz\n" =?>
definitionList [ (text "foo1", [para (text "bar") <>
para (text "baz")])
]
]
, testGroup "+compact_definition_lists"
[ test markdownCDL "basic compact list" $
"foo1\n: bar\n baz\nfoo2\n: bar2\n" =?>
definitionList [ (text "foo1", [plain (text "bar baz")])
, (text "foo2", [plain (text "bar2")])
]
]
, testGroup "lists"
[ "issue #1154" =:
" - <div>\n first div breaks\n </div>\n\n <button>if this button exists</button>\n\n <div>\n with this div too.\n </div>\n"

View file

@ -270,8 +270,10 @@ Tight using spaces:
apple
: red fruit
orange
: orange fruit
banana
: yellow fruit
@ -279,31 +281,38 @@ Tight using tabs:
apple
: red fruit
orange
: orange fruit
banana
: yellow fruit
Loose:
apple
: red fruit
orange
: orange fruit
banana
: yellow fruit
Multiple blocks with italics:
*apple*
: red fruit
contains seeds,
crisp, pleasant to taste
*orange*
: orange fruit
{ orange code block }
@ -315,6 +324,7 @@ Multiple definitions, tight:
apple
: red fruit
: computer
orange
: orange fruit
: bank
@ -322,11 +332,13 @@ orange
Multiple definitions, loose:
apple
: red fruit
: computer
orange
: orange fruit
: bank