From f11360f50e9104bb6ac203c14ec7f02c5b844a96 Mon Sep 17 00:00:00 2001 From: fiddlosopher Date: Thu, 23 Aug 2007 04:25:09 +0000 Subject: [PATCH] Added new rule for enhanced markdown ordered lists: if the list marker is a capital letter followed by a period (including a single-letter capital roman numeral), then it must be followed by at least two spaces. The point of this is to avoid accidentally treating people's initials as list markers: a paragraph may begin: B. Russell was an English philosopher. and this shouldn't be treated as a list. Modified Markdown reader and README documentation. Added a test case. git-svn-id: https://pandoc.googlecode.com/svn/trunk@880 788f1e2b-df1e-0410-8736-df70ead52e1b --- README | 51 +++++++++++++++-------------- src/Text/Pandoc/Readers/Markdown.hs | 15 +++++---- tests/testsuite.native | 1 + tests/testsuite.txt | 2 ++ tests/writer.context | 2 ++ tests/writer.docbook | 3 ++ tests/writer.html | 2 ++ tests/writer.latex | 2 ++ tests/writer.man | 2 ++ tests/writer.markdown | 2 ++ tests/writer.native | 1 + tests/writer.rst | 2 ++ tests/writer.rtf | 1 + 13 files changed, 56 insertions(+), 30 deletions(-) diff --git a/README b/README index 1dfab9914..cd59ea50a 100644 --- a/README +++ b/README @@ -452,11 +452,31 @@ Unlike standard markdown, Pandoc allows ordered list items to be marked with uppercase and lowercase letters and roman numerals, in addition to arabic numerals. (This behavior can be turned off using the `--strict` option.) List markers may be enclosed in parentheses or followed by a -single right-parentheses or period. Pandoc also pays attention to the -type of list marker used, and to the starting number, and both of these -are preserved where possible in the output format. Thus, the following -yields a list with numbers followed by a single parenthesis, starting -with 9, and a sublist with lowercase roman numerals: +single right-parentheses or period. They must be separated from the +text that follows by at least one space, and, if the list marker is a +capital letter with a period, by at least two spaces.[^2] + +[^2]: The point of this rule is to ensure that normal paragraphs + starting with people's initials, like + + B. Russell was an English philosopher. + + do not get treated as list items. + + This rule will not prevent + + (C) 2007 Joe Smith + + from being interpreted as a list item. In this case, a backslash + escape can be used: + + (C\) 2007 Joe Smith + +Pandoc also pays attention to the type of list marker used, and to the +starting number, and both of these are preserved where possible in the +output format. Thus, the following yields a list with numbers followed +by a single parenthesis, starting with 9, and a sublist with lowercase +roman numerals: 9) Ninth 10) Tenth @@ -491,27 +511,10 @@ gets treated as if it were 1. One 2. Two - A. Sub - B. Sub + A. Sub + B. Sub 3. Three -Note that a list beginning with a single letter will be interpreted as -an alphabetic list. So you are out of luck if you want a roman-numbered -list starting with 100 (C). - -Note also that a paragraph starting with a capital letter and a period -(for example, an initial) or a capital letter in parentheses -(for example, `(C)`) will be interpreted as a list: - - B. Russell was an English philosopher. - - (C) 2007 Joe Smith - -To avoid this, use backslash escapes: - - B\. Russell was an English philosopher. - - \(C) 2007 Joe Smith Definition lists ---------------- diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs index 82a82925d..b4302e6e4 100644 --- a/src/Text/Pandoc/Readers/Markdown.hs +++ b/src/Text/Pandoc/Readers/Markdown.hs @@ -327,12 +327,15 @@ orderedListStart style delim = try $ do optional newline -- if preceded by a Plain block in a list context nonindentSpaces state <- getState - if stateStrict state - then do many1 digit - char '.' - return 1 - else orderedListMarker style delim - spaceChar + num <- if stateStrict state + then do many1 digit + char '.' + return 1 + else orderedListMarker style delim + if delim == Period && (style == UpperAlpha || (style == UpperRoman && + num `elem` [1, 5, 10, 50, 100, 500, 1000])) + then char '\t' <|> (spaceChar >> spaceChar) + else spaceChar skipSpaces -- parse a line of a list item (start = parser for beginning of list item) diff --git a/tests/testsuite.native b/tests/testsuite.native index 9950341c4..3566c3ebf 100644 --- a/tests/testsuite.native +++ b/tests/testsuite.native @@ -180,6 +180,7 @@ Pandoc (Meta [Str "Pandoc",Space,Str "Test",Space,Str "Suite"] ["John MacFarlane ] ] ] , Para [Str "Should",Space,Str "not",Space,Str "be",Space,Str "a",Space,Str "list",Space,Str "item:"] , Para [Str "M",Str ".",Str "A",Str ".",Space,Str "2007"] +, Para [Str "B",Str ".",Space,Str "Williams"] , HorizontalRule , Header 1 [Str "Definition",Space,Str "Lists"] , Para [Str "Tight",Space,Str "using",Space,Str "spaces:"] diff --git a/tests/testsuite.txt b/tests/testsuite.txt index f8d9221b9..ae2d0fe32 100644 --- a/tests/testsuite.txt +++ b/tests/testsuite.txt @@ -285,6 +285,8 @@ Should not be a list item: M.A. 2007 +B. Williams + * * * * * # Definition Lists diff --git a/tests/writer.context b/tests/writer.context index c0b633117..bc216177c 100644 --- a/tests/writer.context +++ b/tests/writer.context @@ -403,6 +403,8 @@ Should not be a list item: M.A. 2007 +B. Williams + \thinrule \section{Definition Lists} diff --git a/tests/writer.docbook b/tests/writer.docbook index 7b3f39f34..6235838ed 100644 --- a/tests/writer.docbook +++ b/tests/writer.docbook @@ -648,6 +648,9 @@ These should not be escaped: \$ \\ \> \[ \{ M.A. 2007 + + B. Williams +
diff --git a/tests/writer.html b/tests/writer.html index 46a95cde7..e674170da 100644 --- a/tests/writer.html +++ b/tests/writer.html @@ -453,6 +453,8 @@ These should not be escaped: \$ \\ \> \[ \{ >Should not be a list item:

M.A. 2007

B. Williams


Definition Lists

\\[ \\\{\par} {\pard \ql \f0 \sa0 \li720 \fi-360 a.\tx360\tab Nested.\sa180\sa180\par} {\pard \ql \f0 \sa180 \li0 \fi0 Should not be a list item:\par} {\pard \ql \f0 \sa180 \li0 \fi0 M.A. 2007\par} +{\pard \ql \f0 \sa180 \li0 \fi0 B. Williams\par} {\pard \qc \f0 \sa180 \li0 \fi0 \emdash\emdash\emdash\emdash\emdash\par} {\pard \ql \f0 \sa180 \li0 \fi0 \b \fs36 Definition Lists\par} {\pard \ql \f0 \sa180 \li0 \fi0 Tight using spaces:\par}