Creole reader (#4002)

* Basic skeleton for creole reader.

No real functionality besides preliminary bold and italics yet.

* Creole: add support for bold/italic with implicit end at paragraph end.

* Creole: add support for headings.

* Creole: add support for tilde escaped chars.

* Basic skeleton for creole reader.

No real functionality besides preliminary bold and italics yet.

* Creole: add support for bold/italic with implicit end at paragraph end.

* Creole: add support for headings.

* Creole: add support for tilde escaped chars.

* Add a test suite for the creole parser

So far this covers only things the parser already supports.

* Added simple parsing of flat unordered lists.

* Added tests for unordered lists in creole.

* First, wrong(!) implementation of sublists.

Fails test, as sublists should not be embedded in a list item!

* Implementation of unordered sublists.

* Added support for ordered lists to creole reader.

* Added utility function to append parsers to Creole reader.

* Creole reader: Fixed list item end detection in sub lists.

* Tests for creole reader: added more tests for lists.

Covering ordered and unordered tests, even mixed.  Tests for
formatting in list items still missing...

* Added "nowiki" blocks.  One exception rule is missing...

* Creole reader: nowiki: implemented exception for curly brackets.

* Creole reader: added inline nowiki.

* Creole reader: added horizontalRule.

* Creole reader: added auto linking of URIs.

* Creole reader: detect horizontalRule as para end.

Used the opportunity for a little refactoring.

* Creole reader: added forced line breaks.

Including test.

* Creole reader: implement wiki links.

* Creole reader: added image support.

* Creole reader: support images as links.

* Creole reader: implemented placeholder -- by simply dropping them.

* Creole reader: added tests for links.

After observing a regression, it was really time...  ;-)

* Creole reader: fixed links with names.

* Creole reader: allow space after first of enclosing tags.

Space after the start of formatting tags are allowed with creole,
e.g. "there is // italic text // in here" is legal.

This problem was discovered using the creole1.0test.txt document from
http://www.wikicreole.org/wiki/Creole1.0TestCases

See l.57:
# // italic item 3 //

* Creole reader: fixed links without names.

* Creole reader: Tests, sorted into groups.

* Creole reader: implemented tables.

* Removed redundant import.

* Creole reader: add correct escaping of links.

* Creole reader: allow handling of e.g. links in parenthesis and quotes.

* Creole reader: Modified disclaimer as most of the code is actually by me.

* Creole reader: Tests: added escaped links.

* Creole reader: preserve leading and trailing space in bold/italic.

* Creole reader: detect tables without a leading blank line.

* Creole Reader: added official creole1.0test.txt as "old" test.

The base document was downloaded from
http://www.wikicreole.org/wiki/Creole1.0TestCases.
The Wiki, and therefore the test document is

Copyright (C) by the contributors.
Some rights reserved, license CC BY-SA.

http://creativecommons.org/licenses/by-sa/1.0/
This commit is contained in:
Sascha Wilde 2017-10-29 18:28:50 +01:00 committed by John MacFarlane
parent d06bbeeaac
commit e8be4b0b6d
3 changed files with 236 additions and 0 deletions

View file

@ -156,6 +156,10 @@ tests = [ testGroup "markdown"
, testGroup "ms" , testGroup "ms"
[ testGroup "writer" $ writerTests "ms" [ testGroup "writer" $ writerTests "ms"
] ]
, testGroup "creole"
[ test "reader" ["-r", "creole", "-w", "native", "-s"]
"creole-reader.txt" "creole-reader.native"
]
] ]
-- makes sure file is fully closed after reading -- makes sure file is fully closed after reading

95
test/creole-reader.native Normal file
View file

@ -0,0 +1,95 @@
Pandoc (Meta {unMeta = fromList []})
[Header 1 ("",[],[]) [Str "Top-level heading (1)"]
,Header 2 ("",[],[]) [Str "This a test for creole 0.1 (2)"]
,Header 3 ("",[],[]) [Str "This is a Subheading (3)"]
,Header 4 ("",[],[]) [Str "Subsub (4)"]
,Header 5 ("",[],[]) [Str "Subsubsub (5)"]
,Para [Str "The",Space,Str "ending",Space,Str "equal",Space,Str "signs",Space,Str "should",Space,Str "not",Space,Str "be",Space,Str "displayed:"]
,Header 1 ("",[],[]) [Str "Top-level heading (1)"]
,Header 2 ("",[],[]) [Str "This a test for creole 0.1 (2)"]
,Header 3 ("",[],[]) [Str "This is a Subheading (3)"]
,Header 4 ("",[],[]) [Str "Subsub (4)"]
,Header 5 ("",[],[]) [Str "Subsubsub (5)"]
,Para [Str "You",Space,Str "can",Space,Str "make",Space,Str "things",Space,Strong [Str "bold"],Space,Str "or",Space,Emph [Str "italic"],Space,Str "or",Space,Strong [Emph [Str "both"]],Space,Str "or",Space,Emph [Strong [Str "both"]],Str "."]
,Para [Str "Character",Space,Str "formatting",Space,Str "extends",Space,Str "across",Space,Str "line",Space,Str "breaks:",Space,Strong [Str "bold,",Space,Str "this",Space,Str "is",Space,Str "still",Space,Str "bold.",Space,Str "This",Space,Str "line",Space,Str "deliberately",Space,Str "does",Space,Str "not",Space,Str "end",Space,Str "in",Space,Str "star-star."]]
,Para [Str "Not",Space,Str "bold.",Space,Str "Character",Space,Str "formatting",Space,Str "does",Space,Str "not",Space,Str "cross",Space,Str "paragraph",Space,Str "boundaries."]
,Para [Str "You",Space,Str "can",Space,Str "use",Space,Link ("",[],[]) [Str "internal links"] ("internal links",""),Space,Str "or",Space,Link ("",[],[]) [Str "external links"] ("http://www.wikicreole.org",""),Str ",",Space,Str "give",Space,Str "the",Space,Str "link",Space,Str "a",Space,Link ("",[],[]) [Str "different"] ("internal links",""),Space,Str "name."]
,Para [Str "Here's",Space,Str "another",Space,Str "sentence:",Space,Str "This",Space,Str "wisdom",Space,Str "is",Space,Str "taken",Space,Str "from",Space,Link ("",[],[]) [Str "Ward Cunningham's"] ("Ward Cunningham's",""),Space,Link ("",[],[]) [Str "Presentation at the Wikisym 06"] ("http://www.c2.com/doc/wikisym/WikiSym2006.pdf",""),Str "."]
,Para [Str "Here's",Space,Str "a",Space,Str "external",Space,Str "link",Space,Str "without",Space,Str "a",Space,Str "description:",Space,Link ("",[],[]) [Str "http://www.wikicreole.org"] ("http://www.wikicreole.org","")]
,Para [Str "Be",Space,Str "careful",Space,Str "that",Space,Str "italic",Space,Str "links",Space,Str "are",Space,Str "rendered",Space,Str "properly:",Space,Emph [Link ("",[],[]) [Str "My Book Title"] ("http://my.book.example/","")]]
,Para [Str "Free",Space,Str "links",Space,Str "without",Space,Str "braces",Space,Str "should",Space,Str "be",Space,Str "rendered",Space,Str "as",Space,Str "well,",Space,Str "like",Space,Link ("",[],[]) [Str "http://www.wikicreole.org/"] ("http://www.wikicreole.org/",""),Space,Str "and",Space,Link ("",[],[]) [Str "http://www.wikicreole.org/users/~example"] ("http://www.wikicreole.org/users/~example",""),Str "."]
,Para [Str "Creole1.0",Space,Str "specifies",Space,Str "that",Space,Link ("",[],[]) [Str "http://bar"] ("http://bar",""),Space,Str "and",Space,Link ("",[],[]) [Str "ftp://bar"] ("ftp://bar",""),Space,Str "should",Space,Str "not",Space,Str "render",Space,Str "italic,",Space,Str "something",Space,Str "like",Space,Str "foo:",Emph [Str "bar",Space,Str "should",Space,Str "render",Space,Str "as",Space,Str "italic."]]
,Para [Str "You",Space,Str "can",Space,Str "use",Space,Str "this",Space,Str "to",Space,Str "draw",Space,Str "a",Space,Str "line",Space,Str "to",Space,Str "separate",Space,Str "the",Space,Str "page:"]
,HorizontalRule
,Para [Str "You",Space,Str "can",Space,Str "use",Space,Str "lists,",Space,Str "start",Space,Str "it",Space,Str "at",Space,Str "the",Space,Str "first",Space,Str "column",Space,Str "for",Space,Str "now,",Space,Str "please..."]
,Para [Str "unnumbered",Space,Str "lists",Space,Str "are",Space,Str "like"]
,BulletList
[[Plain [Str "item",Space,Str "a"]]
,[Plain [Str "item",Space,Str "b"]]
,[Plain [Strong [Str "bold",Space,Str "item",Space,Str "c"]]]]
,Para [Str "blank",Space,Str "space",Space,Str "is",Space,Str "also",Space,Str "permitted",Space,Str "before",Space,Str "lists",Space,Str "like:"]
,BulletList
[[Plain [Str "item",Space,Str "a"]]
,[Plain [Str "item",Space,Str "b"]]
,[Plain [Str "item",Space,Str "c"]
,BulletList
[[Plain [Str "item",Space,Str "c.a"]]]]]
,Para [Str "or",Space,Str "you",Space,Str "can",Space,Str "number",Space,Str "them"]
,OrderedList (1,DefaultStyle,DefaultDelim)
[[Plain [Link ("",[],[]) [Str "item 1"] ("item 1","")]]
,[Plain [Str "item",Space,Str "2"]]
,[Plain [Emph [Space,Str "italic",Space,Str "item",Space,Str "3",Space]]
,OrderedList (1,DefaultStyle,DefaultDelim)
[[Plain [Str "item",Space,Str "3.1"]]
,[Plain [Str "item",Space,Str "3.2"]]]]]
,Para [Str "up",Space,Str "to",Space,Str "five",Space,Str "levels"]
,BulletList
[[Plain [Str "1"]
,BulletList
[[Plain [Str "2"]
,BulletList
[[Plain [Str "3"]
,BulletList
[[Plain [Str "4"]
,BulletList
[[Plain [Str "5"]]]]]]]]]]]
,BulletList
[[Plain [Str "You",Space,Str "can",Space,Str "have",Space,Str "multiline",Space,Str "list",Space,Str "items"]]
,[Plain [Str "this",Space,Str "is",Space,Str "a",Space,Str "second",Space,Str "multiline",Space,Str "list",Space,Str "item"]]]
,Para [Str "You",Space,Str "can",Space,Str "use",Space,Str "nowiki",Space,Str "syntax",Space,Str "if",Space,Str "you",Space,Str "would",Space,Str "like",Space,Str "do",Space,Str "stuff",Space,Str "like",Space,Str "this:"]
,CodeBlock ("",[],[]) "Guitar Chord C:\n\n||---|---|---|\n||-0-|---|---|\n||---|---|---|\n||---|-0-|---|\n||---|---|-0-|\n||---|---|---|"
,Para [Str "You",Space,Str "can",Space,Str "also",Space,Str "use",Space,Str "it",Space,Str "inline",Space,Str "nowiki",Space,Code ("",[],[]) " in a sentence ",Space,Str "like",Space,Str "this."]
,Header 1 ("",[],[]) [Str "Escapes"]
,Para [Str "Normal",Space,Str "Link:",Space,Link ("",[],[]) [Str "http://wikicreole.org/"] ("http://wikicreole.org/",""),Space,Str "-",Space,Str "now",Space,Str "same",Space,Str "link,",Space,Str "but",Space,Str "escaped:",Space,Str "http://wikicreole.org/"]
,Para [Str "Normal",Space,Str "asterisks:",Space,Str "**not",Space,Str "bold**"]
,Para [Str "a",Space,Str "tilde",Space,Str "alone:",Space,Str "~"]
,Para [Str "a",Space,Str "tilde",Space,Str "escapes",Space,Str "itself:",Space,Str "~xxx"]
,Header 3 ("",[],[]) [Str "Creole 0.2"]
,Para [Str "This",Space,Str "should",Space,Str "be",Space,Str "a",Space,Str "flower",Space,Str "with",Space,Str "the",Space,Str "ALT",Space,Str "text",Space,Str "\"this",Space,Str "is",Space,Str "a",Space,Str "flower\"",Space,Str "if",Space,Str "your",Space,Str "wiki",Space,Str "supports",Space,Str "ALT",Space,Str "text",Space,Str "on",Space,Str "images:"]
,Para [Image ("",[],[]) [Str "here is a red flower"] ("Red-Flower.jpg","")]
,Header 3 ("",[],[]) [Str "Creole 0.4"]
,Para [Str "Tables",Space,Str "are",Space,Str "done",Space,Str "like",Space,Str "this:"]
,Table [] [AlignDefault,AlignDefault] [0.0,0.0]
[[Plain [Str "header",Space,Str "col1"]]
,[Plain [Str "header",Space,Str "col2"]]]
[[[Plain [Str "col1"]]
,[Plain [Str "col2"]]]
,[[Plain [Str "you"]]
,[Plain [Str "can"]]]
,[[Plain [Str "also"]]
,[Plain [Str "align",LineBreak,Str "it."]]]]
,Para [Str "You",Space,Str "can",Space,Str "format",Space,Str "an",Space,Str "address",Space,Str "by",Space,Str "simply",Space,Str "forcing",Space,Str "linebreaks:"]
,Para [Str "My",Space,Str "contact",Space,Str "dates:",LineBreak,Str "Pone:",Space,Str "xyz",LineBreak,Str "Fax:",Space,Str "+45",LineBreak,Str "Mobile:",Space,Str "abc"]
,Header 3 ("",[],[]) [Str "Creole 0.5"]
,Table [] [AlignDefault,AlignDefault] [0.0,0.0]
[[Plain [Str "Header",Space,Str "title"]]
,[Plain [Str "Another",Space,Str "header",Space,Str "title"]]]
[[[Plain [Code ("",[],[]) " //not italic text// "]]
,[Plain [Code ("",[],[]) " **not bold text** "]]]
,[[Plain [Emph [Str "italic",Space,Str "text"]]]
,[Plain [Strong [Space,Str "bold",Space,Str "text",Space]]]]]
,Header 3 ("",[],[]) [Str "Creole 1.0"]
,Para [Str "If",Space,Str "interwiki",Space,Str "links",Space,Str "are",Space,Str "setup",Space,Str "in",Space,Str "your",Space,Str "wiki,",Space,Str "this",Space,Str "links",Space,Str "to",Space,Str "the",Space,Str "WikiCreole",Space,Str "page",Space,Str "about",Space,Str "Creole",Space,Str "1.0",Space,Str "test",Space,Str "cases:",Space,Link ("",[],[]) [Str "WikiCreole:Creole1.0TestCases"] ("WikiCreole:Creole1.0TestCases",""),Str "."]
,HorizontalRule
,Para [Str "The",Space,Str "above",Space,Str "test",Space,Str "document",Space,Str "was",Space,Str "found",Space,Str "on",Space,Link ("",[],[]) [Str "http://www.wikicreole.org/wiki/Creole1.0TestCases"] ("http://www.wikicreole.org/wiki/Creole1.0TestCases",""),Space,Str "and",Space,Str "downloaded",Space,Str "from",Space,Link ("",[],[]) [Str "http://www.wikicreole.org/attach/Creole1.0TestCases/creole1.0test.txt"] ("http://www.wikicreole.org/attach/Creole1.0TestCases/creole1.0test.txt",""),Str "."]
,Para [Str "The",Space,Str "Creole",Space,Str "Wiki",Space,Str "is",Space,Str "licensed:",Space,Str "Copyright",Space,Str "(C)",Space,Str "by",Space,Str "the",Space,Str "contributors.",Space,Str "Some",Space,Str "rights",Space,Str "reserved,",Space,Str "license",Space,Link ("",[],[]) [Str "https://creativecommons.org/licenses/by-sa/1.0/"] ("BY-SA",""),Str "."]]

137
test/creole-reader.txt Normal file
View file

@ -0,0 +1,137 @@
= Top-level heading (1)
== This a test for creole 0.1 (2)
=== This is a Subheading (3)
==== Subsub (4)
===== Subsubsub (5)
The ending equal signs should not be displayed:
= Top-level heading (1) =
== This a test for creole 0.1 (2) ==
=== This is a Subheading (3) ===
==== Subsub (4) ====
===== Subsubsub (5) =====
You can make things **bold** or //italic// or **//both//** or //**both**//.
Character formatting extends across line breaks: **bold,
this is still bold. This line deliberately does not end in star-star.
Not bold. Character formatting does not cross paragraph boundaries.
You can use [[internal links]] or [[http://www.wikicreole.org|external links]],
give the link a [[internal links|different]] name.
Here's another sentence: This wisdom is taken from [[Ward Cunningham's]]
[[http://www.c2.com/doc/wikisym/WikiSym2006.pdf|Presentation at the Wikisym 06]].
Here's a external link without a description: [[http://www.wikicreole.org]]
Be careful that italic links are rendered properly: //[[http://my.book.example/|My Book Title]]//
Free links without braces should be rendered as well, like http://www.wikicreole.org/ and http://www.wikicreole.org/users/~example.
Creole1.0 specifies that http://bar and ftp://bar should not render italic,
something like foo://bar should render as italic.
You can use this to draw a line to separate the page:
----
You can use lists, start it at the first column for now, please...
unnumbered lists are like
* item a
* item b
* **bold item c**
blank space is also permitted before lists like:
* item a
* item b
* item c
** item c.a
or you can number them
# [[item 1]]
# item 2
# // italic item 3 //
## item 3.1
## item 3.2
up to five levels
* 1
** 2
*** 3
**** 4
***** 5
* You can have
multiline list items
* this is a second multiline
list item
You can use nowiki syntax if you would like do stuff like this:
{{{
Guitar Chord C:
||---|---|---|
||-0-|---|---|
||---|---|---|
||---|-0-|---|
||---|---|-0-|
||---|---|---|
}}}
You can also use it inline nowiki {{{ in a sentence }}} like this.
= Escapes =
Normal Link: http://wikicreole.org/ - now same link, but escaped: ~http://wikicreole.org/
Normal asterisks: ~**not bold~**
a tilde alone: ~
a tilde escapes itself: ~~xxx
=== Creole 0.2 ===
This should be a flower with the ALT text "this is a flower" if your wiki supports ALT text on images:
{{Red-Flower.jpg|here is a red flower}}
=== Creole 0.4 ===
Tables are done like this:
|=header col1|=header col2|
|col1|col2|
|you |can |
|also |align\\ it. |
You can format an address by simply forcing linebreaks:
My contact dates:\\
Pone: xyz\\
Fax: +45\\
Mobile: abc
=== Creole 0.5 ===
|= Header title |= Another header title |
| {{{ //not italic text// }}} | {{{ **not bold text** }}} |
| //italic text// | ** bold text ** |
=== Creole 1.0 ===
If interwiki links are setup in your wiki, this links to the WikiCreole page about Creole 1.0 test cases: [[WikiCreole:Creole1.0TestCases]].
----
The above test document was found on
http://www.wikicreole.org/wiki/Creole1.0TestCases and downloaded from
http://www.wikicreole.org/attach/Creole1.0TestCases/creole1.0test.txt.
The Creole Wiki is licensed: Copyright (C) by the contributors. Some
rights reserved, license
[[BY-SA|https://creativecommons.org/licenses/by-sa/1.0/]].