Markdown writer: improved escaping.

`<` should not be escaped as `\<`, for compatibility with
original Markdown.  We now escape `<` and `>` with entities.
Also, we now backslash-escape square brackets.

Closes #2086.
This commit is contained in:
John MacFarlane 2015-04-18 10:45:46 -07:00
parent 0b80073e7c
commit d20152e011
4 changed files with 20 additions and 19 deletions

View file

@ -246,7 +246,8 @@ noteToMarkdown opts num blocks = do
-- | Escape special characters for Markdown.
escapeString :: WriterOptions -> String -> String
escapeString opts = escapeStringUsing markdownEscapes
where markdownEscapes = backslashEscapes specialChars
where markdownEscapes = ('<', "&lt;") : ('>', "&gt;") :
backslashEscapes specialChars
specialChars =
(if isEnabled Ext_superscript opts
then ('^':)
@ -257,7 +258,7 @@ escapeString opts = escapeStringUsing markdownEscapes
(if isEnabled Ext_tex_math_dollars opts
then ('$':)
else id) $
"\\`*_<>#"
"\\`*_[]#"
-- | Construct table of contents from list of header blocks.
tableOfContents :: WriterOptions -> [Block] -> Doc

View file

@ -84,13 +84,13 @@ shortcutLinkRefsTests =
]
, "Reference link is followed by text in brackets"
=: (para ((link "/url" "" "link") <> "[text in brackets]"))
=?> unlines [ "[link][][text in brackets]"
=?> unlines [ "[link][]\\[text in brackets\\]"
, ""
, " [link]: /url"
]
, "Reference link is followed by space and text in brackets"
=: (para ((link "/url" "" "link") <> " [text in brackets]"))
=?> unlines [ "[link][] [text in brackets]"
=?> unlines [ "[link][] \\[text in brackets\\]"
, ""
, " [link]: /url"
]

View file

@ -80,7 +80,7 @@ E-mail style:
>
> > nested
This should not be a block quote: 2 \> 1.
This should not be a block quote: 2 &gt; 1.
And a following paragraph.
@ -581,9 +581,9 @@ AT&T is another way to write it.
This & that.
4 \< 5.
4 &lt; 5.
6 \> 5.
6 &gt; 5.
Backslash: \\
@ -597,15 +597,15 @@ Left brace: {
Right brace: }
Left bracket: [
Left bracket: \[
Right bracket: ]
Right bracket: \]
Left paren: (
Right paren: )
Greater-than: \>
Greater-than: &gt;
Hash: \#
@ -652,7 +652,7 @@ Foo [bar](/url/).
Foo [bar](/url/).
With [embedded [brackets]](/url/).
With [embedded \[brackets\]](/url/).
[b](/url/) by itself should be a link.
@ -662,7 +662,7 @@ Indented [twice](/url).
Indented [thrice](/url).
This should [not][] be a link.
This should \[not\]\[\] be a link.
[not]: /url
@ -716,8 +716,8 @@ Footnotes
=========
Here is a footnote reference,[^1] and another.[^2] This should *not* be a
footnote reference, because it contains a space.[\^my note] Here is an inline
note.[^3]
footnote reference, because it contains a space.\[\^my note\] Here is an
inline note.[^3]
> Notes can go in quotes.[^4]
@ -740,7 +740,7 @@ This paragraph should not be part of the note, as it is not indented.
[^3]: This is *easier* to type. Inline notes may contain
[links](http://google.com) and `]` verbatim characters, as well as
[bracketed text].
\[bracketed text\].
[^4]: In quote.

View file

@ -26,7 +26,7 @@
</outline>
<outline text="Paragraphs" _note="Heres a regular paragraph.&#10;&#10;In Markdown 1.0.0 and earlier. Version 8. This line turns into a list&#10;item. Because a hard-wrapped line in the middle of a paragraph looked&#10;like a list item.&#10;&#10;Heres one with a bullet. \* criminey.&#10;&#10;There should be a hard line break\&#10;here.&#10;&#10;------------------------------------------------------------------------">
</outline>
<outline text="Block Quotes" _note="E-mail style:&#10;&#10;&gt; This is a block quote. It is pretty short.&#10;&#10;&gt; Code in a block quote:&#10;&gt;&#10;&gt; sub status {&#10;&gt; print &quot;working&quot;;&#10;&gt; }&#10;&gt;&#10;&gt; A list:&#10;&gt;&#10;&gt; 1. item one&#10;&gt; 2. item two&#10;&gt;&#10;&gt; Nested block quotes:&#10;&gt;&#10;&gt; &gt; nested&#10;&gt;&#10;&gt; &gt; nested&#10;&#10;This should not be a block quote: 2 \&gt; 1.&#10;&#10;And a following paragraph.&#10;&#10;------------------------------------------------------------------------">
<outline text="Block Quotes" _note="E-mail style:&#10;&#10;&gt; This is a block quote. It is pretty short.&#10;&#10;&gt; Code in a block quote:&#10;&gt;&#10;&gt; sub status {&#10;&gt; print &quot;working&quot;;&#10;&gt; }&#10;&gt;&#10;&gt; A list:&#10;&gt;&#10;&gt; 1. item one&#10;&gt; 2. item two&#10;&gt;&#10;&gt; Nested block quotes:&#10;&gt;&#10;&gt; &gt; nested&#10;&gt;&#10;&gt; &gt; nested&#10;&#10;This should not be a block quote: 2 &amp;gt; 1.&#10;&#10;And a following paragraph.&#10;&#10;------------------------------------------------------------------------">
</outline>
<outline text="Code Blocks" _note="Code:&#10;&#10; ---- (should be four hyphens)&#10;&#10; sub status {&#10; print &quot;working&quot;;&#10; }&#10;&#10; this code block is indented by one tab&#10;&#10;And:&#10;&#10; this code block is indented by two tabs&#10;&#10; These should not be escaped: \$ \\ \&gt; \[ \{&#10;&#10;------------------------------------------------------------------------">
</outline>
@ -52,12 +52,12 @@
</outline>
<outline text="LaTeX" _note="- \cite[22-23]{smith.1899}&#10;- $2+2=4$&#10;- $x \in y$&#10;- $\alpha \wedge \omega$&#10;- $223$&#10;- $p$-Tree&#10;- Heres some display math:&#10; $$\frac{d}{dx}f(x)=\lim_{h\to 0}\frac{f(x+h)-f(x)}{h}$$&#10;- Heres one that has a line break in it:&#10; $\alpha + \omega \times x^2$.&#10;&#10;These shouldnt be math:&#10;&#10;- To get the famous equation, write `$e = mc^2$`.&#10;- \$22,000 is a *lot* of money. So is \$34,000. (It worked if “lot”&#10; is emphasized.)&#10;- Shoes (\$20) and socks (\$5).&#10;- Escaped `$`: \$73 *this should be emphasized* 23\$.&#10;&#10;Heres a LaTeX table:&#10;&#10;\begin{tabular}{|l|l|}\hline&#10;Animal &amp; Number \\ \hline&#10;Dog &amp; 2 \\&#10;Cat &amp; 1 \\ \hline&#10;\end{tabular}&#10;&#10;------------------------------------------------------------------------">
</outline>
<outline text="Special Characters" _note="Here is some unicode:&#10;&#10;- I hat: Î&#10;- o umlaut: ö&#10;- section: §&#10;- set membership: ∈&#10;- copyright: ©&#10;&#10;AT&amp;T has an ampersand in their name.&#10;&#10;AT&amp;T is another way to write it.&#10;&#10;This &amp; that.&#10;&#10;4 \&lt; 5.&#10;&#10;6 \&gt; 5.&#10;&#10;Backslash: \\&#10;&#10;Backtick: \`&#10;&#10;Asterisk: \*&#10;&#10;Underscore: \_&#10;&#10;Left brace: {&#10;&#10;Right brace: }&#10;&#10;Left bracket: [&#10;&#10;Right bracket: ]&#10;&#10;Left paren: (&#10;&#10;Right paren: )&#10;&#10;Greater-than: \&gt;&#10;&#10;Hash: \#&#10;&#10;Period: .&#10;&#10;Bang: !&#10;&#10;Plus: +&#10;&#10;Minus: -&#10;&#10;------------------------------------------------------------------------">
<outline text="Special Characters" _note="Here is some unicode:&#10;&#10;- I hat: Î&#10;- o umlaut: ö&#10;- section: §&#10;- set membership: ∈&#10;- copyright: ©&#10;&#10;AT&amp;T has an ampersand in their name.&#10;&#10;AT&amp;T is another way to write it.&#10;&#10;This &amp; that.&#10;&#10;4 &amp;lt; 5.&#10;&#10;6 &amp;gt; 5.&#10;&#10;Backslash: \\&#10;&#10;Backtick: \`&#10;&#10;Asterisk: \*&#10;&#10;Underscore: \_&#10;&#10;Left brace: {&#10;&#10;Right brace: }&#10;&#10;Left bracket: \[&#10;&#10;Right bracket: \]&#10;&#10;Left paren: (&#10;&#10;Right paren: )&#10;&#10;Greater-than: &amp;gt;&#10;&#10;Hash: \#&#10;&#10;Period: .&#10;&#10;Bang: !&#10;&#10;Plus: +&#10;&#10;Minus: -&#10;&#10;------------------------------------------------------------------------">
</outline>
<outline text="Links">
<outline text="Explicit" _note="Just a [URL](/url/).&#10;&#10;[URL and title](/url/ &quot;title&quot;).&#10;&#10;[URL and title](/url/ &quot;title preceded by two spaces&quot;).&#10;&#10;[URL and title](/url/ &quot;title preceded by a tab&quot;).&#10;&#10;[URL and title](/url/ &quot;title with &quot;quotes&quot; in it&quot;)&#10;&#10;[URL and title](/url/ &quot;title with single quotes&quot;)&#10;&#10;[with\_underscore](/url/with_underscore)&#10;&#10;[Email link](mailto:nobody@nowhere.net)&#10;&#10;[Empty]().">
</outline>
<outline text="Reference" _note="Foo [bar](/url/).&#10;&#10;Foo [bar](/url/).&#10;&#10;Foo [bar](/url/).&#10;&#10;With [embedded [brackets]](/url/).&#10;&#10;[b](/url/) by itself should be a link.&#10;&#10;Indented [once](/url).&#10;&#10;Indented [twice](/url).&#10;&#10;Indented [thrice](/url).&#10;&#10;This should [not][] be a link.&#10;&#10; [not]: /url&#10;&#10;Foo [bar](/url/ &quot;Title with &quot;quotes&quot; inside&quot;).&#10;&#10;Foo [biz](/url/ &quot;Title with &quot;quote&quot; inside&quot;).">
<outline text="Reference" _note="Foo [bar](/url/).&#10;&#10;Foo [bar](/url/).&#10;&#10;Foo [bar](/url/).&#10;&#10;With [embedded \[brackets\]](/url/).&#10;&#10;[b](/url/) by itself should be a link.&#10;&#10;Indented [once](/url).&#10;&#10;Indented [twice](/url).&#10;&#10;Indented [thrice](/url).&#10;&#10;This should \[not\]\[\] be a link.&#10;&#10; [not]: /url&#10;&#10;Foo [bar](/url/ &quot;Title with &quot;quotes&quot; inside&quot;).&#10;&#10;Foo [biz](/url/ &quot;Title with &quot;quote&quot; inside&quot;).">
</outline>
<outline text="With ampersands" _note="Heres a [link with an ampersand in the&#10;URL](http://example.com/?foo=1&amp;bar=2).&#10;&#10;Heres a link with an amersand in the link text:&#10;[AT&amp;T](http://att.com/ &quot;AT&amp;T&quot;).&#10;&#10;Heres an [inline link](/script?foo=1&amp;bar=2).&#10;&#10;Heres an [inline link in pointy braces](/script?foo=1&amp;bar=2).">
</outline>
@ -66,7 +66,7 @@
</outline>
<outline text="Images" _note="From “Voyage dans la Lune” by Georges Melies (1902):&#10;&#10;![lalune](lalune.jpg &quot;Voyage dans la Lune&quot;)&#10;&#10;Here is a movie ![movie](movie.jpg) icon.&#10;&#10;------------------------------------------------------------------------">
</outline>
<outline text="Footnotes" _note="Here is a footnote reference,[^1] and another.[^2] This should *not* be&#10;a footnote reference, because it contains a space.[\^my note] Here is an&#10;inline note.[^3]&#10;&#10;&gt; Notes can go in quotes.[^4]&#10;&#10;1. And in list items.[^5]&#10;&#10;This paragraph should not be part of the note, as it is not indented.&#10;&#10;[^1]: Here is the footnote. It can go anywhere after the footnote&#10; reference. It need not be placed at the end of the document.&#10;&#10;[^2]: Heres the long note. This one contains multiple blocks.&#10;&#10; Subsequent blocks are indented to show that they belong to the&#10; footnote (as with list items).&#10;&#10; { &lt;code&gt; }&#10;&#10; If you want, you can indent every line, but you can also be lazy and&#10; just indent the first line of each block.&#10;&#10;[^3]: This is *easier* to type. Inline notes may contain&#10; [links](http://google.com) and `]` verbatim characters, as well as&#10; [bracketed text].&#10;&#10;[^4]: In quote.&#10;&#10;[^5]: In list.">
<outline text="Footnotes" _note="Here is a footnote reference,[^1] and another.[^2] This should *not* be&#10;a footnote reference, because it contains a space.\[\^my note\] Here is&#10;an inline note.[^3]&#10;&#10;&gt; Notes can go in quotes.[^4]&#10;&#10;1. And in list items.[^5]&#10;&#10;This paragraph should not be part of the note, as it is not indented.&#10;&#10;[^1]: Here is the footnote. It can go anywhere after the footnote&#10; reference. It need not be placed at the end of the document.&#10;&#10;[^2]: Heres the long note. This one contains multiple blocks.&#10;&#10; Subsequent blocks are indented to show that they belong to the&#10; footnote (as with list items).&#10;&#10; { &lt;code&gt; }&#10;&#10; If you want, you can indent every line, but you can also be lazy and&#10; just indent the first line of each block.&#10;&#10;[^3]: This is *easier* to type. Inline notes may contain&#10; [links](http://google.com) and `]` verbatim characters, as well as&#10; \[bracketed text\].&#10;&#10;[^4]: In quote.&#10;&#10;[^5]: In list.">
</outline>
</body>
</opml>