Paragraphs can be followed by lists, even if there is no blank line
between the two blocks. However, this should only be true if the
paragraph is not within a list, were the preceding block should be
parsed as a plain instead of paragraph (to allow for compact lists).
Thanks to @rgaiacs for bringing this up.
This fixes#2464.
Tightened up the inline HTML parser so it disallows
TagWarnings.
This only affects the markdown reader when the `markdown_in_html_blocks`
option is disabled.
Closes#2469.
- The (non-exported) prelude is in prelude/Prelude.hs.
- It exports Monoid and Applicative, like base 4.8 prelude,
but works with older base versions.
- It exports (<>) for mappend.
- It hides 'catch' on older base versions.
This allows us to remove many imports of Data.Monoid
and Control.Applicative, and remove Text.Pandoc.Compat.Monoid.
It should allow us to use -Wall again for ghc 7.10.
If we're producing a fragment, just skip normalization.
After all, the fragment might be somewhere in the middle
of the document. It's more important for fragments to
have consistency in rendering (so they can be pieced
together) than to normalize.
This closes#2394. It's simpler and more robust than
my earlier fix.
'xref' is used to create cross references to other parts of the
document. It is an empty element - the cross reference text depends on
various attributes. Quoting 'DocBook: The Definitive Guide':
1. If the endterm attribute is specified on xref, the content of the
element pointed to by endterm will be used as the text of the
cross-reference.
2. Otherwise, if the object pointed to has a specified XRefLabel, the
content of that attribute will be used as the cross-reference text.
The previous verse parsing code made the faulty assumption that empty
strings are valid (and empty) inlines. This isn't the case, so lines
are changed to contain at least a newline.
It would generally be nicer and faster to keep the newlines while
splitting the string. However, this would require more code, which
seems unjustified for a simple (and fairly rare) block as *verse*.
This fixes#2402.
Technically this isn't allowed in an HTML comment, but
we've always allowed it, and so do most other implementations.
It is handy if e.g. you want to put command line arguments
in HTML comments.
Previously we disallowed `-` at the end of an autolink,
and disallowed the combination `=-`.
This commit liberalizes the rules for allowing punctuation in
a bare URI.
Added test cases.
One potential drawback is that you can no longer put a bare
URI in em dashes like this
this uri---http://example.com---is an example.
But in this respect we now match github's treatment of bare URIs.
Closes#2299.
Org mode allows headers to be tagged:
``` org-mode
* Headline :TAG1:TAG2:
```
Instead of being interpreted as part of the headline, the tags are now
put into the attributes of empty spans. Spans without textual content
won't be visible by default, but they are detectable by filters. They
can also be styled using CSS when written as HTML.
This fixes#2160.
We only support the href attribute, as there's no place for
"target" in the Pandoc document model for links.
Added HTML reader test module, with tests for this feature.
Closes#1751.
Instead, use a forward-slash to join paths, regardless of the
platform. This matches the way MediaBag now works.
See
56e4ecab20 (commitcomment-10858449)
`<` should not be escaped as `\<`, for compatibility with
original Markdown. We now escape `<` and `>` with entities.
Also, we now backslash-escape square brackets.
Closes#2086.
Previously the body of the definition (after the `:` or `~` marker)
needed to be in column 4. This commit relaxes that requirement,
to better match the behavior of PHP Markdown Extra. So, now
this is a valid definition list:
foo
: bar
This patch also helps resolve a potentially ambiguity with table
captions:
foo
: bar
-----
table
-----
Is "bar" a definition, or the caption for the table? We'll count
it as a caption for the table.
Closes#2087.
If the tag parses as a comment, we check to see if the
input starts with `<!--`. If not, it's bogus comment mode
and we fail htmlTag.
Includes test case. Closes#1820.
Issue #1977
Most markdown processors support the [shortcut format] for reference links.
Pandoc's markdown reader parsed this shortcuts unoptionally.
Pandoc's markdown writer (with --reference-links option) never shortcutted links.
This commit adds an extension `shortcut_reference_links`. The extension is
enabled by default for those markdown flavors that support reading shortcut
reference links, namely:
- pandoc
- strict pandoc
- github flavoured
- PHPmarkdown
If extension is enabled, reader parses the shortcuts in the same way as
it preveously did. Otherwise it would parse them as normal text.
If extension is enabled, writer outputs shortcut reference links unless
doing so would cause problems (see test cases in `tests/Tests/Writers/Markdown.hs`).
The `tabular` environment allows non-empty column separators
with the "@{...}" syntax. Previously, pandoc would fail to
parse tables if a non-empty colsep was present. With this
commit, these separators are still ignored, but the table gets
parsed. A test case is included.
The `tabular` environment takes an optional parameter for
vertical alignment. Previously, pandoc would fail to parse
tables if this parameter was present. With this commit,
the parameter is still ignored, but the table gets
parsed. A test case is included.
Org links like `[[file:target][title]]` were not handled correctly,
parsing the link target verbatim. The org reader is changed such that
the leading `file:` is dropped from the link target.
This is related to issues #756 and #1812.
Move recursive role lookup from renderRole to addNewRole. The Attr value
will be the same for every occurance of this role, so there's no reason
to compute it every time. This allows simplifying the
stateRstCustomRoles map considerably.
We could go even further, and remove the fmt and attr arguments to
renderRole, which are null except for custom roles.
The class directive accepts one or more class names, and creates a Div
value with those classes. If the directive has an indented body, the
body is parsed as the children of the Div. If not, the first block
folowing the directive is made a child of the Div.
This differs from the behavior of rst2xml, which does not create a Div
element. Instead, the specified classes are applied to each child of
the directive. However, most Pandoc Block constructors to not take an
Attr argument, so we can't duplicate this behavior.
closes#65
RST quoted literal blocks are the same as indented literal blocks (which
pandoc already supports) except that the quote character is preserved in
each line.
This includes test cases for the quoted literal block, as well as
additional tests for line blocks and indented literal blocks, to verify
that these are unaffected by the changes.
While empty links are not allowed in Emacs org-mode, Pandoc org-mode
should support them: gitit relies on empty links as they are used to
create wiki links.
Fixesjgm/gitit#471
The org reader was to restrictive when parsing links, some relative
links and links to files given as absolute paths were not recognized
correctly. The org reader's link parsing function was amended to handle
such cases properly.
This fixes#1741
Document trees under a header starting with the word `COMMENT` are
comment trees and should not be exported. Those trees are dropped
silently.
This closes#1678.
Things like `/hello,/` or `/hi'/` were falsy recognized as emphasised
strings. This is wrong, as `,` and `'` are forbidden border chars and
may not occur on the inner border of emphasized text. This patch
enables the reader to matches the reference implementation in that it
reads the above strings as plain text.
Fixes issue with top-level bullet list parsing.
Previously we would use `many1 spaceChars` rather than respecting
the list's indent level. We also permitted `*` bullets on unindented
lists, which should unambiguously parse as `header 1`.
Combined, this meant headers at a different indent level were
being unwittingly slurped into preceding bullet lists, as per
Issue #1650.