Fixes issue with top-level bullet list parsing.
Previously we would use `many1 spaceChars` rather than respecting
the list's indent level. We also permitted `*` bullets on unindented
lists, which should unambiguously parse as `header 1`.
Combined, this meant headers at a different indent level were
being unwittingly slurped into preceding bullet lists, as per
Issue #1650.
Previously text that ended a div would be parsed as Plain
unless there was a blank line before the closing div tag.
Test case:
<div class="first">
This is a paragraph.
This is another paragraph.
</div>
Closes#1591.
We can now handle all different alignment types, for simple
tables only (no captions, no relative widths, cell contents just
plain inlines). Other tables are still handled using raw HTML.
Addresses #1585 as far as it can be addresssed, I believe.
Currently, pandoc has hard-coded the following in order to make horizontal
rules in LaTeX:
```hs
"\\begin{center}\\rule{3in}{0.4pt}\\end{center}"
```
Which is fine, but does not allow customizations. It also does not take into
consideration the current line width.
I'm proposing this change:
```diff
@@ In Writers/LaTeX.hs:
-"\\begin{center}\\rule{3in}{0.4pt}\\end{center}"
+"\\begin{center}\\rule{0.5\\linewidth}{\\linethickness}\\end{center}"
```
Renamed some tests, introducing subsidiary directories
for fb2, docx, epub.
Cleaned up tests in cabal file.
Combined dokuwiki-writer and dokuwiki_inline_formatting tests.
This was just too fragile and dependent on a changing Cabal API
(see #1526).
Instead of passing the bulid directory to the test program, we
now let the test program find itself (using executable-path)
and then find the pandoc executable relative to itself.
Indented code at the beginning of a list item must be indented eight
spaces from the margin (or from the edge of the container), or four
spaces past the list marker, whichever is farther.
Some examples in `tests/markdown-reader-more.txt`.
Closes#1513.
Lists can now start without an intervening blank line.
Also, html block-level tags that don't start a line are parsed
as RawInline and don't interrupt paragraphs, as in RedCloth.
Rewrote features test to remove all unimplemented features.
There are now all three examples of where an image can be included in
the test.
1. Cover image
2. As a spine elemnt
3. In the document
Tests have also been added to make sure that the mediabag contains all
these images after processing.
pandoc -t markdown-raw_html should not emit any raw HTML, even
span and div tags that go with pandoc Span and Div elements.
Cleaned up a bit of the logic with extensions and plain.
Renamed epub test files so they're identified more clearly as
epub: features.{epub,native} -> epub.features.{epub,native},
and similarly with formatting.{epub,native}.
Added epub test files to cabal file, so they'll be included in
the tarball.
Using `map toUpper` to capitalise text is wrong, as e.g.
“Straße” should be converted to “STRASSE”, which is 1 character
longer. This commit adds a `capitalize` function and replaces
2 identical implementations in different modules (`toCaps` and
`capitalize`) with it.
This will allow us to test the whole mediabag (making sure, for example,
that images are added with the correct keys) instead of just individual
extracted images. We compare each entry in the media bag to an image
extracted on the fly from the docx. As a result, we only need one file
to test with.
The image in the current tests was also replaced with a smaller one.
Moved `MediaBag` definition and functions from Shared:
`lookupMedia`, `mediaDirectory`, `insertMedia`, `extractMediaBag`.
Removed `emptyMediaBag`; use `mempty` instead, since `MediaBag`
is a Monoid.
This will make paragraphs styled with `Author`, `Title`, `Subtitle`,
`Date`, and `Abstract` into pandoc metavalues, rather than text. The
implementation only takes those elements from the beginning of the
document (ignoring empty paragraphs).
Multiple paragraphs in the `Author` style will be made into a metaList,
one paragraph per item. Hard linebreaks (shift-return) in the paragraph
will be maintained, and can be used for institution, email, etc.
Math now appears in unicode if possible, without the distracting
italics around identifiers.
Blank lines around headers are more consistent.
Footnotes appear in regular [n] style.
* This change brings pandoc's definition list syntax into alignment
with that used in PHP markdown extra and multimarkdown (with the
exception that pandoc is more flexible about the definition markers,
allowing tildes as well as colons).
* Lazily wrapped definitions are now allowed; blank space is required
between list items; and the space before definition is used to
determine whether it is a paragraph or a "plain" element.
* For backwards compatibility, a new extension,
`compact_definition_lists`, has been added that restores the behavior
of pandoc 1.12.x, allowing tight definition lists with no blank space
between items, and disallowing lazy wrapping.
This gives better results for tight lists. Closes#1437.
An alternative solution would be to use Para everywhere, and
never Plain. I am not sufficiently familiar with org to know
which is best. Thoughts, @tarleb?
Adds support to the org reader for conditionally exporting either the code block,
results block immediately following, both, or neither, depending on the value
of the `:exports` header argument. If no such argument is supplied, the default
org behavior (for most languages) of exporting code is used.
as in the mediawiki writer. The dokuwiki markup isn't able
to handle multiple block-level items within a list item, except
in a few special cases (e.g. code blocks, and these must be started
on the same line as the preceding paragraph). So we fall back to
raw HTML for these.
Perhaps there is a better solution. We can "fake" multiple
paragraphs within list items using hard line breaks (`\\`), but
we must keep everything on one line.
(#1398)