Markup-features focusing on lines as distinctive part of the markup are read
into `LineBlock` elements. This currently means line blocks in reStructuredText
and Markdown (the latter only if the `line_block` extension is enabled), the
`linegroup`/`line` combination from the Docbook 5.1 working draft, and Org-mode
`VERSE` blocks.
Added `stateHeaderKeys` to `ParserState`; this is a `KeyTable`
like `stateKeys`, but it only gets consulted if we don't find
a match in `stateKeys`, and if `Ext_implicit_header_references`
is enabled.
Closes#1606.
Indented code at the beginning of a list item must be indented eight
spaces from the margin (or from the edge of the container), or four
spaces past the list marker, whichever is farther.
Some examples in `tests/markdown-reader-more.txt`.
The 1.10 code assumed that each table header cell contains
exactly one block. That failed for headerless tables (0) and also
for tables with multiple blocks in a header cell.
The code is fixed and tests provided. Thanks to Andrew Lee for
pointing out the bug.
* It no longer uses Network.URIs URI parser, which is too restrictive
(not allowing unicode URIs unless encoded).
* It allows many more schemes.
* It better handles punctuation so as to avoid capturing trailing
punctuation in bare URLs.
Previously header ids were autogenerated by the writers.
Now they are generated (unless supplied explicitly) in the
markdown parser, if the `header_identifiers` extension is
selected.
In addition, the textile reader now supports id attributes on
headers.
* Use Builder's Inlines/Blocks instead of lists.
* Return values in the reader monad, which are then
run (at the end of parsing) against the final
parser state. This allows links, notes, and
example numbers to be resolved without a second
parser pass.
* An effect of using Builder is that everything is
normalized automatically.
* New exports from Text.Pandoc.Parsing:
widthsFromIndices, NoteTable', KeyTable', Key', toKey',
withQuoteContext, singleQuoteStart, singleQuoteEnd, doubleQuoteStart,
doubleQuoteEnd, ellipses, apostrophe, dash
* Updated opendocument tests.
* Don't derive Show for ParserState.
* Benchmarks: markdown reader takes 82% of the time it took before.
Markdown writer takes 92% of the time (here the speedup is probably
due to the fact that everything is normalized by default).
* The new reader is more robust, accurate, and extensible.
It is still quite incomplete, but it should be easier
now to add features.
* Text.Pandoc.Parsing: Added withRaw combinator.
* Markdown reader: do escapedChar before raw latex inline.
Otherwise we capture commands like \{.
* Fixed latex citation tests for new citeproc.
* Handle \include{} commands in latex.
This is done in pandoc.hs, not the (pure) latex reader.
But the reader exports the needed function, handleIncludes.
* Moved err and warn from pandoc.hs to Shared.
* Fixed tests - raw tex should sometimes have trailing space.
* Updated lhs-test for highlighting-kate changes.
Pandoc previously behaved like Markdown.pl for consecutive
lists of different styles. Thus, the following would be parsed
as a single ordered list, rather than an ordered list followed
by an unordered list:
1. one
2. two
- one
- two
This patch makes pandoc behave more sensibly, parsing this as
two lists. Any change in list type (ordered/unordered) or in
list number style will trigger a new list. Thus, the following
will also be parsed as two lists:
1. one
2. two
a. one
b. two
Since we regard this as a bug in Markdown.pl, and not something
anyone would ever rely on, we do not preserve the old behavior
even when `--strict` is selected.
escapeURI now only escapes space characters, leaving unicode characters
as they are, instead of converting them to octets and URL-encoding them,
as before. This gives more readable URIs. User agents now do the
percent-encoding themselves.
URIs are no longer unescaped at all on conversion to markdown, asciidoc,
rst, org.
Closes#349.
Additional related changes:
* URLs in Code in autolinks now use class "url".
* Require highlighting-kate 0.2.8.2, which omits the final <br/> tag,
essential for inline code.
The old TeX, HtmlInline and RawHtml elements have been removed
and replaced by generic RawInline and RawBlock elements.
All modules updated to use the new raw elements.
Previously, curly quotes were just parsed literally, leading
to problems in some output formats. Now they are parsed as
Quoted inlines, if --smart is specified.
Resolves Issue #270.
This broke when we added the Key type. We had assumed that
the custom case-insensitive Ord instance would ensure case-insensitive
matching, but that is not how Data.Map works.
* Added a test case for case-insensitivity in markdown-reader-more
* Removed old refsMatch from Text.Pandoc.Parsing module;
* hid the 'Key' constructor;
* dropped the custom Ord and Eq instances, deriving instead;
* added fromKey and toKey to convert between Keys and Inline lists;
* toKey ensures that keys are case-insensitive, since this is the
only way the API provides to construct a Key.
Resolves Issue #272.
Previously some of the writers added spurious whitespace.
This has been removed, resolving Issue #232.
NOTE: If your application combines pandoc's output with other
text, for example in a template, you may need to add spacing.
For example, a pandoc-generated markdown file will not have
a blank line after the final block element. If you are inserting
it into another markdown file, you will need to make sure there
is a blank line between it and the next block element.
Based on a patch by Justin Bogner.
Titles may span multiple lines, provided continuation lines
begin with a space character.
Separate authors may be put on multiple lines, provided
each line after the first begins with a space character.
Each author must fit on one line. Multiple authors on
a single line may still be separated by a semicolon.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1854 788f1e2b-df1e-0410-8736-df70ead52e1b