* New module `Text.Pandoc.Docx`.
* New output format `docx`.
* Added reference.docx.
* New option `--reference-docx`.
The writer includes support for highlighted code blocks
and math (which is converted from TeX to OMML using
texmath's new OMML module).
Top line of table must not be followed by a blank line.
This bug caused slowdown on some files with hrules and tables,
and pandoc tried to interpret the hrules as the tops of
multiline tables.
This change also means that
[link with [link](/url)](/url)
will turn into
<p><a href="/url">link with link</a></p>
instead of
<p><a href="/url">link with [link](/url)</a></p>
Pandoc previously behaved like Markdown.pl for consecutive
lists of different styles. Thus, the following would be parsed
as a single ordered list, rather than an ordered list followed
by an unordered list:
1. one
2. two
- one
- two
This patch makes pandoc behave more sensibly, parsing this as
two lists. Any change in list type (ordered/unordered) or in
list number style will trigger a new list. Thus, the following
will also be parsed as two lists:
1. one
2. two
a. one
b. two
Since we regard this as a bug in Markdown.pl, and not something
anyone would ever rely on, we do not preserve the old behavior
even when `--strict` is selected.
* `---` is always em-dash, `--` is always en-dash.
* pandoc no longer tries to guess when `-` should be en-dash.
* A new option, `--old-dashes`, is provided for legacy documents.
Rationale: The rules for en-dash are too complex and
language-dependent for a guesser to work reliably. This
change gives users greater control. The alternative of
using unicode isn't very good, since unicode em- and en-
dashes are barely distinguishable in a monospace font.
Inline math uses the :math:`...` construct.
Display math uses
.. math:: ...
or if multilin
.. math::
...
These seem to be supported now by rst2latex.py.
Inline: :math:`E=mc^2`
Block:
.. math: E = mc^2
.. math::
E = mc^2
a = b^2
(This latter will turn into a paragraph with two
display math elements.)
Closes#117.
* Added stateLastStrPos to ParserState. This lets us keep track
of whether we're parsing the position immediately after a 'str'.
If we encounter a ' in such a location, it must be an apostrophe,
and can't be a single quote start.
* Set this in the markdown, textile, html, and rst str parsers.
* Closes#360.
This solves a problem stemming from the fact that a parser
doesn't know what came *before* in the input stream.
Previously pandoc would parse
D'oh l'*aide*
as containing a single quoted "oh l", when both `'`s should
be apostrophes. (Issue #360.) There are two issues here.
(a) It is obvious that the first `'` is not an open quote,
becaues of the preceding `D`. This patch solves the problem.
(b) It is obvious to us that the second `'` is not an
open quote, because we see that *aide* is some text.
But getting a good algorithm that has good performance is
a bit tricky. You can't assume that `'` followed by `*`
is always an apostrophe:
*'this is quoted'*
This patch does not fix (b).
Previously `[@item1 and nowhere else]` yielded the locator ", and nowhere
else", or, with the new citeproc-hs, "and nowhere else".
Now it yields " and nowhere else".
The characters '.',':',';','$','<','>','~','#','-','_' can
be used only between two letters or digits in a citation key.
This means that '@item1.' will be parsed as a citation, 'item1',
followed by a period, instead of a citation 'item1.', as was the
case previously.
Thanks to David Sanson for alerting us to the problem.
It was always possible to include raw DocBook tags in a markdown
document, but now pandoc will be able to distinguish block from
inline tags and behave accordingly. Thus, for example,
<sidebar>
hello
</sidebar>
will not be wrapped in `<para>` tags.
For example, in
Just a few glitches remaining.
<ul><li> In this situation, one loses the list.
</ul>
And in this, the preformatting.
<pre>Preformatted text not starting with its own blank line.
</pre>
Thansk to Dirk Laurie for noticing the issue.
* Skip spaces after <b>, <emph>, etc.
* Convert Plain elements into Para when they're in a list
item with Para, Pre, BlockQuote, CodeBlock.
An example of HTML that pandoc handles better now:
~~~~
<h4> Testing html to markdown </h4>
<ul>
<li>
<b> An item in a list </b>
<p> An introductory sentence.
<pre>
Some preformatted text
at this stage comes next.
But alas! much havoc
is wrought by Pandoc.
</pre>
</ul>
~~~~
Thanks to Dirk Laurie for reporting the issues.
These previously caused infinite looping and stack overflows.
For example:
[^1]
[^1]: See [^1]
Note references are allowed in reST notes, so this isn't a full
implementation of reST. That can come later. For now we need to
prevent the stack overflows.
Partially resolves Issue #297.
So, in RST, 'http://google.com.' should be parsed as a link
to 'http://google.com' followed by a period.
The parser is smart enough to recognize balanced parentheses,
as often occur in wikipedia links: 'http://foo.bar/baz_(bam)'.
Also added ()s to RST specialChars, so '(http://google.com)'
will be parsed as a link in parens.
Added test cases.
Resolves Issue #291.
The point of the change is to allow html tags to be used freely
at the left margin of a markdown+lhs document.
Thanks to Conal Elliot for the suggestion.