* Depend on text.
* Expose Text.Pandoc.UTF8.
* Text.Pandoc.UTF8 now exports toString, fromString,
toStringLazy, fromStringLazy.
* These are used instead of the old utf8-string functions.
Now we insert anchors after each header, and use @ref
instead of @uref for links.
Commas are now escaped as @comma{} only when needed; previously
all commas were escaped. (This change is needed, in part, because @ref
commands must be followed by a real comma or period.)
Also insert a blank line in from of @verbatim environments.
Previously the parser would hang on input like this:
[[[[[[[[[[[[[[[[[[hi
We fixed this by making the link parser parser characters
between balanced brackets (skipping brackets in inline code spans),
then parsing the result as an inline list.
One change is that
[hi *there]* bud](/url)
is now no longer parsed as a link. But in this respect pandoc behaved
differently from most other implementations anyway, so that seems okay.
All current tests pass. Added test for this case.
Closes#620.
This allows the markdown reader to treat '\begin' (not followed
by an argument) as a raw string rather than erroring out when
it doesn't find a '{'.
Closes#622.
Unescaped -'s become hyphens, while \-'s are left as ascii
minus signs. That is preferable for use with command-line
options.
See http://lintian.debian.org/tags/hyphen-used-as-minus-sign.html.
Thanks to Andrea Bolognani for bringing the issue to our
attention.
- Removed writerLiterateHaskell from WriterOptions.
- Removed readerLiterateHaskell from ReaderOptions.
- Added Ext_literate_haskell to Extensions. Test for this
instead of the above.
- Removed failUnlessLHS from Shared.
Note: At this point, +lhs and .lhs extension no longer has any effect.
Need to fix.
* Use Builder's Inlines/Blocks instead of lists.
* Return values in the reader monad, which are then
run (at the end of parsing) against the final
parser state. This allows links, notes, and
example numbers to be resolved without a second
parser pass.
* An effect of using Builder is that everything is
normalized automatically.
* New exports from Text.Pandoc.Parsing:
widthsFromIndices, NoteTable', KeyTable', Key', toKey',
withQuoteContext, singleQuoteStart, singleQuoteEnd, doubleQuoteStart,
doubleQuoteEnd, ellipses, apostrophe, dash
* Updated opendocument tests.
* Don't derive Show for ParserState.
* Benchmarks: markdown reader takes 82% of the time it took before.
Markdown writer takes 92% of the time (here the speedup is probably
due to the fact that everything is normalized by default).
To run tests, configure with --enable-tests, then 'cabal test'.
You can specify particular tests using --test-options='-t markdown'.
No output is shown unless tests fail. In the future, we can move
to the detailed-1.0 interface.
* All tables now require at least one body row.
* Renamed from 'extra' to 'pipe' tables.
* Moved functions from Parsing to Readers.Markdown.
* Cleaned up code; revised to parse in one pass rather than
parsing a raw string, splitting it, and parsing the components.
* Allow pipe tables without pipes on the ends (as PHP Markdown Extra
does).
* Use `:` form instead of `~`, for better compatibility with other
markdown implementations.
* Don't wrap the term, because it breaks definition lists.
This way you can still get the raw latex back, even if you don't
process with citeproc. Previously, cites were not visible at all
unless you specified --biblio on the command line and converted
them using citeproc, or used --natbib or --biblatex.
* The new reader is more robust, accurate, and extensible.
It is still quite incomplete, but it should be easier
now to add features.
* Text.Pandoc.Parsing: Added withRaw combinator.
* Markdown reader: do escapedChar before raw latex inline.
Otherwise we capture commands like \{.
* Fixed latex citation tests for new citeproc.
* Handle \include{} commands in latex.
This is done in pandoc.hs, not the (pure) latex reader.
But the reader exports the needed function, handleIncludes.
* Moved err and warn from pandoc.hs to Shared.
* Fixed tests - raw tex should sometimes have trailing space.
* Updated lhs-test for highlighting-kate changes.
* New module `Text.Pandoc.Docx`.
* New output format `docx`.
* Added reference.docx.
* New option `--reference-docx`.
The writer includes support for highlighted code blocks
and math (which is converted from TeX to OMML using
texmath's new OMML module).
Pandoc previously behaved like Markdown.pl for consecutive
lists of different styles. Thus, the following would be parsed
as a single ordered list, rather than an ordered list followed
by an unordered list:
1. one
2. two
- one
- two
This patch makes pandoc behave more sensibly, parsing this as
two lists. Any change in list type (ordered/unordered) or in
list number style will trigger a new list. Thus, the following
will also be parsed as two lists:
1. one
2. two
a. one
b. two
Since we regard this as a bug in Markdown.pl, and not something
anyone would ever rely on, we do not preserve the old behavior
even when `--strict` is selected.
* `---` is always em-dash, `--` is always en-dash.
* pandoc no longer tries to guess when `-` should be en-dash.
* A new option, `--old-dashes`, is provided for legacy documents.
Rationale: The rules for en-dash are too complex and
language-dependent for a guesser to work reliably. This
change gives users greater control. The alternative of
using unicode isn't very good, since unicode em- and en-
dashes are barely distinguishable in a monospace font.
Inline math uses the :math:`...` construct.
Display math uses
.. math:: ...
or if multilin
.. math::
...
These seem to be supported now by rst2latex.py.
Beamer output uses the default LaTeX template, with some
customizations via variables.
Added `writerBeamer` to `WriterOptions`.
Added `--beamer` option to `markdown2pdf`.
The container element will have the classes, id, and
key-value attributes you specified in the delimited code
block.
Previously these were stripped off.
escapeURI now only escapes space characters, leaving unicode characters
as they are, instead of converting them to octets and URL-encoding them,
as before. This gives more readable URIs. User agents now do the
percent-encoding themselves.
URIs are no longer unescaped at all on conversion to markdown, asciidoc,
rst, org.
Closes#349.
Still TODO:
- documentation in README
- add default.asciidoc to templates/
- lists
- tables
- proper escaping
- footnotes with blank lines - print separately at end?
currently they are just ignored.
- fix header (date gives weird result on pandoc README)
Previously `[@item1 and nowhere else]` yielded the locator ", and nowhere
else", or, with the new citeproc-hs, "and nowhere else".
Now it yields " and nowhere else".
A horizontal rule now gets transformed into an empty H1 header
before 'hierarchicalize' is called.
If the document that does not begin with an H1 header, an
empty one is provided.
This avoids the need for kludgy raw HTML.
Also, the 'titleslide' class is added to any section containing
just a title:
----
----
For example, in
Just a few glitches remaining.
<ul><li> In this situation, one loses the list.
</ul>
And in this, the preformatting.
<pre>Preformatted text not starting with its own blank line.
</pre>
Thansk to Dirk Laurie for noticing the issue.
So, in RST, 'http://google.com.' should be parsed as a link
to 'http://google.com' followed by a period.
The parser is smart enough to recognize balanced parentheses,
as often occur in wikipedia links: 'http://foo.bar/baz_(bam)'.
Also added ()s to RST specialChars, so '(http://google.com)'
will be parsed as a link in parens.
Added test cases.
Resolves Issue #291.
"First paragraph" as opposed to "Text body." This allows
users to specify e.g. that only paragraphs after the first
paragraph of a section are to be indented.
Thanks to Andrea Rossato for the patch.
Closes github Issue #20.
Additional related changes:
* URLs in Code in autolinks now use class "url".
* Require highlighting-kate 0.2.8.2, which omits the final <br/> tag,
essential for inline code.
Field lists now work properly with block content.
(Thanks to Lachlan Musicman for pointing out the bug.)
In addition, definition list items are now always Para instead
of Plain -- which matches behavior of rst2xml.py.
Finally, in image blocks, the alt attribute is parsed properly
and used for the alt, not also the title.
The old TeX, HtmlInline and RawHtml elements have been removed
and replaced by generic RawInline and RawBlock elements.
All modules updated to use the new raw elements.