Note: The attributes go on the enclosing section or div
if `--section-divs` is specified.
Also fixed a regression (only now noticed) in html+lhs output.
Previously the bird tracks were being omitted.
The 1.10 code assumed that each table header cell contains
exactly one block. That failed for headerless tables (0) and also
for tables with multiple blocks in a header cell.
The code is fixed and tests provided. Thanks to Andrew Lee for
pointing out the bug.
* RTF writer: Export writeRTFWithEmbeddedImages instead of
rtfEmbedImage.
* Text.Pandoc: Use writeRTFWithEmbeddedImages for RTF.
* Moved code for embedding images in RTF out of pandoc.hs.
* It no longer uses Network.URIs URI parser, which is too restrictive
(not allowing unicode URIs unless encoded).
* It allows many more schemes.
* It better handles punctuation so as to avoid capturing trailing
punctuation in bare URLs.
* In markdown reader, add a '\1' character to the beginning
of the title of an image that is alone in its paragraph,
if implicit_figures extension is selected.
* In writers, check for Para [Image alt (src,'\1':tit)] and treat
it as a figure if possible.
* Updated tests.
This is a bit of a hack, but it allows us to make implicit_figures
an extension of the markdown reader, rather than the writers.
We now (a) use anonymous links for links with inline URLs, and
(b) use an inline link instead of a reference link if the
reference link would require a label that has already been
used for a different link.
Closes#511.
Previously header ids were autogenerated by the writers.
Now they are generated (unless supplied explicitly) in the
markdown parser, if the `header_identifiers` extension is
selected.
In addition, the textile reader now supports id attributes on
headers.
Taking into account new context/latex output, and fixing
some bugs in the test suite Tests.Helpers and Tests.Writers.ConTeXt.
(We had the wrong order of expected/actual in the diff output.)
Previously the textile reader and writer incorrectly implented
RST-style autolinks for URLs and email addresses.
This has been fixed. Now an autolink is done this way:
"$":http://myurl.com
Now pandoc correctly handles hard line breaks inside list items.
Previously they broke list parsing. Thanks to Pablo
Rodríguez for pointing out the problem.
* Depend on text.
* Expose Text.Pandoc.UTF8.
* Text.Pandoc.UTF8 now exports toString, fromString,
toStringLazy, fromStringLazy.
* These are used instead of the old utf8-string functions.
Now we insert anchors after each header, and use @ref
instead of @uref for links.
Commas are now escaped as @comma{} only when needed; previously
all commas were escaped. (This change is needed, in part, because @ref
commands must be followed by a real comma or period.)
Also insert a blank line in from of @verbatim environments.
Previously the parser would hang on input like this:
[[[[[[[[[[[[[[[[[[hi
We fixed this by making the link parser parser characters
between balanced brackets (skipping brackets in inline code spans),
then parsing the result as an inline list.
One change is that
[hi *there]* bud](/url)
is now no longer parsed as a link. But in this respect pandoc behaved
differently from most other implementations anyway, so that seems okay.
All current tests pass. Added test for this case.
Closes#620.
This allows the markdown reader to treat '\begin' (not followed
by an argument) as a raw string rather than erroring out when
it doesn't find a '{'.
Closes#622.
Unescaped -'s become hyphens, while \-'s are left as ascii
minus signs. That is preferable for use with command-line
options.
See http://lintian.debian.org/tags/hyphen-used-as-minus-sign.html.
Thanks to Andrea Bolognani for bringing the issue to our
attention.
- Removed writerLiterateHaskell from WriterOptions.
- Removed readerLiterateHaskell from ReaderOptions.
- Added Ext_literate_haskell to Extensions. Test for this
instead of the above.
- Removed failUnlessLHS from Shared.
Note: At this point, +lhs and .lhs extension no longer has any effect.
Need to fix.
* Use Builder's Inlines/Blocks instead of lists.
* Return values in the reader monad, which are then
run (at the end of parsing) against the final
parser state. This allows links, notes, and
example numbers to be resolved without a second
parser pass.
* An effect of using Builder is that everything is
normalized automatically.
* New exports from Text.Pandoc.Parsing:
widthsFromIndices, NoteTable', KeyTable', Key', toKey',
withQuoteContext, singleQuoteStart, singleQuoteEnd, doubleQuoteStart,
doubleQuoteEnd, ellipses, apostrophe, dash
* Updated opendocument tests.
* Don't derive Show for ParserState.
* Benchmarks: markdown reader takes 82% of the time it took before.
Markdown writer takes 92% of the time (here the speedup is probably
due to the fact that everything is normalized by default).
To run tests, configure with --enable-tests, then 'cabal test'.
You can specify particular tests using --test-options='-t markdown'.
No output is shown unless tests fail. In the future, we can move
to the detailed-1.0 interface.
* All tables now require at least one body row.
* Renamed from 'extra' to 'pipe' tables.
* Moved functions from Parsing to Readers.Markdown.
* Cleaned up code; revised to parse in one pass rather than
parsing a raw string, splitting it, and parsing the components.
* Allow pipe tables without pipes on the ends (as PHP Markdown Extra
does).
* Use `:` form instead of `~`, for better compatibility with other
markdown implementations.
* Don't wrap the term, because it breaks definition lists.
This way you can still get the raw latex back, even if you don't
process with citeproc. Previously, cites were not visible at all
unless you specified --biblio on the command line and converted
them using citeproc, or used --natbib or --biblatex.
* The new reader is more robust, accurate, and extensible.
It is still quite incomplete, but it should be easier
now to add features.
* Text.Pandoc.Parsing: Added withRaw combinator.
* Markdown reader: do escapedChar before raw latex inline.
Otherwise we capture commands like \{.
* Fixed latex citation tests for new citeproc.
* Handle \include{} commands in latex.
This is done in pandoc.hs, not the (pure) latex reader.
But the reader exports the needed function, handleIncludes.
* Moved err and warn from pandoc.hs to Shared.
* Fixed tests - raw tex should sometimes have trailing space.
* Updated lhs-test for highlighting-kate changes.
* New module `Text.Pandoc.Docx`.
* New output format `docx`.
* Added reference.docx.
* New option `--reference-docx`.
The writer includes support for highlighted code blocks
and math (which is converted from TeX to OMML using
texmath's new OMML module).
Pandoc previously behaved like Markdown.pl for consecutive
lists of different styles. Thus, the following would be parsed
as a single ordered list, rather than an ordered list followed
by an unordered list:
1. one
2. two
- one
- two
This patch makes pandoc behave more sensibly, parsing this as
two lists. Any change in list type (ordered/unordered) or in
list number style will trigger a new list. Thus, the following
will also be parsed as two lists:
1. one
2. two
a. one
b. two
Since we regard this as a bug in Markdown.pl, and not something
anyone would ever rely on, we do not preserve the old behavior
even when `--strict` is selected.
* `---` is always em-dash, `--` is always en-dash.
* pandoc no longer tries to guess when `-` should be en-dash.
* A new option, `--old-dashes`, is provided for legacy documents.
Rationale: The rules for en-dash are too complex and
language-dependent for a guesser to work reliably. This
change gives users greater control. The alternative of
using unicode isn't very good, since unicode em- and en-
dashes are barely distinguishable in a monospace font.
Inline math uses the :math:`...` construct.
Display math uses
.. math:: ...
or if multilin
.. math::
...
These seem to be supported now by rst2latex.py.
Beamer output uses the default LaTeX template, with some
customizations via variables.
Added `writerBeamer` to `WriterOptions`.
Added `--beamer` option to `markdown2pdf`.
The container element will have the classes, id, and
key-value attributes you specified in the delimited code
block.
Previously these were stripped off.
escapeURI now only escapes space characters, leaving unicode characters
as they are, instead of converting them to octets and URL-encoding them,
as before. This gives more readable URIs. User agents now do the
percent-encoding themselves.
URIs are no longer unescaped at all on conversion to markdown, asciidoc,
rst, org.
Closes#349.
Still TODO:
- documentation in README
- add default.asciidoc to templates/
- lists
- tables
- proper escaping
- footnotes with blank lines - print separately at end?
currently they are just ignored.
- fix header (date gives weird result on pandoc README)
Previously `[@item1 and nowhere else]` yielded the locator ", and nowhere
else", or, with the new citeproc-hs, "and nowhere else".
Now it yields " and nowhere else".