The call to toLower in ciMatch was very expensive (and very often
used), because toLower from Data.Char calls a fully unicode
aware function. This optimization avoids the call to toLower
for the most common, ASCII cases. This dramatically reduces the
speed penalty that comes from enabling the `autolink_bare_uris`
extension. The penalty is still substantial (in one test, from 0.33s
to 0.44s), but nowhere near what it used to be.
Now latex macro definitions are preserved when output is latex,
and applied when it is another format, as originally intended.
Partially addresses #730.
\providecommand is still not supported. For this we need changes
to texmath.
oneOfStrings will now take the longest match it can in a
list of strings, so if 'foo' and 'foobar' are both included,
'foobar' will match even if 'foo' is first in the list.
* It no longer uses Network.URIs URI parser, which is too restrictive
(not allowing unicode URIs unless encoded).
* It allows many more schemes.
* It better handles punctuation so as to avoid capturing trailing
punctuation in bare URLs.
This makes 's', 'l', etc. parse properly.
Formerly we had some English-centric heuristics, but they
are no longer needed now that we keep track of the last
'Str' position in state.
Closes#698.
Previously header ids were autogenerated by the writers.
Now they are generated (unless supplied explicitly) in the
markdown parser, if the `header_identifiers` extension is
selected.
In addition, the textile reader now supports id attributes on
headers.
Now by default pandoc will act as if link references have been defined
for all headers. So, you can do this:
# My header
Link to [My header].
Another link to [it][My header].
Closes#691.
Previously, UTF-8 was enforced for both input and output.
The new system:
* For input, UTF-8 is tried first; if an error is raised, the
locale encoding is tried.
* For output, the locale encoding is always used.
Note that system templates are stored as UTF8
and will still be read as such, even if the local encoding
is different. Text downloaded from URLs will also be treated
as UTF-8.
- Removed writerLiterateHaskell from WriterOptions.
- Removed readerLiterateHaskell from ReaderOptions.
- Added Ext_literate_haskell to Extensions. Test for this
instead of the above.
- Removed failUnlessLHS from Shared.
Note: At this point, +lhs and .lhs extension no longer has any effect.
Need to fix.
Now we just use the former Key' (string contents),
renamed Key. lookupKeySrc and fromKey are no longer
eport. Key', toKey' and KeyTable' have become Key,
toKey, and KeyTable.
* Use Builder's Inlines/Blocks instead of lists.
* Return values in the reader monad, which are then
run (at the end of parsing) against the final
parser state. This allows links, notes, and
example numbers to be resolved without a second
parser pass.
* An effect of using Builder is that everything is
normalized automatically.
* New exports from Text.Pandoc.Parsing:
widthsFromIndices, NoteTable', KeyTable', Key', toKey',
withQuoteContext, singleQuoteStart, singleQuoteEnd, doubleQuoteStart,
doubleQuoteEnd, ellipses, apostrophe, dash
* Updated opendocument tests.
* Don't derive Show for ParserState.
* Benchmarks: markdown reader takes 82% of the time it took before.
Markdown writer takes 92% of the time (here the speedup is probably
due to the fact that everything is normalized by default).