* XML.toEntities: changed type to Text -> Text.
* Shared.tabFilter -- fixed so it strips out CRs as before.
* Modified writers to take Text.
* Updated tests, benchmarks, trypandoc.
[API change]
Closes#3731.
We also export the set of known `schemes`.
The new function replaces the function of the same name
from `Network.URI`, as the latter did not check whether a scheme is
well-known. E.g. MediaWiki wikis frequently feature pages with names
like `User:John`. These links were interpreted as URIs, thus turning
internal links into global links. This is prevented by also checking
whether the scheme of a URI is frequently used (i.e. is IANA registered
or an otherwise well-known scheme).
Fixes: #2713
Update set of well-known URIs from IANA list
All official IANA schemes (as of 2017-05-22) are included in the set of
known schemes. The four non-official schemes doi, isbn, javascript, and
pmid are kept.
Since PandocMonad is an instance of MonadError, this will allow us, in a
future commit, to change all invocations of `error` to `throwError`,
which will be preferable for the pure versions. At the moment, we're
disabling the lua custom writers (this is temporary).
This requires changing the type of the Writer in Text.Pandoc. Right now,
we run `runIOorExplode` in pandoc.hs, to make the conversion easier. We
can switch it to the safer `runIO` in the future.
Note that this required a change to Text.Pandoc.PDF as well. Since
running an external program is necessarily IO, we can be clearer about
using PandocIO.
Update all writers to take into account page breaks.
A straightforwad, far from complete, implementation of page
breaks in selected writers.
Readers will have to follow in the future as well.
Previously setting writerStandalone = True did nothing unless
a template was provided in writerTemplate. Now a fragment
will be generated if writerTemplate is Nothing; otherwise,
the specified template will be used and standalone output
generated. [API change]
The following markup features are used to output the lines of the `LineBlock`
element:
- AsciiDoc: a `[verse]` block,
- ConTeXt: text surrounded by `\startlines` and `\endlines`,
- HTML: `div` with an per-element style setting to interpret the content as
pre-wrapped,
- Markdown: line blocks if the `line_blocks` extension is enabled, a simple
paragraph with hard linebreaks otherwise,
- Org: VERSE block,
- RST: a line block, and
- all other formats: a paragraph, containing hard linebreaks between lines.
Custom lua writers should be updated to use the `LineBlock` element.