...when handling URL argument served with no charset in the mime type.
The assumption is that most pages that don't specify a charset
in the mime type are either UTF-8 or latin1. I think that's a good
assumption, though I'm not sure.
[API change] This affects `readFile`, `getContents`, `writeFileWith`,
`writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`.
`hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`.
This avoids the need to uselessly create a linked list of characters
when emiting output.
Also, remove exported class NamedTag(..) [API change].
This was just intended to smooth over the transition from String to Text
and is no longer needed.
The functions isInlineTag and isBlockTag are no longer
polymorphic.
+ Run writer benchmarks for binary formats too.
+ Alphabetize benchmarks.
+ Don't run benchmarks for bibliography formats
(yet; we need a special input for them).
As of now, the default style for ODT documents has a "First paragraph" style that inherits from "Standard" style and has no top or bottom margin. All subsequent paragraphs have "Text_20_body" style that inherits from "Standard" and add "0.0598in" margins on top and bottom. This makes the final document a bit ugly since the first paragraph has a small gap ("0.0598in") towards the second one, and all subsequent have double that.
The proposed fix makes "First paragraph" inherit from "Text_20_body" instead so that it also has a consistent margin.
Another approach would be to inherit "Text_20_body" and add a 0 margin on top.
With the new XML parser, we can avoid the expensive tree
normalization step we used to do.
This gives a significant speed boost in docbook and JATS
parsing (e.g. 9.7 to 6 ms).
This is to prevent accidental creation of ligatures like
`` ?` `` and `` !` `` (especially in languages with quotations
like German), and similar ligature issues.
See jgm/citeproc#54.
Use `\vadjust pre` so that the hypertarget takes you to the
beginning of the paragraph rather than one line down.
Closes#7078.
This makes a particular difference for links to citations
using `--citeproc` and `link-citations: true`.
The org-ref syntax allows to list multiple citations separated by comma.
This fixes a bug that accepted commas as part of the citation id, so all
citation lists were parsed as one single citation.
Fixes: #7101
..and add new definitions isomorphic to xml-light's, but with
Text instead of String. This allows us to keep most of the code in
existing readers that use xml-light, but avoid lots of unnecessary
allocation.
We also add versions of the functions from xml-light's
Text.XML.Light.Output and Text.XML.Light.Proc that operate
on our modified XML types, and functions that convert
xml-light types to our types (since some of our dependencies,
like texmath, use xml-light).
Update golden tests for docx and pptx.
OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`.
Docx: Do a manual traversal to unwrap sdt and smartTag.
This is faster, and needed to pass the tests.
Benchmarks:
A = prior to 8ca191604d (Feb 8)
B = as of 8ca191604d (Feb 8)
C = this commit
| Reader | A | B | C |
| ------- | ----- | ------ | ----- |
| docbook | 18 ms | 12 ms | 10 ms |
| opml | 65 ms | 62 ms | 35 ms |
| jats | 15 ms | 11 ms | 9 ms |
| docx | 72 ms | 69 ms | 44 ms |
| odt | 78 ms | 41 ms | 28 ms |
| epub | 64 ms | 61 ms | 56 ms |
| fb2 | 14 ms | 5 ms | 4 ms |