Commit graph

1454 commits

Author SHA1 Message Date
John MacFarlane
aecbf8156e Remove Text.Pandoc.BCP47 module.
[API change]

Use Lang from UnicodeCollation.Lang instead.
This is a richer implementation of BCP 47.
2021-04-17 16:15:14 -07:00
John MacFarlane
7ba8c0d2a5 Move getLang from BCP47 -> T.P.Writers.Shared.
[API change]
2021-04-17 16:15:13 -07:00
John MacFarlane
2e7fee9c3c Use latest xml-conduit. 2021-04-15 14:30:33 -07:00
Roman Beránek
fd0873c907
Require text for trypandoc (#7193) 2021-03-31 13:26:09 -07:00
John MacFarlane
d2495d0c75 Allow attoparsec 0.14.x. 2021-03-24 14:36:29 -07:00
John MacFarlane
88d3d55909 Require latest skylighting (fixes a bug in XML syntax highlighting). 2021-03-22 14:28:03 -07:00
John MacFarlane
99c15f8e85 Bump to 2.13, update changelog 2021-03-20 00:39:38 -07:00
John MacFarlane
ceadf33246 Tests: Use getExecutablePath from base...
avoiding the need to depend on the executable-path package.
2021-03-19 23:35:47 -07:00
John MacFarlane
029de661f4 Narrow version bounds for skylighting, citeproc, and texmath.
This reduces the chance that tests will fail due to behavior
changes in one of these dependencies. (See e.g. #7163)
2021-03-19 14:13:53 -07:00
John MacFarlane
425c2e47b5 Use skylighting 0.10.5.
This fixes a bad regression in Haskell syntax highlighting.
2021-03-19 11:58:56 -07:00
John MacFarlane
f0e4b9cc3c Require safe >= 0.3.18 and remove cpp. 2021-03-18 21:37:56 -07:00
John MacFarlane
2d10f46de5 Don't bake in extra stack size to the executables.
I don't think this is necessary; stack overflows generally
indicate a code problem and should be fixed (and have been when
reported).
2021-03-18 17:18:31 -07:00
John MacFarlane
2e00ed3fde Bump to 2.12.1 and update changelog. 2021-03-18 15:50:13 -07:00
John MacFarlane
8235daf780 Use -A8m for default rtsopts for benchmark 2021-03-18 15:46:20 -07:00
John MacFarlane
c7230182a0 pandoc.cabal: bake in -A8m to rtsopts.
This reserves a larger allocation area and reduces GC,
speeding up execution.
2021-03-18 11:02:46 -07:00
Alexey Kuleshevich
613c070cbd
Update bounds for random (#7156) 2021-03-17 18:57:32 -07:00
John MacFarlane
c6e5cf2e74 Benchmark improvements.
* Build `+RTS -A256m -RTS` into default ghc-options for benchmark,
  so we don't have to specify this separately on the command line.
  This is necessary to get accurate benchmark results; otherwise
  we are largely measuring garbage collecting, some not related
  to the current benchmark.
* Switch back from gauge to tasty-bench.
* Allow specifying BASELINE file in 'make bench' for comparison
  (otherwise the latest is chosen by default).
* Remove obsolete reference to weigh-pandoc from CONTRIBUTING.md.
* Remove `-Rghc-timing` from 'make bench'.
2021-03-17 13:34:17 -07:00
John MacFarlane
1ef3534328 Increase heap space in runtime for benchmarks.
Otherwise we're essentially benchmarking garbage collecting,
which can give very inconsistent results.
2021-03-16 15:59:50 -07:00
John MacFarlane
ff0fcedcb3 Switch to gauge for now for benchmarks.
tasty-bench is displaying odd behavior, with different
timings depending on the `--pattern` specified.
2021-03-15 22:50:18 -07:00
John MacFarlane
5f94dd74f1 Require texmath 0.12.2 2021-03-15 15:36:57 -07:00
John MacFarlane
39934c8851 Require latest doclayout and skylighting. 2021-03-14 15:48:01 -07:00
John MacFarlane
3519d6f3b4 Use eciteproc >= 0.3.0.9 2021-03-13 11:27:05 -08:00
Albert Krewinkel
f8b49e77f8
Use jira-wiki-markup 1.3.4
Jira reader:

* Fixed parsing of autolinks (i.e., of bare URLs in the text).
  Previously an autolink would take up the rest of a line, as spaces
  were allowed characters in these items.

* Emoji character sequences no longer cause parsing failures. This was
  due to missing backtracking when emoji parsing fails.

Jira writer:

* Block quotes are only rendered as `bq.` if they do not contain a
  linebreak.
2021-03-13 14:53:58 +01:00
John MacFarlane
e17127dc28 Re-add a needed dependency for benchmark. 2021-03-09 13:40:24 -08:00
John MacFarlane
a8b2031bb4 Revert "Use -Wunused-packages on ghc >= 8.10."
This reverts commit 7a1d0f01e9.

This option gives confusing output when a build is interrupted,
suggesting that packages aren't required when we just didn't
get to the model that requires them.
2021-03-09 12:49:15 -08:00
John MacFarlane
a9a05110d0 Remove some unused packages from pandoc.cabal. 2021-03-09 12:34:36 -08:00
John MacFarlane
7a1d0f01e9 Use -Wunused-packages on ghc >= 8.10. 2021-03-09 12:34:36 -08:00
John MacFarlane
e649b69564 Bump version to 2.12 2021-03-04 08:57:27 -08:00
John MacFarlane
92ea8a0cb6 Revert "Add T.P.Readers.LaTeX.Include."
This reverts commit b569b0226d.

Memory usage improvement in compilation wasn't very significant.
2021-03-03 19:07:16 -08:00
John MacFarlane
b569b0226d Add T.P.Readers.LaTeX.Include. 2021-03-03 18:47:17 -08:00
John MacFarlane
33e4c8dd6c Remove T.P.Readers.LaTeX.Accent.
Incorporate accentCommands into T.P.Readers.LaTeX.Inline.
2021-03-03 18:21:32 -08:00
John MacFarlane
bbcc1501a5 Split out T.P.Readers.LaTeX.Inline. 2021-03-03 10:34:10 -08:00
John MacFarlane
e8e5ffe1f4 Split out T.P.Writers.LaTeX.Util. 2021-03-02 22:40:45 -08:00
John MacFarlane
fe483c653b Split out T.P.Writers.LaTeX.Citation. 2021-03-02 21:57:37 -08:00
John MacFarlane
827ecdd2de Split out T.P.Writers.LaTeX.Lang. 2021-03-02 21:33:58 -08:00
John MacFarlane
2097411e4f Split up T.P.Writers.Markdown...
with T.P.Writers.Markdown.Types and T.P.Writers.Markdown.Inline.
The module was difficult to compile on low-memory system.s
2021-03-02 21:08:13 -08:00
John MacFarlane
7f1b933aaa Make T.P.Readers.LaTeX.Types an unexported module.
[API change]

This is really an implementation detail that shouldn't be
exposed in the public API.
2021-03-01 09:46:43 -08:00
John MacFarlane
382f0e23d2 Factor out T.P.Readers.LaTeX.Macro. 2021-03-01 09:46:43 -08:00
John MacFarlane
d2bb0c7c8d Factor out T.P.Readers.LaTeX.Math. 2021-02-28 21:05:25 -08:00
John MacFarlane
7e83686d31 trypandoc: add 2 second timeout. 2021-02-28 09:24:37 -08:00
John MacFarlane
2faa57e8e9 Factor out T.P.Readers.LaTeX.Citation. 2021-02-28 09:12:09 -08:00
John MacFarlane
08231f5cdd Factor out T.P.Readers.LaTeX.Table. 2021-02-27 21:40:56 -08:00
John MacFarlane
925815bb33 Split off T.P.Readers.LaTeX.Accent.
To help reduce memory demands compiling the main LaTeX reader.
2021-02-27 17:02:44 -08:00
John MacFarlane
cbc3f034ad Use skylighting 0.10.4.
This version of skylighting uses xml-conduit rather than hxt.
This speeds up parsing of XML syntax definitions fourfold, and
removes four packages from pandoc's dependency graph:

hxt-charproperties
hxt-unicode
hxt-regex-xmlschema
hxt
2021-02-27 14:26:10 -08:00
John MacFarlane
9767386676 Use latest skylighting. 2021-02-22 22:25:10 -08:00
John MacFarlane
d7cfa0ef4c Remove weigh-pandoc.
It's not really useful any more, now that our regular
benchmarks include data on allocation.
2021-02-22 22:10:20 -08:00
John MacFarlane
b2b32d9bb2 'make bench': Create csv files for comparison. 2021-02-18 23:22:18 -08:00
Dmitrii Kovanikov
ef741f3842 Allow base64-bytestring-1.2.* 2021-02-18 18:07:23 +01:00
John MacFarlane
80a1d5c9b6 Revert "Add T.P.XML.Light.Cursor."
This reverts commit d8fc497186.
2021-02-16 19:18:01 -08:00
John MacFarlane
d8fc497186 Add T.P.XML.Light.Cursor. 2021-02-16 18:51:41 -08:00
John MacFarlane
d7a4996b1e Split up T.P.XML.Light into submodules. 2021-02-16 18:40:06 -08:00
John MacFarlane
967e7f5fb9 Rename Text.Pandoc.XMLParser -> Text.Pandoc.XML.Light...
..and add new definitions isomorphic to xml-light's, but with
Text instead of String.  This allows us to keep most of the code in
existing readers that use xml-light, but avoid lots of unnecessary
allocation.

We also add versions of the functions from xml-light's
Text.XML.Light.Output and Text.XML.Light.Proc that operate
on our modified XML types, and functions that convert
xml-light types to our types (since some of our dependencies,
like texmath, use xml-light).

Update golden tests for docx and pptx.

OOXML test: Use `showContent` instead of `ppContent` in `displayDiff`.

Docx: Do a manual traversal to unwrap sdt and smartTag.
This is faster, and needed to pass the tests.

Benchmarks:

A = prior to 8ca191604d (Feb 8)
B = as of 8ca191604d (Feb 8)
C = this commit

| Reader  |  A    | B      | C     |
| ------- | ----- | ------ | ----- |
| docbook | 18 ms | 12 ms  | 10 ms |
| opml    | 65 ms | 62 ms  | 35 ms |
| jats    | 15 ms | 11 ms  |  9 ms |
| docx    | 72 ms | 69 ms  | 44 ms |
| odt     | 78 ms | 41 ms  | 28 ms |
| epub    | 64 ms | 61 ms  | 56 ms |
| fb2     | 14 ms | 5  ms  | 4 ms  |
2021-02-16 16:55:20 -08:00
Albert Krewinkel
1942dc5611
Allow tasty 1.4.* 2021-02-14 14:43:32 +01:00
Albert Krewinkel
8ffd4159d6
Jira: require jira-wiki-markup 1.3.3
* Modified the Doc parser to skip leading blank lines. This fixes
  parsing of documents which start with multiple blank lines.
  (#7095)

* Prevent URLs within link aliases to be treated as autolinks.
  (#6944)

Fixes: #7095
Fixes: #6944
2021-02-12 17:15:12 +01:00
John MacFarlane
8ca191604d Add new unexported module T.P.XMLParser.
This exports functions that uses xml-conduit's parser to
produce an xml-light Element or [Content].  This allows
existing pandoc code to use a better parser without
much modification.

The new parser is used in all places where xml-light's
parser was previously used.  Benchmarks show a significant
performance improvement in parsing XML-based formats
(especially ODT and FB2).

Note that the xml-light types use String, so the
conversion from xml-conduit types involves a lot
of extra allocation.  It would be desirable to
avoid that in the future by gradually switching
to using xml-conduit directly. This can be done
module by module.

The new parser also reports errors, which we report
when possible.

A new constructor PandocXMLError has been added to
PandocError in T.P.Error [API change].

Closes #7091, which was the main stimulus.

These changes revealed the need for some changes
in the tests.  The docbook-reader.docbook test
lacked definitions for the entities it used; these
have been added. And the docx golden tests have been
updated, because the new parser does not preserve
the order of attributes.

Add entity defs to docbook-reader.docbook.

Update golden tests for docx.
2021-02-10 22:04:11 -08:00
Albert Krewinkel
d202f7eb77
Avoid unnecessary use of NoImplicitPrelude pragma (#7089) 2021-02-07 10:02:35 -08:00
Albert Krewinkel
f7be8d0964
pandoc.cabal: use common stanza to reduce duplication (#7086) 2021-02-07 08:33:43 -08:00
Albert Krewinkel
ebb8f23b66 Use hslua-module-path 0.1.0 2021-02-02 21:04:30 -08:00
Albert Krewinkel
61b108d527 Lua: add module "pandoc.path"
The module allows to work with file paths in a convenient and
platform-independent manner.

Closes: #6001
Closes: #6565
2021-02-02 21:04:30 -08:00
John MacFarlane
02d3c71e72 BibTeX writer: use doclayout and doctemplate.
This change allows bibtex/biblatex output to wrap as other
formats do, depending on the settings of `--wrap` and `--columns`.

It also introduces default templates for bibtex and biblatex,
which allow for using the variables `header-include`, `include-before`
or `include-after` (or alternatively the command line options
`--include-in-header`, `--include-before-body`, `--include-after-body`)
to insert content into the generated bibtex/biblatex.

This change requires a change in the return type of the unexported
`T.P.Citeproc.writeBibTeXString` from `Text` to `Doc Text`.

Closes #7068.
2021-02-01 18:05:20 -08:00
John MacFarlane
b239c89a82 BibTeX writer fixes. Closes #7067.
+ Require citeproc 0.3.0.7, which correctly titlecases when titles
  contain non-ASCII characters.
+ Correctly handle 'pages' (= 'page' in CSL).
+ Correctly handle BibLaTeX 'langid' (= 'language' in CSL).
+ In BibTeX output, protect foreign titles since there's no language
  field.
2021-02-01 11:23:07 -08:00
John MacFarlane
a9adb29648 Require citeproc 0.3.0.6. 2021-01-30 19:09:08 -08:00
John MacFarlane
fe06437ba4 Use tasty-bench instead of criterion for benchmarks.
It is much lighter-weight.
2021-01-30 18:01:14 -08:00
John MacFarlane
f5e3c1dad6 Use citeproc 0.3.0.5. 2021-01-22 11:06:35 -08:00
John MacFarlane
83d7804b8f
Merge pull request #7042 from tarleb/jats-element-citations
JATS writer: use element citations
2021-01-22 10:39:58 -08:00
Albert Krewinkel
b4b3560191
JATS writer: allow to use element-citation 2021-01-22 19:35:08 +01:00
John MacFarlane
fa952c8dbe Add biblatex, bibtex as output formats (closes #7040).
* `biblatex` and `bibtex` are now supported as output
  as well as input formats.

* New module Text.Pandoc.Writers.BibTeX, exporting
  writeBibTeX and writeBibLaTeX. [API change]

* New unexported function `writeBibtexString` in
  Text.Pandoc.Citeproc.BibTeX.
2021-01-22 10:08:43 -08:00
John MacFarlane
07c98eae50 Use citeproc >= 0.3.0.4. 2021-01-15 14:17:17 -08:00
John MacFarlane
4a223e68f4 Use commonmark 0.1.1.3. 2021-01-11 12:23:55 -08:00
Albert Krewinkel
68fa437999
JATS writer: fix citations (#7018)
* JATS writer: keep code lines at 80 chars or below

* JATS writer: fix citations
2021-01-10 15:35:48 -08:00
John MacFarlane
c83811773e Bump to 2.11.4.
API change: export getReferences from T.P.Citeproc.
2021-01-10 10:16:15 -08:00
Albert Krewinkel
4f34345867
Update copyright notices for 2021 (#7012) 2021-01-08 09:38:20 -08:00
David Martschenko
385b6a3b21
Implement defaults file inheritance (#6924)
Allow defaults files to inherit options from other defaults files by
specifying them with the following syntax:
`defaults: [list of defaults files or single defaults file]`.
2021-01-05 10:15:59 -08:00
John MacFarlane
886faa3cbc Bump to 2.11.3.2, update changelog and man page 2020-12-29 12:48:55 -08:00
John MacFarlane
5d8b57444e Use citeproc 0.3.0.3.
Fixes an issue in author-only citations when both an
author and translator are present.
2020-12-29 10:43:50 -08:00
John MacFarlane
a7a162ea55 Update test for new citeproc and require it in cabal. 2020-12-28 14:40:23 -08:00
John MacFarlane
19d4e43605 Require texmath 0.12.1. 2020-12-27 22:57:14 -08:00
Albert Krewinkel
8f402beab9
LaTeX writer: support colspans and rowspans in tables. (#6950)
Note that the multirow package is needed for rowspans.
It is included in the latex template under a variable,
so that it won't be used unless needed for a table.
2020-12-20 18:04:54 -08:00
John MacFarlane
37ba5d5dfe Bump to 2.11.3.1 and update changelog and man page. 2020-12-18 15:29:57 -08:00
John MacFarlane
aa37970969 Use citeproc 0.3.0.1. 2020-12-18 15:08:23 -08:00
John MacFarlane
591bb2bace Add test/writer.asciidoctor, tables.asciidoctor to extra-source-files. 2020-12-18 11:27:41 -08:00
John MacFarlane
29e7fef729 Include missing jats test files in pandoc.cabal.
See #6961.
2020-12-18 08:02:36 -08:00
John MacFarlane
9ec3d6ee97 Use skylighting 0.10.2.
Cloess #6625.
2020-12-17 09:32:13 -08:00
John MacFarlane
914cf0b602 Fix citeproc regression with duplicate references.
- Use dev version of citeproc, which handles duplicate
  ids better, preferring the last one in the list
  and discarding the rest.
- Ensure that inline citations take priority over external
  ones.

See jgm/citeproc#36.

This restores the behavior of pandoc-citeproc.
2020-12-16 15:37:40 -08:00
John MacFarlane
b4b4e32307 Properly handle boolean values in writing YAML metadata.
(Markdown writer.)
This requires doctemplates >= 0.9.
Closes #6388.
2020-12-15 23:45:34 -08:00
John MacFarlane
2ce14997ad Require binary >= 0.7.
Needed for runGetOrFail.
2020-12-13 10:33:46 -08:00
Albert Krewinkel
ccd235e31f
LaTeX writer: extract table handling into separate module. 2020-12-12 16:48:28 +01:00
John MacFarlane
248a2a1db5 cabal: remove -Werror=missing-home-modules.
It causes problems using cabal repl.
2020-12-10 10:27:31 -08:00
John MacFarlane
1fd642dd30 Move executable to app directory.
Otherwise we have problems with cabal repl.
2020-12-10 10:08:24 -08:00
John MacFarlane
a3eb87b2ea Add sourcepos extension for commonmarke
* Add `Ext_sourcepos` constructor for `Extension`.
* Add `sourcepos` extension (only for commonmark).
* Bump to 2.11.3

With the `sourcepos` extension set set, `data-pos` attributes are added
to the AST by the commonmark reader. No other readers are affected.  The
`data-pos` attributes are put on elements that accept attributes; for
other elements, an enlosing Div or Span is added to hold the attributes.

Closes #4565.
2020-12-10 08:59:55 -08:00
John MacFarlane
0dd228593f Use latest citeproc release. 2020-12-09 09:34:15 -08:00
John MacFarlane
1489bb8414 Use skylighting 0.10.1. 2020-11-24 21:26:25 -08:00
Albert Krewinkel
41237fcc0e
HTML reader: extract table parsing into separate module 2020-11-24 14:17:35 +01:00
Albert Krewinkel
f9258371dd HTML reader: extract submodules
Reducing module size should reduce memory use during compilation.

This is preparatory work to tackle support for more table features.
2020-11-23 10:12:20 +01:00
John MacFarlane
3f278f580e Remove 'static' flag.
This isn't really necessary and can be misleading
(e.g. on macOS, where a fully static build isn't
possible). cabal's new option
`--enable-executable-static` does the same. On stack
you can add something like this to the options for your
executable in package.yaml:

    ld-options: -static -pthread
2020-11-18 21:08:24 -08:00
John MacFarlane
e17f970ed0 Use citeproc 0.2 2020-11-18 17:49:30 -08:00
John MacFarlane
46bbdad838 Don't allow macos builds with 'static' flag.
Closes #6771.
2020-11-18 15:41:48 -08:00
Albert Krewinkel
94c9028819 JATS writer: move Table handling to separate module
This makes it easier to split the module into smaller parts.
2020-11-17 09:46:30 +01:00
John MacFarlane
79907e5f17 Bump to 2.11.2 for next release (minor API change in Logging). 2020-11-15 08:34:45 -08:00
John MacFarlane
cfb017c76b Bump to 2.11.1.1 and update changelog. 2020-11-07 11:12:19 -08:00