Commit graph

14814 commits

Author SHA1 Message Date
John MacFarlane
418155aa95 Fix raw LaTeX injection issue (LaTeX writer).
Using a code block containing `\end{verbatim}`, one could
inject raw TeX into a LaTeX document even when `raw_tex`
is disabled.  Thanks to Augustin Laville for noticing the
bug.

Closes #7497.
2021-08-13 11:27:04 -07:00
John MacFarlane
e8d7d157fd LaTeX reader: proper implicit grouping around environment macros. 2021-08-13 10:41:36 -07:00
William Lupton
649133fe5e
Document use of the 'underline' class. (#7492)
Addresses a comment in #7484.
2021-08-12 10:18:22 -07:00
William Lupton
fc20672bb9
Various sample.lua editorial fixes. (#7493)
These address most of the items mentioned in #7487.
There's also a table caption fix (the caption wasn't escaped).
2021-08-12 10:16:34 -07:00
John MacFarlane
ebef4fb41d Bump base-compat version so we get compatibility with base 4.12. 2021-08-12 09:26:39 -07:00
John MacFarlane
3cfcfacd72 Use Prelude from base-compat for ghc 8.4 too.
We were having trouble building on ghc 8.4 because of
the lack of a Foldable instance for (Alt Maybe) in
base < 4.12.

Mystery: for some reason our builds were failing for gitit
but not in the pandoc CI.
2021-08-12 09:24:27 -07:00
Emily Bourke
86cce2b2eb
Add haskell-language-server to shell.nix (#7496)
I find it useful to have this installed via the nix shell when working
on pandoc, so I think others may also find it useful.

Co-authored-by: Emily Bourke <undergroundquizscene@protonmail.com>
2021-08-12 09:20:03 -07:00
John MacFarlane
ec34497bc1 Try fixing compile error on older ghcs.
See https://github.com/jgm/gitit/runs/3308381697
2021-08-11 23:14:43 -07:00
John MacFarlane
073895c340 Fix some lint issues. 2021-08-11 17:53:39 -07:00
John MacFarlane
dd1a956a8a LaTeX reader: Support \global before \def, \let, etc.
See #7494.
2021-08-11 16:28:53 -07:00
John MacFarlane
e3a263df46 Fix scope for LaTeX macros.
They should by default scope over the group in which they
are defined (except `\gdef` and `\xdef`, which are global).
In addition, environments must be treated as groups.

We handle this by making sMacros in the LaTeX parser state
a STACK of macro tables. Opening a group adds a table to
the stack, closing one removes one.  Only the top of the stack
is queried.

This commit adds a parameter for scope to the Macro constructor
(not exported).

Closes #7494.
2021-08-11 16:14:34 -07:00
William Lupton
f00f7ec63c
Clarify internal punctuation in citation keys. (#7491)
Addresses a comment in #5458.
2021-08-11 11:12:45 -07:00
John MacFarlane
a0e44b1ff6 LaTeX reader: improve handling of plain TeX macro primitives.
- Fixed semantics for `\let`.
- Implement `\edef`, `\gdef`, and `\xdef`.
- Add comment noting that currently `\def` and `\edef` set global
  macros (so are equivalent to `\gdef` and `\xdef`).  This should be
  fixed by scoping macro definitions to groups, in a future commit.

Closes #7474.
2021-08-11 10:32:52 -07:00
John MacFarlane
06d97131e5 Tests.Helpers: export testGolden and use it in RTF reader.
This gives a diff output on failure.
2021-08-10 22:07:48 -07:00
John MacFarlane
3a924d8f96 HTML reader: treat commments as blank when parsing.
This modifies pBlank.  Previously comments could sometimes
flummox the parser.

Cloes #7482.
2021-08-10 12:50:23 -07:00
John MacFarlane
7ca4233793 Add test for #7488. 2021-08-10 11:11:33 -07:00
John MacFarlane
3d7120083a Fix RTF table parsing bug that created undesired nested tables.
Closes #7488.
2021-08-10 11:09:12 -07:00
John MacFarlane
6543b05116 Add RTF reader.
- `rtf` is now supported as an input format as well as output.
- New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change]

Closes #3982.
2021-08-10 10:48:55 -07:00
John MacFarlane
c0b68b2030 Allow --slide-level=0.
When the slide level is set to 0, headings won't be used at all
in splitting the document into slides. Horizontal rules must be
used to separate slides.

Closes #7476.
2021-08-08 11:20:26 -07:00
John MacFarlane
6acb674e89 Remove obsolete and incorrect sentence in --slide-level docs. 2021-08-08 11:17:28 -07:00
John MacFarlane
dea1f0f080 RTF writer: emit \outlinelevel for section headings. 2021-08-04 16:37:20 -06:00
mt_caret
407de98b5e
Stop using the HTTP package. (#7456)
We only depend on the urlEncode function in the package, which is also
provided by http-types. The HTTP package also depends on the network
package, which has difficulty building on ghcjs.

Add internal module Text.Pandoc.Network.HTTP, exporting `urlEncode`.
2021-08-03 15:53:05 -06:00
Peter Fabinski
8667ba2bcc
LaTeX table writer: Increase column width precision (#7466)
In some cases, the rounding performed by the LaTeX table
writer would introduce visible overrun outside the text
area.
This adds two more decimal places to the width values.
2021-08-03 15:34:39 -06:00
John MacFarlane
f938378d00 RTF writer: omit \bin in \pict.
According to the spec, this is not needed or wanted when
the data is in hexadecimal format, as it is here.
2021-08-01 22:45:41 -06:00
John MacFarlane
c1ab55793b Add a faq about the "Cannot allocate memory" error on M1 macs. 2021-08-01 10:29:47 -06:00
John MacFarlane
ca12e198ba RTF template: specify font family for fixed-width font f1.
According to the spec, this is mandatory.
2021-08-01 09:45:09 -06:00
John MacFarlane
f145aea0f9 parseFromString: preserve at least the source directory.
Previously we just set the source name to "chunk" when parsing
from strings, to avoid misleading source positions.

This had the side effect that `rebase_relative_paths` would break
inside sections that were parsed as strings.

So, now we use "ORIGINAL_SOURCE_PATH_chunk" instead of just "chunk".

Closes #7464.
2021-07-29 14:54:25 -06:00
John MacFarlane
1f1a30bbf6 LaTeX writer: Use ulem for underline.
ulem is conditionally included already when the `strikeout`
variable is set, so we set this when there is underlined text,
and use `\uline` instead of `\underline`.

This fixes wrapping for underlined text.
Closes #7351.
2021-07-22 23:05:43 -07:00
John MacFarlane
832196fb17 MIME: use image/x-xcf instead of application/x-xcf.
Closes #7454.
2021-07-22 13:08:30 -07:00
Veratyr
26d883faa8
INSTALL.md: Add GitLab CI/CD example (#7448) 2021-07-19 09:19:41 -07:00
John MacFarlane
c909d16ccc Bump to 2.14.1, update changelog and man page. 2021-07-18 11:39:04 -07:00
John MacFarlane
92efc60571 Fix comment syntax in cabal.project 2021-07-18 11:10:45 -07:00
John MacFarlane
168b559c96 Use doctemplates 0.4.1 and citeproc 0.10. 2021-07-18 10:55:27 -07:00
John MacFarlane
cfa52e5824 Use skylighting 0.11. 2021-07-17 23:29:15 -07:00
John MacFarlane
31a5bccd57 LaTeX reader: avoid trailing hyphen in translating languages.
Previously `\foreignlanguage{english}` turned into `<span lang="en-">`.
The same issue affected Arabic.

Closes #7447.
2021-07-17 23:07:53 -07:00
John MacFarlane
46099e79de DocBook reader: handle images with imageobjectco elements.
Closes #7440.
2021-07-16 13:10:45 -07:00
John MacFarlane
493522c562 LaTeX reader: Support \cline in LaTeX tables.
Closes #7442.
2021-07-16 12:04:43 -07:00
John MacFarlane
18270c7a39 PDF: Fix svgIn path error.
We were duplicating the temp directory; this didn't show up
on macOS or linux because there we use absolute paths for
the temp directory.

Closes #7431.
2021-07-16 11:39:02 -07:00
Jan Tojnar
06408d08e5
DocBook reader: add support for citerefentry (#7437)
Originally intended for referring to UNIX manual pages, either part of the same DocBook document as refentry element, or external – hence the manvolnum element.
These days, refentry is more general, for example the element documentation pages linked below are each a refentry.

As per the *Processing expectations* section of citerefentry, the element is supposed to be a hyperlink to a refentry (when in the same document) but pandoc does not support refentry tag at the moment so that is moot.

https://tdg.docbook.org/tdg/5.1/citerefentry.html
https://tdg.docbook.org/tdg/5.1/manvolnum.html
https://tdg.docbook.org/tdg/5.1/refentry.html

This roughly corresponds to a `manpage` role in rST syntax, which produces a `Code` AST node with attributes `.interpreted-text role=manpage` but that does not fit DocBook parser.

https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-manpage
2021-07-11 15:28:52 -07:00
John MacFarlane
ac0a9da6d8 Improved parsing of raw LaTeX from Text streams (rawLaTeXParser).
We now use source positions from the token stream to tell us
how much of the text stream to consume.  Getting this to
work required a few other changes to make token source positions
accurate.

Closes #7434.
2021-07-11 13:50:28 -07:00
John MacFarlane
477a67061f Always use / when adding directory to image path with extractMedia.
Even on Windows.

May help with #7431.
2021-07-09 14:14:19 -07:00
John MacFarlane
ae22b1e977 RST reader: fix regression with code includes.
With the recent changes to include infrastructure,
included code blocks were getting an extra newline.

Closes #7436.  Added regression test.
2021-07-09 12:27:41 -07:00
Michael Hoffmann
565330033a
Don't incorporate externally linked images in EPUB documents (#7430)
Just like it is possible to avoid incorporating an image in EPUB by
passing `data-external="1"` to a raw HTML snippet, this makes the same
possible for native Images, by looking for an associated `external`
attribute.
2021-07-07 09:26:37 -07:00
Michael Hoffmann
e56e2b0e0b
Recognize data-external when reading HTML img tags (#7429)
Preserve all attributes in img tags.  If attributes have a `data-`
prefix, it will be stripped.  In particular, this preserves a
`data-external` attribute as an `external` attribute in the pandoc AST.
2021-07-06 16:06:29 -07:00
John MacFarlane
e7f8cc5786 T.P.PDF, convertImage: normalize paths.
This will avoid paths on Windows with mixed path separators,
which may cause problems with SVG conversion.

See #7431.
2021-07-06 10:39:47 -07:00
John MacFarlane
f88ebf3ebf Markdown reader: don't try to read contents in self-closing HTML tag.
Previously we had problems parsing raw HTML with self-closing
tags like `<col/>`. The problem was that pandoc would look
for a closing tag to close the markdown contents, but the
closing tag had, in effect, already been parsed by `htmlTag`.

This fixes the issue described in
<https://groups.google.com/d/msgid/pandoc-discuss/297bc662-7841-4423-bcbb-534e99bbba09n%40googlegroups.com>.
2021-07-06 10:22:07 -07:00
John MacFarlane
3ed37f0077 HTML reader: add col, colgroup to 'closes' definitions 2021-07-06 10:21:59 -07:00
John MacFarlane
3a31fe68ef Add command test for #7394.
And fix a small bug in handling of citations in notes, which
led to commas at the end of sentences in some cases.
2021-07-05 15:10:14 -07:00
John MacFarlane
77537b1765 Citeproc: cleanup and efficiency improvement in deNote. 2021-07-05 13:41:01 -07:00
John MacFarlane
ff26af59ac Revamp note citation handling.
Use latest citeproc, which uses a Span with a class rather
than a Note for notes.  This helps us distinguish between
user notes and citation notes.

Don't put citations at the beginning of a note in parentheses.
(Closes #7394.)
2021-07-05 13:19:33 -07:00