Commit graph

7458 commits

Author SHA1 Message Date
John MacFarlane
e8d7d157fd LaTeX reader: proper implicit grouping around environment macros. 2021-08-13 10:41:36 -07:00
John MacFarlane
3cfcfacd72 Use Prelude from base-compat for ghc 8.4 too.
We were having trouble building on ghc 8.4 because of
the lack of a Foldable instance for (Alt Maybe) in
base < 4.12.

Mystery: for some reason our builds were failing for gitit
but not in the pandoc CI.
2021-08-12 09:24:27 -07:00
John MacFarlane
ec34497bc1 Try fixing compile error on older ghcs.
See https://github.com/jgm/gitit/runs/3308381697
2021-08-11 23:14:43 -07:00
John MacFarlane
073895c340 Fix some lint issues. 2021-08-11 17:53:39 -07:00
John MacFarlane
dd1a956a8a LaTeX reader: Support \global before \def, \let, etc.
See .
2021-08-11 16:28:53 -07:00
John MacFarlane
e3a263df46 Fix scope for LaTeX macros.
They should by default scope over the group in which they
are defined (except `\gdef` and `\xdef`, which are global).
In addition, environments must be treated as groups.

We handle this by making sMacros in the LaTeX parser state
a STACK of macro tables. Opening a group adds a table to
the stack, closing one removes one.  Only the top of the stack
is queried.

This commit adds a parameter for scope to the Macro constructor
(not exported).

Closes .
2021-08-11 16:14:34 -07:00
John MacFarlane
a0e44b1ff6 LaTeX reader: improve handling of plain TeX macro primitives.
- Fixed semantics for `\let`.
- Implement `\edef`, `\gdef`, and `\xdef`.
- Add comment noting that currently `\def` and `\edef` set global
  macros (so are equivalent to `\gdef` and `\xdef`).  This should be
  fixed by scoping macro definitions to groups, in a future commit.

Closes .
2021-08-11 10:32:52 -07:00
John MacFarlane
3a924d8f96 HTML reader: treat commments as blank when parsing.
This modifies pBlank.  Previously comments could sometimes
flummox the parser.

Cloes .
2021-08-10 12:50:23 -07:00
John MacFarlane
3d7120083a Fix RTF table parsing bug that created undesired nested tables.
Closes .
2021-08-10 11:09:12 -07:00
John MacFarlane
6543b05116 Add RTF reader.
- `rtf` is now supported as an input format as well as output.
- New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change]

Closes .
2021-08-10 10:48:55 -07:00
John MacFarlane
c0b68b2030 Allow --slide-level=0.
When the slide level is set to 0, headings won't be used at all
in splitting the document into slides. Horizontal rules must be
used to separate slides.

Closes .
2021-08-08 11:20:26 -07:00
John MacFarlane
dea1f0f080 RTF writer: emit \outlinelevel for section headings. 2021-08-04 16:37:20 -06:00
mt_caret
407de98b5e
Stop using the HTTP package. ()
We only depend on the urlEncode function in the package, which is also
provided by http-types. The HTTP package also depends on the network
package, which has difficulty building on ghcjs.

Add internal module Text.Pandoc.Network.HTTP, exporting `urlEncode`.
2021-08-03 15:53:05 -06:00
Peter Fabinski
8667ba2bcc
LaTeX table writer: Increase column width precision ()
In some cases, the rounding performed by the LaTeX table
writer would introduce visible overrun outside the text
area.
This adds two more decimal places to the width values.
2021-08-03 15:34:39 -06:00
John MacFarlane
f938378d00 RTF writer: omit \bin in \pict.
According to the spec, this is not needed or wanted when
the data is in hexadecimal format, as it is here.
2021-08-01 22:45:41 -06:00
John MacFarlane
f145aea0f9 parseFromString: preserve at least the source directory.
Previously we just set the source name to "chunk" when parsing
from strings, to avoid misleading source positions.

This had the side effect that `rebase_relative_paths` would break
inside sections that were parsed as strings.

So, now we use "ORIGINAL_SOURCE_PATH_chunk" instead of just "chunk".

Closes .
2021-07-29 14:54:25 -06:00
John MacFarlane
1f1a30bbf6 LaTeX writer: Use ulem for underline.
ulem is conditionally included already when the `strikeout`
variable is set, so we set this when there is underlined text,
and use `\uline` instead of `\underline`.

This fixes wrapping for underlined text.
Closes .
2021-07-22 23:05:43 -07:00
John MacFarlane
832196fb17 MIME: use image/x-xcf instead of application/x-xcf.
Closes .
2021-07-22 13:08:30 -07:00
John MacFarlane
31a5bccd57 LaTeX reader: avoid trailing hyphen in translating languages.
Previously `\foreignlanguage{english}` turned into `<span lang="en-">`.
The same issue affected Arabic.

Closes .
2021-07-17 23:07:53 -07:00
John MacFarlane
46099e79de DocBook reader: handle images with imageobjectco elements.
Closes .
2021-07-16 13:10:45 -07:00
John MacFarlane
493522c562 LaTeX reader: Support \cline in LaTeX tables.
Closes .
2021-07-16 12:04:43 -07:00
John MacFarlane
18270c7a39 PDF: Fix svgIn path error.
We were duplicating the temp directory; this didn't show up
on macOS or linux because there we use absolute paths for
the temp directory.

Closes .
2021-07-16 11:39:02 -07:00
Jan Tojnar
06408d08e5
DocBook reader: add support for citerefentry ()
Originally intended for referring to UNIX manual pages, either part of the same DocBook document as refentry element, or external – hence the manvolnum element.
These days, refentry is more general, for example the element documentation pages linked below are each a refentry.

As per the *Processing expectations* section of citerefentry, the element is supposed to be a hyperlink to a refentry (when in the same document) but pandoc does not support refentry tag at the moment so that is moot.

https://tdg.docbook.org/tdg/5.1/citerefentry.html
https://tdg.docbook.org/tdg/5.1/manvolnum.html
https://tdg.docbook.org/tdg/5.1/refentry.html

This roughly corresponds to a `manpage` role in rST syntax, which produces a `Code` AST node with attributes `.interpreted-text role=manpage` but that does not fit DocBook parser.

https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-manpage
2021-07-11 15:28:52 -07:00
John MacFarlane
ac0a9da6d8 Improved parsing of raw LaTeX from Text streams (rawLaTeXParser).
We now use source positions from the token stream to tell us
how much of the text stream to consume.  Getting this to
work required a few other changes to make token source positions
accurate.

Closes .
2021-07-11 13:50:28 -07:00
John MacFarlane
477a67061f Always use / when adding directory to image path with extractMedia.
Even on Windows.

May help with .
2021-07-09 14:14:19 -07:00
John MacFarlane
ae22b1e977 RST reader: fix regression with code includes.
With the recent changes to include infrastructure,
included code blocks were getting an extra newline.

Closes .  Added regression test.
2021-07-09 12:27:41 -07:00
Michael Hoffmann
565330033a
Don't incorporate externally linked images in EPUB documents ()
Just like it is possible to avoid incorporating an image in EPUB by
passing `data-external="1"` to a raw HTML snippet, this makes the same
possible for native Images, by looking for an associated `external`
attribute.
2021-07-07 09:26:37 -07:00
Michael Hoffmann
e56e2b0e0b
Recognize data-external when reading HTML img tags ()
Preserve all attributes in img tags.  If attributes have a `data-`
prefix, it will be stripped.  In particular, this preserves a
`data-external` attribute as an `external` attribute in the pandoc AST.
2021-07-06 16:06:29 -07:00
John MacFarlane
e7f8cc5786 T.P.PDF, convertImage: normalize paths.
This will avoid paths on Windows with mixed path separators,
which may cause problems with SVG conversion.

See .
2021-07-06 10:39:47 -07:00
John MacFarlane
f88ebf3ebf Markdown reader: don't try to read contents in self-closing HTML tag.
Previously we had problems parsing raw HTML with self-closing
tags like `<col/>`. The problem was that pandoc would look
for a closing tag to close the markdown contents, but the
closing tag had, in effect, already been parsed by `htmlTag`.

This fixes the issue described in
<https://groups.google.com/d/msgid/pandoc-discuss/297bc662-7841-4423-bcbb-534e99bbba09n%40googlegroups.com>.
2021-07-06 10:22:07 -07:00
John MacFarlane
3ed37f0077 HTML reader: add col, colgroup to 'closes' definitions 2021-07-06 10:21:59 -07:00
John MacFarlane
3a31fe68ef Add command test for .
And fix a small bug in handling of citations in notes, which
led to commas at the end of sentences in some cases.
2021-07-05 15:10:14 -07:00
John MacFarlane
77537b1765 Citeproc: cleanup and efficiency improvement in deNote. 2021-07-05 13:41:01 -07:00
John MacFarlane
ff26af59ac Revamp note citation handling.
Use latest citeproc, which uses a Span with a class rather
than a Note for notes.  This helps us distinguish between
user notes and citation notes.

Don't put citations at the beginning of a note in parentheses.
(Closes #7394.)
2021-07-05 13:19:33 -07:00
Aner Lucero
cb038bb312 HTML5 writer, remove aria-hidden when explicit atl text is provided. 2021-07-02 13:02:52 -07:00
John MacFarlane
0948af9cc5 Docx writer: Add table numbering for captioned tables.
The numbers are added using fields, so that Word can
create a list of tables that will update automatically.
2021-06-29 11:15:40 -07:00
John MacFarlane
a01ba4463f Docx writer: Fixed a couple bugs in Figure numbering. 2021-06-29 11:15:13 -07:00
John MacFarlane
a3d745e485 Docx writer: support figure numbers.
These are set up in such a way that they will work with Word's
automatic table of figures.

Closes .
2021-06-29 09:56:21 -07:00
Aner Lucero
f4ef652a41 Remove duplicated alt text in HTML output. 2021-06-29 09:02:13 -07:00
John MacFarlane
851d037b3e Improve punctuation moving with --citeproc.
Previously, using `--citeproc` could cause punctuation to move in
quotes even when there aer no citations. This has been changed;
now, punctuation moving is limited to citations.

In addition, we only move footnotes around punctuation if the
style is a note style, even if `notes-after-punctuation` is `true`.
2021-06-28 22:41:14 -07:00
John MacFarlane
97b0aa667c Allow $ characters in bibtex keys.
Closes .
2021-06-28 13:34:12 -07:00
John MacFarlane
f045e59248 Text.Pandoc.Error: fix line calculations in reporting parsec errors.
Also remove a spurious initial newline in the error report.
2021-06-28 13:28:49 -07:00
John MacFarlane
4262898fe9 Set proper initial source name in parsing BibTeX.
(For better error messages.)
2021-06-28 13:28:02 -07:00
John MacFarlane
dd098d4e15 Markdown writer: put space between Plain and following fenced Div.
Closes .
2021-06-28 11:33:22 -07:00
John MacFarlane
4a7a0cff29 ImageSize: Add Tiff constructor for ImageType.
[Minor API change]

This allows pandoc to get size information from tiff images.
Closes .
2021-06-23 11:39:50 -07:00
John MacFarlane
235cdea629 reveal.js writer: Go back to setting boolean values for variables.
In a previous commit we used strings because boolean False
wouldn't render as `false`. This is changed in the dev
version ofdoctemplates, so we can go back to the more
straightforward approach.
2021-06-23 09:54:14 -07:00
John MacFarlane
1b07997f4a Fix regression with comment-only YAML metadata blocks.
Closes .
2021-06-22 09:55:50 -07:00
John MacFarlane
086790d986 Fix unneeded import 2021-06-22 09:49:24 -07:00
John MacFarlane
8eed5b90d0 LaTeX writer: add strut at end of minipage if it contains...
line breaks.  Without them, the last line is shorter
than it should be, at least in some cases.
2021-06-21 23:33:00 -07:00
John MacFarlane
9867231779 Revert "LaTeX writer: put a strut after a line break (\\)."
This reverts commit e2a7ecb5f7.
2021-06-21 23:19:40 -07:00