Commit graph

7153 commits

Author SHA1 Message Date
John MacFarlane
3cd21c5f6e Improve fix to #6983.
If we have a paragraph then a bookmarkEnd, we don't need to
insert the empty paragraph (and in fact it alters the spacing).

Closes #6983.
2020-12-29 08:44:43 -08:00
John MacFarlane
55f9b59af1 Docx writer: fix nested tables with captions.
Previously we got unreadable content, because docx seems
to want a `<w:p>` element (even an empty one) at the end of
every table cell.  Closes #6983.
2020-12-28 14:41:28 -08:00
Albert Krewinkel
e837ed772e
HTML reader: use renderTags' from Text.Pandoc.Shared.
The `renderTags'` function was duplicated when the reader used `Text` as
its string type. The duplication is no longer necessary.

A side effect of this change is that empty `<col>` elements are written
as self-closing tags in raw HTML blocks.
2020-12-28 14:48:55 +01:00
John MacFarlane
99e1b67b74 Use meta-description instead of description in templates.
Since this is an attribute value, we need to prepare it
in the writer.
2020-12-27 23:19:14 -08:00
timo-a
668596cc89
Add support for writing nested tables to asciidoc (#6972)
Added field to WriterState that denotes the current nesting level for traversing tables.
Depending on the value of that field nested tables are recognized and written.
Asciidoc supports one level of nesting. If deeper tables are to be written, they are
omitted and a warning is issued.
2020-12-27 18:42:28 -08:00
Albert Krewinkel
dcd89413f3 Powerpoint writer: allow arbitrary OOXML in raw inline elements
The raw text is now included verbatim in the output. Previously is was parsed
into XML elements, which prevented the inclusion of partial XML snippets.
2020-12-27 23:18:54 +01:00
John MacFarlane
47f435276a Citeproc: fix handling of empty URL variables (DOI, etc.).
The `linkifyVariables` function was changing these to links
which then got treated as non-empty by citeproc, leading
to wrong results (e.g. ignoring nonempty URL when empty DOI is present).

Addresses part 2 of jgm/citeproc#41.
2020-12-24 09:56:20 -08:00
John MacFarlane
9cbbf18fe1 HTML writer: don't include p tags in CSL bibliography entries.
Fixes a regression in 2.11.3.
Closes #6966
2020-12-20 22:34:31 -08:00
Albert Krewinkel
8f402beab9
LaTeX writer: support colspans and rowspans in tables. (#6950)
Note that the multirow package is needed for rowspans.
It is included in the latex template under a variable,
so that it won't be used unless needed for a table.
2020-12-20 18:04:54 -08:00
John MacFarlane
914cf0b602 Fix citeproc regression with duplicate references.
- Use dev version of citeproc, which handles duplicate
  ids better, preferring the last one in the list
  and discarding the rest.
- Ensure that inline citations take priority over external
  ones.

See jgm/citeproc#36.

This restores the behavior of pandoc-citeproc.
2020-12-16 15:37:40 -08:00
John MacFarlane
57241e201a Support Lua marshalling of doctemplates BoolVal.
This updates T.P.Lua.Marshaling.Context for doctemplates >= 0.9.
2020-12-16 07:56:07 -08:00
John MacFarlane
b4b4e32307 Properly handle boolean values in writing YAML metadata.
(Markdown writer.)
This requires doctemplates >= 0.9.
Closes #6388.
2020-12-15 23:45:34 -08:00
John MacFarlane
87033b2856 Use fetchItem to get external bibliography.
This means that:

- a URL may be provided, and pandoc will fetch the resource.
- Pandoc will search the resource path for the bibliography
  if it is not found relative to the working directory.

Closes #6940.
2020-12-15 09:09:51 -08:00
John MacFarlane
7d799bfcda Allow both inline and external references to be used
with `--citeproc`.  This fixes a regression, since pandoc-citeproc
allowed these to be combined.

Closes #6951.
2020-12-15 08:51:43 -08:00
John MacFarlane
39153ea6e2 ImageSize: use exif width and height when available.
After the move to JuicyPixels, we were getting incorrect
width and heigh information for some images (see #6936, test-3.jpg).

The correct information was encoded in Exif tags that
JuicyPixels seemed to ignore. So we check these first
before looking at the Width and Height identified by
JuicyPixels.

Closes #6936.
2020-12-14 09:39:07 -08:00
John MacFarlane
c43e2dc0f4 RST writer: better image handling.
- An image alone in its paragraph (but not a figure) is now
  rendered as an independent image, with an `alt` attribute
  if a description is supplied.
- An inline image that is not alone in its paragraph will
  be rendered, as before, using a substitution.
  Such an image cannot have a "center", "left", or
  "right" alignment, so the classes `align-center`,
  `align-left`, or `align-right` are ignored.
  However, `align-top`, `align-middle`, `align-bottom`
  will generate a corresponding `align` attribute.

Closes #6948.
2020-12-13 15:25:46 -08:00
John MacFarlane
32902d0fad
Merge pull request #6941 from tarleb/docx-raw
Docx writer: keep raw openxml strings verbatim
2020-12-13 11:08:41 -08:00
John MacFarlane
c3aa90b57a ImageSize: use JuicyPixels to extract size...
...for png, jpeg, gif, instead of doing our own binary parsing.
See #6936.
2020-12-13 10:33:46 -08:00
John MacFarlane
ef62b70646 ImageSize: use JuicyPixels to determine png size. 2020-12-13 10:33:46 -08:00
Albert Krewinkel
00031fc809
Docx writer: keep raw openxml strings verbatim.
Closes: #6933
2020-12-13 14:09:59 +01:00
Albert Krewinkel
8cf58d96e0
Docx writer: use Content instead of Element. 2020-12-13 14:09:53 +01:00
John MacFarlane
3a7d97f02f
Merge pull request #6946 from mb21/icml-image-fit
ICML writer: fix image bounding box for custom widths/heights
2020-12-12 08:28:14 -08:00
Albert Krewinkel
ccd235e31f
LaTeX writer: extract table handling into separate module. 2020-12-12 16:48:28 +01:00
mb21
208cb96196 ICML writer: fix image bounding box for custom widths/heights
fixes #6936
2020-12-12 14:49:11 +01:00
John MacFarlane
fcd0658189 HTML reader: pay attention to lang attributes on body.
These (as well as lang attributes on html) should update
lang in metadata. See #6938.
2020-12-10 15:51:20 -08:00
John MacFarlane
0a502e5ff5 HTML reader: retain attribute prefixes and avoid duplicates.
Previously we stripped attribute prefixes, reading
`xml:lang` as `lang` for example. This resulted in
two duplicate `lang` attributes when `xml:lang` and
`lang` were both used.  This commit causes the prefixes
to be retained, and also avoids invald duplicate
attributes.

Closes #6938.
2020-12-10 15:44:10 -08:00
John MacFarlane
a3eb87b2ea Add sourcepos extension for commonmarke
* Add `Ext_sourcepos` constructor for `Extension`.
* Add `sourcepos` extension (only for commonmark).
* Bump to 2.11.3

With the `sourcepos` extension set set, `data-pos` attributes are added
to the AST by the commonmark reader. No other readers are affected.  The
`data-pos` attributes are put on elements that accept attributes; for
other elements, an enlosing Div or Span is added to hold the attributes.

Closes #4565.
2020-12-10 08:59:55 -08:00
John MacFarlane
8c9010864c Commonmark reader: refactor specFor, set input name to "". 2020-12-10 08:59:55 -08:00
John MacFarlane
5990cbb150 Parsing: Small code improvements. 2020-12-07 21:34:23 -08:00
John MacFarlane
0fa1023b9e Parsing: More minor performance improvements. 2020-12-07 18:57:09 -08:00
John MacFarlane
ce1791913d Small efficiency improvement in uri parser 2020-12-07 13:24:19 -08:00
John MacFarlane
2f9b684b3a Bibtex parser: avoid noneOf. 2020-12-07 13:01:30 -08:00
John MacFarlane
f2749ba6cd Parsing: in nonspaceChar use satisfy instead of oneOf.
For efficiency.
2020-12-07 12:56:03 -08:00
John MacFarlane
501ea7f0c4 Dokuwiki reader: handle unknown interwiki links better.
DokuWiki lets the user define his own Interwiki links.
Previously pandoc reacted to these by emitting a
google search link, which is not helpful. Instead,
we now just emit the full URL including the
wikilink prefix, e.g. `faquk>FAQ-mathml`.
This at least gives users the ability to
modify the links using filters.

Closes #6932.
2020-12-07 12:15:14 -08:00
John MacFarlane
810df00cf5
Merge pull request #6922 from jtojnar/db-writer-admonitions
Docbook writer: handle admonitions
2020-12-07 08:48:02 -08:00
Jan Tojnar
70c7c5703a
Docbook writer: Handle admonition titles from Markdown reader
Docbook reader produces a `Div` with `title` class for `<title>` element
within an “admonition” element. Markdown writer then turns this
into a fenced div with `title` class attribute. Since fenced divs
are block elements, their content is recognized as a paragraph
by the Markdown reader. This is an issue for Docbook writer because
it would produce an invalid DocBook document from such AST –
the `<title>` element can only contain “inline” elements.

Let’s handle this invalid special case separately by unwrapping
the paragraph before creating the `<title>` element.
2020-12-07 07:28:39 +01:00
Jan Tojnar
16ef877457
Docbook writer: Use correct id attribute consistently
DocBook5 should always use xml:id instead of id so let’s use it everywhere.
2020-12-07 06:23:25 +01:00
Jan Tojnar
dc6856530c
Docbook writer: handle admonitions
Similarly to d6fdfe6f2b,
we should handle admonitions.
2020-12-07 06:23:25 +01:00
Albert Krewinkel
acf932825b
Org reader: preserve targets of spurious links
Links with (internal) targets that the reader doesn't know about are
converted into emphasized text. Information on the link target is now
preserved by wrapping the text in a Span of class `spurious-link`, with
an attribute `target` set to the link's original target. This allows to
recover and fix broken or unknown links with filters.

See: #6916
2020-12-05 22:37:48 +01:00
Nils Carlson
c161893f44
OpenDocument writer: Allow references for internal links (#6774)
This commit adds two extensions to the OpenDocument writer,
`xrefs_name` and `xrefs_number`.

Links to headings, figures and tables inside the document are
substituted with cross-references that will use the name or caption
of the referenced item for `xrefs_name` or the number for `xrefs_number`.

For the `xrefs_number` to be useful heading numbers must be enabled
in the generated document and table and figure captions must be enabled using for example the `native_numbering` extension.

In order for numbers and reference text to be updated the generated
document must be refreshed.

Co-authored-by: Nils Carlson <nils.carlson@ludd.ltu.se>
2020-12-05 10:00:04 -08:00
John MacFarlane
ddb76cb356 LaTeX reader: don't apply theorem default styling to a figure inside.
If we put an image in italics, then when rendering to Markdown
we no longer get an implicit figure.

Closes #6925.
2020-12-05 09:53:39 -08:00
Jan Tojnar
6f35600204
Docbook writer: add XML namespaces to top-level elements (#6923)
Previously, we only added xmlns attributes to chapter elements,
even when running with --top-level-division=section.
Let’s add the namespaces to part and section elements too,
when they are the selected top-level divisions.

We do not need to add namespaces to documents produced with
--standalone flag, since those will already have xmlns attribute
on the root element in the template.
2020-12-04 21:00:21 -08:00
John MacFarlane
dc3ef5201f Markdown writer: ensure that a new csl-block begins on a new line.
This just looks better and doesn't affect the semantics.
See #6921.
2020-12-04 10:55:48 -08:00
John MacFarlane
68bcddeb21 LaTeX writer: Fix bug with nested csl- display Spans.
See #6921.
2020-12-04 10:14:19 -08:00
John MacFarlane
171d3db384 HTML writer: Fix handling of nested csl- display spans.
Previously inner Spans used to represent
CSL display attributes were not rendered as div tags.

See #6921.
2020-12-04 09:47:56 -08:00
John MacFarlane
7199d68ba0 EPUB writer: include title page in landmarks.
Closes #6919.

Note that the toc is also included if `--toc` is specified.
2020-12-03 21:39:44 -08:00
John MacFarlane
9c6cc79c11 EPUB writer: add frontmatter type on body element for nav.xhtml.
Closes #6918.
2020-12-03 21:24:27 -08:00
John MacFarlane
5bbd5a9e80 Docx writer: Support bold and italic in "complex script."
Previously bold and italics didn't work properly in LTR
text.  This commit causes the w:bCs and w:iCs attributes
to be used, in addition to w:b and w:i, for bold and
italics respectively.

Closes #6911.
2020-12-03 09:51:23 -08:00
John MacFarlane
7b11cdee49 Citeproc: ensure that BCP47 lang codes can be used.
We ignore the variants and just use the base lang code
and country code when passing off to citeproc.
2020-12-02 10:46:23 -08:00
John MacFarlane
bff9c129c3 LaTeX reader: don't parse \rule with width 0 as horizontal rule. 2020-11-29 10:35:20 -08:00