Commit graph

7406 commits

Author SHA1 Message Date
John MacFarlane
ed3974a254 LaTeX writer: always use a minipage for cells with line breaks...
if width information is available.  Otherwise the way we treat them can
lead to content that overflows a cell.

Closes #7393.
2021-06-21 18:25:36 -07:00
John MacFarlane
eee648447a LaTeX writer: Use \strut instead of ~ before \\ in empty line. 2021-06-21 18:25:07 -07:00
John MacFarlane
14b2eb2aeb reveal.js writer: better handling of options.
Previously it was impossible to specify false values for
options that default to true; setting the option to false
just caused the portion of the template setting the option
to be omitted.

Now we prepopulate all the variables with their default
values, including them unconditionally and allowing them
to be overridden.
2021-06-21 16:40:52 -07:00
John MacFarlane
82ad855f38 Markdown writer: Fix regression in code blocks with attributes.
Code blocks with a single class but nonempty attributes
were having attributes drop as a result of #7242.

Closes #7397.
2021-06-21 08:49:00 -07:00
John MacFarlane
3fb5499dd6 insertMediaBag: ensure we get sane mediaPath for URLs.
Long URLs cannot be treated as mediaPaths, but System.FilePath's
`isRelative` often returns True for them.  So we add a check
for an absolute URL.  We also ensure that extensions are derived
only from the path portion of URLs (previously a following query
was being included).

Closes #7391.
2021-06-18 13:19:24 -07:00
John MacFarlane
cfa26e3ca0 Docx reader: handle absolute URIs in Relationship Target.
Closes #7374.
2021-06-12 13:56:09 -07:00
John MacFarlane
ea53a1dc5c Markdown writer: allow pipe_tables to be disabled for commonmark...
(commonmark_x, gfm).  Closes #7375.
2021-06-12 10:20:19 -07:00
John MacFarlane
b0cd6c6224 Fix regression in citeproc processing.
If inline references are used (in the metadata `references` field),
we should still only include in the bibliography items that are
actually cited -- unless `nocite` is used.

Closes #7376.
2021-06-12 10:16:44 -07:00
John MacFarlane
3776e828a8 Fix MediaBag regressions.
With the 2.14 release `--extract-media` stopped working as before;
there could be mismatches between the paths in the rendered document and
the extracted media.

This patch makes several changes (while keeping the same API).

The `mediaPath` in 2.14 was always constructed from the SHA1 hash of
the media contents.  Now, we preserve the original path unless it's
an absolute path or contains `..` segments (in that case we use a path
based on the SHA1 hash of the contents).

When constructing a path from the SHA1 hash, we always use the
original extension, if there is one. Otherwise we look up an
appropriate extension for the mime type.

`mediaDirectory` and `mediaItems` now use the `mediaPath`, rather
than the mediabag key, for the first component of the tuple.
This makes more sense, I think, and fits with the documentation
of these functions; eventually, though, we should rework the API so that
`mediaItems` returns both the keys and the MediaItems.

Rewriting of source paths in `extractMedia` has been fixed.

`fillMediaBag` has been modified so that it doesn't modify
image paths (that was part of the problem in #7345).

We now do path normalization (e.g. `\` separators on Windows) only
in writing the media; the paths are left unchanged in the image
links (sensibly, since they might be URLs and not file paths).

These changes should restore the original behavior from before 2.14.

Closes #7345.
2021-06-10 16:47:02 -07:00
John MacFarlane
aa79b3035c T.P.MIME, extensionFromMimeType: add a few special cases.
When we do a reverse lookup in the MIME table, we just get the
last match, so when the same mime type is associated with several
different extensions, we sometimes got weird results, e.g. `.vs`
for `text/plain`.  These special cases help us get the most standard
extensions for mime types like `text/plain`.
2021-06-10 16:36:54 -07:00
Albert Krewinkel
c7dd33d5aa
Docx writer: fix handling of empty table headers
A table header which does not contain any cells is now treated as an
empty header.

Fixes: #7369
2021-06-10 18:36:49 +02:00
Albert Krewinkel
55bcd4b4fb
Lua utils: fix handling of table headers in from_simple_table
Passing an empty list of header cells now results in an empty table
header.

Fixes: #7369
2021-06-10 18:36:49 +02:00
John MacFarlane
76e5f047b0 Citeproc: avoid duplicate classes and attributes on refs div. 2021-06-08 17:51:53 -07:00
John MacFarlane
21cc52abe3 LaTeX writer: Fix regression in table header position.
In recent versions the table headers were no longer bottom-aligned
(if more than one line).  This patch fixes that by using minipages
for table headers in non-simple tables.

Closes #7347.
2021-06-05 14:13:58 -06:00
Jan Tojnar
c550bf8482 CommonMark writer: do not use simple class for fenced-divs
In https://github.com/jgm/pandoc/pull/7242, we introduced a simple attribute style for for code blocks and fenced divs with a single class but turns out the CommonMark extension does not support it for fenced divs.

https://github.com/jgm/commonmark-hs/blob/master/commonmark-extensions/test/fenced_divs.md
2021-06-05 13:51:18 -06:00
Jan Tojnar
7a3ee9d3d8 CommonMark writer: do not throw away attributes when Ext_attributes is enabled
Ext_attributes covers at least the following:

- Ext_fenced_code_attributes
- Ext_header_attributes
- Ext_inline_code_attributes
- Ext_link_attributes
2021-06-05 13:51:18 -06:00
Jan Tojnar
c6f8c38c49 Markdown writer: re-use functions from Inline
Instead of duplicating linkAttributes and attrsToMarkdown, let’s just use those from the Inline module.
2021-06-05 13:51:18 -06:00
Jan Tojnar
c8ab8bccf2 DocBook reader: Add support for danger element
Added in DocBook 5.2:

- https://github.com/docbook/docbook/pull/64
- https://tdg.docbook.org/tdg/5.2/danger.html
2021-06-05 08:02:21 -06:00
Jan Tojnar
af9de925de DocBook writer: Remove non-existent admonitions
attention, error and hint are actually just reStructuredText specific.
danger was too until introduced in DocBook 5.2: https://github.com/docbook/docbook/issues/55
2021-06-05 08:02:21 -06:00
John MacFarlane
b6c04383e4 T.P.Class.IO: normalise path in writeMedia.
This ensures that we get `\` separators on Windows.
2021-06-03 18:34:38 -06:00
John MacFarlane
311736fb0a Text.Pandoc.PDF: only print relevant part of environment on --verbose. 2021-06-02 15:21:13 -06:00
John MacFarlane
2b5dad9912 Fix regression in 2.14 for generation of PDFs with SVGs.
Closes #7344.
2021-06-02 10:42:22 -06:00
John MacFarlane
3b628f7664 HTML writer: Don't omit width attribute on div.
Closes #7342.
2021-06-01 21:57:49 -06:00
John MacFarlane
2e4ef14d91 Markdown reader: fix pipe table regression in 2.11.4.
Previously pipe tables with empty headers (that is, a header
line with all empty cells) would be rendered as headerless
tables.  This broke in 2.11.4.

The fix here is to produce an AST with an empty table head
when a pipe table has all empty header cells.

Closes #7343.
2021-06-01 21:44:55 -06:00
John MacFarlane
abb59bd582 LaTeX reader: don't allow optional * on symbol control sequences.
Generally we allow optional starred variants of LaTeX commands
(since many allow them, and if we don't accept these explicitly,
ignoring the star usually gives acceptable results).  But we
don't want to do this for `\(*\)` and similar cases.

Closes #7340.
2021-06-01 13:54:51 -06:00
John MacFarlane
62f46b3995 Fix regression with commonmark/gfm yaml metdata block parsing.
A regression in 2.14 led to the document body being omitted
after YAML metadata in some cases.  This is now fixed.

Closes #7339.
2021-05-31 21:34:51 -06:00
John MacFarlane
fc70f44ee2 HTML reader: fix column width regression.
Column widths specified with a style attribute were
off by a factor of 100 in 2.14.

Closes #7334.
2021-05-30 17:15:14 -07:00
John MacFarlane
cc206af392 Have LoadedResource use relative paths.
The immediate reason for this is to allow the test output of #3752
to work on both windows and linux.
2021-05-30 10:23:00 -07:00
John MacFarlane
c2f46e6df4 Docx writer: fix regression on captions.
The "Table Caption" style was no longer getting applied.
(It was overwritten by "Compact.")

Closes #7328.
2021-05-30 10:07:28 -07:00
John MacFarlane
cc6dcf0392 Markdown reader: in rebasePaths, check for both Windows and Posix
absolute paths.  Previously Windows pandoc was treating
`/foo/bar.jpg` as non-absolute.
2021-05-29 17:36:30 -07:00
John MacFarlane
0d7103de7e In rebasePath, check for absolute paths two ways.
isAbsolute from FilePath doesn't return True on Windows
for paths beginning with `/`, so we check that separately.
2021-05-29 14:41:28 -07:00
John MacFarlane
b6b2331fdc Support rebase_relative_paths for commonmark based formats.
(Including `gfm`.)
2021-05-28 13:58:44 -07:00
Emily Bourke
56b211120c
Docx reader: Support new table features.
* Column spans
* Row spans
  - The spec says that if the `val` attribute is ommitted, its value
    should be assumed to be `continue`, and that its values are
    restricted to {`restart`, `continue`}. If the value has any other
    value, I think it seems reasonable to default it to `continue`. It
    might cause problems if the spec is extended in the future by adding
    a third possible value, in which case this would probably give
    incorrect behaviour, and wouldn't error.
* Allow multiple header rows
* Include table description in simple caption
  - The table description element is like alt text for a table (along
    with the table caption element). It seems like we should include
    this somewhere, but I’m not 100% sure how – I’m pairing it with the
    simple caption for the moment. (Should it maybe go in the block
    caption instead?)
* Detect table captions
  - Check for caption paragraph style /and/ either the simple or
    complex table field. This means the caption detection fails for
    captions which don’t contain a field, as in an example doc I added
    as a test. However, I think it’s better to be too conservative: a
    missed table caption will still show up as a paragraph next to the
    table, whereas if I incorrectly classify something else as a table
    caption it could cause havoc by pairing it up with a table it’s
    not at all related to, or dropping it entirely.
* Update tests and add new ones

Partially fixes: #6316
2021-05-28 20:15:23 +02:00
Emily Bourke
44484d0dee
Docx reader: Read table column widths. 2021-05-28 20:15:23 +02:00
John MacFarlane
4842c5fb82 Two citeproc locator/suffix improvements:
- Recognize locators spelled with a capital letter.
  Closes #7323.
- Add a comma and a space in front of the suffix if it doesn't start
  with space or punctuation.  Closes #7324.
2021-05-27 18:28:52 -07:00
John MacFarlane
4b16d181e7 rebase_relative_paths: leave empty paths unchanged. 2021-05-27 14:16:37 -07:00
John MacFarlane
0661ce699f rebase_relative_paths extension: don't change fragment paths.
We don't want a pure fragment path to be rewritten, since
these are used for cross-referencing.
2021-05-27 13:53:26 -07:00
John MacFarlane
6972a7dc91 Modify rebase_reference_links treatment of reference links/images.
The directory is based on the file containing the link
reference, not the file containing the link, if these differ.
2021-05-27 11:26:38 -07:00
John MacFarlane
cbe16b2866 Citeproc: Don't detect math elements as locators.
Closes #7321.
2021-05-27 10:49:45 -07:00
John MacFarlane
834da53058 Add rebase_relative_paths extension.
- Add manual entry for (non-default) extension
  `rebase_relative_paths`.
- Add constructor `Ext_rebase_relative_paths` to `Extensions`
  in Text.Pandoc.Extensions [API change]. When enabled, this
  extension rewrites relative image and link paths by prepending
  the (relative) directory of the containing file.
- Make Markdown reader sensitive to the new extension.
- Add tests for #3752.

Closes #3752.

NB. currently the extension applies to markdown and associated
readers but not commonmark/gfm.
2021-05-27 10:38:25 -07:00
John MacFarlane
81eadfd99a LaTeX reader: improve \def and implement \newif.
- Improve parsing of `\def` macros.  We previously set "verbatim mode"
  even for parsing the initial `\def`; this caused problems for things
  like
  ```
  \def\foo{\def\bar{BAR}}
  \foo
  \bar
  ```
- Implement `\newif`.
- Add tests.
2021-05-27 09:15:04 -07:00
John MacFarlane
8d5014fdfc Logging: remove single quotes around paths in messages.
We weren't doing it consistently and it seems unnecessary.
2021-05-25 11:53:49 -07:00
Albert Krewinkel
105a50569b Allow compilation with base 4.15 2021-05-25 11:52:49 -07:00
Albert Krewinkel
bb2530caa4 Use haddock-library-1.10.0 2021-05-25 11:52:49 -07:00
John MacFarlane
f2c1b57469 PandocMonad: add info message in downloadOrRead...
indicating what path local resources have been loaded from.
2021-05-25 10:08:30 -07:00
John MacFarlane
fb40c8109d Logging: add LoadedResource constructor to LogMessage.
[API change]

This is for INFO-level messages telling where image data has been
loaded from.  (This can vary because of the resource path.)
2021-05-25 10:07:24 -07:00
Albert Krewinkel
d46ea7d7da
Jira: add support for "smart" links
Support has been added for the new
`[alias|https://example.com|smart-card]` syntax.
2021-05-25 16:54:42 +02:00
John MacFarlane
8511f6fdf6 MediaBag improvements.
In the current dev version, we will sometimes add
a version of an image with a hashed name, keeping
the original version with the original name, which
would leave to undesirable duplication.

This change separates the media's filename from the
media's canonical name (which is the path of the link
in the document itself).  Filenames are based on SHA1
hashes and assigned automatically.

In Text.Pandoc.MediaBag:

- Export MediaItem type [API change].
- Change MediaBag type to a map from Text to MediaItem [API change].
- `lookupMedia` now returns a `MediaItem` [API change].
- Change `insertMedia` so it sets the `mediaPath` to
  a filename based on the SHA1 hash of the contents.
  This will be used when contents are extracted.

In Text.Pandoc.Class.PandocMonad:

- Remove `fetchMediaResource` [API change].

Lua MediaBag module has been changed minimally. In the future
it would be better, probably, to give Lua access to the full
MediaItem type.
2021-05-24 09:20:44 -07:00
Albert Krewinkel
58fbf56548
Jira writer: use {color} when span has a color attribute
Closes: tarleb/jira-wiki-markup#10
2021-05-24 09:56:02 +02:00
John MacFarlane
1af2cfb287 Handle relative lengths (e.g. 2*) in HTML column widths.
See <https://www.w3.org/TR/html4/types.html#h-6.6>.

"A relative length has the form "i*", where "i" is an integer. When
allotting space among elements competing for that space, user agents
allot pixel and percentage lengths first, then divide up remaining
available space among relative lengths. Each relative length receives a
portion of the available space that is proportional to the integer
preceding the "*". The value "*" is equivalent to "1*". Thus, if 60
pixels of space are available after the user agent allots pixel and
percentage space, and the competing relative lengths are 1*, 2*, and 3*,
the 1* will be alloted 10 pixels, the 2* will be alloted 20 pixels, and
the 3* will be alloted 30 pixels."

Closes #4063.
2021-05-22 22:03:54 -07:00