Commit graph

14637 commits

Author SHA1 Message Date
John MacFarlane
07d847a910 RST reader: Fix :literal: includes.
These should create code blocks, not insert raw RST.
Closes #7513.
2021-08-20 09:54:42 -07:00
John MacFarlane
9879953a4f Update changelog and AUTHORS. 2021-08-19 22:59:55 -07:00
John MacFarlane
ef4efa5373 Improve docx reader's robustness in extracting images.
The docx reader made a couple assumptions about how docx
containers were laid out that were not always true, with
the result that some images in documents did not get
found/extracted.

Closes #7511.
2021-08-19 10:50:34 -07:00
William Lupton
5159d6653b Clarify that each YAML block is a separate YAML document 2021-08-19 10:37:24 -07:00
Emily Bourke
5616d00d09 pptx: Include image title in description
The image title (i.e. `![alt text](link "title")`) was previously
ignored when writing to pptx. This commit includes it in PowerPoint's
description of the image, along with the link (which was already
included).

Fixes 7352.
2021-08-18 10:10:55 -07:00
John MacFarlane
fd99fe4d7e Revise citeproc code to fit new citeproc 0.5 API.
Linkification of URLs in the bibliography is now done in
the citeproc library, depending on the setting of an option.
We set that option depending on the value of the metadata
field `link-bibliography` (defaulting to true, for consistency
with earlier behavior, though the new behavior includes the
CSL draft recommendation of hyperlinking the title or the whole
entry if a DOI, PMID, PMCID, or URL field is present but not
explicitly rendered).

These changes implement the following recommendations from the
draft CSL v1.0.2 spec (Appendix VI):

> The CSL syntax does not have support for configuration of links.
> However, processors should include links on bibliographic references,
> using the following rules:

> If the bibliography entry for an item renders any of the following
> identifiers, the identifier should be anchored as a link, with the
> target of the link as follows:

> - url: output as is
> - doi: prepend with "`https://doi.org/`"
> - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`"
> - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`"

> If the identifier is rendered as a URI, include rendered URI components
> (e.g. "`https://doi.org/`") in the link anchor. Do not include any other
> affix text in the link anchor (e.g. "Available from: ", "doi: ", "PMID: ").
> If the bibliography entry for an item does not render any of
> the above identifiers, then set the anchor of the link as the item
> title. If title is not rendered, then set the anchor of the link as the
> full bibliography entry for the item. Set the target of the link as one
> of the following, in order of priority:
>
> - doi: prepend with "`https://doi.org/`"
> - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`"
> - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`"
> - url: output as is
>
> If the item data does not include any of the above identifiers, do not
> include a link.
>
> Citation processors should include an option flag for calling
> applications to disable bibliography linking behavior.

Thanks to Benjamin Bray for getting this all working.
2021-08-17 15:34:23 -07:00
John MacFarlane
10dfd81000 Document new link-bibliography metadata field.
This affects whether hyperlinks are added to the bibliography
by citeproc.
2021-08-17 15:33:49 -07:00
John MacFarlane
2e9a8935fb OOXML tests: silence warnings.
These can make the test output confusing, making people think
tests are failing when they're passing.
2021-08-17 15:33:10 -07:00
John MacFarlane
8451bce6de Rename TemplateWarning -> PowerpointTemplateWarning.
@undergroundquizscene - I think TemplateWarning
is apt to be confusing, since this actually doesn't have
anything to do with what we call 'templates' in pandoc.
Hence the change to a powerpoint-specific name.
2021-08-17 15:31:52 -07:00
John MacFarlane
4c6af94f1c Use released citeproc 0.5. 2021-08-17 14:31:00 -07:00
Emily Bourke
72823ad947 pptx: Select layouts from reference doc by name
Until now, users had to make sure that their reference doc contains
layouts in a specific order: the first four layouts in the file had to
have a specific structure, or else pandoc would error (or sometimes
successfully produce a pptx file, which PowerPoint would then fail to
open).

This commit changes the layout selection to use the layout names rather
than order: users must make sure their reference doc contains four
layouts with specific names, and if a layout with the right name isn’t
found pandoc will output a warning and use the corresponding layout from
the default reference doc as a fallback.

I believe the use of names rather than order will be clearer to users,
and the clearer errors will help them troubleshoot when things go wrong.

- Add tests for moved layouts
- Add tests for deleted layouts
- Add newly included layouts to slideMaster1.xml to fix tests
2021-08-17 09:35:25 -07:00
Emily Bourke
9204e5c9b1 Don’t compare cdLine in OOXML golden tests
The `cdLine` field gives the line of the file some CData was found on. I
don’t think this is a difference that should fail these golden tests, as
the XML should still be parsable if nothing else has changed.
2021-08-17 09:35:25 -07:00
Emily Bourke
8474d488a5 Provide more detailed XML diff in tests
I had some failing tests and couldn’t tell what was different in the
XML. Updating the comparison to return what’s different made it easier
to figure out what was wrong, and I think will be helpful for others in
future.
2021-08-17 09:35:25 -07:00
Emily Bourke
88d82203a1 Add TemplateWarning log message type [API change]
This is a general warning to use for messages about templates.
2021-08-17 09:35:25 -07:00
Emily Bourke
415f445fc1
Escape backslashes in haddock comments (#7505)
Any literal backslash needs to be escaped: these are currently showing
up as “‘r’” instead of “‘\r’”.

Co-authored-by: Emily Bourke <undergroundquizscene@protonmail.com>
2021-08-17 08:20:33 -07:00
John MacFarlane
abb35d8b0f Fix bug in last commit due to removal of take1WhileP. 2021-08-16 08:09:20 -07:00
OCzarnecki
e37cf4484d
Multimarkdown sub- and superscripts (#5512) (#7188)
Added an extension `short_subsuperscripts` which modifies the behavior
of `subscript` and `superscript`, allowing subscripts or superscripts containing only
alphanumerics to end with a space character (eg. `x^2 = 4` or `H~2 is
combustible`).  This improves support for multimarkdown. Closes #5512.

Add `Ext_short_subsuperscripts` constructor to `Extension` [API change].
This is enabled by default for `markdown_mmd`.
2021-08-15 21:57:57 -07:00
John MacFarlane
72447a563c Remove documentation for .ul shortcut for now.
See #7307.  Motivation: there is talk of removing this.
2021-08-15 21:44:46 -07:00
John MacFarlane
4340bd52c4 Make docx writer sensitive to native_numbering extension.
Figure and table numbers are now only included if `native_numbering`
is enabled.  (By default it is disabled.)  This is a behavior change
with respect to 2.14.1, but the behavior is that of previous versions.

The change was necessary to avoid incompatibilities between pandoc's
native numbering and third-party cross reference filters like
pandoc-crossref.

Closes #7499.
2021-08-15 15:05:54 -07:00
John MacFarlane
2c466a15af Remove misleading description from command/citeproc-87 test. 2021-08-15 09:26:30 -07:00
John MacFarlane
82638ad53b Convert Quoted in bib entries to special Spans...
before passing them off to citeproc.
This ensures that we get proper localization and flipflopping
if, e.g., quotes are used in titles.

Closes jgm/citeproc#87.
2021-08-13 19:25:29 -07:00
John MacFarlane
15683bb607 Citeproc: avoid odd handling of quotes.
citeproc changes allow us to ignore Quoted elements;
citeproc now uses its own method for represented quoted
things, and only localizes and flipflops quotes it adds itself.

See #87.

The one thing left to do is to convert Quoted elements in
bibliography databases (esp. titles) to `Span ("",["csl-quoted"],[])`
before passing them to citeproc, IF the localized quotes
for the quote type match the standard inverted commas.
2021-08-13 18:13:06 -07:00
John MacFarlane
05640f9a21 Removed quote localization from citeproc processing.
This is now done in citeproc itself.
2021-08-13 17:30:54 -07:00
John MacFarlane
418155aa95 Fix raw LaTeX injection issue (LaTeX writer).
Using a code block containing `\end{verbatim}`, one could
inject raw TeX into a LaTeX document even when `raw_tex`
is disabled.  Thanks to Augustin Laville for noticing the
bug.

Closes #7497.
2021-08-13 11:27:04 -07:00
John MacFarlane
e8d7d157fd LaTeX reader: proper implicit grouping around environment macros. 2021-08-13 10:41:36 -07:00
William Lupton
649133fe5e
Document use of the 'underline' class. (#7492)
Addresses a comment in #7484.
2021-08-12 10:18:22 -07:00
William Lupton
fc20672bb9
Various sample.lua editorial fixes. (#7493)
These address most of the items mentioned in #7487.
There's also a table caption fix (the caption wasn't escaped).
2021-08-12 10:16:34 -07:00
John MacFarlane
ebef4fb41d Bump base-compat version so we get compatibility with base 4.12. 2021-08-12 09:26:39 -07:00
John MacFarlane
3cfcfacd72 Use Prelude from base-compat for ghc 8.4 too.
We were having trouble building on ghc 8.4 because of
the lack of a Foldable instance for (Alt Maybe) in
base < 4.12.

Mystery: for some reason our builds were failing for gitit
but not in the pandoc CI.
2021-08-12 09:24:27 -07:00
Emily Bourke
86cce2b2eb
Add haskell-language-server to shell.nix (#7496)
I find it useful to have this installed via the nix shell when working
on pandoc, so I think others may also find it useful.

Co-authored-by: Emily Bourke <undergroundquizscene@protonmail.com>
2021-08-12 09:20:03 -07:00
John MacFarlane
ec34497bc1 Try fixing compile error on older ghcs.
See https://github.com/jgm/gitit/runs/3308381697
2021-08-11 23:14:43 -07:00
John MacFarlane
073895c340 Fix some lint issues. 2021-08-11 17:53:39 -07:00
John MacFarlane
dd1a956a8a LaTeX reader: Support \global before \def, \let, etc.
See #7494.
2021-08-11 16:28:53 -07:00
John MacFarlane
e3a263df46 Fix scope for LaTeX macros.
They should by default scope over the group in which they
are defined (except `\gdef` and `\xdef`, which are global).
In addition, environments must be treated as groups.

We handle this by making sMacros in the LaTeX parser state
a STACK of macro tables. Opening a group adds a table to
the stack, closing one removes one.  Only the top of the stack
is queried.

This commit adds a parameter for scope to the Macro constructor
(not exported).

Closes #7494.
2021-08-11 16:14:34 -07:00
William Lupton
f00f7ec63c
Clarify internal punctuation in citation keys. (#7491)
Addresses a comment in #5458.
2021-08-11 11:12:45 -07:00
John MacFarlane
a0e44b1ff6 LaTeX reader: improve handling of plain TeX macro primitives.
- Fixed semantics for `\let`.
- Implement `\edef`, `\gdef`, and `\xdef`.
- Add comment noting that currently `\def` and `\edef` set global
  macros (so are equivalent to `\gdef` and `\xdef`).  This should be
  fixed by scoping macro definitions to groups, in a future commit.

Closes #7474.
2021-08-11 10:32:52 -07:00
John MacFarlane
06d97131e5 Tests.Helpers: export testGolden and use it in RTF reader.
This gives a diff output on failure.
2021-08-10 22:07:48 -07:00
John MacFarlane
3a924d8f96 HTML reader: treat commments as blank when parsing.
This modifies pBlank.  Previously comments could sometimes
flummox the parser.

Cloes #7482.
2021-08-10 12:50:23 -07:00
John MacFarlane
7ca4233793 Add test for #7488. 2021-08-10 11:11:33 -07:00
John MacFarlane
3d7120083a Fix RTF table parsing bug that created undesired nested tables.
Closes #7488.
2021-08-10 11:09:12 -07:00
John MacFarlane
6543b05116 Add RTF reader.
- `rtf` is now supported as an input format as well as output.
- New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change]

Closes #3982.
2021-08-10 10:48:55 -07:00
John MacFarlane
c0b68b2030 Allow --slide-level=0.
When the slide level is set to 0, headings won't be used at all
in splitting the document into slides. Horizontal rules must be
used to separate slides.

Closes #7476.
2021-08-08 11:20:26 -07:00
John MacFarlane
6acb674e89 Remove obsolete and incorrect sentence in --slide-level docs. 2021-08-08 11:17:28 -07:00
John MacFarlane
dea1f0f080 RTF writer: emit \outlinelevel for section headings. 2021-08-04 16:37:20 -06:00
mt_caret
407de98b5e
Stop using the HTTP package. (#7456)
We only depend on the urlEncode function in the package, which is also
provided by http-types. The HTTP package also depends on the network
package, which has difficulty building on ghcjs.

Add internal module Text.Pandoc.Network.HTTP, exporting `urlEncode`.
2021-08-03 15:53:05 -06:00
Peter Fabinski
8667ba2bcc
LaTeX table writer: Increase column width precision (#7466)
In some cases, the rounding performed by the LaTeX table
writer would introduce visible overrun outside the text
area.
This adds two more decimal places to the width values.
2021-08-03 15:34:39 -06:00
John MacFarlane
f938378d00 RTF writer: omit \bin in \pict.
According to the spec, this is not needed or wanted when
the data is in hexadecimal format, as it is here.
2021-08-01 22:45:41 -06:00
John MacFarlane
c1ab55793b Add a faq about the "Cannot allocate memory" error on M1 macs. 2021-08-01 10:29:47 -06:00
John MacFarlane
ca12e198ba RTF template: specify font family for fixed-width font f1.
According to the spec, this is mandatory.
2021-08-01 09:45:09 -06:00
John MacFarlane
f145aea0f9 parseFromString: preserve at least the source directory.
Previously we just set the source name to "chunk" when parsing
from strings, to avoid misleading source positions.

This had the side effect that `rebase_relative_paths` would break
inside sections that were parsed as strings.

So, now we use "ORIGINAL_SOURCE_PATH_chunk" instead of just "chunk".

Closes #7464.
2021-07-29 14:54:25 -06:00