Commit graph

743 commits

Author SHA1 Message Date
John MacFarlane
66636c89b0 Org writer: fix list items starting with a code block...
or other non-paragraph content.

Closes #7810.
2022-01-08 23:21:15 -08:00
John MacFarlane
a965111680 Fix parsing of footnotes in --metadata-file.
Closes #7813.
2022-01-07 15:58:26 -08:00
Jesse Hathaway
4dcb2b6fb6 MediaWiki writer: Remove redundant display text for wiki links
Prior to this commit the MediaWiki writer always added the display
text for a wiki link:

    * [[Help|Help]]
    * [[Bubbles|Everyone loves bubbles]]

However the display text in the first example is redundant since
MediaWiki uses the target as the default display text. The result being:

    * [[Help]]
    * [[Bubbles|Everyone loves bubbles]]
2022-01-06 15:05:39 -08:00
John MacFarlane
ea74582288 AsciiDoc writer: improve detection of intraword emphasis.
Closes #7803.
2022-01-05 14:48:19 -08:00
John MacFarlane
7d56650e01 OpenDocument writer: fix vertical align bug with display math.
Previously some displayed formulas would be floated above
a preceding text line.  This is fixed by setting vertical-rel
to 'text' rather than 'paragraph-content'.

Closes #7777.
2021-12-28 16:06:25 -08:00
John MacFarlane
c4f6e6cb57 HTML writer: make line breaks more consistent.
- With `--wrap=none`, we now output line breaks between
  block-level elements. Previously they were omitted
  entirely, so the whole document was on one line, unless
  there were literal line breaks in pre sections.  This makes
  the HTML writer's behavior more consistent with that of
  other writers.

- Put newline after `<dd>`.

- Put newlines after block-level elements in footnote section.
2021-12-22 09:45:02 -08:00
John MacFarlane
7a9832166e Add text wrapping to HTML output.
Previously the HTML writer was exceptional in not being
sensitive to the `--wrap` option.  With this change `--wrap`
now works for HTML. The default (as with other formats) is
automatic wrapping to 72 columns.

A new internal module, T.P.Writers.Blaze, exports `layoutMarkup`.
This converts a blaze Html structure into a doclayout Doc Text.

In addition, we now add a line break between an `img` tag
and the associated `figcaption`.

Note: Output is never wrapped in `writeHtmlStringForEPUB`.
This accords with previous behavior since previously the HTML
writer was insensitive to `--wrap` settings.  There's no real
need to wrap HTML inside a zipped container.

Note that the contents of script, textarea, and pre tags are
always laid out with the `flush` combinator, so that unwanted
spaces won't be introduced if these occur in an indented context
in a template.

Closes #7764.
2021-12-22 09:45:02 -08:00
John MacFarlane
4b220d592c Citeproc: avoid adding comma before an author-in-text citation...
...in a note if it begins with a title (no author).

Closes #7761.
2021-12-18 12:13:06 -08:00
John MacFarlane
394fa9d072 Org reader: parse official org-cite citations.
We also support the older org-ref style as a fallback.
We no longer support the "markdown-style" citations.

See #7329.
2021-12-14 11:34:32 -08:00
John MacFarlane
9f089aa286 Org writer: add tests for org-cite citations, and improve support. 2021-12-13 12:11:58 -08:00
John MacFarlane
ccb9db34f8 Add test for #7738. 2021-12-07 23:59:59 -08:00
John MacFarlane
51f6f0e3a1 Improve Markdown writer escaping.
This fixes escaping for '#' in particular.
Closes #7726.
2021-12-03 17:52:47 -08:00
John MacFarlane
619dfa2a2a Markdown reader: don't allow ^ at beginning of link or image label.
This is reserved for footnotes.
Fixes a regression introduced by 0a93acf.

Closes #7723.
2021-11-30 12:53:54 -08:00
John MacFarlane
0d25232bbf LaTeX reader: Fix semantics of \ref.
We were including the ams environment type in addition
to the number. This is proper behavior for `\cref` but
not for `\ref`.  To support `\cref` we need to store
the environment label separately.
2021-11-24 19:48:56 -08:00
John MacFarlane
7726b69cd3 LaTeX reader: omit visible content for \label{...}.
Previously we included the text of the label in square brackets,
but this is undesirable in many cases.

See discussion in
<https://github.com/jgm/pandoc/issues/813#issuecomment-978232426>.
2021-11-24 14:47:00 -08:00
John MacFarlane
6072bdcec9 HTML reader: parse attributes on links and images.
Closes #6970.
2021-11-24 11:01:55 -08:00
John MacFarlane
79e6f8db13 Improve detection of pipe table line widths.
Fixed calculation of maximum column widths in pipe tables.
It is now based on the length of the markdown line, rather
than a "stringified" version of the parsed line.  This should
be more predictable for users. In addition, we take into account
double-wide characters such as emojis.

Closes #7713.
2021-11-23 13:29:25 -08:00
Albert Krewinkel
96a4bbe264
Capture alt-text in JATS figures (#7703)
Co-authored-by: Aner Lucero <4rgento@gmail.com>
2021-11-20 09:48:01 -08:00
John MacFarlane
4f2eac88aa MediaWiki writer: fix code for generating spans for header IDs.
We need to generate a span when the header's ID doesn't match
the one MediaWiki would generate automatically.  But MediaWiki's
generation scheme is different from ours (it uses uppercase letters,
and `_` instead of `-`, for example).

This means that in going from markdown -> mediawiki, we'll now get
spans before almost every heading, unless explicit identifiers are
used that correspond to the ones MediaWiki auto-generates.
This is uglier output but it's necessary for internal links to
work properly.

See #7697.
2021-11-19 09:05:19 -08:00
John MacFarlane
df5ae1c186 HTML writer: Don't create invalid data- attribute...
for empty attribute key. (It would be better to make these
unrepresentable in the type system, but for now this is
an improvement.)

Closes #7546.
2021-11-19 08:50:18 -08:00
John MacFarlane
25bba0cc62 MediaWiki writer: use HTML spans for anchors when header has id.
Closes #7697.
2021-11-18 21:15:51 -08:00
John MacFarlane
c19f063420 Markdown writer: don't create autolinks when this loses information.
Previously we sometimes lost attributes when rendering links as autolinks.

Closes #7692.
2021-11-15 11:06:50 -08:00
Albert Krewinkel
ea268fd8a7
LaTeX reader: add rudimentary support for \autoref (#7693) 2021-11-15 09:40:50 -08:00
John MacFarlane
0a45f2600a Properly handle commented lines in BibTeX/BibLaTeX.
Closes #7668.
2021-11-08 10:15:53 -08:00
John MacFarlane
cc46667953 LaTeX reader: add 'uri' class when parsing \url.
Closes #7672.
2021-11-07 21:08:20 -08:00
John MacFarlane
d226a35c0a Switch back from HsYAML to yaml.
Reasons:

- Performance: HsYAML is around 20 times slower in parsing
  large YAML bibliographies (#6084).
- An issue was submitted to HsYAML, but it hasn't gotten
  any attention.  HsYAML seems borderline unmaintained; it hasn't
  had a commit in over a year.
- Unfortunately this goes back on our attempts to free ourselves
  from C dependencies (#4535).  But I don't see a better alternative
  until a better pure Haskell parser is available.

Closes #6084.

Notes:

- We've removed the FromYAML instances for all types that had
  them, since this is a HsYAML-specific typeclass [API change].
  (The yaml package just uses From/ToJSON.)
- Unlike HsYAML (in the configuration we were using), yaml
  parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values.
  Users may need to quote these when they are meant to be
  interpreted as strings.  Similarly, 'null' is parsed as
  a YAML null value (and will be treated as an empty string
  by pandoc rather than the string 'null').  Quoting it will
  force it to be interpreted as a string.
- Some tests had to be adjusted accordingly.
- Pandoc now behaves better when the YAML metadata contains
  escaping errors: instead of just falling back on treating
  the section as a table, it raises a YAML parsing error.
2021-10-27 12:50:51 -07:00
John MacFarlane
c712d13b67 Org reader: allow an initial :PROPERTIES: drawer to add to metadata.
Closes #7520.
2021-10-22 22:10:25 -07:00
John MacFarlane
0a93acf91a Markdown reader: don't parse links or bracketed spans as citations.
Previously pandoc would parse

    [link to (@a)](url)

as a citation; similarly

    [(@a)]{#ident}

This is undesirable.  One should be able to use example references
in citations, and even if `@a` is not defined as an example
reference, `[@a](url)` should be a link containing an author-in-text
citation rather than a normal citation followed by literal `(url)`.

Closes #7632.
2021-10-20 10:34:47 -07:00
John MacFarlane
49c4e1d014 Fix markdown parsing bug for math in bracketed spans and links.
This affects math with unbalanced brackets (e.g. `$(0,1]$`)
inside links, images, bracketed spans.

Closes #7623.
2021-10-13 08:59:37 -07:00
John MacFarlane
c72277e986 LaTeX reader: Properly handle \^ followed by group closing.
Closes #7615.
2021-10-10 11:24:28 -07:00
John MacFarlane
92abe45863 Further test updates for switch to pretty-show. 2021-09-29 08:28:54 -07:00
John MacFarlane
0bdcf415e4 Switch from pretty-simple to pretty-show for native output.
Update tests.

Reason:  it turns out that the native output generated by
pretty-simple isn't always readable by the native reader.
According to https://github.com/cdepillabout/pretty-simple/issues/99
it is not a design goal of the library that the rendered values
be readable using 'read'.  This makes it unsuitable for our
purposes.

pretty-show is a bit slower and it uses 4-space indents
(non-configurable), but it doesn't have this serious drawback.
2021-09-28 21:17:53 -07:00
John MacFarlane
665e6d3d94 BibTeX parser: fix expansion of special strings in series...
e.g. `newseries` or `library`.  Expansion should not happen
when these strings are protected in braces, or when they're
capitalized.

Closes #7591.
2021-09-23 22:21:05 -07:00
John MacFarlane
aa89f6be18 HTML reader: handle empty tbody element in table.
Closes #7589.
2021-09-23 09:25:37 -07:00
John MacFarlane
c266734448 Use pretty-simple to format native output.
Previously we used our own homespun formatting.  But this
produces over-long lines that aren't ideal for diffs in tests.
Easier to use something off-the-shelf and standard.

Closes #7580.

Performance is slower by about a factor of 10, but this isn't
really a problem because native isn't suitable as a serialization
format. (For serialization you should use json, because the reader
is so much faster than native.)
2021-09-21 12:37:42 -07:00
John MacFarlane
5f7e7f539a Add missing % on command tests.
This prevented `--accept` from working properly.
2021-09-21 10:42:24 -07:00
John MacFarlane
57d93cca56 Org writer: don't indent contents of code blocks.
We previously indented them by two spaces, following a
common convention.  Since the convention is fading, and
the indentation is inconvenient for copy/paste, we are
discontinuing this practice.

Closes #5440.
2021-09-17 09:41:34 -07:00
John MacFarlane
a07d955d6f Fix code blocks using --preserve-tabs.
Previously they did not behave as the equivalent input
with spaces would.  Closes #7573.
2021-09-16 20:46:05 -07:00
John MacFarlane
a3162d341b RST reader: handle escaped colons in reference definitions.
Cloess #7568.
2021-09-13 22:57:08 -07:00
John MacFarlane
8beca46611 Fix command test for #7557. 2021-09-10 12:07:11 -07:00
John MacFarlane
0216a2f504 Org reader: don't parse a list as first item in a list item.
Closes #7557.
2021-09-10 09:50:05 -07:00
Francesco Mazzoli
99a4d1d0b0
Support --reference-location for HTML output (#7461)
The HTML writer now supports `EndOfBlock`, `EndOfSection`, and
`EndOfDocument` for reference locations.  EPUB and HTML slide
show formats are also affected by this change.

This works similarly to the markdown writer, but with special care
taken to skipping section divs with what regards to the block level.

The change also takes care to not modify the output if `EndOfDocument`
is used.
2021-09-10 09:30:05 -07:00
John MacFarlane
5dcd4610e2 Improve asciidoc escaping for -- in URLs. Closes #7529. 2021-08-29 10:12:20 -07:00
John MacFarlane
7ff06a8c43 Fix test for #7521. 2021-08-24 12:56:31 -07:00
John MacFarlane
3f9b7a10ad Markdown reader: fix interaction of --strip-comments and list
parsing.  Use of `--strip-comments` was causing tight lists
to be rendered as loose (as if the comment were a blank line).
Closes #7521.
2021-08-23 22:06:39 -07:00
Simon Schuster
591cdca38b LaTeX-parser: restrict \endinput to current file 2021-08-21 18:08:27 -07:00
John MacFarlane
07d847a910 RST reader: Fix :literal: includes.
These should create code blocks, not insert raw RST.
Closes #7513.
2021-08-20 09:54:42 -07:00
John MacFarlane
fd99fe4d7e Revise citeproc code to fit new citeproc 0.5 API.
Linkification of URLs in the bibliography is now done in
the citeproc library, depending on the setting of an option.
We set that option depending on the value of the metadata
field `link-bibliography` (defaulting to true, for consistency
with earlier behavior, though the new behavior includes the
CSL draft recommendation of hyperlinking the title or the whole
entry if a DOI, PMID, PMCID, or URL field is present but not
explicitly rendered).

These changes implement the following recommendations from the
draft CSL v1.0.2 spec (Appendix VI):

> The CSL syntax does not have support for configuration of links.
> However, processors should include links on bibliographic references,
> using the following rules:

> If the bibliography entry for an item renders any of the following
> identifiers, the identifier should be anchored as a link, with the
> target of the link as follows:

> - url: output as is
> - doi: prepend with "`https://doi.org/`"
> - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`"
> - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`"

> If the identifier is rendered as a URI, include rendered URI components
> (e.g. "`https://doi.org/`") in the link anchor. Do not include any other
> affix text in the link anchor (e.g. "Available from: ", "doi: ", "PMID: ").
> If the bibliography entry for an item does not render any of
> the above identifiers, then set the anchor of the link as the item
> title. If title is not rendered, then set the anchor of the link as the
> full bibliography entry for the item. Set the target of the link as one
> of the following, in order of priority:
>
> - doi: prepend with "`https://doi.org/`"
> - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`"
> - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`"
> - url: output as is
>
> If the item data does not include any of the above identifiers, do not
> include a link.
>
> Citation processors should include an option flag for calling
> applications to disable bibliography linking behavior.

Thanks to Benjamin Bray for getting this all working.
2021-08-17 15:34:23 -07:00
John MacFarlane
2c466a15af Remove misleading description from command/citeproc-87 test. 2021-08-15 09:26:30 -07:00
John MacFarlane
82638ad53b Convert Quoted in bib entries to special Spans...
before passing them off to citeproc.
This ensures that we get proper localization and flipflopping
if, e.g., quotes are used in titles.

Closes jgm/citeproc#87.
2021-08-13 19:25:29 -07:00