Commit graph

7584 commits

Author SHA1 Message Date
John MacFarlane
c712d13b67 Org reader: allow an initial :PROPERTIES: drawer to add to metadata.
Closes #7520.
2021-10-22 22:10:25 -07:00
Aner Lucero
921af30854 Use simpleFigure in Readers. 2021-10-22 15:14:23 -07:00
Albert Krewinkel
c07005a095 Lua: marshal Version values as userdata 2021-10-22 11:16:51 -07:00
Albert Krewinkel
6a03aca906 Lua: marshal Inline elements as userdata
This includes the following user-facing changes:

- Deprecated inline constructors are removed. These are `DoubleQuoted`,
  `SingleQuoted`, `DisplayMath`, and `InlineMath`.

- Attr values are no longer normalized when assigned to an Inline
  element property.

- It's no longer possible to access parts of Inline elements via
  numerical indexes. E.g., `pandoc.Span('test')[2]` used to give
  `pandoc.Str 'test'`, but yields `nil` now. This was undocumented
  behavior not intended to be used in user scripts. Use named properties
  instead.

- Accessing `.c` to get a JSON-like tuple of all components no longer
  works. This was undocumented behavior.

- Only known properties can be set on an element value. Trying to set a
  different property will now raise an error.
2021-10-22 11:16:51 -07:00
Albert Krewinkel
8523bb01b2 Lua: marshal Attr values as userdata
- Adds a new `pandoc.AttributeList()` constructor, which creates the
  associative attribute list that is used as the third component of
  `Attr` values. Values of this type can often be passed to constructors
  instead of `Attr` values.

- `AttributeList` values can no longer be indexed numerically.
2021-10-22 11:16:51 -07:00
Albert Krewinkel
e4287e6c95 Lua: marshal Pandoc values as userdata 2021-10-22 11:16:51 -07:00
Albert Krewinkel
9e74826ba9 Switch to hslua-2.0
The new HsLua version takes a somewhat different approach to marshalling
and unmarshalling, relying less on typeclasses and more on specialized
types. This allows for better performance and improved error messages.

Furthermore, new abstractions allow to document the code and exposed
functions.
2021-10-22 11:16:51 -07:00
John MacFarlane
ee34252219 Move splitStrWhen to T.P.Citeproc.Util.
Previously there were two copies, in BibTeX and Locator.
2021-10-21 22:27:07 -07:00
John MacFarlane
ef7b769fa0 SelfContained: fix bug that caused everything to be made a data uri.
All the code we needed to put most styles and scripts into
inline style and script tags was there, but because of the
order of pattern matching, it was never being called.
Putting the catch-all clause at the end fixes the bug.

Closes #7635, closes #7367.  See also #3423.
2021-10-21 21:51:53 -07:00
John MacFarlane
0a93acf91a Markdown reader: don't parse links or bracketed spans as citations.
Previously pandoc would parse

    [link to (@a)](url)

as a citation; similarly

    [(@a)]{#ident}

This is undesirable.  One should be able to use example references
in citations, and even if `@a` is not defined as an example
reference, `[@a](url)` should be a link containing an author-in-text
citation rather than a normal citation followed by literal `(url)`.

Closes #7632.
2021-10-20 10:34:47 -07:00
John MacFarlane
7754b7f2dd FormatHeuristics: remove .tei.xml extension for TEI.
As noted in #7630, this never worked, because `takeExtension`
only returns `.xml`.  So it won't be missed if we remove it.

Closes #7630.
2021-10-19 08:05:30 -07:00
Milan Bracke
465c28d28e Docx reader: fix handling of empty fields
Some fields only have an instrText and no content, Pandoc didn't
understand these, causing other fields to be misunderstood because it
seemed like a field was still open when it wasn't.
2021-10-18 19:15:40 -07:00
Milan Bracke
6acc82c5d2 Docx parser: implement PAGEREF fields
These fields, often used in tables of contents, can be a hyperlink.
2021-10-18 19:15:40 -07:00
Milan Bracke
193f6bfeba Docx reader: fix handling of nested fields
Fields delimited by fldChar elements can contain other fields. Before,
the nested fields would be ignored, except for the end, which would be
considered the end of the parent field.

To fix this issue, fields needed to be considered containing ParParts
instead of Runs, since a Run can't represent complex enough structures.
This also impacted Hyperlinks since they can originate from a field.
2021-10-18 19:15:40 -07:00
Emily Bourke
8de261ba4e pptx: Line up continuation paragraphs
This commit changes the `marL` and `indent` values used for plain
paragraphs and numbered lists, and changes the spacing defined in the
reference doc master for bulleted lists.

For paragraphs, there is now a left-indent taken from the `otherStyle`
in the master. For numbered lists, the number is positioned where the
text would be if this were a plain paragraph, and the text is indented
to the next level. This means that continuation paragraphs line up
nicely with numbered lists.

It also /mostly/ matches the observed PowerPoint behaviour when
inserting paragraphs and numbered lists: the only difference is that
PowerPoint was using a different margin value for the first level
numbered lists – I’ve changed this to match the other levels, as I don’t
think it makes the spacing unappealing and it allows continuation
paragraphs at any level to line up.

With bulleted lists, I’m keeping the observed PowerPoint behaviour of
specifying only a level, letting `marL` and `indent` be automatically
taken from `bodyStyle`. To that end, this commit changes the `bodyStyle`
spacing in the master of the default reference doc, to:

- line up the text of the first paragraph in each bullet with any
  continuation paragraphs
- line up nested bullet markers in any continuation paragraphs with the
  first paragraph, matching lists and plain paragraphs

This does mean the continuation paragraphs still won’t line up for
anyone using their own reference doc where they haven’t matched the
`otherStyle` and `bodyStyle` indent levels, but I think people in that
situation will be able to troubleshoot.
2021-10-17 17:24:30 -07:00
Emily Bourke
8981872bca pptx: Remove outdated comment
I removed the field this comment refers to recently, missed the
comment.
2021-10-17 17:24:30 -07:00
Emily Bourke
8af15ab345 pptx: Fix list level numbering
In PowerPoint, the content of a top-level list is at the same level as
the content of a top-level paragraph – the only difference is that a
list style has been applied.

At the moment, the pptx writer increments the paragraph level on each
list, turning what should be top-level lists into second-level lists.

This commit changes that logic, only incrementing the paragraph level on
continuation paragraphs of lists.

- Fixes https://github.com/jgm/pandoc/issues/4828
- Fixes https://github.com/jgm/pandoc/issues/4663
2021-10-17 17:24:30 -07:00
Samuel Tardieu
a41c1fe0bb asciidoc writer: translate numberLines attribute to linesnum switch
AsciiDoctor allows to request line numbering on code blocks by
using a switch on the `source` block, such as in:

```
[source%linesnum,haskell]
----
some Haskell code here
----
```
2021-10-14 13:41:12 -07:00
Samuel Tardieu
628cde48cf DocBook reader: honor linenumbering attribute
The attribute DocBook linenumbering="numbered" attribute on code blocks
maps to "numberLines" internally.
2021-10-14 09:04:56 -07:00
Samuel Tardieu
ed8877bd68 Remove redundant $
Found by hlint 3.3.1
2021-10-14 08:29:58 -07:00
John MacFarlane
49c4e1d014 Fix markdown parsing bug for math in bracketed spans and links.
This affects math with unbalanced brackets (e.g. `$(0,1]$`)
inside links, images, bracketed spans.

Closes #7623.
2021-10-13 08:59:37 -07:00
John MacFarlane
c636b5dd16 Revert "Depend on pandoc-types 1.23, remove Null constructor on Block."
This reverts commit fb0d6c7cb6.
2021-10-12 21:00:15 -07:00
John MacFarlane
906e6016bb T.P.Writers.Shared: remove 'breakable'...
which was introduced in the cherry-pick'd commit that
added splitSentences, but isn't needed here.
(It is for the nospace branch.)
2021-10-11 15:08:33 -07:00
John MacFarlane
5d17020a20 T.P.Writers.Shared: Export splitSentences as a Doc Text transform.
[API change]

Use this in man/ms.
2021-10-11 09:45:22 -07:00
John MacFarlane
2befeaa29f Remove splitSentences from T.P.Shared [API change].
We used to attempt automatic sentence splitting in man and ms
output, since sentence-ending periods need to be followed by
two spaces or a newline in these formats.

But it's difficult to do this reliably at the level of
`[Inline]`.
2021-10-11 09:35:50 -07:00
John MacFarlane
63ea754b49 Fix warning 2021-10-11 09:24:31 -07:00
John MacFarlane
84d68b92a2 LaTeX reader: Implement siunitx v3 commands.
We support `\unit`, `\qty`, `\qtyrange`, and `\qtylist`
as synonynms of `\si`, `\SI`, `\SIrange`, and `\SIlist`.

Closes #7614.
2021-10-11 08:54:45 -07:00
Milan Bracke
0f98cbff4b Avoid blockquote when parent style has more indent
When a paragraph has an indentation different from the parent (named)
style, it used to be considered a blockquote. But this only makes sense
when the paragraph has more indentation. So this commit adds a check
for the indentation of the parent style.
2021-10-10 16:27:32 -07:00
John MacFarlane
c72277e986 LaTeX reader: Properly handle \^ followed by group closing.
Closes #7615.
2021-10-10 11:24:28 -07:00
John MacFarlane
d80aaee42b Translations: don't depend on the fact that Aeson Object is...
implemented internally as a HashMap.  This is no longer
public as of aeson 2.0.0.0.
2021-10-10 09:36:33 -07:00
John MacFarlane
5a1bd52677 Don't prepend file:// to --syntax-definition on Windows.
This was a fix for a problem in skylighting, but this
problem doesn't exist now that we've moved from HXT to
xml-conduit.

Cf. #6374.
2021-10-06 12:33:22 -07:00
John MacFarlane
6c507a66cf Avoid bad wraps in markdown writer at the Doc Text level.
Previously we tried to do this at the Inline list level,
but it makes more sense to intervene on breaking spaces
at the Doc Text level.
2021-10-05 08:44:51 -07:00
John MacFarlane
b8d460eeab Powerpoint writer: consolidate text runs when possible.
This slims down the output files by avoiding unnecessary
text run elements.

Updated golden tests.
2021-10-04 12:24:12 -07:00
John MacFarlane
82d587493d Revert "Powerpoint writer: consolidate text run nodes."
This reverts commit 62f83aa486.

This was already being done, it seems.
I misidentified the problem; it is really with `Str ""` nodes.
2021-10-04 11:50:32 -07:00
John MacFarlane
62f83aa486 Powerpoint writer: consolidate text run nodes.
This should reduce the size of the generated files.
2021-10-04 11:45:01 -07:00
John MacFarlane
fb0d6c7cb6 Depend on pandoc-types 1.23, remove Null constructor on Block. 2021-10-01 15:42:00 -07:00
nuew
021cdb543b epub: Add EPUB3 subject metadata (authority/term)
This adds the ability to specify EPUB 3 `authority` and `term` specific
refinements to the `subject` tag. Specifying a plain `subject` tag in
metadata will function as before.
2021-09-30 20:53:07 -07:00
John MacFarlane
75d551f65b Add footnotes to default gfm etxensions.
Now that `gfm` supports footnotes.
https://github.blog/changelog/2021-09-30-footnotes-now-supported-in-markdown-fields/
2021-09-30 13:08:13 -07:00
Ezwal
472b33095e Docx reader: Add placeholder for word diagram 2021-09-30 12:44:44 -07:00
John MacFarlane
45db998b39 EPUB writer: treat epub:type "frontispiece" as front matter.
This allows you to include a frontispiece using

```

![](yourimage.jpg)

etc.
```

Closes #7600.
2021-09-29 08:29:14 -07:00
John MacFarlane
0bdcf415e4 Switch from pretty-simple to pretty-show for native output.
Update tests.

Reason:  it turns out that the native output generated by
pretty-simple isn't always readable by the native reader.
According to https://github.com/cdepillabout/pretty-simple/issues/99
it is not a design goal of the library that the rendered values
be readable using 'read'.  This makes it unsuitable for our
purposes.

pretty-show is a bit slower and it uses 4-space indents
(non-configurable), but it doesn't have this serious drawback.
2021-09-28 21:17:53 -07:00
John MacFarlane
8018179b3d Better implementation of splitStrWhen 2021-09-27 16:43:13 -07:00
John MacFarlane
df57d0930b RST writer: properly handle anchors to ids...
with spaces or leading underscore.

In this cases we need the quoted form, e.g.
```
.. _`foo bar`:

.. _`_foo`:
```

Side note: rST will "normalize" these identifiers anyway,
ignoring the underscore:
https://docutils.sourceforge.io/docs/ref/rst/directives.html#identifier-normalization

Closes #7593.
2021-09-26 21:56:21 -07:00
John MacFarlane
665e6d3d94 BibTeX parser: fix expansion of special strings in series...
e.g. `newseries` or `library`.  Expansion should not happen
when these strings are protected in braces, or when they're
capitalized.

Closes #7591.
2021-09-23 22:21:05 -07:00
John MacFarlane
aa89f6be18 HTML reader: handle empty tbody element in table.
Closes #7589.
2021-09-23 09:25:37 -07:00
John MacFarlane
f0a6eb913d HTML writer: render \ref and \eqref as inline math...
not display.  See #7589.
2021-09-23 08:49:52 -07:00
John MacFarlane
0afb48cd38 HTML writer: pass through \ref and \eqref...
if MathJax is used.

Closes #7587.
2021-09-22 22:33:41 -07:00
John MacFarlane
7ab2f4a61d HTML writer: pass through inline math environments with KaTeX. 2021-09-22 22:18:38 -07:00
John MacFarlane
c266734448 Use pretty-simple to format native output.
Previously we used our own homespun formatting.  But this
produces over-long lines that aren't ideal for diffs in tests.
Easier to use something off-the-shelf and standard.

Closes #7580.

Performance is slower by about a factor of 10, but this isn't
really a problem because native isn't suitable as a serialization
format. (For serialization you should use json, because the reader
is so much faster than native.)
2021-09-21 12:37:42 -07:00
John MacFarlane
c9ce6da1bb LaTeX reader: Recognize that \vadjust sometimes takes "pre".
Closes #7531.
2021-09-19 10:12:07 -07:00