Commit graph

1803 commits

Author SHA1 Message Date
Albert Krewinkel
a8638894ab
Lua: allow single elements as singleton MetaBlocks/MetaInlines
Single elements should always be treated as singleton lists in the Lua
subsystem.
2021-11-24 16:54:12 +01:00
John MacFarlane
79e6f8db13 Improve detection of pipe table line widths.
Fixed calculation of maximum column widths in pipe tables.
It is now based on the length of the markdown line, rather
than a "stringified" version of the parsed line.  This should
be more predictable for users. In addition, we take into account
double-wide characters such as emojis.

Closes #7713.
2021-11-23 13:29:25 -08:00
Albert Krewinkel
bffd74323c
Lua: add function pandoc.utils.text (#7710)
The function converts a string to `Inlines`, treating interword spaces
as `Space`s or `SoftBreak`s. If you want a `Str` with literal spaces,
use `pandoc.Str`.

Closes: #7709
2021-11-23 09:32:53 -08:00
Albert Krewinkel
0c0945b93c
Lua: split strings into words when treating them as Inline list (#7712)
Using a Lua string where a list of inlines is expected will cause the
string to be split into words, replacing spaces and tabs into
`pandoc.Space()` elements and newlines into `pandoc.SoftBreak()`.

The previous behavior was to treat the string `s` as `{pandoc.Str(s)}`.
The old behavior can be recovered by wrapping the string into a table
`{s}`.
2021-11-23 09:30:48 -08:00
Albert Krewinkel
96a4bbe264
Capture alt-text in JATS figures (#7703)
Co-authored-by: Aner Lucero <4rgento@gmail.com>
2021-11-20 09:48:01 -08:00
John MacFarlane
db9a73c842 Lua tests: reset path and cpath when testing 'require' fallback. 2021-11-19 21:55:14 -08:00
John MacFarlane
4f2eac88aa MediaWiki writer: fix code for generating spans for header IDs.
We need to generate a span when the header's ID doesn't match
the one MediaWiki would generate automatically.  But MediaWiki's
generation scheme is different from ours (it uses uppercase letters,
and `_` instead of `-`, for example).

This means that in going from markdown -> mediawiki, we'll now get
spans before almost every heading, unless explicit identifiers are
used that correspond to the ones MediaWiki auto-generates.
This is uglier output but it's necessary for internal links to
work properly.

See #7697.
2021-11-19 09:05:19 -08:00
John MacFarlane
df5ae1c186 HTML writer: Don't create invalid data- attribute...
for empty attribute key. (It would be better to make these
unrepresentable in the type system, but for now this is
an improvement.)

Closes #7546.
2021-11-19 08:50:18 -08:00
John MacFarlane
25bba0cc62 MediaWiki writer: use HTML spans for anchors when header has id.
Closes #7697.
2021-11-18 21:15:51 -08:00
willj-dev
005dc7ce56
RST reader: handle class attribute for for custom roles (#7700)
Previously the class attribute was ignored, and the name of the role used as the class.
Closes #7699.
2021-11-18 17:33:57 -08:00
Albert Krewinkel
cd91f72843
Lua: set lpeg, re as globals; allow shared lib access via require
The `lpeg` and `re` modules are loaded into globals of the respective
name, but they are not necessarily registered as loaded packages. This
ensures that

- the built-in library versions are preferred when setting the globals,
- a shared library is used if pandoc has been compiled without `lpeg`,
  and
- the `require` mechanism can be used to load the shared library if
  available, falling back to the internal version if possible and
  necessary.
2021-11-17 10:03:04 +01:00
John MacFarlane
c19f063420 Markdown writer: don't create autolinks when this loses information.
Previously we sometimes lost attributes when rendering links as autolinks.

Closes #7692.
2021-11-15 11:06:50 -08:00
Albert Krewinkel
ea268fd8a7
LaTeX reader: add rudimentary support for \autoref (#7693) 2021-11-15 09:40:50 -08:00
Albert Krewinkel
96a01451ef
JATS writer: ensure figures are wrapped with <p> in list items.
This prevents the generation of invalid output.
2021-11-12 13:29:08 +01:00
Christian Despres
abdfefebdf
Writers.Shared: Improve toLegacyTable.
Closes #7683.
(PR #7684)
2021-11-11 20:55:37 -08:00
Albert Krewinkel
d4c73d5e65
Lua: fix argument order in constructor pandoc.Cite.
This restores the old behavior; argument order had been switched
accidentally in pandoc 2.15.
2021-11-09 14:45:36 +01:00
John MacFarlane
0a45f2600a Properly handle commented lines in BibTeX/BibLaTeX.
Closes #7668.
2021-11-08 10:15:53 -08:00
Rowan Rodrik van der Molen
40aa74badc Add <titleabbr> support to DocBook reader 2021-11-08 07:30:20 -08:00
Albert Krewinkel
ab0fe676a8
Lua: ensure that 're' module is always available.
The module is shipped with LPeg.
2021-11-08 12:22:33 +01:00
John MacFarlane
cc46667953 LaTeX reader: add 'uri' class when parsing \url.
Closes #7672.
2021-11-07 21:08:20 -08:00
Albert Krewinkel
6b462e5933 Lua: allow to pass custom reader options to pandoc.read
Reader options can now be passed as an optional third argument to
`pandoc.read`. The object can either be a table or a ReaderOptions value
like `PANDOC_READER_OPTIONS`. Creating new ReaderOptions objects is
possible through the new constructor `pandoc.ReaderOptions`.

Closes: #7656
2021-11-06 09:04:29 -07:00
Rowan Rodrik van der Molen
7a70a46c03
Support for <indexterm>s when reading DocBook (#7607)
* Support for <indexterm>s when reading DocBook
* Update implementation status of `<n-ary>` tags
* Remove non-idiomatic parentheses
* More complete `<indexterm>` support, with tests

Co-authored-by: Rowan Rodrik van der Molen <rowan@ytec.nl>
2021-11-05 10:22:38 -07:00
John MacFarlane
938d557844 Docx reader: don't let first line indents trigger block quotes.
This fixes a regression introduced in pandoc 2.15 by PR #7606.
Closes #7655.
2021-11-02 14:04:38 -07:00
Albert Krewinkel
45bcd7d3f1
Lua: fix typo in SoftBreak constructor 2021-11-02 21:53:08 +01:00
Albert Krewinkel
dbc654e4a7
Lua tests: ensure Inline elements have all expected properties 2021-11-02 21:40:37 +01:00
Albert Krewinkel
421fd736d4
Lua: re-add content property to Strikeout elements
Fixes a regression introduced in 2.15.
2021-11-02 21:22:59 +01:00
Albert Krewinkel
cce49c5d4b
Lua: be more forgiving when retrieving the Image caption property
Fixes a regression introduced in 2.15.
2021-11-02 17:40:07 +01:00
Albert Krewinkel
b26f950cca
Lua: display Attr values using their native Haskell representation 2021-11-02 17:25:47 +01:00
Albert Krewinkel
c467f0fed1
Lua: allow omitting the 2nd parameter in pandoc.Code constructor
Fixes a regression introduced in 2.15 which required users to always
specify an Attr value when constructing a Code element.
2021-11-02 17:23:24 +01:00
Albert Krewinkel
210e4c98b0
Lua: allow to compare, show Citation values
Comparisons of Citation values are performed in Haskell; values are
equal if they represent the same Haskell value. Converting a Citation
value to a string now yields its native Haskell string representation.
2021-11-02 16:49:50 +01:00
Albert Krewinkel
3a0ac52f7b
Lua tests: ensure Block elements have expected properties 2021-11-02 10:23:14 +01:00
Albert Krewinkel
759aa50951
Lua: restore content property on Header elements 2021-11-01 15:43:51 +01:00
Albert Krewinkel
96e76d4cd4
Lua: restore List behavior of MetaList
Fixes a regression introduced in 2.16 which had MetaList elements loose
the `pandoc.List` properties.

Fixes #7650
2021-11-01 08:27:14 +01:00
Albert Krewinkel
3de8f4fdc5
Lua: re-add content property to Link elements
This was a regression introduced in version 2.15.

Fixes: #7647
2021-10-31 11:15:50 +01:00
Tristan Stenner
2444cbc668 Docx writer: add IDs to native_numbering test 2021-10-29 08:40:20 -07:00
Tristan Stenner
31d6a494de Update test golden master for docx native numbering 2021-10-29 08:40:20 -07:00
Albert Krewinkel
f4d9b443d8
Lua: use hslua module abstraction where possible
This will make it easier to generate module documentation in the future.
2021-10-29 17:08:30 +02:00
Albert Krewinkel
e1cf0ad1be
Lua: fix placement of tests for Block elements in pandoc module tests 2021-10-28 14:10:54 +02:00
Albert Krewinkel
7fcf1d6184
Lua: re-add t and tag property to Attr values
Removal of these properties from Attr values was a regression.
2021-10-27 22:32:19 +02:00
John MacFarlane
d226a35c0a Switch back from HsYAML to yaml.
Reasons:

- Performance: HsYAML is around 20 times slower in parsing
  large YAML bibliographies (#6084).
- An issue was submitted to HsYAML, but it hasn't gotten
  any attention.  HsYAML seems borderline unmaintained; it hasn't
  had a commit in over a year.
- Unfortunately this goes back on our attempts to free ourselves
  from C dependencies (#4535).  But I don't see a better alternative
  until a better pure Haskell parser is available.

Closes #6084.

Notes:

- We've removed the FromYAML instances for all types that had
  them, since this is a HsYAML-specific typeclass [API change].
  (The yaml package just uses From/ToJSON.)
- Unlike HsYAML (in the configuration we were using), yaml
  parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values.
  Users may need to quote these when they are meant to be
  interpreted as strings.  Similarly, 'null' is parsed as
  a YAML null value (and will be treated as an empty string
  by pandoc rather than the string 'null').  Quoting it will
  force it to be interpreted as a string.
- Some tests had to be adjusted accordingly.
- Pandoc now behaves better when the YAML metadata contains
  escaping errors: instead of just falling back on treating
  the section as a table, it raises a YAML parsing error.
2021-10-27 12:50:51 -07:00
Albert Krewinkel
b95e864ecf
Lua: marshal SimpleTable values as userdata objects 2021-10-26 21:45:16 +02:00
Albert Krewinkel
a493c7029c
Lua: marshal Block values as userdata objects
Properties of Block values are marshalled lazily, which generally
improves performance considerably. Script users may also notice the
following differences:

  - Block element properties can no longer be accessed by numerical
    indexing of the `.c` field. The `.c` property now serves as an alias
    for `.content`, so some filter that used this undocumented method
    for property access may continue to work, while others will need to
    be updated and use proper property names.

  - The marshalled Block elements now have a `show` method, and a
    `__tostring` metamethod. Both return the Haskell string
    representation of the element.

  - Block values now have the Lua type `userdata` instead of `table`.
2021-10-26 14:40:10 +02:00
Albert Krewinkel
230b133db5
Lua: marshal Citation values as userdata objects 2021-10-25 09:08:58 +02:00
John MacFarlane
113a66bd08 Fix more epub files in epub reader tests.
Closes #7586.
2021-10-24 15:24:59 -07:00
John MacFarlane
73a105f59c Clean up wasteland.epub and formatting.epub from reader tests.
Make them valid according to epubcheck.
2021-10-24 15:11:02 -07:00
John MacFarlane
c282451fb2 Fixed test/epub/img.epub and img_no_cover.epub...
so they're valid epubs.
2021-10-24 14:13:04 -07:00
John MacFarlane
6df9dc1262 Fix conformance errors in test/epub/features.epub and
test/epub/formatting.epub.  See #7586.
2021-10-23 23:23:22 -07:00
John MacFarlane
c712d13b67 Org reader: allow an initial :PROPERTIES: drawer to add to metadata.
Closes #7520.
2021-10-22 22:10:25 -07:00
Albert Krewinkel
8523bb01b2 Lua: marshal Attr values as userdata
- Adds a new `pandoc.AttributeList()` constructor, which creates the
  associative attribute list that is used as the third component of
  `Attr` values. Values of this type can often be passed to constructors
  instead of `Attr` values.

- `AttributeList` values can no longer be indexed numerically.
2021-10-22 11:16:51 -07:00
Albert Krewinkel
e4287e6c95 Lua: marshal Pandoc values as userdata 2021-10-22 11:16:51 -07:00