Fixed calculation of maximum column widths in pipe tables.
It is now based on the length of the markdown line, rather
than a "stringified" version of the parsed line. This should
be more predictable for users. In addition, we take into account
double-wide characters such as emojis.
Closes#7713.
The function converts a string to `Inlines`, treating interword spaces
as `Space`s or `SoftBreak`s. If you want a `Str` with literal spaces,
use `pandoc.Str`.
Closes: #7709
Using a Lua string where a list of inlines is expected will cause the
string to be split into words, replacing spaces and tabs into
`pandoc.Space()` elements and newlines into `pandoc.SoftBreak()`.
The previous behavior was to treat the string `s` as `{pandoc.Str(s)}`.
The old behavior can be recovered by wrapping the string into a table
`{s}`.
We need to generate a span when the header's ID doesn't match
the one MediaWiki would generate automatically. But MediaWiki's
generation scheme is different from ours (it uses uppercase letters,
and `_` instead of `-`, for example).
This means that in going from markdown -> mediawiki, we'll now get
spans before almost every heading, unless explicit identifiers are
used that correspond to the ones MediaWiki auto-generates.
This is uglier output but it's necessary for internal links to
work properly.
See #7697.
The `lpeg` and `re` modules are loaded into globals of the respective
name, but they are not necessarily registered as loaded packages. This
ensures that
- the built-in library versions are preferred when setting the globals,
- a shared library is used if pandoc has been compiled without `lpeg`,
and
- the `require` mechanism can be used to load the shared library if
available, falling back to the internal version if possible and
necessary.
Reader options can now be passed as an optional third argument to
`pandoc.read`. The object can either be a table or a ReaderOptions value
like `PANDOC_READER_OPTIONS`. Creating new ReaderOptions objects is
possible through the new constructor `pandoc.ReaderOptions`.
Closes: #7656
* Support for <indexterm>s when reading DocBook
* Update implementation status of `<n-ary>` tags
* Remove non-idiomatic parentheses
* More complete `<indexterm>` support, with tests
Co-authored-by: Rowan Rodrik van der Molen <rowan@ytec.nl>
Comparisons of Citation values are performed in Haskell; values are
equal if they represent the same Haskell value. Converting a Citation
value to a string now yields its native Haskell string representation.
Reasons:
- Performance: HsYAML is around 20 times slower in parsing
large YAML bibliographies (#6084).
- An issue was submitted to HsYAML, but it hasn't gotten
any attention. HsYAML seems borderline unmaintained; it hasn't
had a commit in over a year.
- Unfortunately this goes back on our attempts to free ourselves
from C dependencies (#4535). But I don't see a better alternative
until a better pure Haskell parser is available.
Closes#6084.
Notes:
- We've removed the FromYAML instances for all types that had
them, since this is a HsYAML-specific typeclass [API change].
(The yaml package just uses From/ToJSON.)
- Unlike HsYAML (in the configuration we were using), yaml
parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values.
Users may need to quote these when they are meant to be
interpreted as strings. Similarly, 'null' is parsed as
a YAML null value (and will be treated as an empty string
by pandoc rather than the string 'null'). Quoting it will
force it to be interpreted as a string.
- Some tests had to be adjusted accordingly.
- Pandoc now behaves better when the YAML metadata contains
escaping errors: instead of just falling back on treating
the section as a table, it raises a YAML parsing error.
Properties of Block values are marshalled lazily, which generally
improves performance considerably. Script users may also notice the
following differences:
- Block element properties can no longer be accessed by numerical
indexing of the `.c` field. The `.c` property now serves as an alias
for `.content`, so some filter that used this undocumented method
for property access may continue to work, while others will need to
be updated and use proper property names.
- The marshalled Block elements now have a `show` method, and a
`__tostring` metamethod. Both return the Haskell string
representation of the element.
- Block values now have the Lua type `userdata` instead of `table`.
- Adds a new `pandoc.AttributeList()` constructor, which creates the
associative attribute list that is used as the third component of
`Attr` values. Values of this type can often be passed to constructors
instead of `Attr` values.
- `AttributeList` values can no longer be indexed numerically.