API changes:
- The function T.P.Filter.applyFilters now takes a filter
environment of type `Environment`, instead of a ReaderOptions value.
The `Environment` type is exported from `T.P.Filter` and allows to
combine ReaderOptions and WriterOptions in a single value.
- Global, exported from T.P.Lua, has a new type constructor
`PANDOC_WRITER_OPTIONS`.
Closes: #5221
Previously the HTML writer was exceptional in not being
sensitive to the `--wrap` option. With this change `--wrap`
now works for HTML. The default (as with other formats) is
automatic wrapping to 72 columns.
A new internal module, T.P.Writers.Blaze, exports `layoutMarkup`.
This converts a blaze Html structure into a doclayout Doc Text.
In addition, we now add a line break between an `img` tag
and the associated `figcaption`.
Note: Output is never wrapped in `writeHtmlStringForEPUB`.
This accords with previous behavior since previously the HTML
writer was insensitive to `--wrap` settings. There's no real
need to wrap HTML inside a zipped container.
Note that the contents of script, textarea, and pre tags are
always laid out with the `flush` combinator, so that unwanted
spaces won't be introduced if these occur in an indented context
in a template.
Closes#7764.
Markua is a markdown variant used by Leanpub.
More information about Markua can be found at https://leanpub.com/markua/read.
Adds a new exported function `writeMarkua` from T.P.Writers.Markdown.
[API change]
Closes#1871.
Co-authored by Tim Wisotzki and Samuel Lemmenmeier.
The traversal order of filters can now be selected by setting the key
`traverse` of the filter to either `'topdown'` or `'typewise'`; the
default remains `'typewise'`.
Topdown traversals can be cut short by returning `false` as a second
value from the filter function. No child-element of the returned element
is processed in that case.
Previously, both `fmt == f` case and Image have a rank of 1.
In the end, e.g. from ipynb to html conversion,
if both html and image exists, it actually prefers the image.
This commit changes this, so that fmt == f is always highest rank,
and rank never collides.
This is achieved by keeping fmt == f case having rank 1,
and every other rank increased by 1.
The first argument passed to Lua `Reader` functions is no longer a plain
string but a richer data structure. The structure can easily be
converted to a string by applying `tostring`, but is also a list with
elements that contain each the *text* and *name* of each input source as
a property of the respective name.
A small example is added to the custom reader documentation, showcasing
its use in a reader that creates a syntax-highlighted code block for
each source code file passed as input.
Existing readers must be updated.
- `walk` methods are added to `Block` and `Inline` values; the methods
are similar to `pandoc.utils.walk_block` and
`pandoc.utils.walk_inline`, but apply to filter also to the element
itself, and therefore return a list of element instead of a single
element.
- Functions of name `Doc` are no longer accepted as alternatives for
`Pandoc` filter functions. This functionality was undocumented.
The marshaling functions for pandoc's AST are extracted into a separate
package. The package comes with a number of changes:
- Pandoc's List module was rewritten in C, thereby improving error
messages.
- Lists of `Block` and `Inline` elements are marshaled using the new
list types `Blocks` and `Inlines`, respectively. These types
currently behave identical to the generic List type, but give better
error messages. This also opens up the possibility of adding
element-specific methods to these lists in the future.
- Elements of type `MetaValue` are no longer pushed as values which
have `.t` and `.tag` properties. This was already true for
`MetaString` and `MetaBool` values, which are still marshaled as Lua
strings and booleans, respectively. Affected values:
+ `MetaBlocks` values are marshaled as a `Blocks` list;
+ `MetaInlines` values are marshaled as a `Inlines` list;
+ `MetaList` values are marshaled as a generic pandoc `List`s.
+ `MetaMap` values are marshaled as plain tables and no longer
given any metatable.
- The test suite for marshaled objects and their constructors has
been extended and improved.
- A bug in Citation objects, where setting a citation's suffix
modified it's prefix, has been fixed.
* Drop old windows 32-bit constraints
- basement >= 0.0.10 was 0.0.12 on stackage-18.10
- foundation >= 0.0.23 was 0.0.26.1 on stackage-18.10
* Update cabal `tested-with` field to correspond to `ci.yml` matrix
* ghc: 8.10.{2,4} → 8.10.7
This allows us to get rid of the old custom prelude and
some crufty cpp. But the primary reason for this is that
conduit has bumped its base lower bound to 4.12, making it
impossible for us to support lower base versions.
This is better as an example. And it is faster than pandoc's
regular creole parser, which shows that high-performance readers
can be developed this way.
New module Text.Pandoc.Readers.Custom, exporting
readCustom [API change].
Users can now do `-f myreader.lua` and pandoc will treat the
script myreader.lua as a custom reader, which parses an input
string to a pandoc AST, using the pandoc module defined for
Lua filters.
A sample custom reader can be found in data/reader.lua.
Closes#7669.
Compiles the 'lpeg' library (Parsing Expression Grammars For Lua) into
the program.
Package maintainers may choose to rely on package dependencies to make
lpeg available, in which case they can compile the with the constraint
`lpeg +rely-on-shared-lpeg-library`.
This fixes issues with
- misleading error messages when a required function parameter is
omitted;
- absent properties still being listed in the output of `pairs`; and
- alias accessing leading to errors instead of returning `nil`, e.g.
with `(pandoc.Str '').identifier`.
Fixes: #7661
See also: #7657
Reasons:
- Performance: HsYAML is around 20 times slower in parsing
large YAML bibliographies (#6084).
- An issue was submitted to HsYAML, but it hasn't gotten
any attention. HsYAML seems borderline unmaintained; it hasn't
had a commit in over a year.
- Unfortunately this goes back on our attempts to free ourselves
from C dependencies (#4535). But I don't see a better alternative
until a better pure Haskell parser is available.
Closes#6084.
Notes:
- We've removed the FromYAML instances for all types that had
them, since this is a HsYAML-specific typeclass [API change].
(The yaml package just uses From/ToJSON.)
- Unlike HsYAML (in the configuration we were using), yaml
parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values.
Users may need to quote these when they are meant to be
interpreted as strings. Similarly, 'null' is parsed as
a YAML null value (and will be treated as an empty string
by pandoc rather than the string 'null'). Quoting it will
force it to be interpreted as a string.
- Some tests had to be adjusted accordingly.
- Pandoc now behaves better when the YAML metadata contains
escaping errors: instead of just falling back on treating
the section as a table, it raises a YAML parsing error.
- Adds a new `pandoc.AttributeList()` constructor, which creates the
associative attribute list that is used as the third component of
`Attr` values. Values of this type can often be passed to constructors
instead of `Attr` values.
- `AttributeList` values can no longer be indexed numerically.
The new HsLua version takes a somewhat different approach to marshalling
and unmarshalling, relying less on typeclasses and more on specialized
types. This allows for better performance and improved error messages.
Furthermore, new abstractions allow to document the code and exposed
functions.
In PowerPoint, the content of a top-level list is at the same level as
the content of a top-level paragraph – the only difference is that a
list style has been applied.
At the moment, the pptx writer increments the paragraph level on each
list, turning what should be top-level lists into second-level lists.
This commit changes that logic, only incrementing the paragraph level on
continuation paragraphs of lists.
- Fixes https://github.com/jgm/pandoc/issues/4828
- Fixes https://github.com/jgm/pandoc/issues/4663