Commit graph

7711 commits

Author SHA1 Message Date
John MacFarlane
6121de369c Use latest version of KaTeX. 2022-01-01 23:29:46 -08:00
Albert Krewinkel
1e60181ee3 Lua: provide global PANDOC_WRITER_OPTIONS [API change]
API changes:

- The function T.P.Filter.applyFilters now takes a filter
  environment of type `Environment`, instead of a ReaderOptions value.
  The `Environment` type is exported from `T.P.Filter` and allows to
  combine ReaderOptions and WriterOptions in a single value.

- Global, exported from T.P.Lua, has a new type constructor
  `PANDOC_WRITER_OPTIONS`.

Closes: #5221
2022-01-01 14:31:42 -08:00
Albert Krewinkel
b5da58e8b4
Apply some HLint suggestions 2022-01-01 20:22:24 +01:00
Albert Krewinkel
eae9be3a48
Org reader: allow trailing spaces after key/value pairs in directives
Ensures that spaces at the end of attribute directives like
`#+ATTR_HTML: :width 100%` (note the trailing spaces) are accepted.
2022-01-01 13:44:14 +01:00
Albert Krewinkel
e58a5ceed8
Lua: marshal ReaderOptions field extensions, track_changes via JSON
Extensions are now available as a list of strings; the track-changes
settings are given as the kebab-case representation used in JSON.
2022-01-01 13:44:13 +01:00
Albert Krewinkel
03054a33e8 Lua: use global state when parsing documents in pandoc.read
The function `pandoc.read` is updated to use the same state that was
used while parsing the main input files. This ensures that log messages
are preserved and that images embedded in the input are added to the
mediabag.
2021-12-31 17:35:52 -08:00
Albert Krewinkel
d6e66b1f1d
Lua: cleanup stack in peekReadOptionsTable
A ReaderOptions element was left on top of the stack when the
`peekReadOptionsTable` function was invoked.
2021-12-31 11:02:16 +01:00
John MacFarlane
7ff1b798c4 Docx reader: handle multiple pic elements inside a drawing.
Closes #7786.
2021-12-30 21:26:30 -08:00
John MacFarlane
cc30d646ca Docx reader: change elemToParPart to return [ParPart]
...instead of ParPart.

Also remove NullParPart constructor, as it is no longer
needed.

This will allow us to handle elements that contain multiple
ParParts, e.g. w:drawing elements with multiple pic:pic.

See #7786.
2021-12-30 21:26:30 -08:00
John MacFarlane
4ff997bf68 Fix ghc 9.2.1 warnings. 2021-12-30 21:26:30 -08:00
Albert Krewinkel
2dd1cde715 Lua: allow binary (byte string) readers to be used with pandoc.read 2021-12-30 22:41:15 +01:00
John MacFarlane
d960282b10 Use splitDirectories istead of splitPath.
We were using `splitPath` in two places in the code
where `splitDirectories` should have been used.

This led to a test for `..` in paths in `extractMedia`
failing, so that images with `..` in the path name
could be extracted outside the directory specified
by `extractMedia`.

It also led a test for `media` in resource paths to fail
in the docx reader.
2021-12-28 16:31:54 -08:00
John MacFarlane
7d56650e01 OpenDocument writer: fix vertical align bug with display math.
Previously some displayed formulas would be floated above
a preceding text line.  This is fixed by setting vertical-rel
to 'text' rather than 'paragraph-content'.

Closes #7777.
2021-12-28 16:06:25 -08:00
Albert Krewinkel
fbd2c8e376
Lua: improve handling of empty caption, body by from_simple_table
Create truly empty table caption and body when these are empty in the
simple table.

Fixes: #7776
2021-12-25 21:20:18 +01:00
John MacFarlane
811601aa8b RTF writer: properly handle images in data URIs.
See #7771.
2021-12-22 11:59:07 -08:00
John MacFarlane
c4f6e6cb57 HTML writer: make line breaks more consistent.
- With `--wrap=none`, we now output line breaks between
  block-level elements. Previously they were omitted
  entirely, so the whole document was on one line, unless
  there were literal line breaks in pre sections.  This makes
  the HTML writer's behavior more consistent with that of
  other writers.

- Put newline after `<dd>`.

- Put newlines after block-level elements in footnote section.
2021-12-22 09:45:02 -08:00
John MacFarlane
7a9832166e Add text wrapping to HTML output.
Previously the HTML writer was exceptional in not being
sensitive to the `--wrap` option.  With this change `--wrap`
now works for HTML. The default (as with other formats) is
automatic wrapping to 72 columns.

A new internal module, T.P.Writers.Blaze, exports `layoutMarkup`.
This converts a blaze Html structure into a doclayout Doc Text.

In addition, we now add a line break between an `img` tag
and the associated `figcaption`.

Note: Output is never wrapped in `writeHtmlStringForEPUB`.
This accords with previous behavior since previously the HTML
writer was insensitive to `--wrap` settings.  There's no real
need to wrap HTML inside a zipped container.

Note that the contents of script, textarea, and pre tags are
always laid out with the `flush` combinator, so that unwanted
spaces won't be introduced if these occur in an indented context
in a template.

Closes #7764.
2021-12-22 09:45:02 -08:00
Albert Krewinkel
0bdf373157
Lua: simplify code of pandoc.utils.stringify
Minor behavior change: plain strings nested in tables are now included
in the result string.
2021-12-21 21:50:13 +01:00
Albert Krewinkel
17a32a99a5
Lua: simplify and deprecate function pandoc.utils.equals
The function is no longer required for element comparisons; it is now an
alias for the `==` operator.
2021-12-21 19:01:11 +01:00
Albert Krewinkel
d7cab51982 Lua: add new library function pandoc.utils.type.
The function behaves like the default `type` function from Lua's
standard library, but is aware of pandoc userdata types. A typical
use-case would be to determine the type of a metadata value.
2021-12-21 09:24:21 -08:00
Albert Krewinkel
c90802d7d8
Lua: fix return types of blocks_to_inlines, make_sections
Ensures the returned lists have the correct type (`Inlines` and
`Blocks`, respectively).
2021-12-21 09:53:44 +01:00
Albert Krewinkel
cd2bffee1e
Lua: use more natural representation for Reference values
Omit `false` boolean values, push integers as numbers.
2021-12-20 09:41:03 +01:00
Albert Krewinkel
993222d2c9
Custom writer: assign default Pandoc object to global PANDOC_DOCUMENT
The default Pandoc object is now non-strict, i.e., only the parts of the
document that are accessed will be marshaled to Lua. A special type is
no longer necessary.

This change also makes it possible to use the global variable with
library functions such as `pandoc.utils.references`, or to inspect the
document contents with `walk()`.
2021-12-19 23:17:27 +01:00
binaarinen
0610f16f7f
Add a writer for Markua 0.10 (#7729)
Markua is a markdown variant used by Leanpub.
More information about Markua can be found at https://leanpub.com/markua/read.

Adds a new exported function `writeMarkua` from T.P.Writers.Markdown.
[API change]

Closes #1871.

Co-authored by Tim Wisotzki and Samuel Lemmenmeier.
2021-12-19 12:10:41 -08:00
Albert Krewinkel
f8f03c2ffc JATS writer: keep quotes in element-citations
The JATS writer was losing quotes in element-citations, as it uses the
`T.P.Citeproc.getReferences` function to get references. That function
replaces `Quoted` elements with spans. That transformation is required
in `T.P.Citeproc.processCitations`, so it has been moved there.
2021-12-19 12:03:01 -08:00
Albert Krewinkel
dc3dcc2ccd
Lua: fixup, should have been part of previous commit 2021-12-19 14:31:52 +01:00
John MacFarlane
4b220d592c Citeproc: avoid adding comma before an author-in-text citation...
...in a note if it begins with a title (no author).

Closes #7761.
2021-12-18 12:13:06 -08:00
Albert Krewinkel
7a70b87fac Lua: add function pandoc.utils.references
List with all cited references of a document.

Closes: #7752
2021-12-17 14:45:27 -08:00
John MacFarlane
61ffa55835 T.P.Citeproc: do not export getStyle, getCiteprocLang.
This commit undoes the API changes noted in
ea77f2e6f6

They are no longer needed, and we should avoid unnecessary
API changes.
2021-12-15 16:05:15 -08:00
John MacFarlane
a527a2f345 Org writer: use the citation locator list from the org source code...
which is not localized, instead of getting locators from the
localized CSL stylesheet as we did before.
2021-12-14 20:30:55 -08:00
John MacFarlane
394fa9d072 Org reader: parse official org-cite citations.
We also support the older org-ref style as a fallback.
We no longer support the "markdown-style" citations.

See #7329.
2021-12-14 11:34:32 -08:00
John MacFarlane
be0e3f9794 Markdown writer: avoid extra space before citation suffix...
if it already starts with a space.
2021-12-14 11:34:32 -08:00
John MacFarlane
d393f2f158 Markdown writer: ensure semicolon btw locator and next citation...
when an author-in-text citation has a locator and following
citations.
2021-12-14 11:34:32 -08:00
John MacFarlane
5817e86491 Org reader: remove support for "Berkeley style" citations.
See #7329.
2021-12-14 09:20:26 -08:00
John MacFarlane
9f089aa286 Org writer: add tests for org-cite citations, and improve support. 2021-12-13 12:11:58 -08:00
John MacFarlane
2015c9070d Markdown reader: fix parsing of "bare locators"...
...after author-in-text citations.

Previously `@item [p. 12; @item2]` was incorrectly parsed as
three citations rather than two.  This is now fixed by ensuring
that `prefix` doesn't gobble any semicolons.
2021-12-13 12:11:58 -08:00
John MacFarlane
ea77f2e6f6 Citeproc changes:
T.P.Citeproc exports `getCiteprocLang` and `getStyle` [API change].

T.P.Citeproc.Locator now exports `toLocatorMap`, `LocatorInfo`,
and `LocatorMap`.  The type of `parseLocator` has changed, so
it now takes a `LocatorMap` rather than a `Locale` as parameter,
and returns a `LocatorInfo` instead of a tuple.
2021-12-13 12:11:58 -08:00
John MacFarlane
0679620f92 Org writer: preliminary support for new org-cite syntax.
See #7329.

This could use some tests.
2021-12-12 23:42:13 -08:00
Kolen Cheung
a9a9a2c62a fix(IpynbOutput)!: rank always favors output format
Previously, both `fmt == f` case and Image have a rank of 1.
In the end, e.g. from ipynb to html conversion,
if both html and image exists, it actually prefers the image.
This commit changes this, so that fmt == f is always highest rank,
and rank never collides.
This is achieved by keeping fmt == f case having rank 1,
and every other rank increased by 1.
2021-12-11 09:42:30 -08:00
Albert Krewinkel
e88224621d Custom reader: ensure old Readers continue to work
Retry conversion by passing a string instead of sources when the
`Reader` fails with a message that hints at an outdated function. A
deprecation notice is reported in that case.
2021-12-11 08:59:11 -08:00
Albert Krewinkel
83b5b79c0e Custom reader: pass list of sources instead of concatenated text
The first argument passed to Lua `Reader` functions is no longer a plain
string but a richer data structure. The structure can easily be
converted to a string by applying `tostring`, but is also a list with
elements that contain each the *text* and *name* of each input source as
a property of the respective name.

A small example is added to the custom reader documentation, showcasing
its use in a reader that creates a syntax-highlighted code block for
each source code file passed as input.

Existing readers must be updated.
2021-12-11 08:59:11 -08:00
Albert Krewinkel
3e7b46af64
Switch to released pandoc-lua-marshal-0.1.2
Cell values are now marshaled as userdata objects; a constructor
function for table cells is provided as `pandoc.Cell`.
2021-12-10 17:24:50 +01:00
Kolen Cheung
20eb8ac7fd
ipynb writer: handle cell output with raw block of markdown (#7563)
Write RawBlock of markdown in code-cell output.

#7561 makes the ipynb reader reads code-cell output with mime
"text/markdown" to a RawBlock of markdown

This commit makes the ipynb writer writes this RawBlock of markdown
back inside a code-cell output with the same mime, preserving this
information in round-trip

Add tests of ipynb reader (#7561) and ipynb writer (#7563)'s ability to
handle a "text/markdown" mime type in a code-cell output
2021-12-09 20:36:56 -08:00
Albert Krewinkel
fa643ba6d7 Lua: update to latest pandoc-lua-marshal (0.1.1)
- `walk` methods are added to `Block` and `Inline` values; the methods
  are similar to `pandoc.utils.walk_block` and
  `pandoc.utils.walk_inline`, but apply to filter also to the element
  itself, and therefore return a list of element instead of a single
  element.

- Functions of name `Doc` are no longer accepted as alternatives for
  `Pandoc` filter functions. This functionality was undocumented.
2021-12-09 09:22:29 -08:00
John MacFarlane
9cbea695c4 Ipynb writer: ensure deterministic order of keys. 2021-12-08 23:21:39 -08:00
John MacFarlane
45e51ecd65 Revert "Markdown reader: Improve inlinesInBalancedBrackets."
This reverts commit fa83246d7d.
2021-12-07 23:52:29 -08:00
John MacFarlane
51142c6803 Ipynb reader & writer: properly handle cell "id".
This is passed through if it exists (in Nb4); otherwise
the writer will add a random one so that cells all have
an "id".

Closes #7728.
2021-12-06 23:40:51 -08:00
John MacFarlane
23b2617bf7 Ms writer: properly encode strings for PDF contents.
Closes #7731.
2021-12-06 12:00:08 -08:00
John MacFarlane
36807db531 Commonmark writer: allow ')' delimiters on ordered lists. 2021-12-05 11:26:01 -08:00
John MacFarlane
51f6f0e3a1 Improve Markdown writer escaping.
This fixes escaping for '#' in particular.
Closes #7726.
2021-12-03 17:52:47 -08:00