Commit graph

7744 commits

Author SHA1 Message Date
Lucas Viana
fb91a91615 Org reader: support alphabetical (fancy) lists
This adds support for alphabetical lists in org by enabling the
extension Ext_fancy_lists, mimicking the behaviour of Org Mode when
org-list-allow-alphabetical is enabled.

Enabling Ext_fancy_lists will also make Pandoc differentiate between the
delimiters of ordered lists (periods or closing parentheses). Org does
this differentiation by default when exporting to some formats (e.g.
plain text) but does not in others (e.g. html and latex), so I decided
to copy Pandoc's markdown reader behaviour.
2022-01-09 09:39:27 -08:00
John MacFarlane
66636c89b0 Org writer: fix list items starting with a code block...
or other non-paragraph content.

Closes #7810.
2022-01-08 23:21:15 -08:00
John MacFarlane
61968047e4 Avoid blank lines after tight sublists in org, haddock.
T.P.Writers.Shared `endsWithPlain` now returns True if
the list ends with a list which ends with a Plain.

See #7810.
2022-01-08 23:11:08 -08:00
John MacFarlane
1b7bdb1016 RST writer: avoid extra blank line after empty list item.
See #7810 (2).
2022-01-08 21:24:41 -08:00
John MacFarlane
8736fe11ee Org writer: fix extra blank line inserted after empty list item.
Addresses issue 2 from #7810.
2022-01-08 21:22:21 -08:00
John MacFarlane
252211bd27 Org writer: don't add blank line before lists.
The code to do this was apparently copied over from the RST
writer, but these blank lines aren't necessary or desirable
in org.  See #7810 comment 3.
2022-01-08 19:39:26 -08:00
John MacFarlane
a6741bd555 writeMedia: unescape percent-encoding in creating file path.
Closes #7819 (problem with spaces in image filenames when creating
PDFs).
2022-01-08 19:10:46 -08:00
John MacFarlane
2b51f54e19 toLocatorMap: store keys as lowercase.
We want to do a case-insensitive comparison when parsing
locators, so that e.g. both `Chap.` and `chap.` work.

Previously we lowercase terms when doing the lookup,
but they weren't lowercased in the map itself, which
led to locator-detection breaking for German (where the
terms have uppercase letters).

See
https://groups.google.com/d/msgid/pandoc-discuss/1dd44886-7b79-4e5f-97ec-57b91113df36n%40googlegroups.com
2022-01-08 16:57:59 -08:00
John MacFarlane
2986a06aaa T.P.Readers.LaTeX.SIunitx: explicit imports. 2022-01-07 18:00:57 -08:00
John MacFarlane
a965111680 Fix parsing of footnotes in --metadata-file.
Closes #7813.
2022-01-07 15:58:26 -08:00
Lucas Viana
45e2e0d018 Org writer: support starting number cookies
This complements #7806 by supporting writing Org ordered lists that
start at a specific number.
2022-01-07 10:48:28 -08:00
John MacFarlane
d562de5039 Add LaTeX babel mappings for Guajati (gu) and Oriya (or).
Closes #7815.
2022-01-07 10:25:34 -08:00
John MacFarlane
90e74c2b76 Fix typo panjabi -> punjabi.
This affects the mapping to Babel language names in the
LaTeX reader and writer.  Closes #7814.
2022-01-07 10:08:41 -08:00
Jesse Hathaway
4dcb2b6fb6 MediaWiki writer: Remove redundant display text for wiki links
Prior to this commit the MediaWiki writer always added the display
text for a wiki link:

    * [[Help|Help]]
    * [[Bubbles|Everyone loves bubbles]]

However the display text in the first example is redundant since
MediaWiki uses the target as the default display text. The result being:

    * [[Help]]
    * [[Bubbles|Everyone loves bubbles]]
2022-01-06 15:05:39 -08:00
John MacFarlane
0d99a131b1 reveal.js: Make sure images with r-stretch are not in p tags.
They must be direct children of the section.
There was previously code to make this work with the older
class name `stretch`.
See https://github.com/jgm/pandoc/issues/5965#issuecomment-1006623836
2022-01-06 10:56:33 -08:00
John MacFarlane
517d7a9cd3 reveal.js: don't add r-fit-text class to section.
It must go on header only.  See
https://github.com/jgm/pandoc/issues/5965#issuecomment-1006623836
2022-01-06 10:45:15 -08:00
Lucas Viana
4be41e3bb5 Org reader: support counter cookies in lists
This adds support for counter cookies in org lists. Such cookies are
used to override the item counter in ordered lists. In org it is
possible to set the counter at any list item, but since Pandoc AST does
not support this, we restrict the usage to setting an offset for the
entire ordered list, by using the cookie in the first list item.

Note that even though unordered lists do not have counters, Org Mode
still parses such cookies in unordered lists and suppresses them in the
output, so we do the same.

Also, even though org-list-allow-alphabetical is disabled in Emacs by
default, for some reason alphabetical cookies are always parsed and used
in Org Mode regardlessly of whether this option is enabled or the list
style is decimal, so we do the same.

E.g.
 2. test
 3. test
Is parsed as an ordered list starting at 1, as before. This also
conforms to Org Mode behaviour.

 1. [@2] test
 2. test
Is now parsed as an ordered list starting at 2, so that it conforms to
Org Mode behaviour.

Note that when parsing
 1. [@2] test
 2. [@9] test
the second cookie is silenced and the entire list starts at 2. This is
because the current Pandoc AST does not support expressing a change in
the counter at a specific item.
2022-01-06 19:33:13 +01:00
John MacFarlane
ea74582288 AsciiDoc writer: improve detection of intraword emphasis.
Closes #7803.
2022-01-05 14:48:19 -08:00
Albert Krewinkel
1f8638fb54 Lua: add pandoc.template module
The module provides a `compile` function to use strings as templates.
2022-01-04 11:55:59 -08:00
Albert Krewinkel
974a9d353a Lua: marshal templates as opaque userdata values 2022-01-04 11:55:59 -08:00
Albert Krewinkel
6a5ac90bf1 Lua: add pandoc.WriterOptions constructor 2022-01-04 11:55:59 -08:00
Albert Krewinkel
0d1d52f0a0 Lua: add function pandoc.write 2022-01-04 11:55:59 -08:00
Albert Krewinkel
5c53837259 Stop exporting writeCustom from module T.P.Writers [API change]
This ensures that all writer exported in T.P.Writers are parameterized
and work with any `PandocMonad` type. This is consistent with
T.P.Readers, as `readCustom` is not exported from that module either.
2022-01-04 11:55:59 -08:00
John MacFarlane
322364ff66 Markdown writer: fix indentation issue in footnotes.
Closes #7801.
2022-01-03 18:50:44 -08:00
John MacFarlane
53699f2ab3 DocBook reader: be sensitive to spacing="compact" in lists.
When spacing="compact" is set, Para elements are turned
into Plain, so we get a "tight" list.

Closes #7799.
2022-01-03 14:19:53 -08:00
John MacFarlane
ca7a3ed5ed Issue error with --list-extensions for invalid formats.
Cloess #7797.
2022-01-03 11:08:14 -08:00
John MacFarlane
cdfdfae4dd parseFormatSpec: cleaner error message for invalid extensions. 2022-01-03 10:32:57 -08:00
John MacFarlane
0abfe4fdab Minor code improvement. 2022-01-03 10:26:18 -08:00
John MacFarlane
75c5218d4f Don't read sources until in/out format are verified.
Partially addresses #7797.
2022-01-03 10:16:07 -08:00
Tuong Nguyen Manh
32297d5677 Odt: Add list-header
The list-header is a type of list-item.
Therefore, it will be treated exactly like one.
2022-01-02 15:05:09 -08:00
Albert Krewinkel
b7a44f9d19 Copyright notices: update for 2022 2022-01-02 11:59:22 -08:00
Albert Krewinkel
efdba79ad1 Lua writer: allow variables to be set via second return value of Doc
New templates variables can be added by giving variable-value pairs as a
second return value of the global function `Doc`.

Example:

    function Doc (body, meta, vars)
      vars.date = vars.date or os.date '%B %e, %Y'
      return body, vars
    end

Closes: #6731
2022-01-02 11:55:02 -08:00
Albert Krewinkel
85334eb6c4
Lua writer: provide global PANDOC_WRITER_OPTIONS
Closes: #6731
2022-01-02 13:57:01 +01:00
John MacFarlane
6121de369c Use latest version of KaTeX. 2022-01-01 23:29:46 -08:00
Albert Krewinkel
1e60181ee3 Lua: provide global PANDOC_WRITER_OPTIONS [API change]
API changes:

- The function T.P.Filter.applyFilters now takes a filter
  environment of type `Environment`, instead of a ReaderOptions value.
  The `Environment` type is exported from `T.P.Filter` and allows to
  combine ReaderOptions and WriterOptions in a single value.

- Global, exported from T.P.Lua, has a new type constructor
  `PANDOC_WRITER_OPTIONS`.

Closes: #5221
2022-01-01 14:31:42 -08:00
Albert Krewinkel
b5da58e8b4
Apply some HLint suggestions 2022-01-01 20:22:24 +01:00
Albert Krewinkel
eae9be3a48
Org reader: allow trailing spaces after key/value pairs in directives
Ensures that spaces at the end of attribute directives like
`#+ATTR_HTML: :width 100%` (note the trailing spaces) are accepted.
2022-01-01 13:44:14 +01:00
Albert Krewinkel
e58a5ceed8
Lua: marshal ReaderOptions field extensions, track_changes via JSON
Extensions are now available as a list of strings; the track-changes
settings are given as the kebab-case representation used in JSON.
2022-01-01 13:44:13 +01:00
Albert Krewinkel
03054a33e8 Lua: use global state when parsing documents in pandoc.read
The function `pandoc.read` is updated to use the same state that was
used while parsing the main input files. This ensures that log messages
are preserved and that images embedded in the input are added to the
mediabag.
2021-12-31 17:35:52 -08:00
Albert Krewinkel
d6e66b1f1d
Lua: cleanup stack in peekReadOptionsTable
A ReaderOptions element was left on top of the stack when the
`peekReadOptionsTable` function was invoked.
2021-12-31 11:02:16 +01:00
John MacFarlane
7ff1b798c4 Docx reader: handle multiple pic elements inside a drawing.
Closes #7786.
2021-12-30 21:26:30 -08:00
John MacFarlane
cc30d646ca Docx reader: change elemToParPart to return [ParPart]
...instead of ParPart.

Also remove NullParPart constructor, as it is no longer
needed.

This will allow us to handle elements that contain multiple
ParParts, e.g. w:drawing elements with multiple pic:pic.

See #7786.
2021-12-30 21:26:30 -08:00
John MacFarlane
4ff997bf68 Fix ghc 9.2.1 warnings. 2021-12-30 21:26:30 -08:00
Albert Krewinkel
2dd1cde715 Lua: allow binary (byte string) readers to be used with pandoc.read 2021-12-30 22:41:15 +01:00
John MacFarlane
d960282b10 Use splitDirectories istead of splitPath.
We were using `splitPath` in two places in the code
where `splitDirectories` should have been used.

This led to a test for `..` in paths in `extractMedia`
failing, so that images with `..` in the path name
could be extracted outside the directory specified
by `extractMedia`.

It also led a test for `media` in resource paths to fail
in the docx reader.
2021-12-28 16:31:54 -08:00
John MacFarlane
7d56650e01 OpenDocument writer: fix vertical align bug with display math.
Previously some displayed formulas would be floated above
a preceding text line.  This is fixed by setting vertical-rel
to 'text' rather than 'paragraph-content'.

Closes #7777.
2021-12-28 16:06:25 -08:00
Albert Krewinkel
fbd2c8e376
Lua: improve handling of empty caption, body by from_simple_table
Create truly empty table caption and body when these are empty in the
simple table.

Fixes: #7776
2021-12-25 21:20:18 +01:00
John MacFarlane
811601aa8b RTF writer: properly handle images in data URIs.
See #7771.
2021-12-22 11:59:07 -08:00
John MacFarlane
c4f6e6cb57 HTML writer: make line breaks more consistent.
- With `--wrap=none`, we now output line breaks between
  block-level elements. Previously they were omitted
  entirely, so the whole document was on one line, unless
  there were literal line breaks in pre sections.  This makes
  the HTML writer's behavior more consistent with that of
  other writers.

- Put newline after `<dd>`.

- Put newlines after block-level elements in footnote section.
2021-12-22 09:45:02 -08:00
John MacFarlane
7a9832166e Add text wrapping to HTML output.
Previously the HTML writer was exceptional in not being
sensitive to the `--wrap` option.  With this change `--wrap`
now works for HTML. The default (as with other formats) is
automatic wrapping to 72 columns.

A new internal module, T.P.Writers.Blaze, exports `layoutMarkup`.
This converts a blaze Html structure into a doclayout Doc Text.

In addition, we now add a line break between an `img` tag
and the associated `figcaption`.

Note: Output is never wrapped in `writeHtmlStringForEPUB`.
This accords with previous behavior since previously the HTML
writer was insensitive to `--wrap` settings.  There's no real
need to wrap HTML inside a zipped container.

Note that the contents of script, textarea, and pre tags are
always laid out with the `flush` combinator, so that unwanted
spaces won't be introduced if these occur in an indented context
in a template.

Closes #7764.
2021-12-22 09:45:02 -08:00