Commit graph

599 commits

Author SHA1 Message Date
Albert Krewinkel
b7a44f9d19 Copyright notices: update for 2022 2022-01-02 11:59:22 -08:00
John MacFarlane
7a9832166e Add text wrapping to HTML output.
Previously the HTML writer was exceptional in not being
sensitive to the `--wrap` option.  With this change `--wrap`
now works for HTML. The default (as with other formats) is
automatic wrapping to 72 columns.

A new internal module, T.P.Writers.Blaze, exports `layoutMarkup`.
This converts a blaze Html structure into a doclayout Doc Text.

In addition, we now add a line break between an `img` tag
and the associated `figcaption`.

Note: Output is never wrapped in `writeHtmlStringForEPUB`.
This accords with previous behavior since previously the HTML
writer was insensitive to `--wrap` settings.  There's no real
need to wrap HTML inside a zipped container.

Note that the contents of script, textarea, and pre tags are
always laid out with the `flush` combinator, so that unwanted
spaces won't be introduced if these occur in an indented context
in a template.

Closes #7764.
2021-12-22 09:45:02 -08:00
binaarinen
0610f16f7f
Add a writer for Markua 0.10 (#7729)
Markua is a markdown variant used by Leanpub.
More information about Markua can be found at https://leanpub.com/markua/read.

Adds a new exported function `writeMarkua` from T.P.Writers.Markdown.
[API change]

Closes #1871.

Co-authored by Tim Wisotzki and Samuel Lemmenmeier.
2021-12-19 12:10:41 -08:00
John MacFarlane
426342f00d Fix typo. Closes jgm/pandoc-website#54. 2021-11-30 08:25:15 -08:00
John MacFarlane
79e6f8db13 Improve detection of pipe table line widths.
Fixed calculation of maximum column widths in pipe tables.
It is now based on the length of the markdown line, rather
than a "stringified" version of the parsed line.  This should
be more predictable for users. In addition, we take into account
double-wide characters such as emojis.

Closes #7713.
2021-11-23 13:29:25 -08:00
John MacFarlane
4857cd1769 Update man page. 2021-11-20 23:19:22 -08:00
John MacFarlane
946b959751 Fix misspelled extension name in manual.
`short_subsuperscript` -> `short_subsuperscripts`.
Closes #7690.
2021-11-14 20:55:39 -08:00
John MacFarlane
881b45209e Replace old sample custom reader with a full-featured reader for creole.
This is better as an example.  And it is faster than pandoc's
regular creole parser, which shows that high-performance readers
can be developed this way.
2021-11-07 14:34:56 -08:00
John MacFarlane
213913f025 Pass ReaderOptions to custom readers as second parameter. 2021-11-06 16:47:13 -07:00
John MacFarlane
ee2f0021f9
Add interface for custom readers written in Lua. (#7671)
New module Text.Pandoc.Readers.Custom, exporting
readCustom [API change].

Users can now do `-f myreader.lua` and pandoc will treat the
script myreader.lua as a custom reader, which parses an input
string to a pandoc AST, using the pandoc module defined for
Lua filters.

A sample custom reader can be found in data/reader.lua.

Closes #7669.
2021-11-05 22:10:29 -07:00
Albert Krewinkel
5750f60442
MANUAL.txt: update table of exit codes and corresponding errors 2021-11-05 13:21:24 +01:00
John MacFarlane
d1789dda75 Bump to 2.16.1, update changelog. 2021-11-02 16:16:21 -07:00
John MacFarlane
40655b54a2 Bump to 2.16, update changelog. 2021-10-31 00:01:03 -07:00
John MacFarlane
51e1a8601f Bump to 2.15, updaet man page. 2021-10-22 22:15:37 -07:00
nuew
021cdb543b epub: Add EPUB3 subject metadata (authority/term)
This adds the ability to specify EPUB 3 `authority` and `term` specific
refinements to the `subject` tag. Specifying a plain `subject` tag in
metadata will function as before.
2021-09-30 20:53:07 -07:00
John MacFarlane
45db998b39 EPUB writer: treat epub:type "frontispiece" as front matter.
This allows you to include a frontispiece using

```

![](yourimage.jpg)

etc.
```

Closes #7600.
2021-09-29 08:29:14 -07:00
John MacFarlane
9b23b738cb Update documentation for definition_list extension.
In 2015, we relaxed indentation requirements for the first
line of a definition (see commit d3544dc and issue #2087), but
the documnentation wasn't updated to reflect the change.

Closes #7594.
2021-09-26 21:20:27 -07:00
John MacFarlane
dd7b83ac91 Use babel, not polyglossia, with xelatex.
Previously polyglossia worked better with xelatex, but
that is no longer the case, so we simplify the code so that
babel is used with all latex engines.

This involves a change to the default LaTeX template.
2021-09-19 09:40:59 -07:00
Christophe Dervieux
44ec800b07 mention revealjs support of title-toc in MANUAL
Following addition in #7171
2021-09-17 09:21:39 -07:00
Emily Bourke
7c22c0202e pptx: Support specifying slide background images
In the reveal-js output, it’s possible to use reveal’s
`data-background-image` class on a slide’s title to specify a background
image for the slide.

With this commit, it’s possible to use `background-image` in the same
way for pptx output. Only the “stretch” mode is supported, and the
background image is centred around the slide in the image’s larger axis,
matching the observed default behaviour of PowerPoint.

- Support `background-image` per slide.
- Add tests.
- Update manual.
2021-09-16 19:45:53 -07:00
Emily Bourke
0fb6474a55 pptx: Add support for incremental lists
- Support -i option
- Support incremental/noincremental divs
- Support older block quote syntax
- Add tests

One thing not clear from the manual is what should happen when the input
uses a combination of these things. For example, what should the
following produce?

```md
::: {.incremental .nonincremental}
- are
- these
- incremental?
:::

::: incremental
::::: nonincremental
- or
- these?
:::::
:::

::: nonincremental
> - how
> - about
> - these?
:::
```

In this commit I’ve taken the following approach, matching the observed
behaviour for beamer and reveal.js output:

- if a div with both classes, incremental wins
- the innermost incremental/nonincremental div is the one which takes
  effect
- a block quote containing a list as its first element inverts whether
  the list is incremental, whether or not the quote is inside an
  incremental/non-incremental div

I’ve added some tests to verify this behaviour.

This commit closes issue #5689
(https://github.com/jgm/pandoc/issues/5689).
2021-09-15 09:13:05 -07:00
John MacFarlane
d43f9cf414 Add note to Security section that commonmark is better...
than markdown as far as pathological performance goes.
2021-09-12 11:10:05 -07:00
John MacFarlane
12b3ee3787 MANUAL: Document formats affected by --reference-location. 2021-09-10 09:32:51 -07:00
Francesco Mazzoli
99a4d1d0b0
Support --reference-location for HTML output (#7461)
The HTML writer now supports `EndOfBlock`, `EndOfSection`, and
`EndOfDocument` for reference locations.  EPUB and HTML slide
show formats are also affected by this change.

This works similarly to the markdown writer, but with special care
taken to skipping section divs with what regards to the block level.

The change also takes care to not modify the output if `EndOfDocument`
is used.
2021-09-10 09:30:05 -07:00
Emily Bourke
b82a01b688 pptx: Add support for more layouts
Until now, the pptx writer only supported four slide layouts: “Title
Slide” (used for the automatically generated metadata slide), “Section
Header” (used for headings above the slide level), “Two Column” (used
when there’s a columns div containing at least two column divs), and
“Title and Content” (used for all other slides).

This commit adds support for three more layouts: Comparison, Content
with Caption, and Blank.

- Support “Comparison” slide layout

  This layout is used when a slide contains at least two columns, at
  least one of which contains some text followed by some non-text (e.g.
  an image or table). The text in each column is inserted into the
  “body” placeholder for that column, and the non-text is inserted into
  the ObjType placeholder. Any extra content after the non-text is
  overlaid on top of the preceding content, rather than dropping it
  completely (as currently happens for the two-column layout).

  + Accept straightforward test changes

    Adding the new layout means the “-deleted-layouts” tests have an
    additional layout added to the master and master rels.

  + Add new tests for the comparison layout
  + Add new tests to pandoc.cabal

- Support “Content with Caption” slide layout

  This layout is used when a slide’s body contains some text, followed by
  non-text (e.g. and image or a table). Before now, in this case the image
  or table would break onto a new slide: to get that output again, users
  can add a horizontal rule before the image or table.

  + Accept straightforward tests

    The “-deleted-layouts” tests all have an extra layout and relationship
    in the master for the Content with Caption layout.

  + Accept remove-empty-slides test

    Empty slides are still removed, but the Content with Caption layout is
    now used.

  + Change slide-level-0/h1-h2-with-text description

    This test now triggers the content with caption layout, giving a
    different (but still correct) result.

  + Add new tests for the new layout
  + Add new tests to the cabal file

- Support “Blank” slide layout

  This layout is used when a slide contains only blank content (e.g.
  non-breaking spaces). No content is inserted into any placeholders in
  the layout.

  Fixes #5097.

  + Accept straightforward test changes

    Blank layout now copied over from reference doc as well, when
    layouts have been deleted.

  + Add some new tests

    A slide should use the blank layout if:

    - It contains only speaker notes
    - It contains only an empty heading with a body of nbsps
    - It contains only a heading containing only nbsps

- Change ContentType -> Placeholder

  This type was starting to have a constructor for each placeholder on
  each slide (e.g. `ComparisonUpperLeftContent`). I’ve changed it
  instead to identify a placeholder by type and index, as I think that’s
  clearer and less redundant.

- Describe layout-choosing logic in manual
2021-09-01 07:16:17 -07:00
John MacFarlane
6180d42434 Add more potential threats to security section of manual. 2021-08-28 22:31:42 -07:00
John MacFarlane
d6d7c9620a Add --sandbox option.
+ Add sandbox feature for readers.  When this option is used,
  readers and writers only have access to input files (and
  other files specified directly on command line).  This restriction
  is enforced in the type system.
+ Filters, PDF production, custom writers are unaffected.  This
  feature only insulates the actual readers and writers, not
  the pipeline around them in Text.Pandoc.App.
+ Note that when `--sandboxed` is specified, readers won't have
  access to the resource path, nor will anything have access to
  the user data directory.
+ Add module Text.Pandoc.Class.Sandbox, defining
  `sandbox`.  Exported via Text.Pandoc.Class. [API change]

Closes #5045.
2021-08-28 22:31:42 -07:00
William Lupton
af9d464cee Clarify 'attributes' extension support 2021-08-27 09:14:27 -07:00
John MacFarlane
aabcee6002 MANUAL: document error code 25 2021-08-22 08:45:36 -07:00
Salim B
03b50c0f78 Add some more info regarding --slide-level=0 2021-08-22 08:25:29 -07:00
Salim B
607d716624 Harmonize spelling of 'slide show' 2021-08-22 08:20:21 -07:00
John MacFarlane
9b8a691782 Update manual date and man page. 2021-08-20 21:56:32 -07:00
John MacFarlane
13cf02acfd MANUAL.txt/security: add a note on security risks of include directives. 2021-08-20 21:43:22 -07:00
William Lupton
5159d6653b Clarify that each YAML block is a separate YAML document 2021-08-19 10:37:24 -07:00
John MacFarlane
10dfd81000 Document new link-bibliography metadata field.
This affects whether hyperlinks are added to the bibliography
by citeproc.
2021-08-17 15:33:49 -07:00
Emily Bourke
72823ad947 pptx: Select layouts from reference doc by name
Until now, users had to make sure that their reference doc contains
layouts in a specific order: the first four layouts in the file had to
have a specific structure, or else pandoc would error (or sometimes
successfully produce a pptx file, which PowerPoint would then fail to
open).

This commit changes the layout selection to use the layout names rather
than order: users must make sure their reference doc contains four
layouts with specific names, and if a layout with the right name isn’t
found pandoc will output a warning and use the corresponding layout from
the default reference doc as a fallback.

I believe the use of names rather than order will be clearer to users,
and the clearer errors will help them troubleshoot when things go wrong.

- Add tests for moved layouts
- Add tests for deleted layouts
- Add newly included layouts to slideMaster1.xml to fix tests
2021-08-17 09:35:25 -07:00
OCzarnecki
e37cf4484d
Multimarkdown sub- and superscripts (#5512) (#7188)
Added an extension `short_subsuperscripts` which modifies the behavior
of `subscript` and `superscript`, allowing subscripts or superscripts containing only
alphanumerics to end with a space character (eg. `x^2 = 4` or `H~2 is
combustible`).  This improves support for multimarkdown. Closes #5512.

Add `Ext_short_subsuperscripts` constructor to `Extension` [API change].
This is enabled by default for `markdown_mmd`.
2021-08-15 21:57:57 -07:00
John MacFarlane
72447a563c Remove documentation for .ul shortcut for now.
See #7307.  Motivation: there is talk of removing this.
2021-08-15 21:44:46 -07:00
John MacFarlane
4340bd52c4 Make docx writer sensitive to native_numbering extension.
Figure and table numbers are now only included if `native_numbering`
is enabled.  (By default it is disabled.)  This is a behavior change
with respect to 2.14.1, but the behavior is that of previous versions.

The change was necessary to avoid incompatibilities between pandoc's
native numbering and third-party cross reference filters like
pandoc-crossref.

Closes #7499.
2021-08-15 15:05:54 -07:00
William Lupton
649133fe5e
Document use of the 'underline' class. (#7492)
Addresses a comment in #7484.
2021-08-12 10:18:22 -07:00
William Lupton
f00f7ec63c
Clarify internal punctuation in citation keys. (#7491)
Addresses a comment in #5458.
2021-08-11 11:12:45 -07:00
John MacFarlane
6543b05116 Add RTF reader.
- `rtf` is now supported as an input format as well as output.
- New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change]

Closes #3982.
2021-08-10 10:48:55 -07:00
John MacFarlane
c0b68b2030 Allow --slide-level=0.
When the slide level is set to 0, headings won't be used at all
in splitting the document into slides. Horizontal rules must be
used to separate slides.

Closes #7476.
2021-08-08 11:20:26 -07:00
John MacFarlane
6acb674e89 Remove obsolete and incorrect sentence in --slide-level docs. 2021-08-08 11:17:28 -07:00
John MacFarlane
c909d16ccc Bump to 2.14.1, update changelog and man page. 2021-07-18 11:39:04 -07:00
Michael Hoffmann
565330033a
Don't incorporate externally linked images in EPUB documents (#7430)
Just like it is possible to avoid incorporating an image in EPUB by
passing `data-external="1"` to a raw HTML snippet, this makes the same
possible for native Images, by looking for an associated `external`
attribute.
2021-07-07 09:26:37 -07:00
John MacFarlane
0c4ed0045a Bump to 2.14.0.3, update changelog, require latest skylighting. 2021-06-20 19:25:43 -07:00
John MacFarlane
961268446c Rephrase section on unsafe HTML in manual. 2021-06-14 12:36:05 -07:00
John MacFarlane
67b3c36a93 Bump to 2.14.0.2, update chaneglog and manual. 2021-06-10 22:47:53 -07:00
John MacFarlane
3776e828a8 Fix MediaBag regressions.
With the 2.14 release `--extract-media` stopped working as before;
there could be mismatches between the paths in the rendered document and
the extracted media.

This patch makes several changes (while keeping the same API).

The `mediaPath` in 2.14 was always constructed from the SHA1 hash of
the media contents.  Now, we preserve the original path unless it's
an absolute path or contains `..` segments (in that case we use a path
based on the SHA1 hash of the contents).

When constructing a path from the SHA1 hash, we always use the
original extension, if there is one. Otherwise we look up an
appropriate extension for the mime type.

`mediaDirectory` and `mediaItems` now use the `mediaPath`, rather
than the mediabag key, for the first component of the tuple.
This makes more sense, I think, and fits with the documentation
of these functions; eventually, though, we should rework the API so that
`mediaItems` returns both the keys and the MediaItems.

Rewriting of source paths in `extractMedia` has been fixed.

`fillMediaBag` has been modified so that it doesn't modify
image paths (that was part of the problem in #7345).

We now do path normalization (e.g. `\` separators on Windows) only
in writing the media; the paths are left unchanged in the image
links (sensibly, since they might be URLs and not file paths).

These changes should restore the original behavior from before 2.14.

Closes #7345.
2021-06-10 16:47:02 -07:00