With the 2.14 release `--extract-media` stopped working as before;
there could be mismatches between the paths in the rendered document and
the extracted media.
This patch makes several changes (while keeping the same API).
The `mediaPath` in 2.14 was always constructed from the SHA1 hash of
the media contents. Now, we preserve the original path unless it's
an absolute path or contains `..` segments (in that case we use a path
based on the SHA1 hash of the contents).
When constructing a path from the SHA1 hash, we always use the
original extension, if there is one. Otherwise we look up an
appropriate extension for the mime type.
`mediaDirectory` and `mediaItems` now use the `mediaPath`, rather
than the mediabag key, for the first component of the tuple.
This makes more sense, I think, and fits with the documentation
of these functions; eventually, though, we should rework the API so that
`mediaItems` returns both the keys and the MediaItems.
Rewriting of source paths in `extractMedia` has been fixed.
`fillMediaBag` has been modified so that it doesn't modify
image paths (that was part of the problem in #7345).
We now do path normalization (e.g. `\` separators on Windows) only
in writing the media; the paths are left unchanged in the image
links (sensibly, since they might be URLs and not file paths).
These changes should restore the original behavior from before 2.14.
Closes#7345.
- Add manual entry for (non-default) extension
`rebase_relative_paths`.
- Add constructor `Ext_rebase_relative_paths` to `Extensions`
in Text.Pandoc.Extensions [API change]. When enabled, this
extension rewrites relative image and link paths by prepending
the (relative) directory of the containing file.
- Make Markdown reader sensitive to the new extension.
- Add tests for #3752.
Closes#3752.
NB. currently the extension applies to markdown and associated
readers but not commonmark/gfm.
The change provides a way to use citation keys that contain
special characters not usable with the standard citation
key syntax. Example: `@{foo_bar{x}'}` for the key `foo_bar{x}`.
Closes#6026.
The change requires adding a new parameter to the `citeKey`
parser from Text.Pandoc.Parsing [API change].
Markdown reader: recognize @{..} syntax for citatinos.
Markdown writer: use @{..} syntax for citations when needed.
Update manual with curly-brace syntax for citations.
Closes#6026.
This is a bit more limited than with markdown, as documented
in the manual:
- The YAML block must be the first thing in the input.
- The leaf notes are parsed in isolation from the rest of
the document. So, for example, you can't use reference
links if the references are defined later in the document.
Closes#6537.
Previously, if `--resource-path` were used multiple times, the last
resource path would replace the others.
With this change, each time `--resource-path` is used, it prepends
the specified path components to the existing resource path.
Similarly, when `resource-path` is specified in a defaults file,
the paths provided will be prepended to the existing resource
path.
This change also allows one to avoid using the OS-specific path
separator; instead, one can simply use `--resource-path`
a number of times with single paths. This form of command
will not have an OS-dependent behavior.
This change facilitates the use of multiple, small defaults
files: each can specify a directory containing its own
resources without clobbering the resource paths set by
the others.
Closes#6152.
to refer to the directory where the default file is.
This will make it possible to create moveable
"packages" of resources in a directory.
Closes#5871.
This allows the syntax `${HOME}` to be used, in fields that expect
file paths only. Any environment variable may be interpolated
in this way. A warning will be raised for undefined variables.
The special variable `USERDATA` is automatically set to the
user data directory in force when the defaults file is parsed.
(Note: it may be different from the eventual user data directory,
if the defaults file or further command line options change that.)
Closes#5982.
Closes#5977.
Closes#6108 (path not taken).
Rationale: the manual says that the XDG data directory will
be used if it exists, otherwise the legacy data directory.
So we should just determine this and use this directory,
rather than having a search path which could cause some
things to be taken from one data directory and others from
others.
[API change]
This exports functions that uses xml-conduit's parser to
produce an xml-light Element or [Content]. This allows
existing pandoc code to use a better parser without
much modification.
The new parser is used in all places where xml-light's
parser was previously used. Benchmarks show a significant
performance improvement in parsing XML-based formats
(especially ODT and FB2).
Note that the xml-light types use String, so the
conversion from xml-conduit types involves a lot
of extra allocation. It would be desirable to
avoid that in the future by gradually switching
to using xml-conduit directly. This can be done
module by module.
The new parser also reports errors, which we report
when possible.
A new constructor PandocXMLError has been added to
PandocError in T.P.Error [API change].
Closes#7091, which was the main stimulus.
These changes revealed the need for some changes
in the tests. The docbook-reader.docbook test
lacked definitions for the entities it used; these
have been added. And the docx golden tests have been
updated, because the new parser does not preserve
the order of attributes.
Add entity defs to docbook-reader.docbook.
Update golden tests for docx.
Load the iftex package directly rather than via the ifxetex and ifluatex compatibility
wrappers, which have been merged into a single package that is part of the LaTeX core.
The capitalization of the commands has been changed for compatibility with older
versions of TeX Live that have the version of iftex by the Persian TeX Group. This had
been removed in
<2845794c0c>
for compatibility with BasicTeX, but that is no longer an issue.
* `biblatex` and `bibtex` are now supported as output
as well as input formats.
* New module Text.Pandoc.Writers.BibTeX, exporting
writeBibTeX and writeBibLaTeX. [API change]
* New unexported function `writeBibtexString` in
Text.Pandoc.Citeproc.BibTeX.
Allow defaults files to inherit options from other defaults files by
specifying them with the following syntax:
`defaults: [list of defaults files or single defaults file]`.
* Add `Ext_sourcepos` constructor for `Extension`.
* Add `sourcepos` extension (only for commonmark).
* Bump to 2.11.3
With the `sourcepos` extension set set, `data-pos` attributes are added
to the AST by the commonmark reader. No other readers are affected. The
`data-pos` attributes are put on elements that accept attributes; for
other elements, an enlosing Div or Span is added to hold the attributes.
Closes#4565.
This commit adds two extensions to the OpenDocument writer,
`xrefs_name` and `xrefs_number`.
Links to headings, figures and tables inside the document are
substituted with cross-references that will use the name or caption
of the referenced item for `xrefs_name` or the number for `xrefs_number`.
For the `xrefs_number` to be useful heading numbers must be enabled
in the generated document and table and figure captions must be enabled using for example the `native_numbering` extension.
In order for numbers and reference text to be updated the generated
document must be refreshed.
Co-authored-by: Nils Carlson <nils.carlson@ludd.ltu.se>
Previously we used Setext (underlined) headings by default.
The default is now ATX (`##` style).
* Add the `--markdown-headings=atx|setext` option.
* Deprecate `--atx-headers`.
* Add constructor 'ATXHeadingInLHS` constructor to `LogMessage` [API change].
* Support `markdown-headings` in defaults files.
* Document new options in MANUAL.
Closes#6662.
- Fix margin before codeblock
- Add `monobackgroundcolor` variable, making the background color
and padding of code optional.
- Ensure that backgrounds from highlighting styles take precedence over
monobackgroundcolor
- Remove list markers from TOC
- Add margin-bottom where needed
- Remove italics from blockquote styling
- Change borders and spacing in tables to be more consistent with other
output formats
- Style h5, h6
- Decrease root font-size to 18px
- Update tests for styles.html changes
- Add CSS example to MANUAL
This deprecates the use of the external pandoc-citeproc
filter; citation processing is now built in to pandoc.
* Add dependency on citeproc library.
* Add Text.Pandoc.Citeproc module (and some associated unexported
modules under Text.Pandoc.Citeproc). Exports `processCitations`.
[API change]
* Add data files needed for Text.Pandoc.Citeproc: default.csl
in the data directory, and a citeproc directory that is just
used at compile-time. Note that we've added file-embed as a mandatory
rather than a conditional depedency, because of the biblatex
localization files. We might eventually want to use readDataFile
for this, but it would take some code reorganization.
* Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it
in `processCitations`. [API change]
* Add tests from the pandoc-citeproc package as command tests (including
some tests pandoc-citeproc did not pass).
* Remove instructions for building pandoc-citeproc from CI and
release binary build instructions. We will no longer distribute
pandoc-citeproc.
* Markdown reader: tweak abbreviation support. Don't insert a
nonbreaking space after a potential abbreviation if it comes right before
a note or citation. This messes up several things, including citeproc's
moving of note citations.
* Add `csljson` as and input and output format. This allows pandoc
to convert between `csljson` and other bibliography formats,
and to generate formatted versions of CSL JSON bibliographies.
* Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API
change]
* Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API
change]
* Added `bibtex`, `biblatex` as input formats. This allows pandoc
to convert between BibLaTeX and BibTeX and other bibliography formats,
and to generated formatted versions of BibTeX/BibLaTeX bibliographies.
* Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and
`readBibLaTeX`. [API change]
* Make "standalone" implicit if output format is a bibliography format.
This is needed because pandoc readers for bibliography formats put
the bibliographic information in the `references` field of metadata;
and unless standalone is specified, metadata gets ignored.
(TODO: This needs improvement. We should trigger standalone for the
reader when the input format is bibliographic, and for the writer
when the output format is markdown.)
* Carry over `citationNoteNum` to `citationNoteNumber`. This was just
ignored in pandoc-citeproc.
* Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter.
[API change] This runs the processCitations transformation.
We need to treat it like a filter so it can be placed
in the sequence of filter runs (after some, before others).
In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`,
so this special filter may be specified either way in a defaults file
(or by `citeproc: true`, though this gives no control of positioning
relative to other filters). TODO: we need to add something to the
manual section on defaults files for this.
* Add deprecation warning if `upandoc-citeproc` filter is used.
* Add `--citeproc/-C` option to trigger citation processing.
This behaves like a filter and will be positioned
relative to filters as they appear on the command line.
* Rewrote the manual on citatations, adding a dedicated Citations
section which also includes some information formerly found in
the pandoc-citeproc man page.
* Look for CSL styles in the `csl` subdirectory of the pandoc user data
directory. This changes the old pandoc-citeproc behavior, which looked
in `~/.csl`. Users can simply symlink `~/.csl` to the `csl`
subdirectory of their pandoc user data directory if they want
the old behavior.
* Add support for CSL bibliography entry formatting to LaTeX, HTML,
Ms writers. Added CSL-related CSS to styles.html.
This gives a rule that has been been superseded by
commit 47537d26db.
The section is concerned to explain a discrepancy with
original Markdown.pl and its test suite. In the case
under consideration, Markdown.pl gave strange results
which pandoc corrected. I think it's no longer worth
wasting space on this, as its behavior seems clearly wrong.
If we are going to comment on every edge case with
Markdown.pl, the manual will get too long.
Babelmark 2 shows that some of the older implementations
follow Markdown.pl -- PHP Markdown, Python Markdown,
redcarpet, discount.
https://johnmacfarlane.net/babelmark2/?normalize=1&text=%2B+++First%0A%2B+++Second%0A++++-+a%0A++++-+b%0A%0A%2B+Third%0ACloses#6684.
Specifying `-f ipynb+raw_markdown` will cause Markdown cells
to be represented as raw Markdown blocks, instead of being
parsed. This is not what you want when going from `ipynb`
to other formats, but it may be useful when going from `ipynb`
to Markdown or to `ipynb`, to avoid semantically insignificant
changes in the contents of the Markdown cells that might
otherwise be introduced.
Closes#5408.
Instead rely on the markdown writer with appropriate extensions.
Export writeCommonMark variant from Markdown writer.
This changes a few small things in rendering markdown,
e.g. w/r/t requiring backslashes before spaces inside
super/subscripts.
This allows attributes to be added to any block or inline
element, in principle. (Though in many cases this will be
done by adding a Div or Span container, since pandoc's
AST doesn't have a slot for attributes for most elements.)
Currently this is only possible with the commonmark and gfm
readers.
Add `Ext_attributes` constructor for `Extension` [API change].
This commit adds the option `--no-check-certificate`, which disables certificate
checking when resources are fetched by HTTP.
Co-authored-by: Cécile Chemin <cecile.chemin@insee.fr>
Co-authored-by: Juliette Fourcot <juliette.fourcot@insee.fr>
This reverts commit 3493d6afaa.
This might be worth considering in the future, but let's not do
it yet...the additional complexity needs a better justification.
This is experimental. Normally metadata values are interpreted
as markdown, but if the !!literal tag is used they will be interpreted
as plain strings.
We need to consider whether this can still be implemented if
we switch back from HsYAML to yaml for performance reasons.
New formats:
- `jats_archiving` for the "Archiving and Interchange Tag Set",
- `jats_publishing` for the "Journal Publishing Tag Set", and
- `jats_articleauthoring` for the "Article Authoring Tag Set."
The "jats" output format is now an alias for "jats_archiving".
Closes: #6014
This follows the advise on the Lua
website (https://www.lua.org/about.html#name):
> […] "Lua" is a name, the name of the Earth's moon and the name of the
> language. Like most names, it should be written in lower case with an
> initial capital, that is, "Lua".
With positive heading shifts, starting in 2.8 this option caused
metadata titles to be removed and changed to regular headings.
This behavior is incompatible with the old behavior of
`--base-header-level` and breaks old workflows, so with this
commit we are rolling back this change.
Now, there is an asymmetry in positive and negative heading
level shifts:
+ With positive shifts, the metadata title stays the same and
does not get changed to a heading in the body.
+ With negative shifts, a heading can be converted into the
metadata title.
I think this is a desirable combination of features, despite
the asymmetry. One might, e.g., want to have a document
with level-1 section headigs, but render it to HTML with
level-2 headings, retaining the metadata title (which pandoc
will render as a level-1 heading with the default template).
Closes#5957.
Revises #5615.
Certain options (`--self-contained`, `--include-in-header`, etc.)
imply `--standalone`. We now handle this after option parsing
so that it affects options specified in defaults files too.
Behavior change: `--title-prefix` no longer implies `--standalone`.
Certain command-line arguments can be repeated:
`--metadata-file`, `--css`, `--include-in-header`,
`--include-before-body`, `--include-after-body`, `--variable`,
`--metadata`, `--syntax-definition`. In these cases, values
specified in default files should be added to the list rather
than replacing values specified earlier on the command line
(perhaps in other default files).
So, for example, if one does
pandoc --variable foo=3 --defaults d1 --defaults d2
and `d1` sets the variable `bar` and `d2` sets `baz`,
all three variables will be set.
Closes#5894.
Superscripts and subscripts cannot contain spaces,
but newlines were previously allowed (unintentionally).
This led to bad interactions in some cases with footnotes.
E.g.
```
foo^[note]
bar^[note]
```
With this change newlines are also not allowed inside
super/subscripts.
Closes#5878.
Previously, if a document contained two YAML metadata blocks
that set the same field, the conflict would be resolved in favor
of the first. Now it is resolved in favor of the second (due to
a change in pandoc-types).
This makes the behavior more uniform with other things in pandoc
(such as reference links and `--metadata-file`).