if they come before csl-right-inline. This ensures that
the citation number or label will be separated from the
rest by a space, even in formats (like plain) that don't yet have
special handling for the display spans.
This deprecates the use of the external pandoc-citeproc
filter; citation processing is now built in to pandoc.
* Add dependency on citeproc library.
* Add Text.Pandoc.Citeproc module (and some associated unexported
modules under Text.Pandoc.Citeproc). Exports `processCitations`.
[API change]
* Add data files needed for Text.Pandoc.Citeproc: default.csl
in the data directory, and a citeproc directory that is just
used at compile-time. Note that we've added file-embed as a mandatory
rather than a conditional depedency, because of the biblatex
localization files. We might eventually want to use readDataFile
for this, but it would take some code reorganization.
* Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it
in `processCitations`. [API change]
* Add tests from the pandoc-citeproc package as command tests (including
some tests pandoc-citeproc did not pass).
* Remove instructions for building pandoc-citeproc from CI and
release binary build instructions. We will no longer distribute
pandoc-citeproc.
* Markdown reader: tweak abbreviation support. Don't insert a
nonbreaking space after a potential abbreviation if it comes right before
a note or citation. This messes up several things, including citeproc's
moving of note citations.
* Add `csljson` as and input and output format. This allows pandoc
to convert between `csljson` and other bibliography formats,
and to generate formatted versions of CSL JSON bibliographies.
* Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API
change]
* Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API
change]
* Added `bibtex`, `biblatex` as input formats. This allows pandoc
to convert between BibLaTeX and BibTeX and other bibliography formats,
and to generated formatted versions of BibTeX/BibLaTeX bibliographies.
* Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and
`readBibLaTeX`. [API change]
* Make "standalone" implicit if output format is a bibliography format.
This is needed because pandoc readers for bibliography formats put
the bibliographic information in the `references` field of metadata;
and unless standalone is specified, metadata gets ignored.
(TODO: This needs improvement. We should trigger standalone for the
reader when the input format is bibliographic, and for the writer
when the output format is markdown.)
* Carry over `citationNoteNum` to `citationNoteNumber`. This was just
ignored in pandoc-citeproc.
* Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter.
[API change] This runs the processCitations transformation.
We need to treat it like a filter so it can be placed
in the sequence of filter runs (after some, before others).
In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`,
so this special filter may be specified either way in a defaults file
(or by `citeproc: true`, though this gives no control of positioning
relative to other filters). TODO: we need to add something to the
manual section on defaults files for this.
* Add deprecation warning if `upandoc-citeproc` filter is used.
* Add `--citeproc/-C` option to trigger citation processing.
This behaves like a filter and will be positioned
relative to filters as they appear on the command line.
* Rewrote the manual on citatations, adding a dedicated Citations
section which also includes some information formerly found in
the pandoc-citeproc man page.
* Look for CSL styles in the `csl` subdirectory of the pandoc user data
directory. This changes the old pandoc-citeproc behavior, which looked
in `~/.csl`. Users can simply symlink `~/.csl` to the `csl`
subdirectory of their pandoc user data directory if they want
the old behavior.
* Add support for CSL bibliography entry formatting to LaTeX, HTML,
Ms writers. Added CSL-related CSS to styles.html.
When a list occurs at the beginning of a definition list definition,
it can start on the same line as the label, which looks bad.
Fix that by starting such lists with an `\item[]`.
Instead rely on the markdown writer with appropriate extensions.
Export writeCommonMark variant from Markdown writer.
This changes a few small things in rendering markdown,
e.g. w/r/t requiring backslashes before spaces inside
super/subscripts.
Screen readers read an image's `alt` attribute and the figure caption,
both of which come from the same source in pandoc. The figure caption is
hidden from screen readers with the `aria-hidden` attribute. This
improves accessibility.
For HTML4, where `aria-hidden` is not allowed, pandoc still uses an
empty `alt` attribute to avoid duplicate contents.
Closes: #6491
Closes#6385. (The summary element needs to be the first
child of details and should not be enclosed by p tags.)
NOTE: you need to include a blank line before the closing
`</details>`, if you want the last part of the content to
be parsed as a paragraph.
Some CSS to ensure that display math is
displayed centered and on a new line is now included
in the default HTML-based templates; this may be
overridden if the user wants a different behavior.
This reverts a change in the last release; the Div is
no longer needed, because we can now put the id right in
the Table's attributes. However, writers may still need
to be modified to do something with the id in a Table
(e.g. create an anchor), so in the short term we may lose
the ability to link to tables in some writers.
The Builder.simpleTable now only adds a row to the TableHead when the
given header row is not null. This uncovered an inconsistency in the
readers: some would unconditionally emit a header filled with empty
cells, even if the header was not present. Now every reader has the
conditional behaviour. Only the XWiki writer depended on the header
row being always present; it now pads its head as necessary.
- Writers.Native is now adapted to the new Table type.
- Inline captions should now be conditionally wrapped in a Plain, not
a Para block.
- The toLegacyTable function now lives in Writers.Shared.
- SmallCaps instead of Span for the part after the initial capital.
- Ensure that both arguments are parsed, so that in Markdown both
are treated as raw LateX. (Closes #6258.)
This reverts commit 3493d6afaa.
This might be worth considering in the future, but let's not do
it yet...the additional complexity needs a better justification.
This is experimental. Normally metadata values are interpreted
as markdown, but if the !!literal tag is used they will be interpreted
as plain strings.
We need to consider whether this can still be implemented if
we switch back from HsYAML to yaml for performance reasons.
Closes#5849. Previously biblatex citations were only grouped if
there was no prefix. This patch allows them to be grouped in
subgroups split by prefixes and suffixes, which allows better citation
sorting.
If parsing fails in a raw environment (e.g. due to special
characters like unescaped `_`), try again as a verbatim
environment, which is less sensitive to special characters.
This allows us to capture special environments that change
catcodes as raw tex when `-f latex+raw_tex` is used.
Closes#6034.
The fix to #6030 actually changed behavior, so that the
2D nesting occurred at slide level N-1 and N, instead of
at the top-level section. This commit restores the 2.7.3 behavior.
If there are more than 2 levels, the top level is horizontal
and the rest are collapsed to vertical.
Closes#6032.
With positive heading shifts, starting in 2.8 this option caused
metadata titles to be removed and changed to regular headings.
This behavior is incompatible with the old behavior of
`--base-header-level` and breaks old workflows, so with this
commit we are rolling back this change.
Now, there is an asymmetry in positive and negative heading
level shifts:
+ With positive shifts, the metadata title stays the same and
does not get changed to a heading in the body.
+ With negative shifts, a heading can be converted into the
metadata title.
I think this is a desirable combination of features, despite
the asymmetry. One might, e.g., want to have a document
with level-1 section headigs, but render it to HTML with
level-2 headings, retaining the metadata title (which pandoc
will render as a level-1 heading with the default template).
Closes#5957.
Revises #5615.
If a simple table would be too wide, we use a grid table.
The code for generating grid tables has been adjusted to
give more intelligent column widths when widths aren't
given. (This also affects the markdown writer.)
Closes#5899.
`Nothing` means: nothing specified.
`Just []` means: an empty list specified (e.g. in defaults).
Potentially these could lead to different behavior: see #5888.
Superscripts and subscripts cannot contain spaces,
but newlines were previously allowed (unintentionally).
This led to bad interactions in some cases with footnotes.
E.g.
```
foo^[note]
bar^[note]
```
With this change newlines are also not allowed inside
super/subscripts.
Closes#5878.
* Add HTML Reader support for `<dfn>`, parsing this as a Span with class `dfn`.
* Change `htmlSpanLikeElements` implementation to retain classes,
attributes and inline content.
Previously, if a document contained two YAML metadata blocks
that set the same field, the conflict would be resolved in favor
of the first. Now it is resolved in favor of the second (due to
a change in pandoc-types).
This makes the behavior more uniform with other things in pandoc
(such as reference links and `--metadata-file`).
When a div surrounds multiple sections at the same level,
or a section of highre level followed by one of lower level,
then we just leave it as a div and create a new div for the
section.
Closes#5846, closes#5761.
Parse <mark> elements from HTML as HTML span like elements, with a
single class matching the tag name `mark`. Mark elements are rendered to
HTML using the native <mark> element.
Fixes https://github.com/jgm/pandoc/issues/5797.
* Text.Pandoc.Shared: export `htmlSpanLikeElements` [API change]
This commit also introduces a mapping of HTML span like elements that
are internally represented as a Span with a single class, but that are
converted back to the original element by the html writer. As of now,
only the kbd element is handled this way. Ideally these elements should
be handled as plain AST values, but since that would be a breaking
change with a large impact, we revert to this stop-gap solution.
Fixes https://github.com/jgm/pandoc/issues/5796.
...even when we must lose width information.
All in all this seems to be people's preferred behavior, even though it
is slightly lossier.
Closes#2608.
Closes#4497.
on conflicting fields. This changes earlier behavior (but not in
a release), where first took precedence.
Note that this may seem inconsistent with the behavior of
multiple YAML blocks within a document, where the first takes
precedence. Still, it is convenient to be able to override
defaults with options later on the command line.
If this is present on a heading with the 'unnumbered' class,
the heading won't appear in the TOC. This class has no
effect if 'unnumbered' is not also specified.
This affects HTML-based writers (including slide shows
and epub), LateX (including beamer), RTF, and PowerPoint.
Other writers do not yet support `unlisted`.
Closes#1762.
We were requiring consistent indentation, but this
isn't required by RST, as long as each nonblank
line of the block has *some* indentation.
Closes#5753.
Previously we used the following Project Gutenberg conventions
for plain output:
- extra space before and after level 1 and 2 headings
- all-caps for strong emphasis `LIKE THIS`
- underscores surrounding regular emphasis `_like this_`
This commit makes `plain` output plainer. Strong and Emph
inlines are rendered without special formatting. Headings
are also rendered without special formatting, and with only
one blank line following.
To restore the former behavior, use `-t plain+gutenberg`.
API change: Add `Ext_gutenberg` constructor to `Extension`.
See #5741.
Deprecate --base-heading-level.
The new option does everything the old one does, but also
allows negative shifts. It also promotes the document
metadata (if not null) to a level-1 heading with a +1 shift,
and demotes an initial level-1 heading to document metadata
with a -1 shift. This supports converting documents that
use an initial level-1 heading for the document title.
Closes#5615.
* Org reader: allow the `-i` switch to ignore leading spaces.
* Org reader: handle awkwardly-aligned code blocks within lists.
Code blocks in Org lists must have their #+BEGIN_ aligned in a
reasonable way, but their other components can be positioned otherwise.
Text.Pandoc.Shared:
+ Remove `Element` type [API change]
+ Remove `makeHierarchicalize` [API change]
+ Add `makeSections` [API change]
+ Export `deLink` [API change]
Now that we have Divs, we can use them to represent the structure
of sections, and we don't need a special Element type.
`makeSections` reorganizes a block list, adding Divs with
class `section` around sections, and adding numbering
if needed.
This change also fixes some longstanding issues recognizing
section structure when the document contains Divs.
Closes#3057, see also #997.
All writers have been changed to use `makeSections`.
Note that in the process we have reverted the change
c1d058aeb1
made in response to #5168, which I'm not completely
sure was a good idea.
Lua modules have also been adjusted accordingly.
Existing lua filters that use `hierarchicalize` will
need to be rewritten to use `make_sections`.
Revert "hierarchicalize: ensure that sections get ids..."
This reverts commit 212406a61d.
Revert "Improve detection of headings in Divs by hierarchicalize."
This reverts commit 6e2cfd6c97.
Revert "Shared.hierarchicalize: improve handling of div and section structure."
This reverts commit 345b33762e.
The structure
```
<h1>one</h1>
<div>
<h1>two</h1>
</div>
```
should create two coordinate sections, not a section with
a subsection. Now it does.
Extends #3057.
Previously Divs were opaque to hierarchicalize, so headings
inside divs didn't get into the table of contents, for
example (#3057).
Now hierarchicalize treats Divs as sections when appropriate.
For example, these structures both yield a section and a
subsection:
``` html
<div>
<h1>one</h1>
<div>
<h2>two</h2>
</div>
</div>
```
``` html
<div>
<h1>one</h1>
<div>
<h1>two</h1>
</div>
</div>
```
Note that
``` html
<h1>one</h1>
<div>
<h2>two</h2>
</div>
<h1>three</h1>
```
gets parsed as the structure
one
two
three
which may not always be desirable.
Closes#3057.
Previously we used named character references with html5 output.
But these aren't valid XML, and we aim to produce html5 that is
also valid XHTML (polyglot markup). (This is also needed for
epub3.)
Closes#5718.
+ Remove Text.Pandoc.Pretty; use doclayout instead. [API change]
+ Text.Pandoc.Writers.Shared: remove metaToJSON, metaToJSON'
[API change].
+ Text.Pandoc.Writers.Shared: modify `addVariablesToContext`,
`defField`, `setField`, `getField`, `resetField` to work with
Context rather than JSON values. [API change]
+ Text.Pandoc.Writers.Shared: export new function `endsWithPlain` [API
change].
+ Use new templates and doclayout in writers.
+ Use Doc-based templates in all writers.
+ Adjust three tests for minor template rendering differences.
+ Added indentation to body in docbook4, docbook5 templates.
The main impact of this change is better reflowing of content
interpolated into templates. Previously, interpolated variables
were rendered independently and intepolated as strings, which could lead
to overly long lines. Now the templates interpolated as Doc values
which may include breaking spaces, and reflowing occurs
after template interpolation rather than before.
Changed optMetadataFile from `Maybe FilePath` to `[FilePath]`. This allows
for multiple YAML metadata files to be added. The new default value has
been changed from `Nothing` to `[]`.
To account for this change in `Text.Pandoc.App`, `metaDataFromFile` now
operates on two `mapM` calls (for `readFileLazy` and `yamlToMeta`) and a fold.
Added a test (command/5700.md) which tests this functionality and
updated MANUAL.txt, as per the contributing guidelines.
With the current behavior, using `foldr1 (<>)`, values within files
specified first will be used over those in later files. (If the reverse
of this behavior would be preferred, it should be fixed by changing
foldr1 to foldl1.)
the token string is modified by a parser (e.g. accent when
it only takes part of a Word token).
Closes#5686. Still not ideal, because we get the whole
`\t0BAR` and not just `\t0` as a raw latex inline command.
But I'm willing to let this be an edge case, since you
can easily work around this by inserting a space, braces,
or raw attribute. The important thing is that we no longer
drop the rest of the document after a raw latex inline
command that gobbles only part of a Word token!
The web service passed in to `--webtex` may render formulas using inline
or display style by default. Prefixing formulas with the appropriate
command ensures they are rendered correctly.
This is a followup to the discussion in #5656.
Previously we just omitted these. Now we render them
using `\hfill\break` instead of `\\`. This is a revision
of a PR by @sabine (#5591) who should be credited with the
idea.
Closes#3324.
The `raw_attribute` will be used to mark raw bits, even HTML
and LaTeX, and even when `raw_html` and `raw_tex` are enabled,
as they are by default.
To get the old behavior, disable `raw_attribute` in the writer.
Closes#4311.
Now we boldface code but not other things. This matches the
most common style in man pages (particularly option lists).
Also, remove a regression in the last commit in which 'nowrap'
was removed.