- Recognize locators spelled with a capital letter.
Closes#7323.
- Add a comma and a space in front of the suffix if it doesn't start
with space or punctuation. Closes#7324.
- Add manual entry for (non-default) extension
`rebase_relative_paths`.
- Add constructor `Ext_rebase_relative_paths` to `Extensions`
in Text.Pandoc.Extensions [API change]. When enabled, this
extension rewrites relative image and link paths by prepending
the (relative) directory of the containing file.
- Make Markdown reader sensitive to the new extension.
- Add tests for #3752.
Closes#3752.
NB. currently the extension applies to markdown and associated
readers but not commonmark/gfm.
- Improve parsing of `\def` macros. We previously set "verbatim mode"
even for parsing the initial `\def`; this caused problems for things
like
```
\def\foo{\def\bar{BAR}}
\foo
\bar
```
- Implement `\newif`.
- Add tests.
The error message to stderr was appearing in test output
and confusing some users, who thought it indicated a failing
test rather than expected output.
See <https://www.w3.org/TR/html4/types.html#h-6.6>.
"A relative length has the form "i*", where "i" is an integer. When
allotting space among elements competing for that space, user agents
allot pixel and percentage lengths first, then divide up remaining
available space among relative lengths. Each relative length receives a
portion of the available space that is proportional to the integer
preceding the "*". The value "*" is equivalent to "1*". Thus, if 60
pixels of space are available after the user agent allots pixel and
percentage space, and the competing relative lengths are 1*, 2*, and 3*,
the 1* will be alloted 10 pixels, the 2* will be alloted 20 pixels, and
the 3* will be alloted 30 pixels."
Closes#4063.
There's still one slight divergence from the siunitx behavior:
we get 'kg m/A/s' instead of 'kg m/(A s)'. At the moment I'm
not going to worry about that.
Successive quote characters are separated with a thin space to improve
readability and to prevent unwanted ligatures. Detection of these quotes
sometimes had failed if the second quote was nested in a span element.
Closes: #6958
The change provides a way to use citation keys that contain
special characters not usable with the standard citation
key syntax. Example: `@{foo_bar{x}'}` for the key `foo_bar{x}`.
Closes#6026.
The change requires adding a new parameter to the `citeKey`
parser from Text.Pandoc.Parsing [API change].
Markdown reader: recognize @{..} syntax for citatinos.
Markdown writer: use @{..} syntax for citations when needed.
Update manual with curly-brace syntax for citations.
Closes#6026.
Treat a leading " with no closing " as a left curly quote.
This supports the practice, in fiction, of continuing
paragraphs quoting the same speaker without an end quote.
It also helps with quotes that break over lines in line
blocks.
Closes#7216.
When a block only has a single class and no other attributes,
it is not necessary to wrap the class attribute in curly braces –
the class name can be placed after the opening mark as is.
This will result in bit cleaner output when pandoc is used
as a markdown pretty-printer.
This fixes a bug, which caused the writer to look at the LAST
rather than the FIRST character in determining whether quotes
were needed. So we got spurious quotes in some cases and
didn't get necessary quotes in others.
Closes#7245. Updated a number of test cases accordingly.
The `<p>` element is used for wrapping in cases were the contents would
otherwise not be allowed in a certain context. Unnecessary wrapping is
avoided, especially around quotes (`<disp-quote>` elements).
Closes: #7227
This prevents emitting invalid HTML.
Ultimately it would be good to prevent this in the types
themselves, but this is better for now.
T.P.Logging: Add DuplicateAttribute constructor to LogMessage.
[API change]
- If src is empty, we simply skip the iframe.
- If src is invalid or cannot be fetched, we issue a warning
and skip instead of failing with an error.
- Closes#7099.
This exports functions that uses xml-conduit's parser to
produce an xml-light Element or [Content]. This allows
existing pandoc code to use a better parser without
much modification.
The new parser is used in all places where xml-light's
parser was previously used. Benchmarks show a significant
performance improvement in parsing XML-based formats
(especially ODT and FB2).
Note that the xml-light types use String, so the
conversion from xml-conduit types involves a lot
of extra allocation. It would be desirable to
avoid that in the future by gradually switching
to using xml-conduit directly. This can be done
module by module.
The new parser also reports errors, which we report
when possible.
A new constructor PandocXMLError has been added to
PandocError in T.P.Error [API change].
Closes#7091, which was the main stimulus.
These changes revealed the need for some changes
in the tests. The docbook-reader.docbook test
lacked definitions for the entities it used; these
have been added. And the docx golden tests have been
updated, because the new parser does not preserve
the order of attributes.
Add entity defs to docbook-reader.docbook.
Update golden tests for docx.
This change allows bibtex/biblatex output to wrap as other
formats do, depending on the settings of `--wrap` and `--columns`.
It also introduces default templates for bibtex and biblatex,
which allow for using the variables `header-include`, `include-before`
or `include-after` (or alternatively the command line options
`--include-in-header`, `--include-before-body`, `--include-after-body`)
to insert content into the generated bibtex/biblatex.
This change requires a change in the return type of the unexported
`T.P.Citeproc.writeBibTeXString` from `Text` to `Doc Text`.
Closes#7068.
Previously there was a messy code path that gave strange
results in some cases, not passing through raw tex but
trying to extract a string content. This was an artefact
of trying to handle some special bibtex-specific commands
in the BibTeX reader. Now we just handle these in the
LaTeX reader and simplify parsing in the BibTeX reader.
This does mean that more raw tex will be passed through
(and currently this is not sensitive to the `raw_tex`
extension; this should be fixed).
Closes#7049.
If the table lacks a header, the header row should be an empty
list. Previously we got a list of empty cells, which caused
an empty header to be emitted instead of no header. In LaTeX/PDF
output that meant we got a double top line with space between.
@tarleb @despres - please let me know if this is problematic
for some reason I'm not grasping.
Allow defaults files to inherit options from other defaults files by
specifying them with the following syntax:
`defaults: [list of defaults files or single defaults file]`.
In 2.11.3 we started adding `\addlinespace`, which produced less
dense tables. This wasn't an intentional change; I misunderstood
a comment in the discussion leading up to the change. This commit
restores the earlier default table appearance.
Note that if you want a less dense table, you can use something like
`\def\arraystretch{1.5}` in your header.
Closes#6996.
Added field to WriterState that denotes the current nesting level for traversing tables.
Depending on the value of that field nested tables are recognized and written.
Asciidoc supports one level of nesting. If deeper tables are to be written, they are
omitted and a warning is issued.
- An image alone in its paragraph (but not a figure) is now
rendered as an independent image, with an `alt` attribute
if a description is supplied.
- An inline image that is not alone in its paragraph will
be rendered, as before, using a substitution.
Such an image cannot have a "center", "left", or
"right" alignment, so the classes `align-center`,
`align-left`, or `align-right` are ignored.
However, `align-top`, `align-middle`, `align-bottom`
will generate a corresponding `align` attribute.
Closes#6948.
Previously we stripped attribute prefixes, reading
`xml:lang` as `lang` for example. This resulted in
two duplicate `lang` attributes when `xml:lang` and
`lang` were both used. This commit causes the prefixes
to be retained, and also avoids invald duplicate
attributes.
Closes#6938.
This commit adds two extensions to the OpenDocument writer,
`xrefs_name` and `xrefs_number`.
Links to headings, figures and tables inside the document are
substituted with cross-references that will use the name or caption
of the referenced item for `xrefs_name` or the number for `xrefs_number`.
For the `xrefs_number` to be useful heading numbers must be enabled
in the generated document and table and figure captions must be enabled using for example the `native_numbering` extension.
In order for numbers and reference text to be updated the generated
document must be refreshed.
Co-authored-by: Nils Carlson <nils.carlson@ludd.ltu.se>
This affected author-in-text citations in footnotes.
It didn't cause problems for the printed output, but for
filters that expected the citation id and other information.
Closes#6890.
+ Remove the `\strut` that was added at the end of minipage
environments in cells.
+ Replace `\tabularnewline` with `\\ \addlinespace`.
Closes#6842, closes#6860.
Table width in relation to text width is not natively supported
by docbook but is by the docbook fo stylesheets through an XML
processing instruction, <?dbfo table-width="50%"?> .
Implement support for this instruction in the DocBook reader.
in cases where we run into trouble parsing inlines til the
closing `]`, e.g. quotes, we return a plain string with the
option contents. Previously we mistakenly included the brackets
in this string.
Closes#6869.
`commonmark_x` never actually supported `auto_identifiers` (it
didn't do anything), because the underlying library implements
gfm-style identifiers only.
Attempts to add the `autolink_identifiers` extension to
`commonmark` will now fail with an error.
Closes#6863.
We now better handle `.IP` when it is used with non-bullet,
non-numbered lists, creating a definition list.
We also skip blank lines like groff itself.
Closes#6858.
Previously we used Setext (underlined) headings by default.
The default is now ATX (`##` style).
* Add the `--markdown-headings=atx|setext` option.
* Deprecate `--atx-headers`.
* Add constructor 'ATXHeadingInLHS` constructor to `LogMessage` [API change].
* Support `markdown-headings` in defaults files.
* Document new options in MANUAL.
Closes#6662.
Background: syntactically, references to example list items
can't be distinguished from citations; we only know which they
are after we've parsed the whole document (and this is resolved
in the `runF` stage).
This means that pandoc's calculation of `citationNoteNum`
can sometimes be wrong when there are example list references.
This commit partially addresses #6836, but only for the case
where the example list references refer to list items defined
previously in the document.
If `\cL` is defined as `\mathcal{L}`, and `\til` as `\tilde{#1}`,
then `\til\cL` should expand to `\tilde{\mathcal{L}}`, but pandoc
was expanding it to `\tilde\mathcal{L}`. This is fixed by
parsing the arguments in "verbatim mode" when the macro expands
arguments at the point of use.
Closes#6796.
When an author-in-text citation like `@foo` occurs in a footnote,
we now render it with: `AUTHOR NAME + COMMA + SPACE + REST`.
Previously we rendered: `AUTHOR NAME + SPACE + "(" + REST + ")"`.
This gives better results. Note that normal citations are still
rendered in parentheses.
We now have LaTeX do the calculation, using `\tabcolsep`.
So we should now have accurate relative column widths no
matter what the text width.
The default template has been modified to load the calc
package if tables are used.
Starting with 2.10.1, fenced divs no longer render with
HTML div tags in commonmark output. This is a regression
due to our transition from cmark-gfm. This commit fixes it.
Closes#6768.
This fixes a problem with author-in-text citations for references
including both an author and an editor. Previously, both were
included in the text, but only the author should be.
Closes#6765. Added a test.
This fixes the citation number issue with ieee.csl and other
styles that do not explicitly sort bibliographies. (Pandoc
was numbering them by their order in the bibliography file,
rather than the order cited, as required by the CSL spec.)
Closes#6741.
The renderCslJson function escapes `<`, `>`, and `&` as entities.
This is appropriate when generating HTML, but in CSL JSON
these are supposed to appear unescaped.
Closesjgm/citeproc#17.
if they come before csl-right-inline. This ensures that
the citation number or label will be separated from the
rest by a space, even in formats (like plain) that don't yet have
special handling for the display spans.
This deprecates the use of the external pandoc-citeproc
filter; citation processing is now built in to pandoc.
* Add dependency on citeproc library.
* Add Text.Pandoc.Citeproc module (and some associated unexported
modules under Text.Pandoc.Citeproc). Exports `processCitations`.
[API change]
* Add data files needed for Text.Pandoc.Citeproc: default.csl
in the data directory, and a citeproc directory that is just
used at compile-time. Note that we've added file-embed as a mandatory
rather than a conditional depedency, because of the biblatex
localization files. We might eventually want to use readDataFile
for this, but it would take some code reorganization.
* Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it
in `processCitations`. [API change]
* Add tests from the pandoc-citeproc package as command tests (including
some tests pandoc-citeproc did not pass).
* Remove instructions for building pandoc-citeproc from CI and
release binary build instructions. We will no longer distribute
pandoc-citeproc.
* Markdown reader: tweak abbreviation support. Don't insert a
nonbreaking space after a potential abbreviation if it comes right before
a note or citation. This messes up several things, including citeproc's
moving of note citations.
* Add `csljson` as and input and output format. This allows pandoc
to convert between `csljson` and other bibliography formats,
and to generate formatted versions of CSL JSON bibliographies.
* Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API
change]
* Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API
change]
* Added `bibtex`, `biblatex` as input formats. This allows pandoc
to convert between BibLaTeX and BibTeX and other bibliography formats,
and to generated formatted versions of BibTeX/BibLaTeX bibliographies.
* Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and
`readBibLaTeX`. [API change]
* Make "standalone" implicit if output format is a bibliography format.
This is needed because pandoc readers for bibliography formats put
the bibliographic information in the `references` field of metadata;
and unless standalone is specified, metadata gets ignored.
(TODO: This needs improvement. We should trigger standalone for the
reader when the input format is bibliographic, and for the writer
when the output format is markdown.)
* Carry over `citationNoteNum` to `citationNoteNumber`. This was just
ignored in pandoc-citeproc.
* Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter.
[API change] This runs the processCitations transformation.
We need to treat it like a filter so it can be placed
in the sequence of filter runs (after some, before others).
In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`,
so this special filter may be specified either way in a defaults file
(or by `citeproc: true`, though this gives no control of positioning
relative to other filters). TODO: we need to add something to the
manual section on defaults files for this.
* Add deprecation warning if `upandoc-citeproc` filter is used.
* Add `--citeproc/-C` option to trigger citation processing.
This behaves like a filter and will be positioned
relative to filters as they appear on the command line.
* Rewrote the manual on citatations, adding a dedicated Citations
section which also includes some information formerly found in
the pandoc-citeproc man page.
* Look for CSL styles in the `csl` subdirectory of the pandoc user data
directory. This changes the old pandoc-citeproc behavior, which looked
in `~/.csl`. Users can simply symlink `~/.csl` to the `csl`
subdirectory of their pandoc user data directory if they want
the old behavior.
* Add support for CSL bibliography entry formatting to LaTeX, HTML,
Ms writers. Added CSL-related CSS to styles.html.
When a list occurs at the beginning of a definition list definition,
it can start on the same line as the label, which looks bad.
Fix that by starting such lists with an `\item[]`.
Instead rely on the markdown writer with appropriate extensions.
Export writeCommonMark variant from Markdown writer.
This changes a few small things in rendering markdown,
e.g. w/r/t requiring backslashes before spaces inside
super/subscripts.
Screen readers read an image's `alt` attribute and the figure caption,
both of which come from the same source in pandoc. The figure caption is
hidden from screen readers with the `aria-hidden` attribute. This
improves accessibility.
For HTML4, where `aria-hidden` is not allowed, pandoc still uses an
empty `alt` attribute to avoid duplicate contents.
Closes: #6491
Closes#6385. (The summary element needs to be the first
child of details and should not be enclosed by p tags.)
NOTE: you need to include a blank line before the closing
`</details>`, if you want the last part of the content to
be parsed as a paragraph.
Some CSS to ensure that display math is
displayed centered and on a new line is now included
in the default HTML-based templates; this may be
overridden if the user wants a different behavior.
This reverts a change in the last release; the Div is
no longer needed, because we can now put the id right in
the Table's attributes. However, writers may still need
to be modified to do something with the id in a Table
(e.g. create an anchor), so in the short term we may lose
the ability to link to tables in some writers.
The Builder.simpleTable now only adds a row to the TableHead when the
given header row is not null. This uncovered an inconsistency in the
readers: some would unconditionally emit a header filled with empty
cells, even if the header was not present. Now every reader has the
conditional behaviour. Only the XWiki writer depended on the header
row being always present; it now pads its head as necessary.
- Writers.Native is now adapted to the new Table type.
- Inline captions should now be conditionally wrapped in a Plain, not
a Para block.
- The toLegacyTable function now lives in Writers.Shared.
- SmallCaps instead of Span for the part after the initial capital.
- Ensure that both arguments are parsed, so that in Markdown both
are treated as raw LateX. (Closes #6258.)
This reverts commit 3493d6afaa.
This might be worth considering in the future, but let's not do
it yet...the additional complexity needs a better justification.
This is experimental. Normally metadata values are interpreted
as markdown, but if the !!literal tag is used they will be interpreted
as plain strings.
We need to consider whether this can still be implemented if
we switch back from HsYAML to yaml for performance reasons.
Closes#5849. Previously biblatex citations were only grouped if
there was no prefix. This patch allows them to be grouped in
subgroups split by prefixes and suffixes, which allows better citation
sorting.
If parsing fails in a raw environment (e.g. due to special
characters like unescaped `_`), try again as a verbatim
environment, which is less sensitive to special characters.
This allows us to capture special environments that change
catcodes as raw tex when `-f latex+raw_tex` is used.
Closes#6034.
The fix to #6030 actually changed behavior, so that the
2D nesting occurred at slide level N-1 and N, instead of
at the top-level section. This commit restores the 2.7.3 behavior.
If there are more than 2 levels, the top level is horizontal
and the rest are collapsed to vertical.
Closes#6032.