This change allows bibtex/biblatex output to wrap as other
formats do, depending on the settings of `--wrap` and `--columns`.
It also introduces default templates for bibtex and biblatex,
which allow for using the variables `header-include`, `include-before`
or `include-after` (or alternatively the command line options
`--include-in-header`, `--include-before-body`, `--include-after-body`)
to insert content into the generated bibtex/biblatex.
This change requires a change in the return type of the unexported
`T.P.Citeproc.writeBibTeXString` from `Text` to `Doc Text`.
Closes#7068.
Previously there was a messy code path that gave strange
results in some cases, not passing through raw tex but
trying to extract a string content. This was an artefact
of trying to handle some special bibtex-specific commands
in the BibTeX reader. Now we just handle these in the
LaTeX reader and simplify parsing in the BibTeX reader.
This does mean that more raw tex will be passed through
(and currently this is not sensitive to the `raw_tex`
extension; this should be fixed).
Closes#7049.
The Lua modules `pandoc` and `pandoc.List` are now always loaded from the
system's default data directory. Loading from a different directory by
overriding the default path, e.g. via `--data-dir`, is no longer supported to
avoid unexpected behavior and to address security concerns.
* `biblatex` and `bibtex` are now supported as output
as well as input formats.
* New module Text.Pandoc.Writers.BibTeX, exporting
writeBibTeX and writeBibLaTeX. [API change]
* New unexported function `writeBibtexString` in
Text.Pandoc.Citeproc.BibTeX.
We were losing content from inside spans with a class,
due to logic that is meant to avoid nested inline
structures that can't be represented in RST.
The logic was a bit stricter than necessary. This
commit fixes the issue.
This reverts commit 6efd3460a7.
Since this extension is designed to be used with
GitHub markdown (gfm), we need to implement the parser
as a commonmark extension (commonmark-extensions),
rather than in pandoc's markdown reader. When that is
done, we can add it here.
Instead of hard-coding the border and header cell vertical alignment,
we now let this be determined by the Table style, making use of
Word's "conditional formatting" for the table's first row.
For headerless tables, we use the tblLook element to tell Word
not to apply conditional first-row formatting.
Closes#7008.
Due to a bug in code added to avoid overwriting the cover image
if it had the form `fileX.YYY`, pandoc made an endless sequence
of HTTP requests when writing epub with input from a URL.
Closes#7013.
Additional pipe chars, used to separate "action" state from "no further
action" states, are ignored. E.g., for the following sequence, both
`DONE` and `FINISHED` are states with no further action required.
#+TODO: UNFINISHED | DONE | FINISHED
Previously, parsing of the todo sequence failed if multiple pipe chars
were included.
Closes: #7014
If the table lacks a header, the header row should be an empty
list. Previously we got a list of empty cells, which caused
an empty header to be emitted instead of no header. In LaTeX/PDF
output that meant we got a double top line with space between.
@tarleb @despres - please let me know if this is problematic
for some reason I'm not grasping.
Allow defaults files to inherit options from other defaults files by
specifying them with the following syntax:
`defaults: [list of defaults files or single defaults file]`.
* Replace org-mode’s verbatim from code to codeWith.
This adds the `"verbatim"` class so that exporters can apply a specific
style on it. For instance, it will be possible for HTML to add a CSS
rule for code + verbatim class.
* Alter test for org-mode’s verbatim change.
See previous commit for further detail on the new implementation.
when `raw_tex` is not enabled. (When `raw_tex` is enabled,
the whole environment is parsed as a raw block.)
The class name is the name of the environment.
Previously, we just included the contents without the
surrounding Div, but having a record of the environment's
boundaries and name can be useful.
Closes#6997.
In 2.11.3 we started adding `\addlinespace`, which produced less
dense tables. This wasn't an intentional change; I misunderstood
a comment in the discussion leading up to the change. This commit
restores the earlier default table appearance.
Note that if you want a less dense table, you can use something like
`\def\arraystretch{1.5}` in your header.
Closes#6996.
The Div wrapper of code blocks with captions now has the class
"captioned-content". The caption itself is added as a Plain block
inside a Div of class "caption". This makes it easier to write filters
which match on captioned code blocks. Existing filters will need to be
updated.
Closes: #6977
The `renderTags'` function was duplicated when the reader used `Text` as
its string type. The duplication is no longer necessary.
A side effect of this change is that empty `<col>` elements are written
as self-closing tags in raw HTML blocks.
Added field to WriterState that denotes the current nesting level for traversing tables.
Depending on the value of that field nested tables are recognized and written.
Asciidoc supports one level of nesting. If deeper tables are to be written, they are
omitted and a warning is issued.
The raw text is now included verbatim in the output. Previously is was parsed
into XML elements, which prevented the inclusion of partial XML snippets.
The `linkifyVariables` function was changing these to links
which then got treated as non-empty by citeproc, leading
to wrong results (e.g. ignoring nonempty URL when empty DOI is present).
Addresses part 2 of jgm/citeproc#41.
Note that the multirow package is needed for rowspans.
It is included in the latex template under a variable,
so that it won't be used unless needed for a table.
- Use dev version of citeproc, which handles duplicate
ids better, preferring the last one in the list
and discarding the rest.
- Ensure that inline citations take priority over external
ones.
See jgm/citeproc#36.
This restores the behavior of pandoc-citeproc.
This means that:
- a URL may be provided, and pandoc will fetch the resource.
- Pandoc will search the resource path for the bibliography
if it is not found relative to the working directory.
Closes#6940.
After the move to JuicyPixels, we were getting incorrect
width and heigh information for some images (see #6936, test-3.jpg).
The correct information was encoded in Exif tags that
JuicyPixels seemed to ignore. So we check these first
before looking at the Width and Height identified by
JuicyPixels.
Closes#6936.
- An image alone in its paragraph (but not a figure) is now
rendered as an independent image, with an `alt` attribute
if a description is supplied.
- An inline image that is not alone in its paragraph will
be rendered, as before, using a substitution.
Such an image cannot have a "center", "left", or
"right" alignment, so the classes `align-center`,
`align-left`, or `align-right` are ignored.
However, `align-top`, `align-middle`, `align-bottom`
will generate a corresponding `align` attribute.
Closes#6948.
Previously we stripped attribute prefixes, reading
`xml:lang` as `lang` for example. This resulted in
two duplicate `lang` attributes when `xml:lang` and
`lang` were both used. This commit causes the prefixes
to be retained, and also avoids invald duplicate
attributes.
Closes#6938.
* Add `Ext_sourcepos` constructor for `Extension`.
* Add `sourcepos` extension (only for commonmark).
* Bump to 2.11.3
With the `sourcepos` extension set set, `data-pos` attributes are added
to the AST by the commonmark reader. No other readers are affected. The
`data-pos` attributes are put on elements that accept attributes; for
other elements, an enlosing Div or Span is added to hold the attributes.
Closes#4565.
DokuWiki lets the user define his own Interwiki links.
Previously pandoc reacted to these by emitting a
google search link, which is not helpful. Instead,
we now just emit the full URL including the
wikilink prefix, e.g. `faquk>FAQ-mathml`.
This at least gives users the ability to
modify the links using filters.
Closes#6932.
Docbook reader produces a `Div` with `title` class for `<title>` element
within an “admonition” element. Markdown writer then turns this
into a fenced div with `title` class attribute. Since fenced divs
are block elements, their content is recognized as a paragraph
by the Markdown reader. This is an issue for Docbook writer because
it would produce an invalid DocBook document from such AST –
the `<title>` element can only contain “inline” elements.
Let’s handle this invalid special case separately by unwrapping
the paragraph before creating the `<title>` element.
Links with (internal) targets that the reader doesn't know about are
converted into emphasized text. Information on the link target is now
preserved by wrapping the text in a Span of class `spurious-link`, with
an attribute `target` set to the link's original target. This allows to
recover and fix broken or unknown links with filters.
See: #6916
This commit adds two extensions to the OpenDocument writer,
`xrefs_name` and `xrefs_number`.
Links to headings, figures and tables inside the document are
substituted with cross-references that will use the name or caption
of the referenced item for `xrefs_name` or the number for `xrefs_number`.
For the `xrefs_number` to be useful heading numbers must be enabled
in the generated document and table and figure captions must be enabled using for example the `native_numbering` extension.
In order for numbers and reference text to be updated the generated
document must be refreshed.
Co-authored-by: Nils Carlson <nils.carlson@ludd.ltu.se>
Previously, we only added xmlns attributes to chapter elements,
even when running with --top-level-division=section.
Let’s add the namespaces to part and section elements too,
when they are the selected top-level divisions.
We do not need to add namespaces to documents produced with
--standalone flag, since those will already have xmlns attribute
on the root element in the template.
Previously bold and italics didn't work properly in LTR
text. This commit causes the w:bCs and w:iCs attributes
to be used, in addition to w:b and w:i, for bold and
italics respectively.
Closes#6911.
Fix appearance of bullets/numbered lists (the first level is slightly
indented to the right instead of right on the margin).
New golden files have been tested using Word 2010 on Windows 10.
- `<tfoot>` elements are no longer added to the table body but used as
table footer.
- Separate `<tbody>` elements are no longer combined into one.
- Attributes on `<thead>`, `<tbody>`, `<th>`/`<td>`, and `<tfoot>`
elements are preserved.
This affected author-in-text citations in footnotes.
It didn't cause problems for the printed output, but for
filters that expected the citation id and other information.
Closes#6890.
This shouldn't happen, in general, but it can happen with
JPEGs that don't conform to the spec. Having a DPI of 0
will blow up size calculations (division by 0).
Closes#6880.
+ Remove the `\strut` that was added at the end of minipage
environments in cells.
+ Replace `\tabularnewline` with `\\ \addlinespace`.
Closes#6842, closes#6860.
which was made (for reasons forgotten) when transferring
this code from pandoc-citeproc. The change led to `--` in
URLs being interpreted as en-dashes, which is unwanted.
Closes#6874.
Table width in relation to text width is not natively supported
by docbook but is by the docbook fo stylesheets through an XML
processing instruction, <?dbfo table-width="50%"?> .
Implement support for this instruction in the DocBook reader.
in cases where we run into trouble parsing inlines til the
closing `]`, e.g. quotes, we return a plain string with the
option contents. Previously we mistakenly included the brackets
in this string.
Closes#6869.
`commonmark_x` never actually supported `auto_identifiers` (it
didn't do anything), because the underlying library implements
gfm-style identifiers only.
Attempts to add the `autolink_identifiers` extension to
`commonmark` will now fail with an error.
Closes#6863.
Previously we only self-contained attributes for
certain tag names (`img`, `embed`, `video`, `input`, `audio`,
`source`, `track`, `section`). Now we self-contain any
occurrence of `src`, `data-src`, `poster`, or `data-background-image`,
on any tag; and also `href` on `link` tags.
Closes#6854 (which specifically asked about
`asciinema-player` tags).
We now better handle `.IP` when it is used with non-bullet,
non-numbered lists, creating a definition list.
We also skip blank lines like groff itself.
Closes#6858.
Previously a path beginning with a drive, like
`C:\foo\bar`, was translated to `C:\/foo/bar`, which
caused problems.
With this fix, the backslashes are removed.
Closes#6173.
Previously we used Setext (underlined) headings by default.
The default is now ATX (`##` style).
* Add the `--markdown-headings=atx|setext` option.
* Deprecate `--atx-headers`.
* Add constructor 'ATXHeadingInLHS` constructor to `LogMessage` [API change].
* Support `markdown-headings` in defaults files.
* Document new options in MANUAL.
Closes#6662.
Background: syntactically, references to example list items
can't be distinguished from citations; we only know which they
are after we've parsed the whole document (and this is resolved
in the `runF` stage).
This means that pandoc's calculation of `citationNoteNum`
can sometimes be wrong when there are example list references.
This commit partially addresses #6836, but only for the case
where the example list references refer to list items defined
previously in the document.