Commit graph

7248 commits

Author SHA1 Message Date
Igor Pashev
630b1bff2b
LaTeX reader: preserve center environment (#6852)
The contents of the `center` environment are put in a `Div`
with class `center`.
2020-11-26 12:04:31 -08:00
Albert Krewinkel
07919e1b22
HTML reader: improve support for table headers, footer, attributes
- `<tfoot>` elements are no longer added to the table body but used as
  table footer.
- Separate `<tbody>` elements are no longer combined into one.
- Attributes on `<thead>`, `<tbody>`, `<th>`/`<td>`, and `<tfoot>`
  elements are preserved.
2020-11-26 07:22:01 +01:00
Albert Krewinkel
3e01ae405f
HTML reader: allow finer grained options for tag omission 2020-11-26 07:22:01 +01:00
John MacFarlane
7c4d7db9c7 LaTeX writer: improve longtable output.
- Don't create minipages for regular paragraphs.
- Put width and alignment information in the longtable column
  descriptors.
- Closes #6883.
2020-11-25 15:42:44 -08:00
John MacFarlane
b50ac3a95b LaTeX tables: Fix calculation of column spacing.
See #6883.
2020-11-25 14:41:28 -08:00
John MacFarlane
815976d537 Fix truncation of [Citation] list in Cite inside footnotes...
This affected author-in-text citations in footnotes.
It didn't cause problems for the printed output, but for
filters that expected the citation id and other information.

Closes #6890.
2020-11-25 09:10:10 -08:00
Albert Krewinkel
c6f2663a23
HTML reader: simplify list attribute handling
This removes the `foldOrElse` function from the internal Text.Pandoc.CSS
module.
2020-11-25 17:55:42 +01:00
Albert Krewinkel
c9f98e2bf5
HTML reader: support row or column-spanning table cells 2020-11-24 14:17:35 +01:00
Albert Krewinkel
446ef27a3f
HTML reader: support blocks in caption 2020-11-24 14:17:35 +01:00
Albert Krewinkel
41237fcc0e
HTML reader: extract table parsing into separate module 2020-11-24 14:17:35 +01:00
John MacFarlane
2f110265ff ImageSize: default to DPI 72 if the format specifies DPI of 0.
This shouldn't happen, in general, but it can happen with
JPEGs that don't conform to the spec.  Having a DPI of 0
will blow up size calculations (division by 0).

Closes #6880.
2020-11-23 09:39:48 -08:00
Albert Krewinkel
f9258371dd HTML reader: extract submodules
Reducing module size should reduce memory use during compilation.

This is preparatory work to tackle support for more table features.
2020-11-23 10:12:20 +01:00
Nils Carlson
75c881e2d9
OpenDocument Writer: Implement Div and Span ident support (#6755)
Spans and Divs containing an ident in the Attr will become bookmarks
or sections with idents in OpenDocument format.
2020-11-22 22:23:30 -08:00
John MacFarlane
b5b5ef92cb LaTeX writer: Improve table spacing.
+ Remove the `\strut` that was added at the end of minipage
  environments in cells.

+ Replace `\tabularnewline` with `\\ \addlinespace`.

Closes #6842, closes #6860.
2020-11-22 10:54:42 -08:00
Albert Krewinkel
5344dab8eb
Org reader: parse #+LANGUAGE into lang metadata field
Fixes: #6845
2020-11-22 12:53:05 +01:00
Nils Carlson
ae52918faa
OpenDocument writer: Table text width support (#6792)
Support for table width as a percentage of text width by summing
width of columns and verifying that the sum is > 0 and <= 1.
2020-11-21 12:42:43 -08:00
John MacFarlane
7db2cf5d2f LaTeX reader: more robust parsing of bracketed options.
Improves on 9a40976.  Closes #6873.
2020-11-21 12:24:37 -08:00
John MacFarlane
fec8223d3a Citeproc BibTeX parser: revert change in getRawField...
which was made (for reasons forgotten) when transferring
this code from pandoc-citeproc.  The change led to `--` in
URLs being interpreted as en-dashes, which is unwanted.

Closes #6874.
2020-11-21 12:07:28 -08:00
Nils Carlson
56ceaf49dc
DocBook reader: Table text width support (#6791)
Table width in relation to text width is not natively supported
by docbook but is by the docbook fo stylesheets through an XML
processing instruction, <?dbfo table-width="50%"?> .
Implement support for this instruction in the DocBook reader.
2020-11-20 16:05:56 -08:00
John MacFarlane
9a4097640f Improve LaTeX option parsing...
in cases where we run into trouble parsing inlines til the
closing `]`, e.g. quotes, we return a plain string with the
option contents. Previously we mistakenly included the brackets
in this string.

Closes #6869.
2020-11-20 13:40:26 -08:00
John MacFarlane
c647948ff1 commonmark_x: replace auto_identifiers with gfm_auto_identifiers.
`commonmark_x` never actually supported `auto_identifiers` (it
didn't do anything), because the underlying library implements
gfm-style identifiers only.

Attempts to add the `autolink_identifiers` extension to
`commonmark` will now fail with an error.

Closes #6863.
2020-11-20 09:17:14 -08:00
Albert Krewinkel
d286242131 JATS writer: support advanced table features 2020-11-19 22:09:52 +01:00
John MacFarlane
c1fbe7b91a --self-contained: increase coverage.
Previously we only self-contained attributes for
certain tag names (`img`, `embed`, `video`, `input`, `audio`,
`source`, `track`, `section`).  Now we self-contain any
occurrence of `src`, `data-src`, `poster`, or `data-background-image`,
on any tag; and also `href` on `link` tags.

Closes #6854 (which specifically asked about
`asciinema-player` tags).
2020-11-19 10:08:43 -08:00
John MacFarlane
e16df8d271 DocBook reader: drop period in formalpara title...
...and put it in a div with class `formalpara-title`, so that
people can reformat with filters.

Closes #6562.

Thanks to rdmuller.
2020-11-19 09:33:29 -08:00
John MacFarlane
0962b30d84 Man reader: improve handling of .IP.
We now better handle `.IP` when it is used with non-bullet,
non-numbered lists, creating a definition list.

We also skip blank lines like groff itself.

Closes #6858.
2020-11-18 22:44:32 -08:00
Albert Krewinkel
023468ea2d
JATS writer: wrap all tables
All `<table>` elements are put inside `<table-wrap>` elements, as the
former are not valid as immediate child elements of `<body>`.
2020-11-18 18:10:17 +01:00
TEC
0306eec5fa Replace org #+KEYWORDS with #+keywords
As of ~2 years ago, lower case keywords became the standard (though they
are handled case insensitive, as always):
13424336a6

Upper case keywords are exclusive to the manual:
- https://orgmode.org/list/871s50zn6p.fsf@nicolasgoaziou.fr/
- https://orgmode.org/list/87tuuw3n15.fsf@nicolasgoaziou.fr/
2020-11-18 14:48:56 +01:00
TEC
224a501b29 Update org supported languages and identifiers
according to the current list contained in
https://orgmode.org/worg/org-contrib/babel/languages/index.html
2020-11-18 14:48:56 +01:00
John MacFarlane
efa34a8de6 Bibtex reader: fall back on en-US if locale for LANG not found.
This reproduces earlier pandoc-citeproc behavior.

Closes jgm/citeproc#26.
2020-11-17 23:12:32 -08:00
John MacFarlane
bf3fea0a8c Markdown reader: fix regression with example list references.
This affects example list references followed by dashes.
Introduced by commit b8d17f7.
Closes #6855.
2020-11-17 20:36:59 -08:00
Albert Krewinkel
94c9028819 JATS writer: move Table handling to separate module
This makes it easier to split the module into smaller parts.
2020-11-17 09:46:30 +01:00
John MacFarlane
c9ada73cac Move getNextNumber from Readers.LaTeX to Readers.LaTeX.Parsing. 2020-11-16 22:36:10 -08:00
John MacFarlane
ee34c4fef8 Only use filterIpynbOutput if input format is ipynb.
Closes #6841.
2020-11-16 18:21:30 -08:00
John MacFarlane
98bedd7631 When checking reader/writer name, check base name...
now that we permit extensions on formats other
than markdown.
2020-11-16 17:49:23 -08:00
John MacFarlane
5271c6b3fb Improve fix to siunitx numbers with minus.
- use real minus sign
- use tests contributed by Igor Pashev.
2020-11-16 16:36:16 -08:00
John MacFarlane
734b4c26a9 LaTeX reader: Fix negative numbers in siunitx commands.
The commit a157e1a broke negative numbers, e.g.
`\SI{-33}{\celcius}` or `\num{-3}`. This fixes the regression.
2020-11-16 14:08:29 -08:00
John MacFarlane
d7f905fb63 Markdown reader: fix detection of locators following in-text citations.
Prevously, if we had `@foo [p. 33; @bar]`, the `p. 33` would be
incorrectly parsed as a prefix of `@bar` rather than a suffix
of `@foo`.
2020-11-15 17:51:03 -08:00
John MacFarlane
f8225140a5 Text.Pandoc.PDF: Fix changePathSeparators for Windows.
Previously a path beginning with a drive, like
`C:\foo\bar`, was translated to `C:\/foo/bar`, which
caused problems.

With this fix, the backslashes are removed.

Closes #6173.
2020-11-15 10:43:43 -08:00
Albert Krewinkel
26f946af20
Remove redundant bracket in App.Opt 2020-11-15 12:08:15 +01:00
John MacFarlane
b5d066f167 Revise deprecation warning for --atx-headers. 2020-11-14 21:41:50 -08:00
Aner Lucero
f63b76e169 Markdown writer: default to using ATX headings.
Previously we used Setext (underlined) headings by default.
The default is now ATX (`##` style).

* Add the `--markdown-headings=atx|setext` option.
* Deprecate `--atx-headers`.
* Add constructor 'ATXHeadingInLHS` constructor to `LogMessage` [API change].
* Support `markdown-headings` in defaults files.
* Document new options in MANUAL.

Closes #6662.
2020-11-14 21:33:32 -08:00
John MacFarlane
b8d17f7ae8 Markdown reader: don't increment stateNoteNumber for example refs.
Background:  syntactically, references to example list items
can't be distinguished from citations; we only know which they
are after we've parsed the whole document (and this is resolved
in the `runF` stage).

This means that pandoc's calculation of `citationNoteNum`
can sometimes be wrong when there are example list references.

This commit partially addresses #6836, but only for the case
where the example list references refer to list items defined
previously in the document.
2020-11-14 15:00:17 -08:00
John MacFarlane
68b298ed9a Improve period suppression algorithm for citations in notes...
in note citation styles.  See #6835.
2020-11-13 10:52:21 -08:00
gison93
fec695c77a
Fix error when extension output is doc (#6834) 2020-11-13 09:07:31 -08:00
John MacFarlane
7d298d13d9 Remove redundant bracket. 2020-11-10 10:34:46 -08:00
John MacFarlane
7d01887dda Fix corner case in YAML metadata parsing.
Previously YAML metadata would sometimes not get recognized if a
field ended with a newline followed by spaces.  Closes #6823.
2020-11-10 09:47:24 -08:00
John MacFarlane
08ce3addde Hlint suggestions. 2020-11-07 10:53:07 -08:00
Albert Krewinkel
527346cc7e
Lint code in PRs and when committing to master (#6790)
* Remove unused LANGUAGE pragmata

* Apply HLint suggestions

* Configure HLint to ignore some warnings

* Lint code when committing to master
2020-11-07 10:38:03 -08:00
Albert Krewinkel
0ed3436588
doc/filters.md: describe technical details of filter invocations (#6815) 2020-11-06 15:37:24 -08:00
John MacFarlane
535bd607de Support nocase spansn for csljson output 2020-11-06 09:16:24 -08:00
John MacFarlane
06d3071090 LaTeX reader: better handling of \\ inside math in table cells.
Previously this confused the table parser.  Closes #6811.
2020-11-05 16:13:35 -08:00
John MacFarlane
090b0877bc Citeproc: improve punctuation in in-text note citations.
Previously in-text note citations inside a footnote
would sometimes have the final period stripped, even
if it was needed (e.g. on the end of 'ibid').

See #6813.
2020-11-05 11:15:23 -08:00
John MacFarlane
efe74746d8 DokuWiki writer: translate language names for code elements...
...and improve whitespace.  Closes #6807.
2020-11-04 22:38:53 -08:00
John MacFarlane
08134388ad MediaWiki writer: use syntaxhighlight tag...
instead of deprecated source, for highlighted code.

Also support `startFrom` attribute and `numberLines`.

Closes #6810.
2020-11-04 21:20:41 -08:00
John MacFarlane
0bd6fb4745 Simplified idpred in citeproc. 2020-11-04 11:10:49 -08:00
John MacFarlane
8f75a53542 Properly support optional cite argument for \blockquote.
(LaTeX reader)

Closes #6802.
2020-11-03 10:25:56 -08:00
John MacFarlane
6cbe5efd56 LaTeX reader: fix bug parsing macro arguments.
If `\cL` is defined as `\mathcal{L}`, and `\til` as `\tilde{#1}`,
then `\til\cL` should expand to `\tilde{\mathcal{L}}`, but pandoc
was expanding it to `\tilde\mathcal{L}`.  This is fixed by
parsing the arguments in "verbatim mode" when the macro expands
arguments at the point of use.

Closes #6796.
2020-11-02 15:04:16 -08:00
Albert Krewinkel
1175b0a008
T.P.Filter: allow shorter YAML representation of Citeproc
The map-based YAML representation of filters expects `type` and `path`
fields. The path field had to be present for all filter types, but is
not used for citeproc filters. The field can now be omitted when type
is "citeproc", as described in the MANUAL.
2020-11-02 15:14:19 +01:00
John MacFarlane
6051c751ce Citeproc: use comma for in-text citations inside footnotes.
When an author-in-text citation like `@foo` occurs in a footnote,
we now render it with:  `AUTHOR NAME + COMMA + SPACE + REST`.

Previously we rendered: `AUTHOR NAME + SPACE + "(" + REST + ")"`.

This gives better results.  Note that normal citations are still
rendered in parentheses.
2020-11-01 10:48:47 -08:00
John MacFarlane
01f2d81168 Improve deNote. 2020-11-01 10:48:47 -08:00
Andy Morris
f1f2728259 Fix duplicate "class" attribute in HTML writer 2020-10-30 16:38:59 +01:00
John MacFarlane
3e6d009c6b Use new citeproc; do note capitalization here, not in citeproc. 2020-10-29 21:53:02 -07:00
John MacFarlane
bc3f16b0c1 Allow citation-abbreviations in defaults file. 2020-10-29 15:54:50 -07:00
John MacFarlane
bd7c9eb32b LaTeX writer: Improved calculation of table column widths.
We now have LaTeX do the calculation, using `\tabcolsep`.
So we should now have accurate relative column widths no
matter what the text width.

The default template has been modified to load the calc
package if tables are used.
2020-10-29 12:10:05 -07:00
John MacFarlane
95c9f3da63 Remove obsolete comment 2020-10-27 21:05:59 -07:00
John MacFarlane
3190ce95c2 Citeproc: properly handle csl field with data: URI.
This is used with the JATS writer, so this fixes a regression
in pandoc 2.11 with JATS output and citeproc.

Closes #6783.
2020-10-27 21:04:24 -07:00
John MacFarlane
3d93414e5d Add PandocBibliographyError and use it in parsing bibliographies.
This ensures that bibliography parsing errors generate messages
that include the bibliography file name -- otherwise it can be
quite mysterious where it is coming from.

[API change] New PandocBibliographyError constructor on
PandocError type.
2020-10-26 14:46:53 -07:00
Nils Carlson
dd3d920ba0
DocBook Reader: fix duplicate bibliography bug (#6773)
Also add unit test to ensure the behavior stays consistent.
2020-10-26 12:49:03 -07:00
John MacFarlane
9ab04a92f8 HTML reader: Parse contents of iframes.
See #6770.
2020-10-23 23:31:36 -07:00
John MacFarlane
4bf171e11d HTML reader: parse inline svg as image...
...unless `raw_html` is set in the reader (in which case
the svg is passed through as raw HTML).

Closes #6770.
2020-10-23 22:09:39 -07:00
John MacFarlane
efc6994c8a Commonmark writer: fix regression with fenced divs.
Starting with 2.10.1, fenced divs no longer render with
HTML div tags in commonmark output.  This is a regression
due to our transition from cmark-gfm.  This commit fixes it.

Closes #6768.
2020-10-23 09:25:07 -07:00
John MacFarlane
f9c6167ad1 citeproc - improved removal of final period...
...in citations inside notes in note-based styles.
These citations are put in parentheses, but the final
period must be removed.

See jgm/citeproc#20
2020-10-21 22:23:21 -07:00
John MacFarlane
76315d99ca More refinements to --version output.
Add ipynb version.  Put user data directory on same line as
heading "User data directory" (dropping "default").
2020-10-19 17:12:36 -07:00
John MacFarlane
1a2f8733b6 Normalize rewritten image paths with --extract-media.
This change will avoid mixed paths like this one when
`--extract-media` is used with a Word file:
`![](C:\Git\TIJ4\Markdown/media/image30.wmf)`

Instead we'll get
`![](C:\Git\TIJ4\Markdown`media`image30.wmf)`.

Closes #6761.
2020-10-19 16:32:39 -07:00
John MacFarlane
9ecea0bc62 Modify --version output.
Use space more efficiently and report the citeproc version along
with skylighting, texmath, and pandoc-types.
2020-10-19 16:32:39 -07:00
Nils Carlson
2332a08f1e
DocBook reader: bibliomisc and anchor support (#6754)
Also do some minor refactoring - bibliodiv without
a title no longer results in an empty Header.
2020-10-16 23:52:19 -07:00
John MacFarlane
eb3307da4e Fix handling of xdata in bibtex/biblatex bibliographies.
Closes #6752.
2020-10-15 17:41:45 -07:00
Michael Hoffmann
988d381aad
Fix some small typos in the API documentation (#6751)
While reading the docs I found a couple of small typos.
2020-10-15 17:09:29 -07:00
Albert Krewinkel
90af138443
Fix typos in comments, doc strings, error messages, and tests
Typos reported by
https://fossies.org/linux/test/pandoc-master.tar.gz/codespell.html

See: #6738
2020-10-14 22:26:51 +02:00
John MacFarlane
0b3b77415f Modify fix to #6742 to use stringToLaTeX. 2020-10-14 10:22:15 -07:00
John MacFarlane
e0da02623e LaTeX reader: support more acronym commands.
`\acl`, `\aclp`, and capitalized versions of already
supported commands.

Closes #6746.
2020-10-13 21:00:02 -07:00
John MacFarlane
a55fb5f29d LaTeX writer: escape option values in lstlistings environment.
Closes #6742.
2020-10-13 20:53:39 -07:00
John MacFarlane
ef6627f645 LaTeX writer: fix handling of pt-BR.
For polyglossia we now use
`\setmainlanguage[variant=brazilian]{portuguese}`
and for babel
`\usepackage[shorthands=off,main=brazilian]{babel}`.

Closes #2953.
2020-10-12 21:35:36 -07:00
John MacFarlane
12ff835a8a Commonmark reader: add pipe_table extension after defaults.
Otherwise we get bad results for non-table, non-paragraph
lines containing pipe characters.

Closes #6739.

See also jgm/commonmark-hs#52.
2020-10-12 21:24:26 -07:00
John MacFarlane
2007cff203 Markdown writer: Fix autolinks rendering for gfm.
Previously, autolinks rendered as raw HTML, due to the
`class="uri"` added by pandoc's markdown reader.

Closes #6740.
2020-10-12 18:57:04 -07:00
John MacFarlane
0b5e2601f5 LaTeX reader: allow blank lines inside \author. 2020-10-10 16:28:52 -07:00
Kolen Cheung
0166b8f857
Options.hs: defaultMathJaxURL: use tex-chtml-full instead of tex-mml-chtml (#6600)
Closes #6599

c.f. https://docs.mathjax.org/en/latest/web/components/combined.html

Note that while this use the full variant of the js, this drops the mathml support.
That should be okay, because pandoc renders math in HTML as TeX when using
mathjax.

This change reduces latency.
2020-10-10 11:11:58 -07:00
John MacFarlane
e7adc2917b In fetching parent of dependent CSL style, first...
look locally, and only do an HTTP request if it's not
found locally.
2020-10-09 13:35:53 -07:00
John MacFarlane
9a6c42590f LaTeX reader: Fix parsing of "show name" in newtheorem.
Previously we were just treating it as a string and
ignoring  accents and formatting.  See #6734.
2020-10-08 22:47:32 -07:00
John MacFarlane
2d4214fa31 Extend fix to #6719 to JATS reader 2020-10-08 21:36:08 -07:00
John MacFarlane
f19286cf12 DocBook reader: don't squelch space at end of emphasis element.
Instead, include it after the emphasis.  Closes #6719.

Same fix was made for other inline elements, e.g. strikethrough.
2020-10-08 21:27:52 -07:00
John MacFarlane
dd3c4000ff Small improvements to BibTeX parser. 2020-10-08 20:48:19 -07:00
John MacFarlane
480f582664 Export ParseError from T.P.Parsing. 2020-10-08 18:55:20 -07:00
John MacFarlane
641849b70a Be less aggressive about using quotes for YAML values.
We need quotes if `[` or `{` or `'` is at the beginning of
the line, but not otherwise.
2020-10-08 10:54:53 -07:00
John MacFarlane
2566ff6624 Qualify some uses of fail to avoid ambiguity. 2020-10-08 09:12:29 -07:00
John MacFarlane
1be0f0fba8 Use double quotes for YAML metadata.
Closes #6727.
2020-10-07 23:05:51 -07:00
John MacFarlane
6f2019ac08 Remove redundant import. 2020-10-07 16:01:30 -07:00
John MacFarlane
5b4a606265 Remove redundant import. 2020-10-07 16:01:10 -07:00
John MacFarlane
428f8b4d20 Raise informative errors when YAML metadata parsing fails.
Closes #6730.

Previously the command would succeed, returning empty metadata,
with no errors or warnings.

API changes:

- Remove now unused CouldNotParseYamlMetadata constructor for
  LogMessage (T.P.Logging).

- Add 'Maybe FilePath' parameter to yamlToMeta in T.P.Readers.Markdown.
2020-10-07 13:12:32 -07:00
John MacFarlane
69b030c7df Cleaner solution to #6723. 2020-10-07 11:29:05 -07:00
John MacFarlane
1742821c3e Fix URL prefixes in citations also when they occur in notes.
Update chicago-fullnote-bibliography.csl and adjust tests.

Closes #6723.
2020-10-07 11:23:28 -07:00
John MacFarlane
d2e4a83dc6 Use latest citeproc.
Better solution to the problem of entities in CSL JSON output.
2020-10-07 09:31:44 -07:00
John MacFarlane
fd3809c33f Unescape entities in writing CSL JSON.
The renderCslJson function escapes `<`, `>`, and `&` as entities.
This is appropriate when generating HTML, but in CSL JSON
these are supposed to appear unescaped.

Closes jgm/citeproc#17.
2020-10-06 22:29:25 -07:00
Diego Balseiro
eda5540719
DOCX reader: Allow empty dates in comments and tracked changes (#6726)
For security reasons, some legal firms delete the date from comments and
tracked changes.

* Make date optional (Maybe) in tracked changes and comments datatypes
* Add tests
2020-10-06 21:03:00 -07:00
John MacFarlane
a27a0b5419 Incorporate https://doi.org/ prefix added by CSL style...
...into linked DOI, and similarly for other URLs linked in the
bibliography.  We want to avoid having a URL in which only the latter
part is linked.  Closes #6723.
2020-10-06 19:20:00 -07:00
John MacFarlane
a78fd5dbc0 Fix URL for "short DOIs" in citations. See #6723.
Short DOIs begin 10/abcd and should be links to
`https://doi.org/abcd` (omitting the `10/`).
2020-10-06 17:33:25 -07:00
John MacFarlane
97695a2bcc Fixed regresison in last commit.
Parsing of YAML bibliographies was broken; this fixes it.
2020-10-05 23:57:38 -07:00
John MacFarlane
3e7ca707c9 Removed the idpred from metaValueToReference.
This isn't really necessary; we do filtering at other points now.
2020-10-05 21:15:20 -07:00
John MacFarlane
6a32ea71ea Add yamlToRefs, yamlBsToRefs.
T.P.Readers.Markdown now exports yamlToRefs. [API change]

T.P.Readers.Metadata exports yamlBsToRefs. [API change]

These allow specifying an id filter so we parse only references
that are used in the document.  Improves timing with a 3M
yaml references file from 36s to 17s.
2020-10-05 21:07:47 -07:00
John MacFarlane
89e4f1bf9a Improve searching for CSL files...
...and CSL abbreviation files.  Use resource path to search
in both USERDATADIR/csl and USERDATADIR/csl/dependent.

Also, add .csl or .json extension as needed, so you can just
do --csl zoology.
2020-10-05 17:23:50 -07:00
John MacFarlane
4dac62ef3a Use yamlToMeta for yaml bibliography
This speeds up parsing of external yaml bibliographies considerably
(in one test 36s -> 17s).
2020-10-05 16:58:58 -07:00
John MacFarlane
128991d4a4 Add filtering to metaValueToReference, and check other-ids field too. 2020-10-05 16:35:51 -07:00
Albert Krewinkel
01a6b071fa
Sort languages in --list-highlight-languages output (#6718)
Languages appear to be sorted by their long name, which leads to
unexpected results: e.g., the long name of *m4* is *GNU m4*, so it is
listed between *gnuassembler* and *go*.
2020-10-04 08:03:42 -07:00
Michael Hoffmann
74bd5a4f47
Docx writer: better handle list items whose contents are lists (#6522)
If the first element of a bulleted or ordered list is another list,
then that first item will disappear if the target format is docx. This
changes the docx writer so that it prepends an empty string for those
cases. With this, no items will disappear.

Closes #5948.
2020-10-02 09:30:05 -07:00
John MacFarlane
27b4c21f72 Update to lastest citeproc 2020-10-01 22:07:55 -07:00
niszet
7d97bf7a8c
Syntax highlight for inline code of OpenDocument (#6711)
To implement Syntax highlighting for OpenDocument, inlineToOpenDocument in OpenDocument Writer is updated based on Docx Writer.
This commit is only for inline Code because update of CodeBlock needs structual change of output document.
Currently, styles are not generated automatically in styles.xml. To implement it, additional commit for ODT Writer is needed.
Although styles are not included in styles.xml, output file can be shown in LibreOffice(7.0.0.3) like normal characters.
2020-10-01 09:55:16 -07:00
John MacFarlane
5e70f774ec Fix redundant import warning. 2020-09-27 23:32:45 -07:00
John MacFarlane
eff6b8f27d Use latest citeproc. 2020-09-27 16:03:31 -07:00
Nils Carlson
ae4dcc0d4a
OpenDocument Writer: Implement table cell alignment (#6700)
Co-authored-by: Mauro Bieg <mb21@users.noreply.github.com>
2020-09-27 11:21:53 -07:00
John MacFarlane
a822067903 Fix short-title.
We were getting null short-titles generated, and that
was creating wrong citations in some cases.

Close #6702.
2020-09-26 14:28:28 -07:00
John MacFarlane
5a388ab2f5 Allow gfm_auto_identifiers, ascii_identifiers extensions for docx. 2020-09-25 09:53:56 -07:00
John MacFarlane
188c444990 RST reader: apply .. class:: directly to following Header.
rather than creating a surrounding Div.

Closes #6699.
2020-09-25 09:06:15 -07:00
Albert Krewinkel
6119125a8b
Org reader: fix HLint warnings 2020-09-25 09:44:00 +02:00
Nils Carlson
1ad7a047d5
DocBook reader: Implement table cell alignment (#6698) 2020-09-24 17:43:43 -07:00
John MacFarlane
a331c69b49 Slight improvement to last commit.
We now add a space only if there isn't already one.
(Some styles add a space at the end of the left-margin
div.)
2020-09-24 10:03:04 -07:00
John MacFarlane
810ea6fdf8 Citeproc: Insert space after csl-left-margin span contents...
if they come before csl-right-inline.  This ensures that
the citation number or label will be separated from the
rest by a space, even in formats (like plain) that don't yet have
special handling for the display spans.
2020-09-24 09:57:55 -07:00
Nils Carlson
4f13c0e25e
OpenDocument writer: New table cell support with row and column spans (#6682)
Unit tests only verify column spans at this point.

Co-authored-by: Nils Carlson <nils.carlson@ludd.ltu.se>
2020-09-24 09:31:47 -07:00
niszet
1f707da40f
Support toc-depth option for ODT writer (#6697)
To support `--toc-depth` option for ODT, writer and template are
updated.  Closes #6696.
2020-09-24 09:28:38 -07:00
John MacFarlane
e0984a43a9 Add built-in citation support using new citeproc library.
This deprecates the use of the external pandoc-citeproc
filter; citation processing is now built in to pandoc.

* Add dependency on citeproc library.
* Add Text.Pandoc.Citeproc module (and some associated unexported
  modules under Text.Pandoc.Citeproc).  Exports `processCitations`.
  [API change]
* Add data files needed for Text.Pandoc.Citeproc:  default.csl
  in the data directory, and a citeproc directory that is just
  used at compile-time.  Note that we've added file-embed as a mandatory
  rather than a conditional depedency, because of the biblatex
  localization files. We might eventually want to use readDataFile
  for this, but it would take some code reorganization.
* Text.Pandoc.Loging: Add `CiteprocWarning` to `LogMessage` and use it
  in `processCitations`. [API change]
* Add tests from the pandoc-citeproc package as command tests (including
  some tests pandoc-citeproc did not pass).
* Remove instructions for building pandoc-citeproc from CI and
  release binary build instructions.  We will no longer distribute
  pandoc-citeproc.
* Markdown reader: tweak abbreviation support.  Don't insert a
  nonbreaking space after a potential abbreviation if it comes right before
  a note or citation.  This messes up several things, including citeproc's
  moving of note citations.
* Add `csljson` as and input and output format. This allows pandoc
  to convert between `csljson` and other bibliography formats,
  and to generate formatted versions of CSL JSON bibliographies.
* Add module Text.Pandoc.Writers.CslJson, exporting `writeCslJson`. [API
  change]
* Add module Text.Pandoc.Readers.CslJson, exporting `readCslJson`. [API
  change]
* Added `bibtex`, `biblatex` as input formats.  This allows pandoc
  to convert between BibLaTeX and BibTeX and other bibliography formats,
  and to generated formatted versions of BibTeX/BibLaTeX bibliographies.
* Add module Text.Pandoc.Readers.BibTeX, exporting `readBibTeX` and
  `readBibLaTeX`. [API change]
* Make "standalone" implicit if output format is a bibliography format.
  This is needed because pandoc readers for bibliography formats put
  the bibliographic information in the `references` field of metadata;
  and unless standalone is specified, metadata gets ignored.
  (TODO: This needs improvement. We should trigger standalone for the
  reader when the input format is bibliographic, and for the writer
  when the output format is markdown.)
* Carry over `citationNoteNum` to `citationNoteNumber`.  This was just
  ignored in pandoc-citeproc.
* Text.Pandoc.Filter: Add `CiteprocFilter` constructor to Filter.
  [API change] This runs the processCitations transformation.
  We need to treat it like a filter so it can be placed
  in the sequence of filter runs (after some, before others).
  In FromYAML, this is parsed from `citeproc` or `{type: citeproc}`,
  so this special filter may be specified either way in a defaults file
  (or by `citeproc: true`, though this gives no control of positioning
  relative to other filters).  TODO: we need to add something to the
  manual section on defaults files for this.
* Add deprecation warning if `upandoc-citeproc` filter is used.
* Add `--citeproc/-C` option to trigger citation processing.
  This behaves like a filter and will be positioned
  relative to filters as they appear on the command line.
* Rewrote the manual on citatations, adding a dedicated Citations
  section which also includes some information formerly found in
  the pandoc-citeproc man page.
* Look for CSL styles in the `csl` subdirectory of the pandoc user data
  directory.  This changes the old pandoc-citeproc behavior, which looked
  in `~/.csl`.  Users can simply symlink `~/.csl` to the `csl`
  subdirectory of their pandoc user data directory if they want
  the old behavior.
* Add support for CSL bibliography entry formatting to LaTeX, HTML,
  Ms writers.  Added CSL-related CSS to styles.html.
2020-09-21 10:15:50 -07:00
John MacFarlane
a59ae96062 Markdown reader: Set citationNoteNum accurately in citations.
This also changes stateLastNoteNumber -> stateNoteNumber.
2020-09-21 10:10:37 -07:00
John MacFarlane
b2f3074988 Parsing: add stateInNote and stateLastNoteNumber to ParserState.
These will be used to populate note numbers for citations.
2020-09-21 10:10:30 -07:00
John MacFarlane
39f357027a Sort YAML metadata keys in Markdown output case-insensitive.
Use caseFold.
2020-09-21 10:10:12 -07:00
John MacFarlane
045dd212a7 Remove duplicate tshow definition. 2020-09-21 10:09:59 -07:00
Albert Krewinkel
acbea6b8c6
Lua filters: add SimpleTable for backwards compatibility (#6575)
A new type `SimpleTable` is made available to Lua filters. It is
similar to the `Table` type in pandoc versions before 2.10;
conversion functions from and to the new Table type are provided.

Old filters using tables now require minimal changes and can use,
e.g.,

    if PANDOC_VERSION > {2,10,1} then
      pandoc.Table = pandoc.SimpleTable
    end

and

    function Table (tbl)
      tbl = pandoc.utils.to_simple_table(tbl)
      …
      return pandoc.utils.from_simple_table(tbl)
    end

to work with the current pandoc version.
2020-09-20 15:48:31 -07:00
John MacFarlane
26ed7fb4f9 Command line options: use normalizePath in more places.
See #5127.  It is now used everywhere a file argument can be used.

Closes #5127.
2020-09-19 22:35:50 -07:00
argent0
ba9bedef23
Asciidoctor images (#6671)
Support `Asciidoctor`'s block figures.

Closes #6538.
2020-09-19 18:22:52 -07:00
Mauro Bieg
caa225ad82
Add CSS to default HTML template (#6601) 2020-09-19 16:13:50 -07:00
John MacFarlane
d5a7abd47f Change deprecated Builder.isNull to null. 2020-09-19 16:00:22 -07:00
John MacFarlane
a26ec96d89 LaTeX writer: fix spacing issue with list in definition list.
When a list occurs at the beginning of a definition list definition,
it can start on the same line as the label, which looks bad.

Fix that by starting such lists with an `\item[]`.
2020-09-15 17:59:03 -07:00
Christian Despres
a2d343420f
LaTeX reader: fix improper empty cell filtering (#6689) 2020-09-15 13:36:11 -07:00
Albert Krewinkel
34151e8da8
HTML writer: support intermediate table headers
Closes: #6314
2020-09-13 23:23:11 +02:00
Albert Krewinkel
8711640512
HTML writer: support attributes on all table elements
Add attributes to tbody and tr elements.
2020-09-13 20:26:06 +02:00
Christian Despres
cae155b095
Fix hlint suggestions, update hlint.yaml (#6680)
* Fix hlint suggestions, update hlint.yaml

Most suggestions were redundant brackets. Some required
LambdaCase.

The .hlint.yaml file had a small typo, and didn't ignore camelCase
suggestions in certain modules.
2020-09-13 07:48:14 -07:00
Albert Krewinkel
a400d0dc62
HTML writer: render table footers if present
Part of: #6314
2020-09-12 21:49:01 +02:00
Christian Despres
22babd5382
[API change] Rename Writers.Tables and its contents (#6679)
Writers.Tables is now Writers.AnnotatedTable. All of the types and
functions in it have had the "Ann" removed from them. Now it is
expected that the module be imported qualified.
2020-09-12 08:50:36 -07:00
Joseph C. Sible
6fda8cfa28
Use the original tail instead of deconstructing and reconstructing it (#6678) 2020-09-11 13:49:01 -07:00
Leonard Rosenthol
55e5ad2d8f
Changed default link state to invisible (#6676) 2020-09-10 22:58:53 -07:00
John MacFarlane
623ce89e0e Improved uncertainty handling in slunitx. 2020-09-10 14:48:35 -07:00
John MacFarlane
a03160fb0d LaTeX reader: support parenthesized uncertainties in siunitx. 2020-09-10 13:07:31 -07:00
Albert Krewinkel
9423b4b7d9
Support colspans and rowspans in HTML tables (#6644)
* HTML writer: add support for row headers, colspans, rowspans
* Add planet table tests

See #6312
2020-09-10 09:47:40 -07:00
Leonard Rosenthol
ef4f514359
Implement support for internal document links in ICML (#6606)
Closes #5541.
2020-09-10 09:40:35 -07:00
Nils Carlson
96a0f3c7af
docbook reader: Implement column span support for tables (#6492)
Implement column span support for tables in the DocBook reader.

Co-authored-by: Nils Carlson <nils.carlson@ludd.ltu.se>
2020-09-10 09:11:52 -07:00
Albert Krewinkel
9cad5499c4
Reader.LaTeX.hs: remove trailing whitespace 2020-09-08 10:33:49 +02:00
Christian Despres
10c6c411f9
Add Writers.Tables helper functions and types, add tests for those (#6655)
Add Writers.Tables helper functions and types, add tests for those

The Writers.Tables module contains an AnnTable type that is a pandoc
Table with added inferred information that should be enough for
writers (in particular the HTML writer) to operate on without having
to lay out the table themselves.

The toAnnTable and fromAnnTable functions in that module convert
between AnnTable and Table. In addition to producing an AnnTable with
coherent and well-formed annotations, the toAnnTable function also
normalizes its input Table like the table builder does.

Various tests ensure that toAnnTable normalizes tables exactly like
the table builder, and that its annotations are coherent.
2020-09-05 14:36:51 -07:00
John MacFarlane
a157e1a6e0 Support numrange, numlist for siunitx.
See #6658.
2020-09-02 17:00:13 -07:00
John MacFarlane
83f0acab47 Support some missing siunitx commands. 2020-09-02 16:49:10 -07:00
John MacFarlane
321658fe4d LaTeX reader: Support siunitx \ang.
See #6658.
2020-09-02 16:16:06 -07:00
John MacFarlane
16c44cd2a9 Skip opts for \si. 2020-09-02 16:01:32 -07:00
John MacFarlane
e3e66ba47f LaTeX reader: support \si and improve other siunitx commands. 2020-09-02 15:44:36 -07:00
John MacFarlane
0a98648c1a LaTeX reader: support \num from siunitx. 2020-09-02 13:05:34 -07:00
John MacFarlane
529eb696dc LaTeX reader: Support squared, cubed, tothe in siunitx.
Closes #6657.
2020-09-02 11:06:26 -07:00
John MacFarlane
81fe8ebf36 LaTeX reader: Factored out siunitx stuff into separate module. 2020-09-02 10:10:55 -07:00
John MacFarlane
93e3d463fd Docx writer: separate adjacent tables.
Word combines adjacent tables, so to prevent this we insert
an empty paragraph between two adjacent tables.

Closes #4315.
2020-08-24 09:31:39 -07:00
John MacFarlane
49e810b4ed HTML writer: Fix addition of doc-biblioentry role. 2020-08-21 12:22:33 -07:00
Laurent P. René de Cotret
482a2e5079
[Latex Reader] Fixing issues with \multirow and \multicolumn table cells (#6608)
* Added test to replicate (#6596)

* Table cell reader not consuming spaces correctly (#6596)

* Prevented wrong nesting of \multicolumn and \multirow table cells (#6603)

* Parse empty table cells (#6603)

* Support full prototype for multirow macro (#6603)

Closes #6603
2020-08-15 11:40:10 -07:00
Emerson Harkin
6cfb31bbe2
Change SIRange to SIrange (#6617) 2020-08-14 11:30:17 -07:00
John MacFarlane
5d4932d7ef DocBook reader: Update list of block level tags.
This fixes #6610.
2020-08-11 09:45:12 -07:00
John MacFarlane
a9da64cc3a Remove fenced_code_blocks and backtick_code_blocks from...
commonmark/gfm extensions.  These shouldn't really be counted
as extensions, because they can't be disabled in commonmark.

Adjust markdown writer to check for commonmark variant in addition
to extensions.
2020-08-09 11:12:59 -07:00
Laurent P. René de Cotret
499fc11fca
[Latex Reader] Table cell parser not consuming spaces correctly (#6597)
* Added test to replicate (#6596)

* Table cell reader not consuming spaces correctly (#6596)
2020-08-07 22:45:47 -07:00
John MacFarlane
d8ad766d17 Options: Add /tex-mml-chtml.js to defaultMathJaxURL.
Previously we added this in processing command line options,
but not in processing defaults files, which was inconsistent.

Cloess #6593.
2020-08-06 14:42:26 -07:00
Albert Krewinkel
6b08a37bbd
Org writer: don't force blank line after headers
Closes: #6554
2020-07-31 16:04:18 +02:00
John MacFarlane
2077739a35 Add extensions to gfm and commonmark:
`fenced_code_blocks`, `backtick_code_blocks`, `fenced_code_attributes`.

These can't really be disabled in the reader, but they need to
be enabled in the writer or we just get indented code.
2020-07-29 18:13:15 -07:00
Albert Krewinkel
aff582d9b5
Writers/Shared: add missing function docs
Ensure that all functions in the module have a haddock comment.
2020-07-29 18:33:08 +02:00
Albert Krewinkel
44c4660a31
Lua filters: make attr argument optional in Table constructor
This changes the Lua API. It is highly unlikely for this change to
affect existing filters, since the documentation for the new Table
constructor (and type) was incomplete and partly wrong before.

The Lua API is now more consistent, as all constructors for elements
with attributes now take attributes as the last parameter.
2020-07-25 20:37:57 +02:00
John MacFarlane
3f2bb78f6b Make sure proper set of extensions is recognized for commonmark_x. 2020-07-24 21:52:11 -07:00
John MacFarlane
8c8d3bacb8 Markdown writer: use numerical labels for refs...
...that are longer than 999 characters or contain
square brackets. For conformity with commonmark.

Closes #6560
2020-07-23 18:26:29 -07:00
John MacFarlane
48fb6d947d Add raw_markdown extension affecting ipynb reader.
Specifying `-f ipynb+raw_markdown` will cause Markdown cells
to be represented as raw Markdown blocks, instead of being
parsed.  This is not what you want when going from `ipynb`
to other formats, but it may be useful when going from `ipynb`
to Markdown or to `ipynb`, to avoid semantically insignificant
changes in the contents of the Markdown cells that might
otherwise be introduced.

Closes #5408.
2020-07-23 17:47:02 -07:00
Emerson Harkin
1b8f161198
Minimal support for SIRange in LaTeX reader (#6418)
Add support for `\SIRange{firstnumber}{secondnumber}{unit}` provided by siunitx.

An en-dash is used instead of localized "to".
2020-07-23 16:47:32 -07:00
Laurent P. René de Cotret
8c3b5dd3ae
Col-span and row-span in LaTeX reader (#6470)
Add multirow and multicolumn support in LaTex reader.
Partially addresses #6311.
2020-07-23 11:23:21 -07:00
John MacFarlane
a0e3172a0b Further improvements to ams theorem support, and a test.
See #1608.
2020-07-23 11:11:28 -07:00
John MacFarlane
98c922ad43 LaTeX reader: Add identifier in divs for ams theorem environments. 2020-07-23 10:44:05 -07:00
John MacFarlane
ff0e130560 LaTeX reader: SUpport ams \theoremstyle. 2020-07-22 23:52:28 -07:00
John MacFarlane
cdaaaa3f63 Implement first optional argument for \newtheorem.
This allows groups of theorem environments to be
put in the same numbering sequence.
2020-07-22 23:30:41 -07:00
John MacFarlane
f014d71a2f LaTeX reader: Don't boldface alt title in theorems. 2020-07-22 22:59:28 -07:00
John MacFarlane
9d07d180f0 LaTeX reader: support theorem environments and \newtheorem.
Includes numbering and labels and refs.

Note that numbering support is not complete; we don't
reset numbers with sections for example.
2020-07-22 16:30:10 -07:00
John MacFarlane
65865b3186 LaTeX reader: support ams proof environment. 2020-07-22 14:23:26 -07:00
John MacFarlane
7faa9d9064 Moved more from LaTeX reader to LaTeX.Parsing. 2020-07-22 12:05:35 -07:00
John MacFarlane
1e84178431 Docx writer: support --number-sections.
Closes #1413.
2020-07-22 11:53:31 -07:00
John MacFarlane
942e3ee1f9 RST reader: fix csv tables with multiline cells.
Closes #6549.
2020-07-21 10:20:15 -07:00
John MacFarlane
fe315a8290 Move some code from T.P.R.LaTeX. to T.P.R.LaTeX.Parsing.
We need to reduce the size of the LaTeX reader to ease
compilation on resource-limited systems.  More can be done
in this vein.
2020-07-20 23:36:54 -07:00
John MacFarlane
490f34dee5 Markdown writer: move asciify out of escapeString.
Otherwise unsmartify doesn't catch quotes that have
already been turned to entities.
2020-07-19 22:51:59 -07:00
John MacFarlane
d6b7b1dc77 Remove use of cmark-gfm for commonmark/gfm rendering.
Instead rely on the markdown writer with appropriate extensions.

Export writeCommonMark variant from Markdown writer.
This changes a few small things in rendering markdown,
e.g. w/r/t requiring backslashes before spaces inside
super/subscripts.
2020-07-19 22:51:59 -07:00
John MacFarlane
a63105ffff Markdown writer: use unicode super/subscript characters...
when possible if the superscript or subscript extension or
raw_html aren't available.
2020-07-19 22:51:59 -07:00
John MacFarlane
8d0676ec4d Markdown writer: render caption as following paragraph...
when `Ext_table_caption` not enabled.
2020-07-19 22:51:59 -07:00
John MacFarlane
0aed9dd589 Add commonmark_x output format...
commonmark with a number of useful extensions (more than gfm).
2020-07-19 22:51:59 -07:00
John MacFarlane
3a22fbd11b Trim down githubMarkdownExtensions.
Previously it included all of the following, which make
sense for the legacy markdown_github but not for gfm,
since they are part of base commonmark and thus
can't be turned off in gfm:

- `Ext_all_symbols_escapable`
- `Ext_backtick_code_blocks`
- `Ext_fenced_code_blocks`
- `Ext_space_in_atx_header`
- `Ext_intraword_underscores`
- `Ext_lists_without_preceding_blankline`
- `Ext_shortcut_reference_links`
`
These have been removed from `githubMarkdownExtensions`, though
they're still turned on for legacy `markdown_github`.
2020-07-19 22:51:59 -07:00
John MacFarlane
8d523d80d4 Add generic attributes extension.
This allows attributes to be added to any block or inline
element, in principle.  (Though in many cases this will be
done by adding a Div or Span container, since pandoc's
AST doesn't have a slot for attributes for most elements.)

Currently this is only possible with the commonmark and gfm
readers.

Add `Ext_attributes` constructor for `Extension` [API change].
2020-07-19 22:51:59 -07:00
John MacFarlane
0db4702042 Use commonmark-hs to parse commonmark/gfm...
...instead of cmark-gfm (a wrapper around a C library).

We can now support many more pandoc extensions for
commonmark and gfm.

Add fenced_code_attributes to gfm/commonmark extensions.
2020-07-19 22:51:59 -07:00
John MacFarlane
8ede05161f
Merge pull request #6495 from tarleb/html5-figure-accessiblity
HTML writer: improve alt-text/caption handling for HTML5
2020-07-19 11:24:54 -07:00
John MacFarlane
3b563cfe8f DocBook reader: parse releaseinfo as metadata.
Closes #6542.

Note that you'll need to put releaseinfo somewhere in your
template if you want this to be part of the converted output.
2020-07-18 12:32:31 -07:00
Albert Krewinkel
36fede2b02
Jira writer: keep image caption as alt attribute
Fixes #6529
2020-07-17 16:02:40 +02:00
John MacFarlane
302543af08 Docbook reader: remove misleading comment...
suggesting that releaseinfo is handled. It isn't.
2020-07-14 10:10:43 -07:00
John MacFarlane
2cc0d68ca0
Merge pull request #6527 from lierdakil/fix-6514
[Docx Reader] Only use bCs/iCs on runs with rtl or cs property
2020-07-13 10:57:04 -07:00
Nikolay Yakimov
22c373370c [Docx Reader] Only use bCs/iCs on runs with rtl or cs property
Fixes #6514
2020-07-13 19:50:06 +03:00
John MacFarlane
c3b170be1c
Merge pull request #6513 from brisad/master
Escape starting periods in ms writer code blocks
2020-07-12 17:02:06 -07:00
John MacFarlane
7be86b148e
Merge pull request #6509 from lierdakil/docx-smush-inlines-refactor
[Docx Reader] Refactor/update Text.Pandoc.Readers.Docx.Combine.smushInlines
2020-07-12 16:55:35 -07:00
John MacFarlane
9ae792b0d4 Ms writer: fix code highlighting with blank lines.
Previously blank lines were simply omitted from highligted code.
2020-07-12 14:51:21 -07:00
John MacFarlane
e3217c3862 RST reader: fix spurious newlines in some attributes from directives. 2020-07-12 14:42:41 -07:00
John MacFarlane
37e68a818b RST reader: avoid extra newline in included code blocks. 2020-07-12 13:53:10 -07:00
Michael Hoffmann
09ea10e2b1 Escape starting periods in ms writer code blocks
If a line of ms code block output starts with a period (.), it should
be prepended by '\&' so that it is not interpreted as a roff command.

Fixes #6505
2020-07-08 23:52:28 +02:00
Nikolay Yakimov
f09e18753b [Docx Reader] Use null instead of isEmpty in Readers.Docx.Combine 2020-07-07 14:50:26 +03:00
Nikolay Yakimov
5a1e1db526 [Docx Reader] Remove unused LANGUAGE from Readers.Docx.Combine 2020-07-07 13:23:14 +03:00
Nikolay Yakimov
1ae4d76d42 [Docx Reader] Remove no-op stack/unstackInlines in Readers.Docx.Combine 2020-07-07 12:28:38 +03:00
Nikolay Yakimov
27465638a1 [Docx Reader] Get rid of unused NullModifier in Readers.Docx.Combine 2020-07-07 11:32:17 +03:00
Nikolay Yakimov
48cef91d18 [Docx Reader] Refactor/update smushInlines 2020-07-07 09:04:38 +03:00
John MacFarlane
804e8eeed2 Revert "Ipnyb: allow lossless round-tripping of markdown cell content."
This reverts commit efbc205031.
2020-07-02 09:03:56 -07:00
John MacFarlane
9afa192c3a Revert "Ipynb reader: fix duplication of 'source' attribute."
This reverts commit 2d009366ce.
2020-07-02 09:03:26 -07:00
John MacFarlane
2d009366ce Ipynb reader: fix duplication of 'source' attribute.
See #5408.
2020-07-02 09:01:06 -07:00
Albert Krewinkel
b894de6426
HTML writer: improve alt-text/caption handling for HTML5
Screen readers read an image's `alt` attribute and the figure caption,
both of which come from the same source in pandoc. The figure caption is
hidden from screen readers with the `aria-hidden` attribute. This
improves accessibility.

For HTML4, where `aria-hidden` is not allowed, pandoc still uses an
empty `alt` attribute to avoid duplicate contents.

Closes: #6491
2020-07-01 14:54:52 +02:00
Albert Krewinkel
ccf9889c2c
Org reader: respect tables-excluding export setting
Tables can be removed from the final document with the `#+OPTION:
|:nil` export setting.
2020-07-01 09:28:24 +02:00
Albert Krewinkel
d6711bd7d9
Org reader: respect export setting disabling footnotes
Footnotes can be removed from the final document with the `#+OPTION:
f:nil` export setting.
2020-06-30 22:30:15 +02:00
John MacFarlane
efbc205031 Ipnyb: allow lossless round-tripping of markdown cell content.
The reader now parses the contents of the markdown cell to a Pandoc
structure, but *also* stores the raw markdown in a `source`
attribute on the cell Div.  When we convert back to markdown,
this attribute is stripped off and the original source is used.
When we convert to other formats, the attribute is usually
ignored (though it will come through in HTML as a `data-source`
attribute, not unhelpfully).

I'll note some potential drawbacks of this approach:

- It makes it impossible to use pandoc to clean up or
  change the contents of markdown cells, e.g.
  going from `+smart` to `-smart`.

- There may be formats where the addition of the `source`
  attribute is problematic.  I can't think of any, though.

Closes #5408.
2020-06-30 12:32:44 -07:00
Albert Krewinkel
7c207c3051
Org reader: respect export setting which disables entities
MathML-like entities, e.g., `\alpha`, can be disabled with the
`#+OPTION: e:nil` export setting.
2020-06-30 11:39:32 +02:00
John MacFarlane
f1a5295082
Merge pull request #6328 from lierdakil/defaults-meta-parse
Unify defaults metadata and markdown metadata parsers
2020-06-29 12:38:49 -07:00
Albert Krewinkel
5ef315cc6d
Org reader: keep unknown keyword lines as raw org
The lines of unknown keywords, like `#+SOMEWORD: value` are no longer
read as metadata, but kept as raw `org` blocks. This ensures that more
information is retained when round-tripping org-mode files;
additionally, this change makes it possible to support non-standard org
extensions via filters.
2020-06-29 21:19:34 +02:00
Albert Krewinkel
90ac70c79c
Org reader: unify keyword handling
Handling of export settings and other keywords (like `#+LINK`) has been
combined and unified.
2020-06-29 20:53:25 +02:00
Albert Krewinkel
1480606174
Org reader: support LATEX_HEADER_EXTRA and HTML_HEAD_EXTRA settings
These export settings are treated like their non-extra counterparts,
i.e., the values are added to the `header-includes` metadata list.
2020-06-29 17:04:29 +02:00
Albert Krewinkel
d17b257c89
Org reader: allow multiple #+SUBTITLE export settings
The values of all lines are read as inlines and collected in the
`subtitle` metadata field.
2020-06-29 17:03:33 +02:00
Nikolay Yakimov
42e7f1e976 Clean up T.P.R.Metadata 2020-06-29 17:07:12 +03:00
Nikolay Yakimov
34e54d3020 Handle errors in yamlToMeta 2020-06-29 17:06:29 +03:00
Nikolay Yakimov
f26923b9e4 Unify defaults and markdown metadata parsers 2020-06-29 17:06:29 +03:00
Nikolay Yakimov
11dc9f84f5
Remove obsolete RelaxedPolyRec extension (#6487) 2020-06-28 22:35:33 -07:00
John MacFarlane
8a1690dec1 PDF: all verbose output now goes to stderr, not stdout.
Closes #6483.
2020-06-28 12:11:23 -07:00
Albert Krewinkel
19175af811
JATS reader: parse abstract element into metadata field of same name (#6482)
Closes: #6480
2020-06-28 10:35:50 -07:00
Albert Krewinkel
d2d5eb8a99
Org reader: read #+INSTITUTE values as text with markup
The value is stored in the `institute` metadata field and used in the
default beamer presentation template.
2020-06-28 19:25:57 +02:00
Albert Krewinkel
e3a6d651e1
Org reader: update behavior of author, keywords export settings
The behavior of the `#+AUTHOR` and `#+KEYWORD` export settings has
changed: Org now allows multiple such lines and adds a space between the
contents of each line. Pandoc now always parses these settings as meta
inlines; setting values are no longer treated as comma-separated lists.
Note that a Lua filter can be used to restore the previous behavior.
2020-06-28 18:01:30 +02:00
Albert Krewinkel
54f6faa10f
Org reader: refactor export setting handling 2020-06-28 15:41:56 +02:00
Albert Krewinkel
8dce28d949
Org reader: read description lines as inlines
`#+DESCRIPTION` lines are treated as text with markup. If multiple such
lines are given, then all lines are read and separated by soft
linebreaks.

Closes: #6485
2020-06-27 09:11:00 +02:00
Albert Krewinkel
9e6e9a7221
Org reader: honor tex export option
The `tex` export option can be set with `#+OPTION: tex:nil` and allows
three settings:

 - `t` causes LaTeX fragments to be parsed as TeX or added as raw TeX,
 - `nil` removes all LaTeX fragments from the document, and
 - `verbatim` treats LaTeX as text.

The default is `t`.

Closes: #4070
2020-06-25 20:31:33 +02:00
John MacFarlane
52ac585967 Remove redundant pattern match in pptx writer. 2020-06-23 13:04:42 -07:00
John MacFarlane
9b7282bb0f LaTeX reader: Retain the Div around tables with attributes.
We'll need this to store table attributes until all writers
are adjusted to react to attributes on the Table element.
2020-06-23 11:12:40 -07:00
John MacFarlane
ee782ccfec Markdown reader: Don't require blank line after grid table.
This fixes #6481, allowing grid tables to be enclosed
in fenced divs with no intervening blank lines.
2020-06-23 08:24:45 -07:00
John MacFarlane
7f8105159c Handle native Underline in Powerpoint writer.
(Instead of old Span with underline class.
Spans with `underline` will no longer be rendered
as underlined text.)
2020-06-22 17:56:28 -07:00
John MacFarlane
b1561d8e47 Use native Underline instead of Span in Jira 2020-06-22 17:55:57 -07:00
John MacFarlane
76fc51f2ba Use --enable-local-file-access in invoking wkhtmltopdf.
wkhtmltopdf changed in recent versions to require this for
access to local files.

This fixes PDF via HTML5 with `--css`.

Closes #6474.
2020-06-22 16:33:20 -07:00
Albert Krewinkel
f5d7d41cbd
Recognize images with uppercase extensions
Fixes: #6472
2020-06-20 18:14:18 +02:00
John MacFarlane
9d0506e404 LaTeX writer: escape ^ specially for listings.
Closes #6460.
2020-06-17 10:12:55 -07:00
John MacFarlane
a8b3117e04 RST reader: pass arbitrary attributes through in code blocks.
Exceptions: name (which becomes the id), class (which becomes the
classes), and number-lines (which is treated specially to fit
with pandoc highlighting).

Closes #6465.
2020-06-17 09:57:56 -07:00
Michael Reed
bf95282436
Fix MIME type for TrueType fonts in EPUBs (#6464)
Per the EPUB 3.2 spec, "application/x-font-truetype" is no longer a
valid identifier for TrueType (.ttf) fonts [1]. This fixes warnings when
validating pandoc-generated EPUBs using `epubcheck` [2].

References [3].

[1]: https://www.w3.org/publishing/epub3/epub-spec.html#sec-core-media-types
[2]: https://github.com/w3c/epubcheck
2020-06-17 09:15:50 -07:00
Mathieu Boespflug
bbf04df900
Docbook reader: implement <procedure> (#6442)
A `<procedure>` contains a sequence of `<step>`'s, or `<substeps>`
that themselves contain `<step>`'s.
2020-06-14 10:45:52 -07:00