Commit graph

1553 commits

Author SHA1 Message Date
Albert Krewinkel
7c207c3051
Org reader: respect export setting which disables entities
MathML-like entities, e.g., `\alpha`, can be disabled with the
`#+OPTION: e:nil` export setting.
2020-06-30 11:39:32 +02:00
Albert Krewinkel
5ef315cc6d
Org reader: keep unknown keyword lines as raw org
The lines of unknown keywords, like `#+SOMEWORD: value` are no longer
read as metadata, but kept as raw `org` blocks. This ensures that more
information is retained when round-tripping org-mode files;
additionally, this change makes it possible to support non-standard org
extensions via filters.
2020-06-29 21:19:34 +02:00
Albert Krewinkel
90ac70c79c
Org reader: unify keyword handling
Handling of export settings and other keywords (like `#+LINK`) has been
combined and unified.
2020-06-29 20:53:25 +02:00
Albert Krewinkel
1480606174
Org reader: support LATEX_HEADER_EXTRA and HTML_HEAD_EXTRA settings
These export settings are treated like their non-extra counterparts,
i.e., the values are added to the `header-includes` metadata list.
2020-06-29 17:04:29 +02:00
Albert Krewinkel
d17b257c89
Org reader: allow multiple #+SUBTITLE export settings
The values of all lines are read as inlines and collected in the
`subtitle` metadata field.
2020-06-29 17:03:33 +02:00
Albert Krewinkel
19175af811
JATS reader: parse abstract element into metadata field of same name (#6482)
Closes: #6480
2020-06-28 10:35:50 -07:00
Albert Krewinkel
d2d5eb8a99
Org reader: read #+INSTITUTE values as text with markup
The value is stored in the `institute` metadata field and used in the
default beamer presentation template.
2020-06-28 19:25:57 +02:00
Albert Krewinkel
b7a8620b43
Org tests: group export settings test for Org reader 2020-06-28 19:25:57 +02:00
Albert Krewinkel
e3a6d651e1
Org reader: update behavior of author, keywords export settings
The behavior of the `#+AUTHOR` and `#+KEYWORD` export settings has
changed: Org now allows multiple such lines and adds a space between the
contents of each line. Pandoc now always parses these settings as meta
inlines; setting values are no longer treated as comma-separated lists.
Note that a Lua filter can be used to restore the previous behavior.
2020-06-28 18:01:30 +02:00
Albert Krewinkel
8dce28d949
Org reader: read description lines as inlines
`#+DESCRIPTION` lines are treated as text with markup. If multiple such
lines are given, then all lines are read and separated by soft
linebreaks.

Closes: #6485
2020-06-27 09:11:00 +02:00
Albert Krewinkel
9e6e9a7221
Org reader: honor tex export option
The `tex` export option can be set with `#+OPTION: tex:nil` and allows
three settings:

 - `t` causes LaTeX fragments to be parsed as TeX or added as raw TeX,
 - `nil` removes all LaTeX fragments from the document, and
 - `verbatim` treats LaTeX as text.

The default is `t`.

Closes: #4070
2020-06-25 20:31:33 +02:00
John MacFarlane
9b7282bb0f LaTeX reader: Retain the Div around tables with attributes.
We'll need this to store table attributes until all writers
are adjusted to react to attributes on the Table element.
2020-06-23 11:12:40 -07:00
John MacFarlane
90b2c5a5e4 Add test for #6481. 2020-06-23 08:27:19 -07:00
John MacFarlane
7f8105159c Handle native Underline in Powerpoint writer.
(Instead of old Span with underline class.
Spans with `underline` will no longer be rendered
as underlined text.)
2020-06-22 17:56:28 -07:00
John MacFarlane
b1561d8e47 Use native Underline instead of Span in Jira 2020-06-22 17:55:57 -07:00
Albert Krewinkel
064303e2c9
Jira writer: always escape braces
Braces are now always escaped, even within words or when surrounded by
whitespace. Jira and Confluence treat braces specially.

Package jira-wiki-markup must be version 1.3.2 or later.

Fixes: #6478
2020-06-22 16:30:11 +02:00
Albert Krewinkel
f5d7d41cbd
Recognize images with uppercase extensions
Fixes: #6472
2020-06-20 18:14:18 +02:00
Mathieu Boespflug
bbf04df900
Docbook reader: implement <procedure> (#6442)
A `<procedure>` contains a sequence of `<step>`'s, or `<substeps>`
that themselves contain `<step>`'s.
2020-06-14 10:45:52 -07:00
Mathieu Boespflug
12a35dd0d0
Docbook: map <simplesect> to unnumbered section (#6436)
A <simplesect> is a section like any other, except that it never
contains an subsection, and is typically rendered unnumbered.
2020-06-14 10:40:00 -07:00
John MacFarlane
f6dfacf9d6 Add "summary" to list of block-level HTML tags.
Closes #6385.  (The summary element needs to be the first
child of details and should not be enclosed by p tags.)

NOTE:  you need to include a blank line before the closing
`</details>`, if you want the last part of the content to
be parsed as a paragraph.
2020-05-20 07:45:14 -07:00
Lila
c04800305e
Propagate (DY)LD_LIBRARY_PATH in tests (#6376) 2020-05-18 22:46:14 -07:00
Lila
f4185fcef0
Use CSS in favor of <br> for display math (#6372)
Some CSS to ensure that display math is
displayed centered and on a new line is now included
in the default HTML-based templates; this may be
overridden if the user wants a different behavior.
2020-05-18 22:45:44 -07:00
John MacFarlane
5a20cc07dd Docx writer: enable column and row bands for tables.
This change will not have any effect with the default style.
However, it enables users to use a style (via a reference.docx)
that turns on row and/or column bands.

Closes #6371.
2020-05-16 15:50:59 -07:00
John MacFarlane
be9e93d4ae LaTeX writer: create hypertarget for links with identifier.
Closes #6360.
2020-05-12 14:37:07 -07:00
John MacFarlane
46179d5b3e Use latest skylighting.
This adds `aria-hidden="true"` to the empty a elements, which
helps people who use screen readers.
2020-05-12 14:37:07 -07:00
Albert Krewinkel
9c76c52e9b
Lua: fix regression in package searcher
This caused `require 'module'` to fail for third party packages.

Fixes: #6361
2020-05-12 17:10:30 +02:00
andrebauer
97fe2ea16c
LaTeX Writer: Add support for customizable alignment of columns in beamer (#6331)
Add support for customizable alignment of columns in beamer.
Closes #4805, closes #4150.
2020-05-02 17:08:16 -07:00
Vaibhav Sagar
9c2b659eeb
Support new Underline element in readers and writers (#6277)
Deprecate `underlineSpan` in Shared in favor of `Text.Pandoc.Builder.underline`.
2020-04-28 07:53:06 -07:00
John MacFarlane
f268ae3035 RST writer: properly handle images with same alt text.
Previously we created duplicate references for these
in rendering RST.  Closes #6194.
2020-04-24 16:54:52 -07:00
John MacFarlane
6baacb51bb AsciiDoc writer: add blank line after Div.
Closes #6308.
2020-04-22 23:04:43 -07:00
Joe Hermaszewski
fd5994cc5e Haddock Writer: Support Haddock tables
See this PR on Haddock for details on the table format:
https://github.com/haskell/haddock/pull/718
2020-04-20 13:57:36 +08:00
John MacFarlane
aff2500d46 More fixes for round-trip tests of HTML reader.
We exclude tables that have default widths but non-simple
content, as these can't really round-trip.
2020-04-19 17:21:19 -07:00
John MacFarlane
573214a06a Fixed round-trip HTML tests.
Exclude tables with cells with line breaks because they don't
currently round-trip.  (Table goes from being simple to having
explicit widths.)
2020-04-18 20:57:28 -07:00
John MacFarlane
9a809d4d01 Markdown writer: avoid unnecessary escapes before intraword _
when `intraword_underscores` extension is enabled.
Closes #6296.
2020-04-17 22:42:21 -07:00
John MacFarlane
0d2b8e3fe1
Merge pull request #6211 from tarleb/lua-pandocerror
API change: create PandocLua type, use PandocError for exceptions
2020-04-17 18:02:25 -07:00
John MacFarlane
8f40b4ba14 LaTeX reader: don't put surrounding Div around Table.
This reverts a change in the last release; the Div is
no longer needed, because we can now put the id right in
the Table's attributes.  However, writers may still need
to be modified to do something with the id in a Table
(e.g. create an anchor), so in the short term we may lose
the ability to link to tables in some writers.
2020-04-17 13:04:15 -07:00
Albert Krewinkel
fb54f3d679
API change: use PandocError for exceptions in Lua subsystem
The PandocError type is used throughout the Lua subsystem, all Lua
functions throw an exception of this type if an error occurs. The
`LuaException` type is removed and no longer exported from
`Text.Pandoc.Lua`. In its place, a new constructor `PandocLuaError` is
added to PandocError.
2020-04-17 21:52:48 +02:00
despresc
2fc11f3b1e Modify toLegacyTable to cut up cells, add tests
Now a cell with dimension (h, w) will be cut up into h*w cells of
dimension (1,1), all in the same grid position, with the upper-left
holding the original cell contents and the rest being empty.
2020-04-15 23:03:22 -04:00
despresc
c7814f31e1 Use the new builders, modify readers to preserve empty headers
The Builder.simpleTable now only adds a row to the TableHead when the
given header row is not null. This uncovered an inconsistency in the
readers: some would unconditionally emit a header filled with empty
cells, even if the header was not present. Now every reader has the
conditional behaviour. Only the XWiki writer depended on the header
row being always present; it now pads its head as necessary.
2020-04-15 23:03:22 -04:00
despresc
d368536a4e Adapt to the removal of the RowSpan, ColSpan, RowHeadColumns accessors 2020-04-15 23:03:22 -04:00
despresc
4e34d366df Adapt to the newest Table type, fix some previous adaptation issues
- Writers.Native is now adapted to the new Table type.

- Inline captions should now be conditionally wrapped in a Plain, not
  a Para block.

- The toLegacyTable function now lives in Writers.Shared.
2020-04-15 23:03:22 -04:00
despresc
7254a2ae0b Implement the new Table type 2020-04-15 23:03:22 -04:00
Nikolay Yakimov
83c1ce1d77
Markdown Reader: Fix inline code in lists (#6284)
Closes #6284.

Previously inline code containing list markers was sometimes parsed incorrectly.
2020-04-15 16:20:01 -07:00
John MacFarlane
71c4857464 JATS reader: handle "label" element in section title.
Closes #6288.
2020-04-15 09:23:04 -07:00
John MacFarlane
9187b4bca9 LaTeX writer: ensure that -M csquotes works even in fragment mode.
Closes #6265.
2020-04-11 10:40:59 -07:00
Tristan de Cacqueray
dd06d63540
HTML reader: support <bdo> (#6271)
See https://developer.mozilla.org/en-US/docs/Web/HTML/Element/bdo

Closes #5794
2020-04-11 09:57:59 -07:00
Albert Krewinkel
c09a3448d1
Jira reader: improve icon conversion
Icons are now converted as follows: `(/)` to ✔, `(x)` to , `(!)` to
, `(+)` to , `(-)` to , `(off)` to 🌙, and `(*)` to ☆. The new
icons render well in most fonts. Furthermore, the UTF-8 characters all
fit into 4-bytes.

Closes: #6264
2020-04-09 16:21:45 +02:00
John MacFarlane
11df2a3c0f LaTeX reader: better handling of \lettrine.
- SmallCaps instead of Span for the part after the initial capital.
- Ensure that both arguments are parsed, so that in Markdown both
  are treated as raw LateX. (Closes #6258.)
2020-04-07 09:25:52 -07:00
Vlad Hanciuta
8dbd4938f2
Vimwiki reader: Add nested syntax highlighting (#6257)
Nested syntaxes are specified like this:
{{{sql
SELECT * FROM table
}}}

The preformatted code block parser has been extended to check if the
first attribute of the block is not a `key=value` pair, and in that case
it will be considered as a class.

Closes #6256.
2020-04-06 16:41:28 -07:00
Albert Krewinkel
663a5a9b7f
test/writer.jira: fix links, skip alias if it equals the target 2020-04-04 15:03:13 +02:00
Albert Krewinkel
c3f539364a
Jira: support citations, attachment links, and user links
Closes: #6231
Closes: #6238
Closes: #6239
2020-04-04 14:27:27 +02:00
Albert Krewinkel
d867cac8ca
Jira reader: resolve parsing issues of blockquote, color
Parsing problems occurring with block quotes and colored text have been
resolved.

Fixes: #6233
Fixes: #6235
2020-04-03 13:25:52 +02:00
John MacFarlane
92e0801daa Add test fixes for docbook writer changes. 2020-04-01 23:09:14 -07:00
Albert Krewinkel
7df0710094
Jira reader: use span with class underline for inserted text
Jira text which is marked as `+inserted+` is converted into pandoc's
default representation for underlined text: a span with class
`underline`. Previously, the span was marked with the non-standard class
`inserted`.

Closes: #6237
2020-03-31 10:04:55 +02:00
Albert Krewinkel
ff9be6b384
Jira writer: convert spans with class underline to inserted text
Spans with class `underline` as converted into Jira text marked as
`+inserted+`, i.e. surrounded by plus-signs.
2020-03-31 09:57:59 +02:00
Albert Krewinkel
9a42bec7fc
Jira writer tests: update image in test/writer.jira 2020-03-31 08:18:41 +02:00
Albert Krewinkel
69a3fa5708
Jira reader: retain image attributes
Jira images attributes as in `!image.jpg|align=right!` are retained as
key-value pairs. Thumbnail images, such as `!example.gif|thumbnail!`,
are marked by a `thumbnail` class in their attributes.

Related to #6234.
2020-03-30 22:03:52 +02:00
Joseph C. Sible
7233a7a932
More cleanup (#6209)
* Simplify by collapsing a do block into a single <$>
* Remove an unnecessary variable: `all` takes any Foldable, so only blocksToInlines needs toList.
2020-03-28 22:48:47 -07:00
Albert Krewinkel
44f8c2725e
Jira reader: fix parsing of tables without preceding blankline
A bug was fixed which caused faulty parsing if a table was not preceded
by a newline and the first table cell had no space after the initial `|`
characters.

Fixes: #6198
2020-03-19 21:27:35 +01:00
Albert Krewinkel
81d46435f6
Jira reader: fix parsing of strikeout, emphasis
A bug was fixed which caused non-emphasized text containing digits and/or
non-special symbols (like dots) to sometimes be parsed incorrectly.

Fixes: #6196
2020-03-18 21:32:05 +01:00
Albert Krewinkel
11b5f1e40b
Update copyright year (#6186)
* Update copyright year

* Copyright: add notes for Lua and Jira modules
2020-03-13 09:52:47 -07:00
Albert Krewinkel
7eb9914841
Jira reader: support colored inline text, indented lists
* Support for colored inlines has been added.
* Lists are now allowed to be indented; i.e., lists are still recognized
  if list markers are preceded by spaces.

Closes: #6183, #6184
2020-03-13 09:52:28 +01:00
John MacFarlane
87875763c8 Ms writer: fix definition lists so indent even when...
paragraph indent is set to 0 (as is the default).

Also ensure indent for display math that falls back
to TeX.
2020-03-07 21:54:29 -08:00
John MacFarlane
2ab83a56e6 Ms writer: use .QS/.QE instead of .RS/.RE for block quotes. 2020-03-06 09:45:46 -08:00
John MacFarlane
aaf296508a Fix man reader test for previous change. 2020-03-05 19:34:17 -08:00
John MacFarlane
0edc084c50 Revert "Allow specifying string value in metadata using !!literal tag."
This reverts commit 3493d6afaa.

This might be worth considering in the future, but let's not do
it yet...the additional complexity needs a better justification.
2020-02-17 15:58:21 -08:00
John MacFarlane
3493d6afaa Allow specifying string value in metadata using !!literal tag.
This is experimental.  Normally metadata values are interpreted
as markdown, but if the !!literal tag is used they will be interpreted
as plain strings.

We need to consider whether this can still be implemented if
we switch back from HsYAML to yaml for performance reasons.
2020-02-17 09:53:36 -08:00
Ethan Riley
daf770c1e9
Fixes: group biblatex citations even with prefix and suffix (#6058)
Closes #5849.  Previously biblatex citations were only grouped if
there was no prefix.  This patch allows them to be grouped in
subgroups split by prefixes and suffixes, which allows better citation
sorting.
2020-02-14 08:44:40 -08:00
Lucas Escot
29c2670da2
Add highlight directive to the rST reader (#6140) 2020-02-13 10:27:34 -08:00
Albert Krewinkel
f5ea5f0aad
Introduce new format variants for JATS (#6067)
New formats:

- `jats_archiving` for the "Archiving and Interchange Tag Set",
- `jats_publishing` for the "Journal Publishing Tag Set", and
- `jats_articleauthoring` for the "Article Authoring Tag Set."

The "jats" output format is now an alias for "jats_archiving".

Closes: #6014
2020-02-12 20:36:02 -08:00
John MacFarlane
3a79f37d88 LaTeX reader: improve caption and label parsing.
- Don't emit empty Span elements for labels.
- Put tables with labels in a surrounding Div.
2020-02-12 17:43:55 -08:00
John MacFarlane
3fbee8c6ed LaTeX reader: resolve \ref to table numbers.
Closes #6137.
2020-02-11 22:28:06 -08:00
John MacFarlane
114d77c2ab Fix spurious dots in markdown_mmd metadata output
Closes #6133 (regression).
2020-02-10 09:00:21 -08:00
John MacFarlane
5f0bd52221 reveal.js: ensure that pauses work even in title slides.
Closes #5819.
2020-02-08 09:38:07 -08:00
Joseph C. Sible
12c75701be
Use <$> instead of >>= and return (#6128) 2020-02-08 09:12:01 -08:00
John MacFarlane
4c3db9273f Apply linter suggestions. Add fix_spacing to lint target in Makefile. 2020-02-07 09:08:22 -08:00
Joseph C. Sible
a5a3ac9946
Various minor cleanups and refactoring (#6117)
* Use concatMap instead of reimplementing it

* Replace an unnecessary multi-way if with a regular if

* Use sortOn instead of sortBy and comparing

* Use guards instead of lots of indents for if and else

* Remove redundant do blocks

* Extract common functions from both branches of maybe

Whenever both the Nothing and the Just branch of maybe do the same
function, do that function on the result of maybe instead.

* Use fmap instead of reimplementing it from maybe

* Use negative forms instead of negating the positive forms

* Use mapMaybe instead of mapping and then using catMaybes

* Use zipWith instead of mapping over the result of zip

* Use unwords instead of reimplementing it

* Use <$ instead of <$> and const

* Replace case of Bool with if and else

* Use find instead of listToMaybe and filter

* Use zipWithM instead of mapM and zip

* Inline lambda wrappers into the real functions

* We get zipWithM from Text.Pandoc.Writers.Shared

* Use maybe instead of fromMaybe and fmap

I'm not sure how this one slipped past me.

* Increase a bit of indentation
2020-02-07 08:38:24 +01:00
John MacFarlane
0a4f49d370 MediaWiki writer: prevent triple [[[.
This confuses mediawiki's parser.  So we insert a `<nowiki/>`
no-op between a literal `[` and a link.  Closes #6119.
2020-02-05 10:08:18 -08:00
John MacFarlane
9c4dc8b49b LaTeX reader: skip comments in more places where this is needed.
Closes #6114.
2020-02-05 09:48:42 -08:00
John MacFarlane
d9b1776336 Fix duplicate frame classes in LaTeX/Beamer output.
Close #6107.
2020-02-03 08:52:07 -08:00
John MacFarlane
fb3df6cf19 csv reader: allow empty cells. 2020-01-31 21:43:19 -08:00
John MacFarlane
f9514ccb9e Add Text.Pandoc.Readers.CSV (readCSV).
This adds csv as an input format.
The CSV table is converted into a pandoc simple table.

Closes #6100.
2020-01-31 21:14:21 -08:00
John MacFarlane
05a217091f Fix test suite for new skylighting.
Closes #6086.
2020-01-27 08:25:25 -08:00
Albert Krewinkel
672a4bdd1d Lua filters: allow filtering of element lists (#6040)
Lists of Inline and Block elements can now be filtered via `Inlines` and
`Blocks` functions, respectively. This is helpful if a filter conversion
depends on the order of elements rather than a single element.

For example, the following filter can be used to remove all spaces
before a citation:

    function isSpaceBeforeCite (spc, cite)
      return spc and spc.t == 'Space'
       and cite and cite.t == 'Cite'
    end

    function Inlines (inlines)
      for i = #inlines-1,1,-1 do
        if isSpaceBeforeCite(inlines[i], inlines[i+1]) then
          inlines:remove(i)
        end
      end
      return inlines
    end

Closes: #6038
2020-01-15 14:26:00 -08:00
John MacFarlane
157936c927 HTML writer: fix duplicate attributes on headings.
Another regression from 2.7.x. Closes #6062.
2020-01-12 15:18:37 -08:00
Albert Krewinkel
4ede11a75f
Lua: add methods insert, remove, and sort to pandoc.List
The functions `table.insert`, `table.remove`, and `table.sort` are added
to pandoc.List elements.  They can be used as methods, e.g.

    local numbers = pandoc.List {2, 3, 1}
    numbers:sort()     -- numbers is now {1, 2, 3}
2020-01-11 21:31:20 +01:00
Albert Krewinkel
e98921ae57
pandoc.List.lua: make pandoc.List a callable constructor
It is now possible to construct a new List via `pandoc.List()` instead of
`pandoc.List:new()`.
2020-01-11 19:38:27 +01:00
Albert Krewinkel
57637f8aae
Add tests for pandoc.List module 2020-01-11 17:42:32 +01:00
John MacFarlane
42b915e656 LaTeX reader: allow beamer overlays for all commands in all raw tex.
This affecs parsing of raw tex in LaTeX and in Markdown and
other formats.

Closes #6043.
2020-01-10 08:26:33 -08:00
John MacFarlane
fc78be1140 LaTeX reader: improve parsing of raw environments.
If parsing fails in a raw environment (e.g. due to special
characters like unescaped `_`), try again as a verbatim
environment, which is less sensitive to special characters.

This allows us to capture special environments that change
catcodes as raw tex when `-f latex+raw_tex` is used.

Closes #6034.
2020-01-08 08:43:51 -08:00
Albert Krewinkel
bef0133fe0
Jira writer: fix output of table headers
Headers were missing from tables.

Fixes: #6035
2020-01-07 22:11:51 +01:00
John MacFarlane
ad5f7ecfce Fix regression in handling of columns in beamer slides.
Columns in title slides were causing problems with
slide division.  Closes #6033.
2020-01-07 11:11:23 -08:00
John MacFarlane
2a67b7aea9 Reveal.js writer: restore old behavior for 2D nesting.
The fix to #6030 actually changed behavior, so that the
2D nesting occurred at slide level N-1 and N, instead of
at the top-level section.  This commit restores the 2.7.3 behavior.
If there are more than 2 levels, the top level is horizontal
and the rest are collapsed to vertical.

Closes #6032.
2020-01-07 10:11:46 -08:00
John MacFarlane
66bca029c7 Fix regression in beamer slide structure with certain slide levels.
Closes #6030.
2020-01-05 11:10:07 -08:00
John MacFarlane
0dff9b3cd2 Fix revealjs slide structure regression with certain slide levels.
Partially addresses #6030.
2020-01-05 09:42:35 -08:00
Albert Krewinkel
4914855275
Fix test/tables.org
Editor usage mistake caused a broken reference file.
2020-01-03 08:38:58 +01:00
Albert Krewinkel
c3a00c1fdc
Org writer: remove extra spaces from table cells
Closes: #6024
2020-01-03 00:00:57 +01:00
John MacFarlane
2c1a911bc1 LaTeX writer: properly handle unnumbered headings level 4+.
Closes #6018.

Previously the `\paragraph` command was used instead of
`\paragraph*` for unnumbered level 4 headings.
2020-01-01 22:32:27 -07:00
Arfon Smith
5847624124 Update JATS dtd (#6020)
The current DTD for the JATS writer template is for Journal Publishing (JATS-journalpublishing1.dtd), which does not permit ext-link as a valid child (https://jats.nlm.nih.gov/publishing/tag-library/1.1/element/publisher-name.html).

This update modifies the default output template to be the less restrictive JATS archiving and interchange DTD which systems like PubMed use internally to represent their articles.
2019-12-30 22:18:55 -08:00
John MacFarlane
82f44592ae Fix parsing bug affected indented code after raw HTML.
Closes #6009, #5360.
2019-12-27 13:08:22 -08:00
John MacFarlane
80a0e56a5d HTML reader tests: modify round-trip tests...
to avoid a special failure case involving makeSections.
2019-12-21 12:15:35 -08:00
Albert Krewinkel
2c13773be8
Jira writer: use jira-wiki-markup renderer
Pandoc's AST is translated into the Jira AST, which is then rendered by
the dedicated Jira printer.

The following improvements are included in this change:

- non-jira raw blocks are fully discarded instead of showing as blank
  lines;
- table cells can contain multiple blocks;
- unnecessary blank lines are removed from the output;
- markup chars within words are properly surrounded by braces;
- preserving soft linebreaks via `--wrap=preserve` is supported.

Note that backslashes are rendered as HTML entities, as there appears no
alternative to produce a plain backslash if it is followed by markup.
This may cause problems when used with confluence, where rendering seems
to fail in this case.

Closes: #5926
2019-12-20 17:12:46 +01:00
Albert Krewinkel
0a3cc7260c
Org reader: fix parsing problem for colons in headline
Fixed a problem where words surrounded by colons could causing parse
failures in some cases when they occurred in headers.

Fixes: #5993
2019-12-19 20:38:51 +01:00
Albert Krewinkel
5bd2d28b19
Org reader: wrap named table in div, using name as id
Closes: #5984
2019-12-18 22:28:26 +01:00
Albert Krewinkel
96c80b156d Add jira reader (#5913)
Closes #5556
2019-12-17 21:07:46 -08:00
John MacFarlane
5029233293 Adjust test to work with Windows (I hope). 2019-12-17 14:13:05 -08:00
John MacFarlane
d0918627ca Improved --toc generation. 2019-12-17 11:59:52 -08:00
John MacFarlane
20cf4e47b0 Improved makeSections so we don't get doubled attributes.
Closes #5986.
2019-12-17 11:59:52 -08:00
John MacFarlane
b0c5ecbb1a Added test for #5986. 2019-12-17 11:59:52 -08:00
John MacFarlane
204c7bb943 Add section-divs command test (failing). 2019-12-17 11:59:52 -08:00
Albert Krewinkel
75dc013036
Org reader: add table labels to caption if both are present
The table `#+NAME:` or `#+LABEL:` is added to the table's caption in the
form of an empty span with the label set as the span's ID.

Closes: #5984
2019-12-13 16:22:04 +01:00
Denis Maier
4e7ac069b9 ConTeXt template: Adjustments to title formatting (#5949)
Added `\setupinterlinespace` to `title`, `subtitle`, `date` and `author` elements.
Otherwise longer titles that run over multiple lines will look squashed as
`\tfd` etc. won't adapt the line spacing to the font size.
2019-12-11 08:28:11 -08:00
John MacFarlane
0b54d6282b Fix --toc-depth regression in 2.8.
Closes #5967.
2019-12-07 14:20:41 -08:00
Yihui Xie
ee44d44f20 Keep the \author{} command even if author is not specified (#5961)
Otherwise there will be a LaTeX warning "No \author given" when the .tex file is compiled.
This does not affect spacing in the title block.
2019-12-05 10:22:31 -08:00
John MacFarlane
992f77c17c Roll back part of of --shift-heading-level-by change.
With positive heading shifts, starting in 2.8 this option caused
metadata titles to be removed and changed to regular headings.
This behavior is incompatible with the old behavior of
`--base-header-level` and breaks old workflows, so with this
commit we are rolling back this change.

Now, there is an asymmetry in positive and negative heading
level shifts:

+ With positive shifts, the metadata title stays the same and
  does not get changed to a heading in the body.
+ With negative shifts, a heading can be converted into the
  metadata title.

I think this is a desirable combination of features, despite
the asymmetry.  One might, e.g., want to have a document
with level-1 section headigs, but render it to HTML with
level-2 headings, retaining the metadata title (which pandoc
will render as a level-1 heading with the default template).

Closes #5957.
Revises #5615.
2019-12-05 09:59:50 -08:00
John MacFarlane
64f78b3d5f HTML-based templates: Add CSS to suppress bullet on unordered task lists. 2019-12-05 09:36:34 -08:00
John MacFarlane
79a10388da HTML writer: add task-list class to ul if all elements are task list items.
This will allow styling unordered task lists in a way that omits
the bullet.
2019-12-05 09:32:40 -08:00
John MacFarlane
4489283b03 Fix makeSections so it doesn't turn column divs into sections. 2019-12-05 09:17:28 -08:00
John MacFarlane
ce0a4f8c47 RST writers: Use grid tables for 1-column tables.
With simple tables, we have a clash with heading syntax.
Closes #5936.
2019-11-25 07:31:28 -08:00
Albert Krewinkel
d948e13fea Jira writer: improve escaping of special chars (#5925)
Backslash-escaping is used instead of HTML entities, as escaped
characters are easier to read this way. Furthermore, Confluence, which
seems to use a subset of Jira markup, seems to get confused by HTML
entities.
2019-11-22 07:57:24 -08:00
John MacFarlane
c1b51b1282 Improve markdown escaping in list items.
Closes #5918.
2019-11-19 22:33:14 -08:00
Alexander Krotov
33d2a1a84f DokuWiki reader: parse markup inside monospace ('') (#5917)
Fixes #5916
2019-11-18 15:05:19 -08:00
John MacFarlane
16618f38fb Ms template: Use Palatino for default font.
This is less ugly than Times.
2019-11-16 19:32:59 -08:00
John MacFarlane
e9aea67032 Updaet ms writer test. 2019-11-16 19:22:51 -08:00
John MacFarlane
1bf17a7c87 Ms writer: boldface definition terms in DefinitionLists.
Like LaTeX, ConTeXt.
2019-11-16 19:01:53 -08:00
John MacFarlane
56543ec397 ms template: default to page numbers on bottom, no paragaraph indent.
To be more like the default LaTeX output.
2019-11-16 18:51:22 -08:00
John MacFarlane
1b42e05bd9 ConTeXt template: add a saner default for page numbers.
Previously they appeared centered at the top of the page;
now we put them centered at the bottom, unless the `pagenumbering`
variable is set (this gives users full control over page
number format and position,
https://wiki.contextgarden.net/Command/setuppagenumbering)
2019-11-16 18:38:05 -08:00
John MacFarlane
8683cb596d ConTeXt writer: use braces, not start/stop, for inline language tags.
This prevents unwanted gobbling of spaces.
2019-11-16 17:31:47 -08:00
John MacFarlane
41d1ae0fdd Change styles in reference.docx.
All headings now have a uniform color.

Level-1 headings no longer set `w:themeShade="B5"`.

Level-2 headings are now 14 point rather than 16 point.

Level-3 headings are now 12 point rather than 14 point.

Level-4 headings are italic rather than bold.

Closes #5820.
2019-11-16 09:48:05 -08:00
John MacFarlane
1f69162ffd RST writer: Improve spacing for tables with no width information.
If a simple table would be too wide, we use a grid table.
The code for generating grid tables has been adjusted to
give more intelligent column widths when widths aren't
given. (This also affects the markdown writer.)

Closes #5899.
2019-11-15 23:09:53 -08:00
John MacFarlane
e8de53ce4a Change reference.docx to use more normal block quotes.
Indented left and right, same font and size.
Previously it was unindented, smaller font and different
typeface.

See #5820.
2019-11-14 22:20:58 -08:00
John MacFarlane
81fae63a54 Change optInputFiles to a Maybe [FilePath].
`Nothing` means: nothing specified.
`Just []` means: an empty list specified (e.g. in defaults).
Potentially these could lead to different behavior: see #5888.
2019-11-14 18:42:55 -08:00
John MacFarlane
a60eb60a3d Allow combining -Vheader-includes and --include-in-header.
Closes #5904.
2019-11-14 07:48:19 -08:00
John MacFarlane
3645f9babe Fixed some test locations and put test data files in extra-source-files. 2019-11-14 06:21:00 -08:00
John MacFarlane
a1f69b1c7d Fix regression preventing header-includes from being set using -V.
See #5904.
2019-11-14 05:48:20 -08:00
Albert Krewinkel
e43c2e75a1 RST writer: fix backslash escaping after strings
The check whether a complex inline element following a string must be
escaped, now depends on the last character of the string instead of the
first.

Fixes: #5906
2019-11-14 14:46:32 +01:00
John MacFarlane
cbcaf19174 Add test for #5881. 2019-11-13 17:07:44 -08:00
John MacFarlane
5c0b3743be Ensure there's a blank line before RST tables.
Closes #5898.
2019-11-13 10:10:55 -08:00
despresc
90e436d496 Switch to new pandoc-types and use Text instead of String [API change].
PR #5884.

+ Use pandoc-types 1.20 and texmath 0.12.
+ Text is now used instead of String, with a few exceptions.
+ In the MediaBag module, some of the types using Strings
  were switched to use FilePath instead (not Text).
+ In the Parsing module, new parsers `manyChar`, `many1Char`,
  `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`,
  `mantyUntilChar` have been added: these are like their
  unsuffixed counterparts but pack some or all of their output.
+ `glob` in Text.Pandoc.Class still takes String since it seems
  to be intended as an interface to Glob, which uses strings.
  It seems to be used only once in the package, in the EPUB writer,
  so that is not hard to change.
2019-11-12 16:03:45 -08:00
John MacFarlane
741b1f7fb4 Markdown reader: fix small super/subscript issue.
Superscripts and subscripts cannot contain spaces,
but newlines were previously allowed (unintentionally).
This led to bad interactions in some cases with footnotes.
E.g.

```
foo^[note]
bar^[note]
```

With this change newlines are also not allowed inside
super/subscripts.

Closes #5878.
2019-11-11 09:08:52 -08:00
Florian Beeres
bf2eb4f288 Change the implementation of htmlSpanLikeElements and implement <dfn> (#5882)
* Add HTML Reader support for `<dfn>`, parsing this as a Span with class `dfn`.
* Change `htmlSpanLikeElements` implementation to retain classes,
  attributes and inline content.
2019-11-11 08:55:58 -08:00
John MacFarlane
3bf5362898 DocBook reader: Fix bug with entities in mathphrase element.
Closes #5885.
2019-11-07 23:08:05 -08:00
John MacFarlane
9c7f75afb5 Change merge behavior for metadata.
Previously, if a document contained two YAML metadata blocks
that set the same field, the conflict would be resolved in favor
of the first. Now it is resolved in favor of the second (due to
a change in pandoc-types).

This makes the behavior more uniform with other things in pandoc
(such as reference links and `--metadata-file`).
2019-11-07 10:48:38 -08:00
Amogh Rathore
bd2bd9b19d HTML Reader/Writer - Add support for <var> and <samp> (#5861)
Closes #5799
2019-11-04 08:42:30 -08:00
John MacFarlane
530bfe5f5a Docx reader: fix list number resumption for sublists. Closes #4324.
The first list item of a sublist should not resume numbering
from the number of the last sublist item of the same level,
if that sublist was a sublist of a different list item.

That is, we should not get:

```
1. one
   1. sub one
   2. sub two
2. two
   3. sub one
```
2019-11-03 12:54:42 -08:00
Dmitry Pogodin
270ffe6ab5 Place caption before table in OpenDocument format. (#5869)
Closes #5681.
2019-11-03 07:17:05 -08:00
John MacFarlane
65593043c3 LaTeX reader: Fixed dollar-math parsing...
...to ensure that space is left between a control seq and
a following word that would otherwise change its meaning.

Closes #5836.
2019-11-02 12:44:48 -07:00
John MacFarlane
f39c44f0ba Add test for #5836. 2019-11-02 12:27:15 -07:00
John MacFarlane
6c9a20b2d3 Test for macro definitions in LaTeX preamble. 2019-11-02 11:08:26 -07:00
John MacFarlane
724fd655e7 Add test case for #5845. 2019-11-02 09:40:08 -07:00
Albert Krewinkel
4e902b1d60
Jira writer: remove extraneous newline after single-line block quotes
See #5858
2019-10-31 22:01:12 +01:00
John MacFarlane
4f1224dc0b Use latest doclayout.
Closes #5863.
2019-10-30 21:26:21 -07:00
Florian Klink
c6e936dec2 docbook reader: fix nesting of chapters and sections (#5864)
* Set dbBook to true when traversing a chapter too.
  Currently, a `<title/>` in a chapter and in a `<section/>` below that
  chapter have the same level if they're not inside a `<book/>`.

  This can happen in a multi-file book project. Also see the example at
  https://tdg.docbook.org/tdg/4.5/chapter.html

  Co-authored-by: Félix Baylac-Jacqué <felix@alternativebit.fr>

* Add docbook-chapter test

  This tests nested `<section/>` and makes sure `<title/>` in the first
  `<section/>` below `<chapter/>` is one level deeper than the `<chapter/>`'s
  `<title/>`, also when not inside a `<book/>`.

  Co-authored-by: Félix Baylac-Jacqué <felix@alternativebit.fr>
2019-10-30 08:51:33 -07:00
John MacFarlane
63cfd45406 T.P.W.Shared: Changed gridTables so it does better at...
...keeping the widths of columns.  See #4320.
Adjust test case for #4320.
2019-10-29 22:21:35 -07:00
John MacFarlane
1fe9742263 Changes to build with new doctemplates/doclayout.
The new version of doctemplates adds many features to pandoc's
templating system, while remaining backwards-compatible.
New features include partials and filters.  Using template filters,
one can lay out data in enumerated lists and tables.

Templates are now layout-sensitive: so, for example, if a
text with soft line breaks is interpolated near the end of
a line, the text will break and wrap naturally.  This makes
the templating system much more suitable for programatically
generating markdown or other plain-text files from metadata.
2019-10-29 22:21:35 -07:00
John MacFarlane
4d5fd9e2fe Remove include of grffile from default latex template.
This package is needed for proper handling of image filenames
containing periods (in addition to the period before the
extension).

Unfortunately, grffile breaks in the latest texlive update.
Until a fix is released (see ho-tex/oberdiek#73) it seems best
to remove this from the default template.

This may cause problems if you have filenames with periods.
The workaround is to put `\usepackage{grffile}` in header-includes,
and be sure you're using an older version of texlive packages.

See #5848. We will leave that issue open to remind us to
check upstream, and restore grffile when it's possible to
do so.
2019-10-29 21:44:08 -07:00
John MacFarlane
a796b655cd Shared.makeSections: better behavior in some corner cases.
When a div surrounds multiple sections at the same level,
or a section of highre level followed by one of lower level,
then we just leave it as a div and create a new div for the
section.

Closes #5846, closes #5761.
2019-10-29 21:24:58 -07:00
John MacFarlane
47566817c5 Shared: improve isTight.
If a list has an empty item, this should not count against
its being a tight list.

Closes #5857.
2019-10-28 21:14:01 -07:00
Albert Krewinkel
909083090a
Org reader: fix parsing of empty comment lines
Comment lines in Org-mode can be completely empty; both of these line
should produce no output:

    # a comment
    #

The reader used to produce a wrong result for the latter, but ignores
that line as well now.

Fixes: #5856
2019-10-27 23:00:30 +01:00
Ole Martin Ruud
45479114e8 HTML reader/writer: Better handling of <q> with cite attribute (#5837)
* HTML reader: Handle cite attribute for quotes.  If a `<q>` tag has a `cite` attribute, we interpret it as a Quoted element with an inner Span.  Closes #5798

* Refactor url canonicalization into a helper function

* Modify HTML writer to handle quote with cite.

[0]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/q
2019-10-24 22:27:49 -07:00
Amogh Rathore
d50f46d26d Add Reader support for HTML <samp> element (#5843)
The `<samp>` element is parsed as a Span with class `sample`.
Closes #5792.
2019-10-23 08:44:24 -07:00
Florian B
a933ad30fb Add support for reading & writing <mark> elements
Parse <mark> elements from HTML as HTML span like elements, with a
single class matching the tag name `mark`. Mark elements are rendered to
HTML using the native <mark> element.

Fixes https://github.com/jgm/pandoc/issues/5797.
2019-10-16 09:08:12 -07:00
Daniele D'Orazio
1425bf9a65 Add support for reading and writing <kbd> elements
* Text.Pandoc.Shared: export `htmlSpanLikeElements` [API change]

This commit also introduces a mapping of HTML span like elements that
are internally represented as a Span with a single class, but that are
converted back to the original element by the html writer. As of now,
only the kbd element is handled this way. Ideally these elements should
be handled as plain AST values, but since that would be a breaking
change with a large impact, we revert to this stop-gap solution.

Fixes https://github.com/jgm/pandoc/issues/5796.
2019-10-15 09:49:09 -07:00
Alexander Krotov
a1977dd2d6 Muse reader: do not allow closing asterisks to be followed by "*" 2019-10-15 16:36:05 +03:00
John MacFarlane
a0aeb135b3 Minor template & test changes for latest dev doctemplates. 2019-10-14 23:42:29 -07:00
Alexander Krotov
d5c13dd438 Muse reader: do not split series of asterisks into symbols and emphasis
Fixes #5821
2019-10-15 01:55:32 +03:00
Alexander Krotov
13e0ac1104 Muse reader: do not terminate emphasis on "*" not followed by space 2019-10-15 01:02:54 +03:00
John MacFarlane
d9db76dcf4 LaTeX writer: fix horizontal rule.
We change to use 0.5pt rather than `\linethickness`, which
apparently only ever worked "by accident" and no longer works
with recent updates to texlive.

Closes #5801.
2019-10-12 23:20:57 -07:00
John MacFarlane
9e3e195dd4 Fix gfm_auto_identifiers behavior with emojis.
Closes #5813.

Note that we also now use emoji names for emojis
when `ascii_identifiers` is enabled.
2019-10-11 10:00:33 -07:00
John MacFarlane
7c2dd0359b Markdown writer: prefer pipe_tables to raw html...
...even when we must lose width information.

All in all this seems to be people's preferred behavior, even though it
is slightly lossier.

Closes #2608.
Closes #4497.
2019-10-11 09:46:58 -07:00
John MacFarlane
a3cd74c29b --metadata-file: when multiple files specified, second takes precedence...
on conflicting fields.  This changes earlier behavior (but not in
a release), where first took precedence.

Note that this may seem inconsistent with the behavior of
multiple YAML blocks within a document, where the first takes
precedence.  Still, it is convenient to be able to override
defaults with options later on the command line.
2019-10-10 10:00:45 -07:00
John MacFarlane
68b09a6d81 Make some writers sensitive to 'unlisted' class on headings.
If this is present on a heading with the 'unnumbered' class,
the heading won't appear in the TOC.  This class has no
effect if 'unnumbered' is not also specified.

This affects HTML-based writers (including slide shows
and epub), LateX (including beamer), RTF, and PowerPoint.
Other writers do not yet support `unlisted`.

Closes #1762.
2019-10-10 09:15:40 -07:00
John MacFarlane
a3729ef2da RST writer: proper handling of :align: on figures, images.
When the image has the `align-right` (etc.) class, we now use
an `:align:` attribute.

Closes #4420.
2019-10-09 15:05:22 -07:00
John MacFarlane
5ec9044288 Update s5 test for mathjax 3 change. 2019-10-09 14:32:30 -07:00
John MacFarlane
aceee9ca48 Options.WriterOptions: Change type of writerVariables to Context Text.
This will allow structured values.

[API change]
2019-10-09 11:01:33 -07:00
Alexander Krotov
6a9cafc67a hlint Muse reader tests 2019-10-04 18:28:53 +03:00
John MacFarlane
7caaa3d5d6 Minor ghc 8.8 fixups. 2019-10-03 22:41:24 -07:00
Nils Carlson
8028de3322 odt: Add external option for native numbering
This adds an external options +native_numbering to the
ODT writer enabling enumeration of figures and tables in
ODT output.
2019-09-24 15:23:59 -07:00
John MacFarlane
f223196c35 Man writer: suppress non-absolute link URLs.
Motivation: in a man page there's not much use for relative URLs,
which you can't follow.  Absolute URLs are still useful.  We previously
suppressed relative URLs starting with '#' (purely internal links),
but it makes sense to go a bit farther.

Closes #5770.
2019-09-23 17:46:39 -07:00
John MacFarlane
e99050283e ConTeXt unit tests - tweak code property.
Inline code will never have two consecutive newlines.
We get a counterexample in this case

https://pipelines.actions.githubusercontent.com/bMXCpShstkkHbFPgw9hBRMWw2w9plyzdVM8r7CRPFBHFvidaAG/5cf52d2d-3804-412d-ae65-4f8c059b0fb7/_apis/pipelines/1/runs/116/signedlogcontent/39?urlExpires=2019-09-23T17%3A38%3A05.8358735Z&urlSigningMethod=HMACV1&urlSignature=Qtd6vnzqgSwXpAkIyp9DJY4Kn7GJzYMR8UDkLR%2FsMQY%3D

so for simplicity we just weed out code with newlines.
2019-09-23 15:03:26 -07:00
John MacFarlane
ba14649945 Improve test #5753 2019-09-22 22:00:20 -07:00
John MacFarlane
9abed45879 RST reader: Fixed parsing of indented blocks.
We were requiring consistent indentation, but this
isn't required by RST, as long as each nonblank
line of the block has *some* indentation.

Closes #5753.
2019-09-22 12:01:45 -07:00
John MacFarlane
d247e9f72e Make plain output plainer.
Previously we used the following Project Gutenberg conventions
for plain output:

- extra space before and after level 1 and 2 headings
- all-caps for strong emphasis `LIKE THIS`
- underscores surrounding regular emphasis `_like this_`

This commit makes `plain` output plainer. Strong and Emph
inlines are rendered without special formatting.  Headings
are also rendered without special formatting, and with only
one blank line following.

To restore the former behavior, use `-t plain+gutenberg`.

API change: Add `Ext_gutenberg` constructor to `Extension`.

See #5741.
2019-09-22 11:33:09 -07:00
Nikolay Yakimov
5c5d1a65d9 [Docx Reader] Update tests
Notice this commit updates lists.docx. The old test file contained
references to "ListParagraph" style, which should never leak
outside of pandoc, so I'm not sure what that was supposed to test
for exactly.
2019-09-21 11:37:21 -07:00
Nikolay Yakimov
c113ca6717 [Docx Reader] Use style names, not ids, for assigning semantic meaning
Motivating issues: #5523, #5052, #5074

Style name comparisons are case-insensitive, since those are
case-insensitive in Word.

w:styleId will be used as style name if w:name is missing (this should
only happen for malformed docx and is kept as a fallback to avoid
failing altogether on malformed documents)

Block quote detection code moved from Docx.Parser to Readers.Docx

Code styles, i.e. "Source Code" and "Verbatim Char" now honor style
inheritance

Docx Reader now honours "Compact" style (used in Pandoc-generated docx).
The side-effect is that "Compact" style no longer shows up in
docx+styles output. Styles inherited from "Compact" will still
show up.

Removed obsolete list-item style from divsToKeep. That didn't
really do anything for a while now.

Add newtypes to differentiate between style names, ids, and
different style types (that is, paragraph and character styles)

Since docx style names can have spaces in them, and pandoc-markdown
classes can't, anywhere when style name is used as a class name,
spaces are replaced with ASCII dashes `-`.

Get rid of extraneous intermediate types, carrying styleId information.
Instead, styleId is saved with other style data.

Use RunStyle for inline style definitions only (lacking styleId and styleName);
for Character Styles use CharStyle type (which is basicaly RunStyle with styleId
and StyleName bolted onto it).
2019-09-21 11:18:15 -07:00
Ben Steinberg
7389919bb4 Preserve built-in styles in DOCX with custom style (#5670)
This commit prevents custom styles on divs and spans from overriding
styles on certain elements inside them, like headings, blockquotes,
and links. On those elements, the "native" style is required for the
element to display correctly. This change also allows nesting of
custom styles; in order to do so, it removes the default "Compact"
style applied to Plain blocks, except when inside a table.
2019-09-20 22:13:29 -07:00
John MacFarlane
5a85789185 Remove admonition-title remnants.
Completes 8e01ccb41d
2019-09-19 16:09:38 -07:00
Albert Krewinkel
d0261d7387 Lua filters: allow passing of HTML-like tables instead of Attr (#5750)
Attr values can now be given as normal Lua tables; this can be used as a
convenient alternative to define Attr values, instead of constructing
values with `pandoc.Attr`. Identifiers are taken from the *id* field,
classes must be given as space separated words in the *class* field. All
remaining fields are included as misc attributes.

With this change, the following lines now create equal elements:

    pandoc.Span('test', {id = 'test', class = 'a b', check = 1})
    pandoc.Span('test', pandoc.Attr('test', {'a','b'}, {check = 1}))

This also works when using the *attr* setter:

    local span = pandoc.Span 'text'
    span.attr = {id = 'test', class = 'a b', check = 1}

Furthermore, the *attributes* field of AST elements can now be a plain
key-value table even when using the `attributes` accessor:

    local span = pandoc.Span 'test'
    span.attributes = {check = 1}   -- works as expected now

Closes: #5744
2019-09-15 12:11:58 -07:00
John MacFarlane
45b7636307 Revert "FB2 reader test: better diagnostics on failure."
This reverts commit c65af7d1a2.
2019-09-15 10:27:19 -07:00
John MacFarlane
c65af7d1a2 FB2 reader test: better diagnostics on failure. 2019-09-15 09:06:38 -07:00
John MacFarlane
88a0327579 FB2 reader test: Another attempt to fix test failure on GitHub CI. 2019-09-14 10:37:19 -07:00
John MacFarlane
7ecae69e27 Revert "FB2 reader test: filter CRs."
This reverts commit e35147d715.
2019-09-13 22:08:42 -07:00
John MacFarlane
e35147d715 FB2 reader test: filter CRs.
This may help with the test failure on GitHub CI.

b59e6d0376/checks
2019-09-13 16:50:00 -07:00
John MacFarlane
88dc6fac5d Add --shift-heading-level-by option.
Deprecate --base-heading-level.

The new option does everything the old one does, but also
allows negative shifts.  It also promotes the document
metadata (if not null) to a level-1 heading with a +1 shift,
and demotes an initial level-1 heading to document metadata
with a -1 shift. This supports converting documents that
use an initial level-1 heading for the document title.

Closes #5615.
2019-09-10 23:16:13 -07:00
John MacFarlane
4778d03473 LaTeX reader: Fix parsing of optional arguments that contain braced text.
Closes #5740.
2019-09-09 21:33:16 -07:00
Brian Leung
0558ea9836 Org reader: modify handling of example blocks. (#5717)
* Org reader: allow the `-i` switch to ignore leading spaces.

* Org reader: handle awkwardly-aligned code blocks within lists.

Code blocks in Org lists must have their #+BEGIN_ aligned in a
reasonable way, but their other components can be positioned otherwise.
2019-09-08 22:34:10 -07:00
John MacFarlane
9f984ff26a Replace Element and makeHierarchical with makeSections.
Text.Pandoc.Shared:

+ Remove `Element` type [API change]
+ Remove `makeHierarchicalize` [API change]
+ Add `makeSections` [API change]
+ Export `deLink` [API change]

Now that we have Divs, we can use them to represent the structure
of sections, and we don't need a special Element type.
`makeSections` reorganizes a block list, adding Divs with
class `section` around sections, and adding numbering
if needed.

This change also fixes some longstanding issues recognizing
section structure when the document contains Divs.
Closes #3057, see also #997.

All writers have been changed to use `makeSections`.
Note that in the process we have reverted the change
c1d058aeb1
made in response to #5168, which I'm not completely
sure was a good idea.

Lua modules have also been adjusted accordingly.
Existing lua filters that use `hierarchicalize` will
need to be rewritten to use `make_sections`.
2019-09-08 22:20:19 -07:00
John MacFarlane
1ccff3339d Revert changes to hierarchicalizeWithIds.
Revert "hierarchicalize: ensure that sections get ids..."
This reverts commit 212406a61d.

Revert "Improve detection of headings in Divs by hierarchicalize."
This reverts commit 6e2cfd6c97.

Revert "Shared.hierarchicalize: improve handling of div and section structure."
This reverts commit 345b33762e.
2019-09-08 21:56:42 -07:00
John MacFarlane
212406a61d hierarchicalize: ensure that sections get ids...
even if they're in divs.  Improves #3057.
2019-09-06 09:05:52 -07:00
John MacFarlane
6e2cfd6c97 Improve detection of headings in Divs by hierarchicalize.
The structure

```
<h1>one</h1>
<div>
<h1>two</h1>
</div>
```

should create two coordinate sections, not a section with
a subsection.  Now it does.

Extends #3057.
2019-09-06 08:44:59 -07:00
John MacFarlane
345b33762e Shared.hierarchicalize: improve handling of div and section structure.
Previously Divs were opaque to hierarchicalize, so headings
inside divs didn't get into the table of contents, for
example (#3057).

Now hierarchicalize treats Divs as sections when appropriate.
For example, these structures both yield a section and a
subsection:

``` html
<div>
<h1>one</h1>
<div>
<h2>two</h2>
</div>
</div>
```
``` html
<div>
<h1>one</h1>
<div>
<h1>two</h1>
</div>
</div>
```

Note that

``` html
<h1>one</h1>
<div>
<h2>two</h2>
</div>
<h1>three</h1>
```

gets parsed as the structure

    one
      two
    three

which may not always be desirable.

Closes #3057.
2019-09-05 22:37:13 -07:00
John MacFarlane
381654a704 Add div.hanging-indent CSS to HTML templates. 2019-09-05 12:42:23 -07:00
John MacFarlane
bb362fd76c Add partial styles.html in HTML5 template.
Avoid duplication in HTML templates by using styles.html partial.
Change indentation of styles in template.
2019-09-05 12:39:50 -07:00
John MacFarlane
0e31483d43 asciidoc writer: don't include + in code blocks for regular asciidoc.
This is asciidoctor-specific.

Amends 98ee6ca289.
2019-09-04 14:57:22 -07:00
John MacFarlane
e4cca4cf67 Roff readers: better parsing of groups.
We now allow groups where the closing `\\}` isn't at the
beginning of a line.

Closes #5410.
2019-09-04 09:24:42 -07:00
John MacFarlane
513058a24e XML: change toEntities to emit numerical hex character references.
Previously decimal references were used.
But Polyglot Markup prefers hex.  See #5718.

This affects the output of pandoc with `--ascii`.
2019-09-03 11:28:20 -07:00
John MacFarlane
6b286a1d74 LaTeX reader: don't try to parse includes if raw_tex is set.
When the `raw_tex` extension is set, we just carry through
`\usepackage`, `\input`, etc. verbatim as raw LaTeX.

Closes #5673.
2019-09-02 21:03:05 -07:00
John MacFarlane
d79242796b HTML writer: use numeric character references with --ascii.
Previously we used named character references with html5 output.
But these aren't valid XML, and we aim to produce html5 that is
also valid XHTML (polyglot markup).  (This is also needed for
epub3.)

Closes #5718.
2019-09-02 20:36:57 -07:00
John MacFarlane
5e708eb8ce LaTeX reader: properly handle optional arguments for macros.
Closes #5682.
2019-09-02 18:48:37 -07:00
John MacFarlane
fba1296fd1 LaTeX reader: fix \\ in \parbox inside a table cell.
Closes #5711.
2019-08-27 10:48:02 -07:00
John MacFarlane
167fc4bc87 Markdown reader: Headers: don't parse content over newline boundary.
Closes #5714.
2019-08-27 10:15:00 -07:00
Jesse Rosenthal
4a7dad18b1 PowerPoint writer: Start numbering at appopriate numbers.
Starting numbers for ordered lists were previously ignored. Now we
specify the number if it is something other than 1.

Closes: #5709
2019-08-27 01:24:41 -04:00
John MacFarlane
180f534d21 Add test for issue #5708. 2019-08-26 15:20:22 -07:00
John MacFarlane
1ee6e0e087 Use new doctemplates, doclayout.
+ Remove Text.Pandoc.Pretty; use doclayout instead. [API change]
+ Text.Pandoc.Writers.Shared: remove metaToJSON, metaToJSON'
  [API change].
+ Text.Pandoc.Writers.Shared: modify `addVariablesToContext`,
  `defField`, `setField`, `getField`, `resetField` to work with
  Context rather than JSON values. [API change]
+ Text.Pandoc.Writers.Shared: export new function `endsWithPlain` [API
  change].
+ Use new templates and doclayout in writers.
+ Use Doc-based templates in all writers.
+ Adjust three tests for minor template rendering differences.
+ Added indentation to body in docbook4, docbook5 templates.

The main impact of this change is better reflowing of content
interpolated into templates.  Previously, interpolated variables
were rendered independently and intepolated as strings, which could lead
to overly long lines.  Now the templates interpolated as Doc values
which may include breaking spaces, and reflowing occurs
after template interpolation rather than before.
2019-08-25 14:24:31 -07:00
Owen McGrath
92debe4b9e Change optMetadataFile type from Maybe to List (#5702)
Changed optMetadataFile from `Maybe FilePath` to `[FilePath]`. This allows
for multiple YAML metadata files to be added. The new default value has
been changed from `Nothing` to `[]`.

To account for this change in `Text.Pandoc.App`, `metaDataFromFile` now
operates on two `mapM` calls (for `readFileLazy` and `yamlToMeta`) and a fold.

Added a test (command/5700.md) which tests this functionality and
updated MANUAL.txt, as per the contributing guidelines.

With the current behavior, using `foldr1 (<>)`, values within files
specified first will be used over those in later files. (If the reverse
of this behavior would be preferred, it should be fixed by changing
foldr1 to foldl1.)
2019-08-24 09:41:25 -07:00
John MacFarlane
9d581428f9 Add test for #5690. 2019-08-23 10:15:42 -07:00
John MacFarlane
1c71bd1ff5 Ensure proper nesting when we have long ordered list markers.
Closes #5705.
2019-08-23 09:16:54 -07:00
Albert Krewinkel
2712d3e869
Lua: traverse nested blocks and inlines in correct order
Traversal methods are updated to use the new Walk module such that
sequences with nested Inline (or Block) elements are traversed in the
order in which they appear in the linearized document.

Fixes: #5667
2019-08-16 20:52:15 +02:00
John MacFarlane
79a3449eeb LaTeX reader: improve withRaw so it can handle cases where...
the token string is modified by a parser (e.g. accent when
it only takes part of a Word token).

Closes #5686.  Still not ideal, because we get the whole
`\t0BAR` and not just `\t0` as a raw latex inline command.
But I'm willing to let this be an edge case, since you
can easily work around this by inserting a space, braces,
or raw attribute.  The important thing is that we no longer
drop the rest of the document after a raw latex inline
command that gobbles only part of a Word token!
2019-08-14 14:34:44 -07:00
John MacFarlane
eb23527121 Rename test for 5685 -> 5684 (typo in last commit).
Closes #5684. (Note that #5685 is NOT closed by previous commit.)
2019-08-14 11:13:18 -07:00
John MacFarlane
0b2fb9b8f9 Add thin space when needed in LaTeX quote ligatures.
Closes #5685.
2019-08-14 11:07:02 -07:00
Jan-Otto Kröpke
a0a41c7a8e JIRA writer: Remove escapeStringForJira for code blocks 2019-08-11 21:57:12 +02:00
John MacFarlane
1afea63308 Update muse template to handle multiple authors better. 2019-07-28 19:25:44 -07:00
John MacFarlane
b35fae6511 Use doctemplates 0.3, change type of writerTemplate.
* Require recent doctemplates.  It is more flexible and
  supports partials.
* Changed type of writerTemplate to Maybe Template instead
  of Maybe String.
* Remove code from the LaTeX, Docbook, and JATS writers that looked in
  the template for strings to determine whether it is a book or an
  article, or whether csquotes is used. This was always kludgy and
  unreliable.  To use csquotes for LaTeX, set `csquotes` in your
  variables or metadata. It is no longer sufficient to put
  `\usepackage{csquotes}` in your template or header includes.
  To specify a book style, use the `documentclass` variable or
  `--top-level-division`.
* Change template code to use new API for doctemplates.
2019-07-28 19:25:45 -07:00
Philip Pesca
01a52e2300 HTML writer: ensure TeX formulas are rendered correctly (#5658)
The web service passed in to `--webtex` may render formulas using inline
or display style by default. Prefixing formulas with the appropriate
command ensures they are rendered correctly.

This is a followup to the discussion in #5656.
2019-07-24 10:10:36 -07:00
Philip Pesca
bc508534a6 HTML writer: render inline formulas correctly with --webtex (#5656)
We add `\textstyle` to the beginning of the formula to ensure it will be rendered in inline style.
Closes #5655.
2019-07-23 12:21:31 -07:00
John MacFarlane
db5f6dd4fe Fix error introduced in change to test for 4669. 2019-07-22 15:32:49 -07:00
John MacFarlane
d9960244d8 LaTeX reader: support tex \tt command.
Closes #5654.
2019-07-22 15:29:07 -07:00
Albert Krewinkel
63c65c89da
Org reader: accept ATTR_LATEX in block attributes
Attributes for LaTeX output are accepted as valid block attributes;
however, their values are ignored.

Fixes: #5648
2019-07-22 08:12:22 +02:00
John MacFarlane
91d4283263 LaTeX writer: fix line breaks at start of paragraph.
Previously we just omitted these. Now we render them
using `\hfill\break` instead of `\\`.  This is a revision
of a PR by @sabine (#5591) who should be credited with the
idea.

Closes #3324.
2019-07-20 17:12:53 -07:00
John MacFarlane
465eeece6b LaTeX reader: search for image with list of extensions...
like latex does, if an extension is not provided.

Closes #4933.
2019-07-20 10:17:49 -07:00
John MacFarlane
339392bf54 Markdown: Ensure that expanded latex macros end with space if original did.
Closes #4442.
2019-07-19 10:32:59 -07:00
Agustín Martín Barbero
bd69218451 Change order of ilvl and numId in document.xml (#5647)
Workaround for Word Online shortcomming. Fixes #5645

Also, make list para properties go first.

This reordering of properties shouldn't be necessary but
it seems Word Online does not understand the docx correctly otherwise.
2019-07-19 09:32:43 -07:00
John MacFarlane
28cad16517 Markdown writer: prefer using raw_attribute when enabled.
The `raw_attribute` will be used to mark raw bits, even HTML
and LaTeX, and even when `raw_html` and `raw_tex` are enabled,
as they are by default.

To get the old behavior, disable `raw_attribute` in the writer.

Closes #4311.
2019-07-18 22:31:03 -07:00
John MacFarlane
5c655e86d5 HTML writer: ensure that line numbers in code blocks get id-prefix.
Closes #5650.
2019-07-18 22:08:37 -07:00
John MacFarlane
0d72237e27 Dokuwiki writer: handle mixed lists without HTML fallback.
Closes #5107.
2019-07-16 13:14:37 -07:00
Karl Pettersson
5303791bc4 Customizable type of PDF/A for the ConTeXt writer (issue #5608) (#5610)
* Let the user choose type of PDF/A generated with ConTeXt (closes #5608)
* Updated ConTeXt test documents for changes in tagging
* Updated color profile settings in accordance with ConTeXt wiki
* Made ICC profile and output intent for PDF/A customizable
* Read pdfa variable from meta (and updated manual)
2019-07-15 11:55:04 -07:00
John MacFarlane
968d2046a3 Update test for new skylighting. 2019-07-14 10:48:14 -07:00
Alexander Krotov
0713cb65bc Muse: add RTL support
Closes #5551
2019-07-14 18:22:52 +03:00
Vasily Alferov
f6c92c7523 Fix #4499: add mbox and hbox handling to LaTeX reader (#5586)
When `+raw_tex` is enabled, these are passed through literally.
Otherwise, they are handled in a way that emulates LaTeX's behavior.
2019-07-13 16:55:41 -07:00
John MacFarlane
7bc9eab846
Merge pull request #5589 from blmage/fix-3992
Add support for EPUB2 covers (fix #3992)
2019-07-13 16:48:09 -07:00
John MacFarlane
4a5e727c8c Man writer: Improved definition list term output.
Now we boldface code but not other things. This matches the
most common style in man pages (particularly option lists).

Also, remove a regression in the last commit in which 'nowrap'
was removed.
2019-07-13 16:41:43 -07:00
John MacFarlane
d0bf7efe95 Man writer: fixed boldfacing of definition terms.
Previously the bold-facing would be interrupted by
other formatting, because we used `.B`.

Closes #5620.
2019-07-13 16:12:28 -07:00
John MacFarlane
a16311c225
Merge pull request #5606 from blmage/odt-frames
Improve the parsing of frames in ODT documents
2019-07-13 15:53:58 -07:00
John MacFarlane
1784161946 LaTeX reader: Properly handle \providecommand and environment...
They are now ignored if the corresponding command or environment
is already defined.

Closes #5635.
2019-07-13 15:51:33 -07:00
mb21
6cf5c3f6ac fix filename and issue reference of previous commit 2019-07-13 12:03:45 +02:00
John MacFarlane
6d30d3e0b3 Pass through aria- attributes to HTML5.
Also document addition of data- prefix to unknown attributes.

Closes #5646.
2019-07-12 17:03:01 -07:00
Brian Leung
1d9ff85b45 RST reader: keep name property in imgAttr. (#5637)
Closes #5619.
2019-07-10 18:35:01 -07:00
Arfon Smith
020e2a06d5 Updating JATS template to v1.1dtd (#5632)
* Updating JATS template to v1.1dtd

* Update writer.jats
2019-07-06 23:31:02 +02:00
Brian Leung
9c4ba81357 Markdown reader: handle inline code more eagerly within lists. (#5628)
Closes #5627.
2019-07-06 23:14:21 +02:00