Commit graph

372 commits

Author SHA1 Message Date
John MacFarlane
8a5b9ac868 Add test for relative file: URI to #5517. 2019-05-28 09:05:28 -07:00
Mauro Bieg
214da7217b Fix handling of file: URL scheme in downloadOrRead (#5522)
Move up the pattern match to be reachable, closes #5517.

Previously `file:/` URLs were handled wrongly and pandoc attempted
to make HTTP requests, which failed.
2019-05-28 11:51:21 -04:00
Alexander Krotov
7514277454 HTML reader: trim definition list terms 2019-05-25 18:36:56 +03:00
John MacFarlane
aef71894ce Markdown writer: Ensure the code fence is long enough.
Previously too few backticks were used when the code block
contained an indented line of backticks.  (Ditto tildes.)

Cloess #5519.
2019-05-22 15:21:15 -07:00
Jesse Rosenthal
ed73bd28e5 Markdown writer: Handle labels with integer names
Previously if labels had integer names, it could produce a conflict
with auto-labeled reference links. Now we test for a conflict and find
the next available integer.

Note that this involves adding a new state variable `stPrevRefs` to
keep track of refs used in other document parts when using
`--reference-location=block|section`

Closes #5495
2019-05-21 12:19:59 -04:00
Albert Krewinkel
da9638e6a3
Org writer: always indent src blocks content by 2 spaces
Emacs always uses two spaces when indenting the content of src blocks,
e.g., when exiting a `C-c '` edit-buffer. Pandoc used to indent contents
by the space-equivalent of one tab, but now always uses two spaces, too.

Closes: #5440
2019-05-12 14:49:52 +02:00
John MacFarlane
a20323033e Fix footnote in image caption.
Regression!  The fix for #4683 broke this case.
2019-05-01 16:56:37 -07:00
John MacFarlane
f11d0c9dc8 HTML: prevent gratuitious emojification on iOS.
iOS chooses to render a number of Unicode entities,
including '↩', as big colorful emoji.  This can be
defeated by appending Unicode
VARIATION SELECTOR-15'/'VARIATION SELECTOR-16'.
So we now append this character when escaping
strings, for both '↩' and '↔'.

If other characters prove problematic, they can
simply be added to needsVariationSelector.

Closes #5469.
2019-04-30 22:32:52 -07:00
John MacFarlane
e409509a68 RST writer: treat Span as transparent.
Previously an Emph inside a Span was being treated as
nested markup and ignored.  With this patch, the Span
is just ignored.

Closes #5446.
2019-04-15 09:48:11 -07:00
John MacFarlane
23df94e30a Update command test #5416 to make it windows friendly 2019-04-02 17:59:47 -07:00
Mauro Bieg
0fa6951dc1 Dokuwiki Reader fix: parse single curly brace (#5417)
fixes #5416
2019-04-01 11:36:47 -06:00
John MacFarlane
93ee73e1dc LaTeX writer: Avoid inadvertently creating ? or ! ligatures.
These are upside down ? and !, resp.

Closes #5407.
2019-03-29 10:04:22 -07:00
John MacFarlane
40865958ce Markdown reader: fenced div takes priority over setext header.
For

    ::: {.cell}
    ---
    :::
2019-03-28 17:39:22 -07:00
John MacFarlane
1e60776226 LaTeX writer: Fix footnotes in table caption and cells.
This fixes a bug wherein footnotes appeared in the wrong
order, and with duplicate numbers, when in table captions
and cells.

We now use regular `\footnote` commands, even in the table
caption and the minipages containing cells. Apparently
longtable knows how to handle this.

Closes #5367.
2019-03-22 11:55:41 -07:00
John MacFarlane
6be8f4e953 Improved fix to #5340 and added test. 2019-03-18 16:53:36 -07:00
John MacFarlane
3880a23de9 Properly escape attributes in Markdown writer.
Closes #5369.
2019-03-17 18:15:47 -07:00
John MacFarlane
ebd7035a2a Add test case for #5368. 2019-03-17 18:02:59 -07:00
John MacFarlane
0bed0ab5a3 Use XDG data directory for user data directory.
Instead of `$HOME/.pandoc`, the default user data directory is
now `$XDG_DATA_HOME/pandoc`, where `XDG_DATA_HOME` defaults to
`$HOME/.local/share` but can be overridden by setting the environment
variable.

If this directory is missing, then `$HOME/.pandoc` is searched
instead, for backwards compatibility.  However, we recommend
moving local pandoc data files from `$HOME/.pandoc` to
`$HOME/.local/share/pandoc`.

On Windows the default user data directory remains the same.

Closes #3582.
2019-03-02 15:03:59 -08:00
John MacFarlane
ba05e1ea02 Shared.compactify: Avoid mixed lists.
This improves on the original fix to #5285 by preventing
other mixed lists (lists with a mix of Plain and Para
elements) that were allowed given the original fix.
2019-02-25 17:33:54 -08:00
John MacFarlane
38c028bd50 JATS reader: fix parsing of figures.
This ensures that a figure containing a single image
is parsed as a pandoc "implicit figure" (i.e., a
Para with a single Image whose title attribute begins
with `fig:`).  More complex figures will still be parsed
as divs.

Closes #5321.
2019-02-23 15:40:06 -07:00
John MacFarlane
d7d1c9c8e4 Markdown reader: fix bug parsing fenced code blocks.
Previously parsing would break if the code block
contained a string of backticks of sufficient length
followed by something other than end of line.

Closes #5304.
2019-02-15 22:34:32 -08:00
John MacFarlane
47537d26db Improve tight/loose list handling.
Closes #5285. Previously the algorithm allowed list items
with a mix of Para and Plain, which is never wanted.

compactify in T.P.Shared has been modified so that, if
a list's items contain (at the top level) Para elements
(aside from perhaps at the very end), ALL Plains are
converted to Paras.
2019-02-08 23:16:01 -08:00
John MacFarlane
ccf4e23ee1 Markdown reader: add newline when parsing blocks in YAML.
Otherwise last block gets parsed as a Plain rather than
a Para.

This is a regression in pandoc 2.x.  This patch restores
pandoc 1.19 behavior.

Closes #5271.
2019-02-04 10:22:02 -08:00
John MacFarlane
b74267406b Update test for last commit. 2019-02-02 16:20:06 -08:00
John MacFarlane
633a9ecfec LaTeX writer: avoid {} after control sequences when escaping.
`\ldots{}.` doesn't behave as well as `\ldots.` with the latex
ellipsis package.  This patch causes pandoc to avoid emitting
the `{}` when it is not necessary.  Now `\ldots` and other
control sequences used in escaping will be followed by either
a `{}`, a space, or nothing, depending on context.

Thanks to Elliott Slaughter for the suggestion.
2019-02-01 21:17:46 -08:00
John MacFarlane
e752669e50 LaTeX reader: don't let \egroup match {.
`braced` now actually requires nested braces.
Otherwise some legitimate command and environment
definitions can break (see test/command/tex-group.md).
2019-01-31 22:50:51 -08:00
John MacFarlane
5ddd7b121e LaTeX reader: support \endinput. Closes #5233. 2019-01-22 21:39:26 -08:00
John MacFarlane
f86ac89383 HTML and markdown: treat textarea as a verbatim environment.
We don't want to parse its contents as Markdown or HTML.

Closes #5241.
2019-01-21 20:54:12 -08:00
Brian Leung
35971495ab RST reader: change treatment of number-lines directives. (#5207)
Directives of this type without numeric inputs should not have a
`startFrom` attribute; with a blank value, the writers can produce
extra whitespace.
2019-01-09 22:19:26 -08:00
John MacFarlane
8673eb079b Removed superfluous sourceCode class on code blocks.
* These were added by the RST reader and, for literate Haskell,
  by the Markdown and LaTeX readers.  There is no point to
  this class, and it is not applied consistently by all readers.
  See #5047.

* Reverse order of `literate` and `haskell` classes on code blocks
  when parsing literate Haskell. Better if `haskell` comes first.
2019-01-08 11:36:33 -08:00
Mauro Bieg
f1d83aea12 Implement task lists (#5139)
Closes #3051
2019-01-02 11:36:37 -08:00
John MacFarlane
ea8af33dab Commonmark writer: fix handling of SoftBreak with hard_line_breaks.
This should be rendered as a space.
Closes #5195.
2019-01-02 10:31:13 -08:00
John MacFarlane
ffc2192caf Simplify/fix reading of --metadata values on command line.
Previously we used HsYAML's decodeStrict to recognize
boolean values (treating everything else as a string).
This caused problems relating to hvr/HsYAML#7.

We now just check for the recognized boolean values
`true|True|TRUE|false|False|FALSE`, and avoid using
HsYAML.

Closes #5177.
2018-12-31 21:20:56 -08:00
leungbk
c998b937c1 Org writer: preserve line-numbering for example and code blocks. 2018-12-28 15:07:05 +01:00
John MacFarlane
d5e68d43be RST writer: don't wrap simple table header lines.
Closes #5128.
2018-12-05 17:10:33 -08:00
John MacFarlane
38200c0291 Strip out illegal XML characters in escapeXMLString.
Closes #5119.
2018-12-04 09:24:15 -08:00
John MacFarlane
4060df6891 Markdown writer: include needed whitespace after HTML figure.
We use HTML for a figure in markdown dialects that can't
represent it natively.

Closes #5121.
2018-12-03 15:10:13 -08:00
John MacFarlane
83c0789205 Added test for #5053.
Note that the fix for #5099 also fixes #5053, a pandoc 2.4
regression in parsing underscore emphasis after symbols.
2018-11-25 22:50:16 -08:00
John MacFarlane
edc651059e Fix parsing of citations and quotes after parentheses.
Starting with pandoc 2.4, citations and quoted inlines
were no longer recognized after parentheses.  This is
because of commit 9b0bd4ec6f,
which is reverted here.

The point of that commit was to allow relocation of
soft line breaks to before an abbreviation, so that
a nonbreaking space could be added after the
abbreviation.  Now we simply leave the soft line
break in place, even though this means that
we won't get a nonbreaking space after "Mr."
at the end of a line (and in LaTeX this may
result in a longer intersentential space).
Those who care about this issue should take care
not to end lines with an abbreviation, or to
insert nonbreaking spaces manually.

Closes #5099.
2018-11-25 22:29:54 -08:00
John MacFarlane
d532eb14eb HTML reader: allow tfoot before body rows.
Closes #5079.
2018-11-16 11:29:15 -08:00
John MacFarlane
e61f632531 HTML reader: parse <small> as a Span with class "small".
Closes #5080.
2018-11-15 22:36:01 -08:00
John MacFarlane
e61d1d0da9 Asciidoc writer: Render Spans using [#id .class]#contents#.
See #5080.
2018-11-15 22:29:15 -08:00
John MacFarlane
1a102c11a9 Fix test case for #5014. 2018-11-13 14:50:26 -08:00
John MacFarlane
1cfdd3662f HTML reader: allow thead containing a row with td rather than th.
See #5014.

Note that this doesn't address the original issue in #5014,
only an unrelated side-issue.
2018-11-13 14:49:12 -08:00
John MacFarlane
52a57a5362 LaTeX writer: don't emit [<+->] unless beamer output,
even if `writerIncremental` is True.

See #5072.
2018-11-12 09:43:12 -08:00
John MacFarlane
5bc38a741b Exactly match GitHub's identifier generating algorithm.
See #5057.
2018-11-11 20:45:38 -08:00
John MacFarlane
a36d202e86 Text.Pandoc.Shared: add parameter to uniqueIdent, inlineListToIdentifier.
The parameter is Extensions. This allows these functions to
be sensitive to the settings of `Ext_gfm_auto_identifiers` and
`Ext_ascii_identifiers`.

This allows us to use `uniqueIdent` in the CommonMark reader,
replacing some custom code.

It also means that `gfm_auto_identifiers` can now be used
in all formats.

Semantically, `gfm_auto_identifiers` is now a modifier of
`auto_identifiers`; for identifiers to be set, `auto_identifiers`
must be turned on, and then the type of identifier produced
depends on `gfm_auto_identifiers` and `ascii_identifiers` are set.

Closes #5057.
2018-11-11 13:46:23 -08:00
John MacFarlane
5f030f3c2c Add command test for #5050. 2018-11-06 22:57:11 -08:00
quasicomputational
a747268823 CommonMark writer: respect --ascii (#5043) 2018-11-05 09:33:10 -08:00
John MacFarlane
511d647290 XML: toHtml5Entities: prefer shorter entities...
when there are several choices for a particular character.
2018-11-04 22:15:53 -08:00