Commit graph

290 commits

Author SHA1 Message Date
John MacFarlane
b9602766d8 Textile reader: fixed tables with no body rows.
Previously these raised an exception.

Closes #4513.
2018-03-30 14:56:36 -07:00
John MacFarlane
5a79948e0c Mediawiki reader: improve table parsing.
This fixes detection of table attributes and also
handles `!` characters in cells.

Closes #4508.
2018-03-28 08:59:34 -07:00
Marc Schreiber
155a2ac039 Add support to parse unit string of \SI command (closes #4296). 2018-03-17 20:59:20 -07:00
Francesco Occhipinti
e5845f33ad Don't wrap lines in grid tables when --wrap=none (#4320)
* Annotate gridTable code with comments and abstract small functions
* Don't wrap lines in tables when `--wrap=none`.  Instead, expand cells, even if
   it results in cells that don't respect relative widths or surpass page column width.
* This change affects RST, Markdown, and Haddock writers.
2018-03-17 20:31:43 -07:00
John MacFarlane
b76c0e6a4a RST reader: Allow unicode bullet characters.
Closes #4454.
2018-03-14 17:33:00 -07:00
John MacFarlane
17725a0661 Beamer: put hyperlink after \begin{frame}.
and not in the title.  If it's in the title, then we get
a titlebar on slides with the `plain` attribute, when
the id is non-null.  This fixes a regression from 1.9.x.

Closes #4307.
2018-03-13 10:03:51 -07:00
John MacFarlane
adefd86cd4 LaTeX reader: Fix regression in package options including underscore.
Closes #4424.
2018-03-02 09:33:18 -08:00
John MacFarlane
377640402f LaTeX reader: Fixed comments inside citations. Closes #4374. 2018-02-17 23:06:54 -08:00
Henri Menke
751b5ad010 ConTeXt writer: new section syntax and --section-divs (#4295)
Fixes #2609.

This PR introduces the new-style section headings: `\section[my-header]{My Header}` -> `\section[title={My Header},reference={my-header}]`.

On top of this, the ConTeXt writer now supports the `--section-divs` option to write sections in the fenced style, with `\startsection` and `\stopsection`.
2018-01-25 11:56:28 -08:00
John MacFarlane
ac08a887cf Markdown reader: Fix parsing bug with nested fenced divs.
Closes #4281.

Previously we allowed "nonindent spaces" before the
opening and closing `:::`, but this interfered with
list parsing, so now we require the fences to be
flush with the margin of the containing block.
2018-01-20 14:44:08 -08:00
John MacFarlane
957c0e110d RST reader: fix parsing of headers with trailing space.
This was a regression in pandoc 2.0.

Closes #4280.
2018-01-20 11:10:09 -08:00
John MacFarlane
ca8cd38bdc Markdown reader: don't coalesce adjacent raw LaTeX blocks...
if they are separated by a blank line.

See lierdakil/pandoc-crossref#160 for motivation.
2018-01-17 09:22:35 -08:00
John MacFarlane
615a99c2c2 RST reader: add aligned environment when needed in math.
rst2latex.py uses an align* environment for math in
`.. math::` blocks, so this math may contain line breaks.
If it does, we put the math in an `aligned` environment
to simulate rst2latex.py's behavior.

Closes #4254.
2018-01-14 15:11:11 -08:00
John MacFarlane
d9584d73f9 Markdown reader: Improved inlinesInBalancedBrackets.
The change both improves performance and fixes a
regression whereby normal citations inside inline notes
were not parsed correctly.

Closes jgm/pandoc-citeproc#315.
2018-01-14 12:24:21 -08:00
John MacFarlane
e7d95cadf5 LaTeX reader: pass through macro defs in rawLaTeXBlock...
even if the `latex_macros` extension is set.
This reverts to earlier behavior and is probably safer
on the whole, since some macros only modify things in
included packages, which pandoc's macro expansion can't
modify.

Closes #4246.
2018-01-13 22:12:32 -08:00
John MacFarlane
dca0032b0e LaTeX reader: allow macro definitions inside macros.
Previously we went into an infinite loop with

```
\newcommand{\noop}[1]{#1}
\noop{\newcommand{\foo}[1]{#1}}
\foo{hi}
```

See #4253.
2018-01-13 12:22:25 -08:00
John MacFarlane
49007ded7b RST reader: better handling for headers with an anchor.
Instead of creating a div containing the header, we put
the id directly on the header. This way header promotion
will work properly. Closes #4240.
2018-01-10 12:07:33 -08:00
John MacFarlane
13f7c2cf83 Fixed a test case so it works on windows too. 2018-01-09 17:45:02 -08:00
John MacFarlane
e3f01235e9 HTML writer: Fixed footnote backlinks with --id-prefix.
Closes #4235.
2018-01-09 15:29:27 -08:00
John MacFarlane
3c93ac5cf0 LaTeX reader: be more tolerant of & character.
This allows us to parse unknown tabular environments
as raw LaTeX.  Closes #4208.
2017-12-28 08:41:53 -08:00
John MacFarlane
acfa846aab
Merge pull request #4184 from mb21/html-reader-figcaption
HTML Reader: be more forgiving about figcaption
2017-12-27 13:33:00 -07:00
John MacFarlane
a888083ee1 HTML reader: parse div with class line-block as LineBlock.
See #4162.
2017-12-27 12:26:15 -08:00
John MacFarlane
9e1d86638c LaTeX reader: support \foreignlanguage from babel. 2017-12-26 10:57:57 -08:00
John MacFarlane
ee5fe9bf2c RST reader: allow empty list items (as docutils does).
Closes #4193.
2017-12-24 13:02:18 -08:00
mb21
9b54b94612 HTML Reader: be more forgiving about figcaption
fixes #4183
2017-12-23 09:42:04 +01:00
John MacFarlane
28b736bf95 latex_macros extension changes.
Don't pass through macro definitions themselves when `latex_macros`
is set.  The macros have already been applied.

If `latex_macros` is enabled, then `rawLaTeXBlock` in
Text.Pandoc.Readers.LaTeX will succeed in parsing a macro definition,
and will update pandoc's internal macro map accordingly, but the
empty string will be returned.

Together with earlier changes, this closes #4179.
2017-12-22 18:03:51 -08:00
John MacFarlane
4a07977715 Markdown reader: improved raw tex parsing.
+ Preserve original whitespace between blocks.
+ Recognize `\placeformula` as context.
2017-12-22 18:03:51 -08:00
John MacFarlane
9758720a24 RST writer: fix anchors for headers.
We were missing an `_`.
See #4188.
2017-12-22 10:36:37 -08:00
Alexander Krotov
d035689a06 Org writer: do not wrap "-" to avoid accidental bullet lists
Also add TODO for ordered lists.
2017-12-21 16:36:29 +03:00
Alexander Krotov
1e21cfb251 Muse writer: don't wrap note references to the next line
Closes #4172.
2017-12-19 13:30:48 +03:00
Alexander Krotov
ef8430e702 Fix for #4171 fix: don't wrap note references after SoftBreak 2017-12-19 13:30:48 +03:00
John MacFarlane
c0cc9270cb Org writer: don't allow fn refs to wrap to beginning of line.
Otherwise they can be interpreted as footnote definitions.

Closes #4171.
2017-12-18 16:33:52 -08:00
John MacFarlane
808f6d3fa1 OPML reader: enable raw HTML and other extensions by default for notes.
This fixes a regression in 2.0.

Note that extensions can now be individually disabled, e.g.
`-f opml-smart-raw_html`.

Closes #4164.
2017-12-17 09:52:53 -08:00
John MacFarlane
044d58bb24 Fixed regression in LateX tokenization.
This mainly affects the Markdown reader when parsing
raw LaTeX with escaped spaces.  Closes #4159.
2017-12-15 09:45:29 -08:00
John MacFarlane
b94f1e2045 RST reader: more accurate parsing of references.
Previously we erroneously included the enclosing
backticks in a reference ID (closes #4156).

This change also disables interpretation of
syntax inside references, as in docutils.
So, there is no emphasis in

    `my *link*`_
2017-12-14 12:48:43 -08:00
John MacFarlane
7093a3b44c Markdown: Improved computation of relative cell widths in pipe tables. 2017-12-12 15:36:29 -08:00
John MacFarlane
67b6abc806 LaTeX reader: fix \ before newline.
This should be a nonbreaking space, as long as it's not
followed by a blank line. This has been fixed at the tokenizer
level.

Closes #4134.
2017-12-08 16:34:15 -08:00
John MacFarlane
f6007e7146 Markdown reader: accept processing instructions as raw HTML.
Closes #4125.
2017-12-06 16:05:50 -08:00
John MacFarlane
6a2562efb5 Rewrite empty_paragraphs test so it will run on Windows. 2017-12-04 15:41:09 -08:00
John MacFarlane
fac3953abf Markdown reader: Don't parse native div as table caption.
Closes #4119.
2017-12-04 15:04:47 -08:00
John MacFarlane
ae60e0196c Add empty_paragraphs extension.
* Deprecate `--strip-empty-paragraphs` option.  Instead we now
  use an `empty_paragraphs` extension that can be enabled on
  the reader or writer.  By default, disabled.

* Add `Ext_empty_paragraphs` constructor to `Extension`.

* Revert "Docx reader: don't strip out empty paragraphs."
  This reverts commit d6c58eb836.

* Implement `empty_paragraphs` extension in docx reader and writer,
  opendocument writer, html reader and writer.

* Add tests for `empty_paragraphs` extension.
2017-12-04 14:56:57 -08:00
John MacFarlane
03496d1810 Test for #4113.
Closes #4113.
2017-12-03 20:15:40 -08:00
John MacFarlane
1193c1a505 LaTeX writer: allow specifying just width or height for image size.
Previously both needed to be specified (unless the image was
being resized to be smaller than its original size).

If height but not width is specified, we now set width to
textwidth (and similarly if width but not height is specified).
Since we have keepaspectratio, this yields the desired result.
2017-12-01 21:18:29 -08:00
John MacFarlane
b2a190546d Revert "LaTeX writer: Add keepaspectratio to includegraphics..."
This reverts commit 171187a452.
2017-12-01 13:51:33 -08:00
John MacFarlane
171187a452 LaTeX writer: Add keepaspectratio to includegraphics...
...if only one of height/width is given.
2017-11-30 16:03:28 -08:00
John MacFarlane
03ddac451e Support beamer \alert in LaTeX reader. Closes #4091. 2017-11-29 21:30:13 -08:00
John MacFarlane
508aab0bd5 Text.Pandoc.Parsing.uri: allow & and = as word characters.
This fixes a bug where pandoc would stop parsing a URI with an
empty attribute:  for example, `&a=&b=` wolud stop at `a`.
(The uri parser tries to guess which punctuation characters
are part of the URI and which might be punctuation after it.)

Closes #4068.
2017-11-14 22:08:14 -08:00
John MacFarlane
51897937cd LaTeX reader: allow optional arguments on \footnote.
Closes #4062.
2017-11-13 21:19:38 -08:00
John MacFarlane
8d6e0e516a Markdown writer: fix bug with doubled footnotes in grid tables.
Closes #4061.
2017-11-13 21:12:04 -08:00
John MacFarlane
eeaa3b048c LaTeX reader: support column specs like *{2}{r}.
This is equivalent to `rr`.  We now expand it like a macro.

Closes #4056.
2017-11-12 14:46:29 -08:00
John MacFarlane
7ba0ae8b4d LaTeX reader: allow optional args for parbox.
See #4056.
2017-11-12 14:19:58 -08:00
John MacFarlane
fb5ba1bb00 Fixed YAML metadata with "chomp" (|-).
Previously if a YAML block under `|-` contained
a blank line, pandoc would not parse it as metadata.
2017-11-11 10:17:53 -05:00
John MacFarlane
1592d38821 Allow fenced code blocks to be indented 1-3 spaces.
This brings our handling of them into alignment with
CommonMark's.

Closes #??.
2017-11-09 23:22:44 -05:00
John MacFarlane
fef5770591 Fix regression with --metadata.
It should replace a metadata value set in the document
itself, rather than creating a list including a new value.

Closes #4054.
2017-11-08 21:54:23 -08:00
John MacFarlane
8e53489cbc Fix strikethrough in gfm writer.
Previously we got a crash, because we were trying to print
a native cmark STRIKETHROUGH node, and the commonmark writer
in cmark-github doesn't support this.  Work around this by
using a raw node to add the strikethrough delimiters.

Closes #4038.
2017-11-04 10:35:52 -07:00
John MacFarlane
642d603666 Improved support for columns in HTML.
* Move as much as possible to the CSS in the template.
* Ensure that all the HTML-based templates (including epub)
  contain the CSS for columns.
* Columns default to 50% width unless they are given a width
  attribute.

Closes #4028.
2017-11-02 20:57:05 -07:00
John MacFarlane
6d00e6e8c3 Fixed revealjs slide column width issues.
* Remove "width" attribute which is not allowed on div.
* Remove space between `<div class="column">` elements,
  since this prevents columns whose widths sum to 100%
  (the space takes up space).

Closes #4028.
2017-11-02 10:23:04 -07:00
John MacFarlane
ed3d466384 Really fix #3989.
The previous fix only worked in certain cases.
Other cases with `>` in an HTML attribute broke.
2017-11-01 09:27:51 -07:00
John MacFarlane
f1ebdb8145 Updated command test for #3989.
We didn't fix it completely before.
2017-11-01 09:15:15 -07:00
John MacFarlane
fb6e5812bc Fixed regression in parsing of HTML comments in markdown...
and other non-HTML formats (`Text.Pandoc.Readers.HTML.htmlTag`).
The parser stopped at the first `>` character, even if it wasn't
the end of the comment.

Closes #4019.
2017-10-31 21:14:38 -07:00
John MacFarlane
0e57b8b85d Add Millimeter constructor to Dimension in ImageSize.
Minor API change.

Now sizes given in 'mm' are no longer converted to 'cm'.

Closes #4012.
2017-10-31 11:58:43 -07:00
John MacFarlane
5f9f458df3 LaTeX reader: handle % comment right after command.
For example

    \emph%
    {hi}
2017-10-31 11:31:35 -07:00
John MacFarlane
556c6c2c6d Markdown reader: make sure fenced div closers work in lists.
Previously the following failed:

    ::: {.class}
    1. one
    2. two
    :::

and you needed a blank line before the closing `:::`.
2017-10-31 10:57:20 -07:00
John MacFarlane
81610144f9 Make fenced_divs affect the Markdown writer.
If `fenced_divs` is enabled, fenced divs will be used.
2017-10-31 10:57:20 -07:00
John MacFarlane
244b42dbaf Added failing command test for #4007. 2017-10-30 11:04:40 -07:00
John MacFarlane
513b16a71b Fenced divs: ensure that paragraph at end doesn't become Plain.
Added test case.
2017-10-24 09:53:29 -07:00
John MacFarlane
ecb5475a2a Back to using [WARNING] and [INFO] to mark messages. 2017-10-23 23:01:37 -07:00
John MacFarlane
fda0c0119f Implemented fenced Divs.
+ Added Ext_fenced_divs to Extensions (default for pandoc Markdown).
+ Document fenced_divs extension in manual.
+ Implemented fenced code divs in Markdown reader.
+ Added test.

Closes #168.
2017-10-23 22:45:28 -07:00
John MacFarlane
896803b0d5 HTML reader: htmlTag improvements.
We previously failed on cases where an attribute contained a `>`
character. This patch fixes the bug.

Closes #3989.
2017-10-23 17:29:32 -07:00
John MacFarlane
1a82ecbb68 More pleasing presentation of warnings and info messages.
!! warning
-- info
2017-10-23 15:00:11 -07:00
John MacFarlane
cecf02e326 Fixed test for change in log level. 2017-10-23 11:20:22 -07:00
mb21
e2123a4033 LaTeX Reader: support \lettrine 2017-10-22 20:33:30 +02:00
John MacFarlane
28bb5d610d LaTeX reader: support \expandafter.
Closes #3983.
2017-10-19 13:23:50 -07:00
John MacFarlane
61641f996f Revised command test 3971 to work with Windows. 2017-10-16 22:51:25 -07:00
John MacFarlane
c40857b389 Improved handling of include files in LaTeX reader.
Previously `\include` wouldn't work if the included file
contained, e.g., a begin without a matching end.

We've changed the Tok type so that it stores a full SourcePos,
rather than just a line and column.  So tokens keeep track
of the file they came from. This allows us to use a simpler
method for includes, which doesn't require parsing the included
document as a whole.

Closes #3971.
2017-10-16 22:05:34 -07:00
John MacFarlane
9cf9a64923 RST writer: correctly handle inline code containing backticks.
(Use a :literal: role.)

Closes #3974.
2017-10-16 20:54:43 -07:00
John MacFarlane
cba18c19a6 RST writer: don't backslash-escape word-internal punctuation.
Closes #3978.
2017-10-16 20:39:19 -07:00
John MacFarlane
75d8c99c73 ConTeXt writer: Use identifiers for chapters.
Closes #3968.
2017-10-11 20:21:55 -07:00
John MacFarlane
8cd1e00bbc Add test - closes #3958. 2017-10-08 21:57:26 -07:00
John MacFarlane
492f496842 Markdown reader: Fixed bug with indented code following raw LaTeX.
Closes #3947.
2017-10-02 21:28:14 -07:00
John MacFarlane
2314534d4d RST writer: add header anchors when header has non-standard id.
Closes #3937.
2017-09-27 20:42:04 -07:00
John MacFarlane
b1ee747a24 Added --strip-comments option, readerStripComments in ReaderOptions.
* Options:  Added readerStripComments to ReaderOptions.
* Added `--strip-comments` command-line option.
* Made `htmlTag` from the HTML reader sensitive to this feature.

This affects Markdown and Textile input.

Closes #2552.
2017-09-17 13:01:27 -07:00
John MacFarlane
4177ee8626 Textile reader: allow 'pre' code in list item.
Closes #3916.
2017-09-12 08:58:47 -07:00
John MacFarlane
5fc4980216 Markdown writer: Escape pipe characters when pipe_tables enabled.
Closes #3887.
2017-09-07 22:10:13 -07:00
Albert Krewinkel
6a6c3858b4
Org writer: stop using raw HTML to wrap divs
Div's are difficult to translate into org syntax, as there are multiple
div-like structures (drawers, special blocks, greater blocks) which all
have their advantages and disadvantages.  Previously pandoc would
use raw HTML to preserve the full div information; this was rarely
useful and resulted in visual clutter.  Div-rendering was changed to
discard the div's classes and key-value pairs if there is no natural way
to translate the div into an org structure.

Closes: #3771
2017-09-01 00:08:12 +02:00
John MacFarlane
8fcf66453c RST reader: Fixed ..include:: directive.
Closes #3880.
2017-08-27 17:09:55 -07:00
John MacFarlane
1b3431a165 LaTeX reader: improved support for \hyperlink, \hypertarget.
Closes #2549.
2017-08-25 22:04:57 -07:00
John MacFarlane
d70b89c0d9 Use pandoc-types 1.17.1. Tests updated for new simpleTable behavior...
with empty headers.
2017-08-20 23:24:51 -07:00
John MacFarlane
9cc128b579 LaTeX reader: Set identifiers on Spans used for \label. 2017-08-20 16:52:03 -07:00
John MacFarlane
a31241a08b Markdown reader: use CommonMark rules for list item nesting.
Closes #3511.

Previously pandoc used the four-space rule: continuation paragraphs,
sublists, and other block level content had to be indented 4
spaces.  Now the indentation required is determined by the
first line of the list item:  to be included in the list item,
blocks must be indented to the level of the first non-space
content after the list marker. Exception: if are 5 or more spaces
after the list marker, then the content is interpreted as an
indented code block, and continuation paragraphs must be indented
two spaces beyond the end of the list marker.  See the CommonMark
spec for more details and examples.

Documents that adhere to the four-space rule should, in most cases,
be parsed the same way by the new rules.  Here are some examples
of texts that will be parsed differently:

    - a
      - b

will be parsed as a list item with a sublist; under the four-space
rule, it would be a list with two items.

    - a

          code

Here we have an indented code block under the list item, even though it
is only indented six spaces from the margin, because it is four spaces
past the point where a continuation paragraph could begin.  With the
four-space rule, this would be a regular paragraph rather than a code
block.

    - a

            code

Here the code block will start with two spaces, whereas under
the four-space rule, it would start with `code`.  With the four-space
rule, indented code under a list item always must be indented eight
spaces from the margin, while the new rules require only that it
be indented four spaces from the beginning of the first non-space
text after the list marker (here, `a`).

This change was motivated by a slew of bug reports from people
who expected lists to work differently (#3125, #2367, #2575, #2210,
 #1990, #1137, #744, #172, #137, #128) and by the growing prevalance
of CommonMark (now used by GitHub, for example).

Users who want to use the old rules can select the `four_space_rule`
extension.

* Added `four_space_rule` extension.
* Added `Ext_four_space_rule` to `Extensions`.
* `Parsing` now exports `gobbleAtMostSpaces`, and the type
  of `gobbleSpaces` has been changed so that a `ReaderOptions`
  parameter is not needed.
2017-08-19 15:45:01 -07:00
John MacFarlane
5ab1162def Markdown reader: fixed parsing of fenced code after list...
...when there is no intervening blank line.

Closes #3733.
2017-08-18 21:46:55 -07:00
John MacFarlane
bfbdfa646a LaTeX reader: implement \newtoggle, \iftoggle, \toggletrue|false
from etoolbox.

Closes #3853.
2017-08-18 10:13:41 -07:00
John MacFarlane
d1444b4ecd RST reader/writer: support unknown interpreted text roles...
...by parsing them as Span with "role" attributes.
This way they can be manipulated in the AST.

Closes #3407.
2017-08-17 16:01:44 -07:00
John MacFarlane
b1f6fb4af5 HTML reader: support column alignments.
These can be set either with a `width` attribute or
with `text-width` in a `style` attribute.

Closes #1881.
2017-08-17 12:08:32 -07:00
John MacFarlane
db715ca847 LaTeX reader: use Link instead of Span for \ref.
This makes more sense semantically and avoids unnecessary
Span [Link] nestings when references are resolved.
2017-08-16 10:56:12 -07:00
schrieveslaach
cf4b40162d LaTeX reader: add Support for glossaries and acronym package (#3589)
Acronyms are not resolved by the reader, but acronym and glossary information is put into attributes on Spans so that they can be processed in filters.
2017-08-16 10:24:46 -07:00
John MacFarlane
68434957d6 Fixed command test #2994 on Windows. 2017-08-16 09:47:25 -07:00
John MacFarlane
892a4edeb1 Implement multicolumn support for slide formats.
The structure expected is:

    <div class="columns">
      <div class="column" width="40%">
        contents...
      </div>
      <div class="column" width="60%">
        contents...
      </div>
    </div>

Support has been added for beamer and all HTML slide formats.

Closes #1710.

Note:  later we could add a more elegant way to create
this structure in Markdown than to use raw HTML div elements.
This would come for free with a "native div syntax" (#168).
Or we could devise something specific to slides
2017-08-14 23:17:44 -07:00
John MacFarlane
319d7ed6ff Changed command test for #2994 so it actually tests the writer. 2017-08-14 00:00:50 -07:00
schrieveslaach
2845ab5976 Put content of \ref, \label commands into span… (#3639)
* Put content of `\ref` and `\label` commands into Span elements so they can be used in filters.
* Add support for `\eqref`
2017-08-13 10:58:45 -07:00