Commit graph

259 commits

Author SHA1 Message Date
John MacFarlane
c0cc9270cb Org writer: don't allow fn refs to wrap to beginning of line.
Otherwise they can be interpreted as footnote definitions.

Closes #4171.
2017-12-18 16:33:52 -08:00
John MacFarlane
808f6d3fa1 OPML reader: enable raw HTML and other extensions by default for notes.
This fixes a regression in 2.0.

Note that extensions can now be individually disabled, e.g.
`-f opml-smart-raw_html`.

Closes #4164.
2017-12-17 09:52:53 -08:00
John MacFarlane
044d58bb24 Fixed regression in LateX tokenization.
This mainly affects the Markdown reader when parsing
raw LaTeX with escaped spaces.  Closes #4159.
2017-12-15 09:45:29 -08:00
John MacFarlane
b94f1e2045 RST reader: more accurate parsing of references.
Previously we erroneously included the enclosing
backticks in a reference ID (closes #4156).

This change also disables interpretation of
syntax inside references, as in docutils.
So, there is no emphasis in

    `my *link*`_
2017-12-14 12:48:43 -08:00
John MacFarlane
7093a3b44c Markdown: Improved computation of relative cell widths in pipe tables. 2017-12-12 15:36:29 -08:00
John MacFarlane
67b6abc806 LaTeX reader: fix \ before newline.
This should be a nonbreaking space, as long as it's not
followed by a blank line. This has been fixed at the tokenizer
level.

Closes #4134.
2017-12-08 16:34:15 -08:00
John MacFarlane
f6007e7146 Markdown reader: accept processing instructions as raw HTML.
Closes #4125.
2017-12-06 16:05:50 -08:00
John MacFarlane
6a2562efb5 Rewrite empty_paragraphs test so it will run on Windows. 2017-12-04 15:41:09 -08:00
John MacFarlane
fac3953abf Markdown reader: Don't parse native div as table caption.
Closes #4119.
2017-12-04 15:04:47 -08:00
John MacFarlane
ae60e0196c Add empty_paragraphs extension.
* Deprecate `--strip-empty-paragraphs` option.  Instead we now
  use an `empty_paragraphs` extension that can be enabled on
  the reader or writer.  By default, disabled.

* Add `Ext_empty_paragraphs` constructor to `Extension`.

* Revert "Docx reader: don't strip out empty paragraphs."
  This reverts commit d6c58eb836.

* Implement `empty_paragraphs` extension in docx reader and writer,
  opendocument writer, html reader and writer.

* Add tests for `empty_paragraphs` extension.
2017-12-04 14:56:57 -08:00
John MacFarlane
03496d1810 Test for #4113.
Closes #4113.
2017-12-03 20:15:40 -08:00
John MacFarlane
1193c1a505 LaTeX writer: allow specifying just width or height for image size.
Previously both needed to be specified (unless the image was
being resized to be smaller than its original size).

If height but not width is specified, we now set width to
textwidth (and similarly if width but not height is specified).
Since we have keepaspectratio, this yields the desired result.
2017-12-01 21:18:29 -08:00
John MacFarlane
b2a190546d Revert "LaTeX writer: Add keepaspectratio to includegraphics..."
This reverts commit 171187a452.
2017-12-01 13:51:33 -08:00
John MacFarlane
171187a452 LaTeX writer: Add keepaspectratio to includegraphics...
...if only one of height/width is given.
2017-11-30 16:03:28 -08:00
John MacFarlane
03ddac451e Support beamer \alert in LaTeX reader. Closes #4091. 2017-11-29 21:30:13 -08:00
John MacFarlane
508aab0bd5 Text.Pandoc.Parsing.uri: allow & and = as word characters.
This fixes a bug where pandoc would stop parsing a URI with an
empty attribute:  for example, `&a=&b=` wolud stop at `a`.
(The uri parser tries to guess which punctuation characters
are part of the URI and which might be punctuation after it.)

Closes #4068.
2017-11-14 22:08:14 -08:00
John MacFarlane
51897937cd LaTeX reader: allow optional arguments on \footnote.
Closes #4062.
2017-11-13 21:19:38 -08:00
John MacFarlane
8d6e0e516a Markdown writer: fix bug with doubled footnotes in grid tables.
Closes #4061.
2017-11-13 21:12:04 -08:00
John MacFarlane
eeaa3b048c LaTeX reader: support column specs like *{2}{r}.
This is equivalent to `rr`.  We now expand it like a macro.

Closes #4056.
2017-11-12 14:46:29 -08:00
John MacFarlane
7ba0ae8b4d LaTeX reader: allow optional args for parbox.
See #4056.
2017-11-12 14:19:58 -08:00
John MacFarlane
fb5ba1bb00 Fixed YAML metadata with "chomp" (|-).
Previously if a YAML block under `|-` contained
a blank line, pandoc would not parse it as metadata.
2017-11-11 10:17:53 -05:00
John MacFarlane
1592d38821 Allow fenced code blocks to be indented 1-3 spaces.
This brings our handling of them into alignment with
CommonMark's.

Closes #??.
2017-11-09 23:22:44 -05:00
John MacFarlane
fef5770591 Fix regression with --metadata.
It should replace a metadata value set in the document
itself, rather than creating a list including a new value.

Closes #4054.
2017-11-08 21:54:23 -08:00
John MacFarlane
8e53489cbc Fix strikethrough in gfm writer.
Previously we got a crash, because we were trying to print
a native cmark STRIKETHROUGH node, and the commonmark writer
in cmark-github doesn't support this.  Work around this by
using a raw node to add the strikethrough delimiters.

Closes #4038.
2017-11-04 10:35:52 -07:00
John MacFarlane
642d603666 Improved support for columns in HTML.
* Move as much as possible to the CSS in the template.
* Ensure that all the HTML-based templates (including epub)
  contain the CSS for columns.
* Columns default to 50% width unless they are given a width
  attribute.

Closes #4028.
2017-11-02 20:57:05 -07:00
John MacFarlane
6d00e6e8c3 Fixed revealjs slide column width issues.
* Remove "width" attribute which is not allowed on div.
* Remove space between `<div class="column">` elements,
  since this prevents columns whose widths sum to 100%
  (the space takes up space).

Closes #4028.
2017-11-02 10:23:04 -07:00
John MacFarlane
ed3d466384 Really fix #3989.
The previous fix only worked in certain cases.
Other cases with `>` in an HTML attribute broke.
2017-11-01 09:27:51 -07:00
John MacFarlane
f1ebdb8145 Updated command test for #3989.
We didn't fix it completely before.
2017-11-01 09:15:15 -07:00
John MacFarlane
fb6e5812bc Fixed regression in parsing of HTML comments in markdown...
and other non-HTML formats (`Text.Pandoc.Readers.HTML.htmlTag`).
The parser stopped at the first `>` character, even if it wasn't
the end of the comment.

Closes #4019.
2017-10-31 21:14:38 -07:00
John MacFarlane
0e57b8b85d Add Millimeter constructor to Dimension in ImageSize.
Minor API change.

Now sizes given in 'mm' are no longer converted to 'cm'.

Closes #4012.
2017-10-31 11:58:43 -07:00
John MacFarlane
5f9f458df3 LaTeX reader: handle % comment right after command.
For example

    \emph%
    {hi}
2017-10-31 11:31:35 -07:00
John MacFarlane
556c6c2c6d Markdown reader: make sure fenced div closers work in lists.
Previously the following failed:

    ::: {.class}
    1. one
    2. two
    :::

and you needed a blank line before the closing `:::`.
2017-10-31 10:57:20 -07:00
John MacFarlane
81610144f9 Make fenced_divs affect the Markdown writer.
If `fenced_divs` is enabled, fenced divs will be used.
2017-10-31 10:57:20 -07:00
John MacFarlane
244b42dbaf Added failing command test for #4007. 2017-10-30 11:04:40 -07:00
John MacFarlane
513b16a71b Fenced divs: ensure that paragraph at end doesn't become Plain.
Added test case.
2017-10-24 09:53:29 -07:00
John MacFarlane
ecb5475a2a Back to using [WARNING] and [INFO] to mark messages. 2017-10-23 23:01:37 -07:00
John MacFarlane
fda0c0119f Implemented fenced Divs.
+ Added Ext_fenced_divs to Extensions (default for pandoc Markdown).
+ Document fenced_divs extension in manual.
+ Implemented fenced code divs in Markdown reader.
+ Added test.

Closes #168.
2017-10-23 22:45:28 -07:00
John MacFarlane
896803b0d5 HTML reader: htmlTag improvements.
We previously failed on cases where an attribute contained a `>`
character. This patch fixes the bug.

Closes #3989.
2017-10-23 17:29:32 -07:00
John MacFarlane
1a82ecbb68 More pleasing presentation of warnings and info messages.
!! warning
-- info
2017-10-23 15:00:11 -07:00
John MacFarlane
cecf02e326 Fixed test for change in log level. 2017-10-23 11:20:22 -07:00
mb21
e2123a4033 LaTeX Reader: support \lettrine 2017-10-22 20:33:30 +02:00
John MacFarlane
28bb5d610d LaTeX reader: support \expandafter.
Closes #3983.
2017-10-19 13:23:50 -07:00
John MacFarlane
61641f996f Revised command test 3971 to work with Windows. 2017-10-16 22:51:25 -07:00
John MacFarlane
c40857b389 Improved handling of include files in LaTeX reader.
Previously `\include` wouldn't work if the included file
contained, e.g., a begin without a matching end.

We've changed the Tok type so that it stores a full SourcePos,
rather than just a line and column.  So tokens keeep track
of the file they came from. This allows us to use a simpler
method for includes, which doesn't require parsing the included
document as a whole.

Closes #3971.
2017-10-16 22:05:34 -07:00
John MacFarlane
9cf9a64923 RST writer: correctly handle inline code containing backticks.
(Use a :literal: role.)

Closes #3974.
2017-10-16 20:54:43 -07:00
John MacFarlane
cba18c19a6 RST writer: don't backslash-escape word-internal punctuation.
Closes #3978.
2017-10-16 20:39:19 -07:00
John MacFarlane
75d8c99c73 ConTeXt writer: Use identifiers for chapters.
Closes #3968.
2017-10-11 20:21:55 -07:00
John MacFarlane
8cd1e00bbc Add test - closes #3958. 2017-10-08 21:57:26 -07:00
John MacFarlane
492f496842 Markdown reader: Fixed bug with indented code following raw LaTeX.
Closes #3947.
2017-10-02 21:28:14 -07:00
John MacFarlane
2314534d4d RST writer: add header anchors when header has non-standard id.
Closes #3937.
2017-09-27 20:42:04 -07:00
John MacFarlane
b1ee747a24 Added --strip-comments option, readerStripComments in ReaderOptions.
* Options:  Added readerStripComments to ReaderOptions.
* Added `--strip-comments` command-line option.
* Made `htmlTag` from the HTML reader sensitive to this feature.

This affects Markdown and Textile input.

Closes #2552.
2017-09-17 13:01:27 -07:00
John MacFarlane
4177ee8626 Textile reader: allow 'pre' code in list item.
Closes #3916.
2017-09-12 08:58:47 -07:00
John MacFarlane
5fc4980216 Markdown writer: Escape pipe characters when pipe_tables enabled.
Closes #3887.
2017-09-07 22:10:13 -07:00
Albert Krewinkel
6a6c3858b4
Org writer: stop using raw HTML to wrap divs
Div's are difficult to translate into org syntax, as there are multiple
div-like structures (drawers, special blocks, greater blocks) which all
have their advantages and disadvantages.  Previously pandoc would
use raw HTML to preserve the full div information; this was rarely
useful and resulted in visual clutter.  Div-rendering was changed to
discard the div's classes and key-value pairs if there is no natural way
to translate the div into an org structure.

Closes: #3771
2017-09-01 00:08:12 +02:00
John MacFarlane
8fcf66453c RST reader: Fixed ..include:: directive.
Closes #3880.
2017-08-27 17:09:55 -07:00
John MacFarlane
1b3431a165 LaTeX reader: improved support for \hyperlink, \hypertarget.
Closes #2549.
2017-08-25 22:04:57 -07:00
John MacFarlane
d70b89c0d9 Use pandoc-types 1.17.1. Tests updated for new simpleTable behavior...
with empty headers.
2017-08-20 23:24:51 -07:00
John MacFarlane
9cc128b579 LaTeX reader: Set identifiers on Spans used for \label. 2017-08-20 16:52:03 -07:00
John MacFarlane
a31241a08b Markdown reader: use CommonMark rules for list item nesting.
Closes #3511.

Previously pandoc used the four-space rule: continuation paragraphs,
sublists, and other block level content had to be indented 4
spaces.  Now the indentation required is determined by the
first line of the list item:  to be included in the list item,
blocks must be indented to the level of the first non-space
content after the list marker. Exception: if are 5 or more spaces
after the list marker, then the content is interpreted as an
indented code block, and continuation paragraphs must be indented
two spaces beyond the end of the list marker.  See the CommonMark
spec for more details and examples.

Documents that adhere to the four-space rule should, in most cases,
be parsed the same way by the new rules.  Here are some examples
of texts that will be parsed differently:

    - a
      - b

will be parsed as a list item with a sublist; under the four-space
rule, it would be a list with two items.

    - a

          code

Here we have an indented code block under the list item, even though it
is only indented six spaces from the margin, because it is four spaces
past the point where a continuation paragraph could begin.  With the
four-space rule, this would be a regular paragraph rather than a code
block.

    - a

            code

Here the code block will start with two spaces, whereas under
the four-space rule, it would start with `code`.  With the four-space
rule, indented code under a list item always must be indented eight
spaces from the margin, while the new rules require only that it
be indented four spaces from the beginning of the first non-space
text after the list marker (here, `a`).

This change was motivated by a slew of bug reports from people
who expected lists to work differently (#3125, #2367, #2575, #2210,
 #1990, #1137, #744, #172, #137, #128) and by the growing prevalance
of CommonMark (now used by GitHub, for example).

Users who want to use the old rules can select the `four_space_rule`
extension.

* Added `four_space_rule` extension.
* Added `Ext_four_space_rule` to `Extensions`.
* `Parsing` now exports `gobbleAtMostSpaces`, and the type
  of `gobbleSpaces` has been changed so that a `ReaderOptions`
  parameter is not needed.
2017-08-19 15:45:01 -07:00
John MacFarlane
5ab1162def Markdown reader: fixed parsing of fenced code after list...
...when there is no intervening blank line.

Closes #3733.
2017-08-18 21:46:55 -07:00
John MacFarlane
bfbdfa646a LaTeX reader: implement \newtoggle, \iftoggle, \toggletrue|false
from etoolbox.

Closes #3853.
2017-08-18 10:13:41 -07:00
John MacFarlane
d1444b4ecd RST reader/writer: support unknown interpreted text roles...
...by parsing them as Span with "role" attributes.
This way they can be manipulated in the AST.

Closes #3407.
2017-08-17 16:01:44 -07:00
John MacFarlane
b1f6fb4af5 HTML reader: support column alignments.
These can be set either with a `width` attribute or
with `text-width` in a `style` attribute.

Closes #1881.
2017-08-17 12:08:32 -07:00
John MacFarlane
db715ca847 LaTeX reader: use Link instead of Span for \ref.
This makes more sense semantically and avoids unnecessary
Span [Link] nestings when references are resolved.
2017-08-16 10:56:12 -07:00
schrieveslaach
cf4b40162d LaTeX reader: add Support for glossaries and acronym package (#3589)
Acronyms are not resolved by the reader, but acronym and glossary information is put into attributes on Spans so that they can be processed in filters.
2017-08-16 10:24:46 -07:00
John MacFarlane
68434957d6 Fixed command test #2994 on Windows. 2017-08-16 09:47:25 -07:00
John MacFarlane
892a4edeb1 Implement multicolumn support for slide formats.
The structure expected is:

    <div class="columns">
      <div class="column" width="40%">
        contents...
      </div>
      <div class="column" width="60%">
        contents...
      </div>
    </div>

Support has been added for beamer and all HTML slide formats.

Closes #1710.

Note:  later we could add a more elegant way to create
this structure in Markdown than to use raw HTML div elements.
This would come for free with a "native div syntax" (#168).
Or we could devise something specific to slides
2017-08-14 23:17:44 -07:00
John MacFarlane
319d7ed6ff Changed command test for #2994 so it actually tests the writer. 2017-08-14 00:00:50 -07:00
schrieveslaach
2845ab5976 Put content of \ref, \label commands into span… (#3639)
* Put content of `\ref` and `\label` commands into Span elements so they can be used in filters.
* Add support for `\eqref`
2017-08-13 10:58:45 -07:00
John MacFarlane
8f65590ce9 CommonMark writer: prefer pipe tables to HTML tables...
...even if it means losing relative column width information.
See #3734.
2017-08-13 10:43:43 -07:00
John MacFarlane
506866ef73 Markdown writer: Use pipe tables if raw_html disabled...
and `pipe_tables` enabled, even if the table has relative
width information.

Closes #3734.
2017-08-13 10:37:24 -07:00
John MacFarlane
418bda8128 Docx writer: pass through comments.
We assume that comments are defined as parsed by the
docx reader:

    I want <span class="comment-start" id="0" author="Jesse Rosenthal"
    date="2016-05-09T16:13:00Z">I left a comment.</span>some text to
    have a comment <span class="comment-end" id="0"></span>on it.

We assume also that the id attributes are unique and properly
matched between comment-start and comment-end.

Closes #2994.
2017-08-12 22:59:53 -07:00
John MacFarlane
be9957bddc Escape MetaString values (as added with --metadata flag).
Previously they would be transmitted to the template without
any escaping.

Note that `--M title='*foo*'` yields a different result from

    ---
    title: *foo*
    ---

In the latter case, we have emphasis; in the former case, just
a string with literal asterisks (which will be escaped
in formats, like Markdown, that require it).

Closes #3792.
2017-08-12 20:27:42 -07:00
John MacFarlane
0ab8670a0e LaTeX reader: Fixed space after \figurename etc. 2017-08-12 13:40:28 -07:00
John MacFarlane
467ca2a1ad Fixed data-dir on translations tests. 2017-08-12 10:39:25 -07:00
John MacFarlane
dbb81f513c More translation tests. 2017-08-11 23:59:27 -07:00
John MacFarlane
9abb688f29 Added simple test for translations. 2017-08-11 23:57:28 -07:00
John MacFarlane
dee4cbc854 RST reader: implement csv-table directive.
Most attributes are supported, including `:file:` and `:url:`.
A (probably insufficient) test case has been added.

Closes #3533.
2017-08-10 15:01:14 -07:00
John MacFarlane
09b7df472d LaTeX reader: Use label instead of data-label for label in caption.
See d441e656db, #3639.
2017-08-09 09:15:50 -07:00
John MacFarlane
1ad9679dc9 CommonMark writer: avoid excess blank lines at end of output. 2017-08-08 14:00:13 -07:00
John MacFarlane
3752298d91 Thread options through CommonMark reader.
This is more efficient than doing AST traversals for
emojis and hard breaks.

Also make behavior sensitive to `raw_html` extension.
2017-08-08 13:55:19 -07:00
John MacFarlane
b6f7c4930b CommonMark writer: support hard_line_breaks, smart.
Add tests.
2017-08-08 13:18:27 -07:00
John MacFarlane
2c0e989f9d Markdown reader: fixed spurious parsing as citation as reference def.
We now disallow reference keys starting with `@` if the
`citations` extension is enabled.  Closes #3840.
2017-08-07 21:00:57 -07:00
John MacFarlane
c806ef1b15 LaTeX reader: Support simple \def macros.
Note that we still don't support macros with fancy parameter
delimiters, like

    \def\foo#1..#2{...}
2017-08-07 16:06:19 -07:00
John MacFarlane
9e6b9cdc5f LaTeX reader: Support \let.
Also, fix regular macros so they're expanded at the
point of use, and NOT also the point of definition.
`\let` macros, by contrast, are expanded at the
point of definition.  Added an `ExpansionPoint`
field to `Macro` to track this difference.
2017-08-07 13:38:15 -07:00
John MacFarlane
ced834076d DokuWiki reader: better handling for code block in list item.
Closes #3824.
2017-08-02 10:33:08 -07:00
John MacFarlane
303d10d07b Small tweak in test (add --wrap=preserve). 2017-07-26 12:55:15 +02:00
John MacFarlane
e0ab09611a HTML writer: render raw inline environments when --mathjax used.
We previously did this only with raw blocks, on the assumption
that math environments would always be raw blocks. This has changed
since we now parse them as inline environments.

Closes #3816.
2017-07-26 12:50:36 +02:00
John MacFarlane
d441e656db HTML writer: insert data- in front of unsupported attributes.
Thus, a span with attribute 'foo' gets written to HTML5
with 'data-foo', so it is valid HTML5.

HTML4 is not affected.

This will allow us to use custom attributes in pandoc without
producing invalid HTML.
2017-07-25 13:13:24 +02:00
John MacFarlane
2b039acb4e Merge branch 'textcolor-support' of https://github.com/schrieveslaach/pandoc into schrieveslaach-textcolor-support 2017-07-25 11:42:10 +02:00
John MacFarlane
329b61ff5c LaTeX reader: support etoolbox's ifstrequal. 2017-07-24 11:20:59 +02:00
John MacFarlane
439ffc2e7f Added a test case with markdown-latex_macros. 2017-07-24 00:02:55 +02:00
John MacFarlane
be14e2b501 LaTeX reader: some improvements in macro parsing.
Fixed applyMacros so that it operates on the whole
string, not just the first token!

Don't remove macro definitions from the output,
even if Ext_latex_macros is set, so that macros will
be applied.  Since they're only applied to math in
Markdown, removing the macros can have bad effects.
Even for math macros, keeping them should be harmless.
2017-07-24 00:02:55 +02:00
Mauro Bieg
7d9b782f73 HTML Reader: parse figure and figcaption (#3813) 2017-07-22 19:22:56 +02:00
John MacFarlane
7191fe1f29 LaTeX reader: handle optional args in raw \titleformat.
Closes #3804.
2017-07-21 09:28:36 +02:00
John MacFarlane
56f63af3f6 LaTeX reader: fixed regression with starred environment names.
Closes #3803.
2017-07-19 17:30:22 +02:00
schrieveslaach
911b63dfc3 Add LaTeX xspace support (#3797) 2017-07-13 20:56:59 +02:00
Marc Schreiber
f93d7d06f6 Merge branch 'master' of https://github.com/jgm/pandoc into textcolor-support 2017-07-13 11:51:40 +02:00
John MacFarlane
013fd1c6b6 Make sure \write18 is parsed as raw LaTeX.
The change is in the LaTeX reader's treatment of raw commands,
but it also affects the Markdown reader.
2017-07-12 14:50:49 +02:00
John MacFarlane
41209ea676 HTML reader: Ensure that paragraphs are closed properly...
when the parent block element closes, even without `</p>`.

Closes #3794.
2017-07-11 15:52:38 +02:00