Commit graph

369 commits

Author SHA1 Message Date
John MacFarlane
202d848fb3 Fixed tests. 2015-12-12 16:22:31 -08:00
John MacFarlane
d6a4c70ef8 Test fixes. 2015-12-12 12:58:31 -08:00
John MacFarlane
536b6bf538 Implemented SoftBreak and new --wrap option.
Added threefold wrapping option.

* Command line option: deprecated `--no-wrap`, added
  `--wrap=[auto|none|preserve]`
* Added WrapOption, exported from Text.Pandoc.Options
* Changed type of writerWrapText in WriterOptions from
  Bool to WrapOption.
* Modified Text.Pandoc.Shared functions for SoftBreak.
* Supported SoftBreak in writers.
* Updated tests.
* Updated README.

Closes #1701.
2015-12-11 23:55:08 -08:00
John MacFarlane
244cd5644b Merge branch 'new-image-attributes' of https://github.com/mb21/pandoc into mb21-new-image-attributes
* Bumped version to 1.16.
* Added Attr field to Link and Image.
* Added `common_link_attributes` extension.
* Updated readers for link attributes.
* Updated writers for link attributes.
* Updated tests
* Updated stack.yaml to build against unreleased versions of
  pandoc-types and texmath.
* Fixed various compiler warnings.

Closes #261.

TODO:

* Relative (percentage) image widths in docx writer.
* ODT/OpenDocument writer (untested, same issue about percentage widths).
* Update pandoc-citeproc.
2015-11-19 23:14:23 -08:00
Jesse Rosenthal
186a955bd0 Docx reader: Add test cases for dummy list items. 2015-11-18 14:03:10 -05:00
Jesse Rosenthal
78c90bd749 Added test case for links in notes. 2015-11-14 13:49:02 -05:00
John MacFarlane
73e6333fae Merge pull request #2526 from tarleb/org-definition-lists-fix
Org reader: Require whitespace around def list markers
2015-11-13 14:07:56 -08:00
Albert Krewinkel
67cb2809fd Org reader: Require whitespace around def list markers
Definition list markers (i.e. double colons `::`) must be surrounded by
whitespace to start a definition item.  This rule was not checked
before, resulting in bugs with footnotes and some link types.

Thanks to @conklech for noticing and reporting this issue.

This fixes #2518.
2015-11-13 22:04:17 +01:00
John MacFarlane
028a605bf8 Merge pull request #2525 from tarleb/org-smart-fixes
Org reader: Fix emphasis rules for smart parsing
2015-11-13 12:27:23 -08:00
John MacFarlane
0a6aaf5e1b Added emoji extension to Markdown.
This is enabled by default in `markdown_github`.
Added `Ext_emoji` to `Extension` in `Text.Pandoc.Options` (API change).

Closes #2523.
2015-11-13 12:14:24 -08:00
Albert Krewinkel
220f3d12b8 Org reader: Fix emphasis rules for smart parsing
Smart quotes, ellipses, and dashes should behave like normal quotes,
single dashes, and dots with respect to text markup parsing.  The parser
state was not updated properly in all cases, which has been fixed.

Thanks to @conklech for reporting this issue.

This fixes #2513.
2015-11-13 20:43:46 +01:00
John MacFarlane
23b693c029 Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly."
This reverts commit c423dbb5a3.
2015-11-09 10:08:22 -08:00
John MacFarlane
237c10992e Merge pull request #2505 from tarleb/org-header-markup-fix
Org reader: fix markup parsing in headers
2015-11-08 17:17:19 -08:00
John MacFarlane
c423dbb5a3 Use -XNoImplicitPrelude and 'import Prelude' explicitly.
This is needed for ghci to work with pandoc, given that we
now use a custom prelude.

Closes #2503.
2015-11-08 16:56:59 -08:00
Albert Krewinkel
e3b4844bd7 Org reader: fix markup parsing in headers
Markup as the very first item in a header wasn't recognized.  This was
caused by an incorrect parser state: positions at which inline markup
can start need to be marked explicitly by changing the parser state.
This wasn't done for headers.  The proper function to update the state
is now called at the beginning of the header parser, fixing this issue.

This fixes #2504.
2015-11-08 22:32:00 +01:00
John MacFarlane
1d53d452c3 LaTeX writer: add \protect to \hyperlink.
Thanks to Hadrien Mary for the problem and solution.

Closes #2490.
2015-10-28 09:37:05 -07:00
John MacFarlane
d5efa9b35c LaTeX writer: Use \hypertarget and \hyperlink for links.
This works correctly to link to Div or Span elements.
We now don't bother defining `\label` for Div or Span
elements.

Closes jgm/pandoc-citeproc#174.
2015-10-27 14:08:35 -07:00
John MacFarlane
45df87cf15 Merge pull request #2477 from tarleb/org-toggling-header-args
Org reader: allow toggling header args
2015-10-25 12:10:07 -07:00
Albert Krewinkel
27a8603278 Org reader: allow toggling header args
Org-mode allows to skip the argument of a code block header argument if
it's toggling a value.  Argument-less headers are now recognized,
avoiding weird parsing errors.

The fixes are not exactly pretty, but neither is the code that was
fixed.  So I guess it's about par for the course.  However, a rewrite of
the header parsing code wouldn't hurt in the long run.

Thanks to @jo-tham for filing the bug report.

This fixes #2269.
2015-10-25 08:54:00 +01:00
Albert Krewinkel
b27366780f Org reader: fix paragraph/list interaction
Paragraphs can be followed by lists, even if there is no blank line
between the two blocks.  However, this should only be true if the
paragraph is not within a list, were the preceding block should be
parsed as a plain instead of paragraph (to allow for compact lists).

Thanks to @rgaiacs for bringing this up.

This fixes #2464.
2015-10-24 19:05:56 +02:00
John MacFarlane
a7150bb6b6 Fixed over-eager raw HTML inline parsing.
Tightened up the inline HTML parser so it disallows
TagWarnings.

This only affects the markdown reader when the `markdown_in_html_blocks`
option is disabled.

Closes #2469.
2015-10-22 21:18:06 -07:00
John MacFarlane
b49ab06e96 Merge pull request #2458 from mb21/lang-inlines
LaTeX and ConTeXt writers: support lang attribute on divs and spans
2015-10-19 23:02:08 -07:00
John MacFarlane
448f5b151e Tests: Unset pandoc-version so we don't get the comment...
in the man writer test.  Otherwise this needs updating
every version bump.
2015-10-18 11:52:32 -07:00
mb21
fa2b26ddcb Added writers-lang-and-dir test, fixed ConTeXt writer test
The writers-lang-and-dir testGroup tests LaTeX and ConTeXt
writers' language and directionality output
2015-10-18 17:01:38 +02:00
John MacFarlane
82b3e0ab97 Use custom Prelude to avoid compiler warnings.
- The (non-exported) prelude is in prelude/Prelude.hs.
- It exports Monoid and Applicative, like base 4.8 prelude,
  but works with older base versions.
- It exports (<>) for mappend.
- It hides 'catch' on older base versions.

This allows us to remove many imports of Data.Monoid
and Control.Applicative, and remove Text.Pandoc.Compat.Monoid.

It should allow us to use -Wall again for ghc 7.10.
2015-10-14 09:09:10 -07:00
John MacFarlane
24f68654e9 RST writer: do header normalization only in "standalone" mode.
If we're producing a fragment, just skip normalization.
After all, the fragment might be somewhere in the middle
of the document.  It's more important for fragments to
have consistency in rendering (so they can be pieced
together) than to normalize.

This closes #2394.  It's simpler and more robust than
my earlier fix.
2015-10-12 23:00:27 -07:00
John MacFarlane
1e8a25ad69 Percent-encode more special characters in URLs.
HTML, LaTeX writers adjusted.
The special characters are '<','>','|','"','{','}','[',']','^', '`'.

Closes #1640, #2377.
2015-10-11 17:12:50 -07:00
John MacFarlane
72b038d201 Merge pull request #2412 from frerich/reader/docbook/xref_support
Added support for <xref> tag in DocBook reader
2015-10-10 14:18:28 -07:00
Ophir Lifshitz
dfd06467ea Docx Reader: Create special punctuation test 2015-10-04 06:11:07 -04:00
John MacFarlane
6532950b26 MediaBag: ensure that / is always used as path separator. 2015-09-26 22:40:58 -07:00
Frerich Raabe
35f12b5095 Added proper support for DocBook 'xref' elements
'xref' is used to create cross references to other parts of the
document. It is an empty element - the cross reference text depends on
various attributes. Quoting 'DocBook: The Definitive Guide':

  1. If the endterm attribute is specified on xref, the content of the
  element pointed to by endterm will be used as the text of the
  cross-reference.

  2. Otherwise, if the object pointed to has a specified XRefLabel, the
  content of that attribute will be used as the cross-reference text.
2015-09-24 18:26:55 +02:00
John MacFarlane
9b033672e4 Merge pull request #2406 from tarleb/org-verse-fix
Make sure verse blocks can contain empty lines
2015-09-20 13:02:47 -07:00
Albert Krewinkel
8007dd97b5 Make sure verse blocks can contain empty lines
The previous verse parsing code made the faulty assumption that empty
strings are valid (and empty) inlines.  This isn't the case, so lines
are changed to contain at least a newline.

It would generally be nicer and faster to keep the newlines while
splitting the string.  However, this would require more code, which
seems unjustified for a simple (and fairly rare) block as *verse*.

This fixes #2402.
2015-09-19 22:02:43 +02:00
Nikolay Yakimov
5788f62ef5 [RST Writer] Don't normalize heading levels below input minimum 2015-09-19 17:45:54 +03:00
John MacFarlane
6dcfa7e07b Tests: docx writer tests now use "../data" for data directory.
This allows tests  to be run without installing first.
2015-09-09 10:39:20 -07:00
Juliusz Gonera
f1c87ed164 Org reader: add auto identifiers if not present on headers
Refs #2354

This should also fix the table of contents (--toc) when generating a html file
from org input
2015-08-15 07:57:48 +02:00
mb21
fe2907ca72 Updated Arbitrary instance for link attribute 2015-08-07 12:38:37 +02:00
John MacFarlane
564b0c43b0 Updated Arbitrary instance for new image attribute parameter. 2015-08-07 12:37:12 +02:00
MarLinn
f068093555 Added odt reader
Fully implemented features:

* Paragraphs
* Headers
* Basic styling
* Unordered lists
* Ordered lists
* External Links
* Internal Links
* Footnotes, Endnotes
* Blockquotes

Partly implemented features:

* Citations
  Very basic, but pandoc can't do much more
* Tables
  No headers, no sizing, limited styling
2015-07-23 15:37:01 -07:00
John MacFarlane
8390d935d8 Updated tests and removed a skipSpaces....
we no longer need it with the change to toKey, and it
is expensive to skip spaces after every inline.
2015-07-23 15:35:18 -07:00
Ophir Lifshitz
53cb926232 Markdown Reader: Add basic tests for each header style 2015-07-23 02:31:24 -04:00
Ophir Lifshitz
0c7d0757d6 Markdown Reader: Add implicit header ref tests for headers with spaces 2015-07-23 02:31:03 -04:00
John MacFarlane
fa2c008ae5 Fix regression: allow HTML comments containing --.
Technically this isn't allowed in an HTML comment, but
we've always allowed it, and so do most other implementations.
It is handy if e.g. you want to put command line arguments
in HTML comments.
2015-07-21 22:44:18 -07:00
John MacFarlane
9e0fb844a9 Markdown reader: don't allow bare URI links or autolinks in link label.
Added test cases.

Closes #2300.
2015-07-14 13:16:40 -07:00
John MacFarlane
9cdfd4f649 Improved bare autolink detection.
Previously we disallowed `-` at the end of an autolink,
and disallowed the combination `=-`.

This commit liberalizes the rules for allowing punctuation in
a bare URI.

Added test cases.

One potential drawback is that you can no longer put a bare
URI in em dashes like this

    this uri---http://example.com---is an example.

But in this respect we now match github's treatment of bare URIs.

Closes #2299.
2015-07-14 10:24:39 -07:00
John MacFarlane
653a7bbe21 Removed tabs from source. 2015-07-10 10:35:58 -07:00
John MacFarlane
99fe8594d9 Avoid parsing partial URLs as HTML tags.
Closes #2277.
2015-07-10 10:33:27 -07:00
Lars-Dominik Braun
d9e17cb3f7 Tests.Readers.RST: Test metadata with inline markup too 2015-07-03 16:57:30 +02:00
Lars-Dominik Braun
8577007d9b Tests.Readers.RST: Group field list tests 2015-07-03 16:44:02 +02:00
Lars-Dominik Braun
68b6b9f652 Readers.RST: Parse field list name
“Inline markup is parsed in field names.” [1]

[1] http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#field-lists
2015-07-03 16:41:28 +02:00
Lars-Dominik Braun
3b2c50ed93 Fix RST reference names with special characters 2015-06-29 18:34:45 +02:00
Albert Krewinkel
385dcf5b99 Org reader: drop trees with a :noexport: tag
Trees having a `:noexport:` tag set are not exported.  This mirrors
default Emacs Org-Mode behavior.
2015-05-23 14:23:16 +02:00
Albert Krewinkel
d8e4a8bc10 Org reader: put header tags into empty spans
Org mode allows headers to be tagged:

``` org-mode
* Headline         :TAG1:TAG2:
```

Instead of being interpreted as part of the headline, the tags are now
put into the attributes of empty spans.  Spans without textual content
won't be visible by default, but they are detectable by filters.  They
can also be styled using CSS when written as HTML.

This fixes #2160.
2015-05-23 14:06:32 +02:00
John MacFarlane
e06810499e HTML reader: Support base tag.
We only support the href attribute, as there's no place for
"target" in the Pandoc document model for links.

Added HTML reader test module, with tests for this feature.

Closes #1751.
2015-05-13 20:53:19 -07:00
John MacFarlane
64b1394fe2 Make sure a closing </div> doesn't get included in a defn list item.
Closes #2127.
2015-05-03 15:06:40 -07:00
John MacFarlane
d9d88e58e1 Fixed regression with lists inside defintiion lists.
This fixes a regression (not in any released version) on
things like

    hi
    :   - there

Closes #2098.
2015-04-26 11:27:47 -07:00
John MacFarlane
095b05e4f9 EPUB tests: don't use joinPath, which varies across platforms.
Instead, use a forward-slash to join paths, regardless of the
platform. This matches the way MediaBag now works.

See
56e4ecab20 (commitcomment-10858449)
2015-04-22 17:38:59 -07:00
Nikolay Yakimov
c1ff165154 MD Reader: Tests for links/footnotes after citations
In-text citation suffix clashes with links and footnotes
2015-04-20 01:31:45 +03:00
John MacFarlane
343b6051da Added test case for #2062. 2015-04-18 19:00:18 -07:00
John MacFarlane
d20152e011 Markdown writer: improved escaping.
`<` should not be escaped as `\<`, for compatibility with
original Markdown.  We now escape `<` and `>` with entities.
Also, we now backslash-escape square brackets.

Closes #2086.
2015-04-18 10:58:50 -07:00
John MacFarlane
d3544dc6f7 Markdown definition lists: don't require indent for first line.
Previously the body of the definition (after the `:` or `~` marker)
needed to be in column 4.  This commit relaxes that requirement,
to better match the behavior of PHP Markdown Extra.  So, now
this is a valid definition list:

    foo
    : bar

This patch also helps resolve a potentially ambiguity with table
captions:

    foo

      : bar

      -----
      table
      -----

Is "bar" a definition, or the caption for the table?  We'll count
it as a caption for the table.

Closes #2087.
2015-04-18 10:13:32 -07:00
John MacFarlane
10e28ef750 More principled fix for #1820.
If the tag parses as a comment, we check to see if the
input starts with `<!--`. If not, it's bogus comment mode
and we fail htmlTag.

Includes test case.  Closes #1820.
2015-04-17 22:56:33 -07:00
John MacFarlane
764f677530 Merge branch 'latex-tightlist' of https://github.com/jlduran/pandoc into jlduran-latex-tightlist
Conflicts:
	data/templates
2015-04-17 19:23:13 -07:00
John MacFarlane
28ca8566ab Merge pull request #1954 from mcmtroffaes/feature/citekey-firstchar-alphanum
Allow digit as first character of a citation key.
2015-04-17 19:10:37 -07:00
John MacFarlane
44fcc5f96e Merge pull request #2079 from lierdakil/rst-normalize-headings
RST Writer: Normalize headings to sequential levels
2015-04-17 19:06:25 -07:00
Nikolay Yakimov
94e4a5ec44 MD Reader: Test for smart ' after inline math 2015-04-18 00:53:20 +03:00
Nikolay Yakimov
4b7ddeb63f RST Writer: Tests for rubrics and heading normalization 2015-04-16 19:27:33 +03:00
Nikolay Yakimov
251ce0738d LaTeX Reader: Test for ^^ character escapes 2015-04-13 03:22:39 +03:00
John MacFarlane
2d2e4c9ab2 Merge branch 'master' of https://github.com/rootzlevel/pandoc into rootzlevel-master
Conflicts:
	src/Text/Pandoc/Readers/Org.hs
2015-03-28 21:09:38 -07:00
John MacFarlane
6a3a04c428 Merge branch 'errortype' of https://github.com/mpickering/pandoc into mpickering-errortype
Conflicts:
	benchmark/benchmark-pandoc.hs
	src/Text/Pandoc/Readers/Markdown.hs
	src/Text/Pandoc/Readers/Org.hs
	src/Text/Pandoc/Readers/RST.hs
	tests/Tests/Readers/LaTeX.hs
2015-03-28 12:12:48 -07:00
John MacFarlane
619b2e8ca2 Merge pull request #1968 from lierdakil/issue1607
Fixes for multiple docx writer style bugs.
2015-03-16 12:02:40 -07:00
John MacFarlane
0deb7c507d Merge pull request #1989 from zudov/shortcut_ref_link_pr
Support shortcut reference links in markdown writer
2015-03-15 11:58:30 -07:00
Konstantin Zudov
b9f77ed03d Support shortcut reference links in markdown writer
Issue #1977

Most markdown processors support the [shortcut format] for reference links.
Pandoc's markdown reader parsed this shortcuts unoptionally.
Pandoc's markdown writer (with --reference-links option) never shortcutted links.

This commit adds an extension `shortcut_reference_links`. The extension is
enabled by default for those markdown flavors that support reading shortcut
reference links, namely:

    - pandoc
    - strict pandoc
    - github flavoured
    - PHPmarkdown

If extension is enabled, reader parses the shortcuts in the same way as
it preveously did. Otherwise it would parse them as normal text.

If extension is enabled, writer outputs shortcut reference links unless
doing so would cause problems (see test cases in `tests/Tests/Writers/Markdown.hs`).
2015-03-10 20:32:24 +02:00
Craig S. Bosma
513221f822 Org reader: add support for smart punctuation 2015-03-09 07:11:53 -05:00
Mathias Schenner
12bf0ff3e5 LaTeX reader: allow non-empty colsep in tables
The `tabular` environment allows non-empty column separators
with the "@{...}" syntax. Previously, pandoc would fail to
parse tables if a non-empty colsep was present. With this
commit, these separators are still ignored, but the table gets
parsed. A test case is included.
2015-03-08 15:47:39 +01:00
Mathias Schenner
1e3ef0e36f LaTeX reader: allow valign argument in tables
The `tabular` environment takes an optional parameter for
vertical alignment. Previously, pandoc would fail to parse
tables if this parameter was present. With this commit,
the parameter is still ignored, but the table gets
parsed. A test case is included.
2015-03-08 15:39:18 +01:00
Mathias Schenner
4f9a10619f LaTeX reader: add some test cases for simple tables 2015-03-08 15:17:09 +01:00
Nikolay Yakimov
59c4d28d8c Docx Writer: Tables test 2015-03-08 04:42:50 +03:00
Nikolay Yakimov
a82dedf1ff Lists test 2015-03-08 03:59:48 +03:00
Nikolay Yakimov
ae07d5ed49 Initial tests for writer 2015-03-03 14:37:02 +03:00
Hans-Peter Deifel
5871955169 Org reader: Add test for image links
Tests for image links with non-image targets, as introduced in
commit 2ca5101.
2015-02-26 13:11:50 +01:00
Jesse Rosenthal
9654514e8a Docx reader: add test for verbatim in sub/superscript. 2015-02-21 08:45:38 -05:00
Jesse Rosenthal
2995526772 Docx reader: Add tests for new list style parsing. 2015-02-19 00:24:04 -05:00
Matthew Pickering
1a7a99161a Update tests 2015-02-18 21:09:07 +00:00
Matthias C. M. Troffaes
dccd408a9c Allow digit as first character of a citation key.
* Update parser to recognize citation keys starting with a digit.
* Update documentation accordingly.
* Test case added.

See https://github.com/jgm/pandoc-citeproc/issues/97
2015-02-18 15:30:17 +00:00
Jesse Rosenthal
616e211f36 Docx reader: test lists in table cells. 2015-02-13 09:08:07 -05:00
Jesse Rosenthal
e88119f2d1 Docx Reader: Add test for VML images.
Since images are often visually (not structurally) placed on the page,
people might not always get the results they're looking for here.
2015-01-21 13:41:16 -05:00
John MacFarlane
47c360e079 Improved texorpdfstring patch #1148.
* Make LaTeX reader recognize texorpdfstring.
* Don't use texorpdfstring unless it's actually needed.
* Fix tests.
2014-12-15 10:06:03 -08:00
John MacFarlane
544f3e5b45 Merge branch 'use-texorpdfstring' of https://github.com/wilx/pandoc into wilx-use-texorpdfstring
Conflicts:
	src/Text/Pandoc/Writers/LaTeX.hs
	tests/Tests/Writers/LaTeX.hs
2014-12-15 10:01:50 -08:00
John MacFarlane
a864e9a348 Merge pull request #1805 from bergey/rst
RST Reader - Improved Role Support
2014-12-15 09:06:45 -08:00
John MacFarlane
1d3ca088f2 Merge pull request #1813 from tarleb/file-links
Org reader: properly handle links to `file:target`
2014-12-14 13:36:34 -08:00
Albert Krewinkel
4d85b17fc5 Org reader: properly handle links to file:target
Org links like `[[file:target][title]]` were not handled correctly,
parsing the link target verbatim.  The org reader is changed such that
the leading `file:` is dropped from the link target.

This is related to issues #756 and #1812.
2014-12-14 21:30:10 +01:00
John MacFarlane
2b08e32a90 Fixe autolinks with following punctuation.
Closes #1811.
The price of this is that autolinked bare URIs can no longer
contain `>` characters, but this is not a big issue.
2014-12-14 12:20:33 -08:00
Daniel Bergey
689fb112bf RST Reader: compute Attrs when role is defined
Move recursive role lookup from renderRole to addNewRole.  The Attr value
will be the same for every occurance of this role, so there's no reason
to compute it every time.  This allows simplifying the
stateRstCustomRoles map considerably.

We could go even further, and remove the fmt and attr arguments to
renderRole, which are null except for custom roles.
2014-12-12 14:45:45 +00:00
Daniel Bergey
4e040160e0 WIP: tests for RST roles 2014-12-12 14:45:45 +00:00
Matthew Pickering
48e2586ec8 Merge pull request #1746 from shelf/dw-ext-images
DokuWiki writer: fix external images
2014-12-08 23:55:36 +00:00
Daniel Bergey
74c1b547c2 parse RST class directives
The class directive accepts one or more class names, and creates a Div
value with those classes.  If the directive has an indented body, the
body is parsed as the children of the Div.  If not, the first block
folowing the directive is made a child of the Div.

This differs from the behavior of rst2xml, which does not create a Div
element.  Instead, the specified classes are applied to each child of
the directive.  However, most Pandoc Block constructors to not take an
Attr argument, so we can't duplicate this behavior.
2014-12-01 18:22:03 +00:00
Daniel Bergey
2cdfa5eb20 parse RST quoted literal blocks
closes #65
RST quoted literal blocks are the same as indented literal blocks (which
pandoc already supports) except that the quote character is preserved in
each line.

This includes test cases for the quoted literal block, as well as
additional tests for line blocks and indented literal blocks, to verify
that these are unaffected by the changes.
2014-12-01 18:22:03 +00:00
John MacFarlane
46d343f474 Fixed bug in org with bulleted lists:
- a
   - b
   * c

was being parsed as a list, even though an unindented `*`
should make a heading.  See
<http://orgmode.org/manual/Plain-lists.html#fn-1>.
2014-11-13 23:40:18 -08:00
John MacFarlane
43c1978fae Merge pull request #1645 from neongreen/issue1636
Fix 'Ext_lists_without_preceding_blankline' bug.
2014-11-12 09:05:29 -08:00