Commit graph

1053 commits

Author SHA1 Message Date
Bjorn Buckwalter
7960013cd4 Escape spaces. Fixes jgm/pandoc#1694. 2014-10-15 21:06:41 +02:00
Timothy Humphries
4f4b0f031d Respect indent when parsing Org bullet lists
Fixes issue with top-level bullet list parsing.
Previously we would use `many1 spaceChars` rather than respecting
the list's indent level. We also permitted `*` bullets on unindented
lists, which should unambiguously parse as `header 1`.
Combined, this meant headers at a different indent level were
being unwittingly slurped into preceding bullet lists, as per
Issue #1650.
2014-10-12 03:18:36 -04:00
John MacFarlane
fe6d43b3e0 Merge pull request #1601 from jkr/windowsfix
Fix path-slashes inside archive for windows
2014-09-27 16:21:17 -07:00
Matthew Pickering
fa2d11c954 Update tests for #1649 2014-09-27 22:40:25 +01:00
Artyom
bc115ffc2d Fix 'Ext_lists_without_preceding_blankline' bug.
* Fixes #1636.
  * Adds a test.
2014-09-26 13:32:08 +04:00
mpickering
c0b9ad4c5d EPUB Tests: Seperating image testing from other features 2014-09-25 13:33:25 +01:00
mpickering
cc07d0c6bf Shared: Make collapseFilePath OS-agnostic 2014-09-25 12:42:53 +01:00
John MacFarlane
0de27f3579 Updated tests for #1616 change. 2014-09-09 10:09:05 -07:00
Jesse Rosenthal
8d22bf26ab LaTeX writer: Test for protecting images in header. 2014-09-09 11:05:47 -04:00
Jesse Rosenthal
f56e0e958a Docx reader: Add test for polyglot headers.
Only Danish at the moment.
2014-09-05 22:07:06 -04:00
Jesse Rosenthal
313355e373 Org reader: Update Tests
Test for markup after blank line.
2014-09-04 19:55:53 -04:00
Jesse Rosenthal
08359c44e4 Docx Reader: Add tests for numbered headers. 2014-09-04 19:39:49 -04:00
Jesse Rosenthal
a6eead7f26 Docx reader: Modify mediabag test accordingly. 2014-09-02 14:05:54 -04:00
John MacFarlane
3533218d6d Merge pull request #1594 from jkr/itemFix
Item fix
2014-08-31 19:31:38 -07:00
Jesse Rosenthal
dfc8ab5a6a LaTeX writer: Add tests for header-in-item. 2014-08-31 16:05:20 -04:00
John MacFarlane
598d3ee23b Markdown reader: better handling of paragraph in div.
Previously text that ended a div would be parsed as Plain
unless there was a blank line before the closing div tag.

Test case:

    <div class="first">
    This is a paragraph.

    This is another paragraph.
    </div>

Closes #1591.
2014-08-31 12:55:47 -07:00
John MacFarlane
374bb3c147 DokuWiki writer: Make tables prettier by aligning columns.
Also cleaned up crufty code and added tests.
2014-08-30 21:24:33 -07:00
John MacFarlane
a8273009ba Textile reader: Improved table support.
We can now handle all different alignment types, for simple
tables only (no captions, no relative widths, cell contents just
plain inlines).  Other tables are still handled using raw HTML.

Addresses #1585 as far as it can be addresssed, I believe.
2014-08-30 20:34:42 -07:00
Jesse Rosenthal
4eb9769a4c Dokuwiki writer: Add a test for multiblock table cells.
We have to add a new file, because the original table tests don't look
for this.
2014-08-30 15:34:40 -04:00
John MacFarlane
218633548f Merge pull request #1574 from jlduran/latex-horizontal-rule
LaTeX writer: Make Horizontal Rules more flexible
2014-08-29 21:40:55 -07:00
John MacFarlane
017d44af1d Merge branch 'ugly-tables' of https://github.com/jlduran/pandoc into jlduran-ugly-tables 2014-08-29 21:24:36 -07:00
Jose Luis Duran
4c684561ee LaTeX writer: Add \strut to fix multiline tables
See: http://tex.stackexchange.com/questions/34971
2014-08-29 13:54:08 +00:00
Jesse Rosenthal
3b5a868533 Docx reader: update tests for new table behavior. 2014-08-28 14:39:04 -04:00
Jose Luis Duran
1fc665c07d LaTeX writer: Make Horizontal Rules more flexible
Currently, pandoc has hard-coded the following in order to make horizontal
rules in LaTeX:

```hs
"\\begin{center}\\rule{3in}{0.4pt}\\end{center}"
```

Which is fine, but does not allow customizations.  It also does not take into
consideration the current line width.

I'm proposing this change:

```diff
@@ In Writers/LaTeX.hs:
-"\\begin{center}\\rule{3in}{0.4pt}\\end{center}"
+"\\begin{center}\\rule{0.5\\linewidth}{\\linethickness}\\end{center}"
```
2014-08-28 03:12:37 +00:00
Jose Luis Duran
f1d330b7b5 LaTeX writer: Fix tables
- [x] Fix a bug introduced in 66378062b6, which
  causes the table caption to repeat across all pages
- [x] Address the issues discussed
  [here](https://groups.google.com/forum/#!msg/pandoc-discuss/qMu6_5lYy0o/ZAU7lzAIKw0J)
  regarding the extra vertical space.
  - [ ] NOTE: This will cause multiline table cells to appear unpadded. See
    http://tex.stackexchange.com/questions/34971
  - [x] Use [`\tabularnewline`](http://tex.stackexchange.com/questions/78796)
    instead of `\\`.
2014-08-28 02:02:20 +00:00
John MacFarlane
17f71002ae Merge pull request #1553 from mpickering/master
Txt2Tags reader: Header is now parsed only if standalone flag is set
2014-08-21 16:42:28 -07:00
mpickering
2a7319541d Txt2Tags Reader: Parse Meta information
The header is now parsed as meta information. The first line is the
`title`, the second is the `author` and third line is the `date`.
2014-08-21 17:09:40 +01:00
mpickering
2cd049a1bf Txt2Tags reader: Header is now parsed only if standalone flag is set 2014-08-20 18:11:37 +01:00
John MacFarlane
13469af843 More test updates. 2014-08-20 09:47:47 -07:00
John MacFarlane
7e69f4a4fa Updated tests. 2014-08-20 09:23:19 -07:00
John MacFarlane
9d52ecdd42 HTML reader: Parse appropriately styled span as SmallCaps. 2014-08-16 22:57:00 -07:00
Jesse Rosenthal
180f5cbe63 Docx reader: Test for character styles. 2014-08-16 14:05:56 -04:00
John MacFarlane
8bf39cf6d6 Markdown reader: Better handle quote characters in inline links.
This was previously failing to be recognized as a link:

    [Test](http://en.wikipedia.org/wiki/Ward's_method)

Closes #1534.
2014-08-14 10:59:27 -07:00
John MacFarlane
e917bcc124 Make raw_tex extension non-default for textile reader, writer.
Enable `raw_tex` extension in textile writer.

Closes #1532.
2014-08-14 09:49:31 -07:00
John MacFarlane
76d14bcc11 Old tests: better path for test program. 2014-08-13 12:20:25 -07:00
John MacFarlane
fc17b3bd41 Make options work with test-pandoc. 2014-08-13 12:19:42 -07:00
John MacFarlane
40e67b8737 Revised tests directory.
Renamed some tests, introducing subsidiary directories
for fb2, docx, epub.

Cleaned up tests in cabal file.

Combined dokuwiki-writer and dokuwiki_inline_formatting tests.
2014-08-13 11:16:50 -07:00
John MacFarlane
1d6e1cf9f3 Removed special testHook from Setup.
This was just too fragile and dependent on a changing Cabal API
(see #1526).

Instead of passing the bulid directory to the test program, we
now let the test program find itself (using executable-path)
and then find the pandoc executable relative to itself.
2014-08-13 08:12:07 -07:00
John MacFarlane
7684b24959 Merge pull request #1528 from mpickering/epubtitlepage
EPUB Reader: Ignores titlepage attribute
2014-08-12 16:56:38 -07:00
Matthew Pickering
063ba81622 EPUB Tests: Added wasteland test
This epub contains many epub:type elements including footnotes and
titlepage.
2014-08-13 00:25:18 +01:00
John MacFarlane
81157c7cc6 HTML writer: use 'uri' or 'email' class for autolinks.
This allows them to be styled specially.

Closes #1501.
2014-08-12 15:49:43 -07:00
John MacFarlane
3653018a70 Added mathml tests for docbook reader. 2014-08-12 14:10:52 -07:00
John MacFarlane
f97ec6db2c Markdown reader: Improved parsing of indented code in list items.
Indented code at the beginning of a list item must be indented eight
spaces from the margin (or from the edge of the container), or four
spaces past the list marker, whichever is farther.

Some examples in `tests/markdown-reader-more.txt`.
2014-08-12 11:10:48 -07:00
Jesse Rosenthal
45ec035e93 Docx reader: combine inlines properly in dropcaps.
Make sure that adjacent inlines are combined properly in dropcaps. This
updates the test results as well.
2014-08-11 23:31:16 -04:00
Jesse Rosenthal
0808449547 Docx: Add dropcap tests. 2014-08-11 23:10:50 -04:00
John MacFarlane
31811657fa Merge pull request #1521 from jkr/emptyEmph
Discard empty formatters
2014-08-11 14:44:08 -07:00
Jesse Rosenthal
241ef57bb2 Docx reader test: Add an emphasized space to normalize test.
This should be ignored, so the output should be the same as the previous
test.
2014-08-11 15:11:28 -04:00
John MacFarlane
6fae136cbb Textile reader: list and HTML block parsing improvements.
Closes #1513.

Lists can now start without an intervening blank line.
Also, html block-level tags that don't start a line are parsed
as RawInline and don't interrupt paragraphs, as in RedCloth.
2014-08-11 11:22:39 -07:00
Matthew Pickering
853830d12b Docx Parse: Updated tests 2014-08-11 10:30:32 -04:00
Matthew Pickering
f33ae631f3 Improved EPUB Tests
Rewrote features test to remove all unimplemented features.

There are now all three examples of where an image can be included in
the test.
  1. Cover image
  2. As a spine elemnt
  3. In the document

Tests have also been added to make sure that the mediabag contains all
these images after processing.
2014-08-10 14:58:53 +01:00
John MacFarlane
7ec8dd956f Removed OMath module, depend on texmath >= 0.8. 2014-08-10 06:19:41 -07:00
Matthew Pickering
ecb42cc002 Docx Tests: Updated for reading sym element 2014-08-09 22:37:12 -04:00
Matthew Pickering
5bedaba080 Added test for sym element 2014-08-09 22:37:12 -04:00
Matthew Pickering
d0fbe5b751 Updated EPUB tests 2014-08-09 23:06:16 +01:00
John MacFarlane
bc06ef0edb Merge branch 'newbranch' of https://github.com/mpickering/pandoc into mpickering-newbranch
Conflicts:
	src/Text/Pandoc/Readers/EPUB.hs
2014-08-08 22:22:55 -07:00
John MacFarlane
a4a6b6f28c Plain writer: Use ALL CAPS for level 1 headers. 2014-08-08 15:20:29 -07:00
Matthew Pickering
5b5e53024d Added tests for collapseFilePath 2014-08-08 22:31:02 +01:00
John MacFarlane
f723a0575d Markdown writer: Respect -raw_html.
pandoc -t markdown-raw_html should not emit any raw HTML, even
span and div tags that go with pandoc Span and Div elements.

Cleaned up a bit of the logic with extensions and plain.
2014-08-08 13:34:57 -07:00
John MacFarlane
7b47042ae6 Textile reader: fixed list parsing bug. Closes #1500. 2014-08-08 12:18:47 -07:00
John MacFarlane
10b662c120 EPUB test renaming.
Renamed epub test files so they're identified more clearly as
epub:  features.{epub,native} -> epub.features.{epub,native},
and similarly with formatting.{epub,native}.

Added epub test files to cabal file, so they'll be included in
the tarball.
2014-08-07 22:25:06 -07:00
John MacFarlane
94466c0060 HTML reader: Really ignore DOCTYPE and xml declarations.
This actually does what d71b013841
said it did.

Revised epub tests to remove the repeated DOCTYPE and xml tags.
2014-08-07 22:12:44 -07:00
John MacFarlane
3c4079edc8 Merge pull request #1488 from mpickering/epubfixes
EPUB Reader: Improved image extraction
2014-08-07 19:00:32 -07:00
Matthew Pickering
090b2da267 EPUB tests: Updated test file 2014-08-07 22:56:30 +01:00
Jesse Rosenthal
98d14b2b2a Docx reader: Test inline image code. 2014-08-07 15:34:49 -04:00
John MacFarlane
4630cff2a6 Merge branch 'epubend' of https://github.com/mpickering/pandoc into mpickering-epubend
Conflicts:
	pandoc.cabal
2014-08-04 07:36:18 -07:00
Artyom Kazak
ec88d47f23 Correctly implement capitalisation.
Using `map toUpper` to capitalise text is wrong, as e.g.
“Straße” should be converted to “STRASSE”, which is 1 character
longer. This commit adds a `capitalize` function and replaces
2 identical implementations in different modules (`toCaps` and
`capitalize`) with it.
2014-08-03 17:37:37 +04:00
John MacFarlane
1b9371cfdf Merge branch 'underline-option' of https://github.com/jkr/pandoc 2014-07-31 16:40:20 -07:00
Jesse Rosenthal
43ca784d1c Update docx test to interpret single underline as emph. 2014-07-31 18:46:43 -04:00
Matthew Pickering
0ae2c1f146 EPUB Reader: Added tests 2014-07-31 21:39:50 +01:00
Jesse Rosenthal
96ad37536e Remove now unneeded JSON test file. 2014-07-31 15:47:45 -04:00
Jesse Rosenthal
ed71e9b31d Docx tests: rewrite mediabag tests.
This will allow us to test the whole mediabag (making sure, for example,
that images are added with the correct keys) instead of just individual
extracted images. We compare each entry in the media bag to an image
extracted on the fly from the docx. As a result, we only need one file
to test with.

The image in the current tests was also replaced with a smaller one.
2014-07-31 15:47:45 -04:00
John MacFarlane
6dd2418476 New module, Text.Pandoc.MediaBag.
Moved `MediaBag` definition and functions from Shared:
`lookupMedia`, `mediaDirectory`, `insertMedia`, `extractMediaBag`.
Removed `emptyMediaBag`; use `mempty` instead, since `MediaBag`
is a Monoid.
2014-07-31 12:00:21 -07:00
John MacFarlane
00662faefb Made MediaBag a newtype, and added mime type information to media.
Shared now exports functions for interacting with a MediaBag:

- `emptyMediaBag`
- `lookuMedia`
- `insertMedia`
- `mediaDirectory`
- `extractMediaBag`
2014-07-31 11:05:35 -07:00
Jesse Rosenthal
4d1d8a4b6f Docx test: Test image from media bag. 2014-07-30 22:32:55 -04:00
Jesse Rosenthal
b24b328906 Docx tests: Add test image.
This is the cow image extracted from `docx.image.docx`.
2014-07-30 22:32:27 -04:00
Jesse Rosenthal
16f88edb3b Docx tests: Added media test comparison function.
Also tell pandoc.cabal that we'll be needing base64, since we want to
compare strings here.
2014-07-30 22:31:38 -04:00
John MacFarlane
3e26fb517d Updated RTF writer tests. 2014-07-30 15:30:51 -07:00
Jesse Rosenthal
941df1b0de Docx reader: change tests to make use of media bag. 2014-07-30 12:46:53 -04:00
John MacFarlane
8c2ed54e2e LaTeX writer: use \(..\) instead of $..$ for inline math.
Closes #1464.
2014-07-29 20:45:49 -07:00
John MacFarlane
8d4eebaff4 Merge pull request #1463 from jkr/metadata
Make metadata out of styled pars
2014-07-29 11:15:34 -07:00
Jesse Rosenthal
54708da371 Add and update docx tests in pandoc.cabal. 2014-07-29 13:05:19 -04:00
Jesse Rosenthal
840108a9c1 Docx reader: Make metavalues out of styled paragraphs.
This will make paragraphs styled with `Author`, `Title`, `Subtitle`,
`Date`, and `Abstract` into pandoc metavalues, rather than text. The
implementation only takes those elements from the beginning of the
document (ignoring empty paragraphs).

Multiple paragraphs in the `Author` style will be made into a metaList,
one paragraph per item. Hard linebreaks (shift-return) in the paragraph
will be maintained, and can be used for institution, email, etc.
2014-07-29 13:03:01 -04:00
John MacFarlane
c302ab3133 Markdown writer: More improvements to 'plain' output, updated tests.
Math now appears in unicode if possible, without the distracting
italics around identifiers.

Blank lines around headers are more consistent.

Footnotes appear in regular [n] style.
2014-07-27 07:57:23 -07:00
Matthew Pickering
e340a7da02 Txt2Tags Reader: Added tests 2014-07-27 00:12:57 +01:00
John MacFarlane
18f4490482 Fixed runtime error with compactify'DL on certain lists.
Closes #1452.  Added test.
2014-07-25 10:53:04 -07:00
John MacFarlane
9c3f7688ee DocBook reader: Better handle elements inside code environments.
Of course, we can't include structure in the code block, but
this way we at least preserve the text.  Closes #1449.
2014-07-23 10:06:36 -07:00
John MacFarlane
98c7ada061 HTML reader: parse Div and Span elements even without --parse-raw.
Closes #1434.
2014-07-20 21:43:54 -07:00
John MacFarlane
4af8eed764 Markdown reader: revised definition list syntax (closes #1429).
* This change brings pandoc's definition list syntax into alignment
  with that used in PHP markdown extra and multimarkdown (with the
  exception that pandoc is more flexible about the definition markers,
  allowing tildes as well as colons).

* Lazily wrapped definitions are now allowed; blank space is required
  between list items; and the space before definition is used to
  determine whether it is a paragraph or a "plain" element.

* For backwards compatibility, a new extension,
  `compact_definition_lists`, has been added that restores the behavior
  of pandoc 1.12.x, allowing tight definition lists with no blank space
  between items, and disallowing lazy wrapping.
2014-07-20 16:33:59 -07:00
John MacFarlane
87096c64f8 Org reader: text adjacent to a list yields a Plain, not Para.
This gives better results for tight lists.  Closes #1437.

An alternative solution would be to use Para everywhere, and
never Plain.  I am not sufficiently familiar with org to know
which is best.  Thoughts, @tarleb?
2014-07-20 12:56:01 -07:00
John MacFarlane
0f01421f81 AsciiDoc writer: Double markers in intraword emphasis.
Closes #1441.
2014-07-20 12:24:53 -07:00
Craig S. Bosma
1bb4f0c497 Org reader: Respect :exports header arguments on code blocks
Adds support to the org reader for conditionally exporting either the code block,
results block immediately following, both, or neither, depending on the value
of the `:exports` header argument. If no such argument is supplied, the default
org behavior (for most languages) of exporting code is used.
2014-07-17 10:23:22 -05:00
John MacFarlane
047f9b3714 Merge pull request #1430 from jkr/anchor-fix-2
Fix auto identified headers when already auto-id'ed
2014-07-15 20:27:28 -07:00
John MacFarlane
c24ab14918 LaTeX writer: Use \nolinkurl in email autolinks.
This allows them to be styled using `\urlstyle{tt}`.

Thanks to Ulrike Fischer for the solution.
2014-07-15 16:42:39 -07:00
Jesse Rosenthal
643435f1de Docx reader: Add test
Test auto ident header anchors with pandoc-generated pandoc.
2014-07-15 18:32:19 +01:00
John MacFarlane
3e95fd586d Use raw HTML for complex block quotes.
As far as I can see, dokuwiki markup is pretty limited in what
can go in a `>` block quote:  just a single line of paragraph
text.  (#1398)
2014-07-13 16:15:45 -07:00
John MacFarlane
81088281de DokuWiki writer: Use raw HTML for complex lists...
as in the mediawiki writer.  The dokuwiki markup isn't able
to handle multiple block-level items within a list item, except
in a few special cases (e.g. code blocks, and these must be started
on the same line as the preceding paragraph).  So we fall back to
raw HTML for these.

Perhaps there is a better solution.  We can "fake" multiple
paragraphs within list items using hard line breaks (`\\`), but
we must keep everything on one line.

(#1398)
2014-07-13 16:04:29 -07:00
John MacFarlane
0ba2f0b8f9 DokuWiki writer: Normalize to collapse adjacent raw HTML blocks. 2014-07-13 15:48:01 -07:00
John MacFarlane
15956fcac7 DokuWiki writer: Updated tests. 2014-07-13 15:45:59 -07:00
John MacFarlane
5df812f7eb Merge branch 'claremacrae-dokuwiki'.
Use removeFormatting from Shared instead of the custom unfancy
function.
2014-07-13 14:44:53 -07:00
John MacFarlane
ff86702a95 Added failing test for issue #1121. 2014-07-10 14:23:20 -07:00