Commit graph

191 commits

Author SHA1 Message Date
danse
e6ff7f7986 Docx reader: Pick table width from the longest row or header
This change is intended to preserve as much of the table content as
possible

Closes #4360
2018-02-15 15:06:01 -05:00
Alexander Krotov
82a0ceaf18 Muse reader: fix directive parsing
This fixes bugs introduced in commit 4bfab8f04c.
2018-02-15 18:17:24 +03:00
Alexander Krotov
42e39fbd26 Muse reader: parse definition lists with multiple descriptions 2018-02-13 14:34:45 +03:00
Alexander Krotov
8aed3652c2 Muse reader: refactor to avoid reparsing
Lists are parsed in linear instead of exponential time now.

Contents of block tags, such as <quote>, is parsed directly,
without storing it in a string and parsing with parseFromString.

Fixed a bug: headers did not terminate lists.
2018-02-12 17:30:57 +03:00
Alexander Krotov
3480a8acc2 Muse reader: paragraph indentation does not indicate nested quote
Muse allows indentation to indicate quotation or alignment,
but only on the top level, not within a <quote> or list.

This patch also simplifies the code by removing museInQuote
and museInList fields from the state structure.
Headers and indented paragraphs are attempted to be parsed
only at the topmost level, instead of aborting parsing with guards.
2018-02-12 04:57:56 +03:00
Alexander Krotov
450a200637 Muse reader: test empty quote tag 2018-02-11 19:45:16 +03:00
Alexander Krotov
1dfda7e204 Muse reader: require that block tags are on separate lines
Text::Amuse already explicitly requires it anyway.
Supporting block tags on the same line as contents makes
it hard to combine closing tag parsers with indentation parsers.
Being able to combine parsers is required for no-reparsing refactoring
of Muse reader.
2018-02-11 19:35:58 +03:00
Alexander Krotov
6c45f8c8f6 Muse reader: test that two blank lines after verse can separate list items
Unlike paragraph and <quote> tag parsers, verse parser consumes newline.
For this reason only three or more blank lines can separate list items.
2018-02-05 01:39:38 +03:00
Alexander Krotov
3b667c54ea Muse reader: test that lists can be separated with two blanklines after blockquote
Existing tests only checked this for paragraphs.
2018-02-05 00:25:31 +03:00
Alexander Krotov
248f6076bc Muse reader: fix parsing of trailing whitespace
Newline after whitespace now results in softbreak
instead of space.
2018-01-28 03:18:29 +03:00
Alexander Krotov
6337539e32 Muse reader: fix matching of closing inline tags 2018-01-24 14:16:56 +03:00
Alexander Krotov
98f0e2053e Muse reader: remove multiple descriptions during round-trip tests 2018-01-20 18:34:42 +03:00
Alexander Krotov
e1cc9d9abc Muse reader: enable definition lists in round-trip test 2018-01-20 14:09:44 +03:00
John MacFarlane
b8ffd834cf hlint code improvements. 2018-01-19 21:25:24 -08:00
Alexander Krotov
22b69b557e Muse reader: fix parsing of nested definition lists 2018-01-20 02:14:27 +03:00
Alexander Krotov
7680e9b964 Muse reader: require only one space for nested definition list indentation 2018-01-19 14:16:20 +03:00
Alexander Krotov
19d2576223 Muse reader: parse definition list terms without parseFromString 2018-01-19 01:50:17 +03:00
Alexander Krotov
a516198d47 Muse reader: fix parsing of code at the beginning of paragraph 2018-01-18 15:35:43 +03:00
Alexander Krotov
5f57094a47 Muse reader: refactor definition list parsing
Test with wrong indentation is removed,
because now it is parsed as nested lists.
Emacs Muse and Text::Amuse don't have the same
behavior anyway.
2018-01-18 14:55:07 +03:00
Alexander Krotov
9986ccb333 Muse reader: parse "~~" as non-breaking space in Text::Amuse mode
Latest Text::Amuse supports "~~"
2018-01-18 02:46:02 +03:00
Alexander Krotov
ab85143e8a Muse reader: refactor list parsing
Now list item contents is parsed as blocks,
without resorting to parseFromString.

Only the first line of paragraph has to
be indented now, just like in Emacs Muse
and Text::Amuse.

Definition lists are not refactored yet.

See also: issue #3865.
2018-01-18 02:17:26 +03:00
Jesse Rosenthal
004f60bf26 Docx reader: Add test for hyperlinks in instrText tag
This is difficult to recreate with a modern version of Word, so I'm
using the file submitted with the bug report. It would be preferable
to find a smaller example with Latin characters, though, so as not to
confuse the issue being tested.
2018-01-16 13:22:02 -05:00
John MacFarlane
d9584d73f9 Markdown reader: Improved inlinesInBalancedBrackets.
The change both improves performance and fixes a
regression whereby normal citations inside inline notes
were not parsed correctly.

Closes jgm/pandoc-citeproc#315.
2018-01-14 12:24:21 -08:00
John MacFarlane
e5abee82f2 Shorten unbalanced brackets test.
It was taking a lot of time.
2018-01-14 12:24:21 -08:00
Jesse Rosenthal
a5b71a3c7f Docx reader: Add tests for paragraph insertion/deletion. 2018-01-02 11:32:48 -05:00
Jesse Rosenthal
3f30455b49 Docx reader: tests for overlapping targets (anchor spans). 2017-12-31 09:36:42 -05:00
Jesse Rosenthal
475b0dcb66 Docx reader: tests for removing unused anchors. 2017-12-30 22:43:33 -05:00
Alexander Krotov
551aec7b01 Muse reader: enable round trip test
Closes #4107
2017-12-30 20:32:16 +03:00
John MacFarlane
ddd6a89247 Text.Pandoc.Class: add insertInFileTree (API change).
This gives a pure way to insert an ersatz file into a FileTree.

In addition, we normalize paths both on insertion and on
lookup, so that "foo" and "./foo" will be judged equivalent.
2017-12-28 10:23:09 -08:00
Albert Krewinkel
c6b5d65161
Org smart test: drop superfluous import
Keeps GHC 7.8 and GHC 7.10 happy.
2017-12-28 14:51:03 +01:00
Albert Krewinkel
e5c8b65004
Org reader: support minlevel option for includes
The level of headers in included files can be shifted to a higher level
by specifying a minimum header level via the `:minlevel` parameter. E.g.
`#+include: "tour.org" :minlevel 1` will shift the headers in tour.org
such that the topmost headers become level 1 headers.

Fixes: #4154
2017-12-28 14:16:04 +01:00
Albert Krewinkel
2d443ecb07
Break-up org reader test file
The org reader test file had grown large, to the point that editor
performance was negatively affected in some cases. The tests are spread
over multiple submodules, and re-combined into a tasty TestTree in the
main org reader test file.
2017-12-28 14:15:58 +01:00
Jesse Rosenthal
d71165c8e2 Docx reader: add tests for structured document tags unwrapping. 2017-12-27 10:03:00 -05:00
John MacFarlane
af04881655
Merge pull request #4177 from stencila/jats-xml-reader
Add Basic JATS reader based on DocBook reader
2017-12-21 23:16:03 -07:00
Hamish Mackenzie
d853571397 Improve support for code language in JATS 2017-12-22 15:24:54 +13:00
Alexander Krotov
0405c5b461 Muse reader: parse anchors immediately after headings as IDs 2017-12-21 15:52:10 +03:00
Alexander Krotov
b5e62a5c09 Muse reader: require that note references does not start with 0 2017-12-20 14:00:30 +03:00
Hamish Mackenzie
5d3c9e5646 Add Basic JATS reader based on DocBook reader 2017-12-20 13:54:02 +13:00
Alexander Krotov
f6abf15832 Muse reader: parse empty comments correctly 2017-12-19 04:23:32 +03:00
John MacFarlane
79c3f57c47 Added tests of latex tokenizer.
This should help prevent regressions like #4159.
2017-12-15 10:13:43 -08:00
Jesse Rosenthal
440533643e Docx writer: Add tests for list continuation. 2017-12-13 15:16:44 -05:00
Alexander Krotov
e536c4d9c9 hlint Muse reader and tests 2017-12-06 19:38:25 +03:00
Alexander Krotov
6fd3cdac46 Muse reader: add test for #disable-tables directive in Emacs mode 2017-12-06 19:35:01 +03:00
Alexander Krotov
3ae359721d Muse reader: don't allow emphasis to be preceded by letter 2017-12-06 19:04:35 +03:00
John MacFarlane
ae60e0196c Add empty_paragraphs extension.
* Deprecate `--strip-empty-paragraphs` option.  Instead we now
  use an `empty_paragraphs` extension that can be enabled on
  the reader or writer.  By default, disabled.

* Add `Ext_empty_paragraphs` constructor to `Extension`.

* Revert "Docx reader: don't strip out empty paragraphs."
  This reverts commit d6c58eb836.

* Implement `empty_paragraphs` extension in docx reader and writer,
  opendocument writer, html reader and writer.

* Add tests for `empty_paragraphs` extension.
2017-12-04 14:56:57 -08:00
Alexander Krotov
ed261e5832 Muse reader: add underline support in Emacs Muse mode 2017-12-04 15:59:26 +03:00
John MacFarlane
d6c58eb836 Docx reader: don't strip out empty paragraphs.
We now have the `--strip-empty-paragraphs` option for that,
if you want it.  Closes #2252.

Updated docx reader tests.

We use stripEmptyParagraphs to avoid changing too
many tests.  We should add new tests for empty paragraphs.
2017-12-02 16:51:31 -08:00
Alexander Krotov
7751391fce Muse reader: correctly remove indentation from notes
Exactly one space is required and considered to be part of the marker.
2017-11-29 05:12:25 +03:00
John MacFarlane
5a225aa603 Temporarily disable round-trip block test for muse reader.
See #4107.
2017-11-28 16:13:01 -08:00
Alexander Krotov
c2993a6fc6 Muse reader: parse "~~" as non-breaking space in Emacs mode 2017-11-27 12:25:06 +03:00
Alexander Krotov
00004f042c Muse reader: make code blocks round trip 2017-11-27 04:54:23 +03:00
Alexander Krotov
bdad8c1d69 Muse reader: drop common space prefix from list items 2017-11-26 22:14:18 +03:00
Alexander Krotov
a8ac673285 Muse reader: Add partial round trip test 2017-11-26 02:01:39 +03:00
Alexander Krotov
ea2ea455b3 Muse reader: don't interpret XML entities 2017-11-25 22:46:25 +03:00
Alexander Krotov
77af25b4c3 Muse reader: parse markup in definition list terms 2017-11-24 14:02:43 +03:00
Alexander Krotov
137c7c2a65 Muse reader: allow definition to end with EOF 2017-11-24 13:16:09 +03:00
Alexander Krotov
0cfd764d27 Muse: move inline list normalization to writer 2017-11-24 12:17:20 +03:00
Albert Krewinkel
cd85c73ded
Org reader: allow empty list items
Fixes: #4090
2017-11-22 22:53:24 +01:00
Alexander Krotov
75e2a1104c Muse reader: allow list items to be empty 2017-11-22 18:49:07 +03:00
Alexander Krotov
0b63ac2db1 Muse reader: add ordered list test 2017-11-22 18:48:45 +03:00
Alexander Krotov
c8ab4789b6 Muse reader: add more multiline definition tests 2017-11-22 15:23:09 +03:00
Alexander Krotov
351765d4ad Muse reader: concatenate inlines of the same type 2017-11-22 01:22:43 +03:00
Alexander Krotov
df3a80cc97 Muse writer: escape only </code> inside code tag
Additional <verbatim> is not needed as <code> is verbatim already.
2017-11-22 01:22:43 +03:00
Alexander Krotov
6c17117ef2 Muse reader: add inline <literal> support 2017-11-21 19:53:55 +03:00
Alexander Krotov
59f537c31f Muse reader: test <literal> blocks 2017-11-21 19:01:53 +03:00
Alexander Krotov
82bcda80c6 Muse reader: count only one space as part of list item marker 2017-11-19 04:40:00 +03:00
Alexander Krotov
163af3fdee Muse reader: produce SoftBreaks on newlines
Now wrapping can be preserved with --wrap=preserve
2017-11-19 02:37:52 +03:00
Alexander Krotov
6018a2324d Muse reader: Add Text::Amuse footnote extensions
Footnote end is indicated by indentation,
so footnotes can be placed anywhere in the text,
not just at the end of it.
2017-11-18 23:43:02 +03:00
Alexander Krotov
3a83b3843d Replace "emacs" extension with "amuse" extension
It makes clear that extension is related to Muse markup.
2017-11-13 18:41:49 +03:00
Alexander Krotov
df4cb20f29 Muse reader: accept Emacs Muse definition lists
Emacs Muse does not require indentation.
2017-11-12 18:08:41 +03:00
Sascha Wilde
03361f0a68 Creole reader: additional test on nowiki-block after para. 2017-10-31 22:26:35 +01:00
Sascha Wilde
fa67d6e86f Creole reader: fixed lists with trailing white space. 2017-10-31 18:55:27 +01:00
John MacFarlane
6a1476e7e2 Export all of Text.Pandoc.Class from Text.Pandoc. 2017-10-29 15:00:49 -07:00
John MacFarlane
ff16db1aa3 Automatic reformating by stylish-haskell. 2017-10-27 20:28:29 -07:00
John MacFarlane
a2a14f9029 Removed old adjacent_links test for docx reader.
See #2270 for background -- this test blocked the consistent
underline change and was hard to revise, so for now we are
removing it.
2017-10-27 16:09:44 -07:00
hftf
7f8a3c6cb7 Consistent underline for Readers (#2270)
* Added underlineSpan builder function.  This can be easily updated if needed. The purpose is for Readers to transform underlines consistently.

* Docx Reader: Use underlineSpan and update test

* Org Reader: Use underlineSpan and add test

* Textile Reader: Use underlineSpan and add test case

* Txt2Tags Reader: Use underlineSpan and update test

* HTML Reader: Use underlineSpan and add test case
2017-10-27 18:45:00 -04:00
Sascha Wilde
66fd3247ea Creole reader (#3994)
This is feature complete but not very thoroughly tested yet.
2017-10-26 19:19:28 -04:00
Ben Firshman
9046dbadb1
Latex reader: Skip spaces in image options 2017-10-17 16:42:11 +03:00
Ben Firshman
d73fdbf895
Add tests for existing \includegraphics behaviour 2017-10-17 15:09:55 +03:00
Albert Krewinkel
f176ad6f21
Org reader: end footnotes after two blank lines
Footnotes can not only be terminated by the start of a new footnote or a
header, but also by two consecutive blank lines.
2017-10-08 14:17:26 +02:00
bucklereed
c359bdd9b1 LaTeX reader: read polyglossia/babel \text($LANG){...}. 2017-10-06 12:17:50 +01:00
Albert Krewinkel
514662e544
Org reader: support \n export option
The `\n` export option turns all newlines in the text into hard
linebreaks.

Closes #3950
2017-10-02 23:11:58 +02:00
John MacFarlane
f3a80034ff Removed writerSourceURL, add source URL to common state.
Removed `writerSourceURL` from `WriterOptions` (API change).
Added `stSourceURL` to `CommonState`.
It is set automatically by `setInputFiles`.

Text.Pandoc.Class now exports `setInputFiles`, `setOutputFile`.

The type of `getInputFiles` has changed; it now returns `[FilePath]`
instead of `Maybe [FilePath]`.

Functions in Class that formerly took the source URL as a parameter
now have one fewer parameter (`fetchItem`, `downloadOrRead`,
`setMediaResource`, `fillMediaBag`).

Removed `WriterOptions` parameter from `makeSelfContained` in
`SelfContained`.
2017-09-30 16:11:20 -05:00
Alexander Krotov
b5d064e8f0 Muse reader: parse anchors 2017-09-28 14:57:24 +03:00
Alexander Krotov
2cdb8fe2e6 Muse reader: test metadata parsing 2017-09-26 19:31:10 +03:00
Albert Krewinkel
3a7663281a
Org reader: update emphasis border chars
The org reader was updated to match current org-mode behavior: the set
of characters which are acceptable to occur as the first or last
character in an org emphasis have been changed and now allows all
non-whitespace chars at the inner border of emphasized text (see
`org-emphasis-regexp-components`).

Fixes: #3933
2017-09-25 09:31:29 +02:00
John MacFarlane
ddecd72783 Merge pull request #3911 from labdsf/muse-reader-braces
Muse reader: parse {{{ }}} example syntax
2017-09-11 14:01:05 -07:00
Alexander Krotov
8e4ee66563 Muse reader: allow inline markup to be followed by punctuation
Previously code was not allowed to be followed by comma,
and emphasis was allowed to be followed by letter.
2017-09-11 18:34:32 +03:00
Alexander Krotov
508c3a64d8 Muse reader: parse {{{ }}} example syntax 2017-09-11 18:17:28 +03:00
Alexander Krotov
27cccfac84 Muse reader: parse verbatim tag 2017-09-11 12:13:09 +03:00
Alexander Krotov
afedb41b17 Muse reader: trim newlines from <example>s 2017-09-10 12:42:24 +03:00
Alexander Krotov
2230371304 Muse reader: debug inline code markup 2017-09-09 16:39:06 +03:00
Alexander
743413a5b5 Muse reader: Allow finishing header with EOF (#3897) 2017-09-06 08:48:06 -07:00
Alexander
350c282f20 Muse reader: require at least one space char after * in header (#3895) 2017-09-05 09:41:27 -07:00
Alexander
c09b586147 Muse reader: parse <div> tag (#3888) 2017-09-04 21:22:40 -07:00
Alexander
14f813c3f2 Muse reader: parse verse markup (#3882) 2017-08-29 12:40:34 -07:00
Alexander
05bb8ef4aa RST reader: handle blank lines correctly in line blocks (#3881)
Previously pandoc would sometimes combine two line blocks separated by blanks, and ignore trailing blank lines within the line block.

Test is checked to be consisted with http://rst.ninjs.org/
2017-08-28 07:48:46 -07:00
Alexander
e6f767b581 Muse reader: parse <verse> tag (#3872) 2017-08-25 07:09:28 -07:00
bucklereed
c80e26f888 LaTeX reader: RN and Rn, from biblatex (#3854) 2017-08-24 09:45:58 -07:00
Alexander
5d74932578 Muse reader: avoid crashes on multiparagraph inline tags (#3866)
Test checks that behavior is consistent with Amusewiki
2017-08-22 23:12:34 -07:00