Commit graph

235 commits

Author SHA1 Message Date
Yan Pashkovsky
a337685fe0
Merge branch 'master' into groff_reader 2018-05-09 19:48:34 +03:00
Yan Pas
c1617565fc basic manfile parsing 2018-05-09 03:24:45 +03:00
Alexander
1927bc9aac Add FB2 reader (#4539) 2018-04-26 12:33:18 -07:00
Alexander Krotov
4d89a1db7f Muse reader: allow nested footnotes 2018-04-26 12:38:17 +03:00
Alexander Krotov
ce4326a4f1 Muse reader: allow "-" in anchors 2018-04-19 14:17:59 +03:00
Jesse Rosenthal
c5d8fab058 Docx reader tests: Test for combining adjacent code blocks. 2018-04-17 09:29:54 -04:00
Alexander Krotov
3443df6068 Markdown reader: add regression test for previous commit 2018-04-17 11:55:37 +03:00
Alexander Krotov
a8122987fc Muse reader: allow verse to be indented
Muse writer indents verse blocks in definition list more than necessary, so Muse reader should parse them.
2018-04-16 15:08:34 +03:00
Alexander Krotov
ce7301de02 Fix a typo in Muse reader testsuite comment 2018-04-16 12:12:25 +03:00
Alexander Krotov
01f5ed14e6 Muse reader: don't allow footnote references inside links 2018-04-15 17:53:53 +03:00
Alexander Krotov
9cc2bf0295 Muse reader: allow URL to be empty
Muse writer can write links with empty URLs, so Muse reader should read them.
2018-04-15 14:50:46 +03:00
Alexander Krotov
6be0139145 Muse reader: require that comment semicolons are in the first column
Fixes #4551
2018-04-15 12:17:33 +03:00
John MacFarlane
d5b98c8c6e Man writer: Don't escape U+2019 as '.
Closes #4550.
2018-04-14 10:42:05 -07:00
Alexander Krotov
17b0499516 Muse reader: add support for Text:Amuse multiline headings 2018-04-09 02:05:57 +03:00
Alexander Krotov
ebbd441d06 Muse reader: add support for <biblio> and <play> tags 2018-04-07 18:31:06 +03:00
Alexander Krotov
2380845206 Muse reader: add <math> tag support 2018-04-02 17:19:26 +03:00
Alexander Krotov
ca78d93b40 Muse writer: place header IDs before header
See https://github.com/melmothx/text-amuse/issues/39
2018-04-02 15:58:37 +03:00
Alexander Krotov
aa929e462d Muse reader: enable round-trip test
Close #4468
2018-03-25 23:04:54 +03:00
Alexander Krotov
79592db66c Muse reader: allow links to have empty descriptions 2018-03-25 22:16:45 +03:00
Alexander Krotov
c6232d0f7d Muse reader: require block <literal> tags to be on separate lines 2018-03-25 18:31:28 +03:00
John MacFarlane
0ef56657ed Comment out Muse reader round-trip test.
It fails too often.  Perhaps a separate test program should
be used to hunt for round-trip bugs.
2018-03-18 12:43:36 -07:00
John MacFarlane
7e389cb3db Use NoImplicitPrelude and explicitly import Prelude.
This seems to be necessary if we are to use our custom Prelude
with ghci.

Closes #4464.
2018-03-18 10:46:28 -07:00
John MacFarlane
dfa1dc164a hlint fixes. 2018-03-17 22:00:55 -07:00
Jesse Rosenthal
85a65c6a51 Docx reader: add tests for nested smart tags. 2018-03-13 22:16:54 -04:00
Alexander Krotov
3ee45a7357 Muse reader: compare first rewrite to the second in round-trip test 2018-03-12 15:09:27 +03:00
Alexander Krotov
c3fbc492c8 Muse reader: require closing tag to have the same indentation as opening 2018-03-12 14:24:50 +03:00
Alexander Krotov
f0a029ac51 Muse reader: do not reparse blocks inside unclosed block tag
Fixes #4425
2018-03-12 13:44:27 +03:00
Alexander Krotov
9bcd090848 Muse reader: parse <class> tag
<class> tag is supported by Emacs Muse
2018-03-10 07:27:41 +03:00
Alexander Krotov
4d2bf177fc Muse reader: do not produce empty Str element for unindented verse lines 2018-03-07 14:24:16 +03:00
Alexander Krotov
a71a1fec69 Muse reader: fix indentation requirements for footnote continuations 2018-03-03 03:33:02 +03:00
Alexander Krotov
a01573692a Muse reader: enable <literal> tags even if amuse extension is enabled
Amusewiki disables <literal> tags for security reasons.
If user wants similar behavior in pandoc, RawBlocks and RawInlines
can be removed or replaced with filters.
2018-03-02 12:52:39 +03:00
Alexander Krotov
177c5120a5 Muse reader: do not consume whitespace while looking for closing end tag
Fix for a bug caught by round-trip test.
2018-03-02 01:01:50 +03:00
Alexander Krotov
55c4b9982c Muse reader: convert alphabetical list markers to decimal in round-trip test
Alphabetical lists are an addition of Text::Amuse.
They are not present in Emacs Muse and can be ambiguous
when list starts with "i.", "c." etc.
2018-03-02 00:33:16 +03:00
Jesse Rosenthal
7d3e7a5a6d Docx reader: Handle nested sdt tags.
Previously we had only unwrapped one level of sdt tags. Now we recurse
if we find them.

Closes: #4415
2018-02-28 16:32:20 -05:00
Alexander Krotov
cc34771928 Muse reader: add test for verse tag with one empty line 2018-02-28 14:43:36 +03:00
Alexander Krotov
a7ac590b08 Muse reader: allow <quote> and other tags to be indented 2018-02-28 12:11:56 +03:00
Albert Krewinkel
6ed7926bb4
Org reader tests: move citation tests to separate module 2018-02-26 21:18:13 +01:00
Yan Pas
fd3676a568 initial 2018-02-25 03:34:17 +03:00
Alexander Krotov
39dd7c794b Muse reader: allow single colon in definition list term 2018-02-24 02:38:10 +03:00
Alexander Krotov
2eab8f4654 Muse reader: improve verse parsing
Now verse marked up with ">" (in contrast to <verse> tag) can be placed
inside lists.
2018-02-23 18:02:04 +03:00
Jesse Rosenthal
ffcecfacb1 Docx reader tests: test custom style extension. 2018-02-22 13:05:44 -05:00
Albert Krewinkel
00d20ccd09
Org reader: allow changing emphasis syntax
The characters allowed before and after emphasis can be configured via
`#+pandoc-emphasis-pre` and `#+pandoc-emphasis-post`, respectively. This
allows to change which strings are recognized as emphasized text on a
per-document or even per-paragraph basis. The allowed characters must be
given as (Haskell) string.

    #+pandoc-emphasis-pre: "-\t ('\"{"
    #+pandoc-emphasis-post: "-\t\n .,:!?;'\")}["

If the argument cannot be read as a string, the default value is
restored.

Closes: #4378
2018-02-21 22:43:18 +01:00
Alexander Krotov
5a9d7d20dd Move manyUntil to Text.Pandoc.Parsing and use it in Txt2Tags reader 2018-02-19 19:23:30 +03:00
Alexander Krotov
0e4b8ae362 Muse reader: prioritize lists with roman numerals over alphabetical lists
This is to make sure "i." starts a roman numbered list,
instead of a list with letter "i" (followed by "j", "k", ...").
2018-02-16 12:53:41 +03:00
danse
e6ff7f7986 Docx reader: Pick table width from the longest row or header
This change is intended to preserve as much of the table content as
possible

Closes #4360
2018-02-15 15:06:01 -05:00
Alexander Krotov
82a0ceaf18 Muse reader: fix directive parsing
This fixes bugs introduced in commit 4bfab8f04c.
2018-02-15 18:17:24 +03:00
Alexander Krotov
42e39fbd26 Muse reader: parse definition lists with multiple descriptions 2018-02-13 14:34:45 +03:00
Alexander Krotov
8aed3652c2 Muse reader: refactor to avoid reparsing
Lists are parsed in linear instead of exponential time now.

Contents of block tags, such as <quote>, is parsed directly,
without storing it in a string and parsing with parseFromString.

Fixed a bug: headers did not terminate lists.
2018-02-12 17:30:57 +03:00
Alexander Krotov
3480a8acc2 Muse reader: paragraph indentation does not indicate nested quote
Muse allows indentation to indicate quotation or alignment,
but only on the top level, not within a <quote> or list.

This patch also simplifies the code by removing museInQuote
and museInList fields from the state structure.
Headers and indented paragraphs are attempted to be parsed
only at the topmost level, instead of aborting parsing with guards.
2018-02-12 04:57:56 +03:00
Alexander Krotov
450a200637 Muse reader: test empty quote tag 2018-02-11 19:45:16 +03:00
Alexander Krotov
1dfda7e204 Muse reader: require that block tags are on separate lines
Text::Amuse already explicitly requires it anyway.
Supporting block tags on the same line as contents makes
it hard to combine closing tag parsers with indentation parsers.
Being able to combine parsers is required for no-reparsing refactoring
of Muse reader.
2018-02-11 19:35:58 +03:00
Alexander Krotov
6c45f8c8f6 Muse reader: test that two blank lines after verse can separate list items
Unlike paragraph and <quote> tag parsers, verse parser consumes newline.
For this reason only three or more blank lines can separate list items.
2018-02-05 01:39:38 +03:00
Alexander Krotov
3b667c54ea Muse reader: test that lists can be separated with two blanklines after blockquote
Existing tests only checked this for paragraphs.
2018-02-05 00:25:31 +03:00
Alexander Krotov
248f6076bc Muse reader: fix parsing of trailing whitespace
Newline after whitespace now results in softbreak
instead of space.
2018-01-28 03:18:29 +03:00
Alexander Krotov
6337539e32 Muse reader: fix matching of closing inline tags 2018-01-24 14:16:56 +03:00
Alexander Krotov
98f0e2053e Muse reader: remove multiple descriptions during round-trip tests 2018-01-20 18:34:42 +03:00
Alexander Krotov
e1cc9d9abc Muse reader: enable definition lists in round-trip test 2018-01-20 14:09:44 +03:00
John MacFarlane
b8ffd834cf hlint code improvements. 2018-01-19 21:25:24 -08:00
Alexander Krotov
22b69b557e Muse reader: fix parsing of nested definition lists 2018-01-20 02:14:27 +03:00
Alexander Krotov
7680e9b964 Muse reader: require only one space for nested definition list indentation 2018-01-19 14:16:20 +03:00
Alexander Krotov
19d2576223 Muse reader: parse definition list terms without parseFromString 2018-01-19 01:50:17 +03:00
Alexander Krotov
a516198d47 Muse reader: fix parsing of code at the beginning of paragraph 2018-01-18 15:35:43 +03:00
Alexander Krotov
5f57094a47 Muse reader: refactor definition list parsing
Test with wrong indentation is removed,
because now it is parsed as nested lists.
Emacs Muse and Text::Amuse don't have the same
behavior anyway.
2018-01-18 14:55:07 +03:00
Alexander Krotov
9986ccb333 Muse reader: parse "~~" as non-breaking space in Text::Amuse mode
Latest Text::Amuse supports "~~"
2018-01-18 02:46:02 +03:00
Alexander Krotov
ab85143e8a Muse reader: refactor list parsing
Now list item contents is parsed as blocks,
without resorting to parseFromString.

Only the first line of paragraph has to
be indented now, just like in Emacs Muse
and Text::Amuse.

Definition lists are not refactored yet.

See also: issue #3865.
2018-01-18 02:17:26 +03:00
Jesse Rosenthal
004f60bf26 Docx reader: Add test for hyperlinks in instrText tag
This is difficult to recreate with a modern version of Word, so I'm
using the file submitted with the bug report. It would be preferable
to find a smaller example with Latin characters, though, so as not to
confuse the issue being tested.
2018-01-16 13:22:02 -05:00
John MacFarlane
d9584d73f9 Markdown reader: Improved inlinesInBalancedBrackets.
The change both improves performance and fixes a
regression whereby normal citations inside inline notes
were not parsed correctly.

Closes jgm/pandoc-citeproc#315.
2018-01-14 12:24:21 -08:00
John MacFarlane
e5abee82f2 Shorten unbalanced brackets test.
It was taking a lot of time.
2018-01-14 12:24:21 -08:00
Jesse Rosenthal
a5b71a3c7f Docx reader: Add tests for paragraph insertion/deletion. 2018-01-02 11:32:48 -05:00
Jesse Rosenthal
3f30455b49 Docx reader: tests for overlapping targets (anchor spans). 2017-12-31 09:36:42 -05:00
Jesse Rosenthal
475b0dcb66 Docx reader: tests for removing unused anchors. 2017-12-30 22:43:33 -05:00
Alexander Krotov
551aec7b01 Muse reader: enable round trip test
Closes #4107
2017-12-30 20:32:16 +03:00
John MacFarlane
ddd6a89247 Text.Pandoc.Class: add insertInFileTree (API change).
This gives a pure way to insert an ersatz file into a FileTree.

In addition, we normalize paths both on insertion and on
lookup, so that "foo" and "./foo" will be judged equivalent.
2017-12-28 10:23:09 -08:00
Albert Krewinkel
c6b5d65161
Org smart test: drop superfluous import
Keeps GHC 7.8 and GHC 7.10 happy.
2017-12-28 14:51:03 +01:00
Albert Krewinkel
e5c8b65004
Org reader: support minlevel option for includes
The level of headers in included files can be shifted to a higher level
by specifying a minimum header level via the `:minlevel` parameter. E.g.
`#+include: "tour.org" :minlevel 1` will shift the headers in tour.org
such that the topmost headers become level 1 headers.

Fixes: #4154
2017-12-28 14:16:04 +01:00
Albert Krewinkel
2d443ecb07
Break-up org reader test file
The org reader test file had grown large, to the point that editor
performance was negatively affected in some cases. The tests are spread
over multiple submodules, and re-combined into a tasty TestTree in the
main org reader test file.
2017-12-28 14:15:58 +01:00
Jesse Rosenthal
d71165c8e2 Docx reader: add tests for structured document tags unwrapping. 2017-12-27 10:03:00 -05:00
John MacFarlane
af04881655
Merge pull request #4177 from stencila/jats-xml-reader
Add Basic JATS reader based on DocBook reader
2017-12-21 23:16:03 -07:00
Hamish Mackenzie
d853571397 Improve support for code language in JATS 2017-12-22 15:24:54 +13:00
Alexander Krotov
0405c5b461 Muse reader: parse anchors immediately after headings as IDs 2017-12-21 15:52:10 +03:00
Alexander Krotov
b5e62a5c09 Muse reader: require that note references does not start with 0 2017-12-20 14:00:30 +03:00
Hamish Mackenzie
5d3c9e5646 Add Basic JATS reader based on DocBook reader 2017-12-20 13:54:02 +13:00
Alexander Krotov
f6abf15832 Muse reader: parse empty comments correctly 2017-12-19 04:23:32 +03:00
John MacFarlane
79c3f57c47 Added tests of latex tokenizer.
This should help prevent regressions like #4159.
2017-12-15 10:13:43 -08:00
Jesse Rosenthal
440533643e Docx writer: Add tests for list continuation. 2017-12-13 15:16:44 -05:00
Alexander Krotov
e536c4d9c9 hlint Muse reader and tests 2017-12-06 19:38:25 +03:00
Alexander Krotov
6fd3cdac46 Muse reader: add test for #disable-tables directive in Emacs mode 2017-12-06 19:35:01 +03:00
Alexander Krotov
3ae359721d Muse reader: don't allow emphasis to be preceded by letter 2017-12-06 19:04:35 +03:00
John MacFarlane
ae60e0196c Add empty_paragraphs extension.
* Deprecate `--strip-empty-paragraphs` option.  Instead we now
  use an `empty_paragraphs` extension that can be enabled on
  the reader or writer.  By default, disabled.

* Add `Ext_empty_paragraphs` constructor to `Extension`.

* Revert "Docx reader: don't strip out empty paragraphs."
  This reverts commit d6c58eb836.

* Implement `empty_paragraphs` extension in docx reader and writer,
  opendocument writer, html reader and writer.

* Add tests for `empty_paragraphs` extension.
2017-12-04 14:56:57 -08:00
Alexander Krotov
ed261e5832 Muse reader: add underline support in Emacs Muse mode 2017-12-04 15:59:26 +03:00
John MacFarlane
d6c58eb836 Docx reader: don't strip out empty paragraphs.
We now have the `--strip-empty-paragraphs` option for that,
if you want it.  Closes #2252.

Updated docx reader tests.

We use stripEmptyParagraphs to avoid changing too
many tests.  We should add new tests for empty paragraphs.
2017-12-02 16:51:31 -08:00
Alexander Krotov
7751391fce Muse reader: correctly remove indentation from notes
Exactly one space is required and considered to be part of the marker.
2017-11-29 05:12:25 +03:00
John MacFarlane
5a225aa603 Temporarily disable round-trip block test for muse reader.
See #4107.
2017-11-28 16:13:01 -08:00
Alexander Krotov
c2993a6fc6 Muse reader: parse "~~" as non-breaking space in Emacs mode 2017-11-27 12:25:06 +03:00
Alexander Krotov
00004f042c Muse reader: make code blocks round trip 2017-11-27 04:54:23 +03:00
Alexander Krotov
bdad8c1d69 Muse reader: drop common space prefix from list items 2017-11-26 22:14:18 +03:00
Alexander Krotov
a8ac673285 Muse reader: Add partial round trip test 2017-11-26 02:01:39 +03:00
Alexander Krotov
ea2ea455b3 Muse reader: don't interpret XML entities 2017-11-25 22:46:25 +03:00
Alexander Krotov
77af25b4c3 Muse reader: parse markup in definition list terms 2017-11-24 14:02:43 +03:00
Alexander Krotov
137c7c2a65 Muse reader: allow definition to end with EOF 2017-11-24 13:16:09 +03:00