Commit graph

220 commits

Author SHA1 Message Date
Albert Krewinkel
2c66a42ab8
Lua modules: add function pandoc.utils.normalize_date
The function parses a date and converts it (if possible) to "YYYY-MM-DD"
format.
2017-12-23 13:43:22 +01:00
Albert Krewinkel
35f0567a8f
Lua modules: add function pandoc.utils.to_roman_numeral
The function allows conversion of numbers below 4000 into roman
numerals.
2017-12-23 13:42:35 +01:00
Albert Krewinkel
23edb958db
Lua modules: add stringify function to pandoc.utils
The new function `pandoc.utils.stringify` converts any AST element to a
string with formatting removed.
2017-12-22 20:09:37 +01:00
John MacFarlane
af04881655
Merge pull request #4177 from stencila/jats-xml-reader
Add Basic JATS reader based on DocBook reader
2017-12-21 23:16:03 -07:00
Hamish Mackenzie
d853571397 Improve support for code language in JATS 2017-12-22 15:24:54 +13:00
Alexander Krotov
0405c5b461 Muse reader: parse anchors immediately after headings as IDs 2017-12-21 15:52:10 +03:00
Albert Krewinkel
299e452463
Test more pandoc Lua module functions
The functions `sha1`, `read`, and `pipe` are now tested.

Change: minor
2017-12-20 21:36:41 +01:00
Alexander Krotov
b5e62a5c09 Muse reader: require that note references does not start with 0 2017-12-20 14:00:30 +03:00
Hamish Mackenzie
5d3c9e5646 Add Basic JATS reader based on DocBook reader 2017-12-20 13:54:02 +13:00
Alexander Krotov
f6abf15832 Muse reader: parse empty comments correctly 2017-12-19 04:23:32 +03:00
John MacFarlane
79c3f57c47 Added tests of latex tokenizer.
This should help prevent regressions like #4159.
2017-12-15 10:13:43 -08:00
John MacFarlane
3361f85f8e
Merge pull request #4148 from stencila/jats-figures
fig, table-wrap & caption Divs for JATS writer
2017-12-14 13:45:23 -07:00
John MacFarlane
52a8116e71
Merge pull request #4153 from tarleb/unify-lua-init
Unify lua initalization
2017-12-13 21:42:06 -07:00
Jesse Rosenthal
440533643e Docx writer: Add tests for list continuation. 2017-12-13 15:16:44 -05:00
Albert Krewinkel
4c64af4407
Custom writer: use init file to setup Lua interpreter
The same init file (`data/init`) that is used to setup the Lua
interpreter for Lua filters is also used to setup the interpreter of
custom writers.lua.
2017-12-13 21:15:41 +01:00
Hamish Mackenzie
ec1693505c fig, table-wrap & caption Divs for JATS writer
Support writing <fig> and <table-wrap> elements with <title> and
<caption> inside them by using Divs with class set to on of
fig, table-wrap or cation.  The title is included as a Heading
so the constraint on where Heading can occur is also relaxed.

Also leaves out empty alt attributes on links.
2017-12-13 12:06:22 +13:00
Albert Krewinkel
d5b1c7b767
Lua filters: refactor lua module handling
The integration with Lua's package/module system is improved: A
pandoc-specific package searcher is prepended to the searchers in
`package.searchers`. The modules `pandoc` and `pandoc.mediabag` can now
be loaded via `require`.
2017-12-02 23:07:29 +01:00
Alexander Krotov
e536c4d9c9 hlint Muse reader and tests 2017-12-06 19:38:25 +03:00
Alexander Krotov
6fd3cdac46 Muse reader: add test for #disable-tables directive in Emacs mode 2017-12-06 19:35:01 +03:00
Alexander Krotov
3ae359721d Muse reader: don't allow emphasis to be preceded by letter 2017-12-06 19:04:35 +03:00
John MacFarlane
ae60e0196c Add empty_paragraphs extension.
* Deprecate `--strip-empty-paragraphs` option.  Instead we now
  use an `empty_paragraphs` extension that can be enabled on
  the reader or writer.  By default, disabled.

* Add `Ext_empty_paragraphs` constructor to `Extension`.

* Revert "Docx reader: don't strip out empty paragraphs."
  This reverts commit d6c58eb836.

* Implement `empty_paragraphs` extension in docx reader and writer,
  opendocument writer, html reader and writer.

* Add tests for `empty_paragraphs` extension.
2017-12-04 14:56:57 -08:00
Alexander Krotov
ed261e5832 Muse reader: add underline support in Emacs Muse mode 2017-12-04 15:59:26 +03:00
John MacFarlane
d6c58eb836 Docx reader: don't strip out empty paragraphs.
We now have the `--strip-empty-paragraphs` option for that,
if you want it.  Closes #2252.

Updated docx reader tests.

We use stripEmptyParagraphs to avoid changing too
many tests.  We should add new tests for empty paragraphs.
2017-12-02 16:51:31 -08:00
Alexander Krotov
7751391fce Muse reader: correctly remove indentation from notes
Exactly one space is required and considered to be part of the marker.
2017-11-29 05:12:25 +03:00
John MacFarlane
5a225aa603 Temporarily disable round-trip block test for muse reader.
See #4107.
2017-11-28 16:13:01 -08:00
Alexander Krotov
c2993a6fc6 Muse reader: parse "~~" as non-breaking space in Emacs mode 2017-11-27 12:25:06 +03:00
Alexander Krotov
00004f042c Muse reader: make code blocks round trip 2017-11-27 04:54:23 +03:00
Alexander Krotov
bdad8c1d69 Muse reader: drop common space prefix from list items 2017-11-26 22:14:18 +03:00
Alexander Krotov
a8ac673285 Muse reader: Add partial round trip test 2017-11-26 02:01:39 +03:00
Alexander Krotov
ea2ea455b3 Muse reader: don't interpret XML entities 2017-11-25 22:46:25 +03:00
Alexander Krotov
77af25b4c3 Muse reader: parse markup in definition list terms 2017-11-24 14:02:43 +03:00
Alexander Krotov
137c7c2a65 Muse reader: allow definition to end with EOF 2017-11-24 13:16:09 +03:00
Alexander Krotov
fe74436540 Muse writer: test that inline math conversion result is normalized
Without normalization this test produced
<em>a</em><em>b</em><em>c</em>
2017-11-24 12:35:25 +03:00
Alexander Krotov
0cfd764d27 Muse: move inline list normalization to writer 2017-11-24 12:17:20 +03:00
Albert Krewinkel
cd85c73ded
Org reader: allow empty list items
Fixes: #4090
2017-11-22 22:53:24 +01:00
Alexander Krotov
75e2a1104c Muse reader: allow list items to be empty 2017-11-22 18:49:07 +03:00
Alexander Krotov
0b63ac2db1 Muse reader: add ordered list test 2017-11-22 18:48:45 +03:00
Alexander Krotov
454062eccd Muse writer: escape hash symbol 2017-11-22 16:17:30 +03:00
Alexander Krotov
c8ab4789b6 Muse reader: add more multiline definition tests 2017-11-22 15:23:09 +03:00
Alexander Krotov
7e42857ed8 Muse writer: escape "----" to avoid accidental horizontal rules 2017-11-22 01:39:20 +03:00
Alexander Krotov
351765d4ad Muse reader: concatenate inlines of the same type 2017-11-22 01:22:43 +03:00
Alexander Krotov
df3a80cc97 Muse writer: escape only </code> inside code tag
Additional <verbatim> is not needed as <code> is verbatim already.
2017-11-22 01:22:43 +03:00
Alexander Krotov
6c17117ef2 Muse reader: add inline <literal> support 2017-11-21 19:53:55 +03:00
Alexander Krotov
59f537c31f Muse reader: test <literal> blocks 2017-11-21 19:01:53 +03:00
Albert Krewinkel
849900c516 data/pandoc.lua: enable table-like behavior of attributes (#4080)
Attribute lists are represented as associative lists in Lua. Pure
associative lists are awkward to work with. A metatable is attached to
attribute lists, allowing to access and use the associative list as if
the attributes were stored in as normal key-value pair in table.

Note that this changes the way `pairs` works on attribute lists. Instead
of producing integer keys and two-element tables, the resulting iterator
function now returns the key and value of those pairs.  Use `ipairs` to
get the old behavior.

Warning: the new iteration mechanism only works if pandoc has been
compiled with Lua 5.2 or later (current default: 5.3).

The `pandoc.Attr` function is altered to allow passing attributes as
key-values in a normal table. This is more convenient than having to
construct the associative list which is used internally.

Closes #4071
2017-11-20 09:37:40 -08:00
Alexander Krotov
82bcda80c6 Muse reader: count only one space as part of list item marker 2017-11-19 04:40:00 +03:00
Alexander Krotov
163af3fdee Muse reader: produce SoftBreaks on newlines
Now wrapping can be preserved with --wrap=preserve
2017-11-19 02:37:52 +03:00
Albert Krewinkel
53aafd6643 Lua filters: preload text module (#4077)
The `text` module is preloaded in lua. The module contains some UTF-8
aware string functions, implemented in Haskell.  The module is loaded on
request only, e.g.:

    text = require 'text'
    function Str (s)
      s.text = text.upper(s.text)
      return s
    end
2017-11-18 13:24:06 -08:00
Alexander Krotov
6018a2324d Muse reader: Add Text::Amuse footnote extensions
Footnote end is indicated by indentation,
so footnotes can be placed anywhere in the text,
not just at the end of it.
2017-11-18 23:43:02 +03:00
Alexander Krotov
3a83b3843d Replace "emacs" extension with "amuse" extension
It makes clear that extension is related to Muse markup.
2017-11-13 18:41:49 +03:00
Alexander Krotov
df4cb20f29 Muse reader: accept Emacs Muse definition lists
Emacs Muse does not require indentation.
2017-11-12 18:08:41 +03:00
Alexander Krotov
3cee9c8976 FB2 writer: Add "unrecognised" genre to <title-info>
XML schema requires at least one genre.
2017-11-01 13:31:16 +03:00
Alexander Krotov
8a5541dca8 FB2 writer: remove <annotation> from <body>
<annotation> is not allowed inside <body> according to FictionBook2 XML schema. Besides that, the same information is already placed inside <description>.

Related bug: #2424
2017-11-01 13:08:52 +03:00
John MacFarlane
1f393f1a8b
Merge pull request #4001 from labdsf/fb2-tests
Add new style FB2 tests
2017-11-01 00:37:29 -04:00
Sascha Wilde
03361f0a68 Creole reader: additional test on nowiki-block after para. 2017-10-31 22:26:35 +01:00
Sascha Wilde
fa67d6e86f Creole reader: fixed lists with trailing white space. 2017-10-31 18:55:27 +01:00
John MacFarlane
6a1476e7e2 Export all of Text.Pandoc.Class from Text.Pandoc. 2017-10-29 15:00:49 -07:00
Sascha Wilde
e8be4b0b6d Creole reader (#4002)
* Basic skeleton for creole reader.

No real functionality besides preliminary bold and italics yet.

* Creole: add support for bold/italic with implicit end at paragraph end.

* Creole: add support for headings.

* Creole: add support for tilde escaped chars.

* Basic skeleton for creole reader.

No real functionality besides preliminary bold and italics yet.

* Creole: add support for bold/italic with implicit end at paragraph end.

* Creole: add support for headings.

* Creole: add support for tilde escaped chars.

* Add a test suite for the creole parser

So far this covers only things the parser already supports.

* Added simple parsing of flat unordered lists.

* Added tests for unordered lists in creole.

* First, wrong(!) implementation of sublists.

Fails test, as sublists should not be embedded in a list item!

* Implementation of unordered sublists.

* Added support for ordered lists to creole reader.

* Added utility function to append parsers to Creole reader.

* Creole reader: Fixed list item end detection in sub lists.

* Tests for creole reader: added more tests for lists.

Covering ordered and unordered tests, even mixed.  Tests for
formatting in list items still missing...

* Added "nowiki" blocks.  One exception rule is missing...

* Creole reader: nowiki: implemented exception for curly brackets.

* Creole reader: added inline nowiki.

* Creole reader: added horizontalRule.

* Creole reader: added auto linking of URIs.

* Creole reader: detect horizontalRule as para end.

Used the opportunity for a little refactoring.

* Creole reader: added forced line breaks.

Including test.

* Creole reader: implement wiki links.

* Creole reader: added image support.

* Creole reader: support images as links.

* Creole reader: implemented placeholder -- by simply dropping them.

* Creole reader: added tests for links.

After observing a regression, it was really time...  ;-)

* Creole reader: fixed links with names.

* Creole reader: allow space after first of enclosing tags.

Space after the start of formatting tags are allowed with creole,
e.g. "there is // italic text // in here" is legal.

This problem was discovered using the creole1.0test.txt document from
http://www.wikicreole.org/wiki/Creole1.0TestCases

See l.57:
# // italic item 3 //

* Creole reader: fixed links without names.

* Creole reader: Tests, sorted into groups.

* Creole reader: implemented tables.

* Removed redundant import.

* Creole reader: add correct escaping of links.

* Creole reader: allow handling of e.g. links in parenthesis and quotes.

* Creole reader: Modified disclaimer as most of the code is actually by me.

* Creole reader: Tests: added escaped links.

* Creole reader: preserve leading and trailing space in bold/italic.

* Creole reader: detect tables without a leading blank line.

* Creole Reader: added official creole1.0test.txt as "old" test.

The base document was downloaded from
http://www.wikicreole.org/wiki/Creole1.0TestCases.
The Wiki, and therefore the test document is

Copyright (C) by the contributors.
Some rights reserved, license CC BY-SA.

http://creativecommons.org/licenses/by-sa/1.0/
2017-10-29 13:28:50 -04:00
Alexander Krotov
d99776ea04 Add new style FB2 tests 2017-10-28 21:08:48 +03:00
John MacFarlane
ff16db1aa3 Automatic reformating by stylish-haskell. 2017-10-27 20:28:29 -07:00
John MacFarlane
a2a14f9029 Removed old adjacent_links test for docx reader.
See #2270 for background -- this test blocked the consistent
underline change and was hard to revise, so for now we are
removing it.
2017-10-27 16:09:44 -07:00
hftf
7f8a3c6cb7 Consistent underline for Readers (#2270)
* Added underlineSpan builder function.  This can be easily updated if needed. The purpose is for Readers to transform underlines consistently.

* Docx Reader: Use underlineSpan and update test

* Org Reader: Use underlineSpan and add test

* Textile Reader: Use underlineSpan and add test case

* Txt2Tags Reader: Use underlineSpan and update test

* HTML Reader: Use underlineSpan and add test case
2017-10-27 18:45:00 -04:00
Sascha Wilde
66fd3247ea Creole reader (#3994)
This is feature complete but not very thoroughly tested yet.
2017-10-26 19:19:28 -04:00
Ben Firshman
9046dbadb1
Latex reader: Skip spaces in image options 2017-10-17 16:42:11 +03:00
Ben Firshman
d73fdbf895
Add tests for existing \includegraphics behaviour 2017-10-17 15:09:55 +03:00
Albert Krewinkel
f176ad6f21
Org reader: end footnotes after two blank lines
Footnotes can not only be terminated by the start of a new footnote or a
header, but also by two consecutive blank lines.
2017-10-08 14:17:26 +02:00
bucklereed
c359bdd9b1 LaTeX reader: read polyglossia/babel \text($LANG){...}. 2017-10-06 12:17:50 +01:00
Albert Krewinkel
514662e544
Org reader: support \n export option
The `\n` export option turns all newlines in the text into hard
linebreaks.

Closes #3950
2017-10-02 23:11:58 +02:00
John MacFarlane
f3a80034ff Removed writerSourceURL, add source URL to common state.
Removed `writerSourceURL` from `WriterOptions` (API change).
Added `stSourceURL` to `CommonState`.
It is set automatically by `setInputFiles`.

Text.Pandoc.Class now exports `setInputFiles`, `setOutputFile`.

The type of `getInputFiles` has changed; it now returns `[FilePath]`
instead of `Maybe [FilePath]`.

Functions in Class that formerly took the source URL as a parameter
now have one fewer parameter (`fetchItem`, `downloadOrRead`,
`setMediaResource`, `fillMediaBag`).

Removed `WriterOptions` parameter from `makeSelfContained` in
`SelfContained`.
2017-09-30 16:11:20 -05:00
Albert Krewinkel
2f47e04206
Text.Pandoc.Lua: add mediabag submodule 2017-09-30 09:57:03 +02:00
Alexander Krotov
b5d064e8f0 Muse reader: parse anchors 2017-09-28 14:57:24 +03:00
Alexander Krotov
2cdb8fe2e6 Muse reader: test metadata parsing 2017-09-26 19:31:10 +03:00
Albert Krewinkel
3a7663281a
Org reader: update emphasis border chars
The org reader was updated to match current org-mode behavior: the set
of characters which are acceptable to occur as the first or last
character in an org emphasis have been changed and now allows all
non-whitespace chars at the inner border of emphasized text (see
`org-emphasis-regexp-components`).

Fixes: #3933
2017-09-25 09:31:29 +02:00
John MacFarlane
ddecd72783 Merge pull request #3911 from labdsf/muse-reader-braces
Muse reader: parse {{{ }}} example syntax
2017-09-11 14:01:05 -07:00
Alexander Krotov
8e4ee66563 Muse reader: allow inline markup to be followed by punctuation
Previously code was not allowed to be followed by comma,
and emphasis was allowed to be followed by letter.
2017-09-11 18:34:32 +03:00
Alexander Krotov
508c3a64d8 Muse reader: parse {{{ }}} example syntax 2017-09-11 18:17:28 +03:00
Alexander Krotov
27cccfac84 Muse reader: parse verbatim tag 2017-09-11 12:13:09 +03:00
Alexander Krotov
afedb41b17 Muse reader: trim newlines from <example>s 2017-09-10 12:42:24 +03:00
Alexander Krotov
2230371304 Muse reader: debug inline code markup 2017-09-09 16:39:06 +03:00
Alexander
743413a5b5 Muse reader: Allow finishing header with EOF (#3897) 2017-09-06 08:48:06 -07:00
Alexander
350c282f20 Muse reader: require at least one space char after * in header (#3895) 2017-09-05 09:41:27 -07:00
Alexander
c09b586147 Muse reader: parse <div> tag (#3888) 2017-09-04 21:22:40 -07:00
Alexander
14f813c3f2 Muse reader: parse verse markup (#3882) 2017-08-29 12:40:34 -07:00
Alexander
05bb8ef4aa RST reader: handle blank lines correctly in line blocks (#3881)
Previously pandoc would sometimes combine two line blocks separated by blanks, and ignore trailing blank lines within the line block.

Test is checked to be consisted with http://rst.ninjs.org/
2017-08-28 07:48:46 -07:00
Alexander
e6f767b581 Muse reader: parse <verse> tag (#3872) 2017-08-25 07:09:28 -07:00
bucklereed
c80e26f888 LaTeX reader: RN and Rn, from biblatex (#3854) 2017-08-24 09:45:58 -07:00
Alexander
5d74932578 Muse reader: avoid crashes on multiparagraph inline tags (#3866)
Test checks that behavior is consistent with Amusewiki
2017-08-22 23:12:34 -07:00
Alexander
c7d4fd8cf1 Muse reader: do not allow closing tags with EOF (#3863)
This behavior is compatible to Amusewiki
2017-08-22 16:34:18 -07:00
Albert Krewinkel
41baaff327
Text.Pandoc.Lua: support Inline and Block catch-alls
Try function `Inline`/`Block` if no other filter function of the
respective type matches an element.

Closes: #3859
2017-08-22 23:30:48 +02:00
Albert Krewinkel
56fb854ad8
Text.Pandoc.Lua: respect metatable when getting filters
This change makes it possible to define a catch-all function using lua's
metatable lookup functionality.

    function catch_all(el)
      …
    end

    return {
      setmetatable({}, {__index = function(_) return catch_all end})
    }

A further effect of this change is that the map with filter functions
now only contains functions corresponding to AST element constructors.
2017-08-22 22:56:51 +02:00
Alexander
0a839cbdc9 Muse reader: add definition list support (#3860) 2017-08-21 21:08:44 -07:00
John MacFarlane
d1444b4ecd RST reader/writer: support unknown interpreted text roles...
...by parsing them as Span with "role" attributes.
This way they can be manipulated in the AST.

Closes #3407.
2017-08-17 16:01:44 -07:00
John MacFarlane
38578ad06c Test fixes so we can find data files.
In old tests & command tests, we now set the environment variable
pandoc_datadir.

In lua tests, we set the datadir explicitly.
2017-08-14 13:03:26 -07:00
John MacFarlane
c7cbd3ddc2 Fixed command tests to set local path.
Previously we just tacked on a directory to the command
line, but that didn't work when we e.g. used a pipe for round tripping,
with two invocations of pandoc.
2017-08-13 23:59:38 -07:00
Albert Krewinkel
2dc3dbd68b Use hslua >= 0.7, update Lua code 2017-08-13 14:23:54 +02:00
John MacFarlane
7892dcd353 Command tests; print stderr when a test fails. 2017-08-11 22:09:15 -07:00
John MacFarlane
83d856ee6c Fixed writer tests not to use writerUserDataDir. 2017-08-10 23:51:42 -07:00
John MacFarlane
ac18ff90b2 Org reader: use org-language attribute rather than data-org-language. 2017-08-09 09:45:17 -07:00
John MacFarlane
96933c6043 Org reader: use tag-name attribute instead of data-tag-name. 2017-08-09 09:26:57 -07:00
bucklereed
db55f7c1b2 HTML reader: parse <main> like <div role=main>. (#3791)
* HTML reader: parse <main> like <div role=main>.

* <main> closes <p> and behaves like a block element generally
2017-08-09 09:10:12 -07:00