Commit graph

420 commits

Author SHA1 Message Date
Sascha Wilde
fa67d6e86f Creole reader: fixed lists with trailing white space. 2017-10-31 18:55:27 +01:00
Alexander Krotov
a496979c6d FB2 writer: write blocks outside of <p> in definitions 2017-10-31 20:19:00 +03:00
mb21
8d7ce0fdf0 HTML Writer: consistently use dashed class-names
see #3556
2017-10-31 10:40:16 +01:00
Alexander Krotov
94d02a6efa FB2 writer: make bullet lists consistent with ordered lists
Previously bullet lists interacted in odd way with ordered lists.
For example, bullet lists nested in ordered list had incorrect
indentation. Besides that, indentation with spaces is not rendered
by FBReader and fbless. To avoid this problem, bullet lists are
indented by appending bullets to marker just the same way it is
done for ordered lists.
2017-10-31 11:35:47 +03:00
John MacFarlane
244b42dbaf Added failing command test for #4007. 2017-10-30 11:04:40 -07:00
John MacFarlane
6a1476e7e2 Export all of Text.Pandoc.Class from Text.Pandoc. 2017-10-29 15:00:49 -07:00
Alexander
3263d0d7c4 Write FB2 lists without nesting blocks inside <p> (#4004)
According to FB2 XML schema <empty-line /> cannot be placed inside
<p>. Linux FBReader can't display such paragraphs, e.g. any "loose"
lists produced by pandoc prior to this commit.  Besides that,
FB2 writer placed <p> inside <p> when writing nested lists,
this commit fixes the bug.

Also this commit removes leading non-breaking space from ordered
lists for consistency with bullet lists.

Definition lists are not affected at all.
2017-10-29 14:46:44 -04:00
Sascha Wilde
e8be4b0b6d Creole reader (#4002)
* Basic skeleton for creole reader.

No real functionality besides preliminary bold and italics yet.

* Creole: add support for bold/italic with implicit end at paragraph end.

* Creole: add support for headings.

* Creole: add support for tilde escaped chars.

* Basic skeleton for creole reader.

No real functionality besides preliminary bold and italics yet.

* Creole: add support for bold/italic with implicit end at paragraph end.

* Creole: add support for headings.

* Creole: add support for tilde escaped chars.

* Add a test suite for the creole parser

So far this covers only things the parser already supports.

* Added simple parsing of flat unordered lists.

* Added tests for unordered lists in creole.

* First, wrong(!) implementation of sublists.

Fails test, as sublists should not be embedded in a list item!

* Implementation of unordered sublists.

* Added support for ordered lists to creole reader.

* Added utility function to append parsers to Creole reader.

* Creole reader: Fixed list item end detection in sub lists.

* Tests for creole reader: added more tests for lists.

Covering ordered and unordered tests, even mixed.  Tests for
formatting in list items still missing...

* Added "nowiki" blocks.  One exception rule is missing...

* Creole reader: nowiki: implemented exception for curly brackets.

* Creole reader: added inline nowiki.

* Creole reader: added horizontalRule.

* Creole reader: added auto linking of URIs.

* Creole reader: detect horizontalRule as para end.

Used the opportunity for a little refactoring.

* Creole reader: added forced line breaks.

Including test.

* Creole reader: implement wiki links.

* Creole reader: added image support.

* Creole reader: support images as links.

* Creole reader: implemented placeholder -- by simply dropping them.

* Creole reader: added tests for links.

After observing a regression, it was really time...  ;-)

* Creole reader: fixed links with names.

* Creole reader: allow space after first of enclosing tags.

Space after the start of formatting tags are allowed with creole,
e.g. "there is // italic text // in here" is legal.

This problem was discovered using the creole1.0test.txt document from
http://www.wikicreole.org/wiki/Creole1.0TestCases

See l.57:
# // italic item 3 //

* Creole reader: fixed links without names.

* Creole reader: Tests, sorted into groups.

* Creole reader: implemented tables.

* Removed redundant import.

* Creole reader: add correct escaping of links.

* Creole reader: allow handling of e.g. links in parenthesis and quotes.

* Creole reader: Modified disclaimer as most of the code is actually by me.

* Creole reader: Tests: added escaped links.

* Creole reader: preserve leading and trailing space in bold/italic.

* Creole reader: detect tables without a leading blank line.

* Creole Reader: added official creole1.0test.txt as "old" test.

The base document was downloaded from
http://www.wikicreole.org/wiki/Creole1.0TestCases.
The Wiki, and therefore the test document is

Copyright (C) by the contributors.
Some rights reserved, license CC BY-SA.

http://creativecommons.org/licenses/by-sa/1.0/
2017-10-29 13:28:50 -04:00
Alexander Krotov
d99776ea04 Add new style FB2 tests 2017-10-28 21:08:48 +03:00
John MacFarlane
ff16db1aa3 Automatic reformating by stylish-haskell. 2017-10-27 20:28:29 -07:00
John MacFarlane
a2a14f9029 Removed old adjacent_links test for docx reader.
See #2270 for background -- this test blocked the consistent
underline change and was hard to revise, so for now we are
removing it.
2017-10-27 16:09:44 -07:00
hftf
7f8a3c6cb7 Consistent underline for Readers (#2270)
* Added underlineSpan builder function.  This can be easily updated if needed. The purpose is for Readers to transform underlines consistently.

* Docx Reader: Use underlineSpan and update test

* Org Reader: Use underlineSpan and add test

* Textile Reader: Use underlineSpan and add test case

* Txt2Tags Reader: Use underlineSpan and update test

* HTML Reader: Use underlineSpan and add test case
2017-10-27 18:45:00 -04:00
Sascha Wilde
66fd3247ea Creole reader (#3994)
This is feature complete but not very thoroughly tested yet.
2017-10-26 19:19:28 -04:00
John MacFarlane
76886678a6 Use skylighting 0.4.2.
This prevents the problem with extra space around highlighted
code blocks (closes #3996).

Note that we no longer put an enclosing div around highlighted
code blocks.  The pre is the outer element, just as for unhighlighted
blocks.
2017-10-26 15:57:55 -07:00
John MacFarlane
513b16a71b Fenced divs: ensure that paragraph at end doesn't become Plain.
Added test case.
2017-10-24 09:53:29 -07:00
John MacFarlane
ecb5475a2a Back to using [WARNING] and [INFO] to mark messages. 2017-10-23 23:01:37 -07:00
John MacFarlane
fda0c0119f Implemented fenced Divs.
+ Added Ext_fenced_divs to Extensions (default for pandoc Markdown).
+ Document fenced_divs extension in manual.
+ Implemented fenced code divs in Markdown reader.
+ Added test.

Closes #168.
2017-10-23 22:45:28 -07:00
John MacFarlane
896803b0d5 HTML reader: htmlTag improvements.
We previously failed on cases where an attribute contained a `>`
character. This patch fixes the bug.

Closes #3989.
2017-10-23 17:29:32 -07:00
John MacFarlane
1a82ecbb68 More pleasing presentation of warnings and info messages.
!! warning
-- info
2017-10-23 15:00:11 -07:00
John MacFarlane
cecf02e326 Fixed test for change in log level. 2017-10-23 11:20:22 -07:00
mb21
e2123a4033 LaTeX Reader: support \lettrine 2017-10-22 20:33:30 +02:00
John MacFarlane
28bb5d610d LaTeX reader: support \expandafter.
Closes #3983.
2017-10-19 13:23:50 -07:00
Ben Firshman
9046dbadb1
Latex reader: Skip spaces in image options 2017-10-17 16:42:11 +03:00
Ben Firshman
d73fdbf895
Add tests for existing \includegraphics behaviour 2017-10-17 15:09:55 +03:00
John MacFarlane
61641f996f Revised command test 3971 to work with Windows. 2017-10-16 22:51:25 -07:00
John MacFarlane
c40857b389 Improved handling of include files in LaTeX reader.
Previously `\include` wouldn't work if the included file
contained, e.g., a begin without a matching end.

We've changed the Tok type so that it stores a full SourcePos,
rather than just a line and column.  So tokens keeep track
of the file they came from. This allows us to use a simpler
method for includes, which doesn't require parsing the included
document as a whole.

Closes #3971.
2017-10-16 22:05:34 -07:00
John MacFarlane
9cf9a64923 RST writer: correctly handle inline code containing backticks.
(Use a :literal: role.)

Closes #3974.
2017-10-16 20:54:43 -07:00
John MacFarlane
cba18c19a6 RST writer: don't backslash-escape word-internal punctuation.
Closes #3978.
2017-10-16 20:39:19 -07:00
John MacFarlane
75d8c99c73 ConTeXt writer: Use identifiers for chapters.
Closes #3968.
2017-10-11 20:21:55 -07:00
John MacFarlane
8cd1e00bbc Add test - closes #3958. 2017-10-08 21:57:26 -07:00
Albert Krewinkel
f176ad6f21
Org reader: end footnotes after two blank lines
Footnotes can not only be terminated by the start of a new footnote or a
header, but also by two consecutive blank lines.
2017-10-08 14:17:26 +02:00
bucklereed
c359bdd9b1 LaTeX reader: read polyglossia/babel \text($LANG){...}. 2017-10-06 12:17:50 +01:00
John MacFarlane
5307868de5 Removed spuriously added test/pandoc.tix. 2017-10-02 21:29:00 -07:00
John MacFarlane
492f496842 Markdown reader: Fixed bug with indented code following raw LaTeX.
Closes #3947.
2017-10-02 21:28:14 -07:00
Albert Krewinkel
514662e544
Org reader: support \n export option
The `\n` export option turns all newlines in the text into hard
linebreaks.

Closes #3950
2017-10-02 23:11:58 +02:00
John MacFarlane
f3a80034ff Removed writerSourceURL, add source URL to common state.
Removed `writerSourceURL` from `WriterOptions` (API change).
Added `stSourceURL` to `CommonState`.
It is set automatically by `setInputFiles`.

Text.Pandoc.Class now exports `setInputFiles`, `setOutputFile`.

The type of `getInputFiles` has changed; it now returns `[FilePath]`
instead of `Maybe [FilePath]`.

Functions in Class that formerly took the source URL as a parameter
now have one fewer parameter (`fetchItem`, `downloadOrRead`,
`setMediaResource`, `fillMediaBag`).

Removed `WriterOptions` parameter from `makeSelfContained` in
`SelfContained`.
2017-09-30 16:11:20 -05:00
Albert Krewinkel
2f47e04206
Text.Pandoc.Lua: add mediabag submodule 2017-09-30 09:57:03 +02:00
Alexander Krotov
b5d064e8f0 Muse reader: parse anchors 2017-09-28 14:57:24 +03:00
John MacFarlane
2314534d4d RST writer: add header anchors when header has non-standard id.
Closes #3937.
2017-09-27 20:42:04 -07:00
Alexander Krotov
2cdb8fe2e6 Muse reader: test metadata parsing 2017-09-26 19:31:10 +03:00
Albert Krewinkel
3a7663281a
Org reader: update emphasis border chars
The org reader was updated to match current org-mode behavior: the set
of characters which are acceptable to occur as the first or last
character in an org emphasis have been changed and now allows all
non-whitespace chars at the inner border of emphasized text (see
`org-emphasis-regexp-components`).

Fixes: #3933
2017-09-25 09:31:29 +02:00
Albert Krewinkel
71f69cd086 Allow lua filters to return lists of elements
Closes: #3918
2017-09-24 12:04:15 -07:00
John MacFarlane
b1ee747a24 Added --strip-comments option, readerStripComments in ReaderOptions.
* Options:  Added readerStripComments to ReaderOptions.
* Added `--strip-comments` command-line option.
* Made `htmlTag` from the HTML reader sensitive to this feature.

This affects Markdown and Textile input.

Closes #2552.
2017-09-17 13:01:27 -07:00
John MacFarlane
4177ee8626 Textile reader: allow 'pre' code in list item.
Closes #3916.
2017-09-12 08:58:47 -07:00
John MacFarlane
ddecd72783 Merge pull request #3911 from labdsf/muse-reader-braces
Muse reader: parse {{{ }}} example syntax
2017-09-11 14:01:05 -07:00
Alexander Krotov
8e4ee66563 Muse reader: allow inline markup to be followed by punctuation
Previously code was not allowed to be followed by comma,
and emphasis was allowed to be followed by letter.
2017-09-11 18:34:32 +03:00
Alexander Krotov
508c3a64d8 Muse reader: parse {{{ }}} example syntax 2017-09-11 18:17:28 +03:00
Alexander Krotov
27cccfac84 Muse reader: parse verbatim tag 2017-09-11 12:13:09 +03:00
Alexander Krotov
afedb41b17 Muse reader: trim newlines from <example>s 2017-09-10 12:42:24 +03:00
John MacFarlane
a1c11b048a Updated lhs-test for new skylighting. 2017-09-09 21:05:31 -07:00
Alexander Krotov
2230371304 Muse reader: debug inline code markup 2017-09-09 16:39:06 +03:00
John MacFarlane
2358229876 Adjusted some tests for last commit. 2017-09-08 16:34:33 -07:00
Andrew Dunning
621e43e0ec Write euro symbol directly in LaTeX
The textcomp package allows pdfLaTeX to parse `€` directly, making the \euro command unneeded. Closes #3801.
2017-09-08 22:26:32 +01:00
John MacFarlane
732005456e LaTeX template: load polyglossia after header-includes.
It needs to be loaded as late as possible.

Closes #3898.
2017-09-07 22:16:23 -07:00
John MacFarlane
5fc4980216 Markdown writer: Escape pipe characters when pipe_tables enabled.
Closes #3887.
2017-09-07 22:10:13 -07:00
Alexander
743413a5b5 Muse reader: Allow finishing header with EOF (#3897) 2017-09-06 08:48:06 -07:00
Alexander
350c282f20 Muse reader: require at least one space char after * in header (#3895) 2017-09-05 09:41:27 -07:00
Alexander
c09b586147 Muse reader: parse <div> tag (#3888) 2017-09-04 21:22:40 -07:00
Albert Krewinkel
6a6c3858b4
Org writer: stop using raw HTML to wrap divs
Div's are difficult to translate into org syntax, as there are multiple
div-like structures (drawers, special blocks, greater blocks) which all
have their advantages and disadvantages.  Previously pandoc would
use raw HTML to preserve the full div information; this was rarely
useful and resulted in visual clutter.  Div-rendering was changed to
discard the div's classes and key-value pairs if there is no natural way
to translate the div into an org structure.

Closes: #3771
2017-09-01 00:08:12 +02:00
Alexander
14f813c3f2 Muse reader: parse verse markup (#3882) 2017-08-29 12:40:34 -07:00
Alexander
05bb8ef4aa RST reader: handle blank lines correctly in line blocks (#3881)
Previously pandoc would sometimes combine two line blocks separated by blanks, and ignore trailing blank lines within the line block.

Test is checked to be consisted with http://rst.ninjs.org/
2017-08-28 07:48:46 -07:00
John MacFarlane
8fcf66453c RST reader: Fixed ..include:: directive.
Closes #3880.
2017-08-27 17:09:55 -07:00
John MacFarlane
1b3431a165 LaTeX reader: improved support for \hyperlink, \hypertarget.
Closes #2549.
2017-08-25 22:04:57 -07:00
Alexander
e6f767b581 Muse reader: parse <verse> tag (#3872) 2017-08-25 07:09:28 -07:00
bucklereed
c80e26f888 LaTeX reader: RN and Rn, from biblatex (#3854) 2017-08-24 09:45:58 -07:00
Alexander
5d74932578 Muse reader: avoid crashes on multiparagraph inline tags (#3866)
Test checks that behavior is consistent with Amusewiki
2017-08-22 23:12:34 -07:00
Alexander
c7d4fd8cf1 Muse reader: do not allow closing tags with EOF (#3863)
This behavior is compatible to Amusewiki
2017-08-22 16:34:18 -07:00
Albert Krewinkel
41baaff327
Text.Pandoc.Lua: support Inline and Block catch-alls
Try function `Inline`/`Block` if no other filter function of the
respective type matches an element.

Closes: #3859
2017-08-22 23:30:48 +02:00
Albert Krewinkel
56fb854ad8
Text.Pandoc.Lua: respect metatable when getting filters
This change makes it possible to define a catch-all function using lua's
metatable lookup functionality.

    function catch_all(el)
      …
    end

    return {
      setmetatable({}, {__index = function(_) return catch_all end})
    }

A further effect of this change is that the map with filter functions
now only contains functions corresponding to AST element constructors.
2017-08-22 22:56:51 +02:00
John MacFarlane
6cba21b4d3 Small improvement to #3855 - move lang attribute up.
So we don't have a dangling line with the closing `>` when
`lang` is not set.
2017-08-21 21:16:55 -07:00
Jens Getreu
9375e50b96 docbook5 template: use lang and subtitle variables (#3855) 2017-08-21 21:10:41 -07:00
Alexander
0a839cbdc9 Muse reader: add definition list support (#3860) 2017-08-21 21:08:44 -07:00
John MacFarlane
d70b89c0d9 Use pandoc-types 1.17.1. Tests updated for new simpleTable behavior...
with empty headers.
2017-08-20 23:24:51 -07:00
John MacFarlane
9cc128b579 LaTeX reader: Set identifiers on Spans used for \label. 2017-08-20 16:52:03 -07:00
John MacFarlane
a31241a08b Markdown reader: use CommonMark rules for list item nesting.
Closes #3511.

Previously pandoc used the four-space rule: continuation paragraphs,
sublists, and other block level content had to be indented 4
spaces.  Now the indentation required is determined by the
first line of the list item:  to be included in the list item,
blocks must be indented to the level of the first non-space
content after the list marker. Exception: if are 5 or more spaces
after the list marker, then the content is interpreted as an
indented code block, and continuation paragraphs must be indented
two spaces beyond the end of the list marker.  See the CommonMark
spec for more details and examples.

Documents that adhere to the four-space rule should, in most cases,
be parsed the same way by the new rules.  Here are some examples
of texts that will be parsed differently:

    - a
      - b

will be parsed as a list item with a sublist; under the four-space
rule, it would be a list with two items.

    - a

          code

Here we have an indented code block under the list item, even though it
is only indented six spaces from the margin, because it is four spaces
past the point where a continuation paragraph could begin.  With the
four-space rule, this would be a regular paragraph rather than a code
block.

    - a

            code

Here the code block will start with two spaces, whereas under
the four-space rule, it would start with `code`.  With the four-space
rule, indented code under a list item always must be indented eight
spaces from the margin, while the new rules require only that it
be indented four spaces from the beginning of the first non-space
text after the list marker (here, `a`).

This change was motivated by a slew of bug reports from people
who expected lists to work differently (#3125, #2367, #2575, #2210,
 #1990, #1137, #744, #172, #137, #128) and by the growing prevalance
of CommonMark (now used by GitHub, for example).

Users who want to use the old rules can select the `four_space_rule`
extension.

* Added `four_space_rule` extension.
* Added `Ext_four_space_rule` to `Extensions`.
* `Parsing` now exports `gobbleAtMostSpaces`, and the type
  of `gobbleSpaces` has been changed so that a `ReaderOptions`
  parameter is not needed.
2017-08-19 15:45:01 -07:00
John MacFarlane
5ab1162def Markdown reader: fixed parsing of fenced code after list...
...when there is no intervening blank line.

Closes #3733.
2017-08-18 21:46:55 -07:00
John MacFarlane
bfbdfa646a LaTeX reader: implement \newtoggle, \iftoggle, \toggletrue|false
from etoolbox.

Closes #3853.
2017-08-18 10:13:41 -07:00
John MacFarlane
d1444b4ecd RST reader/writer: support unknown interpreted text roles...
...by parsing them as Span with "role" attributes.
This way they can be manipulated in the AST.

Closes #3407.
2017-08-17 16:01:44 -07:00
John MacFarlane
b1f6fb4af5 HTML reader: support column alignments.
These can be set either with a `width` attribute or
with `text-width` in a `style` attribute.

Closes #1881.
2017-08-17 12:08:32 -07:00
John MacFarlane
db715ca847 LaTeX reader: use Link instead of Span for \ref.
This makes more sense semantically and avoids unnecessary
Span [Link] nestings when references are resolved.
2017-08-16 10:56:12 -07:00
schrieveslaach
cf4b40162d LaTeX reader: add Support for glossaries and acronym package (#3589)
Acronyms are not resolved by the reader, but acronym and glossary information is put into attributes on Spans so that they can be processed in filters.
2017-08-16 10:24:46 -07:00
John MacFarlane
68434957d6 Fixed command test #2994 on Windows. 2017-08-16 09:47:25 -07:00
John MacFarlane
892a4edeb1 Implement multicolumn support for slide formats.
The structure expected is:

    <div class="columns">
      <div class="column" width="40%">
        contents...
      </div>
      <div class="column" width="60%">
        contents...
      </div>
    </div>

Support has been added for beamer and all HTML slide formats.

Closes #1710.

Note:  later we could add a more elegant way to create
this structure in Markdown than to use raw HTML div elements.
This would come for free with a "native div syntax" (#168).
Or we could devise something specific to slides
2017-08-14 23:17:44 -07:00
John MacFarlane
9c577b17b7 Update tests for changes to LaTeX template. 2017-08-14 13:19:54 -07:00
John MacFarlane
38578ad06c Test fixes so we can find data files.
In old tests & command tests, we now set the environment variable
pandoc_datadir.

In lua tests, we set the datadir explicitly.
2017-08-14 13:03:26 -07:00
John MacFarlane
319d7ed6ff Changed command test for #2994 so it actually tests the writer. 2017-08-14 00:00:50 -07:00
John MacFarlane
c7cbd3ddc2 Fixed command tests to set local path.
Previously we just tacked on a directory to the command
line, but that didn't work when we e.g. used a pipe for round tripping,
with two invocations of pandoc.
2017-08-13 23:59:38 -07:00
schrieveslaach
2845ab5976 Put content of \ref, \label commands into span… (#3639)
* Put content of `\ref` and `\label` commands into Span elements so they can be used in filters.
* Add support for `\eqref`
2017-08-13 10:58:45 -07:00
John MacFarlane
8f65590ce9 CommonMark writer: prefer pipe tables to HTML tables...
...even if it means losing relative column width information.
See #3734.
2017-08-13 10:43:43 -07:00
John MacFarlane
506866ef73 Markdown writer: Use pipe tables if raw_html disabled...
and `pipe_tables` enabled, even if the table has relative
width information.

Closes #3734.
2017-08-13 10:37:24 -07:00
Albert Krewinkel
2dc3dbd68b Use hslua >= 0.7, update Lua code 2017-08-13 14:23:54 +02:00
John MacFarlane
418bda8128 Docx writer: pass through comments.
We assume that comments are defined as parsed by the
docx reader:

    I want <span class="comment-start" id="0" author="Jesse Rosenthal"
    date="2016-05-09T16:13:00Z">I left a comment.</span>some text to
    have a comment <span class="comment-end" id="0"></span>on it.

We assume also that the id attributes are unique and properly
matched between comment-start and comment-end.

Closes #2994.
2017-08-12 22:59:53 -07:00
John MacFarlane
be9957bddc Escape MetaString values (as added with --metadata flag).
Previously they would be transmitted to the template without
any escaping.

Note that `--M title='*foo*'` yields a different result from

    ---
    title: *foo*
    ---

In the latter case, we have emphasis; in the former case, just
a string with literal asterisks (which will be escaped
in formats, like Markdown, that require it).

Closes #3792.
2017-08-12 20:27:42 -07:00
John MacFarlane
0ab8670a0e LaTeX reader: Fixed space after \figurename etc. 2017-08-12 13:40:28 -07:00
John MacFarlane
467ca2a1ad Fixed data-dir on translations tests. 2017-08-12 10:39:25 -07:00
John MacFarlane
dbb81f513c More translation tests. 2017-08-11 23:59:27 -07:00
John MacFarlane
9abb688f29 Added simple test for translations. 2017-08-11 23:57:28 -07:00
John MacFarlane
7892dcd353 Command tests; print stderr when a test fails. 2017-08-11 22:09:15 -07:00
John MacFarlane
83d856ee6c Fixed writer tests not to use writerUserDataDir. 2017-08-10 23:51:42 -07:00
John MacFarlane
dee4cbc854 RST reader: implement csv-table directive.
Most attributes are supported, including `:file:` and `:url:`.
A (probably insufficient) test case has been added.

Closes #3533.
2017-08-10 15:01:14 -07:00