Commit graph

5117 commits

Author SHA1 Message Date
John MacFarlane
23f3c2d7b4 Changed "extracting..." warning to a regular log message.
This makes it sensitive to proper verbosity settings.
(It is now treated as INFO rather than WARNING, so one
doesn't get these messages for creation of tmp images
while making a pdf.)

API changes:

* Removed extractMediaBag from Text.Pandoc.MediaBag.
* Added Extracting as constructor for LogMessage.
2017-06-12 15:28:39 +02:00
John MacFarlane
8a000e3ecc Markdown writer: don't allow soft break in header.
Closes #3736.
2017-06-12 09:23:30 +02:00
John MacFarlane
b466152d61 Don't allow backslash + newline to affect block structure.
Note that as a result of this change, the following,
which formerly produced a header with two lines separated
by a line break, will now produce a header followed by a
paragraph:

    # Hi\
    there

This may affect some existing documents that relied on
this undocumented and unintended behavior.

This change makes pandoc more consistent with other
Markdown implementations, and with itself (since the two-space
version of a line break doesn't work inside ATX headers, and
neither version works inside Setext headers).

Closes #3730.
2017-06-11 22:24:20 +02:00
John MacFarlane
d1da54a4c3 Properly decode source from stdin.
This should fix the appveyor failures.
2017-06-11 21:22:44 +02:00
John MacFarlane
49b738de4e Rewrote HTML reader to use Text throughout.
- Export new NamedTag class from HTML reader.
- Effect on memory usage is modest (< 10%).
2017-06-11 21:15:58 +02:00
schrieveslaach
f36de77a25 Support for \faCheck and \faClose (#3727) 2017-06-11 07:47:42 +02:00
John MacFarlane
fa719d0264 Switched Writer types to use Text.
* XML.toEntities: changed type to Text -> Text.
* Shared.tabFilter -- fixed so it strips out CRs as before.
* Modified writers to take Text.
* Updated tests, benchmarks, trypandoc.

[API change]

Closes #3731.
2017-06-11 00:46:31 +02:00
John MacFarlane
0c2a509dfb Writers.Shared: metaToJSON, generalized type so it can take a Text.
Previously a String was needed as argument; now any ToJSON
instance will do.

API change.
2017-06-10 23:10:33 +02:00
John MacFarlane
e8cc9faa41 Writers: changed StringWriter -> TextWriter. 2017-06-10 21:54:26 +02:00
John MacFarlane
94b3dacb4e Changed all readers to take Text instead of String.
Readers: Renamed StringReader -> TextReader.

Updated tests.

API change.
2017-06-10 18:26:44 +02:00
John MacFarlane
d6822157e7 Readers: Changed StringReader -> TextReader. 2017-06-10 16:06:15 +02:00
John MacFarlane
d1e78d96b6 UTF8: export fromText, fromTextLazy. 2017-06-10 16:05:56 +02:00
John MacFarlane
627e27fc1e App: change readSource(s) to use Text instead of String. 2017-06-10 15:55:18 +02:00
John MacFarlane
c691b97506 UTF8: export toText, toTextLazy.
Define toString, toStringLazy in terms of them.
2017-06-10 15:54:35 +02:00
John MacFarlane
72b45f05ed Rewrote convertTabs to use Text not String. 2017-06-10 15:22:25 +02:00
Albert Krewinkel
55d679e382
Improve code style in lua and org modules 2017-06-03 13:35:19 +02:00
Albert Krewinkel
d55f01c65f
Org reader: apply hlint suggestions 2017-06-03 00:05:20 +02:00
John MacFarlane
b61a51ee15 hlint suggestions. 2017-06-02 15:25:39 +02:00
John MacFarlane
18f86a0c02 Fixed keywords in docx writer.
(See #3719)
2017-06-02 10:17:19 +02:00
John MacFarlane
a063ba58fd Merge pull request #3719 from iandol/patch-2
Add keywords metadata to docx core.xml document properties
2017-06-02 10:11:37 +02:00
John MacFarlane
e43ea03410 Fixed HTML reader. 2017-06-02 10:10:31 +02:00
Ian
b2fe1015d9 Add keywords metadata to docx document properties
Hi, I don't know haskell so possibly this is wrong, but DOCX stores keywords in cp:keywords in core.xml, and this should be easy to add from the pandoc metadata (I copy and paste the author code). As far as I can tell (no clear documentation, just a few refs), keywords should be separated with a comma.
2017-06-02 10:47:30 +08:00
John MacFarlane
eb6fb62e55 HTML reader: Use sets instead of lists for block tag lookup. 2017-06-01 19:15:00 +02:00
John MacFarlane
0f07404daf HTML reader: Removed "button" from block tag list.
It is already in the eitherBlockOrInlineTag list, and
should be both places.

Closes #3717.

Note: the result of this change is that there will be
p tags around the whole paragraph.  That is the right
result, because the `button` tags are treated as inline
HTML here, and the whole chunk of text is a Markdown
paragraph.
2017-06-01 18:59:21 +02:00
John MacFarlane
8218bdb95c HTML writer: Avoid two class attributes when adding 'uri' class.
Closes #3716.
2017-06-01 18:41:54 +02:00
John MacFarlane
0cf6511f16 Some hlint refactoring. 2017-06-01 15:09:38 +02:00
John MacFarlane
b1a9b567aa Trivial reformatting. 2017-06-01 14:19:43 +02:00
John MacFarlane
c2eb7d0857 Use isNothing. 2017-06-01 14:16:17 +02:00
John MacFarlane
00d8585d8f Trivial renaming. 2017-06-01 14:14:42 +02:00
John MacFarlane
c366fab2cb Markdown writer: Avoid inline surround-marking with empty content.
E.g. we don't want `<strong></strong>` to become `****`.
Similarly for emphasis, super/subscript, strikeout.

Closes #3715.
2017-06-01 12:30:58 +02:00
John MacFarlane
9396f1fb67 LaTeX reader: handle some width specifiers on table columns.
Currently we only handle the form `0.9\linewidth`.
Anything else would have to be converted to a percentage,
using some kind arbitrary assumptions about line widths.

See #3709.
2017-06-01 12:08:28 +02:00
John MacFarlane
af6e8414c7 LaTeX reader: more table refactoring. 2017-06-01 11:56:59 +02:00
John MacFarlane
58cfac84f0 LaTeX reader: Small refactoring of table parsing code.
This makes room for doing something with widths.
2017-06-01 11:35:03 +02:00
John MacFarlane
1e7ba5ccd7 LaTeX reader: Handle block structure inside table cells.
minipage is no longer required.

Closes #3709.
2017-06-01 11:16:28 +02:00
John MacFarlane
a61dce88e8 Merge pull request #3714 from tarleb/odt-reader-cleanup
Odt reader: remove dead code
2017-06-01 10:26:25 +02:00
Marc Schreiber
181c56d400 Add \colorbox support 2017-06-01 09:50:51 +02:00
Albert Krewinkel
e1a0666689
Org reader: respect export option for tags
Tags are appended to headlines by default, but will be omitted when the
`tags` export option is set to nil.

Closes: #3713
2017-05-31 21:26:07 +02:00
Albert Krewinkel
33a1e4ae1a
Org reader: include tags in headlines
The Emacs default is to include tags in the headline when exporting.
Instead of just empty spans, which contain the tag name as attribute,
tags are rendered as small caps and wrapped in those spans.
Non-breaking spaces serve as separators for multiple tags.
2017-05-31 20:43:30 +02:00
Albert Krewinkel
7852cd5603
Org reader: recognize babel result blocks with attributes
Babel result blocks can have block attributes like captions and names.
Result blocks with attributes were not recognized and were parsed as
normal blocks without attributes.

Fixes: #3706
2017-05-31 20:01:04 +02:00
Albert Krewinkel
4b98d0459a
Org reader: fix module names in haddock comments
Copy-pasting had lead to haddock module descriptions containing the
wrong module names.
2017-05-31 20:01:04 +02:00
Albert Krewinkel
f955af58e6
Odt reader: remove dead code
The ODT reader contained a lot of general code useful for working with
arrows. However, many of these utils weren't used and are hence removed.
2017-05-31 19:59:34 +02:00
John MacFarlane
774075c3e2 Added eastAsianLineBreakFilter to Shared.
This used to live in the Markdown reader.
2017-05-30 10:22:48 +02:00
John MacFarlane
5ec384eb60 LaTeX reader: handle escaped & inside table cell.
Closes #3708.
2017-05-29 22:47:04 +02:00
John MacFarlane
230a1b89e8 LaTeX reader: don't crash on empty enumerate environment.
Closes #3707.
2017-05-29 15:09:24 +02:00
John MacFarlane
d461b29d9d Merge pull request #3704 from labdsf/anylinenewline
Markdown reader: use anyLineNewline
2017-05-29 09:25:53 +02:00
Alexander Krotov
efc069de5d Markdown reader: use anyLineNewline 2017-05-28 22:52:35 +03:00
Herwig Stuetz
bfd5c6b172 Org reader: Fix cite parsing behaviour
Until now, org-ref cite keys included special characters also at the
end. This caused problems when citations occur right before colons or
at the end of a sentence.

With this change, all non alphanumeric characters at the end of a cite
key are ignored.

This also adds `,` to the list of special characters that are legal
in cite keys to better mirror the behaviour of org-export.
2017-05-28 18:08:11 +02:00
Herwig Stuetz
5a71632d11 Parsing: many1Till: Check for the end condition before parsing
By not checking for the end condition before the first parse, the
parser was applied too often, consuming too much of the input.

This fixes the behaviour of

  `testStringWith (many1Till (oneOf "ab") (string "aa")) "aaa"`

which before incorrectly returned `Right "a"`. With this change, it
instead correctly fails with `Left (PandocParsecError ...)` because it
is not able to parse at least one occurence of `oneOf "ab"` that is
not `"aa"`.

Note that this only affects `many1Till p end` where `p` matches on a
prefix of `end`.
2017-05-28 18:08:11 +02:00
Alexander Krotov
c38d5966ed RST reader: use anyLineNewline in rawListItem (#3702) 2017-05-28 09:29:37 +02:00
John MacFarlane
8614902234 Markdown writer: changes to --reference-links.
With `--reference-location` of `section` or `block`, pandoc
will now repeat references that have been used in earlier
sections.

The Markdown reader has also been modified, so that *exactly*
repeated references do not generate a warning, only
references with the same label but different targets.

The idea is that, with references after every block,
one  might want to repeat references sometimes.

Closes #3701.
2017-05-27 23:18:45 +02:00