Commit graph

4361 commits

Author SHA1 Message Date
John MacFarlane
d1e78d96b6 UTF8: export fromText, fromTextLazy. 2017-06-10 16:05:56 +02:00
John MacFarlane
627e27fc1e App: change readSource(s) to use Text instead of String. 2017-06-10 15:55:18 +02:00
John MacFarlane
c691b97506 UTF8: export toText, toTextLazy.
Define toString, toStringLazy in terms of them.
2017-06-10 15:54:35 +02:00
John MacFarlane
72b45f05ed Rewrote convertTabs to use Text not String. 2017-06-10 15:22:25 +02:00
Albert Krewinkel
55d679e382
Improve code style in lua and org modules 2017-06-03 13:35:19 +02:00
Albert Krewinkel
d55f01c65f
Org reader: apply hlint suggestions 2017-06-03 00:05:20 +02:00
John MacFarlane
b61a51ee15 hlint suggestions. 2017-06-02 15:25:39 +02:00
John MacFarlane
18f86a0c02 Fixed keywords in docx writer.
(See #3719)
2017-06-02 10:17:19 +02:00
John MacFarlane
a063ba58fd Merge pull request #3719 from iandol/patch-2
Add keywords metadata to docx core.xml document properties
2017-06-02 10:11:37 +02:00
John MacFarlane
e43ea03410 Fixed HTML reader. 2017-06-02 10:10:31 +02:00
Ian
b2fe1015d9 Add keywords metadata to docx document properties
Hi, I don't know haskell so possibly this is wrong, but DOCX stores keywords in cp:keywords in core.xml, and this should be easy to add from the pandoc metadata (I copy and paste the author code). As far as I can tell (no clear documentation, just a few refs), keywords should be separated with a comma.
2017-06-02 10:47:30 +08:00
John MacFarlane
eb6fb62e55 HTML reader: Use sets instead of lists for block tag lookup. 2017-06-01 19:15:00 +02:00
John MacFarlane
0f07404daf HTML reader: Removed "button" from block tag list.
It is already in the eitherBlockOrInlineTag list, and
should be both places.

Closes #3717.

Note: the result of this change is that there will be
p tags around the whole paragraph.  That is the right
result, because the `button` tags are treated as inline
HTML here, and the whole chunk of text is a Markdown
paragraph.
2017-06-01 18:59:21 +02:00
John MacFarlane
8218bdb95c HTML writer: Avoid two class attributes when adding 'uri' class.
Closes #3716.
2017-06-01 18:41:54 +02:00
John MacFarlane
0cf6511f16 Some hlint refactoring. 2017-06-01 15:09:38 +02:00
John MacFarlane
b1a9b567aa Trivial reformatting. 2017-06-01 14:19:43 +02:00
John MacFarlane
c2eb7d0857 Use isNothing. 2017-06-01 14:16:17 +02:00
John MacFarlane
00d8585d8f Trivial renaming. 2017-06-01 14:14:42 +02:00
John MacFarlane
c366fab2cb Markdown writer: Avoid inline surround-marking with empty content.
E.g. we don't want `<strong></strong>` to become `****`.
Similarly for emphasis, super/subscript, strikeout.

Closes #3715.
2017-06-01 12:30:58 +02:00
John MacFarlane
9396f1fb67 LaTeX reader: handle some width specifiers on table columns.
Currently we only handle the form `0.9\linewidth`.
Anything else would have to be converted to a percentage,
using some kind arbitrary assumptions about line widths.

See #3709.
2017-06-01 12:08:28 +02:00
John MacFarlane
af6e8414c7 LaTeX reader: more table refactoring. 2017-06-01 11:56:59 +02:00
John MacFarlane
58cfac84f0 LaTeX reader: Small refactoring of table parsing code.
This makes room for doing something with widths.
2017-06-01 11:35:03 +02:00
John MacFarlane
1e7ba5ccd7 LaTeX reader: Handle block structure inside table cells.
minipage is no longer required.

Closes #3709.
2017-06-01 11:16:28 +02:00
John MacFarlane
a61dce88e8 Merge pull request #3714 from tarleb/odt-reader-cleanup
Odt reader: remove dead code
2017-06-01 10:26:25 +02:00
Albert Krewinkel
e1a0666689
Org reader: respect export option for tags
Tags are appended to headlines by default, but will be omitted when the
`tags` export option is set to nil.

Closes: #3713
2017-05-31 21:26:07 +02:00
Albert Krewinkel
33a1e4ae1a
Org reader: include tags in headlines
The Emacs default is to include tags in the headline when exporting.
Instead of just empty spans, which contain the tag name as attribute,
tags are rendered as small caps and wrapped in those spans.
Non-breaking spaces serve as separators for multiple tags.
2017-05-31 20:43:30 +02:00
Albert Krewinkel
7852cd5603
Org reader: recognize babel result blocks with attributes
Babel result blocks can have block attributes like captions and names.
Result blocks with attributes were not recognized and were parsed as
normal blocks without attributes.

Fixes: #3706
2017-05-31 20:01:04 +02:00
Albert Krewinkel
4b98d0459a
Org reader: fix module names in haddock comments
Copy-pasting had lead to haddock module descriptions containing the
wrong module names.
2017-05-31 20:01:04 +02:00
Albert Krewinkel
f955af58e6
Odt reader: remove dead code
The ODT reader contained a lot of general code useful for working with
arrows. However, many of these utils weren't used and are hence removed.
2017-05-31 19:59:34 +02:00
John MacFarlane
774075c3e2 Added eastAsianLineBreakFilter to Shared.
This used to live in the Markdown reader.
2017-05-30 10:22:48 +02:00
John MacFarlane
5ec384eb60 LaTeX reader: handle escaped & inside table cell.
Closes #3708.
2017-05-29 22:47:04 +02:00
John MacFarlane
230a1b89e8 LaTeX reader: don't crash on empty enumerate environment.
Closes #3707.
2017-05-29 15:09:24 +02:00
John MacFarlane
d461b29d9d Merge pull request #3704 from labdsf/anylinenewline
Markdown reader: use anyLineNewline
2017-05-29 09:25:53 +02:00
Alexander Krotov
efc069de5d Markdown reader: use anyLineNewline 2017-05-28 22:52:35 +03:00
Herwig Stuetz
bfd5c6b172 Org reader: Fix cite parsing behaviour
Until now, org-ref cite keys included special characters also at the
end. This caused problems when citations occur right before colons or
at the end of a sentence.

With this change, all non alphanumeric characters at the end of a cite
key are ignored.

This also adds `,` to the list of special characters that are legal
in cite keys to better mirror the behaviour of org-export.
2017-05-28 18:08:11 +02:00
Herwig Stuetz
5a71632d11 Parsing: many1Till: Check for the end condition before parsing
By not checking for the end condition before the first parse, the
parser was applied too often, consuming too much of the input.

This fixes the behaviour of

  `testStringWith (many1Till (oneOf "ab") (string "aa")) "aaa"`

which before incorrectly returned `Right "a"`. With this change, it
instead correctly fails with `Left (PandocParsecError ...)` because it
is not able to parse at least one occurence of `oneOf "ab"` that is
not `"aa"`.

Note that this only affects `many1Till p end` where `p` matches on a
prefix of `end`.
2017-05-28 18:08:11 +02:00
Alexander Krotov
c38d5966ed RST reader: use anyLineNewline in rawListItem (#3702) 2017-05-28 09:29:37 +02:00
John MacFarlane
8614902234 Markdown writer: changes to --reference-links.
With `--reference-location` of `section` or `block`, pandoc
will now repeat references that have been used in earlier
sections.

The Markdown reader has also been modified, so that *exactly*
repeated references do not generate a warning, only
references with the same label but different targets.

The idea is that, with references after every block,
one  might want to repeat references sometimes.

Closes #3701.
2017-05-27 23:18:45 +02:00
John MacFarlane
4dabcc27f6 Pretty: Eq instance for Doc. 2017-05-27 23:18:45 +02:00
Albert Krewinkel
bf93c07267
Org reader: subject full doc tree to headline transformations
Emacs parses org documents into a tree structure, which is then
post-processed during exporting. The reader is changed to do the same,
turning the document into a single tree of headlines starting at
level 0.

Fixes: #3695
2017-05-27 15:38:08 +02:00
John MacFarlane
8ec03cfc87 HTML writer: Removed unused parameter in dimensionsToAttributeList. 2017-05-26 10:21:55 +02:00
John MacFarlane
cb7b0a6985 Allow em for image height/width in HTML, LaTeX.
- Export `inEm` from ImageSize [API change].
- Change `showFl` and `show` instance for `Dimension` so
  extra decimal places are omitted.
- Added `Em` as a constructor of `Dimension` [API change].
- Allow `em`, `cm`, `in` to pass through without conversion
  in HTML, LaTeX.

Closes #3450.
2017-05-25 22:48:27 +02:00
John MacFarlane
708973a33a Added spaced_reference_links extension.
This is now the default for pandoc's Markdown.
It allows whitespace between the two parts of a
reference link:  e.g.

    [a] [b]

    [b]: url

This is now forbidden by default.

Closes #2602.
2017-05-25 12:57:31 +02:00
John MacFarlane
650e1ac1fd Docx writer: Use Table rather than "Table Normal" for table style.
"Table Normal" is the default table style and can't be modified.

Closes #3275, further testing welcome.
2017-05-25 12:11:46 +02:00
John MacFarlane
8f2c803f97 Markdown reader: warn for notes defined but not used.
Closes #1718.

Parsing.ParserState: Make stateNotes' a Map, add stateNoteRefs.
2017-05-25 11:34:51 +02:00
John MacFarlane
41db9e826e MediaWiki reader: don't do curly quotes inside <tt> contexts.
Even if `+smart`.

See #3585.
2017-05-25 09:35:25 +02:00
John MacFarlane
e6f4636a2c MediaWiki reader: Make smart double quotes depend on smart extension.
Closes #3585.
2017-05-25 09:19:34 +02:00
John MacFarlane
b9a30ef959 Markdown reader: fixed smart quotes after emphasis.
E.g. in

    *foo*'s 'foo'

Closes #2228.
2017-05-24 23:23:08 +02:00
John MacFarlane
8f718b0883 LaTeX reader: Fixed failures on \ref{}, \label{} with +raw_tex.
Now these commands are parsed as raw if `+raw_tex`;
otherwise, their argument is parsed as a bracketed string.
2017-05-24 23:04:49 +02:00
John MacFarlane
bc6aac7b47 Parsing: Provide parseFromString'.
This is a verison of parseFromString specialied to
ParserState, which resets stateLastStrPos at the end.
This is almost always what we want.

This fixes a bug where `_hi_` wasn't treated as emphasis in
the following, because pandoc got confused about the
position of the last word:

    - [o] _hi_

Closes #3690.
2017-05-24 22:41:47 +02:00