Commit graph

1900 commits

Author SHA1 Message Date
Albert Krewinkel
4b98d0459a
Org reader: fix module names in haddock comments
Copy-pasting had lead to haddock module descriptions containing the
wrong module names.
2017-05-31 20:01:04 +02:00
John MacFarlane
774075c3e2 Added eastAsianLineBreakFilter to Shared.
This used to live in the Markdown reader.
2017-05-30 10:22:48 +02:00
John MacFarlane
5ec384eb60 LaTeX reader: handle escaped & inside table cell.
Closes #3708.
2017-05-29 22:47:04 +02:00
John MacFarlane
230a1b89e8 LaTeX reader: don't crash on empty enumerate environment.
Closes #3707.
2017-05-29 15:09:24 +02:00
John MacFarlane
d461b29d9d Merge pull request #3704 from labdsf/anylinenewline
Markdown reader: use anyLineNewline
2017-05-29 09:25:53 +02:00
Alexander Krotov
efc069de5d Markdown reader: use anyLineNewline 2017-05-28 22:52:35 +03:00
Herwig Stuetz
bfd5c6b172 Org reader: Fix cite parsing behaviour
Until now, org-ref cite keys included special characters also at the
end. This caused problems when citations occur right before colons or
at the end of a sentence.

With this change, all non alphanumeric characters at the end of a cite
key are ignored.

This also adds `,` to the list of special characters that are legal
in cite keys to better mirror the behaviour of org-export.
2017-05-28 18:08:11 +02:00
Herwig Stuetz
5a71632d11 Parsing: many1Till: Check for the end condition before parsing
By not checking for the end condition before the first parse, the
parser was applied too often, consuming too much of the input.

This fixes the behaviour of

  `testStringWith (many1Till (oneOf "ab") (string "aa")) "aaa"`

which before incorrectly returned `Right "a"`. With this change, it
instead correctly fails with `Left (PandocParsecError ...)` because it
is not able to parse at least one occurence of `oneOf "ab"` that is
not `"aa"`.

Note that this only affects `many1Till p end` where `p` matches on a
prefix of `end`.
2017-05-28 18:08:11 +02:00
Alexander Krotov
c38d5966ed RST reader: use anyLineNewline in rawListItem (#3702) 2017-05-28 09:29:37 +02:00
John MacFarlane
8614902234 Markdown writer: changes to --reference-links.
With `--reference-location` of `section` or `block`, pandoc
will now repeat references that have been used in earlier
sections.

The Markdown reader has also been modified, so that *exactly*
repeated references do not generate a warning, only
references with the same label but different targets.

The idea is that, with references after every block,
one  might want to repeat references sometimes.

Closes #3701.
2017-05-27 23:18:45 +02:00
Albert Krewinkel
bf93c07267
Org reader: subject full doc tree to headline transformations
Emacs parses org documents into a tree structure, which is then
post-processed during exporting. The reader is changed to do the same,
turning the document into a single tree of headlines starting at
level 0.

Fixes: #3695
2017-05-27 15:38:08 +02:00
John MacFarlane
708973a33a Added spaced_reference_links extension.
This is now the default for pandoc's Markdown.
It allows whitespace between the two parts of a
reference link:  e.g.

    [a] [b]

    [b]: url

This is now forbidden by default.

Closes #2602.
2017-05-25 12:57:31 +02:00
John MacFarlane
8f2c803f97 Markdown reader: warn for notes defined but not used.
Closes #1718.

Parsing.ParserState: Make stateNotes' a Map, add stateNoteRefs.
2017-05-25 11:34:51 +02:00
John MacFarlane
41db9e826e MediaWiki reader: don't do curly quotes inside <tt> contexts.
Even if `+smart`.

See #3585.
2017-05-25 09:35:25 +02:00
John MacFarlane
e6f4636a2c MediaWiki reader: Make smart double quotes depend on smart extension.
Closes #3585.
2017-05-25 09:19:34 +02:00
John MacFarlane
b9a30ef959 Markdown reader: fixed smart quotes after emphasis.
E.g. in

    *foo*'s 'foo'

Closes #2228.
2017-05-24 23:23:08 +02:00
John MacFarlane
8f718b0883 LaTeX reader: Fixed failures on \ref{}, \label{} with +raw_tex.
Now these commands are parsed as raw if `+raw_tex`;
otherwise, their argument is parsed as a bracketed string.
2017-05-24 23:04:49 +02:00
John MacFarlane
bc6aac7b47 Parsing: Provide parseFromString'.
This is a verison of parseFromString specialied to
ParserState, which resets stateLastStrPos at the end.
This is almost always what we want.

This fixes a bug where `_hi_` wasn't treated as emphasis in
the following, because pandoc got confused about the
position of the last word:

    - [o] _hi_

Closes #3690.
2017-05-24 22:41:47 +02:00
John MacFarlane
1288a50380 LaTeX reader: parse tikzpicture as raw verbatim environment...
if `raw_tex` extension is selected.
Otherwise skip with a warning.

This is better than trying to parse it as text!

Closes #3692.
2017-05-24 21:46:53 +02:00
John MacFarlane
7174776c19 HTML reader: Add details tag to list of block tags.
Closes #3694.
2017-05-24 12:11:12 +02:00
John MacFarlane
5844af67b4 RST reader: reformatting (code line length). 2017-05-23 21:00:51 +02:00
keiichiro shikano
c0c54b7906 RST Reader: parse list table directive (#3688)
Closes #3432.
2017-05-23 20:53:04 +02:00
Albert Krewinkel
5debb0da0f Shared: Provide custom isURI that rejects unknown schemes [isURI]
We also export the set of known `schemes`.

The new function replaces the function of the same name
from `Network.URI`, as the latter did not check whether a scheme is
well-known.  E.g. MediaWiki wikis frequently feature pages with names
like `User:John`. These links were interpreted as URIs, thus turning
internal links into global links. This is prevented by also checking
whether the scheme of a URI is frequently used (i.e. is IANA registered
or an otherwise well-known scheme).

Fixes: #2713

Update set of well-known URIs from IANA list
All official IANA schemes (as of 2017-05-22) are included in the set of
known schemes.  The four non-official schemes doi, isbn, javascript, and
pmid are kept.
2017-05-23 09:48:11 +02:00
Alexander Krotov
30a3deadcc Move indentWith to Text.Pandoc.Parsing (#3687) 2017-05-22 10:10:15 +02:00
John MacFarlane
8c1b81bbef Finished implemtation of --resource-path.
* Default is just working directory.
* Working directory must be explicitly specifide if
  `--resource-path` option is used.
2017-05-21 09:02:01 +02:00
Alexander Krotov
753d5811e2 RST reader: make use of anyLineNewline (#3686) 2017-05-20 23:14:08 +02:00
Albert Krewinkel
7a09b7b21d
Org reader: fix smart parsing behavior
Parsing of smart quotes and special characters can either be enabled via
the `smart` language extension or the `'` and `-` export options. Smart
parsing is active if either the extension or export option is enabled.
Only smart parsing of special characters (like ellipses and en and em
dashes) is enabled by default, while smart quotes are disabled.

This means that all smart parsing features will be enabled by adding the
`smart` language extension. Fine-grained control is possible by leaving
the language extension disabled. In that case, smart parsing is
controlled via the aforementioned export OPTIONS only.

Previously, all smart parsing was disabled unless the language extension
was enabled.
2017-05-18 23:25:11 +02:00
John MacFarlane
818d5c2f35 Markdown: allow attributes in reference links to start on next line.
This addresses a subsidiary issue in #3674.
2017-05-18 13:20:32 +02:00
John MacFarlane
61e965b117 Merge pull request #3676 from labdsf/space-char
Txt2Tags parser: newline is not indentation
2017-05-17 12:47:27 +02:00
John MacFarlane
377733e08f Merge pull request #3677 from labdsf/anylinenewline
Move anyLineNewline to Parsing.hs
2017-05-17 12:47:03 +02:00
Alexander Krotov
55ce47d050 Move anyLineNewline to Parsing.hs 2017-05-17 11:02:38 +03:00
Alexander Krotov
e74bd06cc8 Txt2Tags parser: newline is not indentation
space parses '\n', while spaceChar parses only ' ' and '\t'
2017-05-17 02:12:24 +03:00
Albert Krewinkel
602cd6a327
Org reader: replace sequence . map with mapM 2017-05-16 22:49:52 +02:00
Albert Krewinkel
a27e2e8a4e
Org reader: put tree parsing code into dedicated module 2017-05-16 22:42:34 +02:00
John MacFarlane
fbce4228a5 Merge pull request #3671 from WUUUGI/horizont-spacing
Added support for horizontal spacing in LaTeX
2017-05-16 09:18:57 +02:00
John MacFarlane
37189667cc Textile reader: fix bug for certain links in table cells.
Closes #3667.
2017-05-15 20:36:11 +02:00
Henri Werth
2de5208311 Added support for horizontal spacing in LaTeX: parse \, to \8198 (six-per-em space) 2017-05-15 16:37:08 +02:00
Albert Krewinkel
af4bf91c59
Org reader: add basic file inclusion mechanism
Support for the `#+INCLUDE:` file inclusion mechanism was added.
Recognized include types are *example*, *export*, *src*, and normal org
file inclusion.  Advanced features like line numbers and level selection
are not implemented yet.

Closes: #3510
2017-05-14 12:45:31 +02:00
Albert Krewinkel
965f1ddd4a
Update dates in copyright notices
This follows the suggestions given by the FSF for GPL licensed software.
<https://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html>
2017-05-13 23:30:13 +02:00
Alexander Krotov
2a291e437a Replace repeat' and take' with `replicate' once more 2017-05-12 16:31:57 +02:00
Albert Krewinkel
4b9fb7a128 Combine grid table parsers
The grid table parsers for markdown and rst was combined into one single
parser, slightly changing parsing behavior of both parsers:

- The markdown parser now compactifies block content cell-wise: pure
  text blocks in cells are now treated as paragraphs only if the cell
  contains multiple paragraphs, and as plain blocks otherwise. Before,
  this was true only for single-column tables.

- The rst parser now accepts newlines and multiple blocks in header
  cells.

Closes: #3638
2017-05-11 00:17:56 +02:00
John MacFarlane
82cc7fb0d4 Markdown reader: improved parsing of indented raw HTML blocks.
Previously we inadvertently interpreted indented HTML as
code blocks.  This was a regression.

We now seek to determine the indentation level of the contents
of an HTML block, and (optionally) skip that much indentation.

As a side effect, indentation may be stripped off of raw
HTML blocks, if `markdown_in_html_blocks` is used. This
is better than having things interpreted as indented code
blocks.

Closes #1841.
2017-05-06 22:56:16 +02:00
John MacFarlane
f20c89e243 LaTeX reader: Better handling of comments inside math environments.
This solves a problem with commented out `\end{eqnarray}` inside
an eqnarray (among other things).

Closes #3113.
2017-05-06 22:16:43 +02:00
schrieveslaach
ddf2524477 Fix keyval funtion: pandoc did not parse options in braces correctly.… (#3642)
* Fix keyval funtion: pandoc did not parse options in braces correctly. Additionally, dot, dash, and colon were no valid characters

* Add | as possible option value

* Improved code
2017-05-06 15:09:29 +02:00
Albert Krewinkel
bf44b88522
Drop redundant import of sort
This was left in accidentally.
2017-05-06 11:32:38 +02:00
Albert Krewinkel
da8c153a68
Org reader: support macros
Closes: #3401
2017-05-06 11:00:32 +02:00
Albert Krewinkel
57cba3f1d5
Org reader: support table.el tables
Closes #3314
2017-05-03 22:43:34 +02:00
Albert Krewinkel
df23d96c89
Generalize tableWith, gridTableWith
The parsing functions `tableWith` and `gridTableWith` are generalized to
work with more parsers. The parser state only has to be an instance of
the `HasOptions` class instead of requiring a concrete type. Block
parsers are required to return blocks wrapped into a monad, as this
makes it possible to use parsers returning results wrapped in `Future`s.
2017-05-02 23:41:45 +02:00
schrieveslaach
6e55e6837a LaTeX reader: Add support for tabularx environment (#3632) 2017-05-03 12:16:48 +02:00
Albert Krewinkel
31caa616a9 Provide shared F monad functions for Markdown and Org readers
The `F` monads used for delayed evaluation of certain values in the
Markdown and Org readers are based on a shared data type capturing the
common pattern of both `F` types.
2017-04-30 10:59:20 +02:00