Commit graph

2520 commits

Author SHA1 Message Date
Albert Krewinkel
2f724aaaa4 Org reader: Read anchors as empty spans
Anchors (like <<this>>) are parsed as empty spans.
2014-04-24 17:57:06 +02:00
Albert Krewinkel
c128daba9d Org reader: Recognize plain and angle links
This adds support for plain links (like http://zeitlens.com) and angle
links (like <http://moltkeplatz.de>).
2014-04-24 17:55:24 +02:00
Albert Krewinkel
ec24f9761c RST reader: Remove duplicate 'http' in PEP links
The generated link to PEPs had a duplicate 'http://' in its URL.
2014-04-24 17:55:22 +02:00
John MacFarlane
6a2361c457 Merge pull request #1256 from tarleb/org-reader-improvements
Org reader improvements
2014-04-19 20:35:41 -07:00
Albert Krewinkel
8276449520 Org reader: Allow for compact definition lists
Use `Text.Pandoc.Shared.compactify'DL` to allow for compact definition
lists.
2014-04-19 15:13:16 +02:00
Albert Krewinkel
efebade38b Move compactify'DL from Markdown reader into Shared
The function `compactify'DL`, used to change the final definition item of a
definition list into a `Plain` iff all other items are `Plain`s as well, is
useful in many parsers and hence moved into Text.Pandoc.Shared.
2014-04-19 15:03:33 +02:00
Albert Krewinkel
8e91d362a3 Org reader: Fix parsing of footnotes
Footnotes can consist of multiple blocks and end only at a header or at
the beginning of another footnote.  This fixes the previous behavior,
which restricted notes to a single paragraph.
2014-04-19 14:40:46 +02:00
Albert Krewinkel
a69416091b Org reader: Fix distinction of images and normal links
Fixed a false assumption about the precedence of (&&) vs (||).
2014-04-19 11:25:39 +02:00
Albert Krewinkel
6ded3d41d9 Org reader: Apply captions to code blocks and tables
The `Table` blocktype already takes the caption as an argument, while code
blocks are wrapped in a `Div` block together with a labelling `Span`.
2014-04-19 10:41:45 +02:00
Albert Krewinkel
09441b65a8 Org reader: Add support for plain LaTeX fragments
This adds support for LaTeX fragments like the following:

```
\begin{equation}
\int fg \mathrm{d}x
\end{equation}
```
2014-04-18 10:22:54 +02:00
Albert Krewinkel
f19d7233d8 Org reader: Fix parsing of loose lists
Loose lists (i.e. lists with blankline separated items), were parsed as
multiple lists, each containing a single item.  This patch fixes this
issue.
2014-04-18 08:34:06 +02:00
Albert Krewinkel
6d6724cf2c Org reader: Support more types of '#+BEGIN_<type>' blocks
Support for standard org-blocks is improved.  The parser now handles
"HTML", "LATEX", "ASCII", "EXAMPLE", "QUOTE" and "VERSE" blocks in a
sensible fashion.
2014-04-17 18:33:39 +02:00
Albert Krewinkel
0672f58a44 Org reader: Support footnotes 2014-04-17 13:23:14 +02:00
Albert Krewinkel
92582c6272 Org reader: introduce Reader environment around Blocks/Inlines
This introduces a Reader environment in the style of
Text.Pandoc.Parsing.F, but adapted to the Org reader parser.
2014-04-16 13:38:50 +02:00
Albert Krewinkel
5fc252270c Org reader: Fix code for subexpression parsing 2014-04-16 13:26:32 +02:00
Albert Krewinkel
346bcea713 Org reader: Better module description, minor style changes
Use module description analogous to the markdown reader's.
Use (<$) where it makes sense.
2014-04-16 13:23:30 +02:00
John MacFarlane
7f036c0b57 Shared: Fixed bug in toRomanNumeral.
9 and numbers ending in 9 would end with "IXIV."
Thanks to Jesse Rosenthal.  Closes #1249.
2014-04-15 19:53:58 -07:00
John MacFarlane
857fcff7d6 Merge pull request #1240 from neilmayhew/master
Docbook output of Line Blocks
2014-04-13 14:37:28 -07:00
John MacFarlane
86b4da9dec Merge pull request #1239 from tarleb/org-linebreak
Org linebreaks
2014-04-13 14:04:48 -07:00
John MacFarlane
d5d4227ea5 Merge pull request #1238 from tarleb/org-figures
Org reader: Add support for figures
2014-04-13 14:03:15 -07:00
John MacFarlane
d4c1cd456c Org reader: Removed ANN pragma.
This relies on Template Haskell, which causes problems in Windows
due to libraries with C dependencies.  We need to avoid using TH
in pandoc code.
2014-04-12 21:44:54 -07:00
Neil Mayhew
464d7a8e49 Improve handling of hard line breaks in Docbook writer
* Use a <literallayout> for the entire paragraph, not just for the
   newline character
 * Don't let LineBreaks inside footnotes influence the enclosing
   paragraph
2014-04-12 09:16:07 -06:00
Albert Krewinkel
82d4160bdc Org reader: Read linebreaks
Linebreaks are marked by the string `\\` at the end of a line.
2014-04-12 11:07:38 +02:00
Albert Krewinkel
36066699c3 Org writer: Fix output for linebreaks
Hard linebreaks in Org mode are represented by the string "\\" as the
last characters in a line.  Adds this feature to the Org-mode writer.
2014-04-12 10:47:49 +02:00
Albert Krewinkel
ae4280fba5 Org reader: Add support for figures
Support for figures (images with name and caption) is added.
2014-04-12 10:31:45 +02:00
John MacFarlane
8699071ec2 HTML reader: Treat processing instructions & declarations as block.
Previously these were treated as inline, and included in paragraph
tags in HTML or DocBook output, which is generally not what is wanted.

Closes #1233.
2014-04-11 10:10:54 -07:00
Albert Krewinkel
6f19be7d40 Org reader: Fix parsing of sub-/superscript expressions
This fixes the org-reader's handling of sub- and superscript
expressions.  Simple expressions (like `2^+10`), expressions in
parentheses (`a_(n+1)`) and nested sexp (like `a_(nested()parens)`) are
now read correctly.
2014-04-11 11:05:42 +02:00
John MacFarlane
ca40acea5b MediaWiki reader: Handle table rows containing just an HTML comment.
Closes #1230.
2014-04-10 16:52:30 -07:00
Albert Krewinkel
ace8837cd6 Org reader: Improve code by following HLint recommendations
HLint's recommendations for better code are applied to the Org-mode
reader code.
2014-04-10 19:17:58 +02:00
Albert Krewinkel
1715d7cee0 Org reader: Support more inline/display math variants
Support all of the following variants as valid ways to define inline or
display math inlines:

  - `\[..\]` (display)
  - `$$..$$` (display)
  - `\(..\)` (inline)
  - `$..$`   (inline)

This closes #1223.  Again.
2014-04-10 15:32:02 +02:00
John MacFarlane
54e33a132b Merge pull request #1226 from tarleb/org-emphasis-reader
Org reader: Precise rules for the recognition of markup
2014-04-09 09:34:44 -07:00
Albert Krewinkel
030020236c Org reader: Precise rules for the recognition of markup
The inline parsers have been rewritten using the org source code as a
reference. This fixes a couple of bugs related to erroneous markup
recognition.
2014-04-09 15:26:06 +02:00
John MacFarlane
e555a5703d Textile reader: Improved link parsing.
In particular we now pick up on attributes. Since pandoc links
can't have attributes, we enclose the whole link in a span
if there are attributes.

Closes #1008.
2014-04-07 21:23:39 -07:00
John MacFarlane
bfd598e1e9 Merge pull request #1224 from tarleb/org-math
Org reader: Read inline math, recognize definition lists
2014-04-07 07:24:30 -07:00
Albert Krewinkel
c47bd8404f Org reader: Support inline math (like $E=mc^2$)
Closes #1223.
2014-04-07 11:47:36 +02:00
John MacFarlane
e352ec5a0e LaTeX writer: Workaround for level 4-5 headers in quotes.
These previously produced invalid LaTeX: `\paragraph` or
`\subparagraph` in a `quote` environment.  This adds an
`mbox{}` in these contexts to work around the problem.
See http://tex.stackexchange.com/a/169833/22451.

Closes #1221.
2014-04-06 16:32:53 -07:00
John MacFarlane
fcddd0e4bd LaTeX reader: handle @{} and p{length} in tabular.
The length is not actually recorded, but at least we get a table.

Closes #1180.
2014-04-06 15:11:49 -07:00
Albert Krewinkel
480b33b710 Org reader: Add support for definition lists 2014-04-06 20:39:10 +02:00
Albert Krewinkel
4ebf6f6ebf Org reader: Minor code clean-up 2014-04-06 20:39:05 +02:00
John MacFarlane
971e4c4364 HTML reader: Updated closes with rules from HTML5 spec. 2014-04-05 23:05:11 -07:00
John MacFarlane
24f438aa5f Textile reader: Better support for attributes.
Instead of being ignored, attributes are now parsed and
included in Span inlines.

The output will be a bit different from stock textile:
e.g. for `*(foo)hi*`, we'll get `<em><span class="foo">hi</span></em>`
instead of `<em class="foo">hi</em>`.  But at least the data is
not lost.
2014-04-05 21:02:12 -07:00
John MacFarlane
060a76a38e Textile reader: Improved treatment of HTML spans (%).
Closes #1115.
2014-04-05 20:41:38 -07:00
John MacFarlane
75dbe87a99 Removed whitespace at ends of lines. 2014-04-05 20:31:27 -07:00
John MacFarlane
d4054444c0 Text.Pandoc.PDF: Ensure that temp directories deleted on Windows.
The PDF is now read as a strict bytestring, ensuring that process
ownership will be terminated, so the temp directory can be deleted.
Closes #1192.
2014-04-05 19:57:42 -07:00
John MacFarlane
f2deb9d86d Org reader: Added type signature. 2014-04-05 15:50:46 -07:00
John MacFarlane
971dca588e Merge pull request #1219 from tarleb/org-images
Org-reader: support inline images, clean-up code, fix bugs
2014-04-05 15:12:40 -07:00
John MacFarlane
c0309a60bc Shared.openURL: Set proxy with value of http_proxy env variable.
Note:  proxies with non-root paths are not supported,
because of limitations in http-conduit.

Closes #1211.
2014-04-05 10:58:32 -07:00
Albert Krewinkel
652c781e37 Org reader: Support inline images 2014-04-05 16:15:53 +02:00
Albert Krewinkel
d76d2b707b Org reader: Provide more language identifier translations
Org-mode and Pandoc use different language identifiers, marking source
code as being written in a certain programming language.  This adds more
translations from identifiers as used in Org to identifiers used in
Pandoc.

The full list of identifiers used in Org and Pandoc is available through
http://orgmode.org/manual/Languages.html and `pandoc -v`, respectively.
2014-04-05 16:15:50 +02:00
Albert Krewinkel
fd98532784 Org reader: Fix parsing of nested inlines
Text such as /*this*/ was not correctly parsed as a strong, emphasised
word.  This was due to the end-of-word recognition being to strict as it
did not accept markup chars as part of a word.  The fix involves an
additional parser state field, listing the markup chars which might be
parsed as part of a word.
2014-04-05 16:14:40 +02:00