Commit graph

997 commits

Author SHA1 Message Date
Albert Krewinkel
6ded3d41d9 Org reader: Apply captions to code blocks and tables
The `Table` blocktype already takes the caption as an argument, while code
blocks are wrapped in a `Div` block together with a labelling `Span`.
2014-04-19 10:41:45 +02:00
Albert Krewinkel
09441b65a8 Org reader: Add support for plain LaTeX fragments
This adds support for LaTeX fragments like the following:

```
\begin{equation}
\int fg \mathrm{d}x
\end{equation}
```
2014-04-18 10:22:54 +02:00
Albert Krewinkel
f19d7233d8 Org reader: Fix parsing of loose lists
Loose lists (i.e. lists with blankline separated items), were parsed as
multiple lists, each containing a single item.  This patch fixes this
issue.
2014-04-18 08:34:06 +02:00
Albert Krewinkel
6d6724cf2c Org reader: Support more types of '#+BEGIN_<type>' blocks
Support for standard org-blocks is improved.  The parser now handles
"HTML", "LATEX", "ASCII", "EXAMPLE", "QUOTE" and "VERSE" blocks in a
sensible fashion.
2014-04-17 18:33:39 +02:00
Albert Krewinkel
0672f58a44 Org reader: Support footnotes 2014-04-17 13:23:14 +02:00
Albert Krewinkel
92582c6272 Org reader: introduce Reader environment around Blocks/Inlines
This introduces a Reader environment in the style of
Text.Pandoc.Parsing.F, but adapted to the Org reader parser.
2014-04-16 13:38:50 +02:00
Albert Krewinkel
5fc252270c Org reader: Fix code for subexpression parsing 2014-04-16 13:26:32 +02:00
Albert Krewinkel
346bcea713 Org reader: Better module description, minor style changes
Use module description analogous to the markdown reader's.
Use (<$) where it makes sense.
2014-04-16 13:23:30 +02:00
John MacFarlane
86b4da9dec Merge pull request #1239 from tarleb/org-linebreak
Org linebreaks
2014-04-13 14:04:48 -07:00
John MacFarlane
d5d4227ea5 Merge pull request #1238 from tarleb/org-figures
Org reader: Add support for figures
2014-04-13 14:03:15 -07:00
John MacFarlane
d4c1cd456c Org reader: Removed ANN pragma.
This relies on Template Haskell, which causes problems in Windows
due to libraries with C dependencies.  We need to avoid using TH
in pandoc code.
2014-04-12 21:44:54 -07:00
Albert Krewinkel
82d4160bdc Org reader: Read linebreaks
Linebreaks are marked by the string `\\` at the end of a line.
2014-04-12 11:07:38 +02:00
Albert Krewinkel
ae4280fba5 Org reader: Add support for figures
Support for figures (images with name and caption) is added.
2014-04-12 10:31:45 +02:00
John MacFarlane
8699071ec2 HTML reader: Treat processing instructions & declarations as block.
Previously these were treated as inline, and included in paragraph
tags in HTML or DocBook output, which is generally not what is wanted.

Closes #1233.
2014-04-11 10:10:54 -07:00
Albert Krewinkel
6f19be7d40 Org reader: Fix parsing of sub-/superscript expressions
This fixes the org-reader's handling of sub- and superscript
expressions.  Simple expressions (like `2^+10`), expressions in
parentheses (`a_(n+1)`) and nested sexp (like `a_(nested()parens)`) are
now read correctly.
2014-04-11 11:05:42 +02:00
John MacFarlane
ca40acea5b MediaWiki reader: Handle table rows containing just an HTML comment.
Closes #1230.
2014-04-10 16:52:30 -07:00
Albert Krewinkel
ace8837cd6 Org reader: Improve code by following HLint recommendations
HLint's recommendations for better code are applied to the Org-mode
reader code.
2014-04-10 19:17:58 +02:00
Albert Krewinkel
1715d7cee0 Org reader: Support more inline/display math variants
Support all of the following variants as valid ways to define inline or
display math inlines:

  - `\[..\]` (display)
  - `$$..$$` (display)
  - `\(..\)` (inline)
  - `$..$`   (inline)

This closes #1223.  Again.
2014-04-10 15:32:02 +02:00
John MacFarlane
54e33a132b Merge pull request #1226 from tarleb/org-emphasis-reader
Org reader: Precise rules for the recognition of markup
2014-04-09 09:34:44 -07:00
Albert Krewinkel
030020236c Org reader: Precise rules for the recognition of markup
The inline parsers have been rewritten using the org source code as a
reference. This fixes a couple of bugs related to erroneous markup
recognition.
2014-04-09 15:26:06 +02:00
John MacFarlane
e555a5703d Textile reader: Improved link parsing.
In particular we now pick up on attributes. Since pandoc links
can't have attributes, we enclose the whole link in a span
if there are attributes.

Closes #1008.
2014-04-07 21:23:39 -07:00
John MacFarlane
bfd598e1e9 Merge pull request #1224 from tarleb/org-math
Org reader: Read inline math, recognize definition lists
2014-04-07 07:24:30 -07:00
Albert Krewinkel
c47bd8404f Org reader: Support inline math (like $E=mc^2$)
Closes #1223.
2014-04-07 11:47:36 +02:00
John MacFarlane
fcddd0e4bd LaTeX reader: handle @{} and p{length} in tabular.
The length is not actually recorded, but at least we get a table.

Closes #1180.
2014-04-06 15:11:49 -07:00
Albert Krewinkel
480b33b710 Org reader: Add support for definition lists 2014-04-06 20:39:10 +02:00
Albert Krewinkel
4ebf6f6ebf Org reader: Minor code clean-up 2014-04-06 20:39:05 +02:00
John MacFarlane
971e4c4364 HTML reader: Updated closes with rules from HTML5 spec. 2014-04-05 23:05:11 -07:00
John MacFarlane
24f438aa5f Textile reader: Better support for attributes.
Instead of being ignored, attributes are now parsed and
included in Span inlines.

The output will be a bit different from stock textile:
e.g. for `*(foo)hi*`, we'll get `<em><span class="foo">hi</span></em>`
instead of `<em class="foo">hi</em>`.  But at least the data is
not lost.
2014-04-05 21:02:12 -07:00
John MacFarlane
060a76a38e Textile reader: Improved treatment of HTML spans (%).
Closes #1115.
2014-04-05 20:41:38 -07:00
John MacFarlane
75dbe87a99 Removed whitespace at ends of lines. 2014-04-05 20:31:27 -07:00
John MacFarlane
f2deb9d86d Org reader: Added type signature. 2014-04-05 15:50:46 -07:00
John MacFarlane
971dca588e Merge pull request #1219 from tarleb/org-images
Org-reader: support inline images, clean-up code, fix bugs
2014-04-05 15:12:40 -07:00
Albert Krewinkel
652c781e37 Org reader: Support inline images 2014-04-05 16:15:53 +02:00
Albert Krewinkel
d76d2b707b Org reader: Provide more language identifier translations
Org-mode and Pandoc use different language identifiers, marking source
code as being written in a certain programming language.  This adds more
translations from identifiers as used in Org to identifiers used in
Pandoc.

The full list of identifiers used in Org and Pandoc is available through
http://orgmode.org/manual/Languages.html and `pandoc -v`, respectively.
2014-04-05 16:15:50 +02:00
Albert Krewinkel
fd98532784 Org reader: Fix parsing of nested inlines
Text such as /*this*/ was not correctly parsed as a strong, emphasised
word.  This was due to the end-of-word recognition being to strict as it
did not accept markup chars as part of a word.  The fix involves an
additional parser state field, listing the markup chars which might be
parsed as part of a word.
2014-04-05 16:14:40 +02:00
Albert Krewinkel
d43c3e8101 Org reader: Use specialized org parser state
The default pandoc ParserState is replaced with `OrgParserState`.  This
is done to simplify the introduction of new state fields required for
efficient Org parsing.
2014-04-05 16:14:09 +02:00
Albert Krewinkel
7cf7e45e4c Org reader: Slight cleaning of table parsing code 2014-04-05 16:13:18 +02:00
John MacFarlane
ee2e769cd7 DocBook reader: Better treatment of formalpara.
We now emit the title (if present) as a separate paragraph
with boldface text.

Closes #1215.
2014-04-04 22:06:25 -07:00
John MacFarlane
e06aa97bb9 DocBook reader: set metadata "author" not "authors" 2014-04-04 21:39:08 -07:00
John MacFarlane
c5a0c7a819 Removed trailing whitespace. 2014-04-04 21:34:12 -07:00
John MacFarlane
c99ecf6cc5 DocBook reader: set "author" not "authors". 2014-04-04 21:33:16 -07:00
Matthew Pickering
1b930a9670 Added recognition of authorgroup element and releaseinfo element to DocBook reader.
Closes #1214
2014-04-04 21:00:49 -07:00
Matthew Pickering
b1c865ccc6 Converted current meta information parsing in DocBook to a more extensible version which is aware of the more recent meta representation. 2014-04-04 21:00:29 -07:00
John MacFarlane
4ee92dce0c MediaWiki reader: Fixed bug in certain nested lists.
The bug: If a level 2 list was followed by a level 1 list, the first
item of the level 1 list would be lost.

Closes #1213.
2014-04-01 10:36:23 -07:00
John MacFarlane
1cadba16eb HTML reader: idiomatic rewriting for clarity. 2014-04-01 10:10:46 -07:00
Matthew Pickering
5a51a67abd Changed the smart punctuation parser to return Inlines rather than an Inline element and updated files accordingly 2014-04-01 13:53:34 +01:00
Matthew Pickering
9b5d474e79 Converted HTML reader to use builder. Fixes #1162. 2014-04-01 13:44:19 +01:00
Matthew Pickering
0ccca94b4c Bugfix for #1175 and convert textile reader to use builder.
The reader did not correctly parse inline markup. The behavoir is now as follows.

(a) The markup must start at the start of a line, be inside previous
inline markup or be preceeded by whitespace.
(b) The markup can not span across paragraphs (delimited by \n\n)
(c) The markup can not be followed by a alphanumeric character.
(d) Square brackets can be placed around the markup to avoid having
to have white space before it.

In order to make these changes it was either necessary to convert the parser to return a list of inlines or to convert the whole reader to use the builder. The latter approach whilst more work makes a bit more sense as it becomes easy to arbitarily append and prepend elements without changing the type.

Tests are accordingly updated in a later commit to reflect the different normalisation behavoir specified by the builder monoid.
2014-04-01 13:44:06 +01:00
John MacFarlane
69a7c9f634 LaTeX reader: Better handling of figure and table with caption.
We now look for a \caption inside the environment; if one is
found, it is attached to the graphic or tabular found there.

Closes #1204.
2014-03-25 23:10:43 -07:00
John MacFarlane
994597f071 Revert "LaTeX reader: Added LPState."
This reverts commit 82ddec698e.
2014-03-25 22:40:18 -07:00