Commit graph

2514 commits

Author SHA1 Message Date
John MacFarlane
0092606476 LaTeX reader: Don't error on "%foo" with no newline. 2014-05-10 23:26:32 -07:00
Albert Krewinkel
c5fd631b55 Org reader: Fix block parameter reader, relax constraints
The reader produced wrong results for block containing non-letter chars
in their parameter arguments.  This patch relaxes constraints in that it
allows block header arguments to contain any non-space character (except
for ']' for inline blocks).

Thanks to Xiao Hanyu for noticing this.
2014-05-10 11:35:54 +02:00
John MacFarlane
884693fea8 Merge pull request #1288 from tarleb/update-copyright
Update copyright notices for 2014, add missing notices
2014-05-09 09:53:06 -07:00
Albert Krewinkel
07694b3018 Org reader: Fix parsing of blank lines within blocks
Blank lines were parsed as two newlines instead of just one.
Thanks to Xiao Hanyu (@xiaohanyu) for pointing this out.
2014-05-09 18:23:23 +02:00
Albert Krewinkel
757c4f68f3 Org reader: Support arguments for code blocks
The general form of source block headers
(`#+BEGIN_SRC <language> <switches> <header arguments>`) was not
recognized by the reader.  This patch adds support for the above form,
adds header arguments to the block's key-value pairs and marks the block
as a rundoc block if header arguments are present.

This closes #1286.
2014-05-09 18:08:30 +02:00
Albert Krewinkel
7760504bb2 Org reader: refactor #+BEGIN..#+END block parsing code 2014-05-09 10:53:08 +02:00
Albert Krewinkel
8fdbef841d Update copyright notices for 2014, add missing notices 2014-05-09 00:46:08 +02:00
mpickering
f0f88111e6 Small improvement to textile reader fix. Removed 'try'. 2014-05-07 09:48:48 -07:00
mpickering
0050b50905 Fix textile reader hanging.
Textile reader hung on

    pandoc -f textile http://johnmacfarlane.net/pandoc/demo/example25.textile

The reader no longer hangs.
2014-05-07 09:32:25 -07:00
John MacFarlane
84f2336a7d Textile reader: Rearranged inline parsers for performance.
This is possible because of the rewrite of simpleInline.
Also removed a redundant parser for grouped inlines.
2014-05-06 23:41:56 -07:00
John MacFarlane
442eecc15c Textile reader: Rewrote simpleInline for clarity and efficiency.
This way we only look once for the opening `[`.
2014-05-06 23:27:16 -07:00
John MacFarlane
ea4e947bd0 Textile reader: Disallow blank lines in inline contexts.
@hi

    there@

should not be a single code span.
2014-05-06 23:16:47 -07:00
John MacFarlane
d6a9ba1cdc Make --trace work with textile reader. 2014-05-06 22:28:11 -07:00
John MacFarlane
10644607e3 Textile reader: Rewrote some inline parsing code for clarity.
(It seems clearer to put the whitespace parsing in the grouped
parser.  This also uses stateLastStrPos to determine when the
border is adjacent to an alphanumeric.)
2014-05-06 22:14:35 -07:00
Albert Krewinkel
71bd4fb2b3 Org reader: Read inline code blocks
Org's inline code blocks take forms like `src_haskell(print "hi")` and
are frequently used to include results from computations called from
within the document.  The blocks are read as inline code and marked with
the special class `rundoc-block`.  Proper handling and execution of
these blocks is the subject of a separate library, rundoc, which is
work in progress.

This closes #1278.
2014-05-06 13:21:26 +02:00
John MacFarlane
dbd6c1540f Fixed the fix to #1154.
We need to strip off up to 4 spaces, not up to 3.
2014-05-04 16:21:18 -07:00
John MacFarlane
51aa304834 LaTeX writer: Fixed inconsistencies with reference escaping.
- toLabel is now monadic, and it does the needed string escaping.
- Closes #1130.
2014-05-04 14:43:05 -07:00
John MacFarlane
0c7e084342 Docx writer: Fall back on distribution reference.docx.
* Undid changes to parseXml in last commit.
* Instead of a string fallback, we have parseXml fall back
  on the reference.docx that comes with pandoc if the user's
  reference.docx does not contain a needed file.
* Closes #1185.
2014-05-04 10:54:45 -07:00
John MacFarlane
d728715981 Docx writer: Added ability to give fallback in parseXml. 2014-05-04 10:45:20 -07:00
John MacFarlane
3e42f08e87 Markdown reader: Fixed bug with unwanted code in lists.
Closes #1154.

When reading a raw list item, we now strip off nonindent
spaces.
2014-05-04 08:07:17 -07:00
John MacFarlane
96c0c950ca AsciiDoc writer: Handle multiblock table cells.
Closes #1246.
2014-05-03 21:31:53 -07:00
John MacFarlane
fde52c25a6 AsciiDoc writer: Correctly handle empty table cells.
Closes #1245.
2014-05-03 21:08:45 -07:00
John MacFarlane
abd3a039b9 DocBook writer: Small tweaks to last commit.
* Use isTightList from Shared.
* Adjust writer test, since isTightList is a bit different from what
  was used before.

Closes #1250.
2014-05-03 20:45:38 -07:00
Neil Mayhew
ccbf4fc9c2 Distinguish tight and loose lists in Docbook output
Determined by the first block of the first item being Plain.
2014-05-03 18:37:02 -07:00
John MacFarlane
2ba7873086 LaTeX reader: Fixed regression introduced with last commit.
Tests now pass again.
2014-05-03 18:34:23 -07:00
John MacFarlane
743dac493f LaTeX reader: Better error messages with include files.
Closes #1274.

Rewrote handleIncludes.

We now report the actual source file and position where the error
occurs, even if it is included.  We do this by inserting special
commands, `\PandocStartInclude` and `\PandocEndInclude`, that encode
this information in the preprocessing phase.

Also generalized the types of a couple functions from
`Text.Pandoc.Parsing`.
2014-05-03 17:37:54 -07:00
John MacFarlane
4c43824203 Fixed empty reference links. Closes #1186.
Includes test.
2014-05-02 22:58:47 -07:00
John MacFarlane
007eb96e06 Markdown reader: Make one-column pipe tables work.
Closes #1218.
2014-05-01 09:23:21 -07:00
John MacFarlane
b306405caa Merge pull request #1272 from tarleb/link-types
Org reader: add support for custom link types
2014-05-01 08:44:05 -07:00
Albert Krewinkel
8726eebcd3 Org reader: Add support for custom link types
Org allows users to define their own custom link types.  E.g., in a
document with a lot of links to Wikipedia articles, one can define a
custom wikipedia link-type via

    #+LINK: wp https://en.wikipedia.org/wiki/

This allows to write [[wp:Org_mode][Org-mode]] instead of the
equivallent [[https://en.wikipedia.org/wiki/Org_mode][Org-mode]].
2014-05-01 11:50:32 +02:00
John MacFarlane
eaba340b93 RST reader: Some fixes to last change, and use "author" not "authors".
(in metadata)
2014-04-30 11:28:18 -07:00
John MacFarlane
81bf82c258 RST reader: Better handling of directives.
* We now correctly handle field lists that are indented more than
  3 spaces.
* We treat an "aafig" directive as a code block with attributes,
  so it can be processed in a filter.  (Closes #1212.)
2014-04-30 09:59:36 -07:00
John MacFarlane
093229dc35 ConTeXt writer: Improved autolinks.
Closes #1270.
2014-04-30 08:58:10 -07:00
John MacFarlane
c8f97d3d41 Fix #1267.
We now check the writerName for a lua script in pandoc.hs, so that
lowercasing and format parsing aren't done.  Note this behavior
change: getWriter in Text.Pandoc no longer returns a custom writer on
input "foo.lua".
2014-04-27 20:56:50 -07:00
John MacFarlane
22e36e1040 LaTeX reader: Made \nocite work.
This adds nocite citations to a metadata field, `nocite`.
These will appear in the bibliography but not in the text
(unless you use a `$nocite$` variable in your template, of
course).
2014-04-26 12:14:42 -07:00
John MacFarlane
35ea8de369 HTML writer: improved detection of image links.
Previously image links with queries were not recognized,
leading to use of an embed tag rather than an img tag.
2014-04-26 12:04:08 -07:00
John MacFarlane
60297089f6 Merge pull request #1265 from tarleb/org-links
Improvements handling of internal links
2014-04-25 08:08:00 -07:00
Albert Krewinkel
b09412d852 LaTeX writer: Mark span contents with label if span has an ID
Prepend `\label{span-id}` to span contents iff `span-id` is defined.
2014-04-25 16:17:24 +02:00
Albert Krewinkel
2eec20d92f Org reader: Enable internal links
Internal links in Org are possible by using an anchor-name as the target
of a link:

[[some-anchor][This]] is an internal link.

It links <<some-anchor>> here.
2014-04-25 15:29:28 +02:00
John MacFarlane
cbeb3bb213 EPUB writer: Fixed some idrefs to match changes in ids. 2014-04-24 17:37:10 -07:00
John MacFarlane
e6333a9d7c Markdown writer: Use proper escapes to avoid unwanted lists.
Previously we used 0-width spaces, an ugly hack.

Closes #980.
2014-04-24 16:44:49 -07:00
John MacFarlane
d16775e1c7 Render numbers in YAML metadata without decimals when possible.
The change to aeson > 0.7 caused numbers to be rendered with
decimals.  This change causes them to be rendered without decimals
wehn possible.
2014-04-24 11:09:07 -07:00
Albert Krewinkel
2f724aaaa4 Org reader: Read anchors as empty spans
Anchors (like <<this>>) are parsed as empty spans.
2014-04-24 17:57:06 +02:00
Albert Krewinkel
c128daba9d Org reader: Recognize plain and angle links
This adds support for plain links (like http://zeitlens.com) and angle
links (like <http://moltkeplatz.de>).
2014-04-24 17:55:24 +02:00
Albert Krewinkel
ec24f9761c RST reader: Remove duplicate 'http' in PEP links
The generated link to PEPs had a duplicate 'http://' in its URL.
2014-04-24 17:55:22 +02:00
John MacFarlane
e0688711fd EPUB writer: include extension in epub ids.
This fixes a problem with duplicate extensions for fonts and
images with the same base name but different extensions.

Closes #1254.
2014-04-23 10:23:02 -07:00
John MacFarlane
6a2361c457 Merge pull request #1256 from tarleb/org-reader-improvements
Org reader improvements
2014-04-19 20:35:41 -07:00
Albert Krewinkel
8276449520 Org reader: Allow for compact definition lists
Use `Text.Pandoc.Shared.compactify'DL` to allow for compact definition
lists.
2014-04-19 15:13:16 +02:00
Albert Krewinkel
efebade38b Move compactify'DL from Markdown reader into Shared
The function `compactify'DL`, used to change the final definition item of a
definition list into a `Plain` iff all other items are `Plain`s as well, is
useful in many parsers and hence moved into Text.Pandoc.Shared.
2014-04-19 15:03:33 +02:00
Albert Krewinkel
8e91d362a3 Org reader: Fix parsing of footnotes
Footnotes can consist of multiple blocks and end only at a header or at
the beginning of another footnote.  This fixes the previous behavior,
which restricted notes to a single paragraph.
2014-04-19 14:40:46 +02:00
Albert Krewinkel
a69416091b Org reader: Fix distinction of images and normal links
Fixed a false assumption about the precedence of (&&) vs (||).
2014-04-19 11:25:39 +02:00
Albert Krewinkel
6ded3d41d9 Org reader: Apply captions to code blocks and tables
The `Table` blocktype already takes the caption as an argument, while code
blocks are wrapped in a `Div` block together with a labelling `Span`.
2014-04-19 10:41:45 +02:00
Albert Krewinkel
09441b65a8 Org reader: Add support for plain LaTeX fragments
This adds support for LaTeX fragments like the following:

```
\begin{equation}
\int fg \mathrm{d}x
\end{equation}
```
2014-04-18 10:22:54 +02:00
Albert Krewinkel
f19d7233d8 Org reader: Fix parsing of loose lists
Loose lists (i.e. lists with blankline separated items), were parsed as
multiple lists, each containing a single item.  This patch fixes this
issue.
2014-04-18 08:34:06 +02:00
Albert Krewinkel
6d6724cf2c Org reader: Support more types of '#+BEGIN_<type>' blocks
Support for standard org-blocks is improved.  The parser now handles
"HTML", "LATEX", "ASCII", "EXAMPLE", "QUOTE" and "VERSE" blocks in a
sensible fashion.
2014-04-17 18:33:39 +02:00
Albert Krewinkel
0672f58a44 Org reader: Support footnotes 2014-04-17 13:23:14 +02:00
Albert Krewinkel
92582c6272 Org reader: introduce Reader environment around Blocks/Inlines
This introduces a Reader environment in the style of
Text.Pandoc.Parsing.F, but adapted to the Org reader parser.
2014-04-16 13:38:50 +02:00
Albert Krewinkel
5fc252270c Org reader: Fix code for subexpression parsing 2014-04-16 13:26:32 +02:00
Albert Krewinkel
346bcea713 Org reader: Better module description, minor style changes
Use module description analogous to the markdown reader's.
Use (<$) where it makes sense.
2014-04-16 13:23:30 +02:00
John MacFarlane
7f036c0b57 Shared: Fixed bug in toRomanNumeral.
9 and numbers ending in 9 would end with "IXIV."
Thanks to Jesse Rosenthal.  Closes #1249.
2014-04-15 19:53:58 -07:00
John MacFarlane
857fcff7d6 Merge pull request #1240 from neilmayhew/master
Docbook output of Line Blocks
2014-04-13 14:37:28 -07:00
John MacFarlane
86b4da9dec Merge pull request #1239 from tarleb/org-linebreak
Org linebreaks
2014-04-13 14:04:48 -07:00
John MacFarlane
d5d4227ea5 Merge pull request #1238 from tarleb/org-figures
Org reader: Add support for figures
2014-04-13 14:03:15 -07:00
John MacFarlane
d4c1cd456c Org reader: Removed ANN pragma.
This relies on Template Haskell, which causes problems in Windows
due to libraries with C dependencies.  We need to avoid using TH
in pandoc code.
2014-04-12 21:44:54 -07:00
Neil Mayhew
464d7a8e49 Improve handling of hard line breaks in Docbook writer
* Use a <literallayout> for the entire paragraph, not just for the
   newline character
 * Don't let LineBreaks inside footnotes influence the enclosing
   paragraph
2014-04-12 09:16:07 -06:00
Albert Krewinkel
82d4160bdc Org reader: Read linebreaks
Linebreaks are marked by the string `\\` at the end of a line.
2014-04-12 11:07:38 +02:00
Albert Krewinkel
36066699c3 Org writer: Fix output for linebreaks
Hard linebreaks in Org mode are represented by the string "\\" as the
last characters in a line.  Adds this feature to the Org-mode writer.
2014-04-12 10:47:49 +02:00
Albert Krewinkel
ae4280fba5 Org reader: Add support for figures
Support for figures (images with name and caption) is added.
2014-04-12 10:31:45 +02:00
John MacFarlane
8699071ec2 HTML reader: Treat processing instructions & declarations as block.
Previously these were treated as inline, and included in paragraph
tags in HTML or DocBook output, which is generally not what is wanted.

Closes #1233.
2014-04-11 10:10:54 -07:00
Albert Krewinkel
6f19be7d40 Org reader: Fix parsing of sub-/superscript expressions
This fixes the org-reader's handling of sub- and superscript
expressions.  Simple expressions (like `2^+10`), expressions in
parentheses (`a_(n+1)`) and nested sexp (like `a_(nested()parens)`) are
now read correctly.
2014-04-11 11:05:42 +02:00
John MacFarlane
ca40acea5b MediaWiki reader: Handle table rows containing just an HTML comment.
Closes #1230.
2014-04-10 16:52:30 -07:00
Albert Krewinkel
ace8837cd6 Org reader: Improve code by following HLint recommendations
HLint's recommendations for better code are applied to the Org-mode
reader code.
2014-04-10 19:17:58 +02:00
Albert Krewinkel
1715d7cee0 Org reader: Support more inline/display math variants
Support all of the following variants as valid ways to define inline or
display math inlines:

  - `\[..\]` (display)
  - `$$..$$` (display)
  - `\(..\)` (inline)
  - `$..$`   (inline)

This closes #1223.  Again.
2014-04-10 15:32:02 +02:00
John MacFarlane
54e33a132b Merge pull request #1226 from tarleb/org-emphasis-reader
Org reader: Precise rules for the recognition of markup
2014-04-09 09:34:44 -07:00
Albert Krewinkel
030020236c Org reader: Precise rules for the recognition of markup
The inline parsers have been rewritten using the org source code as a
reference. This fixes a couple of bugs related to erroneous markup
recognition.
2014-04-09 15:26:06 +02:00
John MacFarlane
e555a5703d Textile reader: Improved link parsing.
In particular we now pick up on attributes. Since pandoc links
can't have attributes, we enclose the whole link in a span
if there are attributes.

Closes #1008.
2014-04-07 21:23:39 -07:00
John MacFarlane
bfd598e1e9 Merge pull request #1224 from tarleb/org-math
Org reader: Read inline math, recognize definition lists
2014-04-07 07:24:30 -07:00
Albert Krewinkel
c47bd8404f Org reader: Support inline math (like $E=mc^2$)
Closes #1223.
2014-04-07 11:47:36 +02:00
John MacFarlane
e352ec5a0e LaTeX writer: Workaround for level 4-5 headers in quotes.
These previously produced invalid LaTeX: `\paragraph` or
`\subparagraph` in a `quote` environment.  This adds an
`mbox{}` in these contexts to work around the problem.
See http://tex.stackexchange.com/a/169833/22451.

Closes #1221.
2014-04-06 16:32:53 -07:00
John MacFarlane
fcddd0e4bd LaTeX reader: handle @{} and p{length} in tabular.
The length is not actually recorded, but at least we get a table.

Closes #1180.
2014-04-06 15:11:49 -07:00
Albert Krewinkel
480b33b710 Org reader: Add support for definition lists 2014-04-06 20:39:10 +02:00
Albert Krewinkel
4ebf6f6ebf Org reader: Minor code clean-up 2014-04-06 20:39:05 +02:00
John MacFarlane
971e4c4364 HTML reader: Updated closes with rules from HTML5 spec. 2014-04-05 23:05:11 -07:00
John MacFarlane
24f438aa5f Textile reader: Better support for attributes.
Instead of being ignored, attributes are now parsed and
included in Span inlines.

The output will be a bit different from stock textile:
e.g. for `*(foo)hi*`, we'll get `<em><span class="foo">hi</span></em>`
instead of `<em class="foo">hi</em>`.  But at least the data is
not lost.
2014-04-05 21:02:12 -07:00
John MacFarlane
060a76a38e Textile reader: Improved treatment of HTML spans (%).
Closes #1115.
2014-04-05 20:41:38 -07:00
John MacFarlane
75dbe87a99 Removed whitespace at ends of lines. 2014-04-05 20:31:27 -07:00
John MacFarlane
d4054444c0 Text.Pandoc.PDF: Ensure that temp directories deleted on Windows.
The PDF is now read as a strict bytestring, ensuring that process
ownership will be terminated, so the temp directory can be deleted.
Closes #1192.
2014-04-05 19:57:42 -07:00
John MacFarlane
f2deb9d86d Org reader: Added type signature. 2014-04-05 15:50:46 -07:00
John MacFarlane
971dca588e Merge pull request #1219 from tarleb/org-images
Org-reader: support inline images, clean-up code, fix bugs
2014-04-05 15:12:40 -07:00
John MacFarlane
c0309a60bc Shared.openURL: Set proxy with value of http_proxy env variable.
Note:  proxies with non-root paths are not supported,
because of limitations in http-conduit.

Closes #1211.
2014-04-05 10:58:32 -07:00
Albert Krewinkel
652c781e37 Org reader: Support inline images 2014-04-05 16:15:53 +02:00
Albert Krewinkel
d76d2b707b Org reader: Provide more language identifier translations
Org-mode and Pandoc use different language identifiers, marking source
code as being written in a certain programming language.  This adds more
translations from identifiers as used in Org to identifiers used in
Pandoc.

The full list of identifiers used in Org and Pandoc is available through
http://orgmode.org/manual/Languages.html and `pandoc -v`, respectively.
2014-04-05 16:15:50 +02:00
Albert Krewinkel
fd98532784 Org reader: Fix parsing of nested inlines
Text such as /*this*/ was not correctly parsed as a strong, emphasised
word.  This was due to the end-of-word recognition being to strict as it
did not accept markup chars as part of a word.  The fix involves an
additional parser state field, listing the markup chars which might be
parsed as part of a word.
2014-04-05 16:14:40 +02:00
Albert Krewinkel
d43c3e8101 Org reader: Use specialized org parser state
The default pandoc ParserState is replaced with `OrgParserState`.  This
is done to simplify the introduction of new state fields required for
efficient Org parsing.
2014-04-05 16:14:09 +02:00
Albert Krewinkel
7cf7e45e4c Org reader: Slight cleaning of table parsing code 2014-04-05 16:13:18 +02:00
John MacFarlane
ee2e769cd7 DocBook reader: Better treatment of formalpara.
We now emit the title (if present) as a separate paragraph
with boldface text.

Closes #1215.
2014-04-04 22:06:25 -07:00
John MacFarlane
e06aa97bb9 DocBook reader: set metadata "author" not "authors" 2014-04-04 21:39:08 -07:00
John MacFarlane
c5a0c7a819 Removed trailing whitespace. 2014-04-04 21:34:12 -07:00
John MacFarlane
c99ecf6cc5 DocBook reader: set "author" not "authors". 2014-04-04 21:33:16 -07:00
Matthew Pickering
1b930a9670 Added recognition of authorgroup element and releaseinfo element to DocBook reader.
Closes #1214
2014-04-04 21:00:49 -07:00
Matthew Pickering
b1c865ccc6 Converted current meta information parsing in DocBook to a more extensible version which is aware of the more recent meta representation. 2014-04-04 21:00:29 -07:00
John MacFarlane
98658b55fc LaTeX writer: handle line breaks in simple table cells.
Closes #1217.
2014-04-04 17:39:56 -07:00
John MacFarlane
036576c5a2 Correctly handle UTF-8 in custom lua scripts. Closes #1189. 2014-04-04 15:43:47 -07:00
John MacFarlane
fa0f73aef9 Custom writer: read lua script as UTF-8.
This should fix #1189.
2014-04-04 10:07:56 -07:00
John MacFarlane
4ee92dce0c MediaWiki reader: Fixed bug in certain nested lists.
The bug: If a level 2 list was followed by a level 1 list, the first
item of the level 1 list would be lost.

Closes #1213.
2014-04-01 10:36:23 -07:00
John MacFarlane
1cadba16eb HTML reader: idiomatic rewriting for clarity. 2014-04-01 10:10:46 -07:00
Matthew Pickering
5a51a67abd Changed the smart punctuation parser to return Inlines rather than an Inline element and updated files accordingly 2014-04-01 13:53:34 +01:00
Matthew Pickering
9b5d474e79 Converted HTML reader to use builder. Fixes #1162. 2014-04-01 13:44:19 +01:00
Matthew Pickering
0ccca94b4c Bugfix for #1175 and convert textile reader to use builder.
The reader did not correctly parse inline markup. The behavoir is now as follows.

(a) The markup must start at the start of a line, be inside previous
inline markup or be preceeded by whitespace.
(b) The markup can not span across paragraphs (delimited by \n\n)
(c) The markup can not be followed by a alphanumeric character.
(d) Square brackets can be placed around the markup to avoid having
to have white space before it.

In order to make these changes it was either necessary to convert the parser to return a list of inlines or to convert the whole reader to use the builder. The latter approach whilst more work makes a bit more sense as it becomes easy to arbitarily append and prepend elements without changing the type.

Tests are accordingly updated in a later commit to reflect the different normalisation behavoir specified by the builder monoid.
2014-04-01 13:44:06 +01:00
John MacFarlane
99f4f636df Make --toc-depth work well with books in latex/pdf output.
Closes #1210.
2014-03-31 11:08:10 -07:00
John MacFarlane
361167deff Markdown writer: Use longer backtick fences if needed.
If the content contains a backtick fence and there are
attributes, make sure longer fences are used to delimit the code.

Note:  This works well in pandoc, but github markdown is more
limited, and will interpret the first string of three or more
backticks as ending the code block.

Closes #1206.
2014-03-30 15:50:01 -07:00
John MacFarlane
69a7c9f634 LaTeX reader: Better handling of figure and table with caption.
We now look for a \caption inside the environment; if one is
found, it is attached to the graphic or tabular found there.

Closes #1204.
2014-03-25 23:10:43 -07:00
John MacFarlane
0934c4430a Parsing: Added stateCaption.
This is primarily for use in the LaTeX reader, so far.
2014-03-25 22:44:16 -07:00
John MacFarlane
994597f071 Revert "LaTeX reader: Added LPState."
This reverts commit 82ddec698e.
2014-03-25 22:40:18 -07:00
John MacFarlane
82ddec698e LaTeX reader: Added LPState.
Plan is to use this instead of ParserState in LP.
2014-03-25 15:38:30 -07:00
John MacFarlane
6992050161 Parsing: Added HasMacros, simplified other typeclasses.
Removed updateHeaderMap, setHeaderMap, getHeaderMap,
updateIdentifierList, setIdentifierList, getIdentifierList.
2014-03-25 14:55:18 -07:00
John MacFarlane
6ec3ee3a67 Whitespace change, and note:
Contrary to the previous commit message, there was no API
change, since Text.Pandoc.Parsing is not an exposed module.
2014-03-25 13:51:55 -07:00
John MacFarlane
08d1404b31 API changes to HasReaderOptions, HasHeaderMap, HasIdentifierList.
Previously these were typeclasses of monads.  They've been changed
to be typeclasses of states.  This ismplifies the instance definitions
and provides more flexibility.

This is an API change!  However, it should be backwards compatible
unless you're defining instances of HasReaderOptions, HasHeaderMap,
or HasIdentifierList.  The old getOption function should work as
before (albeit with a more general type).

The function askReaderOption has been removed.
extractReaderOptions has been added.
getOption has been given a default definition.

In HasHeaderMap, extractHeaderMap and updateHeaderMap have been added.
Default definitions have been given for getHeaderMap, putHeaderMap,
and modifyHeaderMap.

In HasIdentifierList, extractIdentifierList and updateIdentifierList
have been added.  Default definitions have been given for
getIdentifierList, putIdentifierList, and modifyIdentifierList.

The ultimate goal here is to allow different parsers to use their
own, tailored parser states (instead of ParserState) while still
using shared functions.
2014-03-25 13:43:34 -07:00
John MacFarlane
5e69f845d5 LaTeX reader: Better handling of "table" environment.
Positioning options no longer rendered verbatim.
Partially addresses #1204.
2014-03-25 12:04:25 -07:00
John MacFarlane
d7fbc40dff RTF writer: Fixed tables cells containing paragraphs.
This moves \intbl after \pard.
2014-03-24 15:12:32 -07:00
John MacFarlane
b9cc29e15a Merge pull request #1068 from jaimeMF/mw-images-langs
MediaWiki reader: Accept image links in more languages
2014-03-24 10:39:49 -07:00
John MacFarlane
3fa38db80b Parsing: Make F an instance of Applicative. Closes #1138. 2014-03-24 10:29:24 -07:00
John MacFarlane
dd058b38b0 Markdown reader: Fixed regression on line breaks in strict mode.
Closes #1203.
2014-03-24 09:56:16 -07:00
John MacFarlane
3df75bc160 PDF: Changes to error reporting, to handle non-UTF8 error output. 2014-03-19 11:09:36 -07:00
John MacFarlane
44f58e7e38 EPUB writer: Handle files linked in raw img tags.
See #1170.
2014-03-14 15:41:28 -07:00
John MacFarlane
91696c62c4 EPUB writer: Handle media in audio source tags.
This should resolve the rest of #1170, but it needs
extensive testing.

Note that we now use a 'media' directory rather than 'images'.
2014-03-14 15:30:11 -07:00
John MacFarlane
f6141aa241 EPUB writer: Incorporate files linked in <video> tags.
src and poster will both be incorporated into content.opf
and the epub container.

This partially address #1170.
Still need to do something similar for <audio>.
2014-03-14 15:18:43 -07:00
John MacFarlane
814af2002e RST writer: Avoid stack overflow with certain tables.
Closes #1197.

Note that there are still problems with the formatting of
the tables inside tables with output produced from the input
file in the original bug report.  But this fixes the stack
overflow problem.
2014-03-14 14:03:15 -07:00
John MacFarlane
76ef65f0b3 Man writer: Ensure that terms in definition lists aren't line wrapped.
Closes #1195.
2014-03-12 10:23:45 -07:00
Tim Lin
1aed9208f8 PDF: Use / as path separators in latex input only
Fixes compile error on Windows for 5040f3e
Reverted back to canonical file separators </> in all places except for
arguments to the LaTeX builder and in TEXINPUTS

See #1151.

Note: Temporary directories still fail to be removed in Windows due to
call of ByteString.Lazy.readFile creating process ownership of the
compiled pdf file.
2014-03-10 16:23:57 -07:00
John MacFarlane
5040f3ede0 PDF: Use / as path separators in tempdir on Windows.
This is needed for texlive.
Note that the / is used only in the body of withTempDir,
so when the directory is deleted, the original separators will
be used.

See #1151.
2014-03-10 11:16:50 -07:00
John MacFarlane
c026c16fa6 PDF: Use / as path separators even on Windows.
This seems to be necessary for texlive.
Closes #1151 (again!).
2014-03-09 21:26:25 -07:00
John MacFarlane
f3c9d37885 HTML writer: Add colgroup around col tags.
Also affects EPUB writer.
Closes #877.
2014-03-05 13:01:23 -08:00
John MacFarlane
6fda361977 SelfContained: Handle "poster" attribute in "video" tags.
Closes #1188.
2014-03-05 09:10:09 -08:00
John MacFarlane
3126b00f11 Templates: YAML objects resolve to "true" in conditionals.
Closes #1133.

Note:  If address is a YAML object and you just have $address$
in your template, the word "true" will appear, which may be
unexpected.  (Previously nothing would appear.)
2014-03-05 08:47:20 -08:00
John MacFarlane
ae86e24ff6 Merge branch 'master' of https://github.com/mb21/pandoc into mb21-master 2014-03-04 10:15:43 -08:00
Albert Krewinkel
24b2ac43b0 Add a simple Emacs Org-mode reader
The basic structure of org-mode documents is recognized; however,
org-mode features like todo markers, tags etc. are not supported yet.
2014-03-04 10:40:40 +01:00
mb21
80511f1b34 InDesign ICML Writer 2014-02-28 13:35:35 +01:00
John MacFarlane
4d0bf3c5d6 Markdown reader: Improved parsing of nested divs.
Formerly a closing div tag would be missed if it came right
after other block-level tags.
2014-02-26 22:53:12 -08:00
John MacFarlane
a208a972c3 Markdown parser: avoid backtracking when closing </div> not found. 2014-02-26 22:46:38 -08:00
John MacFarlane
581075a0ca Markdown reader: small efficiency improvement.
Switched `notFollewdBy' rawHtmlBlocks` ->
`notFollowedBy' (htmlTag isBlockTag)`, which is more
efficient.
2014-02-26 22:24:50 -08:00
John MacFarlane
69f7b1dbf3 Added readerTrace to ReaderOptions, --trace command line opt.
This is to debug backtracking-related parsing bugs.
So far it is only implemented for markdown, but it would
be good to extend it to latex and html readers.
2014-02-25 22:43:58 -08:00
John MacFarlane
19b127b898 PDF: Use ; for TEXINPUTS separator on Windows.
Closes #1151, I hope.  Testing needed.
2014-02-23 20:36:21 -08:00
John MacFarlane
a826d3936d Fixed bug in reference link parsing in markdown_mmd.
The bug was triggered by:

Link to [Google][]. Link to [twitter][].

[Google]: http://google.com
[twitter]: http://twitter.com
2014-02-21 17:32:02 -08:00
John MacFarlane
f3a062d5f9 Make rst figures true figures. Closes #1168.
Thanks to CasperVector.
2014-02-19 09:11:15 -08:00
John MacFarlane
c5dc801498 Merge pull request #1145 from wilx/en-dash-ligature-avoidance
Use \/ to avoid en-dash ligature instead of -{}-.
2014-02-17 16:03:27 -08:00
John MacFarlane
f6a020a906 HTML writer: Fixed bug with unnumbered section headings.
Unnumbered section headings (with class 'unnumbered') were getting
numbers.  This commit fixes the bug.
2014-02-17 15:18:52 -08:00
Merijn Verstraaten
66fd9bf759 Clarified field values in RstCustomRoles. 2014-02-15 17:57:08 +01:00
Merijn Verstraaten
fe246ce01c Enhanced Pandoc's support for rST roles.
rST parser now supports:
    - All built-in rST roles
    - New role definition
    - Role inheritance

Issues/TODO:
    - Silently ignores illegal fields on roles
    - Silently drops class annotations for roles
    - Only supports :format: fields with a single format for :raw: roles,
      requires a change to Text.Pandoc.Definition.Format to support multiple
      formats.
    - Allows direct use of :raw: role, rST only allows indirect (i.e.,
      inherited use of :raw:).
2014-02-15 17:51:33 +01:00
Vaclav Zeman
1ba8066f67 Merge remote-tracking branch 'origin/master' into en-dash-ligature-avoidance. 2014-02-09 13:16:39 +01:00
Vaclav Zeman
3f0fe345f9
Use \/ to avoid en-dash ligature instead of -{}-.
This is to fix LuaLaTeX output. The -{}- sequence does not avoid the
ligature with LuaLaTeX but \/ does.
2014-02-08 13:40:04 +01:00
Merijn Verstraaten
286781f801 Removed RenderState datatype context.
Reasoning:
    - It's not Haskell2010
    - It breaks some tools
    - Doesn't actually do anything
    - RenderState doesn't even have a Monoid instance
2014-02-06 23:10:59 +01:00
John MacFarlane
3127ab2b5e Slight code reorganization in endline. 2014-02-04 10:05:52 -08:00
John MacFarlane
a333d9788e ImageSize: Avoid use of lookAhead, which is not in binary >= 0.6.
Closes #1124.
2014-01-24 16:00:53 -08:00
John MacFarlane
9f3b2f6f5d Fixed mediawiki ordered list parsing.
Closes #1122.
2014-01-22 22:07:13 -08:00
John MacFarlane
6c59f060a7 HTML reader: Fixed bug reading inline math with $$.
See #225.
2014-01-20 11:09:44 -08:00
John MacFarlane
1bc9cb4105 Merge pull request #974 from merijn/master
Added support for LaTeX style literate Haskell code blocks in rST.
2014-01-16 12:32:44 -08:00
John MacFarlane
a1abb3eeea Allow binary 0.5. Version bump to 1.12.3.1. 2014-01-14 10:13:08 -08:00
John MacFarlane
b4b16d5786 Minor improvement to exif parser. 2014-01-09 22:51:23 -08:00
John MacFarlane
b9b1546ed2 Markdown parser: be more permissive about citation keys.
Keys may now start with an underscore as well as a letter.
Underscores do not count as internal punctuation, but are
treated like alphanumerics, so "key:_2008" will work, as
it did not before.  (This change was necessary to use keys
generated by zotero.)

Closes #1111, closes #1011.
2014-01-09 11:25:24 -08:00
John MacFarlane
5c8c380a79 Better exif parsing, including image resolution.
This introduces a dependency on binary >= 0.6, but we depend
on binary >= 0.5 via zip-archive anyway.

Closes #976.
2014-01-09 11:16:55 -08:00
John MacFarlane
3bf8012bf6 Text.Pandoc.ImageSize: Parse EXIF format JPGs.
Note:  For now we just assign them all 72 dpi.  It wasn't
clear to me how to extract the resolution information.
At least the aspect ratio will be right, and 72 dpi is
the most common setting.

Closes #976.
2014-01-08 19:33:14 -08:00
John MacFarlane
aada7b495b fetchItem: Handle image URLs beginning with '//'. 2014-01-08 12:04:08 -08:00
John MacFarlane
d9eff99f27 Markdown reader: Allow hard line breaks in table cells.
The \-newline form must be used; the two-space+newline form
won't work, since in a table cell nearly every line ends with
two spaces.
2014-01-07 23:39:49 -08:00
John MacFarlane
2c7bf41d26 Added wmf and emf mime types. 2014-01-07 10:32:47 -08:00
John MacFarlane
002c8ce14c Fixed small regression in docx writer. 2014-01-07 09:01:32 -08:00
John MacFarlane
d97b1fd14c EPUB writer: Strip out footnotes from toc entries. 2014-01-06 21:51:11 -08:00
John MacFarlane
ba6a26b258 EPUB writer: Avoid duplicate notes when headings contain notes.
This arose because the headings are copied into the metadata
"title" field, and the note gets rendered twice.  We strip the
note now before putting the heading in "title".
2014-01-06 12:12:21 -08:00
John MacFarlane
2dd6d892fa HTML writer: Omit footnotes from TOC entries.
Otherwise we get doubled footnotes when headers have notes!
2014-01-06 10:17:31 -08:00
John MacFarlane
e7e76dbdd8 RST writer: Ensure no blank line after def in definition list.
Closes #992.
2014-01-02 21:10:14 -08:00
John MacFarlane
452a140d0c Pretty: Added nestle. API change, minor version bump to 1.12.3. 2014-01-02 21:09:39 -08:00
John MacFarlane
4e7aadb903 HTML writer: With --toc, headers no longer link to themselves.
Closes #1081.
2014-01-02 19:59:33 -08:00
John MacFarlane
b2db6979fe Use isHeaderBlock from Shared rather than defining it anew... 2014-01-02 19:32:13 -08:00
John MacFarlane
33955fd2ef ODT writer: Use mathml for proper rendering of formulas.
Note:  LibreOffice's support for this seems a bit buggy.  But
it should be better than what we had before.
2014-01-02 15:23:40 -08:00
John MacFarlane
ac100f2724 OpenDocument writer: Fixed RawInline, RawBlock so they don't escape. 2014-01-02 15:23:16 -08:00
John MacFarlane
e3d48da627 Moved fixDisplayMath from Docx writer to Writer.Shared. 2014-01-02 15:22:50 -08:00
John MacFarlane
f3ee82373b HTML reader: Parse name/content pairs from meta tags as metadata.
Closes #1106.
2014-01-01 09:22:37 -08:00
John MacFarlane
d6ec6cf9cf Docx writer: Fixed problem with some modified reference docx files.
Include `word/_rels/settings.xml.rels` if it exists, as well as other
`rels` files besides the ones pandoc generates explicitly.
2014-01-01 09:01:05 -08:00
Henry de Valence
3d70059a48 HLint: use fromMaybe
Replace uses of `maybe x id` with `fromMaybe x`.
2013-12-19 21:07:09 -05:00
Henry de Valence
c8fc0a0374 HLint: use /= 2013-12-19 20:46:11 -05:00
Henry de Valence
f6d151889c HLint: redundant parens
Remove parens enclosing a single element.
2013-12-19 20:43:25 -05:00
Henry de Valence
c35f5ba42d HLint: Remove lambdas. 2013-12-19 20:28:53 -05:00
Henry de Valence
0c5e7cf8cb HLint: use elem and notElem
Replaces long conditional chains with calls to `elem` and `notElem`.
2013-12-19 20:19:24 -05:00
Henry de Valence
1ed2c467c9 HLint: Use all
Replace `and . map` with `all`.
2013-12-19 17:06:27 -05:00
John MacFarlane
8053ba2123 LaTeX writer: Better treatment of footnotes in tables.
Notes now appear in the regular sequence, rather than in the
table cell.  (This was a regression in 1.10.)
2013-12-17 20:53:59 -08:00
John MacFarlane
a3f6f2827c LaTeX writer: Factored out function for table cell creation. 2013-12-17 20:10:09 -08:00
John MacFarlane
0132f6fcb7 LaTeX reader: Support babel-style quoting: ` "..."' ``. 2013-12-17 16:03:43 -08:00
John MacFarlane
826443926f Docbook reader: Avoid failure if tbody contains no tr or row elements. 2013-12-16 13:58:54 -08:00
John MacFarlane
2f00f5c7c2 Properly handle script blocks in strict mode.
(That is, markdown-markdown_in_html_blocks.)
Previously a spurious `<p>` tag was being added.

Closes #1093.
2013-12-15 12:27:29 -08:00
Jeff Arnold
5adbe7b365 LaTeX reader: add support for Verb macro 2013-12-13 19:16:04 -05:00
John MacFarlane
ca3c292f30 EPUB writer: Fixed bug with --epub-stylesheet.
Now the contents of `writerEpubStylesheet` (set by `--epub-stylesheet`)
should again work, and take precedence over a stylesheet specified
in the metadata.
2013-12-13 11:10:04 -08:00
John MacFarlane
6d0cd9203c Markdown reader: Fixed regression in title blocks.
If author field was empty, date was being ignored.  Closes #1089.
2013-12-12 22:34:56 -08:00
John MacFarlane
142f81889b Added withSocketsDo around http conduit code in openURL.
This should address #1080, but further testing on Windows is needed
before we can close the bug.
2013-12-09 22:35:57 -08:00
John MacFarlane
f966295770 Don't use tilde code blocks with braced attributes in gfm output.
A consequence of this change is that the backtick form will be
preferred in general if both are enabled.  I think that is good,
as it is much more widespread than the tilde form.

Closes #1084.
2013-12-09 20:31:47 -08:00
John MacFarlane
8e255fad98 Another small performance improvement. 2013-12-07 19:56:54 -08:00
John MacFarlane
e2c4156c20 Small performance improvement in list parsing. 2013-12-07 19:41:42 -08:00
John MacFarlane
e5a7c31a32 Markdown reader: Fixed bug with literal </div> in lists.
Closes #1078.
2013-12-07 17:12:52 -08:00
John MacFarlane
4a14467055 Text.Pandoc: Don't default to pandocExtensions for all writers.
In particular, we don't want to default to math parsing for the
HTML writer.
2013-12-06 17:31:47 -08:00
John MacFarlane
def05d3504 HTML reader: Parse LaTeX math if appropriate options are set.
* Moved inlineMath, displayMath from Markdown reader to Parsing.
* Export them from Parsing.  (API change.)
* Generalize their types.
2013-12-06 17:15:13 -08:00
John MacFarlane
5314df51f3 Stop parsing "list lines" when we hit a block tag.
This fixes exponential slowdown in certain input, e.g.
a series of lists followed by `</div>`.
2013-12-04 10:18:05 -08:00