Commit graph

4077 commits

Author SHA1 Message Date
John MacFarlane
9742c48647 Removed two superfluous lines. 2016-03-25 09:05:38 -07:00
John MacFarlane
f47b369f37 LaTeX writer: better positioning for hypertarget in figures.
Closes #2813.
2016-03-24 16:44:33 -07:00
John MacFarlane
bb6897a13e LaTeX writer: Fixed position of label in figures.
Partially addresses #2813.

This isn't perfect, because now the hypertarget is in the
wrong place -- when you link to the figure, the screen
is positioned with the caption at the top, and most of
the figure off screen.

So this needs a bit more tweaking.
2016-03-24 09:41:45 -07:00
John MacFarlane
499985c1a3 Updated copyright dates to include 2016. 2016-03-22 17:20:39 -07:00
John MacFarlane
b1ffdf3b01 Fixed bug in Markdown raw HTML parsing.
This was a regression, with the rewrite of `htmlInBalanced`
(from `Text.Pandoc.Readers.HTML`) in 1.17.

It caused newlines to be omitted in raw HTML blocks.

Closes #2804.
2016-03-22 16:56:10 -07:00
Mauro Bieg
44f95484a4 LaTeX Writer: fix polyglossia to babel env mapping
allow for optional argument in square brackets, closes #2728
2016-03-20 19:17:30 +01:00
John MacFarlane
3af753de47 Merge pull request #2637 from mb21/latex-figure-label
LaTeX writer: figure label
2016-03-19 13:56:14 -07:00
John MacFarlane
976e7e2054 ConTeXt writer: fix whitespace at line beginning in line blocks.
Add a `\strut` after `\crlf` before space.
Closes #2744, #2745.  Thanks to @c-foster.
This uses the fix suggested by @c-foster.

Mid-line spaces are still not supported, because of limitations
of the Markdown parser.
2016-03-18 16:36:56 -07:00
John MacFarlane
e821b05125 LaTeX writer: Avoid double toprule in headerless table with caption.
Closes #2742.
2016-03-18 16:16:18 -07:00
Jesse Rosenthal
28c7617f19 Docx reader: Handle alternate content
Some word functions -- especially graphics -- give various choices for
content so there can be backwards compatibility. This follows the
largely undocumented feature by working through the choices until we
find one that works.

Note that we had to split out the processing of child elems of runs into
a separate function so we can recurse properly. Any processing of an
element *within* a run (other than a plain run) should go into
`childElemToRun`.
2016-03-18 09:38:26 -04:00
Jesse Rosenthal
855c8b43f0 Docx reader: Don't make numbered heads into lists.
Word uses list numbering styles to number its headings. We only call
something a numbered list if it does not also heave a heading style.
2016-03-16 12:50:32 -04:00
Jesse Rosenthal
5c055b4cf3 Introduce file-scope parsing (parse-before-combine)
Traditionally pandoc operates on multiple files by first concetenating
them (around extra line breaks) and then processing the joined file. So
it only parses a multi-file document at the document scope. This has the
benefit that footnotes and links can be in different files, but it also
introduces a couple of difficulties:

  - it is difficult to join files with footnotes without some sort of
    preprocessing, which makes it difficult to write academic documents
    in small pieces.

  - it makes it impossible to process multiple binary input files, which
    can't be catted.

  - it makes it impossible to process files from different input
    formats.

This commit introduces alternative method. Instead of catting the files
first, it parses the files first, and then combines the parsed
output. This makes it impossible to have links across multiple files,
and auto-identified headers won't work correctly if headers in multiple
files have the same name. On the other hand, footnotes across multiple
files will work correctly and will allow more freedom for input formats.

Since ByteStringReaders can currently only read one binary file, and
will ignore subsequent files, we also changes the behavior to
automatically parse before combining if using the ByteStringReader. If
we use one file, it will work as normal. If there is more than one file
it will combine them after parsing (assuming that the format is the
same).

Note that this is intended to be an optional method, defaulting to
off. Turn it on with `--file-scope`.
2016-03-15 12:52:51 -04:00
Jesse Rosenthal
68fd333ec4 Add a general ByteStringReader with warnings.
Have docx reader use it.
2016-03-12 17:08:20 -05:00
Jesse Rosenthal
ee03e954d0 Add readDocxWithWarnings
The regular readDocx just becomes a special case.
2016-03-12 17:08:20 -05:00
Jesse Rosenthal
102ba9ecb8 Docx Reader: Add state to the parser, for warnings
In order to be able to collect warnings during parsing, we add a state
monad transformer to the D monad. At the moment, this only includes a
list of warning strings (nothing currently triggers them, however). We
use StateT instead of WriterT to correspond more closely with the
warnings behavior in T.P.Parsing.
2016-03-12 17:08:20 -05:00
John MacFarlane
a485c42d78 Fixed behavior of base tag.
+ If the base path does not end with slash, the last component
  will be replaced.  E.g. base = `http://example.com/foo`
  combines with `bar.html` to give `http://example.com/bar.html`.
+ If the href begins with a slash, the whole path of the base
  is replaced.  E.g. base = `http://example.com/foo/` combines
  with `/bar.html` to give `http://example.com/bar.html`.

Closes #2777.
2016-03-10 19:59:55 -08:00
mb21
139fa54d48 Docx Writer: handle image alt text
closes #2754
2016-03-10 08:56:08 +01:00
John MacFarlane
2b55b76ebe Markdown reader: Improved pipe table parsing.
Fixes #2765.
Added test case.
2016-03-09 11:46:00 -08:00
John MacFarlane
54a68616d7 Markdown reader: Clean up pipe table parsing. 2016-03-09 10:11:32 -08:00
John MacFarlane
6e950a8eb5 Markdown reader: allow + separators in pipe table cells.
We already allowed them in the header, but not in the body
rows, for some reason.  This gives compatibility with org-mode
tables.
2016-03-09 08:44:31 -08:00
John MacFarlane
4ed64835cb Markdown reader: don't cross line boundary parsing pipe table row.
Previously an emph element could be parsed across the newline
at the end of the pipe table row.

I thought this would help with #2765, but it doesn't.
2016-03-09 08:33:13 -08:00
John MacFarlane
6bfaa5ad15 DokuWiki writer: use $$ for display math. 2016-03-08 10:08:14 -08:00
Jesse Rosenthal
0b9c54d9f3 Docx reader: update feature checklist.
The feature checklist in the source code was out of date. Update.
2016-03-08 00:36:13 -05:00
John MacFarlane
7c6a3c0f69 LaTeX reader: handle interior $ characters in math.
e.g. `$$\hbox{$i$}$$`.

Partially addresses #2743.
2016-02-28 11:14:03 -08:00
Jesse Rosenthal
a7a0b452a5 Docx Reader: Get rid of Modifiable typeclass.
The docx reader used to use a Modifiable typeclass to combine both
Blocks and Inlines. But all the work was in the inlines. So most of the
generality was wasted, at the expense of making the code harder to
understand. This gets rid of the generality, and adds functions for
Blocks and Inlines. It should be a bit easier to work with going forward.
2016-02-26 08:57:53 -05:00
John MacFarlane
f2bd6fd37c Make protocol-relative URIs work again.
Closes #2737.
2016-02-23 21:58:10 -08:00
John MacFarlane
04d1e40f37 Markdown reader: use htmlInBalanced for rawVerbatimBlock.
This should give better performance.

See #2730.
2016-02-21 07:56:41 -08:00
John MacFarlane
9693de7f59 Fixed some linter warnings. 2016-02-20 22:16:39 -08:00
John MacFarlane
29706ee02d Merge pull request #2646 from tarleb/org-figure-with-no-name
Prefix even empty figure names with "fig:"
2016-02-20 21:44:39 -08:00
John MacFarlane
649cfb61b8 Merge pull request #2668 from monofon/fix/yaml-metadata-block-bottom-line
Markdown writer: Use hyphens for yaml metadata block bottom line
2016-02-20 21:43:15 -08:00
John MacFarlane
e369e60fb4 Merge pull request #2691 from tarleb/org-image-file-links
Org reader: Refactor link-target processing
2016-02-20 21:42:12 -08:00
John MacFarlane
1534052dd9 HTML reader: rewrote htmlInBalanced.
This version avoids an exponential performance problem with `<script>` tags,
and it should be faster in general.

Closes #2730.
2016-02-20 15:00:31 -08:00
Jesse Rosenthal
4438ff17fb LaTeX writer: clean up options parser.
Make sure that we require the closing bracket.
2016-02-18 23:35:38 -05:00
Jesse Rosenthal
4112b321cd LaTeX writer: treat memoir template with article opt as article
We currently treat all memoir templates as books. This means that pandoc
will infer the `--chapters` argument, even if the `article` iption is
set for memoir.

This commit makes pandoc treats the document as an article if there is
an article option (i.e., `\documentclass[12pt,article]{memoir}`).

Note that this refactors out the parsec parsers for document class and
options, to make it a little clearer what's going on.
2016-02-18 22:32:38 -05:00
John MacFarlane
b8dadc608a HTML reader: properly handle an empty cell in a simple table.
Closes #2718.
2016-02-16 11:05:51 -08:00
John MacFarlane
bbc67dee36 Removed tex_math_single_backslash from markdown_github options.
Closes #2707.
2016-02-09 22:30:52 -08:00
John MacFarlane
a692bd2872 Custom writer: Pass attributes parameter to CaptionedImage.
Closes #2697.
2016-02-05 16:49:27 -08:00
John MacFarlane
6cb4991f6b Markdown reader: Fixed bug with smart quotes around tex math.
Previously smart quotes were incorrect in the following:

    '$\neg(x \in x)$'.

(because of the following period).  This commit fixes the problem,
which was introduced by commit 4229cf2d92.
2016-02-04 12:09:26 -08:00
John MacFarlane
93a05dffd3 HTML writer: don't include alignment attribute for default table columns.
Previously these were given "left" alignment.  Better to leave off
alignment attributes altogether.

Closes #2694.
2016-02-03 13:31:21 -08:00
Jesse Rosenthal
2ee7752d14 Docx reader: Add a "Link" modifier to Reducible
We want to make sure that links have their spaces removed, and are
appropriately smushed together.

This closes #2689
2016-02-02 14:40:09 -05:00
Albert Krewinkel
92e6ae47f6 Org reader: Refactor link-target processing
Cleanup of the code for link target handling.  Most notably, the
canonicalization of a link is handled by a separate function.

This fixes #2684.
2016-01-31 23:23:09 +01:00
John MacFarlane
18745585c1 LaTeX reader: inlineCommand now gobbles an empty {} after any command.
This gives better results when people write e.g. `\TeX{}` in Markdown.

    \TeX{} and \LaTeX{}

now works as expected with `pandoc -f markdown -t latex`.

Closes #2687.
2016-01-31 10:52:46 -08:00
John MacFarlane
a02c26d9f4 HTML reader: handle multiple meta tags with same name.
Put them in a list in the metadata so they are all
preserved, rather than (as before) throwing out all
but one..
2016-01-29 11:51:01 -08:00
John MacFarlane
76983c31f2 Properly handle LaTeX "math" environment as inline math.
See #2171.
2016-01-29 10:11:45 -08:00
John MacFarlane
a1021bdda6 Textile reader: Support >, <, =, <> text alignment attributes.
Closes #2674.
2016-01-25 09:34:49 -08:00
John MacFarlane
11c5831a1f Make language extensions trigger highlighting.
For example, `py` will now work as well as `python`.
Closes jgm/highlighting-kate#83.
2016-01-24 14:15:06 -08:00
John MacFarlane
20170c328f Changed type of Shared.uniqueIdent argument from [String] to Set String.
This avoids performance problems in documents with many identically
named headers.

Closes #2671.
2016-01-22 10:16:47 -08:00
John MacFarlane
3b39b16a4b Merge pull request #2638 from c-forster/teiwriter
Add TEI Writer.
2016-01-21 15:23:50 -08:00
Henrik Tramberend
556d0c1810 Markdown writer: Use hyphens for yaml metadata block bottom line 2016-01-21 12:44:16 +01:00
Henrik Tramberend
7a18879a36 LaTeX writer: Allow more flexible table alignment 2016-01-20 13:21:26 +01:00