Commit graph

3943 commits

Author SHA1 Message Date
Ivo Clarysse
fd36e6b64a Docbook5 writer: Properly handle ulink/link 2016-04-29 16:06:55 -07:00
Ivo Clarysse
987ec3a752 Write out Docbook 5 namespace 2016-04-29 15:43:15 -07:00
John MacFarlane
aa4a1d527a HTML writer: ensure mathjax link is added when math appears in footnote.
Previously if a document only had math in a footnote,
the MathJax link would not be added.

Closes #2881.
2016-04-29 14:54:54 -07:00
Ivo Clarysse
271cb4d845 Add docbook5 writer support 2016-04-29 14:00:46 -07:00
John MacFarlane
32f1b0a5f1 Revert "LaTeX writer: Add \strut to fix multiline tables"
This reverts commit 4c684561ee.

See
https://groups.google.com/d/msg/pandoc-discuss/u6J-_aCProU/UufN3IYRAgAJ

This should fix uneven spacing issues in multiline tables.
2016-04-27 17:25:45 -07:00
John MacFarlane
ece215ed7d Merge pull request #2735 from mb21/patch-1
LaTeX Writer: fix polyglossia to babel env mapping
2016-04-26 23:09:02 -07:00
John MacFarlane
cc0527bf31 Merge pull request #2829 from adunning/patch-1
LaTeX writer: Add missing languages.
2016-04-26 23:08:15 -07:00
John MacFarlane
cc82851a6a Merge pull request #2876 from shosti/org-code-indent
Ignore leading space in org code blocks
2016-04-26 23:07:29 -07:00
John MacFarlane
1985164816 LaTeX writer: ignore --incremental unless -t beamer.
Closes #2843.
2016-04-26 21:50:37 -07:00
Emanuel Evans
1bfe39e24c
Ignore leading space in org code blocks
Fixes #2862

Also fix up tab handling for leading whitespace in code blocks.
2016-04-26 10:29:59 -07:00
Jesse Rosenthal
a385ee1d4f Docx Reader: parse moveTo and moveFrom
`moveTo` and `moveFrom` are track-changes tags that are used when a
block of text is moved in the document. We now recognize these tags and
treat them the same as `insert` and `delete`, respectively. So,
`--track-changes=accept` will show the moved version, while
`--track-changes=reject` will show the original version.
2016-04-15 14:09:18 -04:00
John MacFarlane
4b49f923cb Markdown reader: Fix pandoc title blocks with lines ending in 2 spaces.
Closes #2799.

Also added -s to markdown-reader-more test.
2016-04-10 09:13:53 -07:00
John MacFarlane
773bbb8fc7 Markdown + HTML readers: be more forgiving about unescaped &.
We are now more forgiving about parsing invalid HTML with
unescaped `&` as raw HTML.  (Previously any unescaped `&`
would cause pandoc not to recognize the string as raw HTML.)

Closes #2410.
2016-04-10 07:39:36 -07:00
Andrew Dunning
9765ef2ce6 LaTeX writer: Add missing languages.
Updates the list from the hyphenation files at <http://mirror.ctan.org/language/hyph-utf8/tex/generic/hyph-utf8/loadhyph/>.
2016-04-01 16:47:33 +01:00
Andrew Dunning
0c37a7c488 Recognize la-x-classic as Classical Latin.
This allows one to access the hyphenation patterns at <http://mirrors.ctan.org/language/hyph-utf8/tex/generic/hyph-utf8/patterns/tex/hyph-la-x-classic.tex>, using its private language tag.
2016-03-30 14:15:47 +01:00
John MacFarlane
f74498cb47 EPUB writer: set 'navpage' variable on nav page.
This allows templates to treat it differently.
2016-03-26 13:14:50 -07:00
John MacFarlane
9742c48647 Removed two superfluous lines. 2016-03-25 09:05:38 -07:00
John MacFarlane
f47b369f37 LaTeX writer: better positioning for hypertarget in figures.
Closes #2813.
2016-03-24 16:44:33 -07:00
John MacFarlane
bb6897a13e LaTeX writer: Fixed position of label in figures.
Partially addresses #2813.

This isn't perfect, because now the hypertarget is in the
wrong place -- when you link to the figure, the screen
is positioned with the caption at the top, and most of
the figure off screen.

So this needs a bit more tweaking.
2016-03-24 09:41:45 -07:00
John MacFarlane
499985c1a3 Updated copyright dates to include 2016. 2016-03-22 17:20:39 -07:00
John MacFarlane
b1ffdf3b01 Fixed bug in Markdown raw HTML parsing.
This was a regression, with the rewrite of `htmlInBalanced`
(from `Text.Pandoc.Readers.HTML`) in 1.17.

It caused newlines to be omitted in raw HTML blocks.

Closes #2804.
2016-03-22 16:56:10 -07:00
Mauro Bieg
44f95484a4 LaTeX Writer: fix polyglossia to babel env mapping
allow for optional argument in square brackets, closes #2728
2016-03-20 19:17:30 +01:00
John MacFarlane
3af753de47 Merge pull request #2637 from mb21/latex-figure-label
LaTeX writer: figure label
2016-03-19 13:56:14 -07:00
John MacFarlane
976e7e2054 ConTeXt writer: fix whitespace at line beginning in line blocks.
Add a `\strut` after `\crlf` before space.
Closes #2744, #2745.  Thanks to @c-foster.
This uses the fix suggested by @c-foster.

Mid-line spaces are still not supported, because of limitations
of the Markdown parser.
2016-03-18 16:36:56 -07:00
John MacFarlane
e821b05125 LaTeX writer: Avoid double toprule in headerless table with caption.
Closes #2742.
2016-03-18 16:16:18 -07:00
Jesse Rosenthal
28c7617f19 Docx reader: Handle alternate content
Some word functions -- especially graphics -- give various choices for
content so there can be backwards compatibility. This follows the
largely undocumented feature by working through the choices until we
find one that works.

Note that we had to split out the processing of child elems of runs into
a separate function so we can recurse properly. Any processing of an
element *within* a run (other than a plain run) should go into
`childElemToRun`.
2016-03-18 09:38:26 -04:00
Jesse Rosenthal
855c8b43f0 Docx reader: Don't make numbered heads into lists.
Word uses list numbering styles to number its headings. We only call
something a numbered list if it does not also heave a heading style.
2016-03-16 12:50:32 -04:00
Jesse Rosenthal
5c055b4cf3 Introduce file-scope parsing (parse-before-combine)
Traditionally pandoc operates on multiple files by first concetenating
them (around extra line breaks) and then processing the joined file. So
it only parses a multi-file document at the document scope. This has the
benefit that footnotes and links can be in different files, but it also
introduces a couple of difficulties:

  - it is difficult to join files with footnotes without some sort of
    preprocessing, which makes it difficult to write academic documents
    in small pieces.

  - it makes it impossible to process multiple binary input files, which
    can't be catted.

  - it makes it impossible to process files from different input
    formats.

This commit introduces alternative method. Instead of catting the files
first, it parses the files first, and then combines the parsed
output. This makes it impossible to have links across multiple files,
and auto-identified headers won't work correctly if headers in multiple
files have the same name. On the other hand, footnotes across multiple
files will work correctly and will allow more freedom for input formats.

Since ByteStringReaders can currently only read one binary file, and
will ignore subsequent files, we also changes the behavior to
automatically parse before combining if using the ByteStringReader. If
we use one file, it will work as normal. If there is more than one file
it will combine them after parsing (assuming that the format is the
same).

Note that this is intended to be an optional method, defaulting to
off. Turn it on with `--file-scope`.
2016-03-15 12:52:51 -04:00
Jesse Rosenthal
68fd333ec4 Add a general ByteStringReader with warnings.
Have docx reader use it.
2016-03-12 17:08:20 -05:00
Jesse Rosenthal
ee03e954d0 Add readDocxWithWarnings
The regular readDocx just becomes a special case.
2016-03-12 17:08:20 -05:00
Jesse Rosenthal
102ba9ecb8 Docx Reader: Add state to the parser, for warnings
In order to be able to collect warnings during parsing, we add a state
monad transformer to the D monad. At the moment, this only includes a
list of warning strings (nothing currently triggers them, however). We
use StateT instead of WriterT to correspond more closely with the
warnings behavior in T.P.Parsing.
2016-03-12 17:08:20 -05:00
John MacFarlane
a485c42d78 Fixed behavior of base tag.
+ If the base path does not end with slash, the last component
  will be replaced.  E.g. base = `http://example.com/foo`
  combines with `bar.html` to give `http://example.com/bar.html`.
+ If the href begins with a slash, the whole path of the base
  is replaced.  E.g. base = `http://example.com/foo/` combines
  with `/bar.html` to give `http://example.com/bar.html`.

Closes #2777.
2016-03-10 19:59:55 -08:00
mb21
139fa54d48 Docx Writer: handle image alt text
closes #2754
2016-03-10 08:56:08 +01:00
John MacFarlane
2b55b76ebe Markdown reader: Improved pipe table parsing.
Fixes #2765.
Added test case.
2016-03-09 11:46:00 -08:00
John MacFarlane
54a68616d7 Markdown reader: Clean up pipe table parsing. 2016-03-09 10:11:32 -08:00
John MacFarlane
6e950a8eb5 Markdown reader: allow + separators in pipe table cells.
We already allowed them in the header, but not in the body
rows, for some reason.  This gives compatibility with org-mode
tables.
2016-03-09 08:44:31 -08:00
John MacFarlane
4ed64835cb Markdown reader: don't cross line boundary parsing pipe table row.
Previously an emph element could be parsed across the newline
at the end of the pipe table row.

I thought this would help with #2765, but it doesn't.
2016-03-09 08:33:13 -08:00
John MacFarlane
6bfaa5ad15 DokuWiki writer: use $$ for display math. 2016-03-08 10:08:14 -08:00
Jesse Rosenthal
0b9c54d9f3 Docx reader: update feature checklist.
The feature checklist in the source code was out of date. Update.
2016-03-08 00:36:13 -05:00
John MacFarlane
7c6a3c0f69 LaTeX reader: handle interior $ characters in math.
e.g. `$$\hbox{$i$}$$`.

Partially addresses #2743.
2016-02-28 11:14:03 -08:00
Jesse Rosenthal
a7a0b452a5 Docx Reader: Get rid of Modifiable typeclass.
The docx reader used to use a Modifiable typeclass to combine both
Blocks and Inlines. But all the work was in the inlines. So most of the
generality was wasted, at the expense of making the code harder to
understand. This gets rid of the generality, and adds functions for
Blocks and Inlines. It should be a bit easier to work with going forward.
2016-02-26 08:57:53 -05:00
John MacFarlane
f2bd6fd37c Make protocol-relative URIs work again.
Closes #2737.
2016-02-23 21:58:10 -08:00
John MacFarlane
04d1e40f37 Markdown reader: use htmlInBalanced for rawVerbatimBlock.
This should give better performance.

See #2730.
2016-02-21 07:56:41 -08:00
John MacFarlane
9693de7f59 Fixed some linter warnings. 2016-02-20 22:16:39 -08:00
John MacFarlane
29706ee02d Merge pull request #2646 from tarleb/org-figure-with-no-name
Prefix even empty figure names with "fig:"
2016-02-20 21:44:39 -08:00
John MacFarlane
649cfb61b8 Merge pull request #2668 from monofon/fix/yaml-metadata-block-bottom-line
Markdown writer: Use hyphens for yaml metadata block bottom line
2016-02-20 21:43:15 -08:00
John MacFarlane
e369e60fb4 Merge pull request #2691 from tarleb/org-image-file-links
Org reader: Refactor link-target processing
2016-02-20 21:42:12 -08:00
John MacFarlane
1534052dd9 HTML reader: rewrote htmlInBalanced.
This version avoids an exponential performance problem with `<script>` tags,
and it should be faster in general.

Closes #2730.
2016-02-20 15:00:31 -08:00
Jesse Rosenthal
4438ff17fb LaTeX writer: clean up options parser.
Make sure that we require the closing bracket.
2016-02-18 23:35:38 -05:00
Jesse Rosenthal
4112b321cd LaTeX writer: treat memoir template with article opt as article
We currently treat all memoir templates as books. This means that pandoc
will infer the `--chapters` argument, even if the `article` iption is
set for memoir.

This commit makes pandoc treats the document as an article if there is
an article option (i.e., `\documentclass[12pt,article]{memoir}`).

Note that this refactors out the parsec parsers for document class and
options, to make it a little clearer what's going on.
2016-02-18 22:32:38 -05:00