Commit graph

6882 commits

Author SHA1 Message Date
John MacFarlane
e821b05125 LaTeX writer: Avoid double toprule in headerless table with caption.
Closes #2742.
2016-03-18 16:16:18 -07:00
John MacFarlane
8d1c01809e README: document that --toc works with docx.
Closes #2787.
2016-03-18 14:54:09 -07:00
Jesse Rosenthal
28c7617f19 Docx reader: Handle alternate content
Some word functions -- especially graphics -- give various choices for
content so there can be backwards compatibility. This follows the
largely undocumented feature by working through the choices until we
find one that works.

Note that we had to split out the processing of child elems of runs into
a separate function so we can recurse properly. Any processing of an
element *within* a run (other than a plain run) should go into
`childElemToRun`.
2016-03-18 09:38:26 -04:00
Jesse Rosenthal
7f4a40474c Docx reader: Add test for enumerated headers.
We don't want them to turn into a list.
2016-03-16 12:56:17 -04:00
Jesse Rosenthal
855c8b43f0 Docx reader: Don't make numbered heads into lists.
Word uses list numbering styles to number its headings. We only call
something a numbered list if it does not also heave a heading style.
2016-03-16 12:50:32 -04:00
Jesse Rosenthal
09b4f294bf pandoc.hs: Also use filescope for json files.
JSON files have metadata and list structure, so they can't be simply
catted, but they're useful as intermediate build files in large projects.
2016-03-15 13:08:34 -04:00
Jesse Rosenthal
c7c4ee46f8 README: Add description of --file-scope option. 2016-03-15 12:52:51 -04:00
Jesse Rosenthal
5c055b4cf3 Introduce file-scope parsing (parse-before-combine)
Traditionally pandoc operates on multiple files by first concetenating
them (around extra line breaks) and then processing the joined file. So
it only parses a multi-file document at the document scope. This has the
benefit that footnotes and links can be in different files, but it also
introduces a couple of difficulties:

  - it is difficult to join files with footnotes without some sort of
    preprocessing, which makes it difficult to write academic documents
    in small pieces.

  - it makes it impossible to process multiple binary input files, which
    can't be catted.

  - it makes it impossible to process files from different input
    formats.

This commit introduces alternative method. Instead of catting the files
first, it parses the files first, and then combines the parsed
output. This makes it impossible to have links across multiple files,
and auto-identified headers won't work correctly if headers in multiple
files have the same name. On the other hand, footnotes across multiple
files will work correctly and will allow more freedom for input formats.

Since ByteStringReaders can currently only read one binary file, and
will ignore subsequent files, we also changes the behavior to
automatically parse before combining if using the ByteStringReader. If
we use one file, it will work as normal. If there is more than one file
it will combine them after parsing (assuming that the format is the
same).

Note that this is intended to be an optional method, defaulting to
off. Turn it on with `--file-scope`.
2016-03-15 12:52:51 -04:00
Jesse Rosenthal
68fd333ec4 Add a general ByteStringReader with warnings.
Have docx reader use it.
2016-03-12 17:08:20 -05:00
Jesse Rosenthal
ee03e954d0 Add readDocxWithWarnings
The regular readDocx just becomes a special case.
2016-03-12 17:08:20 -05:00
Jesse Rosenthal
102ba9ecb8 Docx Reader: Add state to the parser, for warnings
In order to be able to collect warnings during parsing, we add a state
monad transformer to the D monad. At the moment, this only includes a
list of warning strings (nothing currently triggers them, however). We
use StateT instead of WriterT to correspond more closely with the
warnings behavior in T.P.Parsing.
2016-03-12 17:08:20 -05:00
John MacFarlane
a485c42d78 Fixed behavior of base tag.
+ If the base path does not end with slash, the last component
  will be replaced.  E.g. base = `http://example.com/foo`
  combines with `bar.html` to give `http://example.com/bar.html`.
+ If the href begins with a slash, the whole path of the base
  is replaced.  E.g. base = `http://example.com/foo/` combines
  with `/bar.html` to give `http://example.com/bar.html`.

Closes #2777.
2016-03-10 19:59:55 -08:00
John MacFarlane
06a57b27a1 Merge pull request #2771 from mb21/docx-alt-text
Docx Writer: handle image alt text
2016-03-10 08:57:31 -08:00
mb21
139fa54d48 Docx Writer: handle image alt text
closes #2754
2016-03-10 08:56:08 +01:00
John MacFarlane
2b55b76ebe Markdown reader: Improved pipe table parsing.
Fixes #2765.
Added test case.
2016-03-09 11:46:00 -08:00
John MacFarlane
54a68616d7 Markdown reader: Clean up pipe table parsing. 2016-03-09 10:11:32 -08:00
John MacFarlane
6e950a8eb5 Markdown reader: allow + separators in pipe table cells.
We already allowed them in the header, but not in the body
rows, for some reason.  This gives compatibility with org-mode
tables.
2016-03-09 08:44:31 -08:00
John MacFarlane
4ed64835cb Markdown reader: don't cross line boundary parsing pipe table row.
Previously an emph element could be parsed across the newline
at the end of the pipe table row.

I thought this would help with #2765, but it doesn't.
2016-03-09 08:33:13 -08:00
John MacFarlane
6bfaa5ad15 DokuWiki writer: use $$ for display math. 2016-03-08 10:08:14 -08:00
Jesse Rosenthal
0b9c54d9f3 Docx reader: update feature checklist.
The feature checklist in the source code was out of date. Update.
2016-03-08 00:36:13 -05:00
John MacFarlane
928a05073f Stack-based appveyor setup. 2016-03-07 09:16:39 -08:00
John MacFarlane
0510396a9b Merge pull request #2760 from ickc/master
Very Minor update on the documentation
2016-03-06 20:43:21 -08:00
ickc
846fa87046 Update README 2016-03-06 20:24:02 -08:00
ickc
b411e0ffd0 Update pandoc.1 2016-03-06 20:23:12 -08:00
John MacFarlane
7c6a3c0f69 LaTeX reader: handle interior $ characters in math.
e.g. `$$\hbox{$i$}$$`.

Partially addresses #2743.
2016-02-28 11:14:03 -08:00
John MacFarlane
ea70495fac Merge pull request #2739 from mb21/patch-2
Add relocatable stack build
2016-02-26 13:32:25 -08:00
Mauro Bieg
752be50ea5 Add relocatable stack build 2016-02-26 20:55:55 +01:00
Jesse Rosenthal
a7a0b452a5 Docx Reader: Get rid of Modifiable typeclass.
The docx reader used to use a Modifiable typeclass to combine both
Blocks and Inlines. But all the work was in the inlines. So most of the
generality was wasted, at the expense of making the code harder to
understand. This gets rid of the generality, and adds functions for
Blocks and Inlines. It should be a bit easier to work with going forward.
2016-02-26 08:57:53 -05:00
John MacFarlane
38bd4162fe Allow zip-archive 0.3. 2016-02-24 20:42:28 -08:00
John MacFarlane
f2bd6fd37c Make protocol-relative URIs work again.
Closes #2737.
2016-02-23 21:58:10 -08:00
John MacFarlane
0180807a6c Raise tagsoup lower bound to 0.13.7.
This fixes entity-related problems.

Closes #2734.
2016-02-22 09:59:11 -08:00
John MacFarlane
04d1e40f37 Markdown reader: use htmlInBalanced for rawVerbatimBlock.
This should give better performance.

See #2730.
2016-02-21 07:56:41 -08:00
Jesse Rosenthal
f1c59b271f Update README to reflect 4112b32.
We don't infer `--chapters` if `article` document option is set. For
example: `\documentclass[article]{memoir}`.
2016-02-21 06:34:38 -05:00
John MacFarlane
9693de7f59 Fixed some linter warnings. 2016-02-20 22:16:39 -08:00
John MacFarlane
29706ee02d Merge pull request #2646 from tarleb/org-figure-with-no-name
Prefix even empty figure names with "fig:"
2016-02-20 21:44:39 -08:00
John MacFarlane
649cfb61b8 Merge pull request #2668 from monofon/fix/yaml-metadata-block-bottom-line
Markdown writer: Use hyphens for yaml metadata block bottom line
2016-02-20 21:43:15 -08:00
John MacFarlane
e369e60fb4 Merge pull request #2691 from tarleb/org-image-file-links
Org reader: Refactor link-target processing
2016-02-20 21:42:12 -08:00
John MacFarlane
1534052dd9 HTML reader: rewrote htmlInBalanced.
This version avoids an exponential performance problem with `<script>` tags,
and it should be faster in general.

Closes #2730.
2016-02-20 15:00:31 -08:00
John MacFarlane
d45fcf9f6d Merge pull request #2732 from pra85/patch-2
Fix typos in Readme
2016-02-20 12:48:22 -08:00
Prayag Verma
8a114e9417 Fix typos in Readme
Remove extra `be`
`overriden` → `overridden`
2016-02-21 01:03:48 +05:30
Jesse Rosenthal
4438ff17fb LaTeX writer: clean up options parser.
Make sure that we require the closing bracket.
2016-02-18 23:35:38 -05:00
Jesse Rosenthal
4112b321cd LaTeX writer: treat memoir template with article opt as article
We currently treat all memoir templates as books. This means that pandoc
will infer the `--chapters` argument, even if the `article` iption is
set for memoir.

This commit makes pandoc treats the document as an article if there is
an article option (i.e., `\documentclass[12pt,article]{memoir}`).

Note that this refactors out the parsec parsers for document class and
options, to make it a little clearer what's going on.
2016-02-18 22:32:38 -05:00
John MacFarlane
5848416852 Merge pull request #2725 from adunning/patch-1
Remove stray line from stack.full.yaml
2016-02-18 16:33:44 -08:00
Andrew Dunning
4dfe3733e5 Remove stray line from stack.full.yaml
The line causes an error with stack 1.0.2:

```
Could not parse '/pandoc-build/pandoc/stack.full.yaml':
AesonException "Error in $.extra-deps: failed to parse field 'extra-deps': failed to parse field extra-deps: expected [a], encountered Null"
See http://docs.haskellstack.org/en/stable/yaml_configuration.html.
```
2016-02-18 15:08:06 +00:00
John MacFarlane
44bcc88d57 Don't build with lts-2 or lts-3. 2016-02-17 11:42:04 -08:00
John MacFarlane
9e3f739f11 Travis: don't build with lts-3.
It doesn't have recent enough dependencies.
2016-02-17 11:39:43 -08:00
John MacFarlane
dda7c27378 Travis fixes.
cabal sdist has problems on cabal 1.16, because of our
Text.Pandoc.Data module.  So we don't test it.
2016-02-17 11:13:34 -08:00
John MacFarlane
134a5e52a1 Fixed stack.yaml. 2016-02-17 11:10:12 -08:00
John MacFarlane
1a87794762 Try new travis stack+cabal script. 2016-02-17 10:13:29 -08:00
John MacFarlane
b8dadc608a HTML reader: properly handle an empty cell in a simple table.
Closes #2718.
2016-02-16 11:05:51 -08:00