Commit graph

3203 commits

Author SHA1 Message Date
John MacFarlane
fe625e053d New method for producing man pages.
This change adds `--man1` and `--man5` options to pandoc, so
pandoc can generate its own man pages.

It removes the old overly complex method of building a separate
executable (but not installing it) just to create the man pages.

The man pages are no longer automatically created in the build
process.

The man/ directory has been removed.  The man page templates
have been moved to data/.

New unexported module:  Text.Pandoc.ManPages.

Text.Pandoc.Data now exports readmeFile, and `readDataFile`
knows how to find README.

Closes #2190.
2015-06-28 14:39:17 -07:00
John MacFarlane
ed9a118b54 Fixed regression in CSS parsing with --self-contained.
In 1b44acf0c5 we replaced some
hackish CSS parsing with css-text, which I thought was a complete
CSS parser.  It turns out that it is very buggy, which results
in lots of things being silently dropped from CSS when
`--self-contained` is used (#2224).

This commit replaces the use of css-text with a small but
more principled css preprocessor, which only removes whitespace
and replaces URLs with base 64 data when possible.

Closes #2224.
2015-06-28 11:54:18 -07:00
John MacFarlane
2b1fd52403 Removed unused import. 2015-06-27 20:41:00 -07:00
John MacFarlane
7df4d86006 Textile writer: escape + and - as entities.
Closes #2225.
2015-06-27 20:30:20 -07:00
John MacFarlane
fce3ebb8e0 Plain writer: don't use symbols for super/subscript.
Simplified code by using plainExtensions from Options.

Closes #2237.
2015-06-27 20:19:04 -07:00
John MacFarlane
177533d3f8 Options: Export plainExtensions.
These are the extensions used in `plain` output.
2015-06-27 20:18:14 -07:00
mb21
82e363a727 DocBook reader mediaobjects and figures, closes #2184 2015-06-21 18:36:47 +02:00
gohai
f51757bd16 Fix InDesign crash with URLs containing more than one colon character
Colons are valid characters in URLs, and used e.g. by the Internet Archive's Wayback Machine - a popular resource amongst researchers. When InDesign encounters a HyperlinkURLDestination with more than one colon character in it, it crashes when placing the ICML. (This was tested against CS6.) The IDML specification hints at this requirement in section 6.4.1: "The colon apppears in the Name attribute of the style, but is encoded as %3a when it appears in the Self attribute". Follow this example for all colon characters in URLs.
2015-06-09 15:46:23 +02:00
John MacFarlane
7b4f077652 DokuWiki writer: Use proper <code> tags for code blocks.
Closes #2213.
2015-06-07 11:29:47 -07:00
John MacFarlane
cb7dd4469d HTML reader: allow <body> to close <head>. 2015-06-04 10:23:38 +02:00
John MacFarlane
3534ad3666 Custom writer: fixed some compiler warnings for ghc < 7.10. 2015-05-31 21:18:26 +02:00
John MacFarlane
c327b283c1 Allow building with hslua 0.4. 2015-05-31 13:57:14 +02:00
John MacFarlane
b241472a90 Better fix for #2187.
* Reverted kludgy change to make-windows-installer.bat.
* Removed make-reference-fiels.hs.
* Moved the individual ingredients of reference.docx and
  reference.odt to the data directory.
* Removed reference.docx and reference.odt from data directory.
* We now build the reference archives from their ingredient pieces
  in the docx and odt writers, instead of having a reference.docx
  or reference.odt intermediary.

This should fix #2187.

It also simplifies the bulid procedure.

The one thing users may notice is different is that you can
no longer get the reference.docx or reference.odt using
`--print-default-data-file`.  Instead, simply generate a
docx or odt using pandoc with a blank or minimal input,
and use that (or a customized version) with `--reference-docx`
or `--reference-odt`.
2015-05-28 18:15:01 -07:00
John MacFarlane
c1f6d5e31f ConTeXt writer: Add reference anchors to Div with ids.
This is useful for pandoc-citeproc linked citations.
2015-05-28 17:46:26 -07:00
John MacFarlane
e54c8613e8 Removed tab chars in Textile reader source. 2015-05-28 13:07:52 -07:00
John MacFarlane
4112fb8358 Texinfo writer: Removed tabs from source. 2015-05-28 13:01:43 -07:00
John MacFarlane
4a9aaf6fd6 LaTeX/beamer: added setotherlanguages in polyglossia.
This uses an `otherlang` variable that takes a list of languages.

As requseted in #2174.
2015-05-27 12:15:50 -07:00
John MacFarlane
c841b15d25 LaTeX writer: Make mainlang work when lang is in metadata.
Closes #2174.
2015-05-27 12:01:49 -07:00
John MacFarlane
adfb217622 Fixed svg handling in EPUB writer.
This is a crude workaroud for #2183.
A correct fix would require having openURL and fetchItem return
a content encoding as well as a content type.
2015-05-27 11:46:02 -07:00
John MacFarlane
c99e057d8e Fixed compiler warning. 2015-05-27 11:22:39 -07:00
John MacFarlane
734b0bc2fb Revealjs: allow 'center' to be set to false. 2015-05-27 11:04:38 -07:00
John MacFarlane
dbf2a63669 EPUB writer: Improved chapter splitting and internal link rewriting.
Closes #1887.
Closes #2163.
Closes #2162.
2015-05-27 09:39:05 -07:00
John MacFarlane
24bfc8274e Merge pull request #2170 from tarleb/org-generalize-result-block
Org generalize result block
2015-05-26 17:14:42 -07:00
John MacFarlane
fe66122b61 Merge pull request #2169 from tarleb/org-header-tags
Org reader: put header tags into empty spans
2015-05-26 17:14:29 -07:00
John MacFarlane
c3cb27f2f2 Merge pull request #2141 from DigitalPublishingToolkit/icml-images
Fix image URIs in ICML output
2015-05-26 17:14:10 -07:00
Albert Krewinkel
385dcf5b99 Org reader: drop trees with a :noexport: tag
Trees having a `:noexport:` tag set are not exported.  This mirrors
default Emacs Org-Mode behavior.
2015-05-23 14:23:16 +02:00
Albert Krewinkel
d8e4a8bc10 Org reader: put header tags into empty spans
Org mode allows headers to be tagged:

``` org-mode
* Headline         :TAG1:TAG2:
```

Instead of being interpreted as part of the headline, the tags are now
put into the attributes of empty spans.  Spans without textual content
won't be visible by default, but they are detectable by filters.  They
can also be styled using CSS when written as HTML.

This fixes #2160.
2015-05-23 14:06:32 +02:00
Albert Krewinkel
b61355cecd Org reader: generalize code block result parsing
Code blocks can be followed by optional result blocks, representing the
output generated by running the code in the code block.  It is possible
to choose whether one wants to export the code, the result, both or
none.

This patch allows any kind of `Block` as the result.  Previously, only
example code blocks were recognized.
2015-05-23 13:22:07 +02:00
Albert Krewinkel
40fb102417 Reorder block arguments parsing code
Group code used to parse block arguments together in one place.  This
seems better than having part of the code mixed between unrelated
parsing state changing functions.
2015-05-23 13:17:10 +02:00
John MacFarlane
d5f367d04b EPUB writer: Split references into separate chapter.
Previously the div-enclosed reference section produced
by pandoc-citeproc would not be split into its own chapter,
which caused various problems.

See #2162, #2163.

I'm not sure this is a complete fix.  I note that the bibliography
doesn't appear in nav or toc, which seems bad.
2015-05-21 00:45:57 -07:00
John MacFarlane
57e16e3287 PDF writer: Print temp dir on --verbose.
This might help diagnose #777.
2015-05-20 15:43:42 -07:00
John MacFarlane
f532dd69c9 DocBook writer: add id to para if in Div with id element.
This makes the writer work properly with linked bibliographic
items with pandoc-citeproc.

Closes jgm/pandoc-citeproc#132.
2015-05-20 10:55:06 -07:00
John MacFarlane
24ee1ab4f7 Markdown reader: Made implicit header references case-insensitive.
Added `stateHeaderKeys` to `ParserState`; this is a `KeyTable`
like `stateKeys`, but it only gets consulted if we don't find
a match in `stateKeys`, and if `Ext_implicit_header_references`
is enabled.

Closes #1606.
2015-05-13 23:12:58 -07:00
John MacFarlane
e06810499e HTML reader: Support base tag.
We only support the href attribute, as there's no place for
"target" in the Pandoc document model for links.

Added HTML reader test module, with tests for this feature.

Closes #1751.
2015-05-13 20:53:19 -07:00
John MacFarlane
75cfa7b462 Beamer: mark slide as [fragile] if header has fragile class.
Closes #2119.
2015-05-13 20:10:54 -07:00
John MacFarlane
0fa753b999 EPUB writer: Properly handle image URLs without an extension.
We now look at the mime type from the server and attach an
appropriate extension.

Closes #1855.
2015-05-13 14:52:51 -07:00
John MacFarlane
c9cb313a47 Fixed regression in charsInBalancedBrackets.
Introduced by e9d7504.

This regression caused link and image references containing
raw tex not to parse correctly.

Added test.

Closes #2150.
2015-05-13 10:16:06 -07:00
John MacFarlane
4560447041 Don't use sup element for epub footnotes.
Instead, just use an a element with class `footnoteRef`.
This allows more styling options, and provides better results
in some readers (e.g. iBooks, where anything inside the a
tag breaks popup footnotes).

Closes #1995.
2015-05-11 21:58:01 -07:00
John MacFarlane
9857aa866a HTML reader: Fixed detection of self-closing tags.
Earlier versions had a bug and would wrongly think
opening tags containing attributes with slashes in them
were self-closing.

Closes #2146.
2015-05-11 16:17:20 -07:00
gohai
8af168a7fe Fix image URIs in ICML output (v2)
InDesign expects LinkResourceURI to start with "file:" for local filenames, and won't render/link the image without.
2015-05-11 15:49:36 +02:00
John MacFarlane
4b251e93b4 ImageSize: fixed some exif parsing bugs.
Closes #1834.  The image originally supplied works fine now
with pandoc.
2015-05-10 16:52:37 -07:00
John MacFarlane
60bf4a8bfb Improved warnings when image size can't be determined.
Closes #1834.
2015-05-09 23:56:53 -07:00
John MacFarlane
31b3f2ef88 ImageSize: Use runGetOrFail with binary 0.7+. 2015-05-09 22:28:49 -07:00
John MacFarlane
a60c65c4e9 ImageSize: make jpeg header parsing routines return Either.
See #1834.
2015-05-09 21:55:19 -07:00
John MacFarlane
6fe243abbd ImageSize: make imageSize return an Either, not a Maybe.
This will give us better error reporting options.
This is part of a fix for #1834.
2015-05-09 21:32:31 -07:00
John MacFarlane
7920a1a469 Revert "EPUB writer: stylesheet changes. Closes #2040."
This reverts commit 1c2951dfd9.

See #2040.

The semantics was too squishy.  `--css` takes a URL, but
for EPUB we need files that we can read.  I prefer keeping
the old system for now, with `--epub-stylesheet`.
2015-05-09 00:07:27 -07:00
John MacFarlane
1c2951dfd9 EPUB writer: stylesheet changes. Closes #2040.
* Allow `--css` to be used to specify stylesheets.
* Deprecated `--epub-stylesheet` and made it a synoynym of
  `--css`.
* If a code block with class "css" is given as contents of the
  `stylesheet` metadata field, use its literal code as contents of
  the epub stylesheet.  Otherwise, treat it as a filename and
  read the file.
* Note: `--css` and `stylesheet` in metadata are not compatible.
  `stylesheet` takes precedence.
2015-05-08 23:47:50 -07:00
John MacFarlane
472c1424ba Deal with deprecation warning in Custom. 2015-05-05 12:46:20 -07:00
John MacFarlane
d19a347fd5 UTF8: Better handling of bare CRs in input files.
Previously we just stripped them out; now we convert
other line ending styles to LF line endings.

Closes #2132.
2015-05-05 12:42:50 -07:00
John MacFarlane
1b44acf0c5 SelfContained: properly handle data URIs in css urls.
Also use a proper css parser (adds dependency on text-css).

Closes #2129.
2015-05-04 16:00:28 -07:00
John MacFarlane
64b1394fe2 Make sure a closing </div> doesn't get included in a defn list item.
Closes #2127.
2015-05-03 15:06:40 -07:00
John MacFarlane
8a77eb4c9c LaTeX writer: Add a \label in \hyperdef for Div, Span.
Otherwise links don't work.
2015-05-02 17:58:16 -07:00
John MacFarlane
f1aaad9e86 EPUB writer: Use plain writer for metadata dc: fields.
This gives better results when we have, e.g. multiple paragraphs.
Note that tags aren't allowed in these fields.

Closes #2121.
2015-05-01 22:36:38 -07:00
John MacFarlane
9b2f645e2a SelfContained: cssURLs no longer tries to fetch fragment URLs.
The current test is: does the URL start with a `#`?
Closes #2121.
2015-05-01 22:15:43 -07:00
Alfred Wechselberger
7031748a43 Added woff2 to MIME types 2015-04-29 14:10:30 -07:00
John MacFarlane
55b7afc674 HTML reader: Allow multiple colgroups in table.
Closes #2122.
2015-04-29 12:05:38 -07:00
John MacFarlane
7b27cc6758 EPUB writer: Remove linear=no from cover itemref.
Closes #1609.
2015-04-26 15:43:58 -07:00
John MacFarlane
d9d88e58e1 Fixed regression with lists inside defintiion lists.
This fixes a regression (not in any released version) on
things like

    hi
    :   - there

Closes #2098.
2015-04-26 11:27:47 -07:00
John MacFarlane
2793d986dc Merge pull request #2112 from lierdakil/issue2101
Custom Writer: Set foreign encoding to UTF-8
2015-04-26 11:16:50 -07:00
John MacFarlane
1868cb5e42 Updated copyright notices to -2015. Closes #2111. 2015-04-26 10:18:29 -07:00
Nikolay Yakimov
a0ec3e85ad Custom Writer: Set foreign encoding to UTF-8
Closes #2101, #1634

Also factored out ByteString, since it's only used as an intermediate
representation.
2015-04-26 08:44:57 +03:00
John MacFarlane
e1d6be4e30 LaTeX reader: recognize \newpage as a block command. 2015-04-22 08:48:25 -07:00
John MacFarlane
2bca018201 Custom writer: use UTF-8 aware bytestring conversion.
This might help with #2101.
2015-04-21 22:50:58 -07:00
John MacFarlane
e9d7504bea Rewrote charsInBalancedBrackets.
This version should be a bit more efficient.

This doesn't help with #1735, however.
2015-04-19 17:04:33 -07:00
Nikolay Yakimov
e83968412e MD Reader: Fix links/footnotets after citations
Footnotes: check if '^' follows '['
Links: check if '[' or '(' follows ']'
Shorthand links: attempt to lazily parse suffix as referenceLink
2015-04-20 01:47:02 +03:00
John MacFarlane
1a69896d8f Revert "Merge pull request #1947 from mpickering/Fmonad"
Closes #2062.

This reverts commit c302bdcdbe, reversing
changes made to b983adf0d0.

Conflicts:
	src/Text/Pandoc/Parsing.hs
	src/Text/Pandoc/Readers/Markdown.hs
	src/Text/Pandoc/Readers/Org.hs
	src/Text/Pandoc/Readers/RST.hs
2015-04-18 19:00:32 -07:00
John MacFarlane
d20152e011 Markdown writer: improved escaping.
`<` should not be escaped as `\<`, for compatibility with
original Markdown.  We now escape `<` and `>` with entities.
Also, we now backslash-escape square brackets.

Closes #2086.
2015-04-18 10:58:50 -07:00
John MacFarlane
d3544dc6f7 Markdown definition lists: don't require indent for first line.
Previously the body of the definition (after the `:` or `~` marker)
needed to be in column 4.  This commit relaxes that requirement,
to better match the behavior of PHP Markdown Extra.  So, now
this is a valid definition list:

    foo
    : bar

This patch also helps resolve a potentially ambiguity with table
captions:

    foo

      : bar

      -----
      table
      -----

Is "bar" a definition, or the caption for the table?  We'll count
it as a caption for the table.

Closes #2087.
2015-04-18 10:13:32 -07:00
John MacFarlane
10e28ef750 More principled fix for #1820.
If the tag parses as a comment, we check to see if the
input starts with `<!--`. If not, it's bogus comment mode
and we fail htmlTag.

Includes test case.  Closes #1820.
2015-04-17 22:56:33 -07:00
John MacFarlane
764f677530 Merge branch 'latex-tightlist' of https://github.com/jlduran/pandoc into jlduran-latex-tightlist
Conflicts:
	data/templates
2015-04-17 19:23:13 -07:00
John MacFarlane
28ca8566ab Merge pull request #1954 from mcmtroffaes/feature/citekey-firstchar-alphanum
Allow digit as first character of a citation key.
2015-04-17 19:10:37 -07:00
John MacFarlane
44fcc5f96e Merge pull request #2079 from lierdakil/rst-normalize-headings
RST Writer: Normalize headings to sequential levels
2015-04-17 19:06:25 -07:00
John MacFarlane
fb143be038 Merge pull request #2092 from lierdakil/issue1909
MD Reader: Smart apostrophe after inline math
2015-04-17 18:55:35 -07:00
John MacFarlane
13b230a1b5 Fixed htmlTag in HTML reader.
Require that `<!` or `<?` be followed by nonspace.
This prevents `</ div>` from being parsed as a comment.

Closes #1820.
2015-04-17 18:35:49 -07:00
Nikolay Yakimov
4229cf2d92 MD Reader: Smart ' after inline math
Closes #1909.

Adds new parser combinator to Parsing.hs

`a <+?> b`

:   if a succeeds, applies b and mappends
    output (if any) to result of a. If b fails,
    it's just a, if a fails, whole expression fails.
2015-04-18 01:23:41 +03:00
Nikolay Yakimov
3f5d5a0a76 RST Writer: treat headings in block quotes, etc as rubrics 2015-04-16 12:12:00 +03:00
Nikolay Yakimov
2337ef68fc Docx Writer: Take TOC title from toc-title metadata field 2015-04-14 13:16:19 +03:00
Nikolay Yakimov
deb95d380e RST Writer: Normalize headings to sequential levels
This is pretty much required by docutils.
2015-04-13 20:45:40 +03:00
John MacFarlane
5ae48b7eaf Fixed warning. 2015-04-12 22:06:44 -07:00
John MacFarlane
0439f6f964 Fixed toc depth in RST writer.
Previously the depth was being rendered as a floating point
number with a decimal point.  Thanks to Nick Yakimov for
noticing this.
2015-04-12 22:06:44 -07:00
John MacFarlane
fee04fbee0 Merge pull request #2072 from lierdakil/latex-reader-cleanup
LaTeX Reader: Code cleanup
2015-04-12 21:39:08 -07:00
John MacFarlane
a9628d0745 Text.Pandoc.PDF: more comprehensible errors on image conversion.
Closes #2067.
EPS can't be supported without shelling out to something like
ImageMagick, but at least we can avoid mysterious error messages.

We now get:

    pandoc: Unable to convert `circle.eps' for use with pdflatex.
    ! Package pdftex.def Error: File `circle-eps-converted-to.pdf' not found.

which seems more straightforward.
2015-04-12 21:18:21 -07:00
Nikolay Yakimov
b92d49092f LaTeX Reader: Code cleanup 2015-04-12 14:50:38 +03:00
Nikolay Yakimov
b2ba922638 ODT Writer: Figure captions
Works pretty much the same as Word writer.

Following styles are used for figures:

Figure -- for figure with empty caption
FigureWithCaption (based on Figure) -- for figure with caption
FigureCaption (based on Caption) -- for figure captions

Also, TableCaption (based on Caption) is used for table captions.

We need FigureWithCaption to set keepWithNext, in order to keep caption
with figure.
2015-04-12 00:34:03 +03:00
John MacFarlane
5a33032560 Removed redundat import. 2015-04-07 23:26:20 -07:00
John MacFarlane
250fbef94d DocBook reader: look inside "info" elements for section titles.
Closes #1931.
2015-04-07 22:15:20 -07:00
John MacFarlane
28497d484e RST writer: better handling of raw latex inline.
We use `` :raw-latex:`...` `` and add a definition for this
role to the template.

Closes #1961.
2015-04-07 22:07:38 -07:00
Julien Cretel
b28c846018 Markdown Reader: eliminate common subexpressions 2015-04-07 13:46:32 +01:00
John MacFarlane
b29a8a5516 EPUB writer: Take TOC title from toc-title metadata field. 2015-04-02 21:28:55 -07:00
John MacFarlane
8d39d03d05 Added "noProof" to docx syntax highlighting SourceCode style. 2015-04-01 15:21:55 -07:00
John MacFarlane
9a79538ac9 Merge pull request #2042 from lierdakil/issue1866
LaTeX Reader: check for block-level newcommand aliases in blockCommand
2015-03-31 09:44:15 -07:00
Nikolay Yakimov
f1eb1ab9cf Latex Reader: Block commands code cleanup 2015-03-31 14:32:42 +03:00
John MacFarlane
2b2f7fe15e Merge pull request #2035 from lierdakil/issue2031
Docx Writer/Reference: Add keepNext to objects w/ captions
2015-03-30 20:30:15 -07:00
John MacFarlane
ccb828894b Added CommonMark writer.
Added `Text.Pandoc.Writers.CommonMark`, exporting
`writeCommonMark`.
2015-03-29 23:42:42 -07:00
Nikolay Yakimov
6a0d500f99 Latex Reader: Guard against para starting with inline macro 2015-03-30 06:42:15 +03:00
Nikolay Yakimov
f3e8274d04 LaTeX Reader: check for block-level newcommand aliases in blockCommand 2015-03-30 05:37:00 +03:00
John MacFarlane
34c6ff1f60 Merge pull request #2038 from lierdakil/docx-hyphen-settings
Docx Writer: Copy hyphenation settings from reference.docx
2015-03-29 09:58:45 -07:00
John MacFarlane
91128aac99 Merge pull request #2037 from lierdakil/issue458
Docx Writer: support for --toc option
2015-03-29 09:47:49 -07:00
Matthew Pickering
f3aa03ee86 Docx Writer: Filter out illegal XML characters
Fixes #1992
2015-03-29 13:38:52 +01:00
Nikolay Yakimov
4d1e85a09e Docx Writer: Place toc after abstract, rather than before 2015-03-29 13:46:34 +03:00