Commit graph

4237 commits

Author SHA1 Message Date
John MacFarlane
33b4bc8371 Pretty: Added afterBreak.
This makes it possible to insert escape codes for content
that needs escaping at the beginning of a line.
2016-12-05 00:49:53 +01:00
Albert Krewinkel
bfa734c402
LaTeX writer: Fix unnumbered headers when used with --top-level
Fix interaction of top-level divisions `part` or `chapter` with
unnumbered headers when emitting LaTeX.  Headers are ensured to be
written using stared commands (like `\subsection*{}`).

Fixes: #3272
2016-12-04 21:15:52 +01:00
John MacFarlane
7ace7dd66b Markdown writer: Fixed incorrect word wrapping.
Previously pandoc would sometimes wrap lines too early
due to this bug.

Closes #3277.
2016-12-04 17:13:06 +01:00
John MacFarlane
fb8a2540bd Options: Removed writerStandalone, made writerTemplate a Maybe.
Previously setting writerStandalone = True did nothing unless
a template was provided in writerTemplate.  Now a fragment
will be generated if writerTemplate is Nothing; otherwise,
the specified template will be used and standalone output
generated.  [API change]
2016-11-30 15:34:58 +01:00
John MacFarlane
ac83d4b806 Use new module from texmath to lookup MS font codepoints.
+ Removed Text.Pandoc.Readers.Docx.Fonts
+ Moved its code to texmath; we now use (from texmath 0.9)
  Text.TeXMath.Unicode.Fonts
+ Use texmath 0.9 (currently from git).
+ Updated epub tests because texmath now handles more mathml.
2016-11-30 00:43:55 +01:00
John MacFarlane
e2a452ba4a Shared.fetchItem: Better handling of protocol-relative URL.
If URL starts with `//` and there is no "base URL" (as there
would be if a URL were used on the command line), then default
to http:.

Closes #2635.
2016-11-27 21:19:26 +01:00
John MacFarlane
ea916432ac Updated renderHtml import in HTML writer to avoid deprecated function. 2016-11-27 21:18:58 +01:00
Albert Krewinkel
1fc07ff4da Refactor top-level division selection (#3261)
The "default" option is no longer represented as `Nothing` but via a new
type constructor, making the `Maybe` wrapper superfluous.

The default behavior of using heuristics can now be enabled explicitly
by setting `--top-level-division=default`.

API change (`Text.Pandoc.Options`): The `Division` type was renamed to
`TopLevelDivision`. The `Section`, `Chapter`, and `Part` constructors
were renamed to `TopLevelSection`, `TopLevelChapter`, and
`TopLevelPart`, respectively. An additional `TopLevelDefault`
constructor was added, which is now also the new default value of the
`writerTopLevelDivision` field in `WriterOptions`.
2016-11-27 20:31:04 +01:00
John MacFarlane
5222572033 HTML reader: improved table parsing.
We now check explicitly for non-1 rowspan or colspan
attributes, and fail when we encounter them. Previously
we checked that each row had the same number of cells,
but that could be true even with rowspans/colspans.
And there are cases where it isn't true in tables that
we can handle fine -- e.g. when a tr element is empty.
So now we just pad rows with empty cells when needed.

Closes #3027.
2016-11-26 22:28:28 +01:00
John MacFarlane
7b4a12a532 Revert "Open Document writer: set first level of blockquotes to not use indent (#2757)"
This reverts commit fee0b913c5.

The previous commit did not provide a good way to get increased
indentation for nested block quotes.

Rolling it back for now. @jjsheets feel free to submit something
that handles multiple levels of block quote smoothly, if you like.
2016-11-26 22:03:57 +01:00
Jeff Sheets
fee0b913c5 Open Document writer: set first level of blockquotes to not use indent (#2757)
* Open Document writer: set first level of blockquotes to not use indent
Nested blockquotes start using indents like before. Quotation style is
still in use, so the style's indent settings take effect on the first
level of blockquotes.

* Removed list construction to improve pull request to fix #2747
2016-11-26 21:50:20 +01:00
hubertp-lshift
5219599a77 [Tex] Remove invalid inlines in sections (#3218)
Latex doesn't like when hypertargets or images are
put in the options list of the section. They are not
lost since they were actually duplicated and present
also in the second argument list.

Note on the implementation:
I had to inline the definiton of 'foldMap' since it is
not implemented in every version of Haskell that Pandoc
supports.
2016-11-26 21:47:51 +01:00
hubertp-lshift
015dead0bb [odt] Infer table's caption from the paragraph (#3224)
ODT's reader always put empty captions for the parsed
tables. This commit
1) checks paragraphs that follow the table definition
2) treats specially a paragraph with a style named 'Table'
3) does some postprocessing of the paragraphs that combines
 tables followed immediately by captions

The ODT writer used 'TableCaption' style name for the caption
paragraph. This commit follows the open office approach which
allows for appending captions to table but uses a built-in style
named 'Table' instead of 'TableCaption'. Any users of odt format
(both writer and reader) are therefore required to change the
style's name to 'Table', if necessary.
2016-11-26 21:45:56 +01:00
Albert Krewinkel
baa25362a4 Allow to overwrite top-level division type heuristics (#3258)
Pandoc uses heuristics to determine the most resonable top-level
division type when emitting LaTeX or Docbook markup.  It is now possible
to overwrite this implicitly set top-level division via the
`top-level-division` command line parameter.

API change (`Text.Pandoc.Options`): the type of the
`writerTopLevelDivision` field in of the `WriterOptions` data type is
altered from `Division` to `Maybe Division`. The field's default value
is changed from `Section` to `Nothing`.

Closes: #3197
2016-11-26 21:43:46 +01:00
John MacFarlane
2873cd8288 LaTeX reader: don't treat \vspace and \hspace as block commands.
Fixed an error which came up, for example, with `\vspace`
inside a caption.  (Captions expect inlines.)

Closes #3256.
2016-11-26 21:27:56 +01:00
Albert Krewinkel
f4a8f12387
Org reader: respect column width settings
Table column properties can optionally specify a column's width with
which it is displayed in the buffer. Some exporters, notably the ODT
exporter in org-mode v9.0, use these values to calculate relative column
widths. The org reader now implements the same behavior.

Note that the org-mode LaTeX and HTML exporters in Emacs don't support
this feature yet, which should be kept in mind by users who use the
column widths parameters.

Closes: #3246
2016-11-24 20:07:39 +01:00
John MacFarlane
d7fb9db295 LaTeX writer: use \autocites* when "suppress-author" citation used. 2016-11-24 10:12:13 +01:00
John MacFarlane
03788eb164 Fixed some bugs in Pretty that caused blank lines in tables.
The bugs caused spurious blank lines in grid tables
when we had things like

    blankline $$ blankline

Closes #3251.
2016-11-23 15:18:03 +01:00
John MacFarlane
5449e4a226 Docx writer: Give full detail when there are errors converting tex math. 2016-11-22 22:21:07 +01:00
John MacFarlane
77912ddc56 Put 'warn' in MonadIO. Add warnings for math conversions in docx. 2016-11-22 10:56:59 +01:00
John MacFarlane
8d7ecc27a1 Allow beamer-style <...> options in raw LaTeX (also in Markdown).
This allows use of things like `\only<2,3>{my content}` in
Markdown that is going to be converted to beamer.

Closes #3184.
2016-11-20 21:17:41 +01:00
John MacFarlane
bd19176026 LaTeX writer: ensure that simple tables have simple cells.
If cells contain more than a single Plain or Para, then
we need to set nonzero widths and put contents into minipages.

Closes #2666.
2016-11-20 17:01:51 +01:00
Björn Peemöller
2761fecd57 Fix for calculation of column widths for aligned multiline tables
This also fixes excessive CPU and memory usage for tables
when --columns is set in such a way that cells must be very
tiny.

Now cells are guaranteed to be big enough so that single
words don't need to line break, even if this pushes the
line length above the column width.

Closes #1911.
2016-11-19 23:14:35 +01:00
Björn Peemöller
18f5d25abe Added function to compute the minimal width of a document 2016-11-19 23:04:29 +01:00
Björn Peemöller
6246d6e213 Added error message for illegal call to Pretty.block 2016-11-19 23:03:53 +01:00
John MacFarlane
d905551b12 LaTeX reader: improved table handling.
We can now parse all of the tables emitted by pandoc in
our tests.

The only thing we don't get yet are alignments and
column widths in more complex tables.

See #2669.
2016-11-19 22:46:32 +01:00
John MacFarlane
f255625ad7 LaTeX reader: limited support for minipage. 2016-11-19 22:46:32 +01:00
Albert Krewinkel
64413b1ce2
Un-break Travis build
Remove whitespace before function documentation The extra spaced cause
problems with documentation tools and Travis tests are failing because
of this.
2016-11-19 22:30:02 +01:00
John MacFarlane
5a1796e650 LaTeX reader: improved parsing of tables.
Reader can now parse simple LaTeX tables such as those
generated by pandoc itself.

We still can't handle pandoc multiline tables which involve
minipages and column widths.

Partially addresses #2669.
2016-11-19 21:36:16 +01:00
John MacFarlane
e4798a6726 Fixed xref lookup in DocBook reader. Closes #3243.
It previously only worked when the qnames lacked the docbook
namespace URI.
2016-11-19 10:30:20 +01:00
Albert Krewinkel
1a8af5fc44
Org reader: Ensure images in paragraphs are not parsed as figures
This fixes a regression introduced in
7e5220b57c.
2016-11-19 01:17:04 +01:00
John MacFarlane
f9df62c29f Export Text.Pandoc.getDefaultExtensions.
See #3178.
2016-11-18 17:01:09 +01:00
John MacFarlane
a729dd8ad3 Docx writer: fixed XML markup for empty cells.
Closes #3238.

Previously the Compact style wasn't being applied properly
to empty cells.
2016-11-18 16:47:23 +01:00
John MacFarlane
31076adf09 Markdown writer: Use bracketed form for native spans...
...when `bracketed_spans` enabled.

Closes #3229.
2016-11-18 11:58:56 +01:00
ickc
e8ce21d614 Small caps in Bracketed Spans (#3191)
* Markdown reader: modify bracketedSpan to check small caps

* MANUAL.txt: add description on the use of `bracketed_spans` in small cap

* Improve markdown readers: bracketedSpan function EXACTLY as spanHtml
2016-11-16 11:53:51 +01:00
John MacFarlane
0dfcedad7e Adjust widths in Markdown grid tables so that they match on round-trip. 2016-11-15 16:48:24 +01:00
John MacFarlane
298e6f38f9 Allow alignments to be specified in Markdown grid tables. 2016-11-15 16:41:54 +01:00
John MacFarlane
064e3f8c55 Markdown writer: fixed inconsistent spacing issue.
Previously a tight bullet sublist got rendered with
a blank line after, while a tight ordered sublist did
not.  Now we don't get the blank line in either case.

Closes #3232.
2016-11-15 10:32:16 +01:00
John MacFarlane
50f0cfcc1a HTML reader: only treat "a" element as link if it has href.
Otherwise treat as span.

Closes #3226.
2016-11-13 22:41:11 +01:00
John MacFarlane
3de6b97b9f Use correct mime types for woff and woff2.
Closes #3228.
2016-11-12 23:22:34 +01:00
John MacFarlane
b6a916974e Markdown writer: Fix escaping of spaces in super/subscript.
Previously two backslashes were inserted, which gave a
literal backslash.

Closes #3225.
2016-11-12 23:19:15 +01:00
Jesse Rosenthal
eea4d14f60 Docx reader: add a placeholder value for CHART.
We wrap `[CHART]` in a `<span class="chart">`. Note that it maps to
inlines because, in docx, anything in a drawing tag can be part of a
larger paragraph.
2016-11-10 13:19:27 -05:00
Jesse Rosenthal
7539de0287 Docx reader: Be more specific in parsing images
We not only want "w:drawing", because that could also include
charts. Now we specify "w:drawing"//"pic:pic". This shouldn't change
behavior at all, but it's a first step toward allowing other sorts of
drawing data as well.
2016-11-10 13:19:27 -05:00
Albert Krewinkel
7e5220b57c
Org reader: allow HTML attribs on non-figure images
Images which are the only element in a paragraph can still be given HTML
attributes, even if the image does not have a caption and is hence not a figure.
The following will add set the `width` attribute of the image to `50%`:

    #+ATTR_HTML: :width 50%
    [[file:image.jpg]]

Closes: #3222
2016-11-09 22:49:20 +01:00
Hubert Plociniczak
13bc573e7f Inline code when text has a special style
When a piece of text has a text 'Source_Text' then
we assume that this is a piece of the document
that represents a code that needs to be inlined.
Addapted an odt writer to also reflect that change;
previously it was just writing a 'preformatted' text using
a non-distinguishable font style.

Code blocks are still not recognized by the ODT reader.
That's a separate issue.
2016-11-08 09:29:46 -05:00
John MacFarlane
eced02d70e Markdown reader: Allow reference link labels starting with @...
...if citations extension disabled.  Example:  in

    [link text][@a]

    [@a]: url

`link text` isn't hyperlinked because `[@a]` is parsed as a citation.
Previously this happened whether or not the `citations` extension was
enabled. Now it happens only if the `citations` extension is enabled.

Closes #3209.
2016-11-05 21:14:20 +01:00
Jesse Rosenthal
4a99e142ec Docx Reader: abstract out function to avoid code repetition. 2016-11-02 12:28:56 -04:00
Jesse Rosenthal
378603c770 Docx writer: Handle title text in images.
We already handled alt text. This just puts the image "title" into the
docx "title" attr.
2016-11-02 12:10:45 -04:00
Jesse Rosenthal
effc348965 Docx reader: Handle Alt text and titles in images.
We use the "description" field as alt text and the "title" field as
title. These can be accessed through the "Format Picture" dialog in
Word.
2016-11-02 12:10:45 -04:00
Jesse Rosenthal
1138ae6656 Docx reader utils: handle empty namespace in elemName
Previously, if given an empty namespace:

    (elemName ns "" "foo")

`elemName` would output a QName with a `Just ""` namespace. This is
never what we want. Now we output a `Nothing`. If someone *does* want a
`Just ""` in the namespace, they can enter the QName value explicitly.
2016-11-02 12:10:45 -04:00