Commit graph

4089 commits

Author SHA1 Message Date
Jesse Rosenthal
effc348965 Docx reader: Handle Alt text and titles in images.
We use the "description" field as alt text and the "title" field as
title. These can be accessed through the "Format Picture" dialog in
Word.
2016-11-02 12:10:45 -04:00
Jesse Rosenthal
1138ae6656 Docx reader utils: handle empty namespace in elemName
Previously, if given an empty namespace:

    (elemName ns "" "foo")

`elemName` would output a QName with a `Just ""` namespace. This is
never what we want. Now we output a `Nothing`. If someone *does* want a
`Just ""` in the namespace, they can enter the QName value explicitly.
2016-11-02 12:10:45 -04:00
John MacFarlane
bdda4b185b HTML reader: treat <math> as MathML by default...
unless something else is explicitly specified in xmlns.
Provided it parses as MathML, of course.

Also fixed default which should be to inline math if no
display attribute is used.
2016-11-02 16:43:36 +01:00
John MacFarlane
705df61198 LaTeX reader: Handle BVerbatim from fancyvrb. Fixes #3203. 2016-11-02 12:05:56 +01:00
John MacFarlane
eb5cb0f304 Handle hungarumlaut in LaTeX reader. Closes #3201. 2016-11-01 10:17:15 +01:00
hubertp-lshift
01a21dd43f [odt] Infer tables' header props from rows (#3199)
ODT reader simply provided an empty header list
which meant that the contents of the whole table,
even if not empty, was simply ignored.
While we still do not infer headers we at least have
to provide default properties of columns.
2016-11-01 10:07:39 +01:00
John MacFarlane
5d02e478d0 LaTeX reader: allow for []s inside LaTeX optional args.
Fixes cases like:

    \begin{center}
    \begin{tikzpicture}[baseline={([yshift=+-.5ex]current bounding box.center)}, level distance=24pt]
    \Tree [.{S} [.NP John\index{i} ] [.VP [.V likes ] [.NP himself\index{i,*j} ]]]
    \end{tikzpicture}
    \end{center}
2016-10-31 22:04:22 +01:00
Albert Krewinkel
4f06e6c445
Org reader: support ATTR_HTML for special blocks
Special blocks (i.e. blocks with unrecognized names) can be prefixed
with an `ATTR_HTML` block attribute.  The attributes defined in that
meta-directive are added to the `Div` which is used to represent the
special block.

Closes: #3182
2016-10-30 20:23:53 +01:00
Albert Krewinkel
63bdc5d08f
Org reader: support the todo export option
The `todo` export option allows to toggle the inclusion of TODO keywords
in the output.  Setting this to `nil` causes TODO keywords to be dropped
from headlines.  The default is to include the keywords.
2016-10-30 13:20:25 +01:00
Albert Krewinkel
d5182778c4
Org reader: add support for todo-markers
Headlines can have optional todo-markers which can be controlled via the
`#+TODO`, `#+SEQ_TODO`, or `#+TYP_TODO` meta directive.  Multiple such
directives can be given, each adding a new set of recognized todo-markers.
If no custom todo-markers are defined, the default `TODO` and `DONE`
markers are used.

Todo-markers are conceptually separate from headline text and are hence
excluded when autogenerating headline IDs.

The markers are rendered as spans and labelled with two classes: One
class is the markers name, the other signals the todo-state of the
marker (either `todo` or `done`).
2016-10-30 10:27:47 +01:00
Daniele D'Orazio
78e4fbda51 Markdown Reader: add attributes for autolink (#3183) 2016-10-26 12:18:58 +02:00
John MacFarlane
c46ad7c8db reveal.js: don't change slide title to level 1 header.
This also affects other HTML slide show formats.

Closes #2221.
2016-10-25 17:07:51 +02:00
John MacFarlane
d51e475bad Export Text.Pandoc.Error in Text.Pandoc.
[API change]
2016-10-24 22:31:36 +02:00
John MacFarlane
bf72a482eb Tighten up parsing of raw email addresses.
Technically `**@user` is a valid email address, but if we
allow things like this, we get bad results in markdown flavors
that autolink raw email addresses. (See #2940.)
So we exclude a few valid email addresses in order to
avoid these more common bad cases.

Closes #2940.
2016-10-23 23:12:36 +02:00
Thomas Weißschuh
96fa1daa6e fix example in documentation (#3176)
Errors are encountered while reading, not writing
2016-10-23 21:48:22 +02:00
Mauro Bieg
9dd18ebf34 ICML writer: replace partial function (!!) in table handling (#3175) 2016-10-23 21:42:33 +02:00
John MacFarlane
696dfbc993 Added angle_brackets_escapable extension.
This is needed because github flavored Markdown has a slightly
different set of escapable symbols than original Markdown;
it includes angle brackets.

Closes #2846.
2016-10-22 23:41:55 +02:00
John MacFarlane
d2a6533d6e EPUB reader: don't add root path to data: URIs.
Closes #3150.

Thanks to @lep for the bug report and patch.
2016-10-22 23:07:15 +02:00
John MacFarlane
1da40d63b1 Merge pull request #3108 from tarleb/part
Add command line option allowing to set type of top-level divisions
2016-10-19 14:07:44 +02:00
Albert Krewinkel
595a171407
Add option for top-level division type
The `--chapters` option is replaced with `--top-level-division` which allows
users to specify the type as which top-level headers should be output. Possible
values are `section` (the default), `chapter`, or `part`.

The formats LaTeX, ConTeXt, and Docbook allow `part` as top-level division, TEI
only allows to set the `type` attribute on `div` containers.  The writers are
altered to respect this option in a sensible way.
2016-10-19 13:12:57 +02:00
Hubert Plociniczak
dd799df0ae Image with a caption needs special formatting
Latex Writer only handles captions if the image's title
is prefixed with 'fig:'.
2016-10-19 11:40:44 +02:00
John MacFarlane
4a2a7a21e5 Merge pull request #3166 from hubertp-lshift/bug/3134
Issue 3143: Don't duplicate text for anchors
2016-10-18 22:03:45 +02:00
John MacFarlane
0cd11b3e54 Merge pull request #3165 from hubertp-lshift/feature/odt-image
[odt] images parser
2016-10-18 22:00:58 +02:00
John MacFarlane
8264ae2abe Better fix for the problem with ghc 7.8. 2016-10-18 16:25:13 +02:00
John MacFarlane
3747abf029 Try to fix build error on ghc 7.8.
@tarleb this is an interesting one, see the build log in
https://travis-ci.org/jgm/pandoc/jobs/168612017

It only failed on ghc 7.8; I think this must have to do with
the change making Monad a superclass of Applicative, hence
this change.
2016-10-18 16:03:34 +02:00
Hubert Plociniczak
c74c5fdd97 Issue 3143: Don't duplicate text for anchors
When creating an anchor element we were adding its representation
as well as the original content, leading to text duplication.
2016-10-18 10:50:37 +02:00
Albert Krewinkel
1266b210ac
Org writer: drop space before footnote markers
The writer no longer adds an extra space before footnote markers.

Fixes: #3162
2016-10-17 22:11:03 +02:00
Hubert Plociniczak
4417e33ea9 Use bind function instead of pattern matching 2016-10-17 16:58:53 +02:00
Hubert Plociniczak
7234321e8f Minor refactoring 2016-10-17 16:50:03 +02:00
Hubert Plociniczak
a02f276ff1 Infer caption from the text following the img
Frame can contain other frames with the text boxes.
This is something that has not been considered before
and meant that the whole construction of images was
broken in those cases. Also the captions were fixed/ignored.
2016-10-17 16:35:13 +02:00
Jesse Rosenthal
e666c92bc9 RST reader: skip whitespace before note.
RST requires a space before a footnote marker. We discard those spaces
so that footnotes will be adjacent to the text that comes before
it. This is in line with what rst2latex does. rst2html does not discard
the space, but its html output is different than pandoc's, so this seems
the most semantically correct approach.

Closes #3163
2016-10-17 09:54:59 -04:00
Albert Krewinkel
462c140eb6
Org reader: allow figure with empty caption
A `#+CAPTION` attribute before an image is enough to turn an image into a
figure. This wasn't the case because the `parseFromString` function, which
processes the caption value, would fail on empty values. Adding a newline
character to the caption value fixes this.

Fixes: #3161
2016-10-14 23:16:51 +02:00
John MacFarlane
9bd1da122a Merge pull request #3146 from hubertp-lshift/feature/odt-list-start-value
[ODT Parser] Include list's starting value
2016-10-14 15:15:50 +02:00
Hubert Plociniczak
9282fadc6b Added tests and a corner case for starting number
Review revealed that we didn't handle the case
when the starting point is an empty string. While
this is not a valid .odt file, we simply added
a special case to deal with it.

Also added tests for the new feature.
2016-10-14 13:56:24 +02:00
Jesse Rosenthal
cd1427876e Markdown writer: Abstract out note/ref function.
We do basically the same thing every time we insert notes, so let's cut
down on code duplication.
2016-10-13 11:04:35 -04:00
John MacFarlane
6d13567ac5 Allow http-client 0.4.30, which is the version in stackage lts.
Previously we required 0.5.
Remove CPP conditionals for earlier versions.
2016-10-13 13:01:49 +02:00
John MacFarlane
4a1ef0b51d Revert "Remove http-client CPP conditionals."
This reverts commit 3f82471355.

We might want to revert the requirement of http-client 0.5,
as this is not yet in Stackage and that is starting to
cause problems.  I can't recall why it is there.
2016-10-13 12:35:58 +02:00
Albert Krewinkel
3e60ed9c03
Allow empty lines when parsing line blocks
Line blocks are allowed to contain empty lines and should be parsed as a
single block in that case.  Previously an empty (line block) line would
have terminated parsing of the line block element.
2016-10-13 08:46:44 +02:00
Albert Krewinkel
c9460e7013
Parse line-oriented markup as LineBlock
Markup-features focusing on lines as distinctive part of the markup are read
into `LineBlock` elements. This currently means line blocks in reStructuredText
and Markdown (the latter only if the `line_block` extension is enabled), the
`linegroup`/`line` combination from the Docbook 5.1 working draft, and Org-mode
`VERSE` blocks.
2016-10-13 08:46:44 +02:00
Albert Krewinkel
22cb9e3327
Add support for the LineBlock element to writers
The following markup features are used to output the lines of the `LineBlock`
element:

  - AsciiDoc: a `[verse]` block,
  - ConTeXt: text surrounded by `\startlines` and `\endlines`,
  - HTML: `div` with an per-element style setting to interpret the content as
    pre-wrapped,
  - Markdown: line blocks if the `line_blocks` extension is enabled, a simple
    paragraph with hard linebreaks otherwise,
  - Org: VERSE block,
  - RST: a line block, and
  - all other formats: a paragraph, containing hard linebreaks between lines.

Custom lua writers should be updated to use the `LineBlock` element.
2016-10-13 08:46:44 +02:00
Albert Krewinkel
64b77cc2c5
Shared: add function combining lines using LineBreak
The `linesToBlock` function takes a list of lines and combines them by appending
a hard `LineBreak` to each line and concatenating the result, putting the result
it into a `Para`. This is most useful when dealing when converting `LineBlock`
elements.
2016-10-13 08:46:38 +02:00
Hubert Plociniczak
edc951ee7d [ODT Parser] Include list's starting value
Previously the starting value of the lists' items has been
hardcoded to 1. In reality ODT's list style definition can
provide a new starting value in one of its attributes.

Writers already handle the modified start value so no need
to change anything in that area.
2016-10-12 18:36:40 +02:00
Hubert Plociniczak
c924611de5 Basic support for images in ODT documents
Highly influenced by the docx support, refactored
some code to avoid DRY.
2016-10-12 17:50:35 +02:00
John MacFarlane
901045b0bb Merge pull request #3159 from jkr/refs
Specify location for footnotes (and reference links) in MD output
2016-10-12 11:11:06 +02:00
Jesse Rosenthal
ca50deeeee Markdown writer: Allow footnotes/refs at the end of blocks, sections
This allows footnotes and refs to be placed at the end of blocks and
sections. Note that we only place them at the end of blocks that are at
the top level and before headers that are the top level. We add an
environment variable to keep track of this. Because we clear the
footnotes and refs when we use them, we also add a state variable to
keep track of the starting number.

Finally, note that we still add any remaining footnotes at the end. This
takes care of the final section, if we are placing at the end of a
section, and will always come after a final block as well.
2016-10-11 15:17:01 -04:00
Jesse Rosenthal
6914808139 Add ReaderT monad for environment variables.
This will make it easier to keep track of what level of block we are at.
2016-10-11 13:53:47 -04:00
Jesse Rosenthal
27113bda1f Options: Add references location.
This will be used by the markdown writer for deciding where to put links
and footnotes.
2016-10-11 13:30:01 -04:00
Albert Krewinkel
bb48f2edc4
Org reader: trim verse lines properly
An empty verse line should not result in `Str ""` but in `mempty`.
2016-10-10 21:12:47 +02:00
John MacFarlane
f4b7ab125e More checks for Ext_raw_html when rendering HTML in Markdown.
Previously we'd emit raw HTML tables even if the `raw_html`
extension was disabled.

Now we just emit `[TABLE]` if no table formats are enabled
and raw HTML is not enabled.

We also check for the `raw_html` extension before emiting
a raw HTML block.

Closes #3154.
2016-10-10 11:23:04 +02:00
KolenCheung
46be319ca9 removed mmd raw_tex in src/Text/Pandoc/Options.hs 2016-10-09 21:30:03 +02:00