Commit graph

1927 commits

Author SHA1 Message Date
Albert Krewinkel
d5182778c4
Org reader: add support for todo-markers
Headlines can have optional todo-markers which can be controlled via the
`#+TODO`, `#+SEQ_TODO`, or `#+TYP_TODO` meta directive.  Multiple such
directives can be given, each adding a new set of recognized todo-markers.
If no custom todo-markers are defined, the default `TODO` and `DONE`
markers are used.

Todo-markers are conceptually separate from headline text and are hence
excluded when autogenerating headline IDs.

The markers are rendered as spans and labelled with two classes: One
class is the markers name, the other signals the todo-state of the
marker (either `todo` or `done`).
2016-10-30 10:27:47 +01:00
Daniele D'Orazio
78e4fbda51 Markdown Reader: add attributes for autolink (#3183) 2016-10-26 12:18:58 +02:00
John MacFarlane
d51e475bad Export Text.Pandoc.Error in Text.Pandoc.
[API change]
2016-10-24 22:31:36 +02:00
John MacFarlane
696dfbc993 Added angle_brackets_escapable extension.
This is needed because github flavored Markdown has a slightly
different set of escapable symbols than original Markdown;
it includes angle brackets.

Closes #2846.
2016-10-22 23:41:55 +02:00
John MacFarlane
d2a6533d6e EPUB reader: don't add root path to data: URIs.
Closes #3150.

Thanks to @lep for the bug report and patch.
2016-10-22 23:07:15 +02:00
Hubert Plociniczak
dd799df0ae Image with a caption needs special formatting
Latex Writer only handles captions if the image's title
is prefixed with 'fig:'.
2016-10-19 11:40:44 +02:00
John MacFarlane
4a2a7a21e5 Merge pull request #3166 from hubertp-lshift/bug/3134
Issue 3143: Don't duplicate text for anchors
2016-10-18 22:03:45 +02:00
John MacFarlane
0cd11b3e54 Merge pull request #3165 from hubertp-lshift/feature/odt-image
[odt] images parser
2016-10-18 22:00:58 +02:00
John MacFarlane
8264ae2abe Better fix for the problem with ghc 7.8. 2016-10-18 16:25:13 +02:00
John MacFarlane
3747abf029 Try to fix build error on ghc 7.8.
@tarleb this is an interesting one, see the build log in
https://travis-ci.org/jgm/pandoc/jobs/168612017

It only failed on ghc 7.8; I think this must have to do with
the change making Monad a superclass of Applicative, hence
this change.
2016-10-18 16:03:34 +02:00
Hubert Plociniczak
c74c5fdd97 Issue 3143: Don't duplicate text for anchors
When creating an anchor element we were adding its representation
as well as the original content, leading to text duplication.
2016-10-18 10:50:37 +02:00
Hubert Plociniczak
7234321e8f Minor refactoring 2016-10-17 16:50:03 +02:00
Hubert Plociniczak
a02f276ff1 Infer caption from the text following the img
Frame can contain other frames with the text boxes.
This is something that has not been considered before
and meant that the whole construction of images was
broken in those cases. Also the captions were fixed/ignored.
2016-10-17 16:35:13 +02:00
Jesse Rosenthal
e666c92bc9 RST reader: skip whitespace before note.
RST requires a space before a footnote marker. We discard those spaces
so that footnotes will be adjacent to the text that comes before
it. This is in line with what rst2latex does. rst2html does not discard
the space, but its html output is different than pandoc's, so this seems
the most semantically correct approach.

Closes #3163
2016-10-17 09:54:59 -04:00
Albert Krewinkel
462c140eb6
Org reader: allow figure with empty caption
A `#+CAPTION` attribute before an image is enough to turn an image into a
figure. This wasn't the case because the `parseFromString` function, which
processes the caption value, would fail on empty values. Adding a newline
character to the caption value fixes this.

Fixes: #3161
2016-10-14 23:16:51 +02:00
John MacFarlane
9bd1da122a Merge pull request #3146 from hubertp-lshift/feature/odt-list-start-value
[ODT Parser] Include list's starting value
2016-10-14 15:15:50 +02:00
Hubert Plociniczak
9282fadc6b Added tests and a corner case for starting number
Review revealed that we didn't handle the case
when the starting point is an empty string. While
this is not a valid .odt file, we simply added
a special case to deal with it.

Also added tests for the new feature.
2016-10-14 13:56:24 +02:00
Albert Krewinkel
c9460e7013
Parse line-oriented markup as LineBlock
Markup-features focusing on lines as distinctive part of the markup are read
into `LineBlock` elements. This currently means line blocks in reStructuredText
and Markdown (the latter only if the `line_block` extension is enabled), the
`linegroup`/`line` combination from the Docbook 5.1 working draft, and Org-mode
`VERSE` blocks.
2016-10-13 08:46:44 +02:00
Hubert Plociniczak
edc951ee7d [ODT Parser] Include list's starting value
Previously the starting value of the lists' items has been
hardcoded to 1. In reality ODT's list style definition can
provide a new starting value in one of its attributes.

Writers already handle the modified start value so no need
to change anything in that area.
2016-10-12 18:36:40 +02:00
Hubert Plociniczak
c924611de5 Basic support for images in ODT documents
Highly influenced by the docx support, refactored
some code to avoid DRY.
2016-10-12 17:50:35 +02:00
Albert Krewinkel
bb48f2edc4
Org reader: trim verse lines properly
An empty verse line should not result in `Str ""` but in `mempty`.
2016-10-10 21:12:47 +02:00
John MacFarlane
1435906f09 MediaWiki writer: transform filename with underscores in images.
`foo bar.jpg` becomes `foo_bar.jpg`. This was already done
for internal links, but it also needs to happen for images.

Closes #3052.
2016-10-02 22:09:20 +02:00
John MacFarlane
e95047ed85 Markdown reader: added bracket syntax for native spans.
See #168.

Text.Pandoc.Options.Extension has a new constructor `Ext_brackted_spans`,
which is enabled by default in pandoc's Markdown.
2016-09-28 12:33:05 +02:00
Jesse Rosenthal
3f8d3d844f Remove TagSoup compat
We already lower-bound tagsoup at 0.13.7, which means we were always
running the compatibility layer (it was conditional on min value
0.13). Better to just use `lookupEntity` from the library directly, and
convert a string to a char if need be.
2016-09-02 12:28:53 -04:00
Jesse Rosenthal
f72e3b58e8 Remove directory compat
directory 1.1 depends on base 4.5 (ghc 7.4) which we are no longer
supporting. So we don't have to use a compatibility layer for it.
2016-09-02 09:18:09 -04:00
Jesse Rosenthal
7f676b534a Remove Text.Pandoc.Compat.Except 2016-09-02 09:18:09 -04:00
Jesse Rosenthal
26c3705da4 Fix grouping of imports.
Some source files keep imports in tidy groups. Changing
`Text.Pandoc.Compat.Monoid` to `Data.Monoid` could upset that. This
restores tidiness.
2016-09-02 09:18:09 -04:00
Jesse Rosenthal
45c7108b4f Remove Compat.Monoid
This was only necessary for GHC versions with base below 4.5
(i.e., ghc < 7.4).
2016-09-02 09:18:08 -04:00
Albert Krewinkel
21cd76c201
Org reader: respect unnumbered header property
Sections the `unnumbered` property should, as the name implies, be
excluded from the automatic numbering of section provided by some output
formats.  The Pandoc convention for this is to add an "unnumbered" class
to the header.  The reader treats properties as key-value pairs per
default, so a special case is added to translate the above property to a
class instead.

Closes #3095.
2016-08-30 18:10:24 +02:00
Jesse Rosenthal
abc4bca46b Docx reader: make all compilers happy with traversable.
The last attempt to make 7.8 happy made 7.10 unhappy. So we need some
conditional logic to appease all versions.
2016-08-29 23:16:40 -04:00
Jesse Rosenthal
773adc7804 Docx reader: Import traverse for ghc 7.8
The GHC 7.8 build was erroring without it.
2016-08-29 21:34:26 -04:00
Jesse Rosenthal
7ae9621a87 Docx reader: clean up function with traverse 2016-08-29 18:56:24 -04:00
Albert Krewinkel
a3a3e3fdbf
Merge branch 'org-meta-handling' 2016-08-29 14:42:23 +02:00
Albert Krewinkel
88313c0b93
Org reader: respect creator export option
The `creator` option controls whether the creator meta-field should be
included in the final markup.  Setting `#+OPTIONS: creator:nil` will
drop the creator field from the final meta-data output.

Org-mode recognizes the special value `comment` for this field, causing
the creator to be included in a comment.  This is difficult to translate
to Pandoc internals and is hence interpreted the same as other truish
values (i.e. the meta field is kept if it's present).
2016-08-29 14:35:16 +02:00
Albert Krewinkel
0568aa5cad
Org reader: respect email export option
The `email` option controls whether the email meta-field should be
included in the final markup. Setting `#+OPTIONS: email:nil` will drop
the email field from the final meta-data output.
2016-08-29 14:34:39 +02:00
Albert Krewinkel
117d3f4d92
Org reader: respect author export option
The `author` option controls whether the author should be included in
the final markup.  Setting `#+OPTIONS: author:nil` will drop the author
from the final meta-data output.
2016-08-29 14:33:18 +02:00
Albert Krewinkel
28d17ea70f
Org reader: read HTML_head as header-includes
HTML-specific head content can be defined in `#+HTML_head` lines.  They
are parsed as format-specific inlines to ensure that they will only show
up in HTML output.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
d164ead379
Org reader: set classoption meta from LaTeX_class_options 2016-08-29 14:10:57 +02:00
Albert Krewinkel
825ce8ca73
Org reader: set documentclass meta from LaTeX_class 2016-08-29 14:10:57 +02:00
Albert Krewinkel
a257488343
Org reader: read LaTeX_header as header-includes
LaTeX-specific header commands can be defined in `#+LaTeX_header` lines.
They are parsed as format-specific inlines to ensure that they will only
show up in LaTeX output.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
75df104215
Org reader: give precedence to later meta lines
The last meta-line of any given type is the significant line.
Previously the value of the first line was kept, even if more lines of
the same type were encounterd.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
2ca2585b35
Org reader: allow multiple, comma-separated authors
Multiple authors can be specified in the `#+AUTHOR` meta line if they
are given as a comma-separated list.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
153970bef5
Org reader: read markup only for special meta keys
Most meta-keys should be read as normal string values, only a few are
interpreted as marked-up text.
2016-08-29 14:10:56 +02:00
Albert Krewinkel
bed5f700ce
Org reader: extract meta parsing code to module
Parsing of meta-data is well separable from other block parsing tasks.
Moving into new module to get small files and clearly arranged code.
2016-08-29 14:10:51 +02:00
Jesse Rosenthal
95734b2951 Docx reader: update copyright. 2016-08-28 22:53:19 -04:00
Jesse Rosenthal
9f6fd6139f Docx reader: use all anchor spans for header ids.
Previously we only used the first anchor span to affect header ids. This
allows us to use all the anchor spans in a header, whether they're
nested or not.

Along with 62882f97, this closes #3088.
2016-08-28 18:12:02 -04:00
Jesse Rosenthal
2893b0055a Docx reader: Let headers use exisiting id.
Previously we always generated an id for headers (since they wouldn't
bring one from Docx). Now we let it use an existing one if
possible. This should allow us to recurs through anchor spans.
2016-08-28 18:09:27 -04:00
Jesse Rosenthal
62882f976d Docx reader: Handle anchor spans with content in headers.
Previously, we would only be able to figure out internal links to a
header in a docx if the anchor span was empty. We change that to read
the inlines out of the first anchor span in a header.

This still leaves another problem: what to do if there are multiple
anchor spans in a header. That will be addressed in a future commit.
2016-08-28 18:03:09 -04:00
Jesse Rosenthal
9999db2e6c StyleMap: export functions on StyleMap instances
We're going to want `getMap` in the Docx Writer.
2016-08-15 12:04:20 -04:00
Jesse Rosenthal
347d716826 Docx parser: Use xml convenience functions
The functions `isElem` and `elemName` (defined in Docx/Util.hs) make the
code a lot cleaner than the original XML.Light functions, but they had
been used inconsistently. This puts them in wherever applicable.
2016-08-13 13:46:21 -04:00