Commit graph

68 commits

Author SHA1 Message Date
John MacFarlane
47a16065c4 Removed --parse-raw and readerParseRaw.
These were confusing.

Now we rely on the +raw_tex or +raw_html extension with latex
or html input.

Thus, instead of

    --parse-raw -f latex

we use

    -f latex+raw_tex

and instead of

     --parse-raw -f html

we use

    -f html+raw_html
2017-02-06 23:33:23 +01:00
John MacFarlane
5156a4fe3c Shared: rename compactify', compactify'DL -> compactify, compactifyDL. 2017-01-27 21:36:45 +01:00
John MacFarlane
65b8570e0e Cleanups for rebase. 2017-01-25 17:07:43 +01:00
John MacFarlane
6f8b967d98 Removed readerSmart and the --smart option; added Ext_smart extension.
Now you will need to do

    -f markdown+smart

instead of

    -f markdown --smart

This change opens the way for writers, in addition to readers,
to be sensitive to +smart, but this change hasn't yet been made.

API change. Command-line option change.

Updated manual.
2017-01-25 17:07:42 +01:00
Jesse Rosenthal
b53ebcdf8e Working on readers. 2017-01-25 17:07:40 +01:00
Albert Krewinkel
5729f1f2ea
Org reader: allow short hand for single-line raw blocks
Single-line raw blocks can be given via `#+FORMAT: raw line`, where
`FORMAT` must be one of `latex`, `beamer`, `html`, or `texinfo`.

Closes: #3366
2017-01-19 20:33:05 +01:00
Albert Krewinkel
4da41bdb8e
Remove pipe char irking the haddock coverage tool
Haddock documentation strings must be associated with functions. Remove
pipe char from a comment that was moved into a `do` block in
`Readers/Org/Inlines.hs`.
2017-01-06 18:59:07 +01:00
Albert Krewinkel
4ca420e937
Org reader: accept org-ref citations followed by commas
Bugfix for an issue which, whenever the citation was immediately followed by a
comma, prevented correct parsing of org-ref citations.
2017-01-06 18:22:19 +01:00
Albert Krewinkel
21e6ca1976 Org reader: ensure emphasis markup can be nested
Nested emphasis markup (e.g. `/*strong and emphasized*/`) was
interpreted incorrectly in that the inner markup was not recognized.
2017-01-05 23:30:46 +01:00
Albert Krewinkel
f4a8f12387
Org reader: respect column width settings
Table column properties can optionally specify a column's width with
which it is displayed in the buffer. Some exporters, notably the ODT
exporter in org-mode v9.0, use these values to calculate relative column
widths. The org reader now implements the same behavior.

Note that the org-mode LaTeX and HTML exporters in Emacs don't support
this feature yet, which should be kept in mind by users who use the
column widths parameters.

Closes: #3246
2016-11-24 20:07:39 +01:00
Albert Krewinkel
64413b1ce2
Un-break Travis build
Remove whitespace before function documentation The extra spaced cause
problems with documentation tools and Travis tests are failing because
of this.
2016-11-19 22:30:02 +01:00
Albert Krewinkel
1a8af5fc44
Org reader: Ensure images in paragraphs are not parsed as figures
This fixes a regression introduced in
7e5220b57c.
2016-11-19 01:17:04 +01:00
Albert Krewinkel
7e5220b57c
Org reader: allow HTML attribs on non-figure images
Images which are the only element in a paragraph can still be given HTML
attributes, even if the image does not have a caption and is hence not a figure.
The following will add set the `width` attribute of the image to `50%`:

    #+ATTR_HTML: :width 50%
    [[file:image.jpg]]

Closes: #3222
2016-11-09 22:49:20 +01:00
Albert Krewinkel
4f06e6c445
Org reader: support ATTR_HTML for special blocks
Special blocks (i.e. blocks with unrecognized names) can be prefixed
with an `ATTR_HTML` block attribute.  The attributes defined in that
meta-directive are added to the `Div` which is used to represent the
special block.

Closes: #3182
2016-10-30 20:23:53 +01:00
Albert Krewinkel
63bdc5d08f
Org reader: support the todo export option
The `todo` export option allows to toggle the inclusion of TODO keywords
in the output.  Setting this to `nil` causes TODO keywords to be dropped
from headlines.  The default is to include the keywords.
2016-10-30 13:20:25 +01:00
Albert Krewinkel
d5182778c4
Org reader: add support for todo-markers
Headlines can have optional todo-markers which can be controlled via the
`#+TODO`, `#+SEQ_TODO`, or `#+TYP_TODO` meta directive.  Multiple such
directives can be given, each adding a new set of recognized todo-markers.
If no custom todo-markers are defined, the default `TODO` and `DONE`
markers are used.

Todo-markers are conceptually separate from headline text and are hence
excluded when autogenerating headline IDs.

The markers are rendered as spans and labelled with two classes: One
class is the markers name, the other signals the todo-state of the
marker (either `todo` or `done`).
2016-10-30 10:27:47 +01:00
John MacFarlane
8264ae2abe Better fix for the problem with ghc 7.8. 2016-10-18 16:25:13 +02:00
John MacFarlane
3747abf029 Try to fix build error on ghc 7.8.
@tarleb this is an interesting one, see the build log in
https://travis-ci.org/jgm/pandoc/jobs/168612017

It only failed on ghc 7.8; I think this must have to do with
the change making Monad a superclass of Applicative, hence
this change.
2016-10-18 16:03:34 +02:00
Albert Krewinkel
462c140eb6
Org reader: allow figure with empty caption
A `#+CAPTION` attribute before an image is enough to turn an image into a
figure. This wasn't the case because the `parseFromString` function, which
processes the caption value, would fail on empty values. Adding a newline
character to the caption value fixes this.

Fixes: #3161
2016-10-14 23:16:51 +02:00
Albert Krewinkel
c9460e7013
Parse line-oriented markup as LineBlock
Markup-features focusing on lines as distinctive part of the markup are read
into `LineBlock` elements. This currently means line blocks in reStructuredText
and Markdown (the latter only if the `line_block` extension is enabled), the
`linegroup`/`line` combination from the Docbook 5.1 working draft, and Org-mode
`VERSE` blocks.
2016-10-13 08:46:44 +02:00
Albert Krewinkel
bb48f2edc4
Org reader: trim verse lines properly
An empty verse line should not result in `Str ""` but in `mempty`.
2016-10-10 21:12:47 +02:00
Jesse Rosenthal
26c3705da4 Fix grouping of imports.
Some source files keep imports in tidy groups. Changing
`Text.Pandoc.Compat.Monoid` to `Data.Monoid` could upset that. This
restores tidiness.
2016-09-02 09:18:09 -04:00
Jesse Rosenthal
45c7108b4f Remove Compat.Monoid
This was only necessary for GHC versions with base below 4.5
(i.e., ghc < 7.4).
2016-09-02 09:18:08 -04:00
Albert Krewinkel
21cd76c201
Org reader: respect unnumbered header property
Sections the `unnumbered` property should, as the name implies, be
excluded from the automatic numbering of section provided by some output
formats.  The Pandoc convention for this is to add an "unnumbered" class
to the header.  The reader treats properties as key-value pairs per
default, so a special case is added to translate the above property to a
class instead.

Closes #3095.
2016-08-30 18:10:24 +02:00
Albert Krewinkel
88313c0b93
Org reader: respect creator export option
The `creator` option controls whether the creator meta-field should be
included in the final markup.  Setting `#+OPTIONS: creator:nil` will
drop the creator field from the final meta-data output.

Org-mode recognizes the special value `comment` for this field, causing
the creator to be included in a comment.  This is difficult to translate
to Pandoc internals and is hence interpreted the same as other truish
values (i.e. the meta field is kept if it's present).
2016-08-29 14:35:16 +02:00
Albert Krewinkel
0568aa5cad
Org reader: respect email export option
The `email` option controls whether the email meta-field should be
included in the final markup. Setting `#+OPTIONS: email:nil` will drop
the email field from the final meta-data output.
2016-08-29 14:34:39 +02:00
Albert Krewinkel
117d3f4d92
Org reader: respect author export option
The `author` option controls whether the author should be included in
the final markup.  Setting `#+OPTIONS: author:nil` will drop the author
from the final meta-data output.
2016-08-29 14:33:18 +02:00
Albert Krewinkel
28d17ea70f
Org reader: read HTML_head as header-includes
HTML-specific head content can be defined in `#+HTML_head` lines.  They
are parsed as format-specific inlines to ensure that they will only show
up in HTML output.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
d164ead379
Org reader: set classoption meta from LaTeX_class_options 2016-08-29 14:10:57 +02:00
Albert Krewinkel
825ce8ca73
Org reader: set documentclass meta from LaTeX_class 2016-08-29 14:10:57 +02:00
Albert Krewinkel
a257488343
Org reader: read LaTeX_header as header-includes
LaTeX-specific header commands can be defined in `#+LaTeX_header` lines.
They are parsed as format-specific inlines to ensure that they will only
show up in LaTeX output.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
75df104215
Org reader: give precedence to later meta lines
The last meta-line of any given type is the significant line.
Previously the value of the first line was kept, even if more lines of
the same type were encounterd.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
2ca2585b35
Org reader: allow multiple, comma-separated authors
Multiple authors can be specified in the `#+AUTHOR` meta line if they
are given as a comma-separated list.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
153970bef5
Org reader: read markup only for special meta keys
Most meta-keys should be read as normal string values, only a few are
interpreted as marked-up text.
2016-08-29 14:10:56 +02:00
Albert Krewinkel
bed5f700ce
Org reader: extract meta parsing code to module
Parsing of meta-data is well separable from other block parsing tasks.
Moving into new module to get small files and clearly arranged code.
2016-08-29 14:10:51 +02:00
John MacFarlane
13424a2bd7 Merge pull request #3065 from tarleb/org-verse-indent
Org reader: preserve indentation of verse lines
2016-08-09 21:33:24 +02:00
Albert Krewinkel
ba5b426ded Org reader: ensure image sources are proper links
Image sources as those in plain images, image links, or figures, must be
proper URIs or relative file paths to be recognized as images.  This
restriction is now enforced for all image sources.

This also fixes the reader's usage of uncleaned image sources, leading
to `file:` prefixes not being deleted from figure
images (e.g. `[[file:image.jpg]]` leading to a broken image `<img
src="file:image.jpg"/>)

Thanks to @bsag for noticing this bug.
2016-08-09 20:27:08 +02:00
Albert Krewinkel
13280a8112 Org reader: preserve indentation of verse lines
Leading spaces in verse lines are converted to non-breaking spaces, so
indentation is preserved.

This fixes #3064.
2016-08-08 09:40:50 +02:00
Albert Krewinkel
529146decf Org reader: fix parsing of verbatim inlines
Org rules for allowed characters before or after markup chars were not
checked for verbatim text.  This resultet in wrong parsing outcomes of
if the verbatim text contained e.g. space enclosed markup characters as
part of the text (`=is_substr = True=`).  Forcing the parser to update
the positions of allowed/forbidden markup border characters fixes this.

This fixes #3016.
2016-07-14 13:33:25 +02:00
Albert Krewinkel
f417fecf5f
Org reader: replace ugly code with view pattern
Some less-than-smart code required a pragma switching of overlapping
pattern warnings in order to compile seamlessly.  Using view patterns
makes the code easier to read and also doesn't require overlapping
pattern checks to be disabled.
2016-07-04 11:20:05 +02:00
Albert Krewinkel
5ffa4abf72
Org reader: support headline levels export setting
The depths of headlines can be modified using the `H` option.  Deeper
headlines will be converted to lists.
2016-07-03 23:28:45 +02:00
Albert Krewinkel
c1f6bd2640
Org reader: put export setting parser into module
Export option parsing is distinct enough from general block parsing to
justify putting it into a separate module.
2016-07-02 13:14:09 +02:00
Albert Krewinkel
c4cf6d237f
Org reader: support archived trees export options
Handling of archived trees can be modified using the `arch` option.
Archived trees are either dropped, exported completely, or collapsed to
include just the header when the `arch` option is nil, non-nil, or
`headline`, respectively.
2016-07-01 23:05:33 +02:00
Albert Krewinkel
1ebaf6de11
Org reader: refactor comment tree handling
Comment trees were handled after parsing, as pattern matching on lists
is easier than matching on sequences.  The new method of reading
documents as trees allows for more elegant subtree removal.
2016-07-01 23:05:32 +02:00
Albert Krewinkel
17484ed01a
Org reader: parse as headlines, convert to blocks
Emacs org-mode is based on outline-mode, which treats documents as trees
with headlines are nodes.  The reader is refactored to parse into a
similar tree structure.  This simplifies transformations acting on
document (sub-)trees.
2016-07-01 23:05:32 +02:00
Albert Krewinkel
2f8d6755f4
Org reader: improve tag and properties type safety
Specific newtype definitions are used to replace stringly typing of tags
and properties.  Type safety is increased while readability is improved.
2016-07-01 23:05:32 +02:00
Albert Krewinkel
0f3f5ce1a1 Org reader: support figure labels
Figure labels given as `#+LABEL: thelabel` are used as the ID of the
respective image.  This allows e.g. the LaTeX to add proper `\label`
markup.

This fixes half of #2496 and #2999.
2016-06-26 20:42:22 +02:00
Albert Krewinkel
7df656089f Org reader: remove partial functions
Partial functions like `head` lead to avoidable errors and should be
avoided.  They are replaced with total functions.

This fixes #2991.
2016-06-21 23:51:15 +02:00
Albert Krewinkel
29552eff3e Org reader: support arbitrary raw inlines
Org mode allows arbitrary raw inlines ("export snippets" in Emacs
parlance) to be included as `@@format:raw foreign format text@@`.

Support for this features is added to the Org reader.
2016-06-13 23:53:14 +02:00
Albert Krewinkel
8a9f5915ab Org reader: add support for "Berkeley-style" cites
A specification for an official Org-mode citation syntax was drafted by
Richard Lawrence and enhanced with the help of others on the orgmode
mailing list.  Basic support for this citation style is added to the
reader.

This closes #1978.
2016-06-05 11:28:57 +02:00