Commit graph

388 commits

Author SHA1 Message Date
Jesse Rosenthal
49b0b67b11 Remove Tests.Arbitrary
Use exported Arbitrary instances from pandoc-types instead.
2016-10-14 09:22:29 -04:00
Albert Krewinkel
c9460e7013
Parse line-oriented markup as LineBlock
Markup-features focusing on lines as distinctive part of the markup are read
into `LineBlock` elements. This currently means line blocks in reStructuredText
and Markdown (the latter only if the `line_block` extension is enabled), the
`linegroup`/`line` combination from the Docbook 5.1 working draft, and Org-mode
`VERSE` blocks.
2016-10-13 08:46:44 +02:00
Jesse Rosenthal
4b0dbdc118 Markdown writer: add test for note placement. 2016-10-11 22:26:03 -04:00
Jesse Rosenthal
581fc0130b LaTeX writer: change braced backtick to \textasciigrave{}
Backticks in verbatim environments are converted to
open-single-quotes. This change makes them appear as backticks. This
corresponds to how we treat `'' in verbatim environments (with
\textquotesingle{}).
2016-09-20 09:44:35 -04:00
Jesse Rosenthal
9c1467e848 Add test for backtick in verbatim. 2016-09-19 16:34:28 -04:00
Albert Krewinkel
21cd76c201
Org reader: respect unnumbered header property
Sections the `unnumbered` property should, as the name implies, be
excluded from the automatic numbering of section provided by some output
formats.  The Pandoc convention for this is to add an "unnumbered" class
to the header.  The reader treats properties as key-value pairs per
default, so a special case is added to translate the above property to a
class instead.

Closes #3095.
2016-08-30 18:10:24 +02:00
Albert Krewinkel
a3a3e3fdbf
Merge branch 'org-meta-handling' 2016-08-29 14:42:23 +02:00
Jesse Rosenthal
fcbb37c8f3 Docx reader: test for nested anchor spans in header
This ensures that anchor spans in header with content (or with other
anchor spans inside) will resolve to links to a header id properly.
2016-08-29 08:35:59 -04:00
Albert Krewinkel
88313c0b93
Org reader: respect creator export option
The `creator` option controls whether the creator meta-field should be
included in the final markup.  Setting `#+OPTIONS: creator:nil` will
drop the creator field from the final meta-data output.

Org-mode recognizes the special value `comment` for this field, causing
the creator to be included in a comment.  This is difficult to translate
to Pandoc internals and is hence interpreted the same as other truish
values (i.e. the meta field is kept if it's present).
2016-08-29 14:35:16 +02:00
Albert Krewinkel
0568aa5cad
Org reader: respect email export option
The `email` option controls whether the email meta-field should be
included in the final markup. Setting `#+OPTIONS: email:nil` will drop
the email field from the final meta-data output.
2016-08-29 14:34:39 +02:00
Albert Krewinkel
117d3f4d92
Org reader: respect author export option
The `author` option controls whether the author should be included in
the final markup.  Setting `#+OPTIONS: author:nil` will drop the author
from the final meta-data output.
2016-08-29 14:33:18 +02:00
Albert Krewinkel
ad625782b1
Put Org reader export option tests into test group
Using a separate test group instead of prefixing the test subject should
be clearer than the current approach.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
28d17ea70f
Org reader: read HTML_head as header-includes
HTML-specific head content can be defined in `#+HTML_head` lines.  They
are parsed as format-specific inlines to ensure that they will only show
up in HTML output.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
d164ead379
Org reader: set classoption meta from LaTeX_class_options 2016-08-29 14:10:57 +02:00
Albert Krewinkel
825ce8ca73
Org reader: set documentclass meta from LaTeX_class 2016-08-29 14:10:57 +02:00
Albert Krewinkel
a257488343
Org reader: read LaTeX_header as header-includes
LaTeX-specific header commands can be defined in `#+LaTeX_header` lines.
They are parsed as format-specific inlines to ensure that they will only
show up in LaTeX output.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
75df104215
Org reader: give precedence to later meta lines
The last meta-line of any given type is the significant line.
Previously the value of the first line was kept, even if more lines of
the same type were encounterd.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
2ca2585b35
Org reader: allow multiple, comma-separated authors
Multiple authors can be specified in the `#+AUTHOR` meta line if they
are given as a comma-separated list.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
153970bef5
Org reader: read markup only for special meta keys
Most meta-keys should be read as normal string values, only a few are
interpreted as marked-up text.
2016-08-29 14:10:56 +02:00
Jesse Rosenthal
972286c034 Docx writer test: comment out function to make compiler happy. 2016-08-15 15:39:34 -04:00
Jesse Rosenthal
d416f62410 Docx writer: test for custom styles.
This just tests whether a custom style with a recognizable
style (italic etc, defined in a reference.docx) will roundtrip back to
that format (i.e., whether `<span custom-style="Emphasized">` will
roundtrip to `Emph`). The custom styles are defined in the
`custom-style-reference.docx` included in the docx dir.
2016-08-15 15:33:06 -04:00
Jesse Rosenthal
663f689fa4 Docx writer tests: allow for altered round trip
Sometimes we will want to get back something different than we started
with in a round-trip test. This allows for that, and makes the perfect
roundtrip a special case.
2016-08-15 15:23:25 -04:00
John MacFarlane
13424a2bd7 Merge pull request #3065 from tarleb/org-verse-indent
Org reader: preserve indentation of verse lines
2016-08-09 21:33:24 +02:00
Albert Krewinkel
ba5b426ded Org reader: ensure image sources are proper links
Image sources as those in plain images, image links, or figures, must be
proper URIs or relative file paths to be recognized as images.  This
restriction is now enforced for all image sources.

This also fixes the reader's usage of uncleaned image sources, leading
to `file:` prefixes not being deleted from figure
images (e.g. `[[file:image.jpg]]` leading to a broken image `<img
src="file:image.jpg"/>)

Thanks to @bsag for noticing this bug.
2016-08-09 20:27:08 +02:00
Albert Krewinkel
13280a8112 Org reader: preserve indentation of verse lines
Leading spaces in verse lines are converted to non-breaking spaces, so
indentation is preserved.

This fixes #3064.
2016-08-08 09:40:50 +02:00
John MacFarlane
0b0a0e730f Removed some redundant class constraints. 2016-07-14 08:54:06 -07:00
Albert Krewinkel
529146decf Org reader: fix parsing of verbatim inlines
Org rules for allowed characters before or after markup chars were not
checked for verbatim text.  This resultet in wrong parsing outcomes of
if the verbatim text contained e.g. space enclosed markup characters as
part of the text (`=is_substr = True=`).  Forcing the parser to update
the positions of allowed/forbidden markup border characters fixes this.

This fixes #3016.
2016-07-14 13:33:25 +02:00
Albert Krewinkel
5ffa4abf72
Org reader: support headline levels export setting
The depths of headlines can be modified using the `H` option.  Deeper
headlines will be converted to lists.
2016-07-03 23:28:45 +02:00
Albert Krewinkel
c4cf6d237f
Org reader: support archived trees export options
Handling of archived trees can be modified using the `arch` option.
Archived trees are either dropped, exported completely, or collapsed to
include just the header when the `arch` option is nil, non-nil, or
`headline`, respectively.
2016-07-01 23:05:33 +02:00
Alex Ivkin
a73c95f61d Added Zim Wiki writer, template and tests. 2016-06-30 23:59:43 -07:00
John MacFarlane
3429fa6438 LaTeX reader: fixed \cite so it is a NormalCitation not AuthorInText. 2016-06-29 07:59:00 -07:00
Albert Krewinkel
0f3f5ce1a1 Org reader: support figure labels
Figure labels given as `#+LABEL: thelabel` are used as the ID of the
respective image.  This allows e.g. the LaTeX to add proper `\label`
markup.

This fixes half of #2496 and #2999.
2016-06-26 20:42:22 +02:00
Jesse Rosenthal
7980631a0b Docx reader: add tests for comments
We test for comments, using all track-changes options. Note that we
should only output comments if `--track-changes=all`. We also test for
emitting warnings if there is complicated formatting.
2016-06-23 10:50:46 -04:00
Jesse Rosenthal
5d48a62b74 Docx reader tests: Add tests for warnings.
We test to see if we emit any warnings.
2016-06-23 10:50:46 -04:00
Albert Krewinkel
29552eff3e Org reader: support arbitrary raw inlines
Org mode allows arbitrary raw inlines ("export snippets" in Emacs
parlance) to be included as `@@format:raw foreign format text@@`.

Support for this features is added to the Org reader.
2016-06-13 23:53:14 +02:00
Albert Krewinkel
8a9f5915ab Org reader: add support for "Berkeley-style" cites
A specification for an official Org-mode citation syntax was drafted by
Richard Lawrence and enhanced with the help of others on the orgmode
mailing list.  Basic support for this citation style is added to the
reader.

This closes #1978.
2016-06-05 11:28:57 +02:00
John MacFarlane
061bc60f70 Merge pull request #2950 from tarleb/org-ref-support
Org reader: support org-ref style citations
2016-05-31 12:44:29 -07:00
Albert Krewinkel
c17c62a2c7 Org reader: support new syntax for export blocks
Org-mode version 9 usees a new syntax for export blocks.  Instead of
`#+BEGIN_<FORMAT>`, where `<FORMAT>` is the format of the block's
content, the new format uses `#+BEGIN_export <FORMAT>` instead.  Both
types are supported.
2016-05-29 21:08:50 +02:00
Albert Krewinkel
f226cb88b0 Org reader: support org-ref style citations
The *org-ref* package is an org-mode extension commonly used to manage
citations in org documents.  Basic support for the `cite:citeKey` and
`[[cite:citeKey][prefix text::suffix text]]` syntax is added.
2016-05-27 21:19:28 +02:00
Albert Krewinkel
a4717c2fc5 Org reader: respect drawer export setting
The `d` export option can be used to control which drawers are exported
and which are discarded.  Basic support for this option is added here.
2016-05-23 09:44:37 +02:00
Albert Krewinkel
f3d27e4c80 Org reader/writer: use CUSTOM_ID in properties
The `ID` property is reserved for internal use by Org-mode and should
not be used.  The `CUSTOM_ID` property is to be used instead, it is
converted to the `ID` property for certain export format.

The reader and writer erroneously used `ID`.  This is corrected by using
`CUSTOM_ID` where appropriate.
2016-05-22 23:01:47 +02:00
Albert Krewinkel
68d388f833 Org reader: add :PROPERTIES: drawer support
Headers can have optional `:PROPERTIES:` drawers associated with them.
These drawers contain key/value pairs like the header's `id`.  The
reader adds all listed pairs to the header's attributes; `id` and
`class` attributes are handled specially to match the way `Attr` are
defined.

This also changes behavior of how drawers of unknown type are handled.
Instead of including all unknown drawers, those are not read/exported,
thereby matching current Emacs behavior.

This closes #1877.
2016-05-20 17:01:26 +02:00
Albert Krewinkel
16e233475a Org reader: add support for ATTR_HTML attributes
Arbitrary key-value pairs can be added to some block types using a
`#+ATTR_HTML` line before the block.  Emacs Org-mode only includes these
when exporting to HTML, but since we cannot make this distinction here,
the attributes are always added.

The functionality is now supported for figures.

This closes #1906.
2016-05-19 09:55:12 +02:00
John MacFarlane
344412cba8 Merge pull request #2894 from sid-kap/rst-code-class
Add class option for code block in RST reader
2016-05-12 00:03:14 -07:00
Albert Krewinkel
76143de97e Org reader: add support for sub/superscript export options
Org-mode allows to specify export settings via `#+OPTIONS` lines.
Disabling simple sub- and superscripts is one of these export options,
this options is now supported.
2016-05-11 19:13:43 +02:00
Albert Krewinkel
10a809f126 Org reader: fix inline-LaTeX regression
The last fix for whitespace handling of inline LaTeX commands was
incorrect, preventing correct recognition of inline LaTeX commands which
contain spaces.  This fix ensures that only trailing whitespace is cut
off.
2016-05-09 19:06:04 +02:00
John MacFarlane
21d1a3b57c Merge pull request #2898 from tarleb/org-table-refactoring
Org reader: table parsing code refactoring and fixes
2016-05-05 16:22:56 -07:00
Albert Krewinkel
405c3e9c36 Org reader: fix spacing after LaTeX-style symbols
The org-reader was droping space after unescaped LaTeX-style symbol
commands: `\ForAll \Auml` resulted in `∀Ä` but should give `∀ Ä`
instead.  This seems to be because the LaTeX-reader treats the
command-terminating space as part of the command.  Dropping the trailing
space from the symbol-command fixes this issue.
2016-05-04 23:16:23 +02:00
Albert Krewinkel
2d825603c6 Org reader: fix handling of empty table cells, rows
This fixes Org mode parsing of some corner cases regarding empty cells
and rows.  Empty cells weren't parsed correctly, e.g. `|||` should be
two empty cells, but would be parsed as a single cell containing a pipe
character.  Empty rows where parsed as alignment rows and dropped from
the output.

This fixes #2616.
2016-05-04 16:02:03 +02:00
Albert Krewinkel
d5e4bc179c Org reader: stop padding short table rows
Emacs Org-mode doesn't add any padding to table rows.  The first
row (header or first body row) is used to determine the column count, no
other magic is performed.

The org reader was padding rows to the length of the longest table row.
This was done due to a misunderstanding of how Org handles tables.  This
feature reflected how Org-mode handles tables when pressing <TAB>.  The
Org exporter however, which is what the reader should implement, doesn't
do any of this.  So this was a mis-feature that made the reader more
complex and reduced comparability.  It was hence removed.
2016-05-04 15:48:07 +02:00