Commit graph

248 commits

Author SHA1 Message Date
Albert Krewinkel
88313c0b93
Org reader: respect creator export option
The `creator` option controls whether the creator meta-field should be
included in the final markup.  Setting `#+OPTIONS: creator:nil` will
drop the creator field from the final meta-data output.

Org-mode recognizes the special value `comment` for this field, causing
the creator to be included in a comment.  This is difficult to translate
to Pandoc internals and is hence interpreted the same as other truish
values (i.e. the meta field is kept if it's present).
2016-08-29 14:35:16 +02:00
Albert Krewinkel
0568aa5cad
Org reader: respect email export option
The `email` option controls whether the email meta-field should be
included in the final markup. Setting `#+OPTIONS: email:nil` will drop
the email field from the final meta-data output.
2016-08-29 14:34:39 +02:00
Albert Krewinkel
117d3f4d92
Org reader: respect author export option
The `author` option controls whether the author should be included in
the final markup.  Setting `#+OPTIONS: author:nil` will drop the author
from the final meta-data output.
2016-08-29 14:33:18 +02:00
Albert Krewinkel
ad625782b1
Put Org reader export option tests into test group
Using a separate test group instead of prefixing the test subject should
be clearer than the current approach.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
28d17ea70f
Org reader: read HTML_head as header-includes
HTML-specific head content can be defined in `#+HTML_head` lines.  They
are parsed as format-specific inlines to ensure that they will only show
up in HTML output.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
d164ead379
Org reader: set classoption meta from LaTeX_class_options 2016-08-29 14:10:57 +02:00
Albert Krewinkel
825ce8ca73
Org reader: set documentclass meta from LaTeX_class 2016-08-29 14:10:57 +02:00
Albert Krewinkel
a257488343
Org reader: read LaTeX_header as header-includes
LaTeX-specific header commands can be defined in `#+LaTeX_header` lines.
They are parsed as format-specific inlines to ensure that they will only
show up in LaTeX output.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
75df104215
Org reader: give precedence to later meta lines
The last meta-line of any given type is the significant line.
Previously the value of the first line was kept, even if more lines of
the same type were encounterd.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
2ca2585b35
Org reader: allow multiple, comma-separated authors
Multiple authors can be specified in the `#+AUTHOR` meta line if they
are given as a comma-separated list.
2016-08-29 14:10:57 +02:00
Albert Krewinkel
153970bef5
Org reader: read markup only for special meta keys
Most meta-keys should be read as normal string values, only a few are
interpreted as marked-up text.
2016-08-29 14:10:56 +02:00
John MacFarlane
13424a2bd7 Merge pull request #3065 from tarleb/org-verse-indent
Org reader: preserve indentation of verse lines
2016-08-09 21:33:24 +02:00
Albert Krewinkel
ba5b426ded Org reader: ensure image sources are proper links
Image sources as those in plain images, image links, or figures, must be
proper URIs or relative file paths to be recognized as images.  This
restriction is now enforced for all image sources.

This also fixes the reader's usage of uncleaned image sources, leading
to `file:` prefixes not being deleted from figure
images (e.g. `[[file:image.jpg]]` leading to a broken image `<img
src="file:image.jpg"/>)

Thanks to @bsag for noticing this bug.
2016-08-09 20:27:08 +02:00
Albert Krewinkel
13280a8112 Org reader: preserve indentation of verse lines
Leading spaces in verse lines are converted to non-breaking spaces, so
indentation is preserved.

This fixes #3064.
2016-08-08 09:40:50 +02:00
Albert Krewinkel
529146decf Org reader: fix parsing of verbatim inlines
Org rules for allowed characters before or after markup chars were not
checked for verbatim text.  This resultet in wrong parsing outcomes of
if the verbatim text contained e.g. space enclosed markup characters as
part of the text (`=is_substr = True=`).  Forcing the parser to update
the positions of allowed/forbidden markup border characters fixes this.

This fixes #3016.
2016-07-14 13:33:25 +02:00
Albert Krewinkel
5ffa4abf72
Org reader: support headline levels export setting
The depths of headlines can be modified using the `H` option.  Deeper
headlines will be converted to lists.
2016-07-03 23:28:45 +02:00
Albert Krewinkel
c4cf6d237f
Org reader: support archived trees export options
Handling of archived trees can be modified using the `arch` option.
Archived trees are either dropped, exported completely, or collapsed to
include just the header when the `arch` option is nil, non-nil, or
`headline`, respectively.
2016-07-01 23:05:33 +02:00
John MacFarlane
3429fa6438 LaTeX reader: fixed \cite so it is a NormalCitation not AuthorInText. 2016-06-29 07:59:00 -07:00
Albert Krewinkel
0f3f5ce1a1 Org reader: support figure labels
Figure labels given as `#+LABEL: thelabel` are used as the ID of the
respective image.  This allows e.g. the LaTeX to add proper `\label`
markup.

This fixes half of #2496 and #2999.
2016-06-26 20:42:22 +02:00
Jesse Rosenthal
7980631a0b Docx reader: add tests for comments
We test for comments, using all track-changes options. Note that we
should only output comments if `--track-changes=all`. We also test for
emitting warnings if there is complicated formatting.
2016-06-23 10:50:46 -04:00
Jesse Rosenthal
5d48a62b74 Docx reader tests: Add tests for warnings.
We test to see if we emit any warnings.
2016-06-23 10:50:46 -04:00
Albert Krewinkel
29552eff3e Org reader: support arbitrary raw inlines
Org mode allows arbitrary raw inlines ("export snippets" in Emacs
parlance) to be included as `@@format:raw foreign format text@@`.

Support for this features is added to the Org reader.
2016-06-13 23:53:14 +02:00
Albert Krewinkel
8a9f5915ab Org reader: add support for "Berkeley-style" cites
A specification for an official Org-mode citation syntax was drafted by
Richard Lawrence and enhanced with the help of others on the orgmode
mailing list.  Basic support for this citation style is added to the
reader.

This closes #1978.
2016-06-05 11:28:57 +02:00
John MacFarlane
061bc60f70 Merge pull request #2950 from tarleb/org-ref-support
Org reader: support org-ref style citations
2016-05-31 12:44:29 -07:00
Albert Krewinkel
c17c62a2c7 Org reader: support new syntax for export blocks
Org-mode version 9 usees a new syntax for export blocks.  Instead of
`#+BEGIN_<FORMAT>`, where `<FORMAT>` is the format of the block's
content, the new format uses `#+BEGIN_export <FORMAT>` instead.  Both
types are supported.
2016-05-29 21:08:50 +02:00
Albert Krewinkel
f226cb88b0 Org reader: support org-ref style citations
The *org-ref* package is an org-mode extension commonly used to manage
citations in org documents.  Basic support for the `cite:citeKey` and
`[[cite:citeKey][prefix text::suffix text]]` syntax is added.
2016-05-27 21:19:28 +02:00
Albert Krewinkel
a4717c2fc5 Org reader: respect drawer export setting
The `d` export option can be used to control which drawers are exported
and which are discarded.  Basic support for this option is added here.
2016-05-23 09:44:37 +02:00
Albert Krewinkel
f3d27e4c80 Org reader/writer: use CUSTOM_ID in properties
The `ID` property is reserved for internal use by Org-mode and should
not be used.  The `CUSTOM_ID` property is to be used instead, it is
converted to the `ID` property for certain export format.

The reader and writer erroneously used `ID`.  This is corrected by using
`CUSTOM_ID` where appropriate.
2016-05-22 23:01:47 +02:00
Albert Krewinkel
68d388f833 Org reader: add :PROPERTIES: drawer support
Headers can have optional `:PROPERTIES:` drawers associated with them.
These drawers contain key/value pairs like the header's `id`.  The
reader adds all listed pairs to the header's attributes; `id` and
`class` attributes are handled specially to match the way `Attr` are
defined.

This also changes behavior of how drawers of unknown type are handled.
Instead of including all unknown drawers, those are not read/exported,
thereby matching current Emacs behavior.

This closes #1877.
2016-05-20 17:01:26 +02:00
Albert Krewinkel
16e233475a Org reader: add support for ATTR_HTML attributes
Arbitrary key-value pairs can be added to some block types using a
`#+ATTR_HTML` line before the block.  Emacs Org-mode only includes these
when exporting to HTML, but since we cannot make this distinction here,
the attributes are always added.

The functionality is now supported for figures.

This closes #1906.
2016-05-19 09:55:12 +02:00
John MacFarlane
344412cba8 Merge pull request #2894 from sid-kap/rst-code-class
Add class option for code block in RST reader
2016-05-12 00:03:14 -07:00
Albert Krewinkel
76143de97e Org reader: add support for sub/superscript export options
Org-mode allows to specify export settings via `#+OPTIONS` lines.
Disabling simple sub- and superscripts is one of these export options,
this options is now supported.
2016-05-11 19:13:43 +02:00
Albert Krewinkel
10a809f126 Org reader: fix inline-LaTeX regression
The last fix for whitespace handling of inline LaTeX commands was
incorrect, preventing correct recognition of inline LaTeX commands which
contain spaces.  This fix ensures that only trailing whitespace is cut
off.
2016-05-09 19:06:04 +02:00
John MacFarlane
21d1a3b57c Merge pull request #2898 from tarleb/org-table-refactoring
Org reader: table parsing code refactoring and fixes
2016-05-05 16:22:56 -07:00
Albert Krewinkel
405c3e9c36 Org reader: fix spacing after LaTeX-style symbols
The org-reader was droping space after unescaped LaTeX-style symbol
commands: `\ForAll \Auml` resulted in `∀Ä` but should give `∀ Ä`
instead.  This seems to be because the LaTeX-reader treats the
command-terminating space as part of the command.  Dropping the trailing
space from the symbol-command fixes this issue.
2016-05-04 23:16:23 +02:00
Albert Krewinkel
2d825603c6 Org reader: fix handling of empty table cells, rows
This fixes Org mode parsing of some corner cases regarding empty cells
and rows.  Empty cells weren't parsed correctly, e.g. `|||` should be
two empty cells, but would be parsed as a single cell containing a pipe
character.  Empty rows where parsed as alignment rows and dropped from
the output.

This fixes #2616.
2016-05-04 16:02:03 +02:00
Albert Krewinkel
d5e4bc179c Org reader: stop padding short table rows
Emacs Org-mode doesn't add any padding to table rows.  The first
row (header or first body row) is used to determine the column count, no
other magic is performed.

The org reader was padding rows to the length of the longest table row.
This was done due to a misunderstanding of how Org handles tables.  This
feature reflected how Org-mode handles tables when pressing <TAB>.  The
Org exporter however, which is what the reader should implement, doesn't
do any of this.  So this was a mis-feature that made the reader more
complex and reduced comparability.  It was hence removed.
2016-05-04 15:48:07 +02:00
Sidharth Kapur
ff489a59f4 Add one more test 2016-05-01 22:36:19 -05:00
Sidharth Kapur
d7bc8c0632 Use codeBlockWith 2016-05-01 22:32:26 -05:00
Sidharth Kapur
72d1900b30 Add test for RST code directive class 2016-05-01 22:23:33 -05:00
Emanuel Evans
1bfe39e24c
Ignore leading space in org code blocks
Fixes #2862

Also fix up tab handling for leading whitespace in code blocks.
2016-04-26 10:29:59 -07:00
Jesse Rosenthal
5c24e66c16 Docx Reader: Tests for track-changes moving 2016-04-16 08:39:25 -04:00
Jesse Rosenthal
7f4a40474c Docx reader: Add test for enumerated headers.
We don't want them to turn into a list.
2016-03-16 12:56:17 -04:00
John MacFarlane
a485c42d78 Fixed behavior of base tag.
+ If the base path does not end with slash, the last component
  will be replaced.  E.g. base = `http://example.com/foo`
  combines with `bar.html` to give `http://example.com/bar.html`.
+ If the href begins with a slash, the whole path of the base
  is replaced.  E.g. base = `http://example.com/foo/` combines
  with `/bar.html` to give `http://example.com/bar.html`.

Closes #2777.
2016-03-10 19:59:55 -08:00
John MacFarlane
29706ee02d Merge pull request #2646 from tarleb/org-figure-with-no-name
Prefix even empty figure names with "fig:"
2016-02-20 21:44:39 -08:00
John MacFarlane
e369e60fb4 Merge pull request #2691 from tarleb/org-image-file-links
Org reader: Refactor link-target processing
2016-02-20 21:42:12 -08:00
Jesse Rosenthal
7a10507dc8 Docx reader: Add tests for adjacent hyperlinks. 2016-02-02 14:53:01 -05:00
Albert Krewinkel
92e6ae47f6 Org reader: Refactor link-target processing
Cleanup of the code for link target handling.  Most notably, the
canonicalization of a link is handled by a separate function.

This fixes #2684.
2016-01-31 23:23:09 +01:00
Albert Krewinkel
fabbd1aa79 Prefix even empty figure names with "fig:"
The convention used by pandoc for figures is to mark them by prefixing
the name with "fig:".  The org reader failed to do this if a figure had
no name.  The test for this was broken as well.

This fixes #2643.
2016-01-11 22:23:59 +01:00
John MacFarlane
d5f67829dc Added some entity tests in Markdown reader tests. 2016-01-08 17:33:37 -08:00