Commit graph

86 commits

Author SHA1 Message Date
John MacFarlane
3429fa6438 LaTeX reader: fixed \cite so it is a NormalCitation not AuthorInText. 2016-06-29 07:59:00 -07:00
Albert Krewinkel
0f3f5ce1a1 Org reader: support figure labels
Figure labels given as `#+LABEL: thelabel` are used as the ID of the
respective image.  This allows e.g. the LaTeX to add proper `\label`
markup.

This fixes half of #2496 and #2999.
2016-06-26 20:42:22 +02:00
Albert Krewinkel
29552eff3e Org reader: support arbitrary raw inlines
Org mode allows arbitrary raw inlines ("export snippets" in Emacs
parlance) to be included as `@@format:raw foreign format text@@`.

Support for this features is added to the Org reader.
2016-06-13 23:53:14 +02:00
Albert Krewinkel
8a9f5915ab Org reader: add support for "Berkeley-style" cites
A specification for an official Org-mode citation syntax was drafted by
Richard Lawrence and enhanced with the help of others on the orgmode
mailing list.  Basic support for this citation style is added to the
reader.

This closes #1978.
2016-06-05 11:28:57 +02:00
John MacFarlane
061bc60f70 Merge pull request #2950 from tarleb/org-ref-support
Org reader: support org-ref style citations
2016-05-31 12:44:29 -07:00
Albert Krewinkel
c17c62a2c7 Org reader: support new syntax for export blocks
Org-mode version 9 usees a new syntax for export blocks.  Instead of
`#+BEGIN_<FORMAT>`, where `<FORMAT>` is the format of the block's
content, the new format uses `#+BEGIN_export <FORMAT>` instead.  Both
types are supported.
2016-05-29 21:08:50 +02:00
Albert Krewinkel
f226cb88b0 Org reader: support org-ref style citations
The *org-ref* package is an org-mode extension commonly used to manage
citations in org documents.  Basic support for the `cite:citeKey` and
`[[cite:citeKey][prefix text::suffix text]]` syntax is added.
2016-05-27 21:19:28 +02:00
Albert Krewinkel
a4717c2fc5 Org reader: respect drawer export setting
The `d` export option can be used to control which drawers are exported
and which are discarded.  Basic support for this option is added here.
2016-05-23 09:44:37 +02:00
Albert Krewinkel
f3d27e4c80 Org reader/writer: use CUSTOM_ID in properties
The `ID` property is reserved for internal use by Org-mode and should
not be used.  The `CUSTOM_ID` property is to be used instead, it is
converted to the `ID` property for certain export format.

The reader and writer erroneously used `ID`.  This is corrected by using
`CUSTOM_ID` where appropriate.
2016-05-22 23:01:47 +02:00
Albert Krewinkel
68d388f833 Org reader: add :PROPERTIES: drawer support
Headers can have optional `:PROPERTIES:` drawers associated with them.
These drawers contain key/value pairs like the header's `id`.  The
reader adds all listed pairs to the header's attributes; `id` and
`class` attributes are handled specially to match the way `Attr` are
defined.

This also changes behavior of how drawers of unknown type are handled.
Instead of including all unknown drawers, those are not read/exported,
thereby matching current Emacs behavior.

This closes #1877.
2016-05-20 17:01:26 +02:00
Albert Krewinkel
16e233475a Org reader: add support for ATTR_HTML attributes
Arbitrary key-value pairs can be added to some block types using a
`#+ATTR_HTML` line before the block.  Emacs Org-mode only includes these
when exporting to HTML, but since we cannot make this distinction here,
the attributes are always added.

The functionality is now supported for figures.

This closes #1906.
2016-05-19 09:55:12 +02:00
Albert Krewinkel
76143de97e Org reader: add support for sub/superscript export options
Org-mode allows to specify export settings via `#+OPTIONS` lines.
Disabling simple sub- and superscripts is one of these export options,
this options is now supported.
2016-05-11 19:13:43 +02:00
Albert Krewinkel
10a809f126 Org reader: fix inline-LaTeX regression
The last fix for whitespace handling of inline LaTeX commands was
incorrect, preventing correct recognition of inline LaTeX commands which
contain spaces.  This fix ensures that only trailing whitespace is cut
off.
2016-05-09 19:06:04 +02:00
John MacFarlane
21d1a3b57c Merge pull request #2898 from tarleb/org-table-refactoring
Org reader: table parsing code refactoring and fixes
2016-05-05 16:22:56 -07:00
Albert Krewinkel
405c3e9c36 Org reader: fix spacing after LaTeX-style symbols
The org-reader was droping space after unescaped LaTeX-style symbol
commands: `\ForAll \Auml` resulted in `∀Ä` but should give `∀ Ä`
instead.  This seems to be because the LaTeX-reader treats the
command-terminating space as part of the command.  Dropping the trailing
space from the symbol-command fixes this issue.
2016-05-04 23:16:23 +02:00
Albert Krewinkel
2d825603c6 Org reader: fix handling of empty table cells, rows
This fixes Org mode parsing of some corner cases regarding empty cells
and rows.  Empty cells weren't parsed correctly, e.g. `|||` should be
two empty cells, but would be parsed as a single cell containing a pipe
character.  Empty rows where parsed as alignment rows and dropped from
the output.

This fixes #2616.
2016-05-04 16:02:03 +02:00
Albert Krewinkel
d5e4bc179c Org reader: stop padding short table rows
Emacs Org-mode doesn't add any padding to table rows.  The first
row (header or first body row) is used to determine the column count, no
other magic is performed.

The org reader was padding rows to the length of the longest table row.
This was done due to a misunderstanding of how Org handles tables.  This
feature reflected how Org-mode handles tables when pressing <TAB>.  The
Org exporter however, which is what the reader should implement, doesn't
do any of this.  So this was a mis-feature that made the reader more
complex and reduced comparability.  It was hence removed.
2016-05-04 15:48:07 +02:00
Emanuel Evans
1bfe39e24c
Ignore leading space in org code blocks
Fixes #2862

Also fix up tab handling for leading whitespace in code blocks.
2016-04-26 10:29:59 -07:00
John MacFarlane
29706ee02d Merge pull request #2646 from tarleb/org-figure-with-no-name
Prefix even empty figure names with "fig:"
2016-02-20 21:44:39 -08:00
Albert Krewinkel
92e6ae47f6 Org reader: Refactor link-target processing
Cleanup of the code for link target handling.  Most notably, the
canonicalization of a link is handled by a separate function.

This fixes #2684.
2016-01-31 23:23:09 +01:00
Albert Krewinkel
fabbd1aa79 Prefix even empty figure names with "fig:"
The convention used by pandoc for figures is to mark them by prefixing
the name with "fig:".  The org reader failed to do this if a figure had
no name.  The test for this was broken as well.

This fixes #2643.
2016-01-11 22:23:59 +01:00
Albert Krewinkel
b3b00da43d Fix function dropping subtrees tagged :noexport:
Continue scanning for comment subtrees beyond only the first block.

Note to self: when writing an recursive function, don't forget to, you
know, actually recurse.

Shout to @mrvdb for noticing this.

This fixes #2628.
2016-01-07 19:56:44 +01:00
John MacFarlane
d6a4c70ef8 Test fixes. 2015-12-12 12:58:31 -08:00
John MacFarlane
73e6333fae Merge pull request #2526 from tarleb/org-definition-lists-fix
Org reader: Require whitespace around def list markers
2015-11-13 14:07:56 -08:00
Albert Krewinkel
67cb2809fd Org reader: Require whitespace around def list markers
Definition list markers (i.e. double colons `::`) must be surrounded by
whitespace to start a definition item.  This rule was not checked
before, resulting in bugs with footnotes and some link types.

Thanks to @conklech for noticing and reporting this issue.

This fixes #2518.
2015-11-13 22:04:17 +01:00
Albert Krewinkel
220f3d12b8 Org reader: Fix emphasis rules for smart parsing
Smart quotes, ellipses, and dashes should behave like normal quotes,
single dashes, and dots with respect to text markup parsing.  The parser
state was not updated properly in all cases, which has been fixed.

Thanks to @conklech for reporting this issue.

This fixes #2513.
2015-11-13 20:43:46 +01:00
John MacFarlane
23b693c029 Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly."
This reverts commit c423dbb5a3.
2015-11-09 10:08:22 -08:00
John MacFarlane
237c10992e Merge pull request #2505 from tarleb/org-header-markup-fix
Org reader: fix markup parsing in headers
2015-11-08 17:17:19 -08:00
John MacFarlane
c423dbb5a3 Use -XNoImplicitPrelude and 'import Prelude' explicitly.
This is needed for ghci to work with pandoc, given that we
now use a custom prelude.

Closes #2503.
2015-11-08 16:56:59 -08:00
Albert Krewinkel
e3b4844bd7 Org reader: fix markup parsing in headers
Markup as the very first item in a header wasn't recognized.  This was
caused by an incorrect parser state: positions at which inline markup
can start need to be marked explicitly by changing the parser state.
This wasn't done for headers.  The proper function to update the state
is now called at the beginning of the header parser, fixing this issue.

This fixes #2504.
2015-11-08 22:32:00 +01:00
John MacFarlane
45df87cf15 Merge pull request #2477 from tarleb/org-toggling-header-args
Org reader: allow toggling header args
2015-10-25 12:10:07 -07:00
Albert Krewinkel
27a8603278 Org reader: allow toggling header args
Org-mode allows to skip the argument of a code block header argument if
it's toggling a value.  Argument-less headers are now recognized,
avoiding weird parsing errors.

The fixes are not exactly pretty, but neither is the code that was
fixed.  So I guess it's about par for the course.  However, a rewrite of
the header parsing code wouldn't hurt in the long run.

Thanks to @jo-tham for filing the bug report.

This fixes #2269.
2015-10-25 08:54:00 +01:00
Albert Krewinkel
b27366780f Org reader: fix paragraph/list interaction
Paragraphs can be followed by lists, even if there is no blank line
between the two blocks.  However, this should only be true if the
paragraph is not within a list, were the preceding block should be
parsed as a plain instead of paragraph (to allow for compact lists).

Thanks to @rgaiacs for bringing this up.

This fixes #2464.
2015-10-24 19:05:56 +02:00
John MacFarlane
82b3e0ab97 Use custom Prelude to avoid compiler warnings.
- The (non-exported) prelude is in prelude/Prelude.hs.
- It exports Monoid and Applicative, like base 4.8 prelude,
  but works with older base versions.
- It exports (<>) for mappend.
- It hides 'catch' on older base versions.

This allows us to remove many imports of Data.Monoid
and Control.Applicative, and remove Text.Pandoc.Compat.Monoid.

It should allow us to use -Wall again for ghc 7.10.
2015-10-14 09:09:10 -07:00
Albert Krewinkel
8007dd97b5 Make sure verse blocks can contain empty lines
The previous verse parsing code made the faulty assumption that empty
strings are valid (and empty) inlines.  This isn't the case, so lines
are changed to contain at least a newline.

It would generally be nicer and faster to keep the newlines while
splitting the string.  However, this would require more code, which
seems unjustified for a simple (and fairly rare) block as *verse*.

This fixes #2402.
2015-09-19 22:02:43 +02:00
Juliusz Gonera
f1c87ed164 Org reader: add auto identifiers if not present on headers
Refs #2354

This should also fix the table of contents (--toc) when generating a html file
from org input
2015-08-15 07:57:48 +02:00
Albert Krewinkel
385dcf5b99 Org reader: drop trees with a :noexport: tag
Trees having a `:noexport:` tag set are not exported.  This mirrors
default Emacs Org-Mode behavior.
2015-05-23 14:23:16 +02:00
Albert Krewinkel
d8e4a8bc10 Org reader: put header tags into empty spans
Org mode allows headers to be tagged:

``` org-mode
* Headline         :TAG1:TAG2:
```

Instead of being interpreted as part of the headline, the tags are now
put into the attributes of empty spans.  Spans without textual content
won't be visible by default, but they are detectable by filters.  They
can also be styled using CSS when written as HTML.

This fixes #2160.
2015-05-23 14:06:32 +02:00
John MacFarlane
2d2e4c9ab2 Merge branch 'master' of https://github.com/rootzlevel/pandoc into rootzlevel-master
Conflicts:
	src/Text/Pandoc/Readers/Org.hs
2015-03-28 21:09:38 -07:00
John MacFarlane
6a3a04c428 Merge branch 'errortype' of https://github.com/mpickering/pandoc into mpickering-errortype
Conflicts:
	benchmark/benchmark-pandoc.hs
	src/Text/Pandoc/Readers/Markdown.hs
	src/Text/Pandoc/Readers/Org.hs
	src/Text/Pandoc/Readers/RST.hs
	tests/Tests/Readers/LaTeX.hs
2015-03-28 12:12:48 -07:00
Craig S. Bosma
513221f822 Org reader: add support for smart punctuation 2015-03-09 07:11:53 -05:00
Hans-Peter Deifel
5871955169 Org reader: Add test for image links
Tests for image links with non-image targets, as introduced in
commit 2ca5101.
2015-02-26 13:11:50 +01:00
Matthew Pickering
1a7a99161a Update tests 2015-02-18 21:09:07 +00:00
Albert Krewinkel
4d85b17fc5 Org reader: properly handle links to file:target
Org links like `[[file:target][title]]` were not handled correctly,
parsing the link target verbatim.  The org reader is changed such that
the leading `file:` is dropped from the link target.

This is related to issues #756 and #1812.
2014-12-14 21:30:10 +01:00
John MacFarlane
46d343f474 Fixed bug in org with bulleted lists:
- a
   - b
   * c

was being parsed as a list, even though an unindented `*`
should make a heading.  See
<http://orgmode.org/manual/Plain-lists.html#fn-1>.
2014-11-13 23:40:18 -08:00
Albert Krewinkel
e6cd8c9077 Org reader: allow empty links for gitit interop
While empty links are not allowed in Emacs org-mode,  Pandoc org-mode
should support them: gitit relies on empty links as they are used to
create wiki links.

Fixes jgm/gitit#471
2014-11-05 23:15:28 +01:00
Albert Krewinkel
daaf635806 Org reader: absolute, relative paths in links
The org reader was to restrictive when parsing links, some relative
links and links to files given as absolute paths were not recognized
correctly.  The org reader's link parsing function was amended to handle
such cases properly.

This fixes #1741
2014-11-05 22:27:25 +01:00
Albert Krewinkel
a5eb02f6a7 Org reader: parse LaTeX-style MathML entities
Org supports special symbols which can be included using LaTeX syntax,
but are actually MathML entities.  Examples for this are
`\nbsp` (non-breaking space), `\Aacute` (the letter A with accent acute)
or `\copy` (the copyright sign ©).

This fixes #1657.
2014-10-20 22:57:36 +02:00
John MacFarlane
84f6b1e41a Merge pull request #1680 from shelf/master
Respect indent when parsing Org bullet lists
2014-10-18 13:20:27 -07:00
John MacFarlane
31713d572a Merge pull request #1700 from tarleb/org-emphasis-fix
Org reader: fix rules for emphasis recognition
2014-10-18 13:19:42 -07:00