Commit graph

1604 commits

Author SHA1 Message Date
John MacFarlane
dcb0b02aa3 Markdown reader: handle 'id' and 'class' in parsing key/value attrs.
# Header {id="myid" class="foo bar"}

is now equivalent to

    # Header {#myid .foo .bar}

Closes #2396.
2015-09-25 23:01:34 -07:00
Frerich Raabe
eee992520c Improve text generated for <xref> by employing docbook-xsl heuristics
docbook-xsl, a set of XSLT scripts to generate HMTL out of DocBook,
tries harder to generate a nice xref text. Depending on the element
being linked to, it looks at the title or other descriptive child
elements. Let's do that, too.
2015-09-24 18:28:51 +02:00
Frerich Raabe
35f12b5095 Added proper support for DocBook 'xref' elements
'xref' is used to create cross references to other parts of the
document. It is an empty element - the cross reference text depends on
various attributes. Quoting 'DocBook: The Definitive Guide':

  1. If the endterm attribute is specified on xref, the content of the
  element pointed to by endterm will be used as the text of the
  cross-reference.

  2. Otherwise, if the object pointed to has a specified XRefLabel, the
  content of that attribute will be used as the cross-reference text.
2015-09-24 18:26:55 +02:00
Frerich Raabe
f6538144f0 Pass the parsed DocBook content along the state of readDocBook
Having access to the entire document will be needed when handling
elements which refer to other elements. This is needed for e.g. <xref>
or <link>, both of which reference other elements (by the 'id'
attribute) for the label text.

I suppose that in practice, the [Content] returned by parseXML always
only contains one 'Elem' value -- the document element. However, I'm not
totally sure about it, so let's just pass all the Content along.
2015-09-23 19:31:25 +02:00
Frerich Raabe
3564cd82ca Minor refactoring to readDocBook
I plan to use the parsed and normalized XML tree read in readDocBook in
other places - prepare that commit by factoring this code out into a
separate, shared, definition.
2015-09-23 19:25:58 +02:00
John MacFarlane
72e71a1dad LaTeX reader: support longtable.
Closes #2411.
2015-09-23 08:34:37 -07:00
Albert Krewinkel
8007dd97b5 Make sure verse blocks can contain empty lines
The previous verse parsing code made the faulty assumption that empty
strings are valid (and empty) inlines.  This isn't the case, so lines
are changed to contain at least a newline.

It would generally be nicer and faster to keep the newlines while
splitting the string.  However, this would require more code, which
seems unjustified for a simple (and fairly rare) block as *verse*.

This fixes #2402.
2015-09-19 22:02:43 +02:00
John MacFarlane
761e1edc30 Merge pull request #2364 from gbataille/bugDoc
[BUG] Haddock : * and ^ to be escaped in docs
2015-08-17 12:15:32 -07:00
Grégory Bataille
0dff30271c [BUG] Haddock : * and ^ to be escaped in docs 2015-08-17 09:03:33 +02:00
John MacFarlane
1f00a5395f RST reader: better handling of indirect roles.
Previously the parser failed on this kind of case

    .. role:: indirect(code)

    .. role:: py(indirect)
       :language: python

    :py:`hi`

Now it currectly recognizes `:py:` as a code role.

The previous test for this didn't work, because the
name of the indirect role was the same as the language
defined its parent, os it didn't really test for this
behavior.  Updated test.
2015-08-15 10:22:47 -07:00
John MacFarlane
8c579a5daa Merge pull request #2360 from jg/issue-2354
Org reader: add auto identifiers if not present on headers
2015-08-15 09:47:56 -07:00
Juliusz Gonera
f1c87ed164 Org reader: add auto identifiers if not present on headers
Refs #2354

This should also fix the table of contents (--toc) when generating a html file
from org input
2015-08-15 07:57:48 +02:00
John MacFarlane
7b8c005d07 EPUB reader: stop mangling external URLs.
Closes #2284.

Note the changes to the test suite. In each case, a mangled
external link has been fixed, so these are all positive.
2015-08-10 16:35:43 -07:00
John MacFarlane
467e3be700 MediaWiki reader: handle unquoted table attributes.
Closes #2355.
2015-08-08 20:55:00 -07:00
John MacFarlane
2eec8cf61b HTML reader: add auto identifiers if not present on headers.
This makes TOC linking work properly.

The same thing needs to be done to the org reader to fix #2354;
in addition, `Ext_auto_identifiers` should be added to the list
of default extensions for org in Text.Pandoc.
2015-08-08 11:20:15 -07:00
John MacFarlane
eaccef1491 DocBook reader: handle informalexample.
It is parsed into a Div with class `informalexample`.

Closes #2319.
2015-08-08 10:43:25 -07:00
John MacFarlane
bf7d858a9e LaTeX reader: Implement \Cite.
See #2335.
2015-08-08 10:32:03 -07:00
John MacFarlane
74c31abb1a Merge pull request #2327 from hftf/list-style
HTML Reader: Correctly parse inline list-style(-type) for <ol>
2015-08-07 11:08:53 -07:00
mb21
a010b83a75 Updated readers, writers and README for link attribute 2015-08-07 12:38:37 +02:00
John MacFarlane
92d48fa65b Updated readers and writers for new image attribute parameter.
(mb21)
2015-08-07 12:37:12 +02:00
Ophir Lifshitz
18b1b21a6a HTML Reader: Detect font-variant with pickStyleAttrProps 2015-07-27 20:08:04 -04:00
John MacFarlane
0baaa1080a Pipe tables: allow indented columns.
Previously the left-hand column could not start with 4 or
more spaces indent.  This was inconvenient for right-aligned
left columns.

Note that the first (header column) must still have 3 or fewer
spaces indentation, or the table will be treated as an indented
code  block.
2015-07-27 10:24:06 -07:00
John MacFarlane
defcb5b6b1 Merge pull request #1689 from kuribas/master
Use '=' instead of '#' for atx-style headers in markdown+lhs.
2015-07-25 10:21:02 -07:00
Ophir Lifshitz
7ef8700734 HTML Reader: Parse <ol> type, class, and inline list-style(-type) CSS 2015-07-24 02:53:17 -04:00
MarLinn
f068093555 Added odt reader
Fully implemented features:

* Paragraphs
* Headers
* Basic styling
* Unordered lists
* Ordered lists
* External Links
* Internal Links
* Footnotes, Endnotes
* Blockquotes

Partly implemented features:

* Citations
  Very basic, but pandoc can't do much more
* Tables
  No headers, no sizing, limited styling
2015-07-23 15:37:01 -07:00
John MacFarlane
8390d935d8 Updated tests and removed a skipSpaces....
we no longer need it with the change to toKey, and it
is expensive to skip spaces after every inline.
2015-07-23 15:35:18 -07:00
John MacFarlane
5db4787330 Merge pull request #2323 from hftf/implicit-header-refs
Fix implicit header refs for headers with extra spaces
2015-07-23 14:46:38 -07:00
John MacFarlane
66a72b8eec LaTeX reader: support abstract environment.
The abstract populates an "abstract" metadata field.
2015-07-23 09:31:46 -07:00
Ophir Lifshitz
42c139d302 Markdown Reader: Skip spaces in headers 2015-07-23 02:29:37 -04:00
John MacFarlane
fa2c008ae5 Fix regression: allow HTML comments containing --.
Technically this isn't allowed in an HTML comment, but
we've always allowed it, and so do most other implementations.
It is handy if e.g. you want to put command line arguments
in HTML comments.
2015-07-21 22:44:18 -07:00
John MacFarlane
da0842b5b5 HTML reader: handle type attribute on ol.
E.g. `<ol type="i">`.

Closes #2313.
2015-07-21 13:07:52 -07:00
John MacFarlane
f6ad9e263f LaTeX reader: properly handle booktabs lines.
Lines aren't part of the pandoc table model, but we can just
ignore them.

Closes #2307.
2015-07-21 10:26:29 -07:00
John MacFarlane
9e0fb844a9 Markdown reader: don't allow bare URI links or autolinks in link label.
Added test cases.

Closes #2300.
2015-07-14 13:16:40 -07:00
John MacFarlane
99fe8594d9 Avoid parsing partial URLs as HTML tags.
Closes #2277.
2015-07-10 10:33:27 -07:00
Lars-Dominik Braun
b2adf44e75 Readers.RST: Factor out inline markup string parsing 2015-07-03 16:42:51 +02:00
Lars-Dominik Braun
68b6b9f652 Readers.RST: Parse field list name
“Inline markup is parsed in field names.” [1]

[1] http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#field-lists
2015-07-03 16:41:28 +02:00
John MacFarlane
226a5cd6a9 Merge pull request #2250 from PromyLOPh/rsttarget
Fix RST reference names with special characters
2015-06-29 10:21:40 -07:00
John MacFarlane
754d1cef7b LaTeX reader: Allow _ and ^ as regular inline text.
Normally these will cause an error in LaTeX, but there
are contexts (e.g. `alltt` environments) where they are
okay.  Now that we aren't treating them as super/subscript
outside of math mode, it seems okay to parse them as regular
text.
2015-06-29 10:20:08 -07:00
John MacFarlane
457fbebabc LaTeX reader: don't parse _,^ as super/sub outside math mode. 2015-06-29 09:46:57 -07:00
Lars-Dominik Braun
3b2c50ed93 Fix RST reference names with special characters 2015-06-29 18:34:45 +02:00
mb21
82e363a727 DocBook reader mediaobjects and figures, closes #2184 2015-06-21 18:36:47 +02:00
John MacFarlane
cb7dd4469d HTML reader: allow <body> to close <head>. 2015-06-04 10:23:38 +02:00
John MacFarlane
e54c8613e8 Removed tab chars in Textile reader source. 2015-05-28 13:07:52 -07:00
John MacFarlane
24bfc8274e Merge pull request #2170 from tarleb/org-generalize-result-block
Org generalize result block
2015-05-26 17:14:42 -07:00
Albert Krewinkel
385dcf5b99 Org reader: drop trees with a :noexport: tag
Trees having a `:noexport:` tag set are not exported.  This mirrors
default Emacs Org-Mode behavior.
2015-05-23 14:23:16 +02:00
Albert Krewinkel
d8e4a8bc10 Org reader: put header tags into empty spans
Org mode allows headers to be tagged:

``` org-mode
* Headline         :TAG1:TAG2:
```

Instead of being interpreted as part of the headline, the tags are now
put into the attributes of empty spans.  Spans without textual content
won't be visible by default, but they are detectable by filters.  They
can also be styled using CSS when written as HTML.

This fixes #2160.
2015-05-23 14:06:32 +02:00
Albert Krewinkel
b61355cecd Org reader: generalize code block result parsing
Code blocks can be followed by optional result blocks, representing the
output generated by running the code in the code block.  It is possible
to choose whether one wants to export the code, the result, both or
none.

This patch allows any kind of `Block` as the result.  Previously, only
example code blocks were recognized.
2015-05-23 13:22:07 +02:00
Albert Krewinkel
40fb102417 Reorder block arguments parsing code
Group code used to parse block arguments together in one place.  This
seems better than having part of the code mixed between unrelated
parsing state changing functions.
2015-05-23 13:17:10 +02:00
John MacFarlane
24ee1ab4f7 Markdown reader: Made implicit header references case-insensitive.
Added `stateHeaderKeys` to `ParserState`; this is a `KeyTable`
like `stateKeys`, but it only gets consulted if we don't find
a match in `stateKeys`, and if `Ext_implicit_header_references`
is enabled.

Closes #1606.
2015-05-13 23:12:58 -07:00
John MacFarlane
e06810499e HTML reader: Support base tag.
We only support the href attribute, as there's no place for
"target" in the Pandoc document model for links.

Added HTML reader test module, with tests for this feature.

Closes #1751.
2015-05-13 20:53:19 -07:00