Commit graph

2964 commits

Author SHA1 Message Date
John MacFarlane
c302ab3133 Markdown writer: More improvements to 'plain' output, updated tests.
Math now appears in unicode if possible, without the distracting
italics around identifiers.

Blank lines around headers are more consistent.

Footnotes appear in regular [n] style.
2014-07-27 07:57:23 -07:00
John MacFarlane
cc24a1f3e5 Text.Pandoc.Pretty: added blanklines.
This ensures a certain number of blanklines (and no more) in output.
2014-07-27 07:56:55 -07:00
John MacFarlane
3eff3782c1 Markdown writer: Better 'plain' output.
We now largely follow the style of Project Gutenberg.

Emphasis is rendered with `_underscores_`, strong with ALL CAPS.

The appearance of horizontal rules has changed (even in regular
markdown) to a line across the whole page.

Headings are rendered differently, using space to set them off.
2014-07-27 00:17:15 -07:00
John MacFarlane
18d72a0768 Markdown writer: Update definition lists.
They now behave like the new reader does. The old behavior
can be activated with the `compact_definition_lists` extension.
2014-07-27 00:15:18 -07:00
John MacFarlane
b104db4fb4 Docx writer: Added missing case from last commit. 2014-07-26 23:32:13 -07:00
John MacFarlane
2610de0159 Docx writer: include abstract with Abstract style.
Addresses docx part of #1451.
2014-07-26 22:55:45 -07:00
John MacFarlane
d9751f91c4 Merge pull request #1457 from mpickering/generalstate
Generalised more in Parsing.hs to enable the use of custom state
2014-07-26 19:01:26 -07:00
Matthew Pickering
9e4604fa0b Added compatability layer to support directory-1.1 2014-07-27 00:36:23 +01:00
Matthew Pickering
f0a32197c8 Txt2Tags Reader: Added copyright information 2014-07-27 00:12:57 +01:00
Matthew Pickering
43304d6bd6 Txt2Tags Reader: Added recognition of macros 2014-07-27 00:12:56 +01:00
Matthew Pickering
ab3589ff0b Txt2Tags Reader: Integrated into pandoc 2014-07-27 00:12:56 +01:00
Matthew Pickering
7d04d383a6 Added txt2tags reader
http://txt2tags.org/

There are two points which currently do not match the official
implementation.

1. In the official implementation lists can not be nested like the
following but the reader would interpret this as a bullet list with the
first item being a numbered list.

```
  - + This is not a list
```

2. The specification describes how URIs automatically becomes links.
Unfortunately as is often the case, their definitiong of URI is not
clear. I tried three solutions but was unsure about which to adopt.

* Using isURI from Network.URI, this matches far too many strings and is
therefore unsuitable
* Using uri from Text.Pandoc.Shared, this doesn't match all strings that
the reference implementation matches
* Try to simulate the regex which is used in the native code

I went with the third approach but it is not perfect, for example
trailing punctuation is captured in Urls.
2014-07-27 00:12:56 +01:00
Matthew Pickering
5e2d22a27e Generalised more in Parsing.hs to enable the use of custom state 2014-07-26 21:23:49 +01:00
John MacFarlane
18f4490482 Fixed runtime error with compactify'DL on certain lists.
Closes #1452.  Added test.
2014-07-25 10:53:04 -07:00
John MacFarlane
9c3f7688ee DocBook reader: Better handle elements inside code environments.
Of course, we can't include structure in the code block, but
this way we at least preserve the text.  Closes #1449.
2014-07-23 10:06:36 -07:00
Matthew Pickering
83028c5982 Exported runParserT and Stream 2014-07-22 16:37:41 +01:00
Matthew Pickering
e045b1d5f2 Generalised readWith to readWithM 2014-07-22 15:13:54 +01:00
John MacFarlane
5debb492ef Revert "Shared.hierarchicalize: Don't number subsections of unnumbered sections."
This reverts commit 2a46042661.
2014-07-21 20:47:18 -07:00
John MacFarlane
2a46042661 Shared.hierarchicalize: Don't number subsections of unnumbered sections.
They were previously numbered, starting from the previous numbered
section, which was wrong.
2014-07-21 12:35:11 -07:00
John MacFarlane
3670dda17c Markdown writer: Avoid wrapping that might start a list.
Or a blockquote or header.  Closes #1013.
2014-07-21 00:03:51 -07:00
John MacFarlane
0f970ed95b EPUB writer: Avoid excess whitespace in nav.xhtml.
This should improve TOC view in iBooks.  Closes #1392.
2014-07-20 23:28:44 -07:00
John MacFarlane
98c7ada061 HTML reader: parse Div and Span elements even without --parse-raw.
Closes #1434.
2014-07-20 21:43:54 -07:00
John MacFarlane
b6c769084e Fix behavior of markdown_attribute extension.
It now works as in PHP markdown extra.  Setting `markdown="1"` on
an outer tag affects all contained tags until it is reversed with
`markdown="0"`.  Closes #1378.

Added `stateMarkdownAttribute` to `ParserState`.
2014-07-20 17:44:28 -07:00
John MacFarlane
a243afb551 Markdown reader: Fixed small bug in HTML parsing with markdown_attribute.
Test case:

    <aside markdown="1">
    *hi*
    </aside>

Previously gave:

    <article markdown="1">
    <p><em>hi</em> </article></p>
2014-07-20 17:22:29 -07:00
John MacFarlane
4af8eed764 Markdown reader: revised definition list syntax (closes #1429).
* This change brings pandoc's definition list syntax into alignment
  with that used in PHP markdown extra and multimarkdown (with the
  exception that pandoc is more flexible about the definition markers,
  allowing tildes as well as colons).

* Lazily wrapped definitions are now allowed; blank space is required
  between list items; and the space before definition is used to
  determine whether it is a paragraph or a "plain" element.

* For backwards compatibility, a new extension,
  `compact_definition_lists`, has been added that restores the behavior
  of pandoc 1.12.x, allowing tight definition lists with no blank space
  between items, and disallowing lazy wrapping.
2014-07-20 16:33:59 -07:00
John MacFarlane
cdc4ecbe98 readWith: reverted generalization from f201bdcb.
We need input to be a string so we can print the offending line
on an error.
2014-07-20 13:51:03 -07:00
John MacFarlane
87096c64f8 Org reader: text adjacent to a list yields a Plain, not Para.
This gives better results for tight lists.  Closes #1437.

An alternative solution would be to use Para everywhere, and
never Plain.  I am not sufficiently familiar with org to know
which is best.  Thoughts, @tarleb?
2014-07-20 12:56:01 -07:00
John MacFarlane
0f01421f81 AsciiDoc writer: Double markers in intraword emphasis.
Closes #1441.
2014-07-20 12:24:53 -07:00
Matthew Pickering
e7d8039969 Renamed readTeXMath' to avoid name conflict with texmath 0.6.7
Also removed deprecated readTeXMath.
2014-07-19 18:10:59 +01:00
Craig S. Bosma
1bb4f0c497 Org reader: Respect :exports header arguments on code blocks
Adds support to the org reader for conditionally exporting either the code block,
results block immediately following, both, or neither, depending on the value
of the `:exports` header argument. If no such argument is supplied, the default
org behavior (for most languages) of exporting code is used.
2014-07-17 10:23:22 -05:00
John MacFarlane
e053562280 Remove unused import. 2014-07-16 23:56:24 -07:00
John MacFarlane
0e9d3db244 Custom writers now work with --template.
Removed HTML header scaffolding from data/sample.lua.
2014-07-16 15:17:08 -07:00
John MacFarlane
2a881541a0 Made Citation information available in lua custom writer. 2014-07-16 09:32:41 -07:00
John MacFarlane
1bff443ac9 Removed redundant clause in markdown parser.
Thanks @dubiousjim.  Close #1431.
2014-07-16 07:55:39 -07:00
John MacFarlane
047f9b3714 Merge pull request #1430 from jkr/anchor-fix-2
Fix auto identified headers when already auto-id'ed
2014-07-15 20:27:28 -07:00
Jesse Rosenthal
4b2d07a642 Docx Reader: Fix hdr auto-id when already auto-id.
If header anchors (bookmarks in a header paragraph) already have an
auto-id, which will happen if they're generated by pandoc, we don't want
to rename it twice, and thus end up with an unnecessary number at the
end. So we add a state value to check if we're in a header. If we are,
we don't rename the bookmark -- wait until we rename it in our header
handling.
2014-07-16 03:50:38 +01:00
Jesse Rosenthal
a4671afd64 Docx Reader: Change state handling.
We don't need `updateDState` -- the built-in `modify` works just
fine. And we redefine `withDState` to use modify.
2014-07-16 03:43:14 +01:00
John MacFarlane
897c52880f HTML writer: Removed useless clause. 2014-07-15 16:49:48 -07:00
John MacFarlane
c24ab14918 LaTeX writer: Use \nolinkurl in email autolinks.
This allows them to be styled using `\urlstyle{tt}`.

Thanks to Ulrike Fischer for the solution.
2014-07-15 16:42:39 -07:00
John MacFarlane
73b0630217 EPUB writer: Keep newlines between block elements.
This allows easier diff-ability.  Closes #1424.
2014-07-15 15:41:42 -07:00
John MacFarlane
b80577b395 Shared.fetchItem: unescape URI encoding before reading local file.
Close #1427.
2014-07-15 12:17:45 -07:00
John MacFarlane
5883899625 RTF writer: Avoid extra paragraph tags in metadata.
Closes #1421.
2014-07-13 16:40:07 -07:00
John MacFarlane
3e95fd586d Use raw HTML for complex block quotes.
As far as I can see, dokuwiki markup is pretty limited in what
can go in a `>` block quote:  just a single line of paragraph
text.  (#1398)
2014-07-13 16:15:45 -07:00
John MacFarlane
81088281de DokuWiki writer: Use raw HTML for complex lists...
as in the mediawiki writer.  The dokuwiki markup isn't able
to handle multiple block-level items within a list item, except
in a few special cases (e.g. code blocks, and these must be started
on the same line as the preceding paragraph).  So we fall back to
raw HTML for these.

Perhaps there is a better solution.  We can "fake" multiple
paragraphs within list items using hard line breaks (`\\`), but
we must keep everything on one line.

(#1398)
2014-07-13 16:04:29 -07:00
John MacFarlane
0ba2f0b8f9 DokuWiki writer: Normalize to collapse adjacent raw HTML blocks. 2014-07-13 15:48:01 -07:00
John MacFarlane
71da4f8c55 DokuWiki writer: More tweaks to email links. (#1398) 2014-07-13 15:45:45 -07:00
John MacFarlane
80467d1b18 DokuWiki writer: Use pointy brackets for email links.
(#1398)
2014-07-13 15:36:14 -07:00
John MacFarlane
fbf7cbfdc8 Dokuwiki writer: More idiomatic code for escaping. 2014-07-13 15:32:16 -07:00
John MacFarlane
798c57d9b2 DokuWiki writer: More raw HTML fixes. (#1398)
* Use uppercase HTML tags for block-level content, lowercase for
  inline.
* Newline before closing HTML tag.
2014-07-13 15:28:47 -07:00
John MacFarlane
ce0960ba5a DokuWiki writer: Fix raw inlines and blocks.
* mediawiki > dokuwiki
* ignore raw content other than html or dokuwiki.

(#1398)
2014-07-13 15:25:25 -07:00