Commit graph

2573 commits

Author SHA1 Message Date
Nathan Gass
c2d3796439 Added support for latex cite commands in latex reader. 2010-12-13 20:48:19 -08:00
Nathan Gass
c81495a07a Added option to write citation markup in markdown writer. 2010-12-13 20:42:58 -08:00
Nathan Gass
48600fd547 Added support to write natbib or biblatex citations in latex output. 2010-12-13 20:41:37 -08:00
John MacFarlane
1a4a0d0283 Markdown reader: Further fix to abbrevs. 2010-12-13 20:05:50 -08:00
John MacFarlane
7b4d3c77ec Markdown reader: Fixed abbrev handler to allow abbrev at end of line.
E.g., Mr.
Frank.
2010-12-13 20:04:11 -08:00
John MacFarlane
3822d6c440 Markdown reader: Fixed referenceKey parser to allow space after newline. 2010-12-13 20:03:59 -08:00
John MacFarlane
dfbb4d3994 Fixed inlineListToIdentifier to treat '\160' as ' '. 2010-12-13 20:03:52 -08:00
John MacFarlane
71e0557e61 Markdown reader: Fixed regression in reference key parser.
* The recent change allowing spaces and newlines in the URL
  caused problems when reference keys are stacked up without
  blank lines between. This is now fixed.
* Added test.
2010-12-13 20:03:12 -08:00
John MacFarlane
3748dfeb91 Markdown reader: fix superscripts with links.
Moved inlineNote parser after superscript parser,
so ^[link](/foo)^ gets recognized as a superscripted
link, not an inline note followed by garbage.

Thanks to Conal Elliott for pointing out the problem.
2010-12-12 20:30:55 -08:00
John MacFarlane
f5c2082304 Added JSON reader and writer.
The JSON reader is about 20x faster than the native reader.
So this can be a good way to serialize a pandoc document.
2010-12-11 00:06:03 -08:00
John MacFarlane
2dfb45950e LaTeX reader: Improved parsing of preamble.
Previously you'd get unexpected behavior on a document that
contained '\begin{document}' in, say, a verbatim block.
2010-12-10 23:21:24 -08:00
John MacFarlane
9602f73f2a Moved 'readers' and 'writers' to Text.Pandoc.
This allows library users to avoid repetitive case statements...
2010-12-10 17:30:32 -08:00
John MacFarlane
de6452c0d1 Markdown reader: small cosmetic code improvements. 2010-12-10 16:26:35 -08:00
John MacFarlane
5770ceca36 Removed HTML sanitization.
This is better done on the resulting HTML; use the xss-sanitize library
for this.  xss-sanitize is based on pandoc's sanitization, but improves
it.

- Removed stateSanitize from ParserState.
- Removed --sanitize-html option.
2010-12-10 12:26:03 -08:00
John MacFarlane
17d48cf4af Markdown reader: Allow linebreaks in URLs (treat as spaces).
Also, a string of consecutive spaces or tabs is now parsed
as a single space. If you have multiple spaces in your URL,
use %20%20.
2010-12-10 12:14:51 -08:00
John MacFarlane
ee0a0953de Markdown reader: Rewrote para parser for better efficiency.
This change avoids repeated parsing of inline lists for 'plain'
blocks.
2010-12-10 10:47:46 -08:00
paul.rivier
bb609a85e3 textile redcloth definition lists 2010-12-09 09:25:46 -08:00
John MacFarlane
88a40685b8 Textile reader: better treatment of acronyms.
We now parse PBS(Public Broadcasting System) as if it were
"PBS (Public Broadcasting System)".
2010-12-09 08:52:09 -08:00
John MacFarlane
9ead748cc9 RST reader: Added footnote suppport.
Resolves issue #258.

Note that there are some differences in how docutils and
pandoc treat footnotes.  Currently pandoc ignores the numeral
or symbol used in the note; footnotes are put in an auto-numbered
ordered list.
2010-12-08 08:39:50 -08:00
John MacFarlane
91978d2201 Markdown reader: minor footnote changes.
Don't skipNonindentSpaces in noteMarker, since it's also
used in the inline note parser.
2010-12-08 08:17:16 -08:00
John MacFarlane
f02080b62d Textile reader: Implemented footnotes. 2010-12-08 00:44:46 -08:00
John MacFarlane
200ea33641 Made --smart work with RST reader. 2010-12-07 21:49:10 -08:00
John MacFarlane
5e35eb309f Make --smart work in HTML reader. 2010-12-07 21:24:35 -08:00
John MacFarlane
33ba35da9f Smart punctuation: recognize entities.
Now “Hi” gets parsed as a Quoted DoubleQuote inline.
2010-12-07 20:44:43 -08:00
John MacFarlane
3a5fceeef9 Rewrote normalizeSpaces (mostly aesthetic reasons). 2010-12-07 20:10:21 -08:00
John MacFarlane
e20052a1ba Markdown reader: Moved smartPunctuation parser, for slight speed bump. 2010-12-07 20:09:40 -08:00
John MacFarlane
ace3b80f1e Smart punctuation: don't alllow ellipses containing spaces.
Previously we allowed '. . .', ' . . . ', etc.  This caused
too many complications, and removed author's flexibility in
combining ellipses with spaces and periods.
2010-12-07 20:08:14 -08:00
John MacFarlane
50ca61ef49 Moved smartPunctuation from Markdown to Parsing.
+ Parameterized smartPunctuation on an inline parser.
+ Handle smartPunctuation in Textile reader.
2010-12-07 19:03:08 -08:00
John MacFarlane
f917b46500 Textile reader: implemented acronyms, (tm), (r), (c). 2010-12-07 18:28:36 -08:00
John MacFarlane
c66921f2ac Markdown reader: better handling of intraword _.
The 'str' parser now reads internal _'s as part of the string.
This prevents pandoc from getting started looking for an emphasized
block, which can cause exponential slowdowns in some cases.

Resolves Issue #182.
2010-12-06 22:12:18 -08:00
John MacFarlane
7864f30717 Markdown reader: handle curly quotes better.
Previously, curly quotes were just parsed literally, leading
to problems in some output formats.  Now they are parsed as
Quoted inlines, if --smart is specified.

Resolves Issue #270.
2010-12-06 20:36:58 -08:00
John MacFarlane
5a4609584c Fix regression: markdown references should be case-insensitive.
This broke when we added the Key type.  We had assumed that
the custom case-insensitive Ord instance would ensure case-insensitive
matching, but that is not how Data.Map works.

* Added a test case for case-insensitivity in markdown-reader-more
* Removed old refsMatch from Text.Pandoc.Parsing module;
* hid the 'Key' constructor;
* dropped the custom Ord and Eq instances, deriving instead;
* added fromKey and toKey to convert between Keys and Inline lists;
* toKey ensures that keys are case-insensitive, since this is the
  only way the API provides to construct a Key.

Resolves Issue #272.
2010-12-05 19:27:00 -08:00
John MacFarlane
d52a01a926 Org writer: Minor changes to documentation header. 2010-12-05 09:48:54 -08:00
Puneeth Chaganti
4d48abcb12 Added tests.
+ Added tables.org and writer.org to tests.
    + Added org.template to templates.
    + Changed RunTests.hs as required.
    + Minor changes to Org writer.
2010-12-04 23:49:53 +05:30
Puneeth Chaganti
921e2b6e67 Added Org-mode writer
+ Added Text/Pandoc/Writers/Org.hs
    + Added to pandoc.cabal
    + Added to pandoc.hs and Text/Pandoc.hs exports.
2010-12-04 15:57:39 +05:30
John MacFarlane
357b965b44 Merge branch 'citeproc' into master.
Conflicts:
	src/Text/Pandoc/Definition.hs
2010-12-03 23:43:47 -08:00
John MacFarlane
bea62bcab8 Textile reader: temporarily removed smartPunctuation.
The smartPuncutation parser from the markdown parser
was being used, but this creates two problems:

* smart punctuation rules are slightly different in textile,
  for example, a single dash wish space around becomes an
  En dash.
* the following gets parsed as a double quoted string followed
  by a colon, rather than as a link:

  "emphasized text":http://my.url.com

This needs rethinking.
2010-12-03 23:10:52 -08:00
John MacFarlane
d4e512776d Textile reader: added hrule parser. 2010-12-03 23:10:52 -08:00
John MacFarlane
4bf9d362d2 Textile reader: Turn on smart punctuation by default. 2010-12-03 23:10:52 -08:00
John MacFarlane
0356ad4de6 Textile reader: drop leading, trailing newline in pre block.
This is consistent with how the other readers work.
2010-12-03 23:10:52 -08:00
John MacFarlane
36d4aa4a09 Textile reader: modified str to handle acronyms, hyphens.
* A single hyphen between two word characters is no longer a
  potential strikeout-starter.
* Acronym explanations are dropped.
2010-12-03 23:10:52 -08:00
John MacFarlane
f415e9e119 Textile reader: parse raw by default.
It's part of the textile spec to allow raw HTML,
just as with markdown.
-R is no longer needed in test suite.
2010-12-03 23:10:52 -08:00
paul.rivier
c3866f3c66 punctuation handling, and more html-specific handling 2010-12-03 23:10:52 -08:00
Paul Rivier
d724c6b568 html inlines and html blocks handling in textile reader 2010-12-03 23:10:51 -08:00
Paul Rivier
fa0866886b textile reader now ignores html/css attributes 2010-12-03 23:10:51 -08:00
Paul Rivier
e6dde36622 removed support for textile Inserted construct 2010-12-03 23:10:51 -08:00
Paul Rivier
593b4f6c94 fix autolink by promoting it in the parser list, fix table parabreak 2010-12-03 23:10:51 -08:00
Paul Rivier
a7da0672dc more support for Textile reader (explicit links, images), tests and cabal entries 2010-12-03 23:10:51 -08:00
paul.rivier
cfc70863a3 simpler table cell handling 2010-12-03 23:10:51 -08:00
paul.rivier
d917db5e42 preliminary material toward table support 2010-12-03 23:10:51 -08:00