Commit graph

1918 commits

Author SHA1 Message Date
John MacFarlane
f5c2082304 Added JSON reader and writer.
The JSON reader is about 20x faster than the native reader.
So this can be a good way to serialize a pandoc document.
2010-12-11 00:06:03 -08:00
John MacFarlane
dab645440a Added Benchmark.hs to extra-source-files. 2010-12-10 23:58:02 -08:00
John MacFarlane
4c7f7853a7 Added Benchmark.hs, testing all readers + writers using criterion. 2010-12-10 23:35:31 -08:00
John MacFarlane
2dfb45950e LaTeX reader: Improved parsing of preamble.
Previously you'd get unexpected behavior on a document that
contained '\begin{document}' in, say, a verbatim block.
2010-12-10 23:21:24 -08:00
John MacFarlane
9602f73f2a Moved 'readers' and 'writers' to Text.Pandoc.
This allows library users to avoid repetitive case statements...
2010-12-10 17:30:32 -08:00
John MacFarlane
de6452c0d1 Markdown reader: small cosmetic code improvements. 2010-12-10 16:26:35 -08:00
John MacFarlane
5770ceca36 Removed HTML sanitization.
This is better done on the resulting HTML; use the xss-sanitize library
for this.  xss-sanitize is based on pandoc's sanitization, but improves
it.

- Removed stateSanitize from ParserState.
- Removed --sanitize-html option.
2010-12-10 12:26:03 -08:00
John MacFarlane
17d48cf4af Markdown reader: Allow linebreaks in URLs (treat as spaces).
Also, a string of consecutive spaces or tabs is now parsed
as a single space. If you have multiple spaces in your URL,
use %20%20.
2010-12-10 12:14:51 -08:00
John MacFarlane
ee0a0953de Markdown reader: Rewrote para parser for better efficiency.
This change avoids repeated parsing of inline lists for 'plain'
blocks.
2010-12-10 10:47:46 -08:00
John MacFarlane
167eeef6cb Added json format for reading and writing.
This is faster to parse than native.
2010-12-09 10:40:31 -08:00
paul.rivier
bb609a85e3 textile redcloth definition lists 2010-12-09 09:25:46 -08:00
John MacFarlane
88a40685b8 Textile reader: better treatment of acronyms.
We now parse PBS(Public Broadcasting System) as if it were
"PBS (Public Broadcasting System)".
2010-12-09 08:52:09 -08:00
John MacFarlane
9ead748cc9 RST reader: Added footnote suppport.
Resolves issue #258.

Note that there are some differences in how docutils and
pandoc treat footnotes.  Currently pandoc ignores the numeral
or symbol used in the note; footnotes are put in an auto-numbered
ordered list.
2010-12-08 08:39:50 -08:00
John MacFarlane
91978d2201 Markdown reader: minor footnote changes.
Don't skipNonindentSpaces in noteMarker, since it's also
used in the inline note parser.
2010-12-08 08:17:16 -08:00
John MacFarlane
f02080b62d Textile reader: Implemented footnotes. 2010-12-08 00:44:46 -08:00
John MacFarlane
200ea33641 Made --smart work with RST reader. 2010-12-07 21:49:10 -08:00
John MacFarlane
5e35eb309f Make --smart work in HTML reader. 2010-12-07 21:24:35 -08:00
John MacFarlane
33ba35da9f Smart punctuation: recognize entities.
Now “Hi” gets parsed as a Quoted DoubleQuote inline.
2010-12-07 20:44:43 -08:00
John MacFarlane
3a5fceeef9 Rewrote normalizeSpaces (mostly aesthetic reasons). 2010-12-07 20:10:21 -08:00
John MacFarlane
e20052a1ba Markdown reader: Moved smartPunctuation parser, for slight speed bump. 2010-12-07 20:09:40 -08:00
John MacFarlane
ace3b80f1e Smart punctuation: don't alllow ellipses containing spaces.
Previously we allowed '. . .', ' . . . ', etc.  This caused
too many complications, and removed author's flexibility in
combining ellipses with spaces and periods.
2010-12-07 20:08:14 -08:00
John MacFarlane
50ca61ef49 Moved smartPunctuation from Markdown to Parsing.
+ Parameterized smartPunctuation on an inline parser.
+ Handle smartPunctuation in Textile reader.
2010-12-07 19:03:08 -08:00
John MacFarlane
f917b46500 Textile reader: implemented acronyms, (tm), (r), (c). 2010-12-07 18:28:36 -08:00
John MacFarlane
6ef8a363dc Narrowed a long line in README. 2010-12-07 12:31:51 -08:00
John MacFarlane
3b3387b4a3 Improved process to create man page from README.
Previously it relied on pandoc already being installed.
Now it uses dist/package.conf.inplace.
2010-12-07 12:29:43 -08:00
John MacFarlane
581f8f77d5 Added Paulo Tanimoto to AUTHORS in markdown2pdf man page. 2010-12-07 12:09:52 -08:00
John MacFarlane
315e236d6a Use same options documentation in README and man page.
Later we will generate the man page from the README.
2010-12-07 09:36:03 -08:00
John MacFarlane
67445e7685 Fixed bugs in ieee.csl (Andrea Rossato). 2010-12-07 08:28:18 -08:00
John MacFarlane
5f283e4d47 Updated ieee citation test for punctuation-in-quote. 2010-12-07 08:14:26 -08:00
John MacFarlane
c66921f2ac Markdown reader: better handling of intraword _.
The 'str' parser now reads internal _'s as part of the string.
This prevents pandoc from getting started looking for an emphasized
block, which can cause exponential slowdowns in some cases.

Resolves Issue #182.
2010-12-06 22:12:18 -08:00
John MacFarlane
7864f30717 Markdown reader: handle curly quotes better.
Previously, curly quotes were just parsed literally, leading
to problems in some output formats.  Now they are parsed as
Quoted inlines, if --smart is specified.

Resolves Issue #270.
2010-12-06 20:36:58 -08:00
John MacFarlane
5a4609584c Fix regression: markdown references should be case-insensitive.
This broke when we added the Key type.  We had assumed that
the custom case-insensitive Ord instance would ensure case-insensitive
matching, but that is not how Data.Map works.

* Added a test case for case-insensitivity in markdown-reader-more
* Removed old refsMatch from Text.Pandoc.Parsing module;
* hid the 'Key' constructor;
* dropped the custom Ord and Eq instances, deriving instead;
* added fromKey and toKey to convert between Keys and Inline lists;
* toKey ensures that keys are case-insensitive, since this is the
  only way the API provides to construct a Key.

Resolves Issue #272.
2010-12-05 19:27:00 -08:00
John MacFarlane
37dc5d8c5d Documented citations in README. 2010-12-05 12:57:44 -08:00
John MacFarlane
fabd9e6c22 Documented fact that you can specify --bibliography repeatedly. 2010-12-05 12:57:27 -08:00
John MacFarlane
529f75adec README: Updated list of code contributors. 2010-12-05 09:52:28 -08:00
John MacFarlane
d52a01a926 Org writer: Minor changes to documentation header. 2010-12-05 09:48:54 -08:00
John MacFarlane
05b8017679 Documented org-mode writer in README, cabal, man pages. 2010-12-05 09:45:55 -08:00
John MacFarlane
e8e491bdbf Merge branch 'punchagan-master' 2010-12-05 09:34:40 -08:00
John MacFarlane
c277d29902 Documented all the formats citeproc/bibutils can handle. 2010-12-05 09:29:14 -08:00
Puneeth Chaganti
85263ecda9 Added templates/org.template to pandoc.cabal. 2010-12-05 11:18:02 +05:30
Puneeth Chaganti
4d48abcb12 Added tests.
+ Added tables.org and writer.org to tests.
    + Added org.template to templates.
    + Changed RunTests.hs as required.
    + Minor changes to Org writer.
2010-12-04 23:49:53 +05:30
Puneeth Chaganti
921e2b6e67 Added Org-mode writer
+ Added Text/Pandoc/Writers/Org.hs
    + Added to pandoc.cabal
    + Added to pandoc.hs and Text/Pandoc.hs exports.
2010-12-04 15:57:39 +05:30
John MacFarlane
5171de66c5 Updated README and pandoc man page with textile reader. 2010-12-03 23:50:03 -08:00
John MacFarlane
357b965b44 Merge branch 'citeproc' into master.
Conflicts:
	src/Text/Pandoc/Definition.hs
2010-12-03 23:43:47 -08:00
John MacFarlane
bea62bcab8 Textile reader: temporarily removed smartPunctuation.
The smartPuncutation parser from the markdown parser
was being used, but this creates two problems:

* smart punctuation rules are slightly different in textile,
  for example, a single dash wish space around becomes an
  En dash.
* the following gets parsed as a double quoted string followed
  by a colon, rather than as a link:

  "emphasized text":http://my.url.com

This needs rethinking.
2010-12-03 23:10:52 -08:00
John MacFarlane
d4e512776d Textile reader: added hrule parser. 2010-12-03 23:10:52 -08:00
John MacFarlane
4bf9d362d2 Textile reader: Turn on smart punctuation by default. 2010-12-03 23:10:52 -08:00
John MacFarlane
0356ad4de6 Textile reader: drop leading, trailing newline in pre block.
This is consistent with how the other readers work.
2010-12-03 23:10:52 -08:00
John MacFarlane
36d4aa4a09 Textile reader: modified str to handle acronyms, hyphens.
* A single hyphen between two word characters is no longer a
  potential strikeout-starter.
* Acronym explanations are dropped.
2010-12-03 23:10:52 -08:00
John MacFarlane
55e43c4991 Use textile reader by default for .textile extension. 2010-12-03 23:10:52 -08:00