Commit graph

4833 commits

Author SHA1 Message Date
John MacFarlane
fe1152985c Shared: Made splitBy take a test instead of an element. 2010-12-21 08:41:24 -08:00
John MacFarlane
4e446358d1 XML: Replaced escapeStringAsXML with a faster version.
Benchmarked with criterion, it's about 8x faster than
the old version.  This speeds up docbook, opendocument,
and html writers.
2010-12-21 08:23:48 -08:00
John MacFarlane
78cea94f45 Markdown writer: use \ for newline instead of two spaces at eol.
(Unless --strict.)
2010-12-20 19:36:40 -08:00
John MacFarlane
8889ae8b5b Markdown writer: Use delimited code block if there are attributes.
(Unless in strict mode.)
2010-12-20 19:36:40 -08:00
John MacFarlane
0086329c36 Plain writer: set stateStrictMarkdown automatically. 2010-12-20 19:36:40 -08:00
John MacFarlane
2587543457 ConTeXt writer: Updated to use Text.Pandoc.Pretty. 2010-12-20 19:36:35 -08:00
John MacFarlane
112717de4e Renamed 'enclosed' to 'inside'.
This avoids conflict with 'enclosed' in Text.Pandoc.Parsing.
2010-12-20 19:09:01 -08:00
John MacFarlane
2fe271d163 Pretty: Fixed parens. 2010-12-19 17:20:18 -08:00
John MacFarlane
9120514998 Pretty: Added enclosed, parens. 2010-12-19 12:39:49 -08:00
John MacFarlane
59cc27c10b LaTeX writer: A bit of code polish. 2010-12-19 10:21:16 -08:00
John MacFarlane
99a58e51f5 LaTeX writer: Modified to use Pretty.
Improved footnote formatting, removed spurious blank lines.
2010-12-19 10:14:12 -08:00
John MacFarlane
09aec9f3e3 Shared: Use stringify to simplify inlineListToIdentifier. 2010-12-19 10:13:36 -08:00
John MacFarlane
6aa5010617 Pretty: Added braces and brackets. 2010-12-19 10:13:11 -08:00
John MacFarlane
89bf312765 LaTeX writer: Use \paragraph, \subparagraph for level 4,5 headers. 2010-12-18 15:05:21 -08:00
John MacFarlane
543aa28c38 Added new prettyprinting module.
* Added Text.Pandoc.Pretty.
  This is better suited for pandoc than the 'pretty' package.
  One advantage is that we now get proper wrapping; Emph [Inline]
  is no longer treated as a big unwrappable unit. Previously
  we only got breaks for spaces at the "outer level." We can also
  more easily avoid doubled blank lines.  Performance is
  significantly better as well.

* Removed Text.Pandoc.Blocks.
  Text.Pandoc.Pretty allows you to define blocks and concatenate
  them.

* Modified markdown, RST, org readers to use Text.Pandoc.Pretty
  instead of Text.PrettyPrint.HughesPJ.

* Text.Pandoc.Shared:  Added writerColumns to WriterOptions.

* Markdown, RST, Org writers now break text at writerColumns.

* Added --columns command-line option, which sets stColumns
  and writerColumns.

* Table parsing:  If the size of the header > stColumns,
  use the header size as 100% for purposes of calculating
  relative widths of columns.
2010-12-17 13:39:17 -08:00
John MacFarlane
63cf37a9ca HTML reader: allow : in tags.
Resolves Issue #274.
2010-12-15 14:15:53 -08:00
Nathan Gass
a312d2a8ae Use top-level header at end as bibliography title for natbib and biblatex output. 2010-12-15 10:21:56 -08:00
Nathan Gass
8f60176511 Remove punctuation at start of suffix for natbib and biblatex output.
This is necessary as the latex citation commands include there own
punctuation, which resulted in doubled commas for markdown documents
where citeproc output works correctly.
2010-12-15 10:21:53 -08:00
Nathan Gass
43fee5e7f7 Support multiple bibliography files with natbib and biblatex output. 2010-12-15 10:21:47 -08:00
John MacFarlane
63d5e0c5f9 Added 'normalize' to Text.Pandoc.Shared. 2010-12-14 20:04:37 -08:00
John MacFarlane
3ac6f72f98 Fixed preamble parsing in LaTeX reader. 2010-12-14 19:34:28 -08:00
John MacFarlane
128cf46089 Fixed regression in parsing _emph_
There was a bug in parsing '_emph_, ...':  when followed by
a comma, underscore emphasis did not register.  (Thanks to
gwern for pointing this out.)

This bug was introduced by the change in
c66921f2ac
2010-12-14 18:23:26 -08:00
Nathan Gass
2e728df756 Moved special handling of punctuation in suffix out of markdown reader.
This allows different writers to handle punctuation in the suffix
differently.
2010-12-13 20:50:29 -08:00
Nathan Gass
c2d3796439 Added support for latex cite commands in latex reader. 2010-12-13 20:48:19 -08:00
Nathan Gass
c81495a07a Added option to write citation markup in markdown writer. 2010-12-13 20:42:58 -08:00
Nathan Gass
48600fd547 Added support to write natbib or biblatex citations in latex output. 2010-12-13 20:41:37 -08:00
John MacFarlane
1a4a0d0283 Markdown reader: Further fix to abbrevs. 2010-12-13 20:05:50 -08:00
John MacFarlane
7b4d3c77ec Markdown reader: Fixed abbrev handler to allow abbrev at end of line.
E.g., Mr.
Frank.
2010-12-13 20:04:11 -08:00
John MacFarlane
3822d6c440 Markdown reader: Fixed referenceKey parser to allow space after newline. 2010-12-13 20:03:59 -08:00
John MacFarlane
dfbb4d3994 Fixed inlineListToIdentifier to treat '\160' as ' '. 2010-12-13 20:03:52 -08:00
John MacFarlane
71e0557e61 Markdown reader: Fixed regression in reference key parser.
* The recent change allowing spaces and newlines in the URL
  caused problems when reference keys are stacked up without
  blank lines between. This is now fixed.
* Added test.
2010-12-13 20:03:12 -08:00
John MacFarlane
3748dfeb91 Markdown reader: fix superscripts with links.
Moved inlineNote parser after superscript parser,
so ^[link](/foo)^ gets recognized as a superscripted
link, not an inline note followed by garbage.

Thanks to Conal Elliott for pointing out the problem.
2010-12-12 20:30:55 -08:00
John MacFarlane
2dfb45950e LaTeX reader: Improved parsing of preamble.
Previously you'd get unexpected behavior on a document that
contained '\begin{document}' in, say, a verbatim block.
2010-12-10 23:21:24 -08:00
John MacFarlane
de6452c0d1 Markdown reader: small cosmetic code improvements. 2010-12-10 16:26:35 -08:00
John MacFarlane
5770ceca36 Removed HTML sanitization.
This is better done on the resulting HTML; use the xss-sanitize library
for this.  xss-sanitize is based on pandoc's sanitization, but improves
it.

- Removed stateSanitize from ParserState.
- Removed --sanitize-html option.
2010-12-10 12:26:03 -08:00
John MacFarlane
17d48cf4af Markdown reader: Allow linebreaks in URLs (treat as spaces).
Also, a string of consecutive spaces or tabs is now parsed
as a single space. If you have multiple spaces in your URL,
use %20%20.
2010-12-10 12:14:51 -08:00
John MacFarlane
ee0a0953de Markdown reader: Rewrote para parser for better efficiency.
This change avoids repeated parsing of inline lists for 'plain'
blocks.
2010-12-10 10:47:46 -08:00
paul.rivier
bb609a85e3 textile redcloth definition lists 2010-12-09 09:25:46 -08:00
John MacFarlane
88a40685b8 Textile reader: better treatment of acronyms.
We now parse PBS(Public Broadcasting System) as if it were
"PBS (Public Broadcasting System)".
2010-12-09 08:52:09 -08:00
John MacFarlane
9ead748cc9 RST reader: Added footnote suppport.
Resolves issue #258.

Note that there are some differences in how docutils and
pandoc treat footnotes.  Currently pandoc ignores the numeral
or symbol used in the note; footnotes are put in an auto-numbered
ordered list.
2010-12-08 08:39:50 -08:00
John MacFarlane
91978d2201 Markdown reader: minor footnote changes.
Don't skipNonindentSpaces in noteMarker, since it's also
used in the inline note parser.
2010-12-08 08:17:16 -08:00
John MacFarlane
f02080b62d Textile reader: Implemented footnotes. 2010-12-08 00:44:46 -08:00
John MacFarlane
200ea33641 Made --smart work with RST reader. 2010-12-07 21:49:10 -08:00
John MacFarlane
5e35eb309f Make --smart work in HTML reader. 2010-12-07 21:24:35 -08:00
John MacFarlane
33ba35da9f Smart punctuation: recognize entities.
Now “Hi” gets parsed as a Quoted DoubleQuote inline.
2010-12-07 20:44:43 -08:00
John MacFarlane
3a5fceeef9 Rewrote normalizeSpaces (mostly aesthetic reasons). 2010-12-07 20:10:21 -08:00
John MacFarlane
e20052a1ba Markdown reader: Moved smartPunctuation parser, for slight speed bump. 2010-12-07 20:09:40 -08:00
John MacFarlane
ace3b80f1e Smart punctuation: don't alllow ellipses containing spaces.
Previously we allowed '. . .', ' . . . ', etc.  This caused
too many complications, and removed author's flexibility in
combining ellipses with spaces and periods.
2010-12-07 20:08:14 -08:00
John MacFarlane
50ca61ef49 Moved smartPunctuation from Markdown to Parsing.
+ Parameterized smartPunctuation on an inline parser.
+ Handle smartPunctuation in Textile reader.
2010-12-07 19:03:08 -08:00
John MacFarlane
f917b46500 Textile reader: implemented acronyms, (tm), (r), (c). 2010-12-07 18:28:36 -08:00