pandoc

Author	SHA1	Message	Date
Puneeth Chaganti	e4dedad1c0	Added support for listings package code blocks and inline code.	2010-12-30 14:37:51 -08:00
John MacFarlane	f49e60a8b8	Textile reader: Slight speed improvement.	2010-12-30 14:33:11 -08:00
John MacFarlane	904050fa36	New HTML reader using tagsoup as a lexer. * The new reader is faster and more accurate. * API changes for Text.Pandoc.Readers.HTML: - removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag, anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType, htmlBlockElement, htmlComment - added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag * tagsoup is a new dependency. * Text.Pandoc.Parsing: Generalized type on readWith. * Benchmark.hs: Added length calculation to force full evaluation. * Updated HTML reader tests. * Updated markdown and textile readers to use the functions from the HTML reader. * Note: The markdown reader now correctly handles some cases it did not before. For example: <hr/> is reproduced without adding a space. <script> a = '<b>'; </script> is parsed correctly.	2010-12-30 13:55:40 -08:00
John MacFarlane	10d85f8b0b	Use functions from Text.Pandoc.Generic instead of processWith(M).	2010-12-24 13:39:27 -08:00
John MacFarlane	c08ca6fa6d	HTML reader: Simplified parsing of <script> sections. I had previously assumed that we needed to ignore </script> occuring in a string literal or javascript comment. It turns out, though, that browsers aren't that smart.	2010-12-22 19:20:27 -08:00
John MacFarlane	4bfe140ed1	Made --smart work with HTML reader. It did not work before, because - and quotes were gobbled up by the str parser.	2010-12-22 17:05:17 -08:00
John MacFarlane	63bf227e04	RST reader: Added unicode quote characters to specialChars. (So they can trigger Quoted environments.)	2010-12-22 17:04:56 -08:00
John MacFarlane	bbad129066	RST reader: recouped speed loss due to addition of --smart. This was achieved by rearranging the parsers in inline. Benchmarks went from 500ms to 307ms -- not quite back to the 279ms we had in 1.6, before supporting smart punctuation and footnotes, but close.	2010-12-22 15:10:21 -08:00
John MacFarlane	fe1152985c	Shared: Made splitBy take a test instead of an element.	2010-12-21 08:41:24 -08:00
John MacFarlane	63cf37a9ca	HTML reader: allow : in tags. Resolves Issue #274.	2010-12-15 14:15:53 -08:00
John MacFarlane	3ac6f72f98	Fixed preamble parsing in LaTeX reader.	2010-12-14 19:34:28 -08:00
John MacFarlane	128cf46089	Fixed regression in parsing _emph_ There was a bug in parsing '_emph_, ...': when followed by a comma, underscore emphasis did not register. (Thanks to gwern for pointing this out.) This bug was introduced by the change in `c66921f2ac`	2010-12-14 18:23:26 -08:00
Nathan Gass	2e728df756	Moved special handling of punctuation in suffix out of markdown reader. This allows different writers to handle punctuation in the suffix differently.	2010-12-13 20:50:29 -08:00
Nathan Gass	c2d3796439	Added support for latex cite commands in latex reader.	2010-12-13 20:48:19 -08:00
John MacFarlane	1a4a0d0283	Markdown reader: Further fix to abbrevs.	2010-12-13 20:05:50 -08:00
John MacFarlane	7b4d3c77ec	Markdown reader: Fixed abbrev handler to allow abbrev at end of line. E.g., Mr. Frank.	2010-12-13 20:04:11 -08:00
John MacFarlane	3822d6c440	Markdown reader: Fixed referenceKey parser to allow space after newline.	2010-12-13 20:03:59 -08:00
John MacFarlane	71e0557e61	Markdown reader: Fixed regression in reference key parser. * The recent change allowing spaces and newlines in the URL caused problems when reference keys are stacked up without blank lines between. This is now fixed. * Added test.	2010-12-13 20:03:12 -08:00
John MacFarlane	3748dfeb91	Markdown reader: fix superscripts with links. Moved inlineNote parser after superscript parser, so ^[link](/foo)^ gets recognized as a superscripted link, not an inline note followed by garbage. Thanks to Conal Elliott for pointing out the problem.	2010-12-12 20:30:55 -08:00
John MacFarlane	2dfb45950e	LaTeX reader: Improved parsing of preamble. Previously you'd get unexpected behavior on a document that contained '\begin{document}' in, say, a verbatim block.	2010-12-10 23:21:24 -08:00
John MacFarlane	de6452c0d1	Markdown reader: small cosmetic code improvements.	2010-12-10 16:26:35 -08:00
John MacFarlane	5770ceca36	Removed HTML sanitization. This is better done on the resulting HTML; use the xss-sanitize library for this. xss-sanitize is based on pandoc's sanitization, but improves it. - Removed stateSanitize from ParserState. - Removed --sanitize-html option.	2010-12-10 12:26:03 -08:00
John MacFarlane	17d48cf4af	Markdown reader: Allow linebreaks in URLs (treat as spaces). Also, a string of consecutive spaces or tabs is now parsed as a single space. If you have multiple spaces in your URL, use %20%20.	2010-12-10 12:14:51 -08:00
John MacFarlane	ee0a0953de	Markdown reader: Rewrote para parser for better efficiency. This change avoids repeated parsing of inline lists for 'plain' blocks.	2010-12-10 10:47:46 -08:00
paul.rivier	bb609a85e3	textile redcloth definition lists	2010-12-09 09:25:46 -08:00
John MacFarlane	88a40685b8	Textile reader: better treatment of acronyms. We now parse PBS(Public Broadcasting System) as if it were "PBS (Public Broadcasting System)".	2010-12-09 08:52:09 -08:00
John MacFarlane	9ead748cc9	RST reader: Added footnote suppport. Resolves issue #258. Note that there are some differences in how docutils and pandoc treat footnotes. Currently pandoc ignores the numeral or symbol used in the note; footnotes are put in an auto-numbered ordered list.	2010-12-08 08:39:50 -08:00
John MacFarlane	91978d2201	Markdown reader: minor footnote changes. Don't skipNonindentSpaces in noteMarker, since it's also used in the inline note parser.	2010-12-08 08:17:16 -08:00
John MacFarlane	f02080b62d	Textile reader: Implemented footnotes.	2010-12-08 00:44:46 -08:00
John MacFarlane	200ea33641	Made --smart work with RST reader.	2010-12-07 21:49:10 -08:00
John MacFarlane	5e35eb309f	Make --smart work in HTML reader.	2010-12-07 21:24:35 -08:00
John MacFarlane	33ba35da9f	Smart punctuation: recognize entities. Now “Hi” gets parsed as a Quoted DoubleQuote inline.	2010-12-07 20:44:43 -08:00
John MacFarlane	e20052a1ba	Markdown reader: Moved smartPunctuation parser, for slight speed bump.	2010-12-07 20:09:40 -08:00
John MacFarlane	50ca61ef49	Moved smartPunctuation from Markdown to Parsing. + Parameterized smartPunctuation on an inline parser. + Handle smartPunctuation in Textile reader.	2010-12-07 19:03:08 -08:00
John MacFarlane	f917b46500	Textile reader: implemented acronyms, (tm), (r), (c).	2010-12-07 18:28:36 -08:00
John MacFarlane	c66921f2ac	Markdown reader: better handling of intraword _. The 'str' parser now reads internal _'s as part of the string. This prevents pandoc from getting started looking for an emphasized block, which can cause exponential slowdowns in some cases. Resolves Issue #182.	2010-12-06 22:12:18 -08:00
John MacFarlane	7864f30717	Markdown reader: handle curly quotes better. Previously, curly quotes were just parsed literally, leading to problems in some output formats. Now they are parsed as Quoted inlines, if --smart is specified. Resolves Issue #270.	2010-12-06 20:36:58 -08:00
John MacFarlane	5a4609584c	Fix regression: markdown references should be case-insensitive. This broke when we added the Key type. We had assumed that the custom case-insensitive Ord instance would ensure case-insensitive matching, but that is not how Data.Map works. * Added a test case for case-insensitivity in markdown-reader-more * Removed old refsMatch from Text.Pandoc.Parsing module; * hid the 'Key' constructor; * dropped the custom Ord and Eq instances, deriving instead; * added fromKey and toKey to convert between Keys and Inline lists; * toKey ensures that keys are case-insensitive, since this is the only way the API provides to construct a Key. Resolves Issue #272.	2010-12-05 19:27:00 -08:00
John MacFarlane	357b965b44	Merge branch 'citeproc' into master. Conflicts: src/Text/Pandoc/Definition.hs	2010-12-03 23:43:47 -08:00
John MacFarlane	bea62bcab8	Textile reader: temporarily removed smartPunctuation. The smartPuncutation parser from the markdown parser was being used, but this creates two problems: * smart punctuation rules are slightly different in textile, for example, a single dash wish space around becomes an En dash. * the following gets parsed as a double quoted string followed by a colon, rather than as a link: "emphasized text":http://my.url.com This needs rethinking.	2010-12-03 23:10:52 -08:00
John MacFarlane	d4e512776d	Textile reader: added hrule parser.	2010-12-03 23:10:52 -08:00
John MacFarlane	4bf9d362d2	Textile reader: Turn on smart punctuation by default.	2010-12-03 23:10:52 -08:00
John MacFarlane	0356ad4de6	Textile reader: drop leading, trailing newline in pre block. This is consistent with how the other readers work.	2010-12-03 23:10:52 -08:00
John MacFarlane	36d4aa4a09	Textile reader: modified str to handle acronyms, hyphens. * A single hyphen between two word characters is no longer a potential strikeout-starter. * Acronym explanations are dropped.	2010-12-03 23:10:52 -08:00
John MacFarlane	f415e9e119	Textile reader: parse raw by default. It's part of the textile spec to allow raw HTML, just as with markdown. -R is no longer needed in test suite.	2010-12-03 23:10:52 -08:00
paul.rivier	c3866f3c66	punctuation handling, and more html-specific handling	2010-12-03 23:10:52 -08:00
Paul Rivier	d724c6b568	html inlines and html blocks handling in textile reader	2010-12-03 23:10:51 -08:00
Paul Rivier	fa0866886b	textile reader now ignores html/css attributes	2010-12-03 23:10:51 -08:00
Paul Rivier	e6dde36622	removed support for textile Inserted construct	2010-12-03 23:10:51 -08:00
Paul Rivier	593b4f6c94	fix autolink by promoting it in the parser list, fix table parabreak	2010-12-03 23:10:51 -08:00

... 31 32 33 34 35 ...

1927 commits