pandoc

Author	SHA1	Message	Date
John MacFarlane	e1f3c6058e	Added Text.Pandoc.Readers.Native (readNative). readNative can now read full pandoc documents, block lists, blocks, inline lists, or inlines. It will interpret Str "hi" as if it were Pandoc (Meta [] [] []) [Plain [Str "hi"]] This should make testing easier.	2011-01-19 18:36:27 -08:00
John MacFarlane	e647f761ed	Use spaceChar instead of oneOf " \t" in rst reader.	2011-01-19 15:17:51 -08:00
John MacFarlane	1b8a9711b8	Replaced more noneOf/oneOf parsers.	2011-01-19 15:14:23 -08:00
John MacFarlane	a400cfe10f	Replaced uses of oneOf with more efficient parsers. This speeds up the markdown reader.	2011-01-19 15:06:56 -08:00
John MacFarlane	a5cbcdfe3a	HTML reader: parse simple tables. Resolves Issue #106. Thanks to Rodja Trappe for the idea and some sample code.	2011-01-14 20:48:10 -08:00
John MacFarlane	c31d3cc306	HTML reader: parse location tags in pSatisfy. This avoids the need for manual parsing all over the place.	2011-01-14 20:47:32 -08:00
John MacFarlane	d891b2c29d	LaTeX reader: Support simple tables.	2011-01-07 10:15:48 -08:00
John MacFarlane	303ce8a9e5	LaTeX reader: allow spaces btw \\begin or \\end and {.	2011-01-06 09:34:24 -08:00
John MacFarlane	81ea1a59b4	LaTeX reader: Removed unnecessary 'spaces'.	2011-01-06 09:24:56 -08:00
John MacFarlane	1be2ca6c78	HTML reader: Fixed bug in htmlTag for comments.	2011-01-06 00:21:19 -08:00
John MacFarlane	b63a7f7c48	LaTeX reader: Apply macros to non-math; handle ensuremath.	2011-01-05 16:55:26 -08:00
John MacFarlane	18e7a7a495	LaTeX reader: Don't handle \label and \ref specially. Put labels in {} instead of ().	2011-01-05 15:24:20 -08:00
John MacFarlane	1415b6831e	LaTeX reader: Support \L \l accents.	2011-01-05 14:57:06 -08:00
John MacFarlane	23aae79b01	Updated for texmath 0.5.	2011-01-05 14:44:26 -08:00
John MacFarlane	e126ab9efc	LaTeX reader: Parse inside arguments when ignoring commands.	2011-01-05 12:25:47 -08:00
John MacFarlane	c3071ff6e9	LaTeX reader: Don't handle \index separately. Instead, just put it in list of commands to ignore.	2011-01-05 12:05:04 -08:00
John MacFarlane	b26247a4a8	LaTeX reader: Added "index" to ignorable commands.	2011-01-05 11:56:37 -08:00
John MacFarlane	cf6cd15c27	LaTeX reader: skip space before option or argument.	2011-01-05 11:54:40 -08:00
John MacFarlane	d033fc9d3e	LaTeX reader: Skip \index commands.	2011-01-05 10:11:24 -08:00
John MacFarlane	c949530815	LaTeX reader: Removed \group (we want to parse inside {}).	2011-01-05 10:06:51 -08:00
John MacFarlane	3dab6c574c	LaTeX reader: Better handling of preamble, inc. parsing macros.	2011-01-05 09:04:03 -08:00
John MacFarlane	85bfd26b78	LaTeX reader: Parse bracketed {parts} as raw TeX.	2011-01-04 22:20:35 -08:00
John MacFarlane	22b2c02aeb	Markdown reader: Removed unneeded definitions. specialChars, strChar, specialCharsMinusLt.	2011-01-04 22:11:56 -08:00
John MacFarlane	dac2e9156f	LaTeX reader: parse macros and apply to math.	2011-01-04 19:18:20 -08:00
John MacFarlane	fcbe1e95eb	Moved 'macro' and 'applyMacros'' from markdown reader to Parsing.	2011-01-04 19:12:33 -08:00
John MacFarlane	3e61333af0	Fixed regression in markdown reader. '(_hi_)' was being parsed with literal underscores (no emphasis). The fix: the 'str' parser now only parses alphanumerics and embedded underscores. All other symbols are handled by the 'symbol' parser. This has a slight effect on the AST, since you'll get [Str "hi",Str ":"] insntead of [Str "hi:"]. But there should not be a visible effect in any of the writers. Thanks to gwern for pointing out the regression.	2011-01-01 22:46:30 -08:00
John MacFarlane	b05e739c6d	LaTeX reader: Allow ignored comments after \end{document}.	2010-12-30 22:05:19 -08:00
John MacFarlane	d6f28af9cb	HTML reader: Fixed some parsing bugs.	2010-12-30 19:33:37 -08:00
Puneeth Chaganti	e4dedad1c0	Added support for listings package code blocks and inline code.	2010-12-30 14:37:51 -08:00
John MacFarlane	f49e60a8b8	Textile reader: Slight speed improvement.	2010-12-30 14:33:11 -08:00
John MacFarlane	904050fa36	New HTML reader using tagsoup as a lexer. * The new reader is faster and more accurate. * API changes for Text.Pandoc.Readers.HTML: - removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag, anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType, htmlBlockElement, htmlComment - added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag * tagsoup is a new dependency. * Text.Pandoc.Parsing: Generalized type on readWith. * Benchmark.hs: Added length calculation to force full evaluation. * Updated HTML reader tests. * Updated markdown and textile readers to use the functions from the HTML reader. * Note: The markdown reader now correctly handles some cases it did not before. For example: <hr/> is reproduced without adding a space. <script> a = '<b>'; </script> is parsed correctly.	2010-12-30 13:55:40 -08:00
John MacFarlane	10d85f8b0b	Use functions from Text.Pandoc.Generic instead of processWith(M).	2010-12-24 13:39:27 -08:00
John MacFarlane	c08ca6fa6d	HTML reader: Simplified parsing of <script> sections. I had previously assumed that we needed to ignore </script> occuring in a string literal or javascript comment. It turns out, though, that browsers aren't that smart.	2010-12-22 19:20:27 -08:00
John MacFarlane	4bfe140ed1	Made --smart work with HTML reader. It did not work before, because - and quotes were gobbled up by the str parser.	2010-12-22 17:05:17 -08:00
John MacFarlane	63bf227e04	RST reader: Added unicode quote characters to specialChars. (So they can trigger Quoted environments.)	2010-12-22 17:04:56 -08:00
John MacFarlane	bbad129066	RST reader: recouped speed loss due to addition of --smart. This was achieved by rearranging the parsers in inline. Benchmarks went from 500ms to 307ms -- not quite back to the 279ms we had in 1.6, before supporting smart punctuation and footnotes, but close.	2010-12-22 15:10:21 -08:00
John MacFarlane	fe1152985c	Shared: Made splitBy take a test instead of an element.	2010-12-21 08:41:24 -08:00
John MacFarlane	63cf37a9ca	HTML reader: allow : in tags. Resolves Issue #274.	2010-12-15 14:15:53 -08:00
John MacFarlane	3ac6f72f98	Fixed preamble parsing in LaTeX reader.	2010-12-14 19:34:28 -08:00
John MacFarlane	128cf46089	Fixed regression in parsing _emph_ There was a bug in parsing '_emph_, ...': when followed by a comma, underscore emphasis did not register. (Thanks to gwern for pointing this out.) This bug was introduced by the change in `c66921f2ac`	2010-12-14 18:23:26 -08:00
Nathan Gass	2e728df756	Moved special handling of punctuation in suffix out of markdown reader. This allows different writers to handle punctuation in the suffix differently.	2010-12-13 20:50:29 -08:00
Nathan Gass	c2d3796439	Added support for latex cite commands in latex reader.	2010-12-13 20:48:19 -08:00
John MacFarlane	1a4a0d0283	Markdown reader: Further fix to abbrevs.	2010-12-13 20:05:50 -08:00
John MacFarlane	7b4d3c77ec	Markdown reader: Fixed abbrev handler to allow abbrev at end of line. E.g., Mr. Frank.	2010-12-13 20:04:11 -08:00
John MacFarlane	3822d6c440	Markdown reader: Fixed referenceKey parser to allow space after newline.	2010-12-13 20:03:59 -08:00
John MacFarlane	71e0557e61	Markdown reader: Fixed regression in reference key parser. * The recent change allowing spaces and newlines in the URL caused problems when reference keys are stacked up without blank lines between. This is now fixed. * Added test.	2010-12-13 20:03:12 -08:00
John MacFarlane	3748dfeb91	Markdown reader: fix superscripts with links. Moved inlineNote parser after superscript parser, so ^[link](/foo)^ gets recognized as a superscripted link, not an inline note followed by garbage. Thanks to Conal Elliott for pointing out the problem.	2010-12-12 20:30:55 -08:00
John MacFarlane	2dfb45950e	LaTeX reader: Improved parsing of preamble. Previously you'd get unexpected behavior on a document that contained '\begin{document}' in, say, a verbatim block.	2010-12-10 23:21:24 -08:00
John MacFarlane	de6452c0d1	Markdown reader: small cosmetic code improvements.	2010-12-10 16:26:35 -08:00
John MacFarlane	5770ceca36	Removed HTML sanitization. This is better done on the resulting HTML; use the xss-sanitize library for this. xss-sanitize is based on pandoc's sanitization, but improves it. - Removed stateSanitize from ParserState. - Removed --sanitize-html option.	2010-12-10 12:26:03 -08:00

1 2 3 4 5 ...

355 commits