pandoc

Author	SHA1	Message	Date
fiddlosopher	0c475bfd47	Refactored setext header parsing in Markdown reader for greater speed. git-svn-id: https://pandoc.googlecode.com/svn/trunk@933 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-28 06:58:20 +00:00
fiddlosopher	18b379c1ca	More rearranging in definition of inline. git-svn-id: https://pandoc.googlecode.com/svn/trunk@932 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-28 06:44:47 +00:00
fiddlosopher	b6ebe75656	More intelligent rearranging of 'inline' for speed boosts in Text.Pandoc.Readers.Markdown. git-svn-id: https://pandoc.googlecode.com/svn/trunk@931 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-28 06:38:38 +00:00
fiddlosopher	f04df62db1	Changed definition of 'enclosed' in Text.Pandoc.Shared so that 'try' is not automatically applied to the 'end' parser. Added 'try' in calls to 'enclosed' where needed. Slight speed increase. git-svn-id: https://pandoc.googlecode.com/svn/trunk@926 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-28 05:58:21 +00:00
fiddlosopher	1d8fe2653a	Performance improvements: + Rearranged parsers in definition of 'inline' so that the most frequently used would (by and large) be tried first. + Removed some unneeded 'try's. + Removed tabchar parser, as whitespace handles tabs anyway. + All in all, these changes, together with the last two commits, cut almost in half the time it takes pandoc to parse a large test file. git-svn-id: https://pandoc.googlecode.com/svn/trunk@924 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-28 02:33:53 +00:00
fiddlosopher	1163e622ab	Don't count p. 27 at the beginning of a line as an ordered list start, since it's most likely a page number. git-svn-id: https://pandoc.googlecode.com/svn/trunk@900 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-26 03:17:40 +00:00
fiddlosopher	f11360f50e	Added new rule for enhanced markdown ordered lists: if the list marker is a capital letter followed by a period (including a single-letter capital roman numeral), then it must be followed by at least two spaces. The point of this is to avoid accidentally treating people's initials as list markers: a paragraph may begin: B. Russell was an English philosopher. and this shouldn't be treated as a list. Modified Markdown reader and README documentation. Added a test case. git-svn-id: https://pandoc.googlecode.com/svn/trunk@880 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-23 04:25:09 +00:00
fiddlosopher	4a0e02dab7	Added a needed 'try' to listItem in Markdown reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@878 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-22 20:19:37 +00:00
fiddlosopher	cc0460f952	Code cleanup in Markdown reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@877 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-20 19:21:18 +00:00
fiddlosopher	84aa875bdb	Changes to Markdown reader for better conformity to the Markdown test suite under --strict: + Removed check for a following setext header in endline. A full test is too inefficient (doubles benchmark time), and the substitute we had before is not 100% accurate. + Don't use Code elements for autolinks if --strict specified. git-svn-id: https://pandoc.googlecode.com/svn/trunk@876 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-20 18:52:49 +00:00
fiddlosopher	81eba062f2	Refactor RST and Markdown readers using parseFromString. git-svn-id: https://pandoc.googlecode.com/svn/trunk@864 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-19 00:18:46 +00:00
fiddlosopher	4e149f898a	Added a necessary "try" in definition of "para" (HTML reader). git-svn-id: https://pandoc.googlecode.com/svn/trunk@863 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-19 00:06:03 +00:00
fiddlosopher	4399db4fd2	Bug fixes in readers: + LaTeX reader: skip anything after \end{document} + HTML reader: fixed bug skipping material after </html> -- previously, stuff at the end was skipped even if no </html> was present, which meant only part of the file would be parsed and no error issued + HTML reader: added new constant eitherBlockOrInline with elements that may count either as block-level or inline + Modified isInline and isBlock to take this into account + modified rawHtmlBlock to accept any tag (even an inline tag); this is innocuous, because rawHtmlBlock is tried only if a regular inline element can't be parsed. git-svn-id: https://pandoc.googlecode.com/svn/trunk@862 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-18 23:44:26 +00:00
fiddlosopher	e48f046aa0	+ Fixed bug in markdown ordered list parsing. The problem was that anyOrderedListStart did not check for a space following the ordered list marker. So, 'A.B. 2007' would be parsed as a list item, then fail because of the lack of space after 'A.' (required by orderedListStart). Resolves Issue #22. + Fixed a similar problem in RST reader. + Added regression test. git-svn-id: https://pandoc.googlecode.com/svn/trunk@861 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-18 15:26:29 +00:00
fiddlosopher	3904078e39	Cosmetic changes. git-svn-id: https://pandoc.googlecode.com/svn/trunk@851 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-15 21:20:02 +00:00
fiddlosopher	1a4489ef30	LaTeX reader: parse \texttt{} as code, as long as there's nothing fancy inside. git-svn-id: https://pandoc.googlecode.com/svn/trunk@846 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-15 17:30:37 +00:00
fiddlosopher	6cc5f6b199	Allow htmlComments as rawHtmlInline in HTML reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@844 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-15 17:28:21 +00:00
fiddlosopher	a8e2199034	Major code cleanup in all modules. (Removed unneeded imports, reformatted, etc.) More major changes are documented below: + Removed Text.Pandoc.ParserCombinators and moved all its definitions to Text.Pandoc.Shared. + In Text.Pandoc.Shared: - Removed unneeded 'try' in blanklines. - Removed endsWith function and rewrote functions to use isSuffixOf instead. - Added >>~ combinator. - Rewrote stripTrailingNewlines, removeLeadingSpaces. + Moved Text.Pandoc.Entities -> Text.Pandoc.CharacterReferences. - Removed unneeded functions charToEntity, charToNumericalEntity. - Renamed functions using proper terminology (character references, not entities). decodeEntities -> decodeCharacterReferences, characterEntity -> characterReference. - Moved escapeStringToXML to Docbook writer, which is the only thing that uses it. - Removed old entity parser in HTML and Markdown readers; replaced with new charRef parser in Text.Pandoc.Shared. + Fixed accent bug in Text.Pandoc.Readers.LaTeX: \^{} now correctly parses as a '^' character. + Text.Pandoc.ASCIIMathML is no longer an exported module. git-svn-id: https://pandoc.googlecode.com/svn/trunk@835 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-15 06:00:58 +00:00
fiddlosopher	e814a3f6d2	Major change in the way ordered lists are handled: + The changes are documented in README, under Lists. + The OrderedList block element now stores information about list number style, list number delimiter, and starting number. + The readers parse this information, when possible. + The writers use this information to style ordered lists. + Test suites have been changed accordingly. Motivation: It's often useful to start lists with numbers other than 1, and to have control over the style of the list. Added to Text.Pandoc.Shared: + camelCaseToHyphenated + toRomanNumeral + anyOrderedListMarker + orderedListMarker + orderedListMarkers Added to Text.Pandoc.ParserCombinators: + charsInBalanced' + withHorizDisplacement + romanNumeral RST writer: + Force blank line before lists, so that sublists will be handled correctly. LaTeX reader: + Fixed bug in parsing of footnotes containing multiple paragraphs, introduced by use of charsInBalanced. Fix: use charsInBalanced' instead. LaTeX header: + use mathletters option in ucs package, so that basic unicode Greek letters will work properly. git-svn-id: https://pandoc.googlecode.com/svn/trunk@834 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-08 02:43:15 +00:00
fiddlosopher	22a6538557	Added parsing for \url to LaTeX reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@833 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-08-03 22:44:47 +00:00
fiddlosopher	44b136740b	Changes to Markdown reader: + added try to def of indentSpaces + in def of 'reference', check to make sure it's not a note reference first. git-svn-id: https://pandoc.googlecode.com/svn/trunk@827 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-28 19:14:50 +00:00
fiddlosopher	5e13f8c320	Removed examplep specific stuff in LaTeX reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@826 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-28 18:23:13 +00:00
fiddlosopher	d3404581d4	Changes in LaTeX reader to accommodate Pandoc's own use of examplep. git-svn-id: https://pandoc.googlecode.com/svn/trunk@818 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-28 05:45:27 +00:00
fiddlosopher	9939b0f07e	Make URLs and emails in autolinks appear as Code. git-svn-id: https://pandoc.googlecode.com/svn/trunk@810 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-28 02:12:40 +00:00
fiddlosopher	d8762eb436	In HTML reader, filter Nulls in lists of blocks. (These can be caused by raw HTML when the parse-raw option isn't selected.) git-svn-id: https://pandoc.googlecode.com/svn/trunk@787 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-23 02:50:29 +00:00
fiddlosopher	9f871ab1ec	Fixed bug in spanStrikeout: case was not exhaustive. git-svn-id: https://pandoc.googlecode.com/svn/trunk@786 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-23 02:28:15 +00:00
fiddlosopher	9a410e1635	README: Removed the statement that the RST reader doesn't parse definition lists. HTML reader: Added failIfStrict to the definitionList parser, so definition lists will be passed through as raw HTML if --strict specified. git-svn-id: https://pandoc.googlecode.com/svn/trunk@783 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-23 01:41:37 +00:00
fiddlosopher	6a6b1a5842	Added support for definition lists to RST reader. Added a relevant test to the test suite. git-svn-id: https://pandoc.googlecode.com/svn/trunk@782 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-23 01:35:03 +00:00
fiddlosopher	f69d9efc83	Added definition list support to HTML reader. Added a test for definition lists to the html-reader test suite. git-svn-id: https://pandoc.googlecode.com/svn/trunk@781 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-23 01:21:21 +00:00
fiddlosopher	1296273c85	Added support for definition lists to LaTeX reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@780 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-23 00:19:37 +00:00
fiddlosopher	caef362065	Renamed parseFromStr -> parseFromString. git-svn-id: https://pandoc.googlecode.com/svn/trunk@779 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-23 00:19:00 +00:00
fiddlosopher	4beb0b9130	Superscript and Subscript support for RST reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@777 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-22 21:26:22 +00:00
fiddlosopher	c1f397622d	Added a "try" to the end parser in enclosed (Text.Pandoc.ParserCombinators). This makes errors in its use less likely. Removed some now-unneeded try's in calling code. git-svn-id: https://pandoc.googlecode.com/svn/trunk@776 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-22 21:01:17 +00:00
fiddlosopher	3ca9932ca6	Added subscript and superscript support to LaTeX reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@775 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-22 20:30:07 +00:00
fiddlosopher	4905ebadb3	LaTeX reader: Added clauses for tilde and caret. Tilde is \ensuremath{\sim}, and caret is \^{}, not \^ as before. git-svn-id: https://pandoc.googlecode.com/svn/trunk@764 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-22 18:26:10 +00:00
fiddlosopher	b19c36970e	Removed an extra occurance of escapedChar in definition of inline. git-svn-id: https://pandoc.googlecode.com/svn/trunk@762 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-22 17:51:15 +00:00
fiddlosopher	d03ec5a4a2	Added support for strikeout (\sout) to latex reader. (Thanks to Bradley Sif for the patch.) git-svn-id: https://pandoc.googlecode.com/svn/trunk@754 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-22 05:14:43 +00:00
fiddlosopher	a2194f23db	Added support for Strikeout, Superscript, and Subscript to HTML reader. Thanks to Bradley Sif for the patch for Strikeout (Issue #18). git-svn-id: https://pandoc.googlecode.com/svn/trunk@753 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-21 22:54:40 +00:00
fiddlosopher	44b11214ba	+ Added support for Strikeout, Superscript, and Subscript in markdown reader. + Also replaced constants like emphStart with literals. git-svn-id: https://pandoc.googlecode.com/svn/trunk@752 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-21 22:52:07 +00:00
fiddlosopher	48d67d0d1f	Simplified inlinesInBalanced, using lookAhead. git-svn-id: https://pandoc.googlecode.com/svn/trunk@732 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-16 07:22:17 +00:00
fiddlosopher	7e1370aa87	Markdown reader: Added inlinesInBalanced parser combinator to unify treatment of embedded brackets in links and inline footnotes. Note that the solution adopted here causes one of John Gruber's markdown tests to fail: [with_underscore](/url/with_underscore) Here the whole phrase "underscore](/url/with" is treated as emphasized. The previous version of the markdown reader handled this the way Gruber's script handles it, but it ran into trouble on the following: [link with verbatim `]`](/url) where the inner ] was treated as the end of the reference link label. I don't see any good way to handle both cases in the framework of pandoc, so I choose to require an escape in the first example: [with\_underscore](/url/with_underscore) git-svn-id: https://pandoc.googlecode.com/svn/trunk@729 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-15 23:53:22 +00:00
fiddlosopher	e962c76f05	HTML reader: haddock comment fix. git-svn-id: https://pandoc.googlecode.com/svn/trunk@690 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-12 08:31:39 +00:00
fiddlosopher	1780228b7c	Markdown reader: Parse bracketed text in inline footnotes. Previously, "test^[my [note] contains brackets]" would yield a note with contents "my [note". Now it yields a note with contents "my [note] contains brackets". New function: inlinesInBrackets. Resolves Issue 14. git-svn-id: https://pandoc.googlecode.com/svn/trunk@665 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-09 06:51:06 +00:00
fiddlosopher	d58dca502e	RST reader: Allow hyperlink target URIs to be split over multiple lines, and to start on the line after the reference. Resolves Issue 7. git-svn-id: https://pandoc.googlecode.com/svn/trunk@664 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-09 06:23:59 +00:00
fiddlosopher	655363da51	Moved Text.ParserCombinators.Pandoc -> Text.Pandoc.ParserCombinators. This way, all the Pandoc modules are in one place. git-svn-id: https://pandoc.googlecode.com/svn/trunk@663 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-09 03:39:25 +00:00
fiddlosopher	a414ec9d63	Adjusted copyright notices to 2006-7; use real email address instead of lamely attempting to obfuscate. git-svn-id: https://pandoc.googlecode.com/svn/trunk@640 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-07 22:51:55 +00:00
fiddlosopher	f9c988e703	Fixed bug in Markdown reader: links in footnotes were not being processed. Solution: three-stage parse. First, get all the reference keys and add information to state. Next, get all the notes and add information to state. (Reference keys may be needed at this stage.) Finally, parse everything else. git-svn-id: https://pandoc.googlecode.com/svn/trunk@625 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-07-06 06:46:31 +00:00
fiddlosopher	61024d93ed	Require blankspace (but not multiple lines) between URL and title in links and reference keys. (Markdown reader.) git-svn-id: https://pandoc.googlecode.com/svn/trunk@599 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-05-10 22:06:13 +00:00
fiddlosopher	cf081435ff	Changed definition list syntax in markdown reader and simplified the parsing code. A colon is now required before every block in a definition. This fixes a problem with the old syntax, in which the last block in the following was ambiguous between a regular paragraph in the definition and a code block following the definition list: term : definition is this code or more definition? git-svn-id: https://pandoc.googlecode.com/svn/trunk@589 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-05-03 14:42:40 +00:00
fiddlosopher	485fa81559	Resolved issue #10 : instead of adding "\n\n" to the end of strings in Main, do it in readMarkdown and readRST. (Note: the point of this is to ensure that a block at the end of the file gets treated as if it has blank space after it, which is generally what is wanted.) git-svn-id: https://pandoc.googlecode.com/svn/trunk@588 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-04-22 04:38:05 +00:00

1 2 3

119 commits