pandoc

Author	SHA1	Message	Date
fiddlosopher	7ad17fe5cf	Markdown reader: Ignore blank line after ~~~~~~~~ in delimited code blocks. Rationale: these are useful for literate haskell, but lhs requires a blank line before the haskell code, and we don't want spurious blank lines in the output. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1454 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-09-16 01:39:05 +00:00
fiddlosopher	943c2f353d	Changed list parser so that only the starting list marker matters: 1. one - two (b) three produces an ordered list with 1., 2., 3. This is the behavior of Markdown.pl. Modified README to document the new behavior. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1438 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-09-12 00:05:32 +00:00
fiddlosopher	000b89c718	Use Data.List's 'intercalate' instead of custom 'joinWithSep'. + Removed joinWithSep definition from Text.Pandoc.Shared. + Replaced joinWithSep with intercalate + Depend on base >= 3, since in base < 3 intercalate is not included. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1428 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-09-08 06:36:28 +00:00
fiddlosopher	2127b7d513	Changed Float to Double in definition of Table element. (Double is more efficient in GHC.) Truncate width in opendocument output to 2 decimal places. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1418 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-09-06 02:51:44 +00:00
fiddlosopher	f53fb554fe	Support for display math; changed ASCIIMathML -> LaTeXMathML: Resolves Issue #47. + Added a DisplayMath/InlineMath selector to Math inlines. + Markdown parser yields DisplayMath for $$...$$. + LaTeX parser yields DisplayMath when appropriate. Removed mathBlock parsers, since the same effect is achieved by the math inline parsers, now that they handle display math. + Writers handle DisplayMath as appropriate for the format. + Changed -m option to use LaTeXMathML rather than ASCIIMathML. LaTeXMathML is closer to LaTeX in its display of math, and supports many non-math LaTeX environments. + Modified HTML writer to print raw TeX when LaTeXMathML is being used instead of suppressing it. + Removed ASCIIMathML files from data/ and added LaTeXMathML. + Replaced ASCIIMathML with LaTeXMathML in source files. + Modified README and pandoc man page source. + Modified web page. + Added --latexmathml option (kept --asciimathml as a synonym for backwards compatibility) + Modified tests accordingly; added new tests for display math. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1409 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-08-13 03:02:42 +00:00
fiddlosopher	c8a56a2864	Parse raw ConTeXt environments as TeX in markdown reader. Resolves Issue #73. Also made some structural changes to parsing of raw LaTeX environments. Previously there was a special block parser for LaTeX environments. It returned a Para element containing the raw TeX inline. This has been removed, and the raw LaTeX environment parser is now used in the rawLaTeXInline parser. The effect is exactly the same, except that we can now handle consecutive LaTeX and ConTeXt environments not separated by spaces. This new flexibility is required by the example in Issue #73: \placeformula \startformula L_{1} = L_{2} \stopformula API change: The LaTeX reader now exports rawLaTeXEnvironment' (which returns a string) rather than rawLaTeXEnvironment (which returns a block element). This is more likely to be useful in other applications. Added test cases for raw ConTeXt environments to markdown-reader-more.txt. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1405 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-08-11 07:04:36 +00:00
fiddlosopher	dd2b77d590	Allow newline before URL in markdown link references. Resolves Issue #81 . Added tests for this issue in new "markdown-reader-more" tests. Changed RunTests.hs to run these tests. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1401 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-08-10 23:26:32 +00:00
fiddlosopher	05b366a0b2	Small improvements to citation parsing in markdown reader. (Don't allow blank lines inside citations.) git-svn-id: https://pandoc.googlecode.com/svn/trunk@1382 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-08-06 03:34:06 +00:00
fiddlosopher	abf2dc78ac	Allow parsing of multiline citations. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1381 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-08-05 23:18:52 +00:00
fiddlosopher	1bfe1b84a8	Added support for Cite to Markdown reader, and conditional support for citeproc module. + The citeproc cabal configuration option sets the _CITEPROC macro, which conditionally includes code for handling citations. + Added Text.Pandoc.Biblio module. + Made highlighting option default to False. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1376 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-08-04 03:15:34 +00:00
fiddlosopher	06983c9ba5	Markdown reader: Parse setext headers before atx headers. Test case: # hi ==== parsed by Markdown.pl as an H1 header with contents "# hi". git-svn-id: https://pandoc.googlecode.com/svn/trunk@1334 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-07-23 23:10:05 +00:00
fiddlosopher	7c35c0bc25	Fixed bug in Markdown parser: regular $s triggering math mode. For example: "shoes ($20) and socks ($5)." The fix consists in two new restrictions: + the $ that ends a math span may not be directly followed by a digit. + no blank lines may be included within a math span. Thanks to Joseph Reagle for noticing the bug. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1326 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-07-15 20:41:27 +00:00
fiddlosopher	752adcd45a	Added type signatures and fixed other -Wall warnings in Markdown reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1301 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-07-11 16:33:21 +00:00
fiddlosopher	45044ff536	Added a few more recognized abbreviations to 'abbrev' parser. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1300 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-07-11 03:00:35 +00:00
fiddlosopher	824bb2d22e	In smart mode, use nonbreaking spaces after abbreviations in markdown parser. Thus, for example, "Mr. Brown" comes out as "Mr.~Brown" in LaTeX, and does not produce a sentence-separating space. Resolves Issue #75. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1298 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-07-11 02:14:57 +00:00
fiddlosopher	8ed710bc9d	Treat '\ ' in (extended) markdown as nonbreaking space. Print nonbreaking space appropriately in each writer (e.g. ~ in LaTeX). git-svn-id: https://pandoc.googlecode.com/svn/trunk@1297 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-07-11 01:24:15 +00:00
fiddlosopher	cd38d4ae79	Markdown smart typography: Em dashes no longer eat surrounding whitespace. Resolves Issue #69. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1279 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-06-08 03:20:15 +00:00
fiddlosopher	8624ed9bd3	The '--sanitize-html' option now examines URIs in markdown links and images, and in HTML href and src attributes. If the URI scheme is not on a whitelist of safe schemes, it is rejected. The main point is to prevent cross-site scripting attacks using 'javascript:' URIs. See http://www.mail-archive.com/markdown-discuss@six.pairlist.net/msg01186.html and http://ha.ckers.org/xss.html. Resolves Issue #62. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1262 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-03-22 20:41:56 +00:00
fiddlosopher	06b544360e	Factored codeBlock into separate codeBlockIndented and codeBlockDelimited. Do not use codeBlockDelimited in strict mode. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1211 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-02-09 03:20:02 +00:00
fiddlosopher	614547b38e	Use generic attributes type, not a string, for CodeBlocks. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1209 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-02-09 03:19:43 +00:00
fiddlosopher	2e683e8b53	Fixed delimited code blocks: eat blank lines afterwards, and allow end line to contain more tildes than beginning line. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1206 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-02-09 03:19:17 +00:00
fiddlosopher	24f22ee7ac	Added a needed try to {} attribute parser. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1205 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-02-09 03:19:10 +00:00
fiddlosopher	046c6b0d0d	Added support for multiple classes in delimited code block. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1204 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-02-09 03:19:01 +00:00
fiddlosopher	b06ddad4bc	Initial support for delimited code blocks in markdown reader. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1203 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-02-09 03:18:54 +00:00
fiddlosopher	9f7a14c210	Modified readers for new parameter in CodeBlock. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1199 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-02-09 03:18:03 +00:00
fiddlosopher	8fca649d05	Changed copyright dates where appropriate to include 2008. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1181 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-01-08 17:26:16 +00:00
fiddlosopher	d271473044	Fixed markdown reader to handle "hi there*" as a strong nested in an emph. (A '' is only recognized as the end of the emphasis if it's not the beginning of a strong emphasis.) git-svn-id: https://pandoc.googlecode.com/svn/trunk@1172 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-01-06 19:46:31 +00:00
fiddlosopher	0e94609e18	Markdown reader: Moved blockQuote parser before list parsers. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1171 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-01-06 19:46:24 +00:00
fiddlosopher	5df912b162	Added optional HTML sanitization using a whitelist. When this option is specified (--sanitize-html on the command line), unsafe HTML tags will be replaced by HTML comments, and unsafe HTML attributes will be removed. This option should be especially useful for those who want to use pandoc libraries in web applications, where users will provide the input. + Main.hs: Added --sanitize-html option. + Text.Pandoc.Shared: Added stateSanitizeHTML to ParserState. + Text.Pandoc.Readers.HTML: - Added whitelists of sanitaryTags and sanitaryAttributes. - Added parsers to check these lists (and state) to see if a given tag or attribute should be counted unsafe. - Modified anyHtmlTag and anyHtmlEndTag to replace unsafe tags with comments. - Modified htmlAttribute to remove unsafe attributes. - Modified htmlScript and htmlStyle to remove these elements if unsafe. - Modified rawHtmlBlock to use anyHtmlBlockTag instead of anyHtmlTag and anyHtmlEndTag. This fixes a bug in markdown parsing, where inline tags would be included in raw HTML blocks. - Modified anyHtmlBlockTag to test for (not inline) rather than directly for block. This allows us to handle e.g. docbook in the markdown reader. - Minor tweaks in nonTitleNonHead and parseTitle. + Text.Pandoc.Readers.Markdown: - In non-strict mode use rawHtmlBlocks instead of htmlBlock. Simplified htmlBlock, since we know it's only called in strict mode. + Modified README and man pages to document new option. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1166 788f1e2b-df1e-0410-8736-df70ead52e1b	2008-01-03 21:32:32 +00:00
fiddlosopher	e37df6db69	Fixed bug in the markdown reader: HTML preceding a code block could cause it to be parsed as a paragraph. (The problem is that the HTML parser used to eat all blank space after an HTML block, including the indentation of the code block.) Resolves Issue #39. + In Text.Pandoc.Readers.HTML, removed parsing of following space from rawHtmlBlock. + In Text.Pandoc.Readers.Markdown, modified rawHtmlBlocks so that indentation is eaten only on the first line after the HTML block. This means that in <div> foo <div> the foo won't be treated as a code block, but in <div> foo </div> it will. This seems the right approach for least suprise. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1164 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-12-31 01:02:44 +00:00
fiddlosopher	f74dda27b6	Markdown reader: Make 'block' conditional on strictness state, instead of using failIfStrict in block parsers. Use a different ordering of parsers in strict mode: raw HTML block before paragraph. This recovers performance that was lost in strict mode with r1154. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1157 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-12-29 06:30:12 +00:00
fiddlosopher	05cbdf04fd	Markdown: better handling of parentheses in URLs and quotation marks in titles. + source parser first tries to parse URL with balanced parentheses; if that doesn't work, it tries to parse everything beginning with '(' and ending with ')'. + source parser now uses an auxiliary function source'. + linkTitle parser simplified and improved, under assumption that it will be called in context of source'. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1156 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-12-24 04:22:41 +00:00
fiddlosopher	ee6f06ec05	Modified markdown reader to disallow links within links. (Resolves Issue #35.) + Replaced inlinesInBalanced with inlinesInBalancedBrackets, which instead of hard-coding the inline parser takes an inline parser as a parameter. + Modified reference and inlineNote to use inlinesInBalancedBrackets. + Removed unneeded inlineString function. + Added inlineNonLink parser, which is now used in the definition of reference. + Added inlineParsers list and redefined inline and inlineNonLink parsers in terms of it. + Added failIfLink parser. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1155 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-12-24 04:22:31 +00:00
fiddlosopher	97992e6f7b	Improved handling of raw HTML in Markdown reader. (Resolves Issue #36.) Tags that can be either block or inline (e.g. <ins>) should be treated as block when appropriate and as inline when appropriate. Thus, for example, <ins>hi</ins> should be treated as a paragraph with inline <ins> tags, while <ins> hi </ins> should be treated as a paragraph within <ins> tags. + Moved htmlBlock after para in list of block parsers. This ensures that tags that can be either block or inline get parsed as inline when appropriate. + Modified rawHtmlInline' so that block elements aren't treated as inline. + Modified para parser so that paragraphs containing only HTML tags and blank space are not allowed. Treat these as raw HTML blocks instead. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1154 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-12-24 04:22:20 +00:00
fiddlosopher	fbecb49790	Modified 'source' parser in Markdown reader to allow backslash escapes in URLs. So, for example, [my](/url$1$) yields a link to /url(1). Resolves Issue #34. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1151 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-12-21 22:31:20 +00:00
fiddlosopher	0681d1d3e7	Fixed handling of email addresses in markdown and reStructuredText. Consolidated uri and email address parsers. (Resolves Issue #37.) + New emailAddress and uri parsers in Text.Pandoc.Shared. uri parser uses parseURI from Network.URI. emailAddress parser properly handles email addresses with periods in them. + Removed uri and emailAddress parsers from Text.Pandoc.Readers.RST. + Removed uri and emailAddress parsers from Text.Pandoc.Readers.Markdown. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1149 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-12-21 16:13:10 +00:00
fiddlosopher	aea6f6802b	Removed support for "box-style" block quotes in markdown. This adds unneeded complexity and makes pandoc diverge further than necessary from other markdown extensions. Brought documentation, tests, and debian/changelog up to date. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1141 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-12-08 19:32:18 +00:00
fiddlosopher	576ddc1b99	Modified markdown reader for new Math block. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1115 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-11-29 08:09:17 +00:00
fiddlosopher	9a67a486c2	Moved everything from src into the top-level directory. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1104 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-11-29 02:02:34 +00:00
fiddlosopher	47a4a3ab89	Removed Text directory. This is a remnant of an experiment moving the contents of src/ to the top level, and should have been deleted long ago. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1097 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-11-28 21:01:17 +00:00
fiddlosopher	4a841bfc54	Use template haskell to avoid the need for templates: + Added library Text.Pandoc.Include, with a template haskell function $(includeStrFrom fname) to include a file as a string constant at compile time. + This removes the need for the 'templates' directory or Makefile target. These have been removed. + The base source directory has been changed from src to . + A new 'data' directory has been added, containing the ASCIIMathML.js script, writer headers, and S5 files. + The src/wrappers directory has been moved to 'wrappers'. + The Text.Pandoc.ASCIIMathML library is no longer needed, since Text.Pandoc.Writers.HTML can use includeStrFrom to include the ASCIIMathML.js code directly. It has been removed. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1063 788f1e2b-df1e-0410-8736-df70ead52e1b	2007-11-03 22:14:03 +00:00

41 commits