Commit graph

170 commits

Author SHA1 Message Date
fiddlosopher
235e41f246 Commented out some unneeded code in HTML reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1325 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-15 00:56:53 +00:00
fiddlosopher
7701a87a1a Code cleanup - RST reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1324 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-15 00:54:43 +00:00
fiddlosopher
e9668bc988 Added type declarations to avoid -Wall 'defaulting' warnings in writers.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1323 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-15 00:32:21 +00:00
fiddlosopher
5be53bbd3f LaTeX reader - Code cleanup.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1322 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-15 00:14:58 +00:00
fiddlosopher
3f04b1d275 OpenDocument writer: Fixed typo in inline styles fix (super/sup).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1321 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-14 21:43:52 +00:00
fiddlosopher
3e2afa7a49 Code cleanup in LaTeX reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1320 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-14 16:19:42 +00:00
fiddlosopher
8f41289306 OpenDocument writer: support nested inline styles.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1319 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-14 16:19:28 +00:00
fiddlosopher
d5c73ac42a Code cleanup in TexMath reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1318 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 23:58:35 +00:00
fiddlosopher
43ebdafc07 Code cleanup in ConTeXt writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1317 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 23:53:21 +00:00
fiddlosopher
048aeabebe Code cleanup in Texinfo writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1316 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 23:49:32 +00:00
fiddlosopher
9f14bf7d0c Code cleanup in Man writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1315 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 23:39:52 +00:00
fiddlosopher
b325a5d490 Code cleanup in DocBook writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1314 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 23:33:27 +00:00
fiddlosopher
45a734878e Code cleanup in RTF writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1313 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 23:25:52 +00:00
fiddlosopher
5fb67cab88 Code cleanup in RST writer to eliminate -Wall warnings.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1311 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 23:16:44 +00:00
fiddlosopher
ea31c82dbd Code cleanup in markdown writer to eliminate -Wall warnings.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1310 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 23:10:58 +00:00
fiddlosopher
e9dc5aa02e Code cleanup in Text.Pandoc.Blocks to eliminate -Wall warnings.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1309 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 22:50:57 +00:00
fiddlosopher
b973bc9e3e Code cleanup in HTML writer to eliminate -Wall warnings.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1308 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 17:08:55 +00:00
fiddlosopher
a76b920f03 Code cleanup in LaTeX writer to eliminate -Wall warnings.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1307 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 16:53:06 +00:00
fiddlosopher
1b3328ca38 OpenDocument writer: Indented bulleted lists as we do enumerated lists.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1306 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 16:32:14 +00:00
fiddlosopher
be719b2a44 Added distinction between tight and loose lists in OpenDocument writer.
(For bullet and enumerated lists only.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1305 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 16:31:56 +00:00
fiddlosopher
6bbe5d435d Fixed bugs in OpenDocument writer affecting nested block quotes.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1304 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 16:31:44 +00:00
fiddlosopher
7447ecc255 Escape '\160' as " ", not " " in XML.
"nbsp" isn't a predefined XML entity.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1303 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 16:31:34 +00:00
fiddlosopher
752adcd45a Added type signatures and fixed other -Wall warnings in Markdown reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1301 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-11 16:33:21 +00:00
fiddlosopher
45044ff536 Added a few more recognized abbreviations to 'abbrev' parser.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1300 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-11 03:00:35 +00:00
fiddlosopher
a85dfb83bb Print unicode \160 literally in markdown writer, rather than as  .
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1299 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-11 02:15:10 +00:00
fiddlosopher
824bb2d22e In smart mode, use nonbreaking spaces after abbreviations in markdown parser.
Thus, for example, "Mr. Brown" comes out as "Mr.~Brown" in LaTeX, and does
not produce a sentence-separating space.  Resolves Issue #75.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1298 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-11 02:14:57 +00:00
fiddlosopher
8ed710bc9d Treat '\ ' in (extended) markdown as nonbreaking space.
Print nonbreaking space appropriately in each writer (e.g. ~ in LaTeX).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1297 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-11 01:24:15 +00:00
fiddlosopher
2df112bdf6 Changed inDefinition and indentPara to stInDefinition and stIndentPara
(for consistency).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1295 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-10 23:26:12 +00:00
fiddlosopher
6b73389328 Added type signatures, etc., to eliminate -Wall warnings.
(except for two warnings about unneeded functions, which might
come in handy some day...)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1291 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-17 22:15:39 +00:00
fiddlosopher
24c1d71a37 Cleaned up Text/Pandoc/CharacterReferences.hs to avoid -Wall warnings.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1289 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-17 19:07:17 +00:00
fiddlosopher
18e44501a7 Cleaned up Text/Pandoc/Shared.hs to avoid -Wall warnings.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1288 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-17 19:07:11 +00:00
fiddlosopher
5eb204c86c OpenDocument Writer: Fixed handling of spaces and tabs in preformatted blocks.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1286 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-12 17:14:36 +00:00
fiddlosopher
d07e5825d3 Small fix to indentation of code blocks inside defn lists.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1285 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-09 21:15:24 +00:00
fiddlosopher
94e2d373bc OpenDocument writer: Fixed indentation for verbatim blocks inside defn lists.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1284 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-09 21:13:25 +00:00
fiddlosopher
dd454eb5ed OpenDocument writer: Use different bullets for different list levels.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1283 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-09 21:08:06 +00:00
fiddlosopher
2f88263833 OpenDocument writer: Return empty Doc in title block for null author, title, date.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1282 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-09 21:07:59 +00:00
fiddlosopher
8add1cf821 OpenDocument writer: don't convert 4 spaces to tab in verbatim block.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1281 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-09 21:07:52 +00:00
fiddlosopher
3337b46e30 Use \textsubscr instead of \textsubscript for LaTeX subscript macro.
\textsubscript conflicts with a definition in the memoir class.
Resolves Issue #65.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1280 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-08 03:29:09 +00:00
fiddlosopher
cd38d4ae79 Markdown smart typography: Em dashes no longer eat surrounding whitespace.
Resolves Issue #69.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1279 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-08 03:20:15 +00:00
fiddlosopher
69c2bde7d9 HTML writer: In code blocks, change leading newlines to <br /> tags.
(Some browsers ignore them.)  Resolves Issue #71.
See http://six.pairlist.net/pipermail/markdown-discuss/2008-May/001297.html


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1277 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-05-15 00:54:33 +00:00
fiddlosopher
6a46ffc0ad Count anything that isn't a known block (HTML) tag as an inline tag
(rather than the other way around).  Added "html", "head", and
"body" to list of block tags.  Resolves Issue #66, allowing
<lj> to count as an inline tag.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1276 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-04-20 03:12:42 +00:00
fiddlosopher
2fdcd1701a Fixed haddock documentation error
(affects latest version of haddock only).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1275 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-31 20:20:02 +00:00
fiddlosopher
447efa0f9f Fixed bug in RTF writer:
Extra spaces were being printed after emphasized, boldface, and
other inline elements.  Resolves Issue #64.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1274 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-26 21:46:12 +00:00
fiddlosopher
8624ed9bd3 The '--sanitize-html' option now examines URIs in markdown links
and images, and in HTML href and src attributes.  If the URI scheme
is not on a whitelist of safe schemes, it is rejected.  The main point
is to prevent cross-site scripting attacks using 'javascript:' URIs.
See http://www.mail-archive.com/markdown-discuss@six.pairlist.net/msg01186.html
and http://ha.ckers.org/xss.html.  Resolves Issue #62.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1262 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-22 20:41:56 +00:00
fiddlosopher
4988441f3c Fixed handling of Quoted inline elements to use unicode left & right quotes.
Added inQuotes auxiliary function.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1261 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-21 15:45:54 +00:00
fiddlosopher
97a3b2234a OpenDocument: use the new Definition_20_Term and
Definition_20_Definition styles and some other minor cleanup.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1256 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-19 18:46:27 +00:00
fiddlosopher
d2643c25e2 Code cleanup only.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1255 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-19 18:46:18 +00:00
fiddlosopher
ee644ddda0 OpenDocument writer: Don't print raw HTML.
(Note: For the DocBook writer, it makes sense to pass through
HTML raw, since the "HTML" might be DocBook XML.  But this isn't
desirable for the OpenDocument writer, it seems to me.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1254 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-19 18:46:09 +00:00
fiddlosopher
dcb37d32d1 Moved XML-formatting functions to new unexported module Text.Pandoc.XML.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1253 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-19 18:46:01 +00:00
fiddlosopher
2baf6d09ee Andrea Rossato's patch for OpenDocument support.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1252 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-03-19 18:45:51 +00:00
fiddlosopher
94dcbab55f Modified disallowedInNode in Texinfo writer to correct list of disallowed characters.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1246 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-24 18:21:19 +00:00
fiddlosopher
839f77d81e Use style attributes rather than css classes for strikethrough and ordered list styles.
This works better when fragments, rather than standalone documents, are generated.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1245 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-24 18:15:36 +00:00
fiddlosopher
858269dd20 Changes to Texinfo writer:
+ No space between paragraph and following @verbatim (provides more
  pleasing appearance in text formats)
+ Blank line consistently after list environments.
+ Removed deVerb.
+ Use @code instead of @verb for inline code (this solves the character
  escaping problem for texi2dvi and texi2pdf).
+ Modified test suite accordingly.
+ Added Peter Wang to copyright statement (for Texinfo.hs).
+ Added news of Texinfo writer to README.
+ Added Texinfo to list of formats in man page, and removed extra 'groff'.
+ Updated macports with Texinfo format.
+ Updated FreeBSD pkg-descr with Texinfo format.
+ Updated web page with Texinfo writer.
+ Added demos for Texinfo writer.
+ Added Texinfo to package description in debian/control.
+ Added texi & texinfo extensions to Main.hs, and fixed bug in determining
  default output extension.
+ Changed from texinfo to texi extension in web demo.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1244 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-24 05:48:59 +00:00
fiddlosopher
49e0e507b7 Committed novalazy's initial patch for texinfo output,
including tests for texinfo writer.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1243 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-24 05:48:41 +00:00
fiddlosopher
270eb7bed4 Moved BlockWrapper and wrappedBlocksToDoc from ConTeXt writer to Shared.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1242 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-24 05:48:31 +00:00
fiddlosopher
9a7ca4d9aa Minor changes due to changes in highlighting-kate API.
defaultHighlightingCss now imported rather than duplicated.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1235 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-10 18:59:54 +00:00
fiddlosopher
742a465980 Support for startFrom="nn" to select starting line number in syntax highlighting.
Changed argument of highlightHtml to Attr, not [String], for generality.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1232 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-10 18:59:34 +00:00
fiddlosopher
04b32451be Added build option for syntax highlighting, with *optional* dependency on highlighting-kate.
+ pandoc.cabal includes a flag, 'highlighting', that causes a dependency
  on highlighting-kate.
+ if Setup.hs detects this dependency, it copies templates/Highlighting.yes.hs
  to Text/Pandoc/Highlighting.hs.  Otherwise, it copies templates/Highlighting.no.hs.
+ The HTML writer imports this new module instead of Text.Highlighting.Kate.
  The new module exports highlightHtml, which either uses highlighting-kate to
  perform syntax highlighting or automatically returns a failure code, depending
  on whether highlighting support was selected.
+ --version now prints information about whether syntax highlighting support is compiled in.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1221 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:21:19 +00:00
fiddlosopher
ba6981a4a1 Removed Text.Regex dependencies by rewriting using plain Haskell:
+ from Text.Pandoc.Writers.RTF
+ from Text.Pandoc.Writers.HTML
+ from Main
+ from pandoc.cabal


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1219 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:21:04 +00:00
fiddlosopher
087739ad7a Moved Text.Pandoc.Writers.DefaultHeaders -> Text.Pandoc.DefaultHeaders.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1218 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:20:55 +00:00
fiddlosopher
c150e84d3c Add default CSS to document header if syntax highlighting used.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1215 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:20:25 +00:00
fiddlosopher
b3709022ef Added preliminary support for syntax highlighting to HTML writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1213 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:20:13 +00:00
fiddlosopher
06b544360e Factored codeBlock into separate codeBlockIndented and codeBlockDelimited.
Do not use codeBlockDelimited in strict mode.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1211 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:20:02 +00:00
fiddlosopher
614547b38e Use generic attributes type, not a string, for CodeBlocks.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1209 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:19:43 +00:00
fiddlosopher
f2354590f9 Put language class information in pre tag, not code tag, in HTML code blocks.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1207 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:19:24 +00:00
fiddlosopher
2e683e8b53 Fixed delimited code blocks: eat blank lines afterwards, and allow end line
to contain more tildes than beginning line.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1206 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:19:17 +00:00
fiddlosopher
24f22ee7ac Added a needed try to {} attribute parser.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1205 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:19:10 +00:00
fiddlosopher
046c6b0d0d Added support for multiple classes in delimited code block.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1204 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:19:01 +00:00
fiddlosopher
b06ddad4bc Initial support for delimited code blocks in markdown reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1203 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:18:54 +00:00
fiddlosopher
769e2f3cf5 HTML writer: if language specified for code block, print as <code> class.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1202 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:18:47 +00:00
fiddlosopher
2c13782fdf Modified writers for new argument place in CodeBlock.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1200 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:18:22 +00:00
fiddlosopher
9f7a14c210 Modified readers for new parameter in CodeBlock.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1199 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:18:03 +00:00
fiddlosopher
6a921f0966 Added parameter for class to CodeBlock (for syntax highlighting).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1198 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-02-09 03:17:49 +00:00
fiddlosopher
42359e63c9 Fixed bug in RST reader, which would choke on: "p. one\ntwo\n".
Added some try's in ordered list parsers.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1191 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-17 02:14:20 +00:00
fiddlosopher
d474852f56 Removed unnecessary imports.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1189 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-16 02:18:23 +00:00
fiddlosopher
6795ffd93f Bumped version to 0.47.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1187 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 20:25:46 +00:00
fiddlosopher
8fca649d05 Changed copyright dates where appropriate to include 2008.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1181 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 17:26:16 +00:00
fiddlosopher
2df432dc60 Changed comment used to replace unsafe HTML if sanitize-html option
selected.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1178 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 04:53:01 +00:00
fiddlosopher
b9e30ca8b7 RST reader: Fixed bug in parsing explicit links (resolves Issue #44).
The problem was that we were looking for inlines until a '<' character
signaled the start of the URL.  So if you hit a reference-style link,
it would keep looking til the end of the document.  Fix:  change
inline => (notFollowedBy (char '`') >> inline).  Note that this won't
allow code inlines in links, but these aren't allowed in resT anyway.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1175 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:55 +00:00
fiddlosopher
85657add6a RST reader: cleaned up parsing of reference names in key blocks and links.
Allow nonquoted reference links to contain isolated '.', '-', '_', so
so that strings like 'a_b_' count as links.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1174 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:43 +00:00
fiddlosopher
e4837c140c RST reader: Removed unnecessary check for following link in str.
This is unnecessary now that link is above str in the definition of
'inline'.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1173 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:38 +00:00
fiddlosopher
d271473044 Fixed markdown reader to handle "*hi **there***" as a strong nested in an emph.
(A '*' is only recognized as the end of the emphasis if it's not the beginning
of a strong emphasis.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1172 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:31 +00:00
fiddlosopher
0e94609e18 Markdown reader: Moved blockQuote parser before list parsers.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1171 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:24 +00:00
fiddlosopher
0921704d92 Use an interpreted text role to render math in restructuredText.
See http://www.american.edu/econ/itex2mml/mathhack.rst for the
strategy.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1168 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-04 18:59:00 +00:00
fiddlosopher
ec3f6b649f Refactored RST writer to usea record instead of a tuple for state,
and to include options in state so it doesn't need to be passed as
a parameter.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1167 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-04 18:58:50 +00:00
fiddlosopher
5df912b162 Added optional HTML sanitization using a whitelist.
When this option is specified (--sanitize-html on the command line),
unsafe HTML tags will be replaced by HTML comments, and unsafe HTML
attributes will be removed.  This option should be especially useful
for those who want to use pandoc libraries in web applications, where
users will provide the input.

+ Main.hs:  Added --sanitize-html option.
+ Text.Pandoc.Shared:  Added stateSanitizeHTML to ParserState.
+ Text.Pandoc.Readers.HTML:
  - Added whitelists of sanitaryTags and sanitaryAttributes.
  - Added parsers to check these lists (and state) to see if a given
    tag or attribute should be counted unsafe.
  - Modified anyHtmlTag and anyHtmlEndTag to replace unsafe tags
    with comments.
  - Modified htmlAttribute to remove unsafe attributes.
  - Modified htmlScript and htmlStyle to remove these elements if
    unsafe.
  - Modified rawHtmlBlock to use anyHtmlBlockTag instead of anyHtmlTag
    and anyHtmlEndTag.  This fixes a bug in markdown parsing, where
    inline tags would be included in raw HTML blocks.
  - Modified anyHtmlBlockTag to test for (not inline) rather than
    directly for block.  This allows us to handle e.g. docbook in
    the markdown reader.
  - Minor tweaks in nonTitleNonHead  and parseTitle.
+ Text.Pandoc.Readers.Markdown:
  - In non-strict mode use rawHtmlBlocks instead of htmlBlock.
    Simplified htmlBlock, since we know it's only called in strict
    mode.
+ Modified README and man pages to document new option.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1166 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-03 21:32:32 +00:00
fiddlosopher
e37df6db69 Fixed bug in the markdown reader: HTML preceding a code block
could cause it to be parsed as a paragraph.  (The problem is that
the HTML parser used to eat all blank space after an HTML block,
including the indentation of the code block.)  Resolves Issue #39.
+ In Text.Pandoc.Readers.HTML, removed parsing of following space
  from rawHtmlBlock.
+ In Text.Pandoc.Readers.Markdown, modified rawHtmlBlocks so that
  indentation is eaten *only* on the first line after the HTML
  block.  This means that in
  <div>
       foo
  <div>
  the foo won't be treated as a code block, but in
  <div>

      foo

  </div>
  it will.  This seems the right approach for least suprise.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1164 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-31 01:02:44 +00:00
fiddlosopher
ad5cbb78d0 HTML reader: Finished fixing Issue #40.
Contents of script tags were still being treated as markdown when
the script tags were parsed as inline.  Fixed by moving "script"
from the list of tags that can be either block or inline to the
list of block tags.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1163 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-31 00:45:54 +00:00
fiddlosopher
d989a78b3b HTML reader: Don't interpret contents of style tags as markdown.
Resolves Issue #40.
+ Added htmlStyle, analagous to htmlScript.
+ Use htmlStyle in htmlBlockElement and rawHtmlInline.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1162 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-31 00:05:03 +00:00
fiddlosopher
6e1a652429 Fixed bug in HTML reader: it was looking for <IT> tag, not <I>.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1161 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-30 02:21:01 +00:00
fiddlosopher
6c1f9c8e39 Made LaTeX reader properly recognize --parse-raw in rawLaTeXInline.
Updated LaTeX reader test to use --parse-raw.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1160 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-30 02:07:55 +00:00
fiddlosopher
a465c4f659 Changed handling of titles in HTML writer so you don't get "titleprefix - "
followed by nothing.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1159 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-29 21:11:33 +00:00
fiddlosopher
5e65598b9e Use wrappers around Doc elements to ensure proper spacing in ConTeXt writer.
Each block element is wrapped with either Pad or Reg.  Pad'ed elements are
guaranteed to have a blank line in between.  Updated ConTeXt tests.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1158 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-29 09:31:45 +00:00
fiddlosopher
f74dda27b6 Markdown reader: Make 'block' conditional on strictness state,
instead of using failIfStrict in block parsers. Use a different
ordering of parsers in strict mode: raw HTML block before paragraph.
This recovers performance that was lost in strict mode with r1154.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1157 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-29 06:30:12 +00:00
fiddlosopher
05cbdf04fd Markdown: better handling of parentheses in URLs and quotation marks in titles.
+ source parser first tries to parse URL with balanced parentheses;
  if that doesn't work, it tries to parse everything beginning with
  '(' and ending with ')'.
+ source parser now uses an auxiliary function source'.
+ linkTitle parser simplified and improved, under assumption that it
  will be called in context of source'.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1156 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-24 04:22:41 +00:00
fiddlosopher
ee6f06ec05 Modified markdown reader to disallow links within links. (Resolves Issue #35.)
+ Replaced inlinesInBalanced with inlinesInBalancedBrackets, which instead
  of hard-coding the inline parser takes an inline parser as a parameter.
+ Modified reference and inlineNote to use inlinesInBalancedBrackets.
+ Removed unneeded inlineString function.
+ Added inlineNonLink parser, which is now used in the definition of
  reference.
+ Added inlineParsers list and redefined inline and inlineNonLink parsers
  in terms of it.
+ Added failIfLink parser.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1155 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-24 04:22:31 +00:00
fiddlosopher
97992e6f7b Improved handling of raw HTML in Markdown reader. (Resolves Issue #36.)
Tags that can be either block or inline (e.g. <ins>) should be treated
as block when appropriate and as inline when appropriate.  Thus, for
example,
  <ins>hi</ins>
should be treated as a paragraph with inline <ins> tags, while
  <ins>

  hi

  </ins>
should be treated as a paragraph within <ins> tags.
+ Moved htmlBlock after para in list of block parsers.  This ensures
  that tags that can be either block or inline get parsed as inline
  when appropriate.
+ Modified rawHtmlInline' so that block elements aren't treated as inline.
+ Modified para parser so that paragraphs containing only HTML tags and
  blank space are not allowed.  Treat these as raw HTML blocks instead.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1154 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-24 04:22:20 +00:00
fiddlosopher
dad8e16330 Changed failure message in anyHtmlBlockTag (minor change).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1153 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-24 04:22:08 +00:00
fiddlosopher
6802d287cf Modified rawHtmlBlock in HTML reader so it parses </html> and </body> tags.
This allows these tags to be handled correctly in Markdown.
HTML reader now uses rawHtmlBlock', which excludes </html> and </body>,
since these are handled in parseHtml.  (Resolves Issue #38.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1152 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-23 03:46:12 +00:00
fiddlosopher
fbecb49790 Modified 'source' parser in Markdown reader to allow backslash escapes in URLs.
So, for example, [my](/url\(1\)) yields a link to /url(1).  Resolves Issue #34.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1151 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-21 22:31:20 +00:00