(Note: For the DocBook writer, it makes sense to pass through
HTML raw, since the "HTML" might be DocBook XML. But this isn't
desirable for the OpenDocument writer, it seems to me.)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1254 788f1e2b-df1e-0410-8736-df70ead52e1b
This works better when fragments, rather than standalone documents, are generated.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1245 788f1e2b-df1e-0410-8736-df70ead52e1b
+ No space between paragraph and following @verbatim (provides more
pleasing appearance in text formats)
+ Blank line consistently after list environments.
+ Removed deVerb.
+ Use @code instead of @verb for inline code (this solves the character
escaping problem for texi2dvi and texi2pdf).
+ Modified test suite accordingly.
+ Added Peter Wang to copyright statement (for Texinfo.hs).
+ Added news of Texinfo writer to README.
+ Added Texinfo to list of formats in man page, and removed extra 'groff'.
+ Updated macports with Texinfo format.
+ Updated FreeBSD pkg-descr with Texinfo format.
+ Updated web page with Texinfo writer.
+ Added demos for Texinfo writer.
+ Added Texinfo to package description in debian/control.
+ Added texi & texinfo extensions to Main.hs, and fixed bug in determining
default output extension.
+ Changed from texinfo to texi extension in web demo.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1244 788f1e2b-df1e-0410-8736-df70ead52e1b
+ pandoc.cabal includes a flag, 'highlighting', that causes a dependency
on highlighting-kate.
+ if Setup.hs detects this dependency, it copies templates/Highlighting.yes.hs
to Text/Pandoc/Highlighting.hs. Otherwise, it copies templates/Highlighting.no.hs.
+ The HTML writer imports this new module instead of Text.Highlighting.Kate.
The new module exports highlightHtml, which either uses highlighting-kate to
perform syntax highlighting or automatically returns a failure code, depending
on whether highlighting support was selected.
+ --version now prints information about whether syntax highlighting support is compiled in.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1221 788f1e2b-df1e-0410-8736-df70ead52e1b
+ from Text.Pandoc.Writers.RTF
+ from Text.Pandoc.Writers.HTML
+ from Main
+ from pandoc.cabal
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1219 788f1e2b-df1e-0410-8736-df70ead52e1b
The problem was that we were looking for inlines until a '<' character
signaled the start of the URL. So if you hit a reference-style link,
it would keep looking til the end of the document. Fix: change
inline => (notFollowedBy (char '`') >> inline). Note that this won't
allow code inlines in links, but these aren't allowed in resT anyway.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1175 788f1e2b-df1e-0410-8736-df70ead52e1b
Allow nonquoted reference links to contain isolated '.', '-', '_', so
so that strings like 'a_b_' count as links.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1174 788f1e2b-df1e-0410-8736-df70ead52e1b
(A '*' is only recognized as the end of the emphasis if it's not the beginning
of a strong emphasis.)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1172 788f1e2b-df1e-0410-8736-df70ead52e1b
When this option is specified (--sanitize-html on the command line),
unsafe HTML tags will be replaced by HTML comments, and unsafe HTML
attributes will be removed. This option should be especially useful
for those who want to use pandoc libraries in web applications, where
users will provide the input.
+ Main.hs: Added --sanitize-html option.
+ Text.Pandoc.Shared: Added stateSanitizeHTML to ParserState.
+ Text.Pandoc.Readers.HTML:
- Added whitelists of sanitaryTags and sanitaryAttributes.
- Added parsers to check these lists (and state) to see if a given
tag or attribute should be counted unsafe.
- Modified anyHtmlTag and anyHtmlEndTag to replace unsafe tags
with comments.
- Modified htmlAttribute to remove unsafe attributes.
- Modified htmlScript and htmlStyle to remove these elements if
unsafe.
- Modified rawHtmlBlock to use anyHtmlBlockTag instead of anyHtmlTag
and anyHtmlEndTag. This fixes a bug in markdown parsing, where
inline tags would be included in raw HTML blocks.
- Modified anyHtmlBlockTag to test for (not inline) rather than
directly for block. This allows us to handle e.g. docbook in
the markdown reader.
- Minor tweaks in nonTitleNonHead and parseTitle.
+ Text.Pandoc.Readers.Markdown:
- In non-strict mode use rawHtmlBlocks instead of htmlBlock.
Simplified htmlBlock, since we know it's only called in strict
mode.
+ Modified README and man pages to document new option.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1166 788f1e2b-df1e-0410-8736-df70ead52e1b
could cause it to be parsed as a paragraph. (The problem is that
the HTML parser used to eat all blank space after an HTML block,
including the indentation of the code block.) Resolves Issue #39.
+ In Text.Pandoc.Readers.HTML, removed parsing of following space
from rawHtmlBlock.
+ In Text.Pandoc.Readers.Markdown, modified rawHtmlBlocks so that
indentation is eaten *only* on the first line after the HTML
block. This means that in
<div>
foo
<div>
the foo won't be treated as a code block, but in
<div>
foo
</div>
it will. This seems the right approach for least suprise.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1164 788f1e2b-df1e-0410-8736-df70ead52e1b
Contents of script tags were still being treated as markdown when
the script tags were parsed as inline. Fixed by moving "script"
from the list of tags that can be either block or inline to the
list of block tags.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1163 788f1e2b-df1e-0410-8736-df70ead52e1b
Resolves Issue #40.
+ Added htmlStyle, analagous to htmlScript.
+ Use htmlStyle in htmlBlockElement and rawHtmlInline.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1162 788f1e2b-df1e-0410-8736-df70ead52e1b
Each block element is wrapped with either Pad or Reg. Pad'ed elements are
guaranteed to have a blank line in between. Updated ConTeXt tests.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1158 788f1e2b-df1e-0410-8736-df70ead52e1b
instead of using failIfStrict in block parsers. Use a different
ordering of parsers in strict mode: raw HTML block before paragraph.
This recovers performance that was lost in strict mode with r1154.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1157 788f1e2b-df1e-0410-8736-df70ead52e1b
+ source parser first tries to parse URL with balanced parentheses;
if that doesn't work, it tries to parse everything beginning with
'(' and ending with ')'.
+ source parser now uses an auxiliary function source'.
+ linkTitle parser simplified and improved, under assumption that it
will be called in context of source'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1156 788f1e2b-df1e-0410-8736-df70ead52e1b
+ Replaced inlinesInBalanced with inlinesInBalancedBrackets, which instead
of hard-coding the inline parser takes an inline parser as a parameter.
+ Modified reference and inlineNote to use inlinesInBalancedBrackets.
+ Removed unneeded inlineString function.
+ Added inlineNonLink parser, which is now used in the definition of
reference.
+ Added inlineParsers list and redefined inline and inlineNonLink parsers
in terms of it.
+ Added failIfLink parser.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1155 788f1e2b-df1e-0410-8736-df70ead52e1b
Tags that can be either block or inline (e.g. <ins>) should be treated
as block when appropriate and as inline when appropriate. Thus, for
example,
<ins>hi</ins>
should be treated as a paragraph with inline <ins> tags, while
<ins>
hi
</ins>
should be treated as a paragraph within <ins> tags.
+ Moved htmlBlock after para in list of block parsers. This ensures
that tags that can be either block or inline get parsed as inline
when appropriate.
+ Modified rawHtmlInline' so that block elements aren't treated as inline.
+ Modified para parser so that paragraphs containing only HTML tags and
blank space are not allowed. Treat these as raw HTML blocks instead.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1154 788f1e2b-df1e-0410-8736-df70ead52e1b