Removed advice to pipe through tidy before HTML reader.

This is obsolete, now that we have a forgiving HTML parser.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1827 788f1e2b-df1e-0410-8736-df70ead52e1b
This commit is contained in:
fiddlosopher 2010-02-02 16:39:15 +00:00
parent 9fee73d2a3
commit 183ea8d839
2 changed files with 1 additions and 8 deletions

5
README
View file

@ -96,10 +96,7 @@ Supported input formats include `markdown`, `html`, `latex`, and `rst`.
Note that the `rst` reader only parses a subset of reStructuredText Note that the `rst` reader only parses a subset of reStructuredText
syntax. For example, it doesn't handle tables, option lists, or syntax. For example, it doesn't handle tables, option lists, or
footnotes. But for simple documents it should be adequate. The `latex` footnotes. But for simple documents it should be adequate. The `latex`
and `html` readers are also limited in what they can do. Because the and `html` readers are also limited in what they can do.
`html` reader is picky about the HTML it parses, it is recommended that
you pipe HTML through [HTML Tidy] before sending it to `pandoc`, or use
the `html2markdown` script described below.
If you don't specify a reader or writer explicitly, `pandoc` will If you don't specify a reader or writer explicitly, `pandoc` will
try to determine the input and output format from the extensions of try to determine the input and output format from the extensions of

View file

@ -60,10 +60,6 @@ should pipe input and output through `iconv`:
iconv -t utf-8 input.txt | pandoc | iconv -f utf-8 iconv -t utf-8 input.txt | pandoc | iconv -f utf-8
Pandoc's HTML parser is not very forgiving. If your input is
HTML, consider running it through `tidy`(1) before passing it
to Pandoc. Or use `html2markdown`(1), a wrapper around `pandoc`.
# OPTIONS # OPTIONS
-f *FORMAT*, -r *FORMAT*, \--from=*FORMAT*, \--read=*FORMAT* -f *FORMAT*, -r *FORMAT*, \--from=*FORMAT*, \--read=*FORMAT*