Allow absolute URI as parameter (in this case, content is downloaded).
+ Adds dependency on HTTP. + If a parameter is an absolute URI, pandoc will try to get the content via HTTP. + So, you can do: pandoc -r html -w markdown http://www.fsf.org git-svn-id: https://pandoc.googlecode.com/svn/trunk@1826 788f1e2b-df1e-0410-8736-df70ead52e1b
This commit is contained in:
parent
19b0c72dd1
commit
9fee73d2a3
4 changed files with 28 additions and 7 deletions
11
README
11
README
|
@ -62,12 +62,17 @@ Note that you can specify multiple input files on the command line.
|
|||
`pandoc` will concatenate them all (with blank lines between them)
|
||||
before parsing:
|
||||
|
||||
pandoc -s ch1.txt ch2.txt refs.txt > book.html
|
||||
pandoc -s ch1.txt ch2.txt refs.txt > book.html
|
||||
|
||||
(The `-s` option here tells `pandoc` to produce a standalone HTML file,
|
||||
with a proper header, rather than a fragment. For more details on this
|
||||
and many other command-line options, see below.)
|
||||
|
||||
Instead of a filename, you can specify an absolute URI. In this
|
||||
case pandoc will attempt to download the content via HTTP:
|
||||
|
||||
pandoc -f html -t markdown http://www.fsf.org
|
||||
|
||||
The format of the input and output can be specified explicitly using
|
||||
command-line options. The input format can be specified using the
|
||||
`-r/--read` or `-f/--from` options, the output format using the
|
||||
|
@ -113,7 +118,9 @@ Character encodings
|
|||
-------------------
|
||||
|
||||
All input is assumed to be in the UTF-8 encoding, and all output
|
||||
is in UTF-8. If your local character encoding is not UTF-8 and you use
|
||||
is in UTF-8 (unless your version of pandoc was compiled using
|
||||
GHC 6.12 or higher, in which case the local encoding will be used).
|
||||
If your local character encoding is not UTF-8 and you use
|
||||
accented or foreign characters, you should pipe the input and output
|
||||
through [`iconv`]. For example,
|
||||
|
||||
|
|
|
@ -26,6 +26,11 @@ format). For output to a file, use the `-o` option:
|
|||
|
||||
pandoc -o output.html input.txt
|
||||
|
||||
Instead of a file, an absolute URI may be given. In this case
|
||||
pandoc will fetch the content using HTTP:
|
||||
|
||||
pandoc -f html -t markdown http://www.fsf.org
|
||||
|
||||
The input and output formats may be specified using command-line options
|
||||
(see **OPTIONS**, below, for details). If these formats are not
|
||||
specified explicitly, Pandoc will attempt to determine them
|
||||
|
@ -48,9 +53,10 @@ markdown: the differences are described in the *README* file in
|
|||
the user documentation. If standard markdown syntax is desired, the
|
||||
`--strict` option may be used.
|
||||
|
||||
Pandoc uses the UTF-8 character encoding for both input and output.
|
||||
If your local character encoding is not UTF-8, you should pipe input
|
||||
and output through `iconv`:
|
||||
Pandoc uses the UTF-8 character encoding for both input and output
|
||||
(unless compiled with GHC 6.12 or higher, in which case it uses
|
||||
the local encoding). If your local character encoding is not UTF-8, you
|
||||
should pipe input and output through `iconv`:
|
||||
|
||||
iconv -t utf-8 input.txt | pandoc | iconv -f utf-8
|
||||
|
||||
|
|
|
@ -145,7 +145,8 @@ Library
|
|||
mtl >= 1.1, network >= 2, filepath >= 1.1,
|
||||
process >= 1, directory >= 1,
|
||||
bytestring >= 0.9, zip-archive >= 0.1.1.4,
|
||||
utf8-string >= 0.3, old-time >= 1
|
||||
utf8-string >= 0.3, old-time >= 1,
|
||||
HTTP >= 4000.0
|
||||
if impl(ghc >= 6.10)
|
||||
Build-depends: base >= 4 && < 5, syb
|
||||
else
|
||||
|
|
|
@ -59,6 +59,9 @@ import Text.CSL
|
|||
import Text.Pandoc.Biblio
|
||||
#endif
|
||||
import Control.Monad (when, unless)
|
||||
import Network.HTTP
|
||||
import Network.URI (parseURI)
|
||||
import Data.ByteString.Lazy.UTF8 (toString)
|
||||
|
||||
copyrightMessage :: String
|
||||
copyrightMessage = "\nCopyright (C) 2006-8 John MacFarlane\n" ++
|
||||
|
@ -731,7 +734,11 @@ main = do
|
|||
let readSources [] = mapM readSource ["-"]
|
||||
readSources srcs = mapM readSource srcs
|
||||
readSource "-" = getContents
|
||||
readSource src = readFile src
|
||||
readSource src = case parseURI src of
|
||||
Just u -> readURI u
|
||||
Nothing -> readFile src
|
||||
readURI uri = simpleHTTP (mkRequest GET uri) >>= getResponseBody >>=
|
||||
return . toString -- treat all as UTF8
|
||||
|
||||
let convertTabs = tabFilter (if preserveTabs then 0 else tabStop)
|
||||
|
||||
|
|
Loading…
Add table
Reference in a new issue