Added Text.Pandoc.UTF8 as a backup for when utf8-string is not present.

+ Added Text.Pandoc.UTF8
+ Changed flag name from utf8 to utf8-string
+ Changed CPP MACRO from _UTF8 to _UTF8STRING
+ Import IO functions from Text.Pandoc.UTF8 when utf8-string not available.
+ Removed utf8-string dependency from debian/control.
+ Removed pandoc.cabal.ghc66; we no longer support GHC 6.6
+ Modified INSTALL instructions


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1383 788f1e2b-df1e-0410-8736-df70ead52e1b
This commit is contained in:
fiddlosopher 2008-08-08 00:11:58 +00:00
parent 05b366a0b2
commit 80715bd126
11 changed files with 191 additions and 156 deletions

110
INSTALL
View file

@ -1,43 +1,32 @@
% Installing pandoc
Installing from Source
======================
Installing pandoc from Source
=============================
This method will work on all architectures, but requires that you first
install the GHC compiler and some build tools: [GNU `make`], `sed`,
`bash`, and `perl`. These are standard on unix systems (including MacOS
X). If you're using Windows, you can install [Cygwin].
[Cygwin]: http://www.cygwin.com/
[GNU `make`]: http://www.gnu.org/software/make/
This method will work on all architectures for which the GHC compiler
is available.
Installing GHC
--------------
To compile Pandoc, you'll need [GHC] version 6.6 or greater and [Cabal]
version 1.2 or greater. If you don't have GHC already, you can get it
from the [GHC Download] page. If you're compiling GHC from source, be
sure to get the `extralibs` in addition to the base tarball. GHC comes
with a version of Cabal. Note, however, that the version of Cabal that
comes with GHC 6.6 is not recent enough. So if you are using GHC 6.6,
you'll probably need to [install Cabal] separately. You can check your
Cabal version using `ghc-pkg list`.
To compile Pandoc, you'll need [GHC] version 6.8 or greater. If you
don't have GHC already, you can get it from the [GHC Download] page.
If you're compiling GHC from source, be sure to get the `extralibs`
in addition to the base tarball. Pandoc requires Cabal version 1.2 or
greater. If your GHC comes with an older version of Cabal, you'll need
to to [install Cabal] separately. You can check your Cabal version using
`ghc-pkg list`.
If you're running MacOS X, it may be more convenient to install GHC
using [MacPorts] or [Fink].
If you're running MacOS X, you can also install GHC using [MacPorts] or [Fink].
If you're on a [debian]-based linux system (such as [Ubuntu]), you can get
GHC and the required libraries using `apt-get`:
sudo apt-get install ghc6 libghc6-xhtml-dev libghc6-mtl-dev libghc6-network-dev libghc6-utf8-string-dev
sudo apt-get install ghc6 libghc6-xhtml-dev libghc6-mtl-dev libghc6-network-dev
Otherwise, you should already have all the libraries you need except for
`utf8-string`. Download the [utf8-string] tarball from HackageDB, extract it, and
switch to that directory. Then:
runghc Setup.lhs configure
runghc Setup.lhs build
sudo runghc Setup.lhs install
Pandoc will use the [utf8-string] library if it is installed; otherwise, it
will use its own internal module for UTF-8 I/O. The utf8-string library is
not a required dependency, but it may improve performance slightly.
[GHC]: http://www.haskell.org/ghc/
[GHC Download]: http://www.haskell.org/ghc/download.html
@ -52,15 +41,23 @@ switch to that directory. Then:
Getting the source
------------------
Download the source tarball from pandoc's [pandoc's google code site].
Download the source tarball from [pandoc's google code site].
Extract the contents into a subdirectory:
tar xvzf pandoc-0.xy.tar.gz
tar xvzf pandoc-x.y.tar.gz
[pandoc's google code site]: http://pandoc.googlecode.com
Installing Pandoc
-----------------
Now choose one of the following methods for compiling and installing
pandoc. If you are on a linux or unix-based system, you can [install
pandoc using Make]. If not, you should [install pandoc using Cabal].
[install pandoc using Make]: #installing-pandoc-using-make
[install pandoc using Cabal]: #installing-pandoc-using-cabal
[build options]: #build-options
Installing Pandoc using Make
----------------------------
1. Change to the directory containing the Pandoc distribution.
@ -109,20 +106,26 @@ Installing Pandoc
Note that building the library documentation requires [haddock].
6. If you decide you don't want pandoc on your system, each of the
installation steps described above can be reversed:
sudo make uninstall
PREFIX=~ make uninstall-exec
sudo make uninstall-all
[haddock]: http://www.haskell.org/haddock/
Installing pandoc using Cabal
-----------------------------
Pandoc can also be installed using the standard Haskell packaging tool,
[Cabal](http://www.haskell.org/cabal/). You'll need GHC 6.6 or greater
and Cabal 1.2 or greater (see [Installing GHC](#installing-ghc), above).
Just download the source tarball, unpack, and type:
Change to the directory containing the pandoc source, and type:
runhaskell Setup.hs configure
runhaskell Setup.hs build
runhaskell Setup.hs haddock # optional, to build library documentation
runhaskell Setup.hs install # this one as root
runhaskell Setup.hs install # this one as root or sudo
This will install the pandoc executable and the Haskell libraries,
but not the shell scripts, man pages, or other documentation.
@ -134,19 +137,17 @@ flag):
runhaskell Setup.hs configure -f-library # just the executable
runhaskell Setup.hs configure -f-executable # just the libraries
Note: If you are using GHC 6.6.*, you will need to start by
replacing `pandoc.cabal` with a version suitable for GHC 6.6:
You can also specify the directory tree into which pandoc will be
installed using the `--prefix=` option to `configure`. For more details,
see the [Cabal User's Guide].
cp pandoc.cabal pandoc.cabal.orig
cp pandoc.cabal.ghc66 pandoc.cabal
[Cabal User's Guide]: http://www.haskell.org/cabal/release/latest/doc/users-guide/builders.html#setup-configure-paths
Optional syntax highlighting support
------------------------------------
Pandoc can optionally be compiled with support for syntax highlighting of
delimited code blocks. This feature requires the [`highlighting-kate` library].
It also requires Cabal version 1.2 or greater.
If you are using Cabal to compile pandoc, specify the `highlighting` flag in
the configure step:
@ -161,17 +162,6 @@ If you have already built pandoc, you may need to do a `make clean` or
[`highlighting-kate` library]: http://johnmacfarlane.net/highlighting-kate
Removing Pandoc
---------------
Each of the installation steps described above can be reversed:
sudo make uninstall
PREFIX=~ make uninstall-exec
sudo make uninstall-all
Other targets
-------------
@ -180,11 +170,7 @@ but are documented here for packagers and developers:
### Building and installing
* `configure`: Performs the needed preprocessing to create a proper
Cabal package for Pandoc:
- Builds `ASCIIMathML.hs`, `DefaultHeaders.hs`, and `S5.hs`
from templates in `src/templates` and data in `src/ASCIIMathML.js`,
`src/ui`, and `src/headers`.
* `configure`:
- Stores values of relevant environment variables in `vars` for
persistence.
- Runs Cabal's "configure" command.
@ -204,7 +190,7 @@ but are documented here for packagers and developers:
* `test`: Runs Pandoc's test suite. (All tests should pass.)
* `test-markdown`: Runs the Markdown regression test suite, using
`pandoc --strict`. (Three of the tests will fail.)
`pandoc --strict`. (One test will fail.)
### Cleaning
@ -245,14 +231,12 @@ Extract the files from the archive, and put `pandoc.exe` somewhere
in your PATH.
Note that the Windows binary distribution does not include the shell
scripts `markdown2pdf`, `html2markdown`, or `hsmarkdown`. If you need
these, compile from source.
scripts `markdown2pdf`, `html2markdown`, or `hsmarkdown`.
Installing pandoc on Debian
===========================
Pandoc is now in the debian unstable archive, and can be installed
using `apt-get` (as root):
Pandoc is now in the debian archives, and can be installed using `apt-get` (as root):
apt-get install pandoc # the program, shell scripts, and docs
apt-get install libghc6-pandoc-dev # the libraries

12
Main.hs
View file

@ -41,11 +41,11 @@ import System.Console.GetOpt
import Data.Maybe ( fromMaybe )
import Data.Char ( toLower )
import Prelude hiding ( putStrLn, writeFile, readFile, getContents )
#ifdef _UTF8
import System.IO.UTF8
import System.IO ( stdout, stderr )
#ifdef _UTF8STRING
import System.IO.UTF8
#else
import System.IO
import Text.Pandoc.UTF8
#endif
#ifdef _CITEPROC
import Text.CSL
@ -60,10 +60,10 @@ copyrightMessage = "\nCopyright (C) 2006-7 John MacFarlane\n" ++
compileInfo :: String
compileInfo =
#ifdef _UTF8
" +utf8" ++
#ifdef _UTF8STRING
" +utf8-string" ++
#else
" -utf8" ++
" -utf8-string" ++
#endif
#ifdef _CITEPROC
" +citeproc" ++

View file

@ -120,13 +120,6 @@ cleanup_files+=$(WRAPPERS)
$(WRAPPERS): %: $(SRCDIR)/wrappers/%.in $(SRCDIR)/wrappers/*.sh
@$(generate-shell-script)
CABAL_BACKUP=$(CABAL).orig
$(CABAL_BACKUP):
cp $(CABAL) $(CABAL_BACKUP) ; \
if echo $(GHC_VERSION) | grep -q '^6.6'; then \
cp $(CABAL).ghc66 $(CABAL); \
fi
.PHONY: configure
cleanup_files+=Setup.hi Setup.o $(BUILDCMD) $(BUILDVARS)
ifdef GHC_PKG

View file

@ -43,11 +43,11 @@ import Network.URI ( isURI )
import qualified Data.ByteString as B ( writeFile, pack )
import Data.ByteString.Internal ( c2w )
import Prelude hiding ( writeFile, readFile )
#ifdef _UTF8
import System.IO.UTF8
import System.IO ( stderr )
#ifdef _UTF8STRING
import System.IO.UTF8
#else
import System.IO
import Text.Pandoc.UTF8
#endif
-- | Produce an ODT file from OpenDocument XML.

View file

@ -116,11 +116,11 @@ import Network.URI ( parseURI, URI (..), isAllowedInURI )
import System.FilePath ( (</>), (<.>) )
import System.IO.Error ( catch, ioError, isAlreadyExistsError )
import System.Directory
import Prelude hiding ( putStrLn )
#ifdef _UTF8
import Prelude hiding ( putStrLn, writeFile, readFile, getContents )
#ifdef _UTF8STRING
import System.IO.UTF8
#else
import System.IO
import Text.Pandoc.UTF8
#endif
--

View file

@ -39,10 +39,10 @@ import Language.Haskell.TH.Syntax (Lift (..))
import qualified Data.ByteString as B
import Data.ByteString.Internal ( w2c )
import Prelude hiding ( readFile )
#ifdef _UTF8
#ifdef _UTF8STRING
import System.IO.UTF8
#else
import System.IO
import Text.Pandoc.UTF8
#endif
-- | Insert contents of text file into a template.

76
Text/Pandoc/UTF8.hs Normal file
View file

@ -0,0 +1,76 @@
-- | Functions for IO using UTF-8 encoding.
--
-- The basic encoding and decoding functions are taken from
-- <http://www.cse.ogi.edu/~hallgren/Talks/LHiH/base/lib/UTF8.hs>.
-- (c) 2003, OGI School of Science & Engineering, Oregon Health and
-- Science University.
--
-- From the Char module supplied with HBC.
-- Modified by Martin Norbaeck to pass illegal UTF-8 sequences unchanged.
-- Modified by John MacFarlane to use [Word8] and export IO functions.
module Text.Pandoc.UTF8 (
putStrLn
, putStr
, hPutStrLn
, hPutStr
, getContents
, readFile
, writeFile
) where
import Data.Word
import System.IO ( Handle )
import qualified Data.ByteString.Lazy as BS
import Prelude hiding ( putStrLn, putStr, getContents, readFile, writeFile )
putStrLn :: String -> IO ()
putStrLn = BS.putStrLn . BS.pack . toUTF8
putStr :: String -> IO ()
putStr = BS.putStr . BS.pack . toUTF8
hPutStrLn :: Handle -> String -> IO ()
hPutStrLn h = BS.hPut h . BS.pack . toUTF8 . (++ "\n")
hPutStr :: Handle -> String -> IO ()
hPutStr h = BS.hPut h . BS.pack . toUTF8
readFile :: FilePath -> IO String
readFile p = BS.readFile p >>= return . fromUTF8 . BS.unpack
writeFile :: FilePath -> String -> IO ()
writeFile p = BS.writeFile p . BS.pack . toUTF8
getContents :: IO String
getContents = BS.getContents >>= return . fromUTF8 . BS.unpack
-- | Take a list of bytes in UTF-8 encoding and decode it into a Unicode string.
fromUTF8 :: [Word8] -> String
fromUTF8 [] = ""
fromUTF8 (0xef : 0xbb : 0xbf :cs) = fromUTF8 cs -- skip BOM (byte order marker)
fromUTF8 (c:c':cs) | 0xc0 <= c && c <= 0xdf &&
0x80 <= c' && c' <= 0xbf =
toEnum ((fromEnum c `mod` 0x20) * 0x40 + fromEnum c' `mod` 0x40) : fromUTF8 cs
fromUTF8 (c:c':c'':cs) | 0xe0 <= c && c <= 0xef &&
0x80 <= c' && c' <= 0xbf &&
0x80 <= c'' && c'' <= 0xbf =
toEnum ((fromEnum c `mod` 0x10 * 0x1000) + (fromEnum c' `mod` 0x40) * 0x40 + fromEnum c'' `mod` 0x40) : fromUTF8 cs
fromUTF8 (c:cs) = toEnum (fromEnum c) : fromUTF8 cs
-- | Take a Unicode string and encode it as a list of bytes in UTF-8 encoding.
toUTF8 :: String -> [Word8]
toUTF8 "" = []
toUTF8 (c:cs) =
if c > '\x0000' && c < '\x0080' then
toEnum (fromEnum c) : toUTF8 cs
else if c < toEnum 0x0800 then
let i = fromEnum c
in toEnum (0xc0 + i `div` 0x40) :
toEnum (0x80 + i `mod` 0x40) :
toUTF8 cs
else
let i = fromEnum c
in toEnum (0xe0 + i `div` 0x1000) :
toEnum (0x80 + (i `mod` 0x1000) `div` 0x40) :
toEnum (0x80 + i `mod` 0x40) :
toUTF8 cs

4
debian/control vendored
View file

@ -2,7 +2,7 @@ Source: pandoc
Section: text
Priority: optional
Maintainer: Recai Oktaş <roktas@debian.org>
Build-Depends: debhelper (>= 4.0.0), haskell-devscripts (>=0.5.12), ghc6 (>= 6.8.2-1), libghc6-xhtml-dev, libghc6-mtl-dev, libghc6-network-dev, libghc6-utf8-string-dev
Build-Depends: debhelper (>= 4.0.0), haskell-devscripts (>=0.5.12), ghc6 (>= 6.8.2-1), libghc6-xhtml-dev, libghc6-mtl-dev, libghc6-network-dev
Build-Depends-Indep: haddock
Standards-Version: 3.7.3
Homepage: http://johnmacfarlane.net/pandoc/
@ -38,7 +38,7 @@ Description: general markup converter
Package: libghc6-pandoc-dev
Section: libdevel
Architecture: any
Depends: ${haskell:Depends}, libghc6-xhtml-dev, libghc6-mtl-dev, libghc6-network-dev, libghc6-utf8-string-dev
Depends: ${haskell:Depends}, libghc6-xhtml-dev, libghc6-mtl-dev, libghc6-network-dev
Suggests: pandoc-doc
Description: general markup converter
Pandoc is a Haskell library for converting from one markup format to

41
debian/copyright vendored
View file

@ -37,6 +37,12 @@ Copyright (C) 2008 John MacFarlane and Peter Wang
Released under the GPL.
----------------------------------------------------------------------
Text/Pandoc/Writers/OpenDocument.hs
Copyright (C) 2008 Andrea Rossato
Released under the GPL.
----------------------------------------------------------------------
ASCIIMathML.js
Copyright 2005, Peter Jipsen, Chapman University
@ -51,6 +57,41 @@ by Eric A. Meyer
Released under an explicit Public Domain License
----------------------------------------------------------------------
UTF8.hs
Copyright (c) 2003, OGI School of Science & Engineering, Oregon Health &
Science University, All rights reserved.
Modified by Martin Norbäck, to pass illegal utf-8 sequences through unchanged.
Modified 2006-8 John MacFarlane.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
- Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
- Neither the name of OGI or OHSU nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
----------------------------------------------------------------------
Text/XML/Light/*
(c) 2007 Galois Inc.

View file

@ -64,8 +64,8 @@ Flag executable
Flag library
Description: Build the pandoc library.
Default: True
Flag utf8
Description: Compile in support for UTF-8 input and output.
Flag utf8-string
Description: Use utf8-string library for UTF-8 I/O.
Default: True
Flag citeproc
Description: Compile in support for citeproc-hs bibliographic formatting.
@ -79,9 +79,11 @@ Library
if flag(highlighting)
Build-depends: highlighting-kate
cpp-options: -D_HIGHLIGHTING
if flag(utf8)
if flag(utf8-string)
Build-depends: utf8-string
cpp-options: -D_UTF8
cpp-options: -D_UTF8STRING
else
Other-Modules: Text.Pandoc.UTF8
if flag(citeproc)
Build-depends: citeproc-hs
cpp-options: -D_CITEPROC
@ -141,8 +143,11 @@ Executable pandoc
if flag(highlighting)
cpp-options: -D_HIGHLIGHTING
if flag(utf8)
cpp-options: -D_UTF8
if flag(utf8-string)
Build-depends: utf8-string
cpp-options: -D_UTF8STRING
else
Other-Modules: Text.Pandoc.UTF8
if flag(citeproc)
Build-depends: citeproc-hs
cpp-options: -D_CITEPROC

View file

@ -1,64 +0,0 @@
Name: pandoc
Version: 0.47
License: GPL
License-File: COPYING
Copyright: (c) 2006-2007 John MacFarlane
Author: John MacFarlane <jgm@berkeley.edu>
Maintainer: John MacFarlane <jgm@berkeley.edu>
Stability: alpha
Homepage: http://johnmacfarlane.net/pandoc
Package-URL: http://pandoc.googlecode.com/files/pandoc-0.47.tar.gz
Category: Text
Tested-With: GHC
Synopsis: Conversion between markup formats
Description: Pandoc is a Haskell library for converting from one markup
format to another, and a command-line tool that uses
this library. It can read markdown and (subsets of)
reStructuredText, HTML, and LaTeX, and it can write
markdown, reStructuredText, HTML, LaTeX, ConTeXt, Docbook,
RTF, groff man pages, and S5 HTML slide shows.
.
Pandoc extends standard markdown syntax with footnotes,
embedded LaTeX, definition lists, tables, and other
features. A compatibility mode is provided for those
who need a drop-in replacement for Markdown.pl.
.
In contrast to existing tools for converting markdown
to HTML, which use regex substitutions, pandoc has
a modular design: it consists of a set of readers,
which parse text in a given format and produce a native
representation of the document, and a set of writers,
which convert this native representation into a target
format. Thus, adding an input or output format requires
only adding a reader or writer.
Build-Depends: base, parsec, xhtml, mtl, regex-compat, network
Hs-Source-Dirs: .
Exposed-Modules: Text.Pandoc,
Text.Pandoc.Blocks,
Text.Pandoc.Definition,
Text.Pandoc.CharacterReferences,
Text.Pandoc.Shared,
Text.Pandoc.UTF8,
Text.Pandoc.ASCIIMathML,
Text.Pandoc.DefaultHeaders,
Text.Pandoc.Highlighting,
Text.Pandoc.Readers.HTML,
Text.Pandoc.Readers.LaTeX,
Text.Pandoc.Readers.Markdown,
Text.Pandoc.Readers.RST,
Text.Pandoc.Readers.TeXMath,
Text.Pandoc.Writers.Docbook,
Text.Pandoc.Writers.HTML,
Text.Pandoc.Writers.LaTeX,
Text.Pandoc.Writers.ConTeXt,
Text.Pandoc.Writers.Man,
Text.Pandoc.Writers.Markdown,
Text.Pandoc.Writers.RST,
Text.Pandoc.Writers.RTF,
Text.Pandoc.Writers.S5
Ghc-Options: -O0
Executable: pandoc
Hs-Source-Dirs: .
Main-Is: Main.hs
Ghc-Options: -O0