2006-12-20 20:54:23 +00:00
|
|
|
{-
|
2007-07-07 22:51:55 +00:00
|
|
|
Copyright (C) 2006-7 John MacFarlane <jgm@berkeley.edu>
|
2006-12-20 20:54:23 +00:00
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
|
|
|
the Free Software Foundation; either version 2 of the License, or
|
|
|
|
(at your option) any later version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; if not, write to the Free Software
|
|
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
|
|
|
-}
|
|
|
|
|
2006-12-20 06:50:14 +00:00
|
|
|
{- |
|
|
|
|
Module : Text.Pandoc.Writers.HTML
|
2007-07-07 22:51:55 +00:00
|
|
|
Copyright : Copyright (C) 2006-7 John MacFarlane
|
2006-12-20 06:50:14 +00:00
|
|
|
License : GNU GPL, version 2 or above
|
|
|
|
|
2007-07-07 22:51:55 +00:00
|
|
|
Maintainer : John MacFarlane <jgm@berkeley.edu>
|
2006-12-20 20:20:10 +00:00
|
|
|
Stability : alpha
|
2006-12-20 06:50:14 +00:00
|
|
|
Portability : portable
|
|
|
|
|
|
|
|
Conversion of 'Pandoc' documents to HTML.
|
|
|
|
-}
|
2007-08-15 06:00:58 +00:00
|
|
|
module Text.Pandoc.Writers.HTML ( writeHtml , writeHtmlString ) where
|
2006-10-17 14:22:29 +00:00
|
|
|
import Text.Pandoc.Definition
|
2007-07-26 02:19:20 +00:00
|
|
|
import Text.Pandoc.ASCIIMathML
|
2007-08-15 06:00:58 +00:00
|
|
|
import Text.Pandoc.CharacterReferences ( decodeCharacterReferences )
|
2006-10-17 14:22:29 +00:00
|
|
|
import Text.Pandoc.Shared
|
2006-12-20 18:16:07 +00:00
|
|
|
import Text.Regex ( mkRegex, matchRegex )
|
2006-10-17 14:22:29 +00:00
|
|
|
import Numeric ( showHex )
|
2006-12-20 18:16:07 +00:00
|
|
|
import Data.Char ( ord, toLower )
|
2007-08-15 06:00:58 +00:00
|
|
|
import Data.List ( isPrefixOf, intersperse )
|
2007-08-08 02:43:15 +00:00
|
|
|
import qualified Data.Set as S
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
import Control.Monad.State
|
2007-08-08 02:43:15 +00:00
|
|
|
import Text.XHtml.Transitional
|
2006-10-17 14:22:29 +00:00
|
|
|
|
2007-07-22 20:00:49 +00:00
|
|
|
data WriterState = WriterState
|
2007-08-08 02:43:15 +00:00
|
|
|
{ stNotes :: [Html] -- ^ List of notes
|
|
|
|
, stIds :: [String] -- ^ List of header identifiers
|
|
|
|
, stMath :: Bool -- ^ Math is used in document
|
|
|
|
, stCSS :: S.Set String -- ^ CSS to include in header
|
2007-07-22 20:00:49 +00:00
|
|
|
} deriving Show
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
|
2007-08-08 02:43:15 +00:00
|
|
|
defaultWriterState :: WriterState
|
|
|
|
defaultWriterState = WriterState {stNotes= [], stIds = [],
|
|
|
|
stMath = False, stCSS = S.empty}
|
|
|
|
|
Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
|
|
|
-- | Convert Pandoc document to Html string.
|
|
|
|
writeHtmlString :: WriterOptions -> Pandoc -> String
|
|
|
|
writeHtmlString opts =
|
|
|
|
if writerStandalone opts
|
2007-08-15 06:00:58 +00:00
|
|
|
then renderHtml . writeHtml opts
|
|
|
|
else renderHtmlFragment . writeHtml opts
|
Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
|
|
|
|
|
|
|
-- | Convert Pandoc document to Html structure.
|
|
|
|
writeHtml :: WriterOptions -> Pandoc -> Html
|
|
|
|
writeHtml opts (Pandoc (Meta tit authors date) blocks) =
|
|
|
|
let titlePrefix = writerTitlePrefix opts
|
2007-08-08 02:43:15 +00:00
|
|
|
topTitle = evalState (inlineListToHtml opts tit) defaultWriterState
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
topTitle' = if null titlePrefix
|
|
|
|
then topTitle
|
|
|
|
else titlePrefix +++ " - " +++ topTitle
|
2007-07-22 19:32:39 +00:00
|
|
|
metadata = thetitle topTitle' +++
|
Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
|
|
|
meta ! [httpequiv "Content-Type",
|
|
|
|
content "text/html; charset=UTF-8"] +++
|
|
|
|
meta ! [name "generator", content "pandoc"] +++
|
|
|
|
(toHtmlFromList $
|
|
|
|
map (\a -> meta ! [name "author", content a]) authors) +++
|
|
|
|
(if null date
|
|
|
|
then noHtml
|
2007-08-15 06:00:58 +00:00
|
|
|
else meta ! [name "date", content date])
|
|
|
|
titleHeader = if writerStandalone opts && not (null tit) &&
|
|
|
|
not (writerS5 opts)
|
Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
|
|
|
then h1 ! [theclass "title"] $ topTitle
|
|
|
|
else noHtml
|
2007-07-07 03:52:10 +00:00
|
|
|
headerBlocks = filter isHeaderBlock blocks
|
2007-08-15 06:00:58 +00:00
|
|
|
ids = uniqueIdentifiers $
|
|
|
|
map (\(Header _ lst) -> lst) headerBlocks
|
|
|
|
toc = if writerTableOfContents opts
|
|
|
|
then tableOfContents opts headerBlocks ids
|
|
|
|
else noHtml
|
2007-07-22 20:00:49 +00:00
|
|
|
(blocks', newstate) =
|
2007-08-15 06:00:58 +00:00
|
|
|
runState (blockListToHtml opts blocks)
|
|
|
|
(defaultWriterState {stIds = ids})
|
|
|
|
cssLines = stCSS newstate
|
|
|
|
css = if S.null cssLines
|
|
|
|
then noHtml
|
|
|
|
else style ! [thetype "text/css"] $ primHtml $
|
|
|
|
'\n':(unlines $ S.toList cssLines)
|
|
|
|
math = if stMath newstate
|
|
|
|
then case writerASCIIMathMLURL opts of
|
|
|
|
Just path -> script ! [src path,
|
|
|
|
thetype "text/javascript"] $
|
|
|
|
noHtml
|
|
|
|
Nothing -> primHtml asciiMathMLScript
|
|
|
|
else noHtml
|
|
|
|
head = header $ metadata +++ math +++ css +++
|
|
|
|
primHtml (writerHeader opts)
|
|
|
|
notes = reverse (stNotes newstate)
|
|
|
|
before = primHtml $ writerIncludeBefore opts
|
|
|
|
after = primHtml $ writerIncludeAfter opts
|
|
|
|
thebody = before +++ titleHeader +++ toc +++ blocks' +++
|
|
|
|
footnoteSection opts notes +++ after
|
Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
|
|
|
in if writerStandalone opts
|
2007-08-15 06:00:58 +00:00
|
|
|
then head +++ body thebody
|
Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
|
|
|
else thebody
|
2006-10-17 14:22:29 +00:00
|
|
|
|
2007-07-07 19:08:11 +00:00
|
|
|
-- | Construct table of contents from list of header blocks and identifiers.
|
|
|
|
-- Assumes there are as many identifiers as header blocks.
|
2007-07-22 20:00:49 +00:00
|
|
|
tableOfContents :: WriterOptions -> [Block] -> [String] -> Html
|
2007-07-07 19:08:11 +00:00
|
|
|
tableOfContents opts headers ids =
|
2007-08-15 06:00:58 +00:00
|
|
|
let opts' = opts { writerIgnoreNotes = True }
|
2007-07-07 19:08:11 +00:00
|
|
|
contentsTree = hierarchicalize headers
|
2007-08-15 06:00:58 +00:00
|
|
|
contents = evalState (mapM (elementToListItem opts') contentsTree)
|
|
|
|
(defaultWriterState {stIds = ids})
|
2007-07-07 21:49:46 +00:00
|
|
|
in thediv ! [identifier "toc"] $ unordList contents
|
2007-07-07 19:08:11 +00:00
|
|
|
|
|
|
|
-- | Converts an Element to a list item for a table of contents,
|
|
|
|
-- retrieving the appropriate identifier from state.
|
|
|
|
elementToListItem :: WriterOptions -> Element -> State WriterState Html
|
|
|
|
elementToListItem opts (Blk _) = return noHtml
|
|
|
|
elementToListItem opts (Sec headerText subsecs) = do
|
2007-07-22 20:00:49 +00:00
|
|
|
st <- get
|
|
|
|
let ids = stIds st
|
2007-07-07 19:08:11 +00:00
|
|
|
let (id, rest) = if null ids
|
2007-07-22 20:00:49 +00:00
|
|
|
then ("", [])
|
2007-07-07 19:08:11 +00:00
|
|
|
else (head ids, tail ids)
|
2007-07-22 20:00:49 +00:00
|
|
|
put $ st {stIds = rest}
|
2007-07-07 19:08:11 +00:00
|
|
|
txt <- inlineListToHtml opts headerText
|
|
|
|
subHeads <- mapM (elementToListItem opts) subsecs
|
|
|
|
let subList = if null subHeads
|
|
|
|
then noHtml
|
|
|
|
else unordList subHeads
|
2007-08-15 06:00:58 +00:00
|
|
|
return $ (anchor ! [href ("#" ++ id), identifier ("TOC-" ++ id)] $ txt) +++
|
|
|
|
subList
|
2007-07-07 19:08:11 +00:00
|
|
|
|
2006-12-20 06:50:14 +00:00
|
|
|
-- | Convert list of Note blocks to a footnote <div>.
|
|
|
|
-- Assumes notes are sorted.
|
2007-07-22 20:00:49 +00:00
|
|
|
footnoteSection :: WriterOptions -> [Html] -> Html
|
2007-01-06 09:54:58 +00:00
|
|
|
footnoteSection opts notes =
|
2006-12-20 06:50:14 +00:00
|
|
|
if null notes
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
then noHtml
|
2007-08-15 06:00:58 +00:00
|
|
|
else thediv ! [theclass "footnotes"] $ hr +++ (olist << notes)
|
2006-12-19 23:13:03 +00:00
|
|
|
|
2006-10-17 14:22:29 +00:00
|
|
|
-- | Obfuscate a "mailto:" link using Javascript.
|
2007-07-28 06:38:49 +00:00
|
|
|
obfuscateLink :: WriterOptions -> String -> String -> Html
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
obfuscateLink opts text src =
|
2007-07-28 06:38:49 +00:00
|
|
|
let emailRegex = mkRegex "^mailto:([^@]*)@(.*)$"
|
2007-08-15 06:00:58 +00:00
|
|
|
src' = map toLower src
|
|
|
|
in case (matchRegex emailRegex src') of
|
|
|
|
(Just [name, domain]) ->
|
|
|
|
let domain' = substitute "." " dot " domain
|
|
|
|
at' = obfuscateChar '@'
|
|
|
|
(linkText, altText) =
|
|
|
|
if text == drop 7 src' -- autolink
|
|
|
|
then ("'<code>'+e+'</code>'", name ++ " at " ++ domain')
|
|
|
|
else ("'" ++ text ++ "'", text ++ " (" ++ name ++ " at " ++
|
|
|
|
domain' ++ ")")
|
|
|
|
in if writerStrictMarkdown opts
|
|
|
|
then -- need to use primHtml or &'s are escaped to & in URL
|
|
|
|
primHtml $ "<a href=\"" ++ (obfuscateString src')
|
|
|
|
++ "\">" ++ (obfuscateString text) ++ "</a>"
|
|
|
|
else (script ! [thetype "text/javascript"] $
|
|
|
|
primHtml ("\n<!--\nh='" ++
|
|
|
|
obfuscateString domain ++ "';a='" ++ at' ++ "';n='" ++
|
|
|
|
obfuscateString name ++ "';e=n+a+h;\n" ++
|
|
|
|
"document.write('<a h'+'ref'+'=\"ma'+'ilto'+':'+e+'\">'+" ++
|
|
|
|
linkText ++ "+'<\\/'+'a'+'>');\n// -->\n")) +++
|
|
|
|
noscript (primHtml $ obfuscateString altText)
|
|
|
|
_ -> anchor ! [href src] $ primHtml text -- malformed email
|
2006-10-17 14:22:29 +00:00
|
|
|
|
|
|
|
-- | Obfuscate character as entity.
|
|
|
|
obfuscateChar :: Char -> String
|
2006-12-20 06:50:14 +00:00
|
|
|
obfuscateChar char =
|
2007-08-15 06:00:58 +00:00
|
|
|
let num = ord char
|
|
|
|
numstr = if even num then show num else "x" ++ showHex num ""
|
|
|
|
in "&#" ++ numstr ++ ";"
|
2006-10-17 14:22:29 +00:00
|
|
|
|
2006-12-20 18:16:07 +00:00
|
|
|
-- | Obfuscate string using entities.
|
|
|
|
obfuscateString :: String -> String
|
2007-08-15 06:00:58 +00:00
|
|
|
obfuscateString = concatMap obfuscateChar . decodeCharacterReferences
|
2006-12-20 18:16:07 +00:00
|
|
|
|
2007-07-08 17:33:03 +00:00
|
|
|
-- | True if character is a punctuation character (unicode).
|
|
|
|
isPunctuation :: Char -> Bool
|
|
|
|
isPunctuation c =
|
2007-08-15 06:00:58 +00:00
|
|
|
let c' = ord c
|
|
|
|
in if c `elem` "!\"'()*,-./:;<>?[\\]`{|}~" || c' >= 0x2000 && c' <= 0x206F ||
|
|
|
|
c' >= 0xE000 && c' <= 0xE0FF
|
|
|
|
then True
|
|
|
|
else False
|
2007-07-08 17:33:03 +00:00
|
|
|
|
2007-08-08 02:43:15 +00:00
|
|
|
-- | Add CSS for document header.
|
|
|
|
addToCSS :: String -> State WriterState ()
|
|
|
|
addToCSS item = do
|
2007-07-22 20:00:49 +00:00
|
|
|
st <- get
|
2007-08-08 02:43:15 +00:00
|
|
|
let current = stCSS st
|
2007-08-15 06:00:58 +00:00
|
|
|
put $ st {stCSS = S.insert item current}
|
2007-07-22 20:00:49 +00:00
|
|
|
|
2007-07-07 03:52:10 +00:00
|
|
|
-- | Convert Pandoc inline list to plain text identifier.
|
|
|
|
inlineListToIdentifier :: [Inline] -> String
|
|
|
|
inlineListToIdentifier [] = ""
|
|
|
|
inlineListToIdentifier (x:xs) =
|
|
|
|
xAsText ++ inlineListToIdentifier xs
|
|
|
|
where xAsText = case x of
|
2007-08-15 06:00:58 +00:00
|
|
|
Str s -> filter (\c -> c == '-' || not (isPunctuation c)) $
|
|
|
|
concat $ intersperse "-" $ words $ map toLower s
|
|
|
|
Emph lst -> inlineListToIdentifier lst
|
|
|
|
Strikeout lst -> inlineListToIdentifier lst
|
|
|
|
Superscript lst -> inlineListToIdentifier lst
|
|
|
|
Subscript lst -> inlineListToIdentifier lst
|
|
|
|
Strong lst -> inlineListToIdentifier lst
|
|
|
|
Quoted _ lst -> inlineListToIdentifier lst
|
|
|
|
Code s -> s
|
|
|
|
Space -> "-"
|
|
|
|
EmDash -> "-"
|
|
|
|
EnDash -> "-"
|
|
|
|
Apostrophe -> ""
|
|
|
|
Ellipses -> ""
|
|
|
|
LineBreak -> "-"
|
|
|
|
TeX _ -> ""
|
|
|
|
HtmlInline _ -> ""
|
|
|
|
Link lst _ -> inlineListToIdentifier lst
|
|
|
|
Image lst _ -> inlineListToIdentifier lst
|
|
|
|
Note _ -> ""
|
2007-07-07 03:52:10 +00:00
|
|
|
|
|
|
|
-- | Return unique identifiers for list of inline lists.
|
|
|
|
uniqueIdentifiers :: [[Inline]] -> [String]
|
|
|
|
uniqueIdentifiers ls =
|
2007-07-07 19:08:11 +00:00
|
|
|
let addIdentifier (nonuniqueIds, uniqueIds) l =
|
2007-07-07 03:52:10 +00:00
|
|
|
let new = inlineListToIdentifier l
|
2007-07-07 19:08:11 +00:00
|
|
|
matches = length $ filter (== new) nonuniqueIds
|
2007-07-08 03:40:22 +00:00
|
|
|
new' = new ++ if matches > 0 then ("-" ++ show matches) else ""
|
2007-07-07 19:08:11 +00:00
|
|
|
in (new:nonuniqueIds, new':uniqueIds)
|
2007-08-15 06:00:58 +00:00
|
|
|
in reverse $ snd $ foldl addIdentifier ([],[]) ls
|
2007-07-07 03:52:10 +00:00
|
|
|
|
2006-10-17 14:22:29 +00:00
|
|
|
-- | Convert Pandoc block element to HTML.
|
2007-07-07 03:52:10 +00:00
|
|
|
blockToHtml :: WriterOptions -> Block -> State WriterState Html
|
2007-08-15 06:00:58 +00:00
|
|
|
blockToHtml opts Null = return $ noHtml
|
|
|
|
blockToHtml opts (Plain lst) = inlineListToHtml opts lst
|
|
|
|
blockToHtml opts (Para lst) = inlineListToHtml opts lst >>= (return . paragraph)
|
|
|
|
blockToHtml opts (RawHtml str) = return $ primHtml str
|
|
|
|
blockToHtml opts (HorizontalRule) = return $ hr
|
|
|
|
blockToHtml opts (CodeBlock str) = return $ pre $ thecode << (str ++ "\n")
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
-- the final \n for consistency with Markdown.pl
|
2007-08-15 06:00:58 +00:00
|
|
|
blockToHtml opts (BlockQuote blocks) =
|
|
|
|
-- in S5, treat list in blockquote specially
|
|
|
|
-- if default is incremental, make it nonincremental;
|
|
|
|
-- otherwise incremental
|
|
|
|
if writerS5 opts
|
|
|
|
then let inc = not (writerIncremental opts) in
|
|
|
|
case blocks of
|
|
|
|
[BulletList lst] -> blockToHtml (opts {writerIncremental = inc})
|
|
|
|
(BulletList lst)
|
|
|
|
[OrderedList attribs lst] ->
|
|
|
|
blockToHtml (opts {writerIncremental = inc})
|
|
|
|
(OrderedList attribs lst)
|
|
|
|
otherwise -> blockListToHtml opts blocks >>=
|
|
|
|
(return . blockquote)
|
|
|
|
else blockListToHtml opts blocks >>= (return . blockquote)
|
|
|
|
blockToHtml opts (Header level lst) = do
|
|
|
|
contents <- inlineListToHtml opts lst
|
|
|
|
st <- get
|
|
|
|
let ids = stIds st
|
|
|
|
let (id, rest) = if null ids
|
|
|
|
then ("", [])
|
|
|
|
else (head ids, tail ids)
|
|
|
|
put $ st {stIds = rest}
|
2007-08-20 18:42:09 +00:00
|
|
|
let attribs = if writerStrictMarkdown opts && not (writerTableOfContents opts)
|
|
|
|
then []
|
|
|
|
else [identifier id]
|
2007-08-15 06:00:58 +00:00
|
|
|
let headerHtml = case level of
|
|
|
|
1 -> h1 contents ! attribs
|
|
|
|
2 -> h2 contents ! attribs
|
|
|
|
3 -> h3 contents ! attribs
|
|
|
|
4 -> h4 contents ! attribs
|
|
|
|
5 -> h5 contents ! attribs
|
|
|
|
6 -> h6 contents ! attribs
|
|
|
|
_ -> paragraph contents ! attribs
|
|
|
|
return $ if writerTableOfContents opts
|
|
|
|
then anchor ! [href ("#TOC-" ++ id)] $ headerHtml
|
|
|
|
else headerHtml
|
|
|
|
blockToHtml opts (BulletList lst) = do
|
|
|
|
contents <- mapM (blockListToHtml opts) lst
|
|
|
|
let attribs = if writerIncremental opts
|
|
|
|
then [theclass "incremental"]
|
|
|
|
else []
|
|
|
|
return $ unordList ! attribs $ contents
|
|
|
|
blockToHtml opts (OrderedList (startnum, numstyle, _) lst) = do
|
|
|
|
contents <- mapM (blockListToHtml opts) lst
|
|
|
|
let numstyle' = camelCaseToHyphenated $ show numstyle
|
|
|
|
let attribs = (if writerIncremental opts
|
|
|
|
then [theclass "incremental"]
|
|
|
|
else []) ++
|
|
|
|
(if startnum /= 1
|
|
|
|
then [start startnum]
|
|
|
|
else []) ++
|
|
|
|
(if numstyle /= DefaultStyle
|
|
|
|
then [theclass numstyle']
|
|
|
|
else [])
|
|
|
|
if numstyle /= DefaultStyle
|
|
|
|
then addToCSS $ "ol." ++ numstyle' ++
|
|
|
|
" { list-style-type: " ++
|
|
|
|
numstyle' ++ "; }"
|
|
|
|
else return ()
|
|
|
|
return $ ordList ! attribs $ contents
|
|
|
|
blockToHtml opts (DefinitionList lst) = do
|
|
|
|
contents <- mapM (\(term, def) -> do term' <- inlineListToHtml opts term
|
|
|
|
def' <- blockListToHtml opts def
|
|
|
|
return $ (term', def')) lst
|
|
|
|
let attribs = if writerIncremental opts
|
|
|
|
then [theclass "incremental"]
|
|
|
|
else []
|
|
|
|
return $ defList ! attribs $ contents
|
|
|
|
blockToHtml opts (Table capt aligns widths headers rows) = do
|
|
|
|
let alignStrings = map alignmentToString aligns
|
|
|
|
captionDoc <- if null capt
|
|
|
|
then return noHtml
|
|
|
|
else inlineListToHtml opts capt >>= return . caption
|
|
|
|
colHeads <- colHeadsToHtml opts alignStrings
|
|
|
|
widths headers
|
|
|
|
rows' <- mapM (tableRowToHtml opts alignStrings) rows
|
|
|
|
return $ table $ captionDoc +++ colHeads +++ rows'
|
2007-01-15 19:52:42 +00:00
|
|
|
|
2007-08-15 06:00:58 +00:00
|
|
|
colHeadsToHtml opts alignStrings widths headers = do
|
|
|
|
heads <- sequence $ zipWith3
|
|
|
|
(\align width item -> tableItemToHtml opts th align width item)
|
|
|
|
alignStrings widths headers
|
|
|
|
return $ tr $ toHtmlFromList heads
|
2007-01-15 19:52:42 +00:00
|
|
|
|
|
|
|
alignmentToString alignment = case alignment of
|
Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
|
|
|
AlignLeft -> "left"
|
|
|
|
AlignRight -> "right"
|
|
|
|
AlignCenter -> "center"
|
2007-01-15 19:52:42 +00:00
|
|
|
AlignDefault -> "left"
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
|
2007-08-15 06:00:58 +00:00
|
|
|
tableRowToHtml opts aligns cols =
|
|
|
|
(sequence $ zipWith3 (tableItemToHtml opts td) aligns (repeat 0) cols) >>=
|
|
|
|
return . tr . toHtmlFromList
|
2007-01-15 19:52:42 +00:00
|
|
|
|
2007-08-15 06:00:58 +00:00
|
|
|
tableItemToHtml opts tag align' width item = do
|
|
|
|
contents <- blockListToHtml opts item
|
|
|
|
let attrib = [align align'] ++
|
|
|
|
if width /= 0
|
|
|
|
then [thestyle ("{width: " ++ show (truncate (100*width)) ++
|
|
|
|
"%;}")]
|
|
|
|
else []
|
|
|
|
return $ tag ! attrib $ contents
|
2007-01-06 09:54:58 +00:00
|
|
|
|
2007-07-07 03:52:10 +00:00
|
|
|
blockListToHtml :: WriterOptions -> [Block] -> State WriterState Html
|
2007-08-15 06:00:58 +00:00
|
|
|
blockListToHtml opts lst =
|
|
|
|
mapM (blockToHtml opts) lst >>= return . toHtmlFromList
|
2006-10-17 14:22:29 +00:00
|
|
|
|
|
|
|
-- | Convert list of Pandoc inline elements to HTML.
|
2007-07-07 03:52:10 +00:00
|
|
|
inlineListToHtml :: WriterOptions -> [Inline] -> State WriterState Html
|
2007-08-15 06:00:58 +00:00
|
|
|
inlineListToHtml opts lst =
|
|
|
|
mapM (inlineToHtml opts) lst >>= return . toHtmlFromList
|
2006-10-17 14:22:29 +00:00
|
|
|
|
|
|
|
-- | Convert Pandoc inline element to HTML.
|
2007-07-07 03:52:10 +00:00
|
|
|
inlineToHtml :: WriterOptions -> Inline -> State WriterState Html
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
inlineToHtml opts inline =
|
|
|
|
case inline of
|
|
|
|
(Str str) -> return $ stringToHtml str
|
|
|
|
(Space) -> return $ stringToHtml " "
|
|
|
|
(LineBreak) -> return $ br
|
|
|
|
(EmDash) -> return $ primHtmlChar "mdash"
|
|
|
|
(EnDash) -> return $ primHtmlChar "ndash"
|
|
|
|
(Ellipses) -> return $ primHtmlChar "hellip"
|
|
|
|
(Apostrophe) -> return $ primHtmlChar "rsquo"
|
2007-08-15 06:00:58 +00:00
|
|
|
(Emph lst) -> inlineListToHtml opts lst >>= return . emphasize
|
|
|
|
(Strong lst) -> inlineListToHtml opts lst >>= return . strong
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
(Code str) -> return $ thecode << str
|
2007-08-15 06:00:58 +00:00
|
|
|
(Strikeout lst) -> addToCSS
|
|
|
|
".strikeout { text-decoration: line-through; }" >>
|
2007-07-22 19:32:39 +00:00
|
|
|
inlineListToHtml opts lst >>=
|
2007-08-15 06:00:58 +00:00
|
|
|
return . (thespan ! [theclass "strikeout"])
|
|
|
|
(Superscript lst) -> inlineListToHtml opts lst >>= return . sup
|
|
|
|
(Subscript lst) -> inlineListToHtml opts lst >>= return . sub
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
(Quoted quoteType lst) ->
|
|
|
|
let (leftQuote, rightQuote) = case quoteType of
|
|
|
|
SingleQuote -> (primHtmlChar "lsquo",
|
|
|
|
primHtmlChar "rsquo")
|
|
|
|
DoubleQuote -> (primHtmlChar "ldquo",
|
2007-08-15 06:00:58 +00:00
|
|
|
primHtmlChar "rdquo")
|
|
|
|
in do contents <- inlineListToHtml opts lst
|
|
|
|
return $ leftQuote +++ contents +++ rightQuote
|
|
|
|
(TeX str) -> (if writerUseASCIIMathML opts
|
|
|
|
then modify (\st -> st {stMath = True})
|
|
|
|
else return ()) >> return (stringToHtml str)
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
(HtmlInline str) -> return $ primHtml str
|
2007-07-28 06:38:49 +00:00
|
|
|
(Link [Code str] (src,tit)) | "mailto:" `isPrefixOf` src ->
|
2007-08-15 06:00:58 +00:00
|
|
|
return $ obfuscateLink opts str src
|
|
|
|
(Link txt (src,tit)) | "mailto:" `isPrefixOf` src -> do
|
|
|
|
linkText <- inlineListToHtml opts txt
|
|
|
|
return $ obfuscateLink opts (show linkText) src
|
|
|
|
(Link txt (src,tit)) -> do
|
|
|
|
linkText <- inlineListToHtml opts txt
|
|
|
|
return $ anchor ! ([href src] ++
|
|
|
|
if null tit then [] else [title tit]) $
|
|
|
|
linkText
|
|
|
|
(Image txt (source,tit)) -> do
|
|
|
|
alternate <- inlineListToHtml opts txt
|
|
|
|
let alternate' = renderHtmlFragment alternate
|
|
|
|
let attributes = [src source, title tit] ++
|
|
|
|
if null txt
|
|
|
|
then []
|
|
|
|
else [alt alternate']
|
|
|
|
return $ image ! attributes
|
|
|
|
-- note: null title included, as in Markdown.pl
|
|
|
|
(Note contents) -> do
|
|
|
|
st <- get
|
|
|
|
let notes = stNotes st
|
|
|
|
let number = (length notes) + 1
|
|
|
|
let ref = show number
|
|
|
|
htmlContents <- blockListToNote opts ref contents
|
|
|
|
-- push contents onto front of notes
|
|
|
|
put $ st {stNotes = (htmlContents:notes)}
|
|
|
|
return $ anchor ! [href ("#fn" ++ ref),
|
|
|
|
theclass "footnoteRef",
|
|
|
|
identifier ("fnref" ++ ref)] <<
|
|
|
|
sup << ref
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
|
2007-07-07 03:52:10 +00:00
|
|
|
blockListToNote :: WriterOptions -> String -> [Block] -> State WriterState Html
|
Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required. This rewrite yields a small speed
boost and considerably cleaner code.
* Text/Pandoc/Definition.hs:
+ blocks: removed Key and Note
+ inlines: removed NoteRef, added Note
+ modified Target: there is no longer a 'Ref' target; all targets
are explicit URL, title pairs
* Text/Pandoc/Shared.hs:
+ Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
used in some of the readers.
+ Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
'replaceRefLinksBlockList', along with some auxiliary functions
used only by them. These are no longer needed, since
reference links are resolved in the Markdown and RST readers.
+ Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
to the Docbook writer, since that is now the only module that uses
them.
+ Changed name of 'escapeSGMLString' to 'escapeStringForXML'
+ Added KeyTable and NoteTable types
+ Removed fields from ParserState; 'stateKeyBlocks', 'stateKeysUsed',
'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'.
Added 'stateKeys' and 'stateNotes'.
+ Added clause for Note to 'prettyBlock'.
+ Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.
* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'
* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
line block up to and including following blank lines.
* Main.hs: Replaced --inline-links with --reference-links.
* README:
+ Documented --reference-links and removed description of --inline-links.
+ Added note that footnotes may occur anywhere in the document, but must
be at the outer level, not embedded in block elements.
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
option, added --reference-links option
* Markdown and RST readers:
+ Rewrote to fit new Pandoc definition. Since there are no longer
Note or Key blocks, all note and key blocks are parsed on a first pass
through the document. Once tables of notes and keys have been constructed,
the remaining parts of the document are reassembled and parsed.
+ Refactored link parsers.
* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
there are no longer Note or Key blocks, notes and references can be
parsed in a single pass through the document.
* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
and definition. State is used to hold lists of references footnotes to
and be printed at the end of the document.
* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
of the different treatment of footnotes, the "notes" parameter is no
longer needed in the block and inline conversion functions.)
* Docbook writer:
+ Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
they are now used only by the Docbook writer.
+ Rewrote using new Pandoc definition. (Because of the different
treatment of footnotes, the "notes" parameter is no longer needed
in the block and inline conversion functions.)
* Updated test suite
* Throughout: old haskell98 module names replaced by hierarchical module
names, e.g. List by Data.List.
* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
in "Build-Depends."
* cabalize:
+ Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
module names are being used throughout)
+ Added mtl to BASE_DEPENDS (needed for state monad)
+ Removed html from GHC66_DEPENDS (not needed since xhtml is now used)
git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
|
|
|
blockListToNote opts ref blocks =
|
2007-07-14 03:37:41 +00:00
|
|
|
-- If last block is Para or Plain, include the backlink at the end of
|
|
|
|
-- that block. Otherwise, insert a new Plain block with the backlink.
|
|
|
|
let backlink = [HtmlInline $ " <a href=\"#fnref" ++ ref ++
|
|
|
|
"\" class=\"footnoteBackLink\"" ++
|
|
|
|
" title=\"Jump back to footnote " ++ ref ++ "\">↩</a>"]
|
|
|
|
blocks' = if null blocks
|
|
|
|
then []
|
|
|
|
else let lastBlock = last blocks
|
|
|
|
otherBlocks = init blocks
|
|
|
|
in case lastBlock of
|
|
|
|
(Para lst) -> otherBlocks ++
|
|
|
|
[Para (lst ++ backlink)]
|
|
|
|
(Plain lst) -> otherBlocks ++
|
|
|
|
[Plain (lst ++ backlink)]
|
|
|
|
_ -> otherBlocks ++ [lastBlock,
|
|
|
|
Plain backlink]
|
2007-08-15 06:00:58 +00:00
|
|
|
in do contents <- blockListToHtml opts blocks'
|
|
|
|
return $ li ! [identifier ("fn" ++ ref)] $ contents
|
Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml. Two functions
are now provided, writeHtml and writeHtmlString: the former outputs
an Html structure, the latter a rendered string. The S5 writer is
also changed, in parallel ways (writeS5, writeS5String). The Html
header is now written programmatically, so it has been removed from
the 'headers' directory. The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically. The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
|
|
|
|