Commit graph

1053 commits

Author SHA1 Message Date
fiddlosopher
8f8888df2e Updated INSTALL with instructions for getting GHC + libraries using apt-get.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1190 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-16 18:22:18 +00:00
fiddlosopher
d474852f56 Removed unnecessary imports.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1189 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-16 02:18:23 +00:00
fiddlosopher
155ef838d7 Added Arch linux instructions to INSTALL.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1188 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-14 05:12:16 +00:00
fiddlosopher
6795ffd93f Bumped version to 0.47.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1187 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 20:25:46 +00:00
fiddlosopher
c453e2f7d5 Made -c/--css option repeatable on the command line (like -H, -A, -B).
Documented repeatability of these options in README.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1186 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 20:21:28 +00:00
fiddlosopher
936465ade0 Updated website with news of version 0.46 release.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1185 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 19:57:27 +00:00
fiddlosopher
690677e443 Added some details to RELEASE-CHECKLIST.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1184 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 19:42:09 +00:00
fiddlosopher
93079002dc Removed redundant DISTNAME field from freebsd Makefile template.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1183 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 19:42:06 +00:00
fiddlosopher
8fca649d05 Changed copyright dates where appropriate to include 2008.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1181 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 17:26:16 +00:00
fiddlosopher
a7a519e04c Changed dates on documentation.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1180 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 17:25:57 +00:00
fiddlosopher
3c7498455e Removed unneeded link reference from website index.txt.in.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1179 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 17:09:14 +00:00
fiddlosopher
2df432dc60 Changed comment used to replace unsafe HTML if sanitize-html option
selected.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1178 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 04:53:01 +00:00
roktas
71e508f98a * Debian packaging changes:
+ Remove the empty 'include' directory in -dev package, which lintian
    complains about.
  + Bump Standarts-Version to 3.7.3.
  + Use new 'Homepage:' field to specify the upstream URL on suggestion of
    lintian.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1177 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-08 04:00:35 +00:00
fiddlosopher
65e118ec58 Updated changelog.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1176 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-07 18:44:52 +00:00
fiddlosopher
b9e30ca8b7 RST reader: Fixed bug in parsing explicit links (resolves Issue #44).
The problem was that we were looking for inlines until a '<' character
signaled the start of the URL.  So if you hit a reference-style link,
it would keep looking til the end of the document.  Fix:  change
inline => (notFollowedBy (char '`') >> inline).  Note that this won't
allow code inlines in links, but these aren't allowed in resT anyway.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1175 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:55 +00:00
fiddlosopher
85657add6a RST reader: cleaned up parsing of reference names in key blocks and links.
Allow nonquoted reference links to contain isolated '.', '-', '_', so
so that strings like 'a_b_' count as links.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1174 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:43 +00:00
fiddlosopher
e4837c140c RST reader: Removed unnecessary check for following link in str.
This is unnecessary now that link is above str in the definition of
'inline'.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1173 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:38 +00:00
fiddlosopher
d271473044 Fixed markdown reader to handle "*hi **there***" as a strong nested in an emph.
(A '*' is only recognized as the end of the emphasis if it's not the beginning
of a strong emphasis.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1172 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:31 +00:00
fiddlosopher
0e94609e18 Markdown reader: Moved blockQuote parser before list parsers.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1171 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:24 +00:00
fiddlosopher
0b3ffbf339 Added RELEASE-CHECKLIST.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1170 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-06 19:46:18 +00:00
fiddlosopher
0921704d92 Use an interpreted text role to render math in restructuredText.
See http://www.american.edu/econ/itex2mml/mathhack.rst for the
strategy.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1168 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-04 18:59:00 +00:00
fiddlosopher
ec3f6b649f Refactored RST writer to usea record instead of a tuple for state,
and to include options in state so it doesn't need to be passed as
a parameter.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1167 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-04 18:58:50 +00:00
fiddlosopher
5df912b162 Added optional HTML sanitization using a whitelist.
When this option is specified (--sanitize-html on the command line),
unsafe HTML tags will be replaced by HTML comments, and unsafe HTML
attributes will be removed.  This option should be especially useful
for those who want to use pandoc libraries in web applications, where
users will provide the input.

+ Main.hs:  Added --sanitize-html option.
+ Text.Pandoc.Shared:  Added stateSanitizeHTML to ParserState.
+ Text.Pandoc.Readers.HTML:
  - Added whitelists of sanitaryTags and sanitaryAttributes.
  - Added parsers to check these lists (and state) to see if a given
    tag or attribute should be counted unsafe.
  - Modified anyHtmlTag and anyHtmlEndTag to replace unsafe tags
    with comments.
  - Modified htmlAttribute to remove unsafe attributes.
  - Modified htmlScript and htmlStyle to remove these elements if
    unsafe.
  - Modified rawHtmlBlock to use anyHtmlBlockTag instead of anyHtmlTag
    and anyHtmlEndTag.  This fixes a bug in markdown parsing, where
    inline tags would be included in raw HTML blocks.
  - Modified anyHtmlBlockTag to test for (not inline) rather than
    directly for block.  This allows us to handle e.g. docbook in
    the markdown reader.
  - Minor tweaks in nonTitleNonHead  and parseTitle.
+ Text.Pandoc.Readers.Markdown:
  - In non-strict mode use rawHtmlBlocks instead of htmlBlock.
    Simplified htmlBlock, since we know it's only called in strict
    mode.
+ Modified README and man pages to document new option.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1166 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-03 21:32:32 +00:00
fiddlosopher
a505f70f0b Made -H, -A, and -B cumulative: if they are specified multiple times,
multiple files will be included.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1165 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-01-01 20:43:48 +00:00
fiddlosopher
e37df6db69 Fixed bug in the markdown reader: HTML preceding a code block
could cause it to be parsed as a paragraph.  (The problem is that
the HTML parser used to eat all blank space after an HTML block,
including the indentation of the code block.)  Resolves Issue #39.
+ In Text.Pandoc.Readers.HTML, removed parsing of following space
  from rawHtmlBlock.
+ In Text.Pandoc.Readers.Markdown, modified rawHtmlBlocks so that
  indentation is eaten *only* on the first line after the HTML
  block.  This means that in
  <div>
       foo
  <div>
  the foo won't be treated as a code block, but in
  <div>

      foo

  </div>
  it will.  This seems the right approach for least suprise.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1164 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-31 01:02:44 +00:00
fiddlosopher
ad5cbb78d0 HTML reader: Finished fixing Issue #40.
Contents of script tags were still being treated as markdown when
the script tags were parsed as inline.  Fixed by moving "script"
from the list of tags that can be either block or inline to the
list of block tags.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1163 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-31 00:45:54 +00:00
fiddlosopher
d989a78b3b HTML reader: Don't interpret contents of style tags as markdown.
Resolves Issue #40.
+ Added htmlStyle, analagous to htmlScript.
+ Use htmlStyle in htmlBlockElement and rawHtmlInline.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1162 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-31 00:05:03 +00:00
fiddlosopher
6e1a652429 Fixed bug in HTML reader: it was looking for <IT> tag, not <I>.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1161 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-30 02:21:01 +00:00
fiddlosopher
6c1f9c8e39 Made LaTeX reader properly recognize --parse-raw in rawLaTeXInline.
Updated LaTeX reader test to use --parse-raw.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1160 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-30 02:07:55 +00:00
fiddlosopher
a465c4f659 Changed handling of titles in HTML writer so you don't get "titleprefix - "
followed by nothing.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1159 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-29 21:11:33 +00:00
fiddlosopher
5e65598b9e Use wrappers around Doc elements to ensure proper spacing in ConTeXt writer.
Each block element is wrapped with either Pad or Reg.  Pad'ed elements are
guaranteed to have a blank line in between.  Updated ConTeXt tests.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1158 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-29 09:31:45 +00:00
fiddlosopher
f74dda27b6 Markdown reader: Make 'block' conditional on strictness state,
instead of using failIfStrict in block parsers. Use a different
ordering of parsers in strict mode: raw HTML block before paragraph.
This recovers performance that was lost in strict mode with r1154.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1157 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-29 06:30:12 +00:00
fiddlosopher
05cbdf04fd Markdown: better handling of parentheses in URLs and quotation marks in titles.
+ source parser first tries to parse URL with balanced parentheses;
  if that doesn't work, it tries to parse everything beginning with
  '(' and ending with ')'.
+ source parser now uses an auxiliary function source'.
+ linkTitle parser simplified and improved, under assumption that it
  will be called in context of source'.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1156 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-24 04:22:41 +00:00
fiddlosopher
ee6f06ec05 Modified markdown reader to disallow links within links. (Resolves Issue #35.)
+ Replaced inlinesInBalanced with inlinesInBalancedBrackets, which instead
  of hard-coding the inline parser takes an inline parser as a parameter.
+ Modified reference and inlineNote to use inlinesInBalancedBrackets.
+ Removed unneeded inlineString function.
+ Added inlineNonLink parser, which is now used in the definition of
  reference.
+ Added inlineParsers list and redefined inline and inlineNonLink parsers
  in terms of it.
+ Added failIfLink parser.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1155 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-24 04:22:31 +00:00
fiddlosopher
97992e6f7b Improved handling of raw HTML in Markdown reader. (Resolves Issue #36.)
Tags that can be either block or inline (e.g. <ins>) should be treated
as block when appropriate and as inline when appropriate.  Thus, for
example,
  <ins>hi</ins>
should be treated as a paragraph with inline <ins> tags, while
  <ins>

  hi

  </ins>
should be treated as a paragraph within <ins> tags.
+ Moved htmlBlock after para in list of block parsers.  This ensures
  that tags that can be either block or inline get parsed as inline
  when appropriate.
+ Modified rawHtmlInline' so that block elements aren't treated as inline.
+ Modified para parser so that paragraphs containing only HTML tags and
  blank space are not allowed.  Treat these as raw HTML blocks instead.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1154 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-24 04:22:20 +00:00
fiddlosopher
dad8e16330 Changed failure message in anyHtmlBlockTag (minor change).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1153 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-24 04:22:08 +00:00
fiddlosopher
6802d287cf Modified rawHtmlBlock in HTML reader so it parses </html> and </body> tags.
This allows these tags to be handled correctly in Markdown.
HTML reader now uses rawHtmlBlock', which excludes </html> and </body>,
since these are handled in parseHtml.  (Resolves Issue #38.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1152 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-23 03:46:12 +00:00
fiddlosopher
fbecb49790 Modified 'source' parser in Markdown reader to allow backslash escapes in URLs.
So, for example, [my](/url\(1\)) yields a link to /url(1).  Resolves Issue #34.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1151 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-21 22:31:20 +00:00
fiddlosopher
48f2cc5600 Modified rules for HTML header identifiers to ensure legal identifiers.
+ Modified htmlListToIdentifier and uniqueIdentifier in HTML writer
  to ensure that identifiers begin with an alphabetic character.
+ The new rules are described in README.
+ Resolves Issue #33.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1150 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-21 19:25:54 +00:00
fiddlosopher
0681d1d3e7 Fixed handling of email addresses in markdown and reStructuredText.
Consolidated uri and email address parsers.  (Resolves Issue #37.)
+ New emailAddress and uri parsers in Text.Pandoc.Shared.
  uri parser uses parseURI from Network.URI.  emailAddress
  parser properly handles email addresses with periods in them.
+ Removed uri and emailAddress parsers from Text.Pandoc.Readers.RST.
+ Removed uri and emailAddress parsers from Text.Pandoc.Readers.Markdown.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1149 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-21 16:13:10 +00:00
fiddlosopher
246e5f9ea3 Bumped version number to 0.46.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1147 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-10 21:01:59 +00:00
fiddlosopher
46a2d9269c Added date to release note on website.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1145 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-09 20:15:52 +00:00
fiddlosopher
7a68e64162 Use \{0,1\} instead of \? in sed, so it works on BSD/Mac OSX too.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1144 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-08 23:00:40 +00:00
fiddlosopher
0bf553142c Updated website.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1143 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-08 23:00:35 +00:00
fiddlosopher
fa029799e6 Makefile website target: create changelog.txt, not changelog.
This ensures that browsers will treat it as text.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1142 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-08 23:00:28 +00:00
fiddlosopher
aea6f6802b Removed support for "box-style" block quotes in markdown.
This adds unneeded complexity and makes pandoc diverge further
than necessary from other markdown extensions.
Brought documentation, tests, and debian/changelog up to date.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1141 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-08 19:32:18 +00:00
fiddlosopher
1fa54ab190 Updated debian/changelog.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1140 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-06 00:36:44 +00:00
fiddlosopher
702bb47450 Added new HTML math demos to website. Revamped mkdemos.pl for simplicity.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1139 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-04 04:14:38 +00:00
fiddlosopher
6cfe6c250a Adjusted test suite for footnote changes in LaTeX and ConTeXt writers.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1138 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-04 04:14:27 +00:00
fiddlosopher
13b1b87efc Improvements to LaTeX and ConTeXt footnote handling:
+ ConTeXt:  } at end of footnote now occurs on a new line when the
  footnote ends with \stoptyping, just as in LaTeX writer.
+ LaTeX and ConTeXt:  \footnote is no longer indented.  The indentation
  causes problems with wrapping, as it makes the line too long.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1137 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-04 04:14:16 +00:00