Commit graph

997 commits

Author SHA1 Message Date
John MacFarlane
2253c8ef65 Require texmath >= 0.3, adjusted for new elements. 2010-07-22 15:06:46 -07:00
John MacFarlane
68e3f83545 HTML reader: code cleanup + parse <tt> as Code.
Partially resolves Issue #247.
2010-07-14 09:39:48 -07:00
John MacFarlane
71b4700669 Made latex \section, \chapter parsers more forgiving of whitespace. 2010-07-13 19:22:41 -07:00
John MacFarlane
0b23956d48 Parse \chapter{} in latex.
+ Added stateHasChapters to ParserState.
+ If a \chapter command is encountered, this is set to True
  and subsequent \section commands (etc.) will be bumped up
  one level.
2010-07-13 19:18:58 -07:00
John MacFarlane
afe18e53f1 Modified example refs so they can occur before or after target.
The refs are now replaced by numbers at the final stage, using
processWith.
2010-07-12 23:05:46 -07:00
John MacFarlane
0181e66250 Merge branch 'atlists'. Added auto-numbered example lists. 2010-07-11 22:47:52 -07:00
John MacFarlane
73b4cc0897 Minor comment change. 2010-07-06 21:23:25 -07:00
John MacFarlane
7d687684aa Allow language-neutral table captions.
+ Captions may now begin simply with ':', instead of 'Table:'
+ Captions may now appear either above or below the table.
+ Resolves Issue #227.
2010-07-06 21:02:26 -07:00
John MacFarlane
6a8fa53f6c More refactoring of grid table code. 2010-07-05 23:43:07 -07:00
John MacFarlane
869946114e Moved generic grid table functions from RST reader -> Parsing.
Here they can be used by the Markdown reader as well.
2010-07-05 14:34:48 -07:00
John MacFarlane
998fd098d0 Moved parsing functions from Text.Pandoc.Shared to new module.
+ Text.Pandoc.Parsing
2010-07-05 00:06:27 -07:00
John MacFarlane
b5bda7569e Made KeyTable a map instead of an association list.
* This affects the RST and Markdown readers.
* The type for stateKeys in ParserState has also changed.
* Pandoc, Meta, Inline, and Block have been given Ord instances.
* Reference keys now have a type of their own (Key), with its
  own Ord instance for case-insensitive comparison.
2010-05-08 10:29:40 -07:00
John MacFarlane
d253955a7e Changed rawLaTeXInline to accept '\section', '\begin', etc.
Use new rawLaTeXInline' in LaTeX reader, and export rawLaTeXInline
for use in markdown reader.

Fixes bug wherein '\section{foo}' was not recognized as raw TeX
in markdown document.
2010-04-26 23:17:34 -07:00
John MacFarlane
c243e5b67b Use texmath's parser in TexMath module.
* This replaces a lot of custom parser code, and expands
  the tex -> unicode conversion.
* The behavior has also changed: if the whole formula can't
  be converted, the whole formula is left in raw TeX.
  Previously, pandoc converted parts of the formula to unicode
  and left other parts in raw TeX.
* Added (but not yet exported) readTeXMath', which returns a Maybe.
* Updated tests
2010-04-25 20:30:27 -07:00
John MacFarlane
5d9d7f32ca In parsing smart quotes, leave unicode curly quotes alone.
Resolves Issue #143.
2010-04-10 12:05:26 -07:00
John MacFarlane
6972c0b5b0 Implemented @ for sequentially numbered examples.
Also implemented (@label) for example labels and references.
2010-03-27 10:24:18 -07:00
John MacFarlane
c87d52223a Properly escape URIs in all readers. 2010-03-23 15:07:48 -07:00
John MacFarlane
1aeb7d23ad Updated copyright notices. 2010-03-23 13:31:09 -07:00
John MacFarlane
71eac37ac5 Fixed treatment of unicode characters in URIs.
* Added stringToURI to Shared.  This is used in the HTML
  writer for all URIs.  It properly URI-encodes high
  characters (> 127), leaving everything else (including
  symbols and spaces) the same.

* Modified unsanitaryURI to allow UTF8 characters in a URI.
  (First, we convert the URI to URI-encoded octets, then we
  pass through parseURIReference.)
  This resolves gitit Issue #99. Previously
  '[abc](http://gitit.net/测试)' would not be rendered as
  a link when --sanitize was selected.
2010-03-23 00:33:50 -07:00
fiddlosopher
075f958c6a Markdown(+lhs) reader: handle "inverse bird tracks"
Inverse bird tracks (<) are used for haskell example code that is not
part of the literate Haskell program.

Resolves Issue #211.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1888 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-14 23:23:20 +00:00
fiddlosopher
800b03ba50 LaTeX reader: ignore \section, \pdfannot, \pdfstringdef.
Resolves Issue #202.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1887 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-14 23:23:14 +00:00
fiddlosopher
139b2ed6d1 LaTeX reader: Ignore alt title in section headers.
Partially resolves Issue #202.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1886 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-14 23:23:07 +00:00
fiddlosopher
36a19e0f2e LaTeX reader: don't treat \section as inline LaTeX.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1885 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-13 07:03:26 +00:00
fiddlosopher
df6274e3d7 LaTeX reader: recognize nonbreaking space ~.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1884 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-13 04:30:27 +00:00
fiddlosopher
1f3b48c193 Markdown reader: Added p., pp., sec., ch., as abbreviations.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1861 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-06 04:08:34 +00:00
fiddlosopher
76e6c071d0 Disallow blank lines in inline code span.
Also added additional test cases for markdown code spans.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1860 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-06 02:42:15 +00:00
fiddlosopher
f5e00c50b8 Markdown reader: Allow footnotes to be indented < 4 spaces.
This fixes a regression.  A test case has been added in testsuite.txt.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1859 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-01 22:37:06 +00:00
fiddlosopher
77ba3429e2 Allow multi-line titles and authors in meta block.
Based on a patch by Justin Bogner.

Titles may span multiple lines, provided continuation lines
begin with a space character.

Separate authors may be put on multiple lines, provided
each line after the first begins with a space character.
Each author must fit on one line. Multiple authors on
a single line may still be separated by a semicolon.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1854 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-28 11:21:19 +00:00
fiddlosopher
76ab88807e RST reader: Improved grid tables.
+ Table cells can now contain multiple block elements, such
  as lists or paragraphs.
+ Table parser is now forgiving of spaces at ends of lines.
+ Added test cases.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1852 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-27 20:39:21 +00:00
fiddlosopher
ddcde4d543 Markdown reader: Use simpler approach for URLs - just escape spaces.
Markdown.pl doesn't URI-escape anything, so we won't do that either,
except for spaces, which can cause problems if not escaped.

Resolves Issue #220 and partially reverts r1847.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1851 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-27 04:59:34 +00:00
fiddlosopher
07ae5bc264 Markdown reader: properly escape URIs.
+ Resolves Issue #220.
+ Added escapeURI function to Markdown reader. This escapes
  links in a way that makes sense for markdown.  If they've
  used URI escapes like %20 in their link, these will be preserved.
  But if they've used a special character or space without escaping
  it, it will be escaped. This should make sense in most cases.
+ Previously pandoc collapsed adjacent spaces and replaced these
  sequences of spaces with + characters.  That isn't correct for
  a URI path (+ is to be used only in the query part).  We've also
  removed the space-collapsing behavior.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1847 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-27 03:06:39 +00:00
fiddlosopher
d3f1ddf57e LaTeX reader: handle \ (interword space).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1846 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-27 02:35:46 +00:00
fiddlosopher
4ded477409 LaTeX reader: allow any special character to be escaped.
Resolves Issue #221.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1845 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-26 23:57:41 +00:00
fiddlosopher
c6b34574bf Incomplete support for RST tables (simple and grid).
Thanks to Eric Kow.
Note TODO for future improvement in RST reader code comments.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1840 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-20 08:30:34 +00:00
fiddlosopher
07f25fb13c LaTeX reader: treat \paragraph and \subparagraph as level 4, 5 headers.
Resolves Issue #207.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1838 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-12 03:03:23 +00:00
fiddlosopher
53ede0de5d HTML reader: handle spaces before <html>.
Resolves Issue #216.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1837 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-12 02:52:38 +00:00
fiddlosopher
0c21e4342c HTML reader: Be forgiving in parsing a bare list within a list.
The following is not valid xhtml, but the intent is clear:
<ol>
<li>one</li>
<ol><li>sub</li></ol>
<li>two</li>
</ol>

We'll treat the <ol> as if it's in a <li>.

Resolves Issue #215.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1836 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-12 02:47:24 +00:00
fiddlosopher
6f0d4e49d1 Require two spaces after capital letter + period for list item.
Otherwise "E. coli" starts a list.  This might change the semantics
of some existing documents, since previously the two-space requirement
was only enforced when the second word started with a capital letter.
But it is consistent with the existing documentation and follows the
principle of least surprise.

Resolves Issue #212.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1829 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-03 05:48:46 +00:00
fiddlosopher
19b0c72dd1 Made HTML reader much more forgiving.
+ Incorporated idea (from HXT) that an element can be closed
  by an open tag for another element.
+ Javascript is partially parsed to make sure that a <script>
  section is not closed by a </script> in a comment or string.
+ More lenient non-quoted attribute values.
  Now we accept anything but a space character, quote, or <>.
  This helps in parsing e.g. www.google.com!
+ Bare & signs are now parsed as a string.  This is a common
  HTML mistake.
+ Skip a bare < in malformed HTML.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1825 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-02-02 07:36:55 +00:00
fiddlosopher
ca2bbafbb9 Removed redundant imports (found by ghc 6.12).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1750 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-31 16:48:36 +00:00
fiddlosopher
1b3d5896c7 Removed unneeded LANGUAGE pragmas.
(CPP is enabled globally in the cabal file.)

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1747 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-31 16:48:14 +00:00
fiddlosopher
998fb9820e LaTeX reader: use \\ to separate multiple authors.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1727 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-31 01:16:34 +00:00
fiddlosopher
54dda0ff9e Markdown reader: use ; as separator between authors.
This allows you to use ',' within author names:
e.g. "John Jones, Jr."

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1726 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-31 01:16:27 +00:00
fiddlosopher
3ec8772daf Changed Meta author and date types to Inline lists instead of Strings.
Meta [Inline] [[Inline]] [Inline] rather than
Meta [Inline] [String] String.

This is a breaking change for libraries that use pandoc and
manipulate the metadata.

Changed .native files in test suite for new Meta format.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1699 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-31 01:12:44 +00:00
fiddlosopher
5df3ec11c0 RST reader: Allow :: before lhs code block.
The RST spec requires the :: before verbatim blocks.
This :: should not be treated as literal colons.
Resolves Issue #189.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1668 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-22 07:27:14 +00:00
fiddlosopher
5082b5411b Improved syntax for markdown definition lists.
Definition lists are now more compatible with PHP Markdown Extra.
Resolves Issue #24.

+ You can have multiple definitions for a term (but still not
  multiple terms).
+ Multi-block definitions no longer need a
  column before each block (indeed, this will now cause
  multiple definitions).
+ The marker no longer needs to be flush with the left margin,
  but can be indented at or two spaces.  Also, ~ as well as :
  can be used as the marker (this suggestion due to David
  Wheeler.)
+ There can now be a blank line between the term and
  the definitions.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1656 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-07 08:26:53 +00:00
fiddlosopher
ad5450266c Allow markdown tables without headers.
Resolves Issue #50. The new syntax is described in README.
Also allow optional line of dashes at bottom of simple tables.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1652 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-05 21:34:46 +00:00
fiddlosopher
61f7a4f869 Markdown reader: Compensate for width of final table column.
Resolves Issue #144.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1649 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-05 07:47:09 +00:00
fiddlosopher
8671bc5a1b Markdown reader: Treat a backslash followed by a newline as hard linebreak.
Resolves Issue #154.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1646 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-05 05:33:24 +00:00
fiddlosopher
94841b7602 Added "head" to list of HTML block-level tags.
Resolves Issue #108.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1645 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-05 04:47:08 +00:00