Commit graph

839 commits

Author SHA1 Message Date
John MacFarlane
c31d3cc306 HTML reader: parse location tags in pSatisfy.
This avoids the need for manual parsing all over the place.
2011-01-14 20:47:32 -08:00
John MacFarlane
9305114b9f LaTeX writer: Escape strings in \href{..}.
Previously strings weren't escaped, so %5D would be interpreted
as a LaTeX comment!
2011-01-14 18:59:50 -08:00
John MacFarlane
5131589be0 Simplified Text.Pandoc.CharacterReferences by using TagSoup entity lookup 2011-01-14 18:28:54 -08:00
John MacFarlane
81403b8d80 LateX writer: In nonsimple tables, put cells in \parbox.
Otherwise we can get problems with linebreaks, and cell spacing
isn't right.

Thanks to Jef Allbright for pointing out the problem.
2011-01-14 14:45:04 -08:00
John MacFarlane
ba1d0d3070 Parsing: Fixed bug in grid table parser.
Spaces at end of line were not being stripped properly,
resulting in unintended LineBreaks.
2011-01-14 14:16:27 -08:00
John MacFarlane
91510a109f Improvements to --html5 support:
+ <nav> for TOC, <figure> for figures, type attribute in <ol>.
+ Don't add math javascript in html5.
+ Use style attributes instead of deprecated width, align.
+ html template: move <title> after <meta>.
  Note: charset needs to be declared before title.
+ slidy and s5 templates: move <title> after <meta>.
+ html template: Added link to html5 shim for IE.
+ Make --html5 have an effect only for 'html' writer (not s5, slidy, epub).
2011-01-11 23:15:30 -08:00
John MacFarlane
e8ad4ba43c Preliminary support for HTML5.
+ Added writerHtml5 writer option.
+ Added --html5 option.
+ Added support for lang in html tag (so you can do
  'pandoc -s --V lang=en', for example).
+ Updated html template with conditionals for HTML5.
+ When HTML5 selected, use <header> tag around title in document,
  and use <section> tags instead of <div>s if --section-divs
  specified.
2011-01-11 21:18:46 -08:00
John MacFarlane
d891b2c29d LaTeX reader: Support simple tables. 2011-01-07 10:15:48 -08:00
John MacFarlane
c4c336460b RST writer: blank line after literate Haskell code block. 2011-01-06 21:03:08 -08:00
John MacFarlane
aea93977f5 Markdown writer: blank line after delimited code block. 2011-01-06 16:53:21 -08:00
John MacFarlane
303ce8a9e5 LaTeX reader: allow spaces btw \\begin or \\end and {. 2011-01-06 09:34:24 -08:00
John MacFarlane
81ea1a59b4 LaTeX reader: Removed unnecessary 'spaces'. 2011-01-06 09:24:56 -08:00
John MacFarlane
1be2ca6c78 HTML reader: Fixed bug in htmlTag for comments. 2011-01-06 00:21:19 -08:00
John MacFarlane
b63a7f7c48 LaTeX reader: Apply macros to non-math; handle ensuremath. 2011-01-05 16:55:26 -08:00
John MacFarlane
18e7a7a495 LaTeX reader: Don't handle \label and \ref specially.
Put labels in {} instead of ().
2011-01-05 15:24:20 -08:00
John MacFarlane
1415b6831e LaTeX reader: Support \L \l accents. 2011-01-05 14:57:06 -08:00
John MacFarlane
23aae79b01 Updated for texmath 0.5. 2011-01-05 14:44:26 -08:00
John MacFarlane
eb83f0e5e4 Fixed macro parsing. 2011-01-05 14:42:47 -08:00
John MacFarlane
e126ab9efc LaTeX reader: Parse inside arguments when ignoring commands. 2011-01-05 12:25:47 -08:00
John MacFarlane
c3071ff6e9 LaTeX reader: Don't handle \index separately.
Instead, just put it in list of commands to ignore.
2011-01-05 12:05:04 -08:00
John MacFarlane
b26247a4a8 LaTeX reader: Added "index" to ignorable commands. 2011-01-05 11:56:37 -08:00
John MacFarlane
cf6cd15c27 LaTeX reader: skip space before option or argument. 2011-01-05 11:54:40 -08:00
John MacFarlane
d033fc9d3e LaTeX reader: Skip \index commands. 2011-01-05 10:11:24 -08:00
John MacFarlane
c949530815 LaTeX reader: Removed \group (we want to parse inside {}). 2011-01-05 10:06:51 -08:00
John MacFarlane
3dab6c574c LaTeX reader: Better handling of preamble, inc. parsing macros. 2011-01-05 09:04:03 -08:00
John MacFarlane
85bfd26b78 LaTeX reader: Parse bracketed {parts} as raw TeX. 2011-01-04 22:20:35 -08:00
John MacFarlane
22b2c02aeb Markdown reader: Removed unneeded definitions.
specialChars, strChar, specialCharsMinusLt.
2011-01-04 22:11:56 -08:00
John MacFarlane
dac2e9156f LaTeX reader: parse macros and apply to math. 2011-01-04 19:18:20 -08:00
John MacFarlane
fcbe1e95eb Moved 'macro' and 'applyMacros'' from markdown reader to Parsing. 2011-01-04 19:12:33 -08:00
John MacFarlane
3e61333af0 Fixed regression in markdown reader.
'(_hi_)' was being parsed with literal underscores (no emphasis).
The fix:  the 'str' parser now only parses alphanumerics and
embedded underscores.  All other symbols are handled by the
'symbol' parser.  This has a slight effect on the AST, since
you'll get [Str "hi",Str ":"] insntead of [Str "hi:"].  But there
should not be a visible effect in any of the writers.

Thanks to gwern for pointing out the regression.
2011-01-01 22:46:30 -08:00
John MacFarlane
0411f51433 Updated copyright notices. 2011-01-01 10:26:10 -08:00
John MacFarlane
b05e739c6d LaTeX reader: Allow ignored comments after \end{document}. 2010-12-30 22:05:19 -08:00
John MacFarlane
d6f28af9cb HTML reader: Fixed some parsing bugs. 2010-12-30 19:33:37 -08:00
Puneeth Chaganti
e4dedad1c0 Added support for listings package code blocks and inline code. 2010-12-30 14:37:51 -08:00
John MacFarlane
f49e60a8b8 Textile reader: Slight speed improvement. 2010-12-30 14:33:11 -08:00
John MacFarlane
904050fa36 New HTML reader using tagsoup as a lexer.
* The new reader is faster and more accurate.

* API changes for Text.Pandoc.Readers.HTML:
   - removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag,
     anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType,
     htmlBlockElement, htmlComment
   - added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag

* tagsoup is a new dependency.

* Text.Pandoc.Parsing: Generalized type on readWith.

* Benchmark.hs: Added length calculation to force full evaluation.

* Updated HTML reader tests.

* Updated markdown and textile readers to use the functions from
  the HTML reader.

* Note: The markdown reader now correctly handles some cases it did not
  before. For example:

    <hr/>

  is reproduced without adding a space.

    <script>
      a = '<b>';
    </script>

  is parsed correctly.
2010-12-30 13:55:40 -08:00
John MacFarlane
220fe5fab8 normalize: Don't reduce [Space] to []. 2010-12-26 12:01:33 -08:00
John MacFarlane
c912288eda Improved 'normalize'.
Now normalizeInlines is split into consolidateInlines
and removeEmptyInlines.  We need to remove empties before
consolidating.
2010-12-26 10:24:15 -08:00
John MacFarlane
249aa9e044 Markdown writer: Fixed bug in Image.
URI was getting unescaped twice!
2010-12-26 10:23:20 -08:00
John MacFarlane
82903cfaf3 Improved normalize. 2010-12-25 14:03:43 -08:00
John MacFarlane
10d85f8b0b Use functions from Text.Pandoc.Generic instead of processWith(M). 2010-12-24 13:39:27 -08:00
John MacFarlane
c08ca6fa6d HTML reader: Simplified parsing of <script> sections.
I had previously assumed that we needed to ignore
</script> occuring in a string literal or javascript
comment.  It turns out, though, that browsers aren't
that smart.
2010-12-22 19:20:27 -08:00
John MacFarlane
4bfe140ed1 Made --smart work with HTML reader.
It did not work before, because - and quotes were gobbled
up by the str parser.
2010-12-22 17:05:17 -08:00
John MacFarlane
63bf227e04 RST reader: Added unicode quote characters to specialChars.
(So they can trigger Quoted environments.)
2010-12-22 17:04:56 -08:00
John MacFarlane
bbad129066 RST reader: recouped speed loss due to addition of --smart.
This was achieved by rearranging the parsers in inline.

Benchmarks went from 500ms to 307ms -- not quite back to the
279ms we had in 1.6, before supporting smart punctuation and
footnotes, but close.
2010-12-22 15:10:21 -08:00
John MacFarlane
4ba3afbb4d ODT writer: Don't wrap text in opendocument. 2010-12-22 14:55:59 -08:00
John MacFarlane
dc597a8a68 Removed all dependencies on 'pretty' package. 2010-12-22 11:48:08 -08:00
John MacFarlane
8e9c490b0a Texinfo writer: Updated to use Pretty. 2010-12-22 11:43:43 -08:00
John MacFarlane
f15d479fc2 Shared: Removed unneeded prettyprinting functions:
wrapped, wrapIfNeeded, wrappedTeX, wrapTeXIfNeeded, hang'.
2010-12-22 00:34:36 -08:00
John MacFarlane
21d2d918ac Shared: Removed BlockWrapper, wrappedBlocksToDoc.
These are no longer needed with the new Pretty module.
2010-12-22 00:28:20 -08:00