Commit graph

1163 commits

Author SHA1 Message Date
John MacFarlane
ce4f9a3787 HTML writer: Spacing adjustments for Plain and RawHtml blocks. 2011-02-04 21:05:48 -08:00
John MacFarlane
99cb6076f8 Improved new HTML format; restored original --no-wrap behavior. 2011-02-04 20:12:17 -08:00
John MacFarlane
f9dcea6655 HTML writer: More normal line breaks.
Also removes any distinction between --no-wrap and default HTML
output.

Resolves Issue #134.
2011-02-04 18:37:02 -08:00
John MacFarlane
1a19f96a5b Native writer test: in block list test, limit to list < 20 blocks. 2011-02-04 18:33:32 -08:00
John MacFarlane
c686010a92 Added cCommented-out round-trip property in markdown reader test. 2011-02-04 18:33:08 -08:00
John MacFarlane
221177c272 Shared: Minor refactoring. 2011-02-04 18:32:54 -08:00
John MacFarlane
714303b210 Improved Arbitrary instance. 2011-02-04 18:32:30 -08:00
John MacFarlane
70405ef98a normalize: Normalize spaces too.
In normal form, Space elements only occur to separate two non-Space
elements.  So, we never have [Space], or [, ..., Space].
2011-02-04 13:43:38 -08:00
John MacFarlane
8e81437fd1 markdown2pdf: Fixed bug with output file extensions.
Previously 'markdown2pdf test.txt -o test.en.pdf' would produce
'test.pdf', not 'test.en.pdf'.

Thanks to Paolo Tanimoto for the fix.
2011-02-04 11:27:20 -08:00
John MacFarlane
d4b71a6423 Markdown reader: Simplified and corrected footnote block parser. 2011-02-01 22:35:27 -08:00
John MacFarlane
1edeea5d60 Added a (failing) test for footnotes. 2011-02-01 07:37:22 -08:00
John MacFarlane
e898f0abef Improved fix to markdown noteBlock parser.
The last patch did not handle cases with > 4 spaces.

Also added a more general test case.
2011-01-31 20:42:49 -08:00
John MacFarlane
f282b462bb Markdown reader: Fixed whitespace footnote bug (Jesse Rosenthal).
The problem was in input like this:

[^1]:  note

not in note.

Also added a test case for this.
2011-01-31 20:05:11 -08:00
John MacFarlane
cf0a843239 UTF8: Use #if instead of #ifdef. 2011-01-30 17:01:50 -08:00
John MacFarlane
b1b6d0f859 UTF8 module: Use base 4.2 IO if available.
This gives us proper line endings on windows, and some speed
improvements.

We fall back to the old functions if base < 4.2.

hGetContents is now exported.
2011-01-30 16:01:31 -08:00
John MacFarlane
afe90390ea pandoc.hs: Simplified code for writing result. 2011-01-30 14:02:16 -08:00
John MacFarlane
71ca44db6e LaTeX reader: Fixed bug with whitespace at beginning of file.
Previously a file beginning "   hi" would cause a parse error.
Also cleaned up comment parsing.
2011-01-30 08:21:48 -08:00
John MacFarlane
8a3fd7606f Markdown reader tables: Fixed bug in alignments.
Previously pandoc got confused by blank rows in the header.
2011-01-29 23:09:07 -08:00
John MacFarlane
caa091e810 Highlighting: Fixed non-highlighting-kate version of highlightHtml. 2011-01-29 23:08:29 -08:00
John MacFarlane
3b5dbe6fdb Added HTML writer tests for inline code. 2011-01-29 16:26:00 -08:00
John MacFarlane
22969c1b9c HTML writer: avoid doubled <code> tag for highlighted inline code. 2011-01-29 16:11:16 -08:00
John MacFarlane
9f28acba9d Fixed highlighting for inline code.
highlightHtml in Highlighting now has a boolean argument that
selects between inline and block content.

Revised tests for new highlighting-kate.
2011-01-29 16:04:07 -08:00
John MacFarlane
570d8ff08c Moved tests to src. 2011-01-29 11:24:16 -08:00
John MacFarlane
387a2b365e Shared: Fixed bug in normalize revealed by tests! 2011-01-29 10:03:31 -08:00
John MacFarlane
d28daf0e89 Support --listings in markdown2pdf (Etienne Millon). 2011-01-28 21:21:09 -08:00
Josef Svenningsson
d8d0f46c4c Add possibility to use listings package for code blocks and
inline code in the LaTeX writer.
2011-01-28 21:09:38 -08:00
John MacFarlane
7f5130709b RST reader: skip blanklines at beginning, not all leading spaces.
If you skip all spaces, it becomes impossible to start with
a blockquote.
2011-01-28 12:30:33 -08:00
John MacFarlane
5ba5373ec6 Shared: Make 'normalize' more generic.
Now it can transform an Inline, [Inline], Block, [Block], or Pandoc.
2011-01-28 09:35:43 -08:00
John MacFarlane
f90e18b955 RST reader: Skip blank space at beginning.
Resolves Debian Bug #611328.
2011-01-28 08:52:54 -08:00
John MacFarlane
382564ed9e RTF writer: Embed images when possible.
* Resolves Issue #275.
* PNG and JPEG supported.
* Export rtfEmbedImage.
2011-01-28 08:42:04 -08:00
John MacFarlane
f8dca6ccbc Add support for attributes in inline Code.
Additional related changes:

* URLs in Code in autolinks now use class "url".
* Require highlighting-kate 0.2.8.2, which omits the final <br/> tag,
  essential for inline code.
2011-01-26 20:44:25 -08:00
John MacFarlane
703c421c9e RST reader: Improved field lists.
Field lists now work properly with block content.
(Thanks to Lachlan Musicman for pointing out the bug.)

In addition, definition list items are now always Para instead
of Plain -- which matches behavior of rst2xml.py.

Finally, in image blocks, the alt attribute is parsed properly
and used for the alt, not also the title.
2011-01-26 17:23:57 -08:00
John MacFarlane
80f5a89a0b LaTeX reader: Fixed an incomplete pattern match. 2011-01-26 17:23:56 -08:00
John MacFarlane
422bba202e RST reader: Include line breaks in raw field list parser output.
Note: field list items can have lists, etc. as values.
2011-01-26 17:23:56 -08:00
John MacFarlane
e66cc6728c RST reader: Allow spaces in field list names. 2011-01-26 17:23:56 -08:00
John MacFarlane
0cc7625d98 Adjusted writers to use "tex". 2011-01-26 17:23:56 -08:00
John MacFarlane
d3667f9aac Markdown reader: Don't parse latex/context environments as inline. 2011-01-26 17:23:56 -08:00
John MacFarlane
eb26fa6f54 Distinguish latex & context environments; blank line after in writers. 2011-01-26 17:23:56 -08:00
John MacFarlane
bd43c0f4c9 Bumped version to 1.8; depend on pandoc-types 1.8.
The old TeX, HtmlInline and RawHtml elements have been removed
and replaced by generic RawInline and RawBlock elements.

All modules updated to use the new raw elements.
2011-01-26 17:22:53 -08:00
John MacFarlane
65a015e74b Added needed space after .bc and .bq.
Otherwise these can trap a </dd>, for example.

Better solution to try next: rewrite using Pretty.
2011-01-23 10:08:11 -08:00
John MacFarlane
16d4366431 Textile writer: Don't escape code in bc. block. 2011-01-23 09:44:28 -08:00
John MacFarlane
38013de857 Textile writer: Don't HTML-escape between @'s. 2011-01-23 09:12:50 -08:00
John MacFarlane
628a1ef815 Textile reader: Fixed bug (swallowed p at beginning of paragraph).
The problem was a missing 'try' in the maybeExplicitBlock parser.
Test case, a paragraph beginning with 'p', has been added.
2011-01-23 08:59:35 -08:00
John MacFarlane
7234a79104 Textile writer: Use <pre> instead of bc.. for code with blank lines.
This has fewer interaction effects.
2011-01-23 08:49:55 -08:00
John MacFarlane
1d683be414 Textile reader: Support <tt> for inline code. 2011-01-23 00:25:05 -08:00
John MacFarlane
50d08ec2c3 Textile reader: Added code blocks with bc. 2011-01-23 00:05:35 -08:00
John MacFarlane
9f99c39caf Default to textile writer on .textile extension. 2011-01-23 00:05:10 -08:00
John MacFarlane
d562efa39d ConTeXt writer: Ensure cr after \stoptyping. 2011-01-22 23:26:44 -08:00
John MacFarlane
ea5cd35004 Text.Pandoc: Added jsonFilter for easy construction of scripts.
Here's an example of its use:

-- removelinks.hs - removes links from document
import Text.Pandoc

main = interact $ jsonFilter $ bottomUp removeLink

removeLink :: Inline -> Inline
removeLink (Link xs _) = Emph xs
removeLink x = x
2011-01-22 17:53:16 -08:00
John MacFarlane
a74010a051 Markdown reader: slight speedup by moving whitespace parser. 2011-01-22 16:04:32 -08:00
John MacFarlane
a6d7d88b0f RST reader: Big speed improvement (300->260ms).
Moved whitespace parser to top of inline parsers.
2011-01-22 16:01:42 -08:00
John MacFarlane
8dcc67a993 ConTeXt writer: Don't add cr at end of inline footnote. 2011-01-22 12:17:39 -08:00
John MacFarlane
5c35be1362 Make sure native output ends in newline with --standalone. 2011-01-21 09:58:23 -08:00
John MacFarlane
bfbc289871 Haddock comment improvements. 2011-01-21 09:00:05 -08:00
John MacFarlane
9bb5b54102 Added --normalize option. 2011-01-20 22:48:20 -08:00
John MacFarlane
8894b1a030 Markdown writer: Avoid printing excess spaces at end if no notes/refs. 2011-01-20 22:36:08 -08:00
John MacFarlane
8011d079c8 Native writer: eliminated empty spaces in brackets. 2011-01-20 20:48:06 -08:00
John MacFarlane
6b50778b2a Export readNative in Text.Pandoc.Shared. 2011-01-20 08:52:59 -08:00
John MacFarlane
810e3336dc Improved native writer using Pretty.
2-3X speed improvement and more consistent layout.
2011-01-20 08:43:13 -08:00
John MacFarlane
978d949526 Made writeNative sensitive to writerStandalone.
The Pandoc (Meta ...) is not written unless standalone is set.
2011-01-19 18:57:25 -08:00
John MacFarlane
e1f3c6058e Added Text.Pandoc.Readers.Native (readNative).
readNative can now read full pandoc documents, block lists, blocks,
inline lists, or inlines.  It will interpret

Str "hi"

as if it were

Pandoc (Meta [] [] []) [Plain [Str "hi"]]

This should make testing easier.
2011-01-19 18:36:27 -08:00
John MacFarlane
e647f761ed Use spaceChar instead of oneOf " \t" in rst reader. 2011-01-19 15:17:51 -08:00
John MacFarlane
1b8a9711b8 Replaced more noneOf/oneOf parsers. 2011-01-19 15:14:23 -08:00
John MacFarlane
a400cfe10f Replaced uses of oneOf with more efficient parsers.
This speeds up the markdown reader.
2011-01-19 15:06:56 -08:00
John MacFarlane
c09518eefd More small parser rewrites for small performance gains. 2011-01-19 14:59:59 -08:00
John MacFarlane
61f3db612c Parsing: Rewrote spaceChar for significant speedup in readers. 2011-01-19 14:45:15 -08:00
John MacFarlane
adaae082fc Fixed problem with inline code in ConTeXt writer.
Previously `}` would be rendered '\type{}}'.
Now we check the string for '}' and '{'. If it contains neither,
use \type{}; otherwise use \mono{} with an escaped version of the
string.

Note:  There are some issues using the \type!str! form, including
differences btw mkii and mkiv. For now this is a conservative fix.
Perhaps in the future we can use \type!str!.  See the discussion on
pandoc-discuss s.v. "Bug in context writer".
2011-01-19 11:53:00 -08:00
John MacFarlane
8f7c119c0f Removed '--no-citeproc' as alias for '--natbib'.
This was confusing, I think, as no-citeproc could be either
natbib or biblatex.
2011-01-16 11:08:56 -08:00
John MacFarlane
281b36470f Minor code formatting. 2011-01-16 11:08:20 -08:00
John MacFarlane
b6d1f4bc9e Moved --chapters to before --number-sections in option list. 2011-01-16 09:34:26 -08:00
John MacFarlane
ab20da4be5 Support --chapters for ConTeXt output as well. 2011-01-16 09:08:19 -08:00
John MacFarlane
ece098b9e0 Use <chapter> for top docbook header if template has <book>.
Resolves Issue #265.
2011-01-16 08:59:53 -08:00
John MacFarlane
9721b87c26 Added --chapters option affecting docbook and latex.
* Added writerChapters to WriterOptions.
* Added --chapters command-line option.
* --chapters causes top-level headers to be "chapter" instead of
  "section" in LaTeX and DocBook.
* Resolves Issue #225.
2011-01-16 08:58:29 -08:00
John MacFarlane
53eb2c4828 HTML writer: Add ids to <section> tags. 2011-01-15 22:35:25 -08:00
John MacFarlane
a0e19ba8aa Merge branch 'tests' 2011-01-15 09:25:01 -08:00
John MacFarlane
fd79417825 Fixed the parser for rst+lhs - set stateLiterateHaskell. 2011-01-14 22:38:02 -08:00
John MacFarlane
a5cbcdfe3a HTML reader: parse simple tables.
Resolves Issue #106.  Thanks to Rodja Trappe for the idea
and some sample code.
2011-01-14 20:48:10 -08:00
John MacFarlane
c31d3cc306 HTML reader: parse location tags in pSatisfy.
This avoids the need for manual parsing all over the place.
2011-01-14 20:47:32 -08:00
John MacFarlane
9305114b9f LaTeX writer: Escape strings in \href{..}.
Previously strings weren't escaped, so %5D would be interpreted
as a LaTeX comment!
2011-01-14 18:59:50 -08:00
John MacFarlane
5131589be0 Simplified Text.Pandoc.CharacterReferences by using TagSoup entity lookup 2011-01-14 18:28:54 -08:00
John MacFarlane
09e9a86db9 Merge branch 'master' of github.com:jgm/pandoc into tests 2011-01-14 14:46:48 -08:00
John MacFarlane
81403b8d80 LateX writer: In nonsimple tables, put cells in \parbox.
Otherwise we can get problems with linebreaks, and cell spacing
isn't right.

Thanks to Jef Allbright for pointing out the problem.
2011-01-14 14:45:04 -08:00
John MacFarlane
ba1d0d3070 Parsing: Fixed bug in grid table parser.
Spaces at end of line were not being stripped properly,
resulting in unintended LineBreaks.
2011-01-14 14:16:27 -08:00
John MacFarlane
5da2d1e66c Merge branch 'master' into tests 2011-01-12 08:13:11 -08:00
John MacFarlane
91510a109f Improvements to --html5 support:
+ <nav> for TOC, <figure> for figures, type attribute in <ol>.
+ Don't add math javascript in html5.
+ Use style attributes instead of deprecated width, align.
+ html template: move <title> after <meta>.
  Note: charset needs to be declared before title.
+ slidy and s5 templates: move <title> after <meta>.
+ html template: Added link to html5 shim for IE.
+ Make --html5 have an effect only for 'html' writer (not s5, slidy, epub).
2011-01-11 23:15:30 -08:00
John MacFarlane
e8ad4ba43c Preliminary support for HTML5.
+ Added writerHtml5 writer option.
+ Added --html5 option.
+ Added support for lang in html tag (so you can do
  'pandoc -s --V lang=en', for example).
+ Updated html template with conditionals for HTML5.
+ When HTML5 selected, use <header> tag around title in document,
  and use <section> tags instead of <div>s if --section-divs
  specified.
2011-01-11 21:18:46 -08:00
John MacFarlane
33ff2fed21 Text.Pandoc: Improved readers, writers lists for lhs variants.
Now the lhs variants set the needed literate Haskell flag in
parser state and writer options.
2011-01-11 20:23:43 -08:00
Nathan Gass
e8fa72c6a7 Moved test-pandoc.hs to tests directory. 2011-01-11 21:49:49 +01:00
Nathan Gass
f3ee73607f Removed run prefix from all test functions. 2011-01-11 21:30:19 +01:00
Nathan Gass
a2153acfff Include lhs tests in existing testGroup structure. 2011-01-11 21:10:36 +01:00
Nathan Gass
e06899ef1f Add reader groups for markdown and rst reader tests. 2011-01-11 20:41:34 +01:00
Nathan Gass
c0700987ba Changed test-pandoc to use test-framework and HUnit. 2011-01-10 00:37:46 +01:00
John MacFarlane
3317e9dea8 pandoc: Test standalone' rather than standalone for final newline. 2011-01-07 18:12:20 -08:00
John MacFarlane
d891b2c29d LaTeX reader: Support simple tables. 2011-01-07 10:15:48 -08:00
John MacFarlane
93c3e27731 pandoc: Add newline to output unless standalone.
This avoids output that does not end with a newline, which
is inconvenient when working with many tools.

Updated tests accordingly.
2011-01-06 21:05:28 -08:00
John MacFarlane
c4c336460b RST writer: blank line after literate Haskell code block. 2011-01-06 21:03:08 -08:00
John MacFarlane
438f32cdfa test-pandoc: Wrap to 78 columns in lhs writer tests. 2011-01-06 16:54:15 -08:00
John MacFarlane
aea93977f5 Markdown writer: blank line after delimited code block. 2011-01-06 16:53:21 -08:00
John MacFarlane
303ce8a9e5 LaTeX reader: allow spaces btw \\begin or \\end and {. 2011-01-06 09:34:24 -08:00
John MacFarlane
81ea1a59b4 LaTeX reader: Removed unnecessary 'spaces'. 2011-01-06 09:24:56 -08:00
John MacFarlane
1be2ca6c78 HTML reader: Fixed bug in htmlTag for comments. 2011-01-06 00:21:19 -08:00
John MacFarlane
b63a7f7c48 LaTeX reader: Apply macros to non-math; handle ensuremath. 2011-01-05 16:55:26 -08:00
John MacFarlane
18e7a7a495 LaTeX reader: Don't handle \label and \ref specially.
Put labels in {} instead of ().
2011-01-05 15:24:20 -08:00
John MacFarlane
1415b6831e LaTeX reader: Support \L \l accents. 2011-01-05 14:57:06 -08:00
John MacFarlane
23aae79b01 Updated for texmath 0.5. 2011-01-05 14:44:26 -08:00
John MacFarlane
eb83f0e5e4 Fixed macro parsing. 2011-01-05 14:42:47 -08:00
John MacFarlane
e126ab9efc LaTeX reader: Parse inside arguments when ignoring commands. 2011-01-05 12:25:47 -08:00
John MacFarlane
c3071ff6e9 LaTeX reader: Don't handle \index separately.
Instead, just put it in list of commands to ignore.
2011-01-05 12:05:04 -08:00
John MacFarlane
b26247a4a8 LaTeX reader: Added "index" to ignorable commands. 2011-01-05 11:56:37 -08:00
John MacFarlane
cf6cd15c27 LaTeX reader: skip space before option or argument. 2011-01-05 11:54:40 -08:00
John MacFarlane
d033fc9d3e LaTeX reader: Skip \index commands. 2011-01-05 10:11:24 -08:00
John MacFarlane
c949530815 LaTeX reader: Removed \group (we want to parse inside {}). 2011-01-05 10:06:51 -08:00
John MacFarlane
3dab6c574c LaTeX reader: Better handling of preamble, inc. parsing macros. 2011-01-05 09:04:03 -08:00
John MacFarlane
85bfd26b78 LaTeX reader: Parse bracketed {parts} as raw TeX. 2011-01-04 22:20:35 -08:00
John MacFarlane
22b2c02aeb Markdown reader: Removed unneeded definitions.
specialChars, strChar, specialCharsMinusLt.
2011-01-04 22:11:56 -08:00
John MacFarlane
dac2e9156f LaTeX reader: parse macros and apply to math. 2011-01-04 19:18:20 -08:00
John MacFarlane
fcbe1e95eb Moved 'macro' and 'applyMacros'' from markdown reader to Parsing. 2011-01-04 19:12:33 -08:00
John MacFarlane
3e61333af0 Fixed regression in markdown reader.
'(_hi_)' was being parsed with literal underscores (no emphasis).
The fix:  the 'str' parser now only parses alphanumerics and
embedded underscores.  All other symbols are handled by the
'symbol' parser.  This has a slight effect on the AST, since
you'll get [Str "hi",Str ":"] insntead of [Str "hi:"].  But there
should not be a visible effect in any of the writers.

Thanks to gwern for pointing out the regression.
2011-01-01 22:46:30 -08:00
John MacFarlane
0411f51433 Updated copyright notices. 2011-01-01 10:26:10 -08:00
John MacFarlane
b05e739c6d LaTeX reader: Allow ignored comments after \end{document}. 2010-12-30 22:05:19 -08:00
John MacFarlane
d6f28af9cb HTML reader: Fixed some parsing bugs. 2010-12-30 19:33:37 -08:00
Puneeth Chaganti
e4dedad1c0 Added support for listings package code blocks and inline code. 2010-12-30 14:37:51 -08:00
John MacFarlane
f49e60a8b8 Textile reader: Slight speed improvement. 2010-12-30 14:33:11 -08:00
John MacFarlane
904050fa36 New HTML reader using tagsoup as a lexer.
* The new reader is faster and more accurate.

* API changes for Text.Pandoc.Readers.HTML:
   - removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag,
     anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType,
     htmlBlockElement, htmlComment
   - added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag

* tagsoup is a new dependency.

* Text.Pandoc.Parsing: Generalized type on readWith.

* Benchmark.hs: Added length calculation to force full evaluation.

* Updated HTML reader tests.

* Updated markdown and textile readers to use the functions from
  the HTML reader.

* Note: The markdown reader now correctly handles some cases it did not
  before. For example:

    <hr/>

  is reproduced without adding a space.

    <script>
      a = '<b>';
    </script>

  is parsed correctly.
2010-12-30 13:55:40 -08:00
John MacFarlane
220fe5fab8 normalize: Don't reduce [Space] to []. 2010-12-26 12:01:33 -08:00
John MacFarlane
c912288eda Improved 'normalize'.
Now normalizeInlines is split into consolidateInlines
and removeEmptyInlines.  We need to remove empties before
consolidating.
2010-12-26 10:24:15 -08:00
John MacFarlane
249aa9e044 Markdown writer: Fixed bug in Image.
URI was getting unescaped twice!
2010-12-26 10:23:20 -08:00
John MacFarlane
82903cfaf3 Improved normalize. 2010-12-25 14:03:43 -08:00
John MacFarlane
10d85f8b0b Use functions from Text.Pandoc.Generic instead of processWith(M). 2010-12-24 13:39:27 -08:00
John MacFarlane
c08ca6fa6d HTML reader: Simplified parsing of <script> sections.
I had previously assumed that we needed to ignore
</script> occuring in a string literal or javascript
comment.  It turns out, though, that browsers aren't
that smart.
2010-12-22 19:20:27 -08:00
John MacFarlane
4bfe140ed1 Made --smart work with HTML reader.
It did not work before, because - and quotes were gobbled
up by the str parser.
2010-12-22 17:05:17 -08:00
John MacFarlane
63bf227e04 RST reader: Added unicode quote characters to specialChars.
(So they can trigger Quoted environments.)
2010-12-22 17:04:56 -08:00
John MacFarlane
bbad129066 RST reader: recouped speed loss due to addition of --smart.
This was achieved by rearranging the parsers in inline.

Benchmarks went from 500ms to 307ms -- not quite back to the
279ms we had in 1.6, before supporting smart punctuation and
footnotes, but close.
2010-12-22 15:10:21 -08:00
John MacFarlane
4ba3afbb4d ODT writer: Don't wrap text in opendocument. 2010-12-22 14:55:59 -08:00
John MacFarlane
dc597a8a68 Removed all dependencies on 'pretty' package. 2010-12-22 11:48:08 -08:00
John MacFarlane
8e9c490b0a Texinfo writer: Updated to use Pretty. 2010-12-22 11:43:43 -08:00
John MacFarlane
f15d479fc2 Shared: Removed unneeded prettyprinting functions:
wrapped, wrapIfNeeded, wrappedTeX, wrapTeXIfNeeded, hang'.
2010-12-22 00:34:36 -08:00
John MacFarlane
21d2d918ac Shared: Removed BlockWrapper, wrappedBlocksToDoc.
These are no longer needed with the new Pretty module.
2010-12-22 00:28:20 -08:00
John MacFarlane
369502bbb4 Pretty: Added quote, doubleQuote. 2010-12-22 00:22:28 -08:00
John MacFarlane
fd07db16e9 Man writer: updated to use Pretty. 2010-12-22 00:22:13 -08:00
John MacFarlane
c904024944 OpenDocument writer: Updated to use Pretty. 2010-12-21 16:59:17 -08:00
John MacFarlane
e2548a1317 XML: don't use breaking spaces in attribute lists. 2010-12-21 16:46:21 -08:00
John MacFarlane
ebdbb06f94 Docbook writer: Updated to use Pretty. 2010-12-21 16:45:43 -08:00
John MacFarlane
ce533ffd90 Pretty: don't print a breaking space before a newline. 2010-12-21 16:45:13 -08:00
John MacFarlane
fe1152985c Shared: Made splitBy take a test instead of an element. 2010-12-21 08:41:24 -08:00
John MacFarlane
4e446358d1 XML: Replaced escapeStringAsXML with a faster version.
Benchmarked with criterion, it's about 8x faster than
the old version.  This speeds up docbook, opendocument,
and html writers.
2010-12-21 08:23:48 -08:00
John MacFarlane
78cea94f45 Markdown writer: use \ for newline instead of two spaces at eol.
(Unless --strict.)
2010-12-20 19:36:40 -08:00
John MacFarlane
8889ae8b5b Markdown writer: Use delimited code block if there are attributes.
(Unless in strict mode.)
2010-12-20 19:36:40 -08:00
John MacFarlane
0086329c36 Plain writer: set stateStrictMarkdown automatically. 2010-12-20 19:36:40 -08:00
John MacFarlane
2587543457 ConTeXt writer: Updated to use Text.Pandoc.Pretty. 2010-12-20 19:36:35 -08:00
John MacFarlane
112717de4e Renamed 'enclosed' to 'inside'.
This avoids conflict with 'enclosed' in Text.Pandoc.Parsing.
2010-12-20 19:09:01 -08:00
John MacFarlane
2fe271d163 Pretty: Fixed parens. 2010-12-19 17:20:18 -08:00
John MacFarlane
9120514998 Pretty: Added enclosed, parens. 2010-12-19 12:39:49 -08:00
John MacFarlane
59cc27c10b LaTeX writer: A bit of code polish. 2010-12-19 10:21:16 -08:00
John MacFarlane
99a58e51f5 LaTeX writer: Modified to use Pretty.
Improved footnote formatting, removed spurious blank lines.
2010-12-19 10:14:12 -08:00
John MacFarlane
09aec9f3e3 Shared: Use stringify to simplify inlineListToIdentifier. 2010-12-19 10:13:36 -08:00
John MacFarlane
6aa5010617 Pretty: Added braces and brackets. 2010-12-19 10:13:11 -08:00
John MacFarlane
89bf312765 LaTeX writer: Use \paragraph, \subparagraph for level 4,5 headers. 2010-12-18 15:05:21 -08:00
John MacFarlane
543aa28c38 Added new prettyprinting module.
* Added Text.Pandoc.Pretty.
  This is better suited for pandoc than the 'pretty' package.
  One advantage is that we now get proper wrapping; Emph [Inline]
  is no longer treated as a big unwrappable unit. Previously
  we only got breaks for spaces at the "outer level." We can also
  more easily avoid doubled blank lines.  Performance is
  significantly better as well.

* Removed Text.Pandoc.Blocks.
  Text.Pandoc.Pretty allows you to define blocks and concatenate
  them.

* Modified markdown, RST, org readers to use Text.Pandoc.Pretty
  instead of Text.PrettyPrint.HughesPJ.

* Text.Pandoc.Shared:  Added writerColumns to WriterOptions.

* Markdown, RST, Org writers now break text at writerColumns.

* Added --columns command-line option, which sets stColumns
  and writerColumns.

* Table parsing:  If the size of the header > stColumns,
  use the header size as 100% for purposes of calculating
  relative widths of columns.
2010-12-17 13:39:17 -08:00
John MacFarlane
2a075e9d7a test-pandoc: removed need to depend on MissingH. 2010-12-15 18:07:36 -08:00
John MacFarlane
605648cbbf Added 'tests' Cabal flag.
+ This ensures that test-pandoc gets built.
+ 'cabal test' now runs this.
+ The old tests/RunTests.hs has been removed, and
  src/test-pandoc.hs added.
2010-12-15 17:54:51 -08:00
John MacFarlane
63cf37a9ca HTML reader: allow : in tags.
Resolves Issue #274.
2010-12-15 14:15:53 -08:00
Nathan Gass
a312d2a8ae Use top-level header at end as bibliography title for natbib and biblatex output. 2010-12-15 10:21:56 -08:00
Nathan Gass
8f60176511 Remove punctuation at start of suffix for natbib and biblatex output.
This is necessary as the latex citation commands include there own
punctuation, which resulted in doubled commas for markdown documents
where citeproc output works correctly.
2010-12-15 10:21:53 -08:00
Nathan Gass
43fee5e7f7 Support multiple bibliography files with natbib and biblatex output. 2010-12-15 10:21:47 -08:00
John MacFarlane
63d5e0c5f9 Added 'normalize' to Text.Pandoc.Shared. 2010-12-14 20:04:37 -08:00
John MacFarlane
3ac6f72f98 Fixed preamble parsing in LaTeX reader. 2010-12-14 19:34:28 -08:00
John MacFarlane
128cf46089 Fixed regression in parsing _emph_
There was a bug in parsing '_emph_, ...':  when followed by
a comma, underscore emphasis did not register.  (Thanks to
gwern for pointing this out.)

This bug was introduced by the change in
c66921f2ac
2010-12-14 18:23:26 -08:00
Nathan Gass
2e728df756 Moved special handling of punctuation in suffix out of markdown reader.
This allows different writers to handle punctuation in the suffix
differently.
2010-12-13 20:50:29 -08:00
Nathan Gass
c2d3796439 Added support for latex cite commands in latex reader. 2010-12-13 20:48:19 -08:00
Nathan Gass
c81495a07a Added option to write citation markup in markdown writer. 2010-12-13 20:42:58 -08:00
Nathan Gass
48600fd547 Added support to write natbib or biblatex citations in latex output. 2010-12-13 20:41:37 -08:00
John MacFarlane
1a4a0d0283 Markdown reader: Further fix to abbrevs. 2010-12-13 20:05:50 -08:00
John MacFarlane
7b4d3c77ec Markdown reader: Fixed abbrev handler to allow abbrev at end of line.
E.g., Mr.
Frank.
2010-12-13 20:04:11 -08:00
John MacFarlane
3822d6c440 Markdown reader: Fixed referenceKey parser to allow space after newline. 2010-12-13 20:03:59 -08:00
John MacFarlane
dfbb4d3994 Fixed inlineListToIdentifier to treat '\160' as ' '. 2010-12-13 20:03:52 -08:00
John MacFarlane
71e0557e61 Markdown reader: Fixed regression in reference key parser.
* The recent change allowing spaces and newlines in the URL
  caused problems when reference keys are stacked up without
  blank lines between. This is now fixed.
* Added test.
2010-12-13 20:03:12 -08:00
John MacFarlane
3748dfeb91 Markdown reader: fix superscripts with links.
Moved inlineNote parser after superscript parser,
so ^[link](/foo)^ gets recognized as a superscripted
link, not an inline note followed by garbage.

Thanks to Conal Elliott for pointing out the problem.
2010-12-12 20:30:55 -08:00
John MacFarlane
250aa20250 Recognize .json extension as json reader/writer. 2010-12-12 20:30:26 -08:00
John MacFarlane
c6b79d794e Removed deprecated -C/--custom-header option.
Use --template instead.
2010-12-11 00:22:34 -08:00
John MacFarlane
f5c2082304 Added JSON reader and writer.
The JSON reader is about 20x faster than the native reader.
So this can be a good way to serialize a pandoc document.
2010-12-11 00:06:03 -08:00
John MacFarlane
2dfb45950e LaTeX reader: Improved parsing of preamble.
Previously you'd get unexpected behavior on a document that
contained '\begin{document}' in, say, a verbatim block.
2010-12-10 23:21:24 -08:00
John MacFarlane
9602f73f2a Moved 'readers' and 'writers' to Text.Pandoc.
This allows library users to avoid repetitive case statements...
2010-12-10 17:30:32 -08:00
John MacFarlane
de6452c0d1 Markdown reader: small cosmetic code improvements. 2010-12-10 16:26:35 -08:00
John MacFarlane
5770ceca36 Removed HTML sanitization.
This is better done on the resulting HTML; use the xss-sanitize library
for this.  xss-sanitize is based on pandoc's sanitization, but improves
it.

- Removed stateSanitize from ParserState.
- Removed --sanitize-html option.
2010-12-10 12:26:03 -08:00
John MacFarlane
17d48cf4af Markdown reader: Allow linebreaks in URLs (treat as spaces).
Also, a string of consecutive spaces or tabs is now parsed
as a single space. If you have multiple spaces in your URL,
use %20%20.
2010-12-10 12:14:51 -08:00
John MacFarlane
ee0a0953de Markdown reader: Rewrote para parser for better efficiency.
This change avoids repeated parsing of inline lists for 'plain'
blocks.
2010-12-10 10:47:46 -08:00
John MacFarlane
167eeef6cb Added json format for reading and writing.
This is faster to parse than native.
2010-12-09 10:40:31 -08:00
paul.rivier
bb609a85e3 textile redcloth definition lists 2010-12-09 09:25:46 -08:00
John MacFarlane
88a40685b8 Textile reader: better treatment of acronyms.
We now parse PBS(Public Broadcasting System) as if it were
"PBS (Public Broadcasting System)".
2010-12-09 08:52:09 -08:00
John MacFarlane
9ead748cc9 RST reader: Added footnote suppport.
Resolves issue #258.

Note that there are some differences in how docutils and
pandoc treat footnotes.  Currently pandoc ignores the numeral
or symbol used in the note; footnotes are put in an auto-numbered
ordered list.
2010-12-08 08:39:50 -08:00
John MacFarlane
91978d2201 Markdown reader: minor footnote changes.
Don't skipNonindentSpaces in noteMarker, since it's also
used in the inline note parser.
2010-12-08 08:17:16 -08:00
John MacFarlane
f02080b62d Textile reader: Implemented footnotes. 2010-12-08 00:44:46 -08:00
John MacFarlane
200ea33641 Made --smart work with RST reader. 2010-12-07 21:49:10 -08:00
John MacFarlane
5e35eb309f Make --smart work in HTML reader. 2010-12-07 21:24:35 -08:00
John MacFarlane
33ba35da9f Smart punctuation: recognize entities.
Now &ldquo;Hi&rdquo; gets parsed as a Quoted DoubleQuote inline.
2010-12-07 20:44:43 -08:00
John MacFarlane
3a5fceeef9 Rewrote normalizeSpaces (mostly aesthetic reasons). 2010-12-07 20:10:21 -08:00
John MacFarlane
e20052a1ba Markdown reader: Moved smartPunctuation parser, for slight speed bump. 2010-12-07 20:09:40 -08:00
John MacFarlane
ace3b80f1e Smart punctuation: don't alllow ellipses containing spaces.
Previously we allowed '. . .', ' . . . ', etc.  This caused
too many complications, and removed author's flexibility in
combining ellipses with spaces and periods.
2010-12-07 20:08:14 -08:00
John MacFarlane
50ca61ef49 Moved smartPunctuation from Markdown to Parsing.
+ Parameterized smartPunctuation on an inline parser.
+ Handle smartPunctuation in Textile reader.
2010-12-07 19:03:08 -08:00