Commit graph

5844 commits

Author SHA1 Message Date
fiddlosopher
451b426fd6 Removed unneeded try's in RST reader; also minor code cleanup.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@959 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 20:39:31 +00:00
fiddlosopher
2a37d8d30a Efficiency improvements to RST reader (more than doubled
speed):
+ removed tabchar
+ rearranged parsers in inline


git-svn-id: https://pandoc.googlecode.com/svn/trunk@958 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 20:22:24 +00:00
fiddlosopher
015644b60e Purely stylistic change.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@957 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 20:03:28 +00:00
fiddlosopher
7f8ec9577e Removed unneeded 'try' in 'ellipses'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@956 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 19:59:08 +00:00
fiddlosopher
60f700997c + Fixed bug introduced into referenceTitle by previous changes.
Now it works as before.
+ Improved Markdown.pl-compatibility in referenceLink:  the two
  parts of a reference-style link may be separated by one space,
  but not more... [a] [link], [not]   [a link].


git-svn-id: https://pandoc.googlecode.com/svn/trunk@955 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 19:57:01 +00:00
fiddlosopher
845a658aff Fixed markdown inline code parsing so it better accords with
Markdown.pl:  the marker for the end of the code section is
a clump of the same number of `'s with which the section began,
followed by a non-` character.  So, for example,

   ` h     ```    i ` -> <code>h     ```    i</code>.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@954 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 16:38:41 +00:00
fiddlosopher
77f63605f5 Small change to referenceTitle: should end with line-end, not ')'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@953 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 01:53:50 +00:00
fiddlosopher
21a2acaac9 Split 'title' into 'linkTitle' and 'referenceTitle', since the
rules are slightly different.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@952 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 01:51:03 +00:00
fiddlosopher
86cc8c8bf2 Rewrote charsInBalanced and charsInBalanced'.
- Documented restriction: open and close must be distinct characters.
- Rearranged options for greater efficiency.
- Changed inner call to charsInBalanced inside charsInBalanced' to
  charsInBalanced'.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@951 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 01:50:36 +00:00
fiddlosopher
64023d8ba8 Removed unneeded 'try' from noteMarker.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@950 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 00:16:50 +00:00
fiddlosopher
40b870375d Minor reformatting.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@949 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 00:11:37 +00:00
fiddlosopher
3edf5834e8 Rewrote 'para' for greater efficiency.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@948 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 00:08:18 +00:00
fiddlosopher
902d6c7115 Fixed bug in LaTeX writer: autolinks would not cause
'\usepackage{url}' to be put in the document header.
Also, changes to state in enumerated list items would be
overwritten.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@947 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-29 00:03:45 +00:00
fiddlosopher
5bafe2c9fb Minor reformatting.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@946 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 22:44:01 +00:00
fiddlosopher
fcb91e8e51 Rewrote link parsers for greater efficiency.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@945 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 22:41:05 +00:00
fiddlosopher
e407279b35 anyLine now requires that the line end with a newline (not eof).
This is a harmless assumption, since we always add
newlines to the end of a block before parsing with anyLine.
Yields a 10% speed boost!


git-svn-id: https://pandoc.googlecode.com/svn/trunk@944 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 21:17:08 +00:00
fiddlosopher
8fdf8c1d4c + Removed tabsToSpaces and tabsInLine from Text.Pandoc.Shared.
(They were used only in Main.)
+ Wrote new tabsToSpacesInLine function in Main that changes tabs
  to spaces and removes DOS line-endings in one pass, for a slight
  speed improvement.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@942 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 19:33:47 +00:00
fiddlosopher
cda7e7ac21 Removed redundant 'referenceLink' in definition of inline
(it's already in 'link').


git-svn-id: https://pandoc.googlecode.com/svn/trunk@940 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 07:31:18 +00:00
fiddlosopher
6906584f47 Refactored escapeChar so it doesn't need 'try'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@939 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 07:24:56 +00:00
fiddlosopher
06a5a0e235 Removed unneeded 'try' in multilineRow.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@938 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 07:22:21 +00:00
fiddlosopher
ce800f7121 Removed unneeded 'try' in dashedLine.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@937 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 07:20:47 +00:00
fiddlosopher
3e337eb9c8 Removed unneeded try in rawHtmlBlocks (Markdown parser).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@936 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 07:19:07 +00:00
fiddlosopher
74dc62e730 Refactored hrule for performance in Markdown reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@935 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 07:04:47 +00:00
fiddlosopher
8cf4971821 Minor reformatting.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@934 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 06:59:57 +00:00
fiddlosopher
0c475bfd47 Refactored setext header parsing in Markdown reader for greater
speed.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@933 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 06:58:20 +00:00
fiddlosopher
18b379c1ca More rearranging in definition of inline.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@932 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 06:44:47 +00:00
fiddlosopher
b6ebe75656 More intelligent rearranging of 'inline' for speed boosts
in Text.Pandoc.Readers.Markdown.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@931 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 06:38:38 +00:00
fiddlosopher
b94f541477 Removed unneeded 'try' from romanNumeral parser.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@930 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 06:07:06 +00:00
fiddlosopher
6765fadd12 Use notFollowedBy instead of notFollowedBy' in charsInBalanced.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@929 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 06:04:57 +00:00
fiddlosopher
f10ac4359c Removed unneeded 'try' in 'parseFromString'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@928 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 06:02:12 +00:00
fiddlosopher
da9271c258 Removed unneeded 'try' from stringAnyCase. (Now it behaves
like 'string'.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@927 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 06:00:51 +00:00
fiddlosopher
f04df62db1 Changed definition of 'enclosed' in Text.Pandoc.Shared so that
'try' is not automatically applied to the 'end' parser.
Added 'try' in calls to 'enclosed' where needed.  Slight speed
increase.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@926 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 05:58:21 +00:00
fiddlosopher
a6da87f484 Minor code cleanup in Text.Pandoc.Shared.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@925 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 03:06:05 +00:00
fiddlosopher
1d8fe2653a Performance improvements:
+ Rearranged parsers in definition of 'inline' so that the most
frequently used would (by and large) be tried first.
+ Removed some unneeded 'try's.
+ Removed tabchar parser, as whitespace handles tabs anyway.
+ All in all, these changes, together with the last two commits,
  cut almost in half the time it takes pandoc to parse a large test file.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@924 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 02:33:53 +00:00
fiddlosopher
08fbfa37cc Removed unnecessary 'try' in 'anyLine' (Text.Pandoc.Shared).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@923 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 02:31:17 +00:00
fiddlosopher
583536a138 Refactored Text.Pandoc.CharacterReferences.
Removed unnecessary 'try's for a speed improvement.
Removed unnecessary '&' and ';' from the entity table.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@922 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-28 02:30:38 +00:00
fiddlosopher
1163e622ab Don't count
p. 27
at the beginning of a line as an ordered list start, since
it's most likely a page number.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@900 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-26 03:17:40 +00:00
fiddlosopher
f609755cb6 Fixed bug in LaTeX writer. When a footnote ends with a Verbatim
environment, the close } of the footnote needs to occur on the same
line or an error occurs.  Fixed by adding a newline before the close }
in every footnote.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@897 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-25 18:17:01 +00:00
fiddlosopher
3de1f19b43 Removed incorrect "{}" around style information in
HTML tables.  Adjusted test suite accordingly.  Column
widths now work properly in HTML.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@882 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-23 23:20:54 +00:00
fiddlosopher
f11360f50e Added new rule for enhanced markdown ordered lists: if the list marker
is a capital letter followed by a period (including a single-letter
capital roman numeral), then it must be followed by at least two spaces.
The point of this is to avoid accidentally treating people's initials as
list markers: a paragraph may begin:

    B. Russell was an English philosopher.

and this shouldn't be treated as a list.

Modified Markdown reader and README documentation.
Added a test case.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@880 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-23 04:25:09 +00:00
fiddlosopher
4a0e02dab7 Added a needed 'try' to listItem in Markdown reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@878 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-22 20:19:37 +00:00
fiddlosopher
cc0460f952 Code cleanup in Markdown reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@877 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-20 19:21:18 +00:00
fiddlosopher
84aa875bdb Changes to Markdown reader for better conformity to the
Markdown test suite under --strict:
+ Removed check for a following setext header in endline.
  A full test is too inefficient (doubles benchmark time), and
  the substitute we had before is not 100% accurate.
+ Don't use Code elements for autolinks if --strict specified.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@876 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-20 18:52:49 +00:00
fiddlosopher
57d52c39ec If --strict and not --toc, don't include identifiers in headers.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@875 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-20 18:42:09 +00:00
fiddlosopher
81eba062f2 Refactor RST and Markdown readers using parseFromString.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@864 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-19 00:18:46 +00:00
fiddlosopher
4e149f898a Added a necessary "try" in definition of "para"
(HTML reader).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@863 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-19 00:06:03 +00:00
fiddlosopher
4399db4fd2 Bug fixes in readers:
+ LaTeX reader:  skip anything after \end{document}
+ HTML reader: fixed bug skipping material after </html> -- previously,
  stuff at the end was skipped even if no </html> was present, which
  meant only part of the file would be parsed and no error issued
+ HTML reader: added new constant eitherBlockOrInline with elements that
  may count either as block-level or inline
+ Modified isInline and isBlock to take this into account
+ modified rawHtmlBlock to accept any tag (even an inline tag);
  this is innocuous, because rawHtmlBlock is tried only if a regular
  inline element can't be parsed.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@862 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-18 23:44:26 +00:00
fiddlosopher
e48f046aa0 + Fixed bug in markdown ordered list parsing. The problem was
that anyOrderedListStart did not check for a space following the
  ordered list marker. So, 'A.B. 2007' would be parsed as a list item,
  then fail because of the lack of space after 'A.' (required by
  orderedListStart). Resolves Issue #22.
+ Fixed a similar problem in RST reader.
+ Added regression test.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@861 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-18 15:26:29 +00:00
fiddlosopher
a6f3dd3755 Fixed block quote output in markdown writer: previously,
block quotes in notes would be indented only in the first line.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@859 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-17 18:48:16 +00:00
fiddlosopher
3904078e39 Cosmetic changes.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@851 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-15 21:20:02 +00:00
fiddlosopher
1a4489ef30 LaTeX reader: parse \texttt{} as code, as long as there's
nothing fancy inside.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@846 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-15 17:30:37 +00:00
fiddlosopher
ec0e6e9941 Fixed bug in normalizeSpaces: Space:Str "":Space
should compress to Space.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@845 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-15 17:29:20 +00:00
fiddlosopher
6cc5f6b199 Allow htmlComments as rawHtmlInline in HTML reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@844 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-15 17:28:21 +00:00
fiddlosopher
a8e2199034 Major code cleanup in all modules. (Removed unneeded imports,
reformatted, etc.)  More major changes are documented below:

+ Removed Text.Pandoc.ParserCombinators and moved all its definitions
  to Text.Pandoc.Shared.
+ In Text.Pandoc.Shared:
  - Removed unneeded 'try' in blanklines.
  - Removed endsWith function and rewrote functions to use isSuffixOf instead.
  - Added >>~ combinator.
  - Rewrote stripTrailingNewlines, removeLeadingSpaces.
+ Moved Text.Pandoc.Entities -> Text.Pandoc.CharacterReferences.
  - Removed unneeded functions charToEntity, charToNumericalEntity.
  - Renamed functions using proper terminology (character references,
    not entities).  decodeEntities -> decodeCharacterReferences,
    characterEntity -> characterReference.
  - Moved escapeStringToXML to Docbook writer, which is the only thing
    that uses it.
  - Removed old entity parser in HTML and Markdown readers; replaced with
    new charRef parser in Text.Pandoc.Shared.
+ Fixed accent bug in Text.Pandoc.Readers.LaTeX:  \^{} now correctly
  parses as a '^' character.
+ Text.Pandoc.ASCIIMathML is no longer an exported module.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@835 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-15 06:00:58 +00:00
fiddlosopher
e814a3f6d2 Major change in the way ordered lists are handled:
+ The changes are documented in README, under Lists.
+ The OrderedList block element now stores information
  about list number style, list number delimiter, and
  starting number.
+ The readers parse this information, when possible.
+ The writers use this information to style ordered
  lists.
+ Test suites have been changed accordingly.

Motivation:  It's often useful to start lists with
numbers other than 1, and to have control over the
style of the list.

Added to Text.Pandoc.Shared:
+ camelCaseToHyphenated
+ toRomanNumeral
+ anyOrderedListMarker
+ orderedListMarker
+ orderedListMarkers

Added to Text.Pandoc.ParserCombinators:
+ charsInBalanced'
+ withHorizDisplacement
+ romanNumeral

RST writer:
+ Force blank line before lists, so that sublists will be handled
  correctly.

LaTeX reader:
+ Fixed bug in parsing of footnotes containing multiple paragraphs,
  introduced by use of charsInBalanced.  Fix: use charsInBalanced'
  instead.

LaTeX header:
+ use mathletters option in ucs package, so that basic unicode Greek
  letters will work properly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@834 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-08 02:43:15 +00:00
fiddlosopher
22a6538557 Added parsing for \url to LaTeX reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@833 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-03 22:44:47 +00:00
fiddlosopher
271c3f987e Use \url{} for autolinks in LaTeX writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@832 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-03 22:14:32 +00:00
fiddlosopher
44b136740b Changes to Markdown reader:
+ added try to def of indentSpaces
+ in def of 'reference', check to make sure it's not a note reference
  first.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@827 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 19:14:50 +00:00
fiddlosopher
5e13f8c320 Removed examplep specific stuff in LaTeX reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@826 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 18:23:13 +00:00
fiddlosopher
1be1f77569 LaTeX writer:
+ No longer using examplep (too many quirks, too hard to install)
+ Instead, using deVerb function for environments that don't support
  \verb
+ And fancyvrb for footnotes and verbatim environments in footnotes.
+ Add fancyvrb to header if Code inline occurs in a footnote.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@825 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 18:14:40 +00:00
fiddlosopher
7c8dcc3db6 Cleaned up and fixed autolinks in RST. All that's needed
is a bare email address or URL.  This is now handled with
a separate matching clause in the definition of inlineToRST,
rather than with conditionals.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@821 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 15:33:58 +00:00
fiddlosopher
db0757d65b Don't put autolinks in typewriter font in ConTeXt, since
ConTeXt has its own way of printing links.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@820 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 15:33:04 +00:00
fiddlosopher
f273965e16 Fixed problems with obfuscateLink introduced by last round
of changes.  Changed type so that text parameter is String,
not HTML, which allows easier testing for autolinks.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@819 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 06:38:49 +00:00
fiddlosopher
d3404581d4 Changes in LaTeX reader to accommodate Pandoc's own use of
examplep.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@818 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 05:45:27 +00:00
fiddlosopher
d65f01f467 Man page writer: modified treatment of autolinks,
in accord with recent change from Str to Code.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@816 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 04:14:32 +00:00
fiddlosopher
94ed30cf15 LaTeX writer: include fancyvrb and \VerbatimFootnotes
line in header only if absolutely needed -- that is, only
if there is actually a code block in a footnote.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@815 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 03:58:09 +00:00
fiddlosopher
465c0849ff Simplified HTML writer's treatment of autolinks.
There are now a few different cases for Link, and
less conditional logic needed.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@813 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 02:43:04 +00:00
fiddlosopher
9939b0f07e Make URLs and emails in autolinks appear as Code.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@810 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 02:12:40 +00:00
fiddlosopher
cf87eb854d Fixed a bug in Docbook writer: email links with text were being
incorrectly treated as autolinks.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@809 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 02:11:04 +00:00
fiddlosopher
d488dd0f66 Reinstated dependence on fancyvrb. It is compatible with examplep.
fancyvrb is needed for verbatim environments in footnotes.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@808 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 01:40:48 +00:00
fiddlosopher
b29f221cba Changed LaTeX writer to use the examplep package instead
of fancyvrb. examplep allows verbatim text in places where
fancyvrb does not, e.g. definition list terms, and provides
for line-breaking of verbatim text.
+ examplep code put in LaTeX header instead of being dynamically
  included, since it is frequently used, and people may want to
  customize the options.
+ documented dependency on examplep
+ added texlive-latex-extra as a "Suggested" package in debian/control
+ use examplep's \Q{} is now used instead of \verb:  note that 
  \Q requires backslash- escaping symbols in its scope.
+ modified README so that the verbatim sections will look good at
  shorter line lengths.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@807 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 01:10:04 +00:00
fiddlosopher
4283ce3662 Use ` as default character for \verb in LaTeX output.
If ` is in the content to be escaped, another symbol
will be used as before.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@806 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-27 22:52:11 +00:00
fiddlosopher
1f9d2f8fe7 Include empty \author{} in LaTeX preamble if no
author specified; otherwise LaTeX gives an error.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@803 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-26 03:49:21 +00:00
fiddlosopher
453dc53457 Fixes in LaTeX writer:
+ put \VerbatimFootnotes right before \title block, to avoid
  bad interactions.
+ added deVerb in description list.
+ removed \texttt{} from deVerb, because it cannot go in description
  lists.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@802 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-26 03:15:35 +00:00
fiddlosopher
f2e21a8476 Changed how ASCIIMathML is handled:
+ -m|--asciimathml option now takes an *optional* argument,
  the URL to an asciiMathML.js script.  This is much better
  in situations where multiple files with math must be served,
  as the script can be cached.
+ If the argument is provided, a link is inserted; otherwise,
  the whole script is inserted as before.
+ Nothing is inserted unless there is inline LaTeX.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@799 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-26 02:19:20 +00:00
fiddlosopher
6c8bb8e9a3 Fixed bug in TOC generation in HTML writer (regression,
introduced by the revision in the WriterState type).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@793 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-24 00:11:34 +00:00
fiddlosopher
e7a49d7c12 LaTeX writer: Make sure \VerbatimFootnotes goes after the
preamble; otherwise it has bad interaction effects
with the other stuff in the header.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@791 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-23 23:14:26 +00:00
fiddlosopher
d8762eb436 In HTML reader, filter Nulls in lists of blocks. (These can
be caused by raw HTML when the parse-raw option isn't selected.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@787 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-23 02:50:29 +00:00
fiddlosopher
9f871ab1ec Fixed bug in spanStrikeout: case was not exhaustive.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@786 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-23 02:28:15 +00:00
fiddlosopher
9a410e1635 README: Removed the statement that the RST reader doesn't parse
definition lists.
HTML reader:  Added failIfStrict to the definitionList parser, so
definition lists will be passed through as raw HTML if --strict
specified.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@783 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-23 01:41:37 +00:00
fiddlosopher
6a6b1a5842 Added support for definition lists to RST reader.
Added a relevant test to the test suite.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@782 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-23 01:35:03 +00:00
fiddlosopher
f69d9efc83 Added definition list support to HTML reader.
Added a test for definition lists to the html-reader test suite.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@781 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-23 01:21:21 +00:00
fiddlosopher
1296273c85 Added support for definition lists to LaTeX reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@780 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-23 00:19:37 +00:00
fiddlosopher
caef362065 Renamed parseFromStr -> parseFromString.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@779 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-23 00:19:00 +00:00
fiddlosopher
4beb0b9130 Superscript and Subscript support for RST reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@777 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 21:26:22 +00:00
fiddlosopher
c1f397622d Added a "try" to the end parser in enclosed (Text.Pandoc.ParserCombinators).
This makes errors in its use less likely. Removed some now-unneeded try's
in calling code.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@776 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 21:01:17 +00:00
fiddlosopher
3ca9932ca6 Added subscript and superscript support to LaTeX reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@775 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 20:30:07 +00:00
fiddlosopher
bb5ac55f67 Man writer: Use ~ and ^ for subscripts and superscripts.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@770 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 20:13:43 +00:00
fiddlosopher
7006d9a044 HTML writer: Use a record for state, instead of a tuple, for
easy extensibility.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@769 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 20:00:49 +00:00
fiddlosopher
b4d289c65e HTML writer: include css for .strikethrough only if strikethrough
is actually used in the document.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@768 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 19:32:39 +00:00
fiddlosopher
9ddd464a7e Added ~ to the list of characters the markdown
writer should backslash-escape.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@765 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 18:33:41 +00:00
fiddlosopher
4905ebadb3 LaTeX reader: Added clauses for tilde and caret.
Tilde is \ensuremath{\sim}, and caret is \^{},
not \^ as before.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@764 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 18:26:10 +00:00
fiddlosopher
b8e1e53053 Cleaned up character escaping in LaTeX writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@763 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 18:24:34 +00:00
fiddlosopher
b19c36970e Removed an extra occurance of escapedChar in definition
of inline.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@762 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 17:51:15 +00:00
fiddlosopher
9f38f9d039 Man writer:
+ Make sure to include "" if no section is specified
  in a man page TH line.
+ Updated man writer tests.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@758 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 16:17:20 +00:00
fiddlosopher
062cdfe7de Changed text to char for one character strings
in RST, Man, and Docbook writers.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@757 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 16:10:48 +00:00
fiddlosopher
2b3c2d43ef Markdown writer: Substituted char for text for single characters.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@756 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 16:08:19 +00:00
fiddlosopher
6bb6dd2bfd + Added support for superscript, subscript, and
strikeout to all writers. (Thanks to Bradley Kuhn
  for the patches for strikeout, here slightly modified.)
+ Refactored character escaping using the new functions
  escapeStringUsing and backslashEscapes.
+ Added state to LaTeX writer, which now keeps track of what
  packages need to be included in the preamble, based on the
  content of the document. (Thus, e.g., ulem is only required
  if you use strikeout.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@755 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 16:05:38 +00:00
fiddlosopher
d03ec5a4a2 Added support for strikeout (\sout) to latex
reader.  (Thanks to Bradley Sif for the patch.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@754 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 05:14:43 +00:00
fiddlosopher
a2194f23db Added support for Strikeout, Superscript, and Subscript to
HTML reader.  Thanks to Bradley Sif for the patch for
Strikeout (Issue #18).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@753 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-21 22:54:40 +00:00
fiddlosopher
44b11214ba + Added support for Strikeout, Superscript, and
Subscript in markdown reader.
+ Also replaced constants like emphStart with literals. 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@752 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-21 22:52:07 +00:00
fiddlosopher
dc60aa3aea + Added Strikeout support to Markdown writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@751 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-21 22:07:10 +00:00
fiddlosopher
9b664073d5 Moved failIfStrict from Markdown reader to
Text.Pandoc.Shared.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@750 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-21 21:55:19 +00:00
fiddlosopher
f0fb4c496b Changes to functions for character escaping:
+ Removed escapeCharAsString
+ Added escapeStringUsing
+ Changed backslashEscape to backslashEscapes, which
  now returns an association list of escapes to be
  passed to escapeStringUsing


git-svn-id: https://pandoc.googlecode.com/svn/trunk@748 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-21 21:34:13 +00:00
fiddlosopher
2f7a38e1ab Changed system for indicating man page title, section,
header and footer.  Documented in README.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@745 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-21 20:30:40 +00:00
fiddlosopher
ec0445de0a Added Strikeout, Superscript, and Subscript to
refsMatch function in Pandoc.Shared.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@742 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-21 19:09:04 +00:00
fiddlosopher
a009a4b387 Added Strikeout, Superscript, and Subscript as
Inline elements (Pandoc.Definition).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@741 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-21 19:07:15 +00:00
fiddlosopher
e02fb21452 Refactored character escaping in Text.Pandoc.Writers.Markdown using
escapeCharAsString.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@739 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-19 08:09:17 +00:00
fiddlosopher
7d2c9c6fe6 Added escapeCharAsString to Text.Pandoc.Shared.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@738 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-19 08:08:51 +00:00
fiddlosopher
48d67d0d1f Simplified inlinesInBalanced, using lookAhead.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@732 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-16 07:22:17 +00:00
fiddlosopher
7e1370aa87 Markdown reader: Added inlinesInBalanced parser combinator to
unify treatment of embedded brackets in links and inline footnotes.
Note that the solution adopted here causes one of John Gruber's
markdown tests to fail:
  [with_underscore](/url/with_underscore)
Here the whole phrase "underscore](/url/with" is treated as
emphasized. The previous version of the markdown reader handled
this the way Gruber's script handles it, but it ran into trouble
on the following:
  [link with verbatim `]`](/url)
where the inner ] was treated as the end of the reference link
label. I don't see any good way to handle both cases in the framework
of pandoc, so I choose to require an escape in the first example:
  [with\_underscore](/url/with_underscore)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@729 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-15 23:53:22 +00:00
fiddlosopher
8e71c4c388 Added charsInBalanced parser combinator to Text.Pandoc.ParserCombinators.
This is not currently used, but should be useful in parsing strings
containing balanced pairs of brackets or parentheses.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@728 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-15 23:48:17 +00:00
fiddlosopher
2f928d4c9d ConTeXt writer changes:
+ Use defined blockquote environment for block quotes (smaller
  font, no indent, narrower text)
+ Changed default font to 12pt
+ Changed default page layout


git-svn-id: https://pandoc.googlecode.com/svn/trunk@725 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-15 15:08:27 +00:00
fiddlosopher
465b2d791e Merged branches/context: addition of a ConTeXt writer
<http://www.pragma-ade.nl/>.
+ Text.Pandoc.Writers.ConTeXt added.
+ Text.Pandoc modified to export the basic ConTeXt writer.
+ Main.hs modified to recognize 'context' as a writer.
+ ConTeXtHeader added to headers
+ DefaultHeaders.hs template modified to include ConTeXt header
+ Tests added (writer.context, tables.context), and runtests.pl
  modified to run them
+ pandoc.cabal updated to include Text.Pandoc.Writers.ConTeXt. 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@716 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-15 02:56:34 +00:00
fiddlosopher
78aeebc143 Removed an unused function in LaTeX writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@714 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-15 02:48:21 +00:00
fiddlosopher
8ce4b2fb62 Simplified special character escaping code in LaTeX writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@705 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-14 17:59:06 +00:00
fiddlosopher
ab74f97422 Small comment fix in LaTeX writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@704 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-14 17:28:26 +00:00
fiddlosopher
a786f43e8c Change to footnotes in HTML writer: Instead of putting the footnote
backlink on a line by itself, after the content of the note,
we now put it at the end of the last paragraph of the footnote.
This saves space and looks better.  More specifically:
+ If the last block of the note is a Para or Plain block, the
  backlink is put at the end of that block's contents.
+ Otherwise, the backlink is put in a separate Plain block by
  itself, after the footnote's contents. 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@697 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-14 03:37:41 +00:00
fiddlosopher
30375bb847 Changed encodeUTF8 to toUTF8, decodeUTF8 to fromUTF8,
for clarity.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@692 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-13 06:34:33 +00:00
fiddlosopher
bd5a5d48e7 Cleaned up Text.Pandoc. Added lots of documentation,
including an example program.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@691 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-12 08:32:57 +00:00
fiddlosopher
e962c76f05 HTML reader: haddock comment fix.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@690 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-12 08:31:39 +00:00
fiddlosopher
a9bd39b10e Export NoteTable in Text.Pandoc.Shared.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@688 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-12 08:30:17 +00:00
fiddlosopher
0ec7722363 Pandoc.hs:
+ added haddock documentation
+ added export of prettyPandoc and writeMan


git-svn-id: https://pandoc.googlecode.com/svn/trunk@686 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-12 07:07:51 +00:00
fiddlosopher
4f78aa9eca Change to defaultWriterOptions: standalone is False by default.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@685 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-12 07:06:25 +00:00
fiddlosopher
1780228b7c Markdown reader: Parse bracketed text in inline footnotes. Previously,
"test^[my [note] contains brackets]" would yield a note with contents
"my [note". Now it yields a note with contents "my [note] contains
brackets".  New function: inlinesInBrackets.

Resolves Issue 14. 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@665 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 06:51:06 +00:00
fiddlosopher
d58dca502e RST reader: Allow hyperlink target URIs to be split over multiple
lines, and to start on the line after the reference.
Resolves Issue 7. 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@664 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 06:23:59 +00:00
fiddlosopher
655363da51 Moved Text.ParserCombinators.Pandoc ->
Text.Pandoc.ParserCombinators.  This way, all the
Pandoc modules are in one place.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@663 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 03:39:25 +00:00
fiddlosopher
622d4e5223 Added type declaration for hsepBlocks in
Text.Pandoc.Blocks.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@662 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 03:38:03 +00:00
fiddlosopher
69072efc00 Text.Pandoc.Blocks:
+ Fixed a bug in hPad, which previously padded the rightmost
  cell.  This is fixed by introducing a case for a singleton
  list.
+ Fixed Markdown writer so that the space after a table is not
  nested two spaces.
+ Adjusted Markdown table tests accordingly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@660 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 01:36:53 +00:00
fiddlosopher
0ceb538ecd Markdown writer:
Fixed a small problem with lengths of dashed lines in tables.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@659 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 01:24:49 +00:00
fiddlosopher
ad7bb70cce Added --toc support to Markdown writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@658 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 01:14:35 +00:00
fiddlosopher
56efd6176f Added support for --toc to RTF writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@657 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 00:51:07 +00:00
fiddlosopher
489a2bb1d9 Moved isHeaderBlock from Text.Pandoc.Writers.HTML
to Text.Pandoc.Shared.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@656 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 00:50:14 +00:00
fiddlosopher
2d4a22d0be Regularized the scheme for unique header identifiers in HTML writer:
- punctuation is now all removed (except -)
 - spaces are turned into -
 - all lowercase
This scheme should be fairly predictable.
Updated tests accordingly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@655 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-08 17:33:03 +00:00
fiddlosopher
7f5638853b HTML writer changes:
+ change in scheme for construction of unique identifiers for headers:
  - all lowercase
  - spaces turn into -, not _
+ TOC items now have their own identifiers, starting with TOC-
+ when there is a TOC, headers link back to the corresponding TOC items,
  rather than the top of the TOC.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@653 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-08 17:09:17 +00:00
fiddlosopher
c3fb1dd4ea Added --toc support to RST writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@652 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-08 16:48:54 +00:00
fiddlosopher
dfcae807b0 Man writer: Don't print .\" t at beginning unless we're
in --standalone mode.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@650 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-08 03:54:07 +00:00
fiddlosopher
e58a55eb41 LaTeX writer:
+ Leave extra blank line after \maketitle
+ Insert \tableofcontents if --toc option was selected

Test suite:
+ extra blank line after \maketitle in writer.latex


git-svn-id: https://pandoc.googlecode.com/svn/trunk@649 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-08 03:48:07 +00:00
fiddlosopher
608b22e9dd HTML writer: Slight change in code for generating unique
identifiers.  The numbers used after duplicate identifiers
are now separated from them by a hyphen:  Duplicate-1
rather than Duplicate1.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@648 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-08 03:40:22 +00:00
fiddlosopher
e252fa300a Fixed bug in Notes ($$ instead of <>), which caused
note blocks to be indented.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@646 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-08 03:38:54 +00:00
fiddlosopher
a414ec9d63 Adjusted copyright notices to 2006-7; use
real email address instead of lamely attempting
to obfuscate.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@640 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 22:51:55 +00:00
fiddlosopher
956deeda4b Haddock documentation for Text.Pandoc.Blocks.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@638 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 22:37:11 +00:00
fiddlosopher
3e6184763e Man writer: Use integral n measures instead of fractional i
measures.  Calculate on basis of a 70 character line, since
the default is 78 but the table will appear indented 8 spaces
in standard man output.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@637 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 22:18:46 +00:00
fiddlosopher
6f4017b8ea Put table of contents in its own div (id="toc").
git-svn-id: https://pandoc.googlecode.com/svn/trunk@635 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 21:49:46 +00:00
fiddlosopher
54a0e4c3b2 HTML writer modifications:
+ Added code to HTML Writer to generate a table of contents if the
  writerTableOfContents option is specified.  This is an unordered list
  with links to the headers.  It is constructed hierarchically, based on
  the order of the headers and their levels.
+ If a TOC is used, the headers become links back to the TOC.
+ Removed Toc from WriterState; instead, the TOC is generated at the top
  level, by the function tableOfContents.
+ Fixed a bug in uniqueIdentifiers which prevented it from handling more than
  one duplicate.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@634 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 19:08:11 +00:00
fiddlosopher
f2b16c2065 + Introduced writerIgnoreNotes option in WriterOptions. This is needed
for processing header blocks for a table of contents, since notes on
  headers should not appear in the TOC.  Set default in Main.hs.
+ Moved Element, headerAtLeast, and hierarchicalize from Docbook writer
  to Text.Pandoc.Shared.  This is because HTML writer now uses these in
  constructing a table of contents.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@633 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 19:04:10 +00:00
fiddlosopher
fdf31cd23d Added writerTableOfContents to WriterOptions, and added a
--table-of-contents/--toc command-line option to Main.hs.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@632 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 05:43:23 +00:00
fiddlosopher
5a0ce1bcac Changes to HTML writer to incorporate automatic identifiers for
headers and table of contents:
+ WriterState now includes a list of header identifiers and a table
  of contents in addition to notes.
+ The function uniqueIdentifiers creates a list of unique identifiers
  from a list of inline lists (e.g. headers).
+ This list is part of WriterState and gets consumed by blockToHtml
  each time a header is encountered.
+ Headers are now printed with unique identifiers based on their names,
  e.g. Shell_scripts for "# Shell scripts".  Fancy stuff like links,
  italics, etc. gets ignored.  A numerical index is added to the end if
  there is already an identifier by the same name, e.g. "Shell_scripts1".
+ Provision has been made for a table-of-contents block element, but this
  has not yet been added.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@630 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 03:52:10 +00:00
fiddlosopher
0a250edfde Minor comment change.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@629 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-07 03:05:06 +00:00
fiddlosopher
ad2c642713 Pandoc.hs: Export all definitions in Text.Pandoc.Definition,
rather than exporting the module.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@628 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-06 16:37:46 +00:00
fiddlosopher
f9c988e703 Fixed bug in Markdown reader: links in footnotes were not
being processed.  Solution:  three-stage parse.  First, get
all the reference keys and add information to state.  Next,
get all the notes and add information to state.  (Reference
keys may be needed at this stage.)  Finally, parse everything
else.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@625 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-06 06:46:31 +00:00
fiddlosopher
1ca8a731bc Added table support to RST writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@624 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-05 06:14:48 +00:00
fiddlosopher
09f0247fd8 Added table support to markdown writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@623 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-04 22:21:52 +00:00
fiddlosopher
8d776e1e42 Improvements/bug fixes to Text.Pandoc.Blocks
library.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@622 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-04 22:21:08 +00:00
fiddlosopher
1a8b32afd5 Added Text.Pandoc.Blocks module for prettyprinting of
text tables.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@620 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-04 18:53:12 +00:00
fiddlosopher
4fe56a8d18 Man writer:
- Added scheme for specifying manual section and additional
  headers:
  % PROGNAM | 1 | User Manual | Version 4.0
- Modified man page sources to include section 1


git-svn-id: https://pandoc.googlecode.com/svn/trunk@619 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-04 15:53:49 +00:00
fiddlosopher
d5c47c33ca Added table support to man writer (using the tbl preprocessor).
The writer state now includes a list of "preprocessor" codes.
If the document contains a table, "t" (for "tbl") is added to the list.
If this list is nonempty, the man page starts with
.\" <list>
which instructs man to run the file through the appropriate
preprocessor before processing with groff.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@618 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-04 03:14:43 +00:00
fiddlosopher
bd5f3876a4 Man writer: don't change - to \- (minus sign).
Leave them as hyphens.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@614 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-02 02:38:14 +00:00
fiddlosopher
2cb22a1f8a Man writer: better output for line break:
.PD 0     # set interparagraph space to 0
.P        # new paragraph
.PD       # reset interparagraph space to default.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@613 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-02 02:31:36 +00:00
fiddlosopher
f5a3d44494 Minor changes in Man writer:
- escape ' as \[aq], because ' can trigger groff commands.
- remove unneeded line breaks.
- use CR font in code blocks.
- use .P 0 for line breaks.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@612 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-02 02:23:05 +00:00
fiddlosopher
e6f67fcc57 Modified escaping in Man writer. Also changed
format of footnote references and authors list.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@608 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-01 23:23:36 +00:00
fiddlosopher
c726e83ce9 Added groff man writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@606 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-06-30 22:42:32 +00:00
fiddlosopher
61024d93ed Require blankspace (but not multiple lines) between URL and
title in links and reference keys.  (Markdown reader.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@599 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-10 22:06:13 +00:00
fiddlosopher
f2c1777598 Fixed bug with indented blocks occurring in definition lists.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@598 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-10 22:05:25 +00:00
fiddlosopher
f9731108e8 Improved prettyprinting of definition lists.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@596 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-10 18:50:12 +00:00
fiddlosopher
5454c841a9 Added support for definition lists in Docbook writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@595 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-09 00:47:29 +00:00
fiddlosopher
71cf0a11b3 + Use new alignment parameter in title/author/date,
instead of hardcoded \qc.
+ Adjusted test suite to account for changes in RTF writer.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@594 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-09 00:44:56 +00:00
fiddlosopher
c6323b2c78 Changes to RTF writer:
+ Added support for definition lists.
+ Removed extra '\cell' in table output, which caused
  a blank column to the left.
+ Added support for captions in tables.
+ Added an 'alignment' parameter to RTF block writers.
+ Added support for column alignments in tables.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@593 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-03 14:46:37 +00:00
fiddlosopher
09fa7e6f63 Added support for definition lists to RST writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@592 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-03 14:44:28 +00:00
fiddlosopher
292d2bd38d Added support for definition lists to markdown
writer.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@591 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-03 14:44:04 +00:00
fiddlosopher
cf081435ff Changed definition list syntax in markdown reader and simplified
the parsing code. A colon is now required before every block in a
definition. This fixes a problem with the old syntax, in which the last
block in the following was ambiguous between a regular paragraph in the
definition and a code block following the definition list:

term
:   definition

    is this code or more definition?



git-svn-id: https://pandoc.googlecode.com/svn/trunk@589 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-03 14:42:40 +00:00
fiddlosopher
485fa81559 Resolved issue #10: instead of adding "\n\n" to the
end of strings in Main, do it in readMarkdown and readRST.
(Note: the point of this is to ensure that a block at the
end of the file gets treated as if it has blank space after
it, which is generally what is wanted.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@588 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-22 04:38:05 +00:00
fiddlosopher
726685af1b Fixed bug in anyLine parser. Previously anyLine would parse an
empty string "".  But it should fail on an empty string, or we get an
error from its use inside "many" combinators.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@587 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-22 04:19:34 +00:00
fiddlosopher
7742301c09 Support for definition lists in LaTeX writer.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@586 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-21 21:38:19 +00:00
fiddlosopher
ffd7248af8 Fixed export declarations; removed unneeded import of
Pandoc.Shared.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@585 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-15 16:22:29 +00:00
fiddlosopher
e15fd3bf86 Moved escape and nullBlock parsers from ParserCombinators/Pandoc
to Pandoc/Shared.  Reason:  ParserCombinators/Pandoc is for
general-purpose parsers that don't require Pandoc.Definition.
Also removed some unnecessary imports from Pandoc/Shared.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@584 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-15 16:20:07 +00:00
fiddlosopher
944740ce5c Added Table to prettyBlock in Shared.hs.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@582 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-13 01:33:10 +00:00
fiddlosopher
92845a9834 Added Text.Pandoc module that exports basic readers, writers,
definitions, and utility functions.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@581 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-11 22:57:45 +00:00
fiddlosopher
23df0ed176 Extensive changes stemming from a rethinking of the Pandoc data
structure. Key and Note blocks have been removed. Link and image URLs
are now stored directly in Link and Image inlines, and note blocks
are stored in Note inlines. This requires changes in both parsers
and writers. Markdown and RST parsers need to extract data from key
and note blocks and insert them into the relevant inline elements.
Other parsers can be simplified, since there is no longer any need to
construct separate key and note blocks. Markdown, RST, and HTML writers
need to construct lists of notes; Markdown and RST writers need to
construct lists of link references (when the --reference-links option
is specified); and the RST writer needs to construct a list of image
substitution references. All writers have been rewritten to use the
State monad when state is required.  This rewrite yields a small speed
boost and considerably cleaner code. 

* Text/Pandoc/Definition.hs:
  + blocks:  removed Key and Note
  + inlines:  removed NoteRef, added Note
  + modified Target:  there is no longer a 'Ref' target; all targets
    are explicit URL, title pairs

* Text/Pandoc/Shared.hs:

  + Added 'Reference', 'isNoteBlock', 'isKeyBlock', 'isLineClump',
    used in some of the readers.
  + Removed 'generateReference', 'keyTable', 'replaceReferenceLinks',
    'replaceRefLinksBlockList', along with some auxiliary functions
    used only by them.  These are no longer needed, since
    reference links are resolved in the Markdown and RST readers.
  + Moved 'inTags', 'selfClosingTag', 'inTagsSimple', and 'inTagsIndented'
    to the Docbook writer, since that is now the only module that uses
    them.
  + Changed name of 'escapeSGMLString' to 'escapeStringForXML'
  + Added KeyTable and NoteTable types
  + Removed fields from ParserState;  'stateKeyBlocks', 'stateKeysUsed',
    'stateNoteBlocks', 'stateNoteIdentifiers', 'stateInlineLinks'. 
    Added 'stateKeys' and 'stateNotes'.
  + Added clause for Note to 'prettyBlock'.
  + Added 'writerNotes', 'writerReferenceLinks' fields to WriterOptions.

* Text/Pandoc/Entities.hs: Renamed 'escapeSGMLChar' and
  'escapeSGMLString' to 'escapeCharForXML' and 'escapeStringForXML'

* Text/ParserCombinators/Pandoc.hs: Added lineClump parser: parses a raw
  line block up to and including following blank lines.

* Main.hs:  Replaced --inline-links with --reference-links.

* README: 
  + Documented --reference-links and removed description of --inline-links.
  + Added note that footnotes may occur anywhere in the document, but must
    be at the outer level, not embedded in block elements.
  
* man/man1/pandoc.1, man/man1/html2markdown.1: Removed --inline-links
  option, added --reference-links option

* Markdown and RST readers:
  + Rewrote to fit new Pandoc definition.  Since there are no longer
    Note or Key blocks, all note and key blocks are parsed on a first pass
    through the document.  Once tables of notes and keys have been constructed,
    the remaining parts of the document are reassembled and parsed.
  + Refactored link parsers.

* LaTeX and HTML readers: Rewrote to fit new Pandoc definition. Since
  there are no longer Note or Key blocks, notes and references can be
  parsed in a single pass through the document.

* RST, Markdown, and HTML writers: Rewrote using state monad new Pandoc
  and definition. State is used to hold lists of references footnotes to
  and be printed at the end of the document.

* RTF and LaTeX writers: Rewrote using new Pandoc definition. (Because
  of the different treatment of footnotes, the "notes" parameter is no
  longer needed in the block and inline conversion functions.)

* Docbook writer:
  + Moved the functions 'attributeList', 'inTags', 'selfClosingTag',
    'inTagsSimple', 'inTagsIndented' from Text/Pandoc/Shared, since
    they are now used only by the Docbook writer.
  + Rewrote using new Pandoc definition.  (Because of the different
    treatment of footnotes, the "notes" parameter is no longer needed
    in the block and inline conversion functions.)

* Updated test suite

* Throughout:  old haskell98 module names replaced by hierarchical module
  names, e.g. List by Data.List.

* debian/control: Include libghc6-xhtml-dev instead of libghc6-html-dev
  in "Build-Depends."

* cabalize: 
  + Remove haskell98 from BASE_DEPENDS (since now the new hierarchical
    module names are being used throughout)
  + Added mtl to BASE_DEPENDS (needed for state monad)
  + Removed html from GHC66_DEPENDS (not needed since xhtml is now used)



git-svn-id: https://pandoc.googlecode.com/svn/trunk@580 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-10 01:56:50 +00:00
fiddlosopher
74e7497226 Fixed bug in email obfuscation (issue #15). If the text to be obfuscated
contains an entity, this needs to be decoded before obfuscation.
Thanks to thsutton for the patch.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@579 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-04-08 21:04:47 +00:00
fiddlosopher
571c3b4173 Removed Blank block element as unnecessary.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@578 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-17 20:23:47 +00:00
fiddlosopher
9b3d5d88c2 Consolidated 'text', 'special', and 'inline' into 'inline'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@577 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-17 17:25:28 +00:00
fiddlosopher
cdf7c78b5f Added trys to two list start routines. Reason:
<|> only parses second parser when first hasn't consumed
input.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@576 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-16 06:05:43 +00:00
fiddlosopher
0d2e5eab79 Added clauses for DefinitionList and Table to replaceReferenceLinks in
Text/Pandoc/Shared.hs. This ensures that reference-style links inside
tables and definition lists will be handled properly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@575 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-12 00:36:41 +00:00
fiddlosopher
d55346ca9a Simplified keyTable, using assumption that key blocks are not
inside other block elements (an assumption that the Markdown
reader uses in making its initial pass anyway).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@574 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-12 00:23:39 +00:00
fiddlosopher
7bf7aba7e7 Changes to Markdown reader relating to definition lists:
+ fixed bug in indentSpaces (which didn't properly handle
  cases with mixed spaces and tabs)
+ rewrote definition list code to conform to new syntax
+ include definition lists in list block
+ failIfStrict on definition lists


git-svn-id: https://pandoc.googlecode.com/svn/trunk@572 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-11 07:56:29 +00:00
fiddlosopher
aca046702e Added support for DefinitionList blocks to HTML writer.
Cleaned up bullet and ordered list code by using
ordList and unordList instead of raw olist and ulist.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@571 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-11 07:54:47 +00:00
fiddlosopher
2e794fdf37 Fixed bug in HTML email obfuscation using --strict mode.
The problem is that the "href" function escapes &, so
(href "&#108;") is 'href="&amp;#108;"'.  Fixed by using
primHtml for the whole link.  Resolves issue 9.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@569 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-11 00:19:15 +00:00
fiddlosopher
115cad8882 Changed syntax of definition lists in Markdown parser:
+ definition blocks must be indented throughout (not
    just in first line)
  + compact lists can be formed by leaving no blank line
    between a definition and the next term


git-svn-id: https://pandoc.googlecode.com/svn/trunk@568 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-10 20:45:19 +00:00
fiddlosopher
5ec31cc727 Added parser for definition lists, derived from reStructuredText
syntax:

term 1
    Definition 1

    Paragraph 2 of definition 1.

term 2
    There must be whitespace between entries.
    Any kind of block may serve as a definition,
    but the first line of each block must be indented.

terms can contain any *inline* elements
    If you want to be lazy, you can just
indent the first line of the definition block.



git-svn-id: https://pandoc.googlecode.com/svn/trunk@566 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-10 17:48:16 +00:00
fiddlosopher
0ce965f34c Modified prettyPandoc to handle DefinitionList elements.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@565 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-10 17:45:00 +00:00
fiddlosopher
68cd976838 Added definition for DefinitionList block element.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@564 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-10 17:44:27 +00:00
fiddlosopher
d5f9b1dbb4 Change in ordered lists in Markdown reader:
+ Lists may begin with lowercase letters only, and only 'a' through
    'n'. Otherwise first initials and page references (e.g., p. 400)
    are too easily parsed as lists.
  + Numbers beginning list items must end with '.' (not ')', which is
    now allowed only after letters).
NOTE:  This change may cause documents to be parsed differently.
Users should take care in upgrading.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@561 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-09 02:37:49 +00:00
fiddlosopher
f1191f029a More smart quote adjustments:
+ remove support for all-caps contractions (too
  much potential for conflict with things like
  'M. Mitterand')
+ add support for 'm as a contraction


git-svn-id: https://pandoc.googlecode.com/svn/trunk@560 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-07 20:53:37 +00:00
fiddlosopher
42ade05ebe Smart quote parsing in Markdown reader:
treat ' followed by ll, re, ve, then a
non-letter, as a contraction.  (e.g.
I've, you're, he'll)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@559 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-07 03:01:43 +00:00
fiddlosopher
402016df28 Fixed bug in noscript part of email obfuscation:
& instead of &amp;


git-svn-id: https://pandoc.googlecode.com/svn/trunk@557 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-04 07:40:22 +00:00
fiddlosopher
21ba6b08f2 Made image parsing in HTML reader sensitive to the
--inline-links option.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@556 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-03 18:25:49 +00:00
fiddlosopher
31c030e3a5 Added --inline-links option to force links in HTML to be parsed
as inline links, rather than reference links.  (Addresses Issue
#4.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@554 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-03-03 18:19:31 +00:00
fiddlosopher
463a0e5c3e Changes to test suite for new XHTML output.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@550 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-27 07:05:11 +00:00
fiddlosopher
59065c103f Modified HTML writer to use the Text.XHtml library. This results
in cleaner, faster code, and it makes it easier to use Pandoc in
other projects, like wikis, that use Text.XHtml.  Two functions
are now provided, writeHtml and writeHtmlString:  the former outputs
an Html structure, the latter a rendered string.  The S5 writer is
also changed, in parallel ways (writeS5, writeS5String).  The Html
header is now written programmatically, so it has been removed from
the 'headers' directory.  The S5 header is still needed, but the
doctype and some of the meta declarations have been removed, since
they are written programatically.  The INSTALL file and cabalize
have been updated to reflect the new dependency on the xhtml package.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@549 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-26 19:08:10 +00:00
fiddlosopher
3456355174 Added defaultWriterOptions to Shared.hs.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@545 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-21 01:22:08 +00:00
fiddlosopher
7d3382e9f0 In writing Markdown, print unicode nonbreaking space
(160) as "&nbsp;", since otherwise it is hard to distinguish
from a regular space.  (Addresses Issue #3.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@541 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-17 04:57:41 +00:00
fiddlosopher
7a04caeea8 Escape non-breaking space in SGML as '&nbsp;' instead of
printing a unicode non-breaking space, which is
hard to distinguish visually from a regular space.
(Resolves issue #3.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@540 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-17 03:40:28 +00:00
fiddlosopher
1855af4be5 Refactored str and strong in Markdown reader, for clarity.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@539 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-15 02:27:36 +00:00
fiddlosopher
87eb8be143 Got rid of two unneeded 'getState's. Note that
lookAhead automatically saves and restores the state.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@538 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-15 01:43:12 +00:00
fiddlosopher
9ad7c73b57 Use lookAhead instead of getInput/setInput in RST reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@537 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-15 01:33:35 +00:00
fiddlosopher
d3efa4aa8a Use lookAhead parser for the "first pass" looking for
reference keys in Markdown parser, instead of parsing
normally, then using setInput to reset input.  Slight
performance improvement. 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@536 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-15 01:22:02 +00:00
fiddlosopher
ae19b94fd1 Removed followedBy' parser from Text/ParserCombinators/Pandoc,
replacing it with the 'lookAhead' parser from
Text/ParserCombinators/Parsec.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@535 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-15 01:10:15 +00:00
fiddlosopher
0114f68d21 Introduced a new map, reverseEntityTable, for lookups
of entity by character, in Entities.hs. This yields a 
small performance improvement.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@534 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-14 18:02:51 +00:00
fiddlosopher
1266b189a1 Changed Entities.hs to use Data.Map rather than
an association list, for a slight performance boost.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@532 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-14 17:39:06 +00:00
fiddlosopher
7f65a4e914 Fixed issue #8: slow performance in parsing inline literals in
RST reader.  The problem was that ``#`` was seen by
'inline' as a potential link or image.  Fix:  insert
'notFollowedBy (char '`')' in link parsers.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@529 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-14 06:57:23 +00:00
fiddlosopher
6e467c26c4 Replaced "choice [(try (string ...), ...]" idiom with
"oneOfStrings" in LaTeX reader.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@528 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-12 17:18:20 +00:00
fiddlosopher
f3437dd27c + Added some needed "try"s before multicharacter parsers,
especially in "option" contexts.
+ Removed the "try" from the "end" parser in "enclosed"
  (Text.Pandoc.Shared).  Now "enclosed" behaves like
  "option", "manyTill", etc.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@527 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-12 17:14:11 +00:00
fiddlosopher
73cbae7202 Added 'try' in front of 'string', where needed, or
used a different parser, in RST reader. This fixes
a bug where ````` would not be correctly parsed as
a verbatim `.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@526 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-12 03:53:14 +00:00
fiddlosopher
ca93680f72 Allow the URI in a RST hyperlink target to start on the line
after the reference key.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@525 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-02-12 02:46:39 +00:00
fiddlosopher
6c3d96a776 Changed 'encodeEntities' to 'escapeSGMLString'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@520 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-28 16:40:44 +00:00
fiddlosopher
dc6925542c + Simplified entity handling by removing stringToSGML from Entities.hs.
It is no longer needed now that all entities are processed in the markdown 
  and HTML readers.  All calls to stringToSGML have been replaced by calls
  to encodeEntities.
+ Since inTag's attribute handling already encodes entities, 
  calls to encodeEntities are no longer needed for attribute values, so
  they've been removed.
+ The HTML and Markdown readers now call decodeEntities on all raw
  strings (e.g. authors, dates, link titles), to ensure that no unprocessed
  entities are included in the native representation of the document. 
  (In the HTML reader, most of this work is done by a change in
  extractAttributeName.)
+ The result is a small speed improvement (around 5% on my benchmark)
  and cleaner code.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@519 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-28 00:04:43 +00:00
fiddlosopher
21484713c6 Use encodeEntities rather than stringToSGML for contents of
Str inline in Docbook and HTML writers, since now these
strings should not contain literal entity references.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@518 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-27 22:58:03 +00:00
fiddlosopher
8e0ad5a006 Cleaned up handling of embedded quotes in link titles.
Now these are stored as a '"' character, not as '&quot;'.
The function escapeLinkTitle in the Markdown writer is
unnecessary and was removed.  Tests modified accordingly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@517 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-27 22:45:14 +00:00
fiddlosopher
141affdb51 More changes in entity handling: Instead of using entities for characters
above 128 in HTML and Docbook output, we now just use unicode.  After all,
we're declaring UTF-8 content in the header.  This makes the HTML and
docbook files produced by pandoc much more readable and editable.

Changes to Entities.hs:
+ Removed specialCharToEntity
+ Added escapeSGMLChar (which just escapes the basic four, <>&")
+ Modified encodeEntities and stringToSGML to use escapeSGMLChar
+ Removed encodeEntitiesNumerical
+ Rewrote encodeEntities for better performance
+ Rewrote stringToSGML for better performance 



git-svn-id: https://pandoc.googlecode.com/svn/trunk@516 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-27 22:13:11 +00:00
fiddlosopher
d06417125d Changes in entity handling:
+ Entities are parsed (and unicode characters returned) in both
  Markdown and HTML readers.
+ Parsers characterEntity, namedEntity, decimalEntity, hexEntity added
  to Entities.hs; these parse a string and return a unicode character.
+ Changed 'entity' parser in HTML reader to use the 'characterEntity'
  parser from Entities.hs.  
+ Added new 'entity' parser to Markdown reader, and added '&' as a 
  special character.  Adjusted test suite accordingly since now we 
  get 'Str "AT",Str "&",Str "T"' instead of 'Str "AT&T"..
+ stringToSGML moved to Entities.hs.  escapeSGML removed as redundant,
  given encodeEntities.
+ stringToSGML, encodeEntities, and specialCharToEntity are given a
  boolean parameter that causes only numerical entities to be used.
  This is used in the docbook writer.  The HTML writer uses named
  entities where possible, but not all docbook-consumers know about
  the named entities without special instructions, so it seems safer
  to use numerical entities there.
+ decodeEntities is rewritten in a way that avoids Text.Regex, using
  the new parsers.
+ charToEntity and charToNumericalEntity added to Entities.hs.
+ Moved specialCharToEntity from Shared.hs to Entities.hs.
+ Removed unneeded 'decodeEntities' from 'str' parser in HTML and
  Markdown readers.
+ Removed sgmlHexEntity, sgmlDecimalEntity, sgmlNamedEntity, and
  sgmlCharacterEntity from Shared.hs.
+ Modified Docbook writer so that it doesn't rely on Text.Regex for
  detecting "mailto" links.



git-svn-id: https://pandoc.googlecode.com/svn/trunk@515 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-27 03:04:40 +00:00
fiddlosopher
f2de08864e Rewrote functions in Text/Pandoc/Shared so as not to use Text.Regex,
which does not support unicode:
  - escapePreservingRegex removed
  - stringToSGML rewritten using Parsec parser
  - new parsers for SGML character entities
  - escapeSGML rewritten using specialCharToEntity
  - new function specialCharToEntity


git-svn-id: https://pandoc.googlecode.com/svn/trunk@514 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 23:25:27 +00:00
fiddlosopher
11456ba5ce Changed Markdown autoLink parsing to conform better to
Markdown.pl's behavior.  <google.com> is not treated as
a link, but <http://google.com>, <ftp://google.com>, and
<mailto:google@google.com> are.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@513 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 20:55:27 +00:00
fiddlosopher
c94dacec35 Fixed bug in 'extractTagType' in HTML reader: previous
version was not skipping / in close tags.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@512 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 20:26:06 +00:00
fiddlosopher
890fbe97ec Refactored markdown reader so that Text.Regex is not used.
Replaced email regex test with a custom email autolink parser
(autoLinkEmail).  Also replaced 'selfClosingTag' with a
custom function 'isSelfClosingTag'.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@511 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 19:44:43 +00:00
fiddlosopher
8f0cfe9bd0 Fixed a bug in extractTagType in HTML Reader: the previous
version extracted the attributes, too, which is not wanted.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@510 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 19:40:32 +00:00
fiddlosopher
c61f2b6984 Fixed bug in HTML attribute parser: now a space is
required before an attribute.  Previously, <a.b>
would be parsed as an HTML tag with an attribute!


git-svn-id: https://pandoc.googlecode.com/svn/trunk@509 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 19:07:35 +00:00
fiddlosopher
ca6cb23f23 Modified Markdown writer to use autolinks when possible.
So, instead of [site.com](site.com) we get <site.com>.
Changed test suite accordingly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@508 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 19:06:30 +00:00
fiddlosopher
0646eef976 Rewrote 'extractTagType' in HTML reader so that it doesn't use
regexs.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@507 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 17:43:39 +00:00
fiddlosopher
96919a6ac5 More smart quote bug fixes:
+ LaTeX writer now handles consecutive quotes properly:
  for example, ``\,`hello'\,''
+ LaTeX reader now parses '\,' as empty Str
+ normalizeSpaces function in Shared now removes empty Str elements
+ Modified tests accordingly


git-svn-id: https://pandoc.googlecode.com/svn/trunk@506 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 08:14:43 +00:00
fiddlosopher
e6cc2aa3cf Fixed bug in smart quoting: recognize ' in contractions like
"don't" as not beginning single quoted contexts.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@505 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-24 04:49:36 +00:00
fiddlosopher
1121e8738b Removed 'gsub' entirely and replaced its uses with 'substitute'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@501 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-22 22:52:39 +00:00
fiddlosopher
8f0750574a + Added a 'substitute' function to Shared.hs. This is a generic
list function that can be used to substitute one substring
  for another in a string, like 'gsub' except without regular
  expressions.
+ Use 'substitute' instead of 'gsub' in the LaTeX writer.  This
  avoids what appears to be a bug in Text.Regex, whereby "\\^"
  matches "\350".  There seems to be a slight speed improvement
  as well.  (Note:  If this works, it would be good to replace
  other uses of gsub that don't employ regexs with 'substitute'.) 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@500 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-22 21:28:46 +00:00
fiddlosopher
a7839a18a7 Small bug fix to last change, and count "'S" as well as "'s" as
possessive when followed by non-alphanumeric.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@499 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-18 02:54:07 +00:00
fiddlosopher
2c64e7947b More tweaks to smart quote parsing: a ' is not a single quote
start if followed by 's' and then a non-alphanumeric.  (Yes,
this is English-centric, I'm afraid.  But it does help, and I
can't think of a language in which 's' by itself is a word.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@498 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-18 02:47:27 +00:00
fiddlosopher
5a48839168 Minor tweaks to smart quoting code.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@497 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-16 23:51:57 +00:00
fiddlosopher
8c257385ac Fixed bug in smart quote recognition: ' before ) or certain
other punctuation must not be an open quote.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@496 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-16 23:46:33 +00:00
fiddlosopher
f3cf1cc168 Fixed haddock documentation errors.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@495 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-16 00:07:42 +00:00
fiddlosopher
60989d0637 Added support for tables in markdown reader and in LaTeX,
DocBook, and HTML writers.  The syntax is documented in
README.  Tests have been added to the test suite.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@493 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-15 19:52:42 +00:00
fiddlosopher
19123a540e Don't use named entities in docbook writer. Instead, use
numerical entities, for portability across stylesheets.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@473 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-09 05:40:40 +00:00
fiddlosopher
d83b70f3a0 + Changed 'escapedChar' in Markdown reader so that only the
characters Markdown escapes are escaped in strict mode.
  When not in strict mode, Pandoc allows all non-alphanumeric 
  characters to be escaped.
+ Added documentation of backslash escapes to README.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@461 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-08 23:54:15 +00:00
fiddlosopher
15ea29b223 Small improvements to indentSpaces. (Allow combinations
of spaces and tabs.)


git-svn-id: https://pandoc.googlecode.com/svn/trunk@446 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-07 01:44:24 +00:00
fiddlosopher
f2c2494b66 Modified HTML output for Image elements, to conform to
Markdown.pl:
+ title attribute comes after alt attribute
+ title is included even if null


git-svn-id: https://pandoc.googlecode.com/svn/trunk@445 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-07 01:06:34 +00:00
fiddlosopher
1bc585837c Fixed performance problems with '--strict' option:
+ Replaced skipEndline with "option ' ' newline" where possible.
+ Replaced "notFollowedBy' header" in definition of endline with
  a faster but equally accurate test for a folliwng header.
+ Removed check at the beginning of 'reference' for
  a noteStart: This is not needed, because note comes before
  referenceKey in the definition of block.
+ Replaced check for a following anyHtmlBlockTag in autoLink
  with a check for anyHtmlTag or anyHtmlEndTag.
+ Other small code cleanups.  


git-svn-id: https://pandoc.googlecode.com/svn/trunk@444 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-07 00:48:46 +00:00
fiddlosopher
233148f963 Fixed bug in Markdown reader's handling of underscores and other
inline formatting markers inside reference labels:  for example,
in '[A_B]: /url/a_b', the material between underscores was being
parsed as emphasized inlines.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@442 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-06 20:47:00 +00:00
fiddlosopher
58dcef0625 Added support for hexadecimal entities: e.g. &#xA0AB;
git-svn-id: https://pandoc.googlecode.com/svn/trunk@441 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-06 19:47:05 +00:00
fiddlosopher
1645fb65e4 Fixed serious performance problems with new Markdown reader:
Instead of using lookahead to determine whether a single quote
is an apostrophe, we now use state.  Inside single quotes,
a ' character won't be recognized as the beginning of a single
quote.  'stateQuoteContext' has been added to keep track of
this. 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@437 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-06 18:41:01 +00:00
fiddlosopher
bb8478e4e2 Merged changes from 'quotes' branch since r431. Smart typography
is now handled in the Markdown and LaTeX readers, rather than in
the writers.  The HTML writer has been rewritten to use the
prettyprinting library.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@436 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-06 09:54:58 +00:00
fiddlosopher
39eb8cbad8 Changed Markdown writer so that it does not use the single-bracket
style of implicit reference link.  It now uses [this style][],
not [this style].  Reason:  only newer, beta versions of Markdown
allow the single-bracket style.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@419 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-05 00:55:38 +00:00
fiddlosopher
a5e3c09fc7 Fixed small bug in consolidateList: added case
for (Str a):Space:Space:rest.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@418 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-04 22:58:28 +00:00
fiddlosopher
030d94e1c3 Refactored SGML escaping functions and "in tag" functions to
Text/Shared/Pandoc.  (escapeSGML, stringToSGML, inTag,
inTagSimple, inTagIndented, selfClosingTag)  These can be
used by both the HTML and Docbook writers.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@417 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-04 22:52:16 +00:00
fiddlosopher
24f3710e09 Fixed bug in encodeEntities (characters less than 128, not 127,
should be encoded).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@416 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-04 17:51:46 +00:00
fiddlosopher
b770a9f009 Removed unneeded 'options' parameter from 'indentedInTags' function
in Docbook writer.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@413 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-04 08:40:22 +00:00
fiddlosopher
99959b68e9 + Improved text wrapping algorithm in markdown, docbook, and RST writers.
LineBreaks no longer cause ugly wrapping in Markdown output.
+ Replaced splitBySpace with the more general, polymorphic function
  splitBy (in Text/Pandoc/Shared).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@411 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-04 01:04:56 +00:00
fiddlosopher
e4880319e6 Modified HTML reader to skip a newline following a <br> tag.
Otherwise the newline will be treated as a space at the beginning
of the next line.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@410 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-03 20:52:12 +00:00
fiddlosopher
d4454536f0 Change 'HtmlEntities' module to 'Entities'. Adjusted calling
code accordingly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@395 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-02 00:40:12 +00:00
fiddlosopher
4e5745134a Use entities for all characters above 127 in docbook output.
Though XML tools should support unicode, some people will be
using SGML tools, and these do not.  Using entities makes the
docbook files more portable.

Also refactored encodeEntities and charToHtmlEntity in
HtmlEntities.hs


git-svn-id: https://pandoc.googlecode.com/svn/trunk@394 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-02 00:29:22 +00:00
fiddlosopher
2716943855 Changed representation of code blocks to use <screen> and
escaped characters rather than <programlisting> and CDATA.
Reason:  XML source more easily editable and readable.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@393 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-01 22:07:19 +00:00
fiddlosopher
a734aaf2ae Removed a line that was causing a compiler warning in docbook
writer.  The line isn't necessary, since we have a case for
every kind of block element.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@388 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-01 21:11:47 +00:00
fiddlosopher
a9e32505de Merged changes from docbook branch since r363.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@386 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-01 21:08:12 +00:00
fiddlosopher
0d182772f9 Revised inline code parsing in Markdown reader to conform to
Markdown.pl.  Now any number of `'s can begin inline code,
which will end with the same number of `'s.  For example, to
have two backticks as code, write
``` `` ```


git-svn-id: https://pandoc.googlecode.com/svn/trunk@360 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-31 19:22:02 +00:00
fiddlosopher
8b3ac98171 Simplified list parsing code in RST reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@356 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-31 18:14:51 +00:00
fiddlosopher
3f5194b3bf Cleaned up some code in RST reader.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@354 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-31 17:34:26 +00:00
fiddlosopher
0b6dc98a0a Changed Markdown reader so that the first pass, in which a list
of reference keys is made, is much faster.  This gets us a big
performance boost.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@353 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-31 17:34:06 +00:00
fiddlosopher
83cddbc682 Removed unneeded 'do' block from 'parseBlocks' definition
in Markdown reader.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@352 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-31 16:46:48 +00:00
fiddlosopher
4ea1b2bdc0 Merged 'strict' branch from r324. This adds a '--strict'
option to pandoc, which forces it to stay as close as possible
to official Markdown syntax.  


git-svn-id: https://pandoc.googlecode.com/svn/trunk@347 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-30 22:51:49 +00:00
fiddlosopher
eea359203a Reversed changes from r246:
+ Removed invisible anchors in front of header tags in HTML output.
  Reason:  no way to prevent duplicate ID attributes (which is invalid
  HTML), since there might be duplicate header titles.  See 
  http://six.pairlist.net/pipermail/markdown-discuss/2005-January/000975.html.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@306 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-29 08:04:39 +00:00
fiddlosopher
d2105f6693 + Added regression tests with footnotes in quote blocks and lists.
+ This uncovered an existing bug in the RTF writer, which got indentation
  wrong on footnotes occuring in indented blocks like lists.  Fixed
  this bug.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@263 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-21 19:33:57 +00:00
fiddlosopher
48b8267126 Fixed a serious bug in the Markdown reader (also affecting LaTeX
and RST readers).  The problem:  these readers ran 'runParser' on
processed chunks of text to handle embedded block lists in lists
and quotation blocks.  But then any changes made to the parser state
in these chunks was lost, as the state is local to the parser.
So, for example, footnotes didn't work in quotes or list items.

The fix:  instead of calling runParser on some raw text, use
setInput to make it the input, then parse it, then use setInput
to restore the input to what it was before.  This is shorter and more
elegant, and it fixes the problem.

'BlockQuoteContext' was also eliminated from ParserContext, as it
isn't used anywhere.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@261 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-21 09:02:06 +00:00
fiddlosopher
862471e417 Fixed two small haddock bugs.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@260 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-21 08:30:08 +00:00
fiddlosopher
11cd6e94e0 Added license text to top of source files.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@258 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-20 20:54:23 +00:00
fiddlosopher
70d291026d Changed 'stability' from 'provisional' to 'alpha'.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@257 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-20 20:20:10 +00:00
fiddlosopher
1fded403c5 Changed 'status' in comment headers from 'unstable' to 'provisional'
(which seems to be the term that is used in this context).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@255 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-20 19:48:46 +00:00
fiddlosopher
b98edf2c74 Made javascript obfuscation of emails even more obfuscatory,
by combining it with entity obfuscation.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@254 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-20 18:16:07 +00:00
fiddlosopher
dc9c6450f3 + Added module data for haddock.
+ Reformatted code consistently.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@252 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-20 06:50:14 +00:00
fiddlosopher
5cf769b1cd Modified the HTML writer to add invisible anchors to each section
heading.  The anchors are derived form the text of the section
heading as described in README.  This makes it easy to insert
links that jump from one part of a document to another:
for example, '[back to the Introduction](#Introduction)'.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@246 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-20 00:25:54 +00:00
fiddlosopher
c1ebe94e40 + Replaced 'comparing' combinator in markdown reader with 'compare'.
'comparing' is from Data.Ord, which is not available in GHC 6.4.
+ Added line break after </li> in HTML footnote output, for easier
  inspection of the source.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@245 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-19 23:46:07 +00:00
fiddlosopher
34bb7a125e Fixed a minor mistake introduced in resolving conflicts from the
merge.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@243 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-19 23:17:53 +00:00
fiddlosopher
661c7e7b1d Merged changes to footnotes branch r219-r240.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@241 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-19 23:13:03 +00:00
fiddlosopher
206c59a386 Removed three files from the repository. These are generated
from templates in src/templates, and so should not be in the
repository.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@234 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-19 21:35:14 +00:00
fiddlosopher
3a6296acae Changed footnote syntax to conform to the de facto standard
for markdown footnotes.  References are now like this[^1]
rather than like this^(1).  There are corresponding changes
in the footnotes themselves.  See the updated README for
more details.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@230 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-19 07:30:36 +00:00
fiddlosopher
a8bbd950e5 Changed '--smartypants' to '--smart' and adjusted documentation
and symbols accordingly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@224 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-18 22:02:39 +00:00
fiddlosopher
75dbe3248e Removed a / in a comment that was causing haddock to fail.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@213 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-17 18:41:26 +00:00
fiddlosopher
7c319e55ff Modified markdown reader to allow ordered list items to begin
with (single) letters, as well as numbers.  The list item marker
may now be terminated either by '.' or by ')'.  These extensions
to standard markdown are documented in README.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@211 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-16 19:43:00 +00:00
fiddlosopher
19bad8253d Improvements to smart-quote regexs. Now we can better handle
cases where latex commands or HTML entity references appear
after quotes.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@202 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-16 06:14:37 +00:00
fiddlosopher
fe66a90a2a Changed 'putStrLn' to 'putStr' in Main.hs, and modified some
of the readers to make spacing at end of output more consistent.
Modified tests accordingly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@201 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-16 05:05:02 +00:00
fiddlosopher
6411ea7466 In HTML writer, include <title></title> even if title is null.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@171 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-09 02:05:23 +00:00
fiddlosopher
6536e64128 Minor improvements to LaTeX reader:
+ added nullBlock to preamble parsing, so we can handle unusual
  things like pure TeX
+ modified escapedChar to allow a \ at the end of line to count
  as escaped whitespace
+ treat "thanks" commands as footnotes


git-svn-id: https://pandoc.googlecode.com/svn/trunk@146 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-11-27 06:51:03 +00:00
fiddlosopher
7384774d83 Refactored LaTeX reader for clarity (added isArg function).
git-svn-id: https://pandoc.googlecode.com/svn/trunk@138 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-11-26 07:08:46 +00:00
fiddlosopher
986c1f9dee Pandoc bug fixes:
+ LaTeX reader did not parse metadata correctly.  Now the title,
  author, and date are parsed correctly, and everything else in
  the preamble is skipped.
+ Simplified parsing of LaTeX command arguments and options.  
  The function commandArgs now returns a list of arguments OR
  options (in whatever order they appear).  The brackets are
  included, and a new stripFirstAndLast function is provided
  to strip them off when needed.  This fixes a problem in dealing
  with \newcommand, etc.
+ Added a "try" before "parser" in definition of notFollowedBy'
  combinator.  Adjusted the code using this combinator accordingly.
+ Changed handling of code blocks.  Previously, some readers allowed
  trailing newlines, while others stripped them.  Now, all readers
  strip trailing newlines in code blocks; writers insert a newline
  at the end of code blocks as needed.
+ Changed test suite to reflect these changes. 


git-svn-id: https://pandoc.googlecode.com/svn/trunk@137 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-11-26 07:01:37 +00:00
fiddlosopher
82d7190a0c Fixed two small Haddock comment bugs in Shared.hs.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@87 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-11-08 17:50:09 +00:00
fiddlosopher
3dbd266d21 Improved LaTeX writer's handling of dashes:
+ Recognize a double hyphen as an Em-dash, even when it occurs next
  to punctuation (e.g. a quotation mark).
+ Collapse space around Em-dashes.
+ Process quotes before dashes.  This way (foo -- 'bar') will turn into
  (foo---`bar') instead of (foo---'bar').


git-svn-id: https://pandoc.googlecode.com/svn/trunk@49 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-10-30 23:15:28 +00:00
fiddlosopher
09473903dc Changes to RTF writer:
+ use Helvetica instead of Times New Roman as default font
+ specify \f0 in every \pard; otherwise font sizes are not registered properly
+ modify test of RTF writer accordingly


git-svn-id: https://pandoc.googlecode.com/svn/trunk@32 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-10-29 08:56:26 +00:00
fiddlosopher
df7b682251 initial import
git-svn-id: https://pandoc.googlecode.com/svn/trunk@2 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-10-17 14:22:29 +00:00