Commit graph

25 commits

Author SHA1 Message Date
fiddlosopher
7c35c0bc25 Fixed bug in Markdown parser: regular $s triggering math mode.
For example:  "shoes ($20) and socks ($5)."

The fix consists in two new restrictions:
+ the $ that ends a math span may not be directly followed by a digit.
+ no blank lines may be included within a math span.

Thanks to Joseph Reagle for noticing the bug.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1326 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-15 20:41:27 +00:00
fiddlosopher
7447ecc255 Escape '\160' as " ", not " " in XML.
"nbsp" isn't a predefined XML entity.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1303 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-13 16:31:34 +00:00
fiddlosopher
824bb2d22e In smart mode, use nonbreaking spaces after abbreviations in markdown parser.
Thus, for example, "Mr. Brown" comes out as "Mr.~Brown" in LaTeX, and does
not produce a sentence-separating space.  Resolves Issue #75.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1298 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-11 02:14:57 +00:00
fiddlosopher
8ed710bc9d Treat '\ ' in (extended) markdown as nonbreaking space.
Print nonbreaking space appropriately in each writer (e.g. ~ in LaTeX).


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1297 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-11 01:24:15 +00:00
fiddlosopher
cd38d4ae79 Markdown smart typography: Em dashes no longer eat surrounding whitespace.
Resolves Issue #69.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1279 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-06-08 03:20:15 +00:00
fiddlosopher
aea6f6802b Removed support for "box-style" block quotes in markdown.
This adds unneeded complexity and makes pandoc diverge further
than necessary from other markdown extensions.
Brought documentation, tests, and debian/changelog up to date.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1141 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-08 19:32:18 +00:00
fiddlosopher
f8a257589b Updated test suite -- no italics for digits.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1133 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-02 08:35:24 +00:00
fiddlosopher
d1832da9e1 Added Text.Pandoc.Readers.TeXMath and changed default handling of math.
+ Text.Pandoc.Readers.TeXMath exports readTeXMath, which reads raw TeX
  math and outputs a string of pandoc inlines that tries to render it
  as far as possible, lapsing into literal TeX when needed.
+ Added Text.Pandoc.Readers.TeXMath to pandoc.cabal + ghc66 version.
+ Modified writers so that readTeXMath is used for default HTMl output
  in HTML, S5, RTF, Docbook.
+ Updated README with information about how math is rendered in all formats.
+ Updated test suite.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1129 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-02 00:36:32 +00:00
fiddlosopher
b008e5d347 Modified writer tests for new Math and TeX output.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1119 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-11-29 08:09:37 +00:00
fiddlosopher
f11360f50e Added new rule for enhanced markdown ordered lists: if the list marker
is a capital letter followed by a period (including a single-letter
capital roman numeral), then it must be followed by at least two spaces.
The point of this is to avoid accidentally treating people's initials as
list markers: a paragraph may begin:

    B. Russell was an English philosopher.

and this shouldn't be treated as a list.

Modified Markdown reader and README documentation.
Added a test case.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@880 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-23 04:25:09 +00:00
fiddlosopher
e48f046aa0 + Fixed bug in markdown ordered list parsing. The problem was
that anyOrderedListStart did not check for a space following the
  ordered list marker. So, 'A.B. 2007' would be parsed as a list item,
  then fail because of the lack of space after 'A.' (required by
  orderedListStart). Resolves Issue #22.
+ Fixed a similar problem in RST reader.
+ Added regression test.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@861 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-18 15:26:29 +00:00
fiddlosopher
e814a3f6d2 Major change in the way ordered lists are handled:
+ The changes are documented in README, under Lists.
+ The OrderedList block element now stores information
  about list number style, list number delimiter, and
  starting number.
+ The readers parse this information, when possible.
+ The writers use this information to style ordered
  lists.
+ Test suites have been changed accordingly.

Motivation:  It's often useful to start lists with
numbers other than 1, and to have control over the
style of the list.

Added to Text.Pandoc.Shared:
+ camelCaseToHyphenated
+ toRomanNumeral
+ anyOrderedListMarker
+ orderedListMarker
+ orderedListMarkers

Added to Text.Pandoc.ParserCombinators:
+ charsInBalanced'
+ withHorizDisplacement
+ romanNumeral

RST writer:
+ Force blank line before lists, so that sublists will be handled
  correctly.

LaTeX reader:
+ Fixed bug in parsing of footnotes containing multiple paragraphs,
  introduced by use of charsInBalanced.  Fix: use charsInBalanced'
  instead.

LaTeX header:
+ use mathletters option in ucs package, so that basic unicode Greek
  letters will work properly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@834 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-08 02:43:15 +00:00
fiddlosopher
de72aea6b4 Brought test suite up to date.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@828 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-28 19:15:30 +00:00
fiddlosopher
c9e89e1793 Updated test suite for writers, adding tests for
strikeout, superscript, subscript.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@766 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 18:49:23 +00:00
fiddlosopher
17caba2ffc Added a test case with an inline link containing bracketed text.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@667 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 06:59:13 +00:00
fiddlosopher
5660e6ba11 Updated test suite with new tests for definition lists.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@597 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-10 22:04:36 +00:00
fiddlosopher
141affdb51 More changes in entity handling: Instead of using entities for characters
above 128 in HTML and Docbook output, we now just use unicode.  After all,
we're declaring UTF-8 content in the header.  This makes the HTML and
docbook files produced by pandoc much more readable and editable.

Changes to Entities.hs:
+ Removed specialCharToEntity
+ Added escapeSGMLChar (which just escapes the basic four, <>&")
+ Modified encodeEntities and stringToSGML to use escapeSGMLChar
+ Removed encodeEntitiesNumerical
+ Rewrote encodeEntities for better performance
+ Rewrote stringToSGML for better performance 



git-svn-id: https://pandoc.googlecode.com/svn/trunk@516 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-27 22:13:11 +00:00
fiddlosopher
d06417125d Changes in entity handling:
+ Entities are parsed (and unicode characters returned) in both
  Markdown and HTML readers.
+ Parsers characterEntity, namedEntity, decimalEntity, hexEntity added
  to Entities.hs; these parse a string and return a unicode character.
+ Changed 'entity' parser in HTML reader to use the 'characterEntity'
  parser from Entities.hs.  
+ Added new 'entity' parser to Markdown reader, and added '&' as a 
  special character.  Adjusted test suite accordingly since now we 
  get 'Str "AT",Str "&",Str "T"' instead of 'Str "AT&T"..
+ stringToSGML moved to Entities.hs.  escapeSGML removed as redundant,
  given encodeEntities.
+ stringToSGML, encodeEntities, and specialCharToEntity are given a
  boolean parameter that causes only numerical entities to be used.
  This is used in the docbook writer.  The HTML writer uses named
  entities where possible, but not all docbook-consumers know about
  the named entities without special instructions, so it seems safer
  to use numerical entities there.
+ decodeEntities is rewritten in a way that avoids Text.Regex, using
  the new parsers.
+ charToEntity and charToNumericalEntity added to Entities.hs.
+ Moved specialCharToEntity from Shared.hs to Entities.hs.
+ Removed unneeded 'decodeEntities' from 'str' parser in HTML and
  Markdown readers.
+ Removed sgmlHexEntity, sgmlDecimalEntity, sgmlNamedEntity, and
  sgmlCharacterEntity from Shared.hs.
+ Modified Docbook writer so that it doesn't rely on Text.Regex for
  detecting "mailto" links.



git-svn-id: https://pandoc.googlecode.com/svn/trunk@515 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-27 03:04:40 +00:00
fiddlosopher
458bb40989 Fixed docbook writer test -- removed named entities.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@474 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-09 05:45:06 +00:00
fiddlosopher
233148f963 Fixed bug in Markdown reader's handling of underscores and other
inline formatting markers inside reference labels:  for example,
in '[A_B]: /url/a_b', the material between underscores was being
parsed as emphasized inlines.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@442 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-06 20:47:00 +00:00
fiddlosopher
bb8478e4e2 Merged changes from 'quotes' branch since r431. Smart typography
is now handled in the Markdown and LaTeX readers, rather than in
the writers.  The HTML writer has been rewritten to use the
prettyprinting library.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@436 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-06 09:54:58 +00:00
fiddlosopher
030d94e1c3 Refactored SGML escaping functions and "in tag" functions to
Text/Shared/Pandoc.  (escapeSGML, stringToSGML, inTag,
inTagSimple, inTagIndented, selfClosingTag)  These can be
used by both the HTML and Docbook writers.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@417 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-04 22:52:16 +00:00
fiddlosopher
4e5745134a Use entities for all characters above 127 in docbook output.
Though XML tools should support unicode, some people will be
using SGML tools, and these do not.  Using entities makes the
docbook files more portable.

Also refactored encodeEntities and charToHtmlEntity in
HtmlEntities.hs


git-svn-id: https://pandoc.googlecode.com/svn/trunk@394 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-02 00:29:22 +00:00
fiddlosopher
2716943855 Changed representation of code blocks to use <screen> and
escaped characters rather than <programlisting> and CDATA.
Reason:  XML source more easily editable and readable.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@393 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-01 22:07:19 +00:00
fiddlosopher
a9e32505de Merged changes from docbook branch since r363.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@386 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-01 21:08:12 +00:00