Commit graph

26 commits

Author SHA1 Message Date
John MacFarlane
4af8eed764 Markdown reader: revised definition list syntax (closes #1429).
* This change brings pandoc's definition list syntax into alignment
  with that used in PHP markdown extra and multimarkdown (with the
  exception that pandoc is more flexible about the definition markers,
  allowing tildes as well as colons).

* Lazily wrapped definitions are now allowed; blank space is required
  between list items; and the space before definition is used to
  determine whether it is a paragraph or a "plain" element.

* For backwards compatibility, a new extension,
  `compact_definition_lists`, has been added that restores the behavior
  of pandoc 1.12.x, allowing tight definition lists with no blank space
  between items, and disallowing lazy wrapping.
2014-07-20 16:33:59 -07:00
John MacFarlane
8d441af3da Adjusted writers and tests for change in parsing of div/span.
Textile, MediaWiki, Markdown, Org, RST will emit raw HTML div tags for divs.
Otherwise Div and Span are "transparent" block containers.
2013-08-18 14:36:40 -07:00
John MacFarlane
da8425598a New treatment of dashes in --smart mode.
* `---` is always em-dash, `--` is always en-dash.
* pandoc no longer tries to guess when `-` should be en-dash.
* A new option, `--old-dashes`, is provided for legacy documents.

Rationale: The rules for en-dash are too complex and
language-dependent for a guesser to work reliably.  This
change gives users greater control.  The alternative of
using unicode isn't very good, since unicode em- and en-
dashes are barely distinguishable in a monospace font.
2012-01-01 13:48:28 -08:00
John MacFarlane
904050fa36 New HTML reader using tagsoup as a lexer.
* The new reader is faster and more accurate.

* API changes for Text.Pandoc.Readers.HTML:
   - removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag,
     anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType,
     htmlBlockElement, htmlComment
   - added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag

* tagsoup is a new dependency.

* Text.Pandoc.Parsing: Generalized type on readWith.

* Benchmark.hs: Added length calculation to force full evaluation.

* Updated HTML reader tests.

* Updated markdown and textile readers to use the functions from
  the HTML reader.

* Note: The markdown reader now correctly handles some cases it did not
  before. For example:

    <hr/>

  is reproduced without adding a space.

    <script>
      a = '<b>';
    </script>

  is parsed correctly.
2010-12-30 13:55:40 -08:00
John MacFarlane
71e0557e61 Markdown reader: Fixed regression in reference key parser.
* The recent change allowing spaces and newlines in the URL
  caused problems when reference keys are stacked up without
  blank lines between. This is now fixed.
* Added test.
2010-12-13 20:03:12 -08:00
John MacFarlane
ace3b80f1e Smart punctuation: don't alllow ellipses containing spaces.
Previously we allowed '. . .', ' . . . ', etc.  This caused
too many complications, and removed author's flexibility in
combining ellipses with spaces and periods.
2010-12-07 20:08:14 -08:00
fiddlosopher
f5e00c50b8 Markdown reader: Allow footnotes to be indented < 4 spaces.
This fixes a regression.  A test case has been added in testsuite.txt.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1859 788f1e2b-df1e-0410-8736-df70ead52e1b
2010-03-01 22:37:06 +00:00
fiddlosopher
54dda0ff9e Markdown reader: use ; as separator between authors.
This allows you to use ',' within author names:
e.g. "John Jones, Jr."

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1726 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-31 01:16:27 +00:00
fiddlosopher
5082b5411b Improved syntax for markdown definition lists.
Definition lists are now more compatible with PHP Markdown Extra.
Resolves Issue #24.

+ You can have multiple definitions for a term (but still not
  multiple terms).
+ Multi-block definitions no longer need a
  column before each block (indeed, this will now cause
  multiple definitions).
+ The marker no longer needs to be flush with the left margin,
  but can be indented at or two spaces.  Also, ~ as well as :
  can be used as the marker (this suggestion due to David
  Wheeler.)
+ There can now be a blank line between the term and
  the definitions.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@1656 788f1e2b-df1e-0410-8736-df70ead52e1b
2009-12-07 08:26:53 +00:00
fiddlosopher
f53fb554fe Support for display math; changed ASCIIMathML -> LaTeXMathML:
Resolves Issue #47.

+ Added a DisplayMath/InlineMath selector to Math inlines.
+ Markdown parser yields DisplayMath for $$...$$.
+ LaTeX parser yields DisplayMath when appropriate.  Removed
  mathBlock parsers, since the same effect is achieved by the math
  inline parsers, now that they handle display math.
+ Writers handle DisplayMath as appropriate for the format.
+ Changed -m option to use LaTeXMathML rather than ASCIIMathML.
  LaTeXMathML is closer to LaTeX in its display of math, and
  supports many non-math LaTeX environments.
+ Modified HTML writer to print raw TeX when LaTeXMathML is
  being used instead of suppressing it.
+ Removed ASCIIMathML files from data/ and added LaTeXMathML.
+ Replaced ASCIIMathML with LaTeXMathML in source files.
+ Modified README and pandoc man page source.
+ Modified web page.
+ Added --latexmathml option (kept --asciimathml as a synonym
  for backwards compatibility)
+ Modified tests accordingly; added new tests for display math.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1409 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-08-13 03:02:42 +00:00
fiddlosopher
7c35c0bc25 Fixed bug in Markdown parser: regular $s triggering math mode.
For example:  "shoes ($20) and socks ($5)."

The fix consists in two new restrictions:
+ the $ that ends a math span may not be directly followed by a digit.
+ no blank lines may be included within a math span.

Thanks to Joseph Reagle for noticing the bug.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1326 788f1e2b-df1e-0410-8736-df70ead52e1b
2008-07-15 20:41:27 +00:00
fiddlosopher
aea6f6802b Removed support for "box-style" block quotes in markdown.
This adds unneeded complexity and makes pandoc diverge further
than necessary from other markdown extensions.
Brought documentation, tests, and debian/changelog up to date.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@1141 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-12-08 19:32:18 +00:00
fiddlosopher
f11360f50e Added new rule for enhanced markdown ordered lists: if the list marker
is a capital letter followed by a period (including a single-letter
capital roman numeral), then it must be followed by at least two spaces.
The point of this is to avoid accidentally treating people's initials as
list markers: a paragraph may begin:

    B. Russell was an English philosopher.

and this shouldn't be treated as a list.

Modified Markdown reader and README documentation.
Added a test case.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@880 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-23 04:25:09 +00:00
fiddlosopher
e48f046aa0 + Fixed bug in markdown ordered list parsing. The problem was
that anyOrderedListStart did not check for a space following the
  ordered list marker. So, 'A.B. 2007' would be parsed as a list item,
  then fail because of the lack of space after 'A.' (required by
  orderedListStart). Resolves Issue #22.
+ Fixed a similar problem in RST reader.
+ Added regression test.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@861 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-18 15:26:29 +00:00
fiddlosopher
e814a3f6d2 Major change in the way ordered lists are handled:
+ The changes are documented in README, under Lists.
+ The OrderedList block element now stores information
  about list number style, list number delimiter, and
  starting number.
+ The readers parse this information, when possible.
+ The writers use this information to style ordered
  lists.
+ Test suites have been changed accordingly.

Motivation:  It's often useful to start lists with
numbers other than 1, and to have control over the
style of the list.

Added to Text.Pandoc.Shared:
+ camelCaseToHyphenated
+ toRomanNumeral
+ anyOrderedListMarker
+ orderedListMarker
+ orderedListMarkers

Added to Text.Pandoc.ParserCombinators:
+ charsInBalanced'
+ withHorizDisplacement
+ romanNumeral

RST writer:
+ Force blank line before lists, so that sublists will be handled
  correctly.

LaTeX reader:
+ Fixed bug in parsing of footnotes containing multiple paragraphs,
  introduced by use of charsInBalanced.  Fix: use charsInBalanced'
  instead.

LaTeX header:
+ use mathletters option in ucs package, so that basic unicode Greek
  letters will work properly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@834 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-08-08 02:43:15 +00:00
fiddlosopher
c9e89e1793 Updated test suite for writers, adding tests for
strikeout, superscript, subscript.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@766 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-22 18:49:23 +00:00
fiddlosopher
c14082cab2 Modified the test of a link containing an underscore in
both the label and the URL.  The underscore must be
backslash-escaped.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@730 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-15 23:54:02 +00:00
fiddlosopher
17caba2ffc Added a test case with an inline link containing bracketed text.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@667 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-09 06:59:13 +00:00
fiddlosopher
5660e6ba11 Updated test suite with new tests for definition lists.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@597 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-05-10 22:04:36 +00:00
fiddlosopher
233148f963 Fixed bug in Markdown reader's handling of underscores and other
inline formatting markers inside reference labels:  for example,
in '[A_B]: /url/a_b', the material between underscores was being
parsed as emphasized inlines.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@442 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-01-06 20:47:00 +00:00
fiddlosopher
4ea1b2bdc0 Merged 'strict' branch from r324. This adds a '--strict'
option to pandoc, which forces it to stay as close as possible
to official Markdown syntax.  


git-svn-id: https://pandoc.googlecode.com/svn/trunk@347 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-30 22:51:49 +00:00
fiddlosopher
d2105f6693 + Added regression tests with footnotes in quote blocks and lists.
+ This uncovered an existing bug in the RTF writer, which got indentation
  wrong on footnotes occuring in indented blocks like lists.  Fixed
  this bug.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@263 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-21 19:33:57 +00:00
fiddlosopher
661c7e7b1d Merged changes to footnotes branch r219-r240.
git-svn-id: https://pandoc.googlecode.com/svn/trunk@241 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-19 23:13:03 +00:00
fiddlosopher
3a6296acae Changed footnote syntax to conform to the de facto standard
for markdown footnotes.  References are now like this[^1]
rather than like this^(1).  There are corresponding changes
in the footnotes themselves.  See the updated README for
more details.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@230 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-19 07:30:36 +00:00
fiddlosopher
fe66a90a2a Changed 'putStrLn' to 'putStr' in Main.hs, and modified some
of the readers to make spacing at end of output more consistent.
Modified tests accordingly.


git-svn-id: https://pandoc.googlecode.com/svn/trunk@201 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-16 05:05:02 +00:00
fiddlosopher
df7b682251 initial import
git-svn-id: https://pandoc.googlecode.com/svn/trunk@2 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-10-17 14:22:29 +00:00