pandoc

Author	SHA1	Message	Date
John MacFarlane	0487eae7ee	Markdown reader: Fixed bug in code block attribute parser. Previously the ID attribute got lost if it didn't come first. Now attributes can come in any order.	2012-01-28 12:36:51 -08:00
John MacFarlane	d1ded4b026	Support github syntax for fenced code blocks. You can now write ```ruby x = 2 ``` instead of ~~~ {.ruby} x = 2 ~~~~	2012-01-28 12:25:24 -08:00
John MacFarlane	ff93a8e789	Fixed table parsing with wide or combining characters. Closes #348. Closes #108.	2012-01-27 00:39:00 -08:00
John MacFarlane	1ce7c38bc4	LaTeX reader: Handle \@.	2012-01-26 11:52:25 -08:00
John MacFarlane	ba81cda7f1	Added Docx writer. * New module `Text.Pandoc.Docx`. * New output format `docx`. * Added reference.docx. * New option `--reference-docx`. The writer includes support for highlighted code blocks and math (which is converted from TeX to OMML using texmath's new OMML module).	2012-01-19 12:10:49 -08:00
John MacFarlane	a3988d89c8	Added "title" to list of docbook block-level tags.	2012-01-12 21:13:52 -08:00
John MacFarlane	5b49c47414	Markdown reader: fixed bug in table/hrule parsing. Top line of table must not be followed by a blank line. This bug caused slowdown on some files with hrules and tables, and pandoc tried to interpret the hrules as the tops of multiline tables.	2012-01-10 12:45:19 -08:00
John MacFarlane	0ee49911f6	Markdown reader: Allow links in image captions. This change also means that [link with [link](/url)](/url) will turn into <p><a href="/url">link with link</a></p> instead of <p><a href="/url">link with [link](/url)</a></p>	2012-01-08 09:52:39 -08:00
John MacFarlane	5b7c209373	Markdown reader: Fix parsing of consecutive lists. Pandoc previously behaved like Markdown.pl for consecutive lists of different styles. Thus, the following would be parsed as a single ordered list, rather than an ordered list followed by an unordered list: 1. one 2. two - one - two This patch makes pandoc behave more sensibly, parsing this as two lists. Any change in list type (ordered/unordered) or in list number style will trigger a new list. Thus, the following will also be parsed as two lists: 1. one 2. two a. one b. two Since we regard this as a bug in Markdown.pl, and not something anyone would ever rely on, we do not preserve the old behavior even when `--strict` is selected.	2012-01-02 17:04:59 -08:00
John MacFarlane	da8425598a	New treatment of dashes in --smart mode. * `---` is always em-dash, `--` is always en-dash. * pandoc no longer tries to guess when `-` should be en-dash. * A new option, `--old-dashes`, is provided for legacy documents. Rationale: The rules for en-dash are too complex and language-dependent for a guesser to work reliably. This change gives users greater control. The alternative of using unicode isn't very good, since unicode em- and en- dashes are barely distinguishable in a monospace font.	2012-01-01 13:48:28 -08:00
John MacFarlane	3cf60c7306	Support for math in RST reader and writer. Inline math uses the :math:`...` construct. Display math uses .. math:: ... or if multilin .. math:: ... These seem to be supported now by rst2latex.py.	2011-12-31 11:40:47 -08:00
John MacFarlane	d8272d0356	Support Sphinx style math in RST reader. Inline: :math:`E=mc^2` Block: .. math: E = mc^2 .. math:: E = mc^2 a = b^2 (This latter will turn into a paragraph with two display math elements.) Closes #117.	2011-12-30 23:46:43 -08:00
John MacFarlane	925a4c5164	Better smart quote parsing. * Added stateLastStrPos to ParserState. This lets us keep track of whether we're parsing the position immediately after a 'str'. If we encounter a ' in such a location, it must be an apostrophe, and can't be a single quote start. * Set this in the markdown, textile, html, and rst str parsers. * Closes #360.	2011-12-29 23:44:12 -08:00
John MacFarlane	a579e2c892	Replaced Apostrophe, Ellipses, EmDash, EnDash w/ unicode strings.	2011-12-27 15:45:34 -08:00
John MacFarlane	8838f473a8	LaTeX reader: Return Str instead of Apostrophe.	2011-12-27 11:19:23 -08:00
John MacFarlane	4f76fe5a1d	Markdown reader: Improved previous patch to allow unicode apostrophe.	2011-12-27 11:01:34 -08:00
John MacFarlane	dd96267626	Modified str parser to capture apostrophes in smart mode. This solves a problem stemming from the fact that a parser doesn't know what came before in the input stream. Previously pandoc would parse D'oh l'aide as containing a single quoted "oh l", when both `'`s should be apostrophes. (Issue #360.) There are two issues here. (a) It is obvious that the first `'` is not an open quote, becaues of the preceding `D`. This patch solves the problem. (b) It is obvious to us that the second `'` is not an open quote, because we see that aide is some text. But getting a good algorithm that has good performance is a bit tricky. You can't assume that `'` followed by `` is always an apostrophe: 'this is quoted'* This patch does not fix (b).	2011-12-26 23:04:45 -08:00
John MacFarlane	9f9a57de19	Markdown reader: Fixed backslash escapes in reference links. Closes #312.	2011-12-05 21:33:47 -08:00
John MacFarlane	26371975f8	Markdown: Better handling of escapes in link URLs and titles.	2011-12-05 21:13:24 -08:00
John MacFarlane	d34f85613a	Changes to fit new charsInBalanced.	2011-12-05 20:55:23 -08:00
John MacFarlane	c39cdc15ba	Markdown reader: internal changes. Refactored escapedChar into escapedChar', escapedChar.	2011-12-05 20:27:10 -08:00
John MacFarlane	7b971517b0	Parsing: Changed type of escaped to return Char	2011-12-05 20:22:27 -08:00
John MacFarlane	bf4f8ffe55	LaTeX reader: Don't crash on commands like `\itemsep`. Closes #314.	2011-11-12 13:20:29 -08:00
John MacFarlane	da57775171	LaTeX reader: Ignore empty groups {}, { }. Closes #322.	2011-11-12 13:03:11 -08:00
John MacFarlane	d74e8d14a5	Markdown citations: don't strip off initial space in locator. Previously `[@item1 and nowhere else]` yielded the locator ", and nowhere else", or, with the new citeproc-hs, "and nowhere else". Now it yields " and nowhere else".	2011-11-09 13:18:01 -08:00
John MacFarlane	c2f7ba3b69	TeXMath writer: Use unicode thin spaces for thin spaces. Partially resolves issue #333.	2011-11-08 18:22:28 -08:00
John MacFarlane	dd6ed88707	Markdown reader: allow punctuation only internally in cite keys. The characters '.',':',';','$','<','>','~','#','-','_' can be used only between two letters or digits in a citation key. This means that '@item1.' will be parsed as a citation, 'item1', followed by a period, instead of a citation 'item1.', as was the case previously. Thanks to David Sanson for alerting us to the problem.	2011-11-06 16:00:23 -08:00
John MacFarlane	1b81981c5f	HTML reader now recognizes DocBook block and inline tags. It was always possible to include raw DocBook tags in a markdown document, but now pandoc will be able to distinguish block from inline tags and behave accordingly. Thus, for example, <sidebar> hello </sidebar> will not be wrapped in `<para>` tags.	2011-10-25 12:44:20 -07:00
takahashim	724de8314c	allow footnotes followed by newline without space chars	2011-08-23 09:56:58 +09:00
John MacFarlane	6c639d3420	HTML reader: Fixed bug parsing tables w both thead and tbody. See bug #274, which was not completely fixed by the last patch.	2011-08-01 11:56:15 -07:00
John MacFarlane	8be6cc210c	Added PRAGMA needed for ghc 6.12.	2011-07-30 19:58:46 -07:00
John MacFarlane	81381a9305	Removed applicative stuff in Markdown reader. It requires parsec 3, and currently pandoc can build with parsec 2.	2011-07-30 19:43:20 -07:00
John MacFarlane	b66b7a791c	Markdown reader: Improved emph/strong parsing. Ported code from pandoc2. Now all tests pass.	2011-07-30 18:08:49 -07:00
John MacFarlane	35cef01659	RST reader: Partial support for labeled footnotes. Also made simpleReferenceName parser more accurate, which affects several other parsers.	2011-07-23 18:51:02 -07:00
John MacFarlane	6424e7d02c	Properly handle characters in the 128..159 range. These aren't valid in HTML, but many HTML files produced by Windows tools contain them. We substitute correct unicode characters.	2011-07-23 12:43:01 -07:00
John MacFarlane	fe14bf9447	LaTeX reader: Handle \subtitle command. If there's a subtitle, it is added to the title, separated by a colon and linebreak. Closes #280.	2011-07-21 13:33:51 -07:00
John MacFarlane	6c029621ed	LaTeX reader & writer: Use \and to separate authors. Closes #279.	2011-07-21 10:09:51 -07:00
John MacFarlane	dd59cd2341	HTML reader: treat Plain as Para when needed. For example, in Just a few glitches remaining. <ul><li> In this situation, one loses the list. </ul> And in this, the preformatting. <pre>Preformatted text not starting with its own blank line. </pre> Thansk to Dirk Laurie for noticing the issue.	2011-07-16 09:42:16 -07:00
John MacFarlane	934867f858	HTML reader: Handle tbody, thead in simple tables. Closes #274.	2011-07-15 21:16:49 -07:00
John MacFarlane	b30afc2009	Merge pull request #273 from qerub/master Textile reader: Make it possible to have colons after links.	2011-07-11 08:31:29 -07:00
John MacFarlane	c83b578f58	LaTeX reader: Gobble option & space after linebreak \\[10pt].	2011-07-10 19:07:40 -07:00
John MacFarlane	4134dad500	Make HTML reader more forgiving of bad HTML. * Skip spaces after <b>, <emph>, etc. * Convert Plain elements into Para when they're in a list item with Para, Pre, BlockQuote, CodeBlock. An example of HTML that pandoc handles better now: ~~~~ <h4> Testing html to markdown </h4> <ul> <li> <b> An item in a list </b> <p> An introductory sentence. <pre> Some preformatted text at this stage comes next. But alas! much havoc is wrought by Pandoc. </pre> </ul> ~~~~ Thanks to Dirk Laurie for reporting the issues.	2011-07-10 16:54:46 -07:00
Christoffer Sawicki	8fa4e8bff1	Textile reader: Make it possible to have colons after links.	2011-07-10 16:30:14 +02:00
John MacFarlane	9e71dc3f48	Support \dots and well as \ldots in LaTeX reader.	2011-06-22 20:06:29 -07:00
John MacFarlane	6e59053d32	Forbid ()s in citation item keys. Resolves Issue #304: problems with (@item1; @item2) because the final paren was being parsed as part of the item key.	2011-05-22 20:24:18 -07:00
John MacFarlane	b42c48e919	Disallow notes within notes in reST and markdown. These previously caused infinite looping and stack overflows. For example: [^1] [^1]: See [^1] Note references are allowed in reST notes, so this isn't a full implementation of reST. That can come later. For now we need to prevent the stack overflows. Partially resolves Issue #297.	2011-04-20 11:42:27 -07:00
John MacFarlane	4b90ffe1bd	Allow '\|' followed by newline in RST line block.	2011-04-11 14:45:42 -07:00
John MacFarlane	6beba76f61	Changed uri parser so it doesn't include trailing punctuation. So, in RST, 'http://google.com.' should be parsed as a link to 'http://google.com' followed by a period. The parser is smart enough to recognize balanced parentheses, as often occur in wikipedia links: 'http://foo.bar/baz_(bam)'. Also added ()s to RST specialChars, so '(http://google.com)' will be parsed as a link in parens. Added test cases. Resolves Issue #291.	2011-03-18 11:30:20 -07:00
John MacFarlane	403bb521cd	Fixed bug in RST field list parser. The bug affected field lists with multi-line items at the end of the list.	2011-03-12 17:08:23 -08:00
John MacFarlane	eebd77829c	Markdown+lhs reader: Require space after inverse bird tracks. The point of the change is to allow html tags to be used freely at the left margin of a markdown+lhs document. Thanks to Conal Elliot for the suggestion.	2011-03-02 12:47:17 -08:00

... 29 30 31 32 33 ...

1927 commits