pandoc

Author	SHA1	Message	Date
John MacFarlane	38200c0291	Strip out illegal XML characters in escapeXMLString. Closes #5119.	2018-12-04 09:24:15 -08:00
John MacFarlane	4060df6891	Markdown writer: include needed whitespace after HTML figure. We use HTML for a figure in markdown dialects that can't represent it natively. Closes #5121.	2018-12-03 15:10:13 -08:00
John MacFarlane	83c0789205	Added test for #5053 . Note that the fix for #5099 also fixes #5053, a pandoc 2.4 regression in parsing underscore emphasis after symbols.	2018-11-25 22:50:16 -08:00
John MacFarlane	edc651059e	Fix parsing of citations and quotes after parentheses. Starting with pandoc 2.4, citations and quoted inlines were no longer recognized after parentheses. This is because of commit `9b0bd4ec6f`, which is reverted here. The point of that commit was to allow relocation of soft line breaks to before an abbreviation, so that a nonbreaking space could be added after the abbreviation. Now we simply leave the soft line break in place, even though this means that we won't get a nonbreaking space after "Mr." at the end of a line (and in LaTeX this may result in a longer intersentential space). Those who care about this issue should take care not to end lines with an abbreviation, or to insert nonbreaking spaces manually. Closes #5099.	2018-11-25 22:29:54 -08:00
John MacFarlane	d532eb14eb	HTML reader: allow tfoot before body rows. Closes #5079.	2018-11-16 11:29:15 -08:00
John MacFarlane	e61f632531	HTML reader: parse `<small>` as a Span with class "small". Closes #5080.	2018-11-15 22:36:01 -08:00
John MacFarlane	e61d1d0da9	Asciidoc writer: Render Spans using `[#id .class]#contents#`. See #5080.	2018-11-15 22:29:15 -08:00
John MacFarlane	1a102c11a9	Fix test case for #5014 .	2018-11-13 14:50:26 -08:00
John MacFarlane	1cfdd3662f	HTML reader: allow thead containing a row with td rather than th. See #5014. Note that this doesn't address the original issue in #5014, only an unrelated side-issue.	2018-11-13 14:49:12 -08:00
John MacFarlane	52a57a5362	LaTeX writer: don't emit `[<+->]` unless beamer output, even if `writerIncremental` is True. See #5072.	2018-11-12 09:43:12 -08:00
John MacFarlane	5bc38a741b	Exactly match GitHub's identifier generating algorithm. See #5057.	2018-11-11 20:45:38 -08:00
John MacFarlane	a36d202e86	Text.Pandoc.Shared: add parameter to uniqueIdent, inlineListToIdentifier. The parameter is Extensions. This allows these functions to be sensitive to the settings of `Ext_gfm_auto_identifiers` and `Ext_ascii_identifiers`. This allows us to use `uniqueIdent` in the CommonMark reader, replacing some custom code. It also means that `gfm_auto_identifiers` can now be used in all formats. Semantically, `gfm_auto_identifiers` is now a modifier of `auto_identifiers`; for identifiers to be set, `auto_identifiers` must be turned on, and then the type of identifier produced depends on `gfm_auto_identifiers` and `ascii_identifiers` are set. Closes #5057.	2018-11-11 13:46:23 -08:00
John MacFarlane	5f030f3c2c	Add command test for #5050 .	2018-11-06 22:57:11 -08:00
quasicomputational	a747268823	CommonMark writer: respect --ascii (#5043 )	2018-11-05 09:33:10 -08:00
John MacFarlane	511d647290	XML: toHtml5Entities: prefer shorter entities... when there are several choices for a particular character.	2018-11-04 22:15:53 -08:00
John MacFarlane	805b9f8a12	Roff reader: Improved handling of custom strings as arguments. Added test.	2018-11-02 21:35:49 -07:00
John MacFarlane	26341c1632	Implement --ascii for Markdown writer.	2018-11-01 16:31:04 -07:00
John MacFarlane	f379edc4ad	HTML writer: use character entities references when possible for HTML5.	2018-11-01 16:08:27 -07:00
John MacFarlane	e0290fd18b	LaTeX writer: add newline if math ends in a comment. This prevents the closing delimiter from being swalled up in the comment. Closes #4880.	2018-10-31 21:51:20 -07:00
John MacFarlane	c51be5dfc8	LaTeX reader: allow space at end of math after `\`. Closes #5010. Expose trimMath from T.P.Shared.	2018-10-29 22:20:14 -07:00
Albert Krewinkel	096cbe6987	Lua: allow access to pandoc state (#5015 ) * Lua: allow access to pandoc state Lua filters and custom writers now have read-only access to most fields of pandoc's internal state via the global variable `PANDOC_STATE`. * Lua: allow iterating through fields of PANDOC_STATE * Lua filters doc: describe CommonState * Lua filters doc: mention global variable PANDOC_STATE * Lua: add access to logs Log messages can currently only be printed, but not decomposed.	2018-10-25 22:12:14 -07:00
John MacFarlane	8efb8975ed	Groff writer character escaping changes. T.P.GroffChar: replaced `essentialEscapes` with `manEscapes`, which includes all the escapes mentioned in the groff_man manual. T.P.Writers.Groff: removed escapeCode; changed parameter on escapeString from Bool to new type `EscapeMode`. Rewrote `escapeString`.	2018-10-23 21:44:07 -07:00
Brian Leung	7eea5c62ed	LaTeX reader: add support for `nolinkurl` command. (#4992 )	2018-10-22 23:36:44 -07:00
John MacFarlane	efbb329f1a	Groff escaping changes. - `--ascii` is now turned on automatically for man output, for portability. All man output will be escaped to ASCII. - In T.P.Writers.Groff, `escapeChar`, `escapeString`, and `escapeCode` now take a boolean parameter that selects ascii-only output. This is used by the Ms writer for `--ascii`, instead of doing an extra pass after writing the document. - In ms output without `--ascii`, unicode is used whenever possible (e.g. for double quotes). - A few escapes are changed: e.g. `\[rs]` instead of `\\` for backslash, and `\ga]` instead of `` \` `` for backtick.	2018-10-18 10:21:34 -07:00
John MacFarlane	f48960b75f	Move common groff functions to Text.Pandoc.Writers.Groff (unexported module). These are used in both the man and ms writers. Moved groffEscape out of Text.Pandoc.Writers.Shared [cancels earlier API change from adding it, which was after last release]. This fixes strong/code combination on man (should be `\f[CB]` not `\f[BC]`), mentioned in #4973. Updated tests. Closes #4975.	2018-10-17 17:26:37 -07:00
Alexander Krotov	b3feaba6af	Man writer: use \f[R] instead of \f[] to reset font Fixes #4973	2018-10-17 18:29:07 +03:00
John MacFarlane	6f6ad0514d	LaTeX reader: make macroDef polymorphic and allow in inline context. Otherwise we can't parse something like ``` \lowercase{\def\x{Foo}} ``` I have actually seen tex like this in the wild.	2018-10-15 11:46:31 -07:00
John MacFarlane	22f81f78bd	Added failing test case for macros.	2018-10-15 00:37:17 -07:00
John MacFarlane	88faa45f1d	Markdown writer: ensure blank between raw block and normal content. Otherwise a raw block can prevent a paragraph from being recognized as such. Closes #4629.	2018-10-14 17:12:06 -07:00
John MacFarlane	cf8224045b	Markdown reader: Fix awkward soft break movements before abbreviations. Closes #4635.	2018-10-14 13:02:36 -07:00
John MacFarlane	f5c64c3060	HTML reader: fix htmlTag and isInlineTag to accept processing instructions. Fixes regression #3123 (since 2.0). Added regression test.	2018-10-11 09:58:25 -07:00
John MacFarlane	a92e43575f	LaTeX writer: with `--biblatex`, use `\autocite` when possible. `\autocites{a1}{a2}{a3}` will not collapse the entries. So, if we don't have prefixes and suffixes, we use instead `\autocite{a1;a2;a3}`. Closes #4960.	2018-10-08 20:47:09 -07:00
John MacFarlane	145710c4c3	RST reader: don't allow single-dash separator in headerless table. Closes #4382.	2018-10-07 12:37:08 -07:00
John MacFarlane	b806bff5b4	LaTeX reader: fix bugs omitting raw tex. The default is `-raw_tex`, so no raw tex should result unless we explicitly say `+raw_tex`. Previously some raw commands did make it through. Closes #4527.	2018-10-07 12:21:43 -07:00
John MacFarlane	08fef6b210	RST reader: pass through fields in unknown directives as div attributes. This commit also adds support for `class` and `name` attributes to directives in general. Closes #4715.	2018-10-07 11:44:11 -07:00
Brian Leung	e257b54124	Org reader: fix behavior for successive calls of `#+EXCLUDE_TAGS`. (#4951 ) Calling `#+EXCLUDE_TAGS` multiple times should preserve the status of the previously declared tags.	2018-10-05 22:21:20 -07:00
quasicomputational	6207bdeb68	CommonMark writer: add plain text fallbacks. (#4531 ) Previously, the writer would unconditionally emit HTMLish output for subscripts, superscripts, strikeouts (if the strikeout extension is disabled) and small caps, even with raw_html disabled. Now there are plain-text (and, where possible, fancy Unicode) fallbacks for all of these corresponding (mostly) to the Markdown fallbacks, and the HTMLish output is only used when raw_html is enabled. This commit adds exported functions `toSuperscript` and `toSubscript` to `Text.Pandoc.Writers.Shared`. [API change] Closes #4528.	2018-10-05 21:33:14 -07:00
Brian Leung	a26b3a2d6a	Org reader: Add partial support for `#+EXCLUDE_TAGS` option. (#4950 ) Closes #4284. Headers with the corresponding tags should not appear in the output. If one or more of the specified tags contains a non-tag character like `+`, Org-mode will not treat that as a valid tag, but will nonetheless continue scanning for valid tags. That behavior is not replicated in this patch; entering `cat+dog` as one of the entries in `#+EXCLUDE_TAGS` and running the file through Pandoc will cause the parser to fail and result in the only excluded tag being the default, `noexport`.	2018-10-05 14:28:17 -07:00
John MacFarlane	36f1846cc3	Implement `--ascii` (`writerPreferAscii`) in writers, not App. Now the `write*` functions for Docbook, HTML, ICML, JATS, Man, Ms, OPML are sensitive to `writerPreferAscii`. Previously the to-ascii translation was done in Text.Pandoc.App, and thus not available to those using the writer functions directly. In addition, the LaTeX writer is now sensitive to `writerPreferAscii` and to `--ascii`. 100% ASCII output can't be guaranteed, but the writer will use commands like `\"{a}` and `\l` whenever possible, to avoid emiting a non-ASCII character. A new unexported module, Text.Pandoc.Groff, has been added to store functions used in the different groff-based writers.	2018-09-30 22:32:00 -07:00
John MacFarlane	190ee279c9	LaTeX reader: allow verbatim blocks ending with blank lines. Closes #4624.	2018-09-29 10:57:11 -07:00
leungbk	6e8f31dab1	Force inline code blocks to honor export options. `exportsCode` is moved from `Blocks.hs` to `Shared.hs` and exported accordingly.	2018-09-26 08:49:13 +02:00
Brian Leung	72363cd2fc	Add support for multiprenote and multipostnote arguments in LaTeX. (#4930 ) * Add support for multiprenote and multipostnote arguments. The multiprenotes occur before the first prefix of a multicite, and the multipostnotes follow the last suffix. * Add test for multiprenote and multipostnote.	2018-09-25 20:49:13 -07:00
John MacFarlane	37c6f6adfe	RST reader: fix bug with internal link targets. They were gobbling up indented content underneath. Closes #4919.	2018-09-20 11:15:03 -07:00
John MacFarlane	136bf901aa	Markdown reader: distinguish autolinks in the AST. With this change, autolinks are parsed as Links with the `uri` class. (The same is true for bare links, if the `autolink_bare_uris` extension is enabled.) Email autolinks are parsed as Links with the `email` class. This allows the distinction to be represented in the URI. Formerly the `uri` class was added to autolinks by the HTML writer, but it had to guess what was an autolink and could not distinguish `[http://example.com](http://example.com)` from `<http://example.com>`. It also incorrectly recognized `[pandoc](pandoc)` as an autolink. Now the HTML writer simply passes through the `uri` attribute if it is present, but does not add anything. The Textile writer has been modified so that the `uri` class is not explicitly added for autolinks, even if it is present. Closes #4913.	2018-09-19 14:53:29 -07:00
John MacFarlane	44e4f7b292	Markdown reader: example_lists should work without startnum. Closes #4908.	2018-09-16 20:40:32 -07:00
mb21	5347e9454f	add test for --metadata-file	2018-09-15 17:06:10 +02:00
mb21	bd5500ba7f	add test yaml-metadata-blocks.md	2018-09-15 12:10:10 +02:00
John MacFarlane	fa4ebd71a3	LaTeX reader: resolve `\ref` for figure numbers.	2018-09-09 22:53:18 -07:00
John MacFarlane	a211edc819	HTML reader: parse `<script type="math/tex` tags as math. These are used by MathJax. Closes #4877.	2018-09-07 09:41:17 -07:00
John MacFarlane	85ed24e849	RSTR reader: don't skip link definitions after comments. Closes #4860.	2018-08-29 14:40:04 -07:00
John MacFarlane	a2c4261b32	HTML reader: allow enabling `raw_tex` extension. This now allows raw LaTeX environments, `\ref`, and `\eqref` to be parsed (which is helpful for translation HTML documents using MathJaX). Closes #1126.	2018-08-24 18:04:00 -07:00
Alexander Krotov	937b92cd30	HTML reader: extract spaces inside links instead of trimming them Fixes #4845	2018-08-22 12:43:15 +03:00
John MacFarlane	3b5949e8f2	LaTeX reader: support blockcquote, foreignblockquote from csquotes. Also foreigncblockquote, hyphenblockquote, hyphencblockquote. Closes #4848. But note: currently foreignquote will be parsed as a regular Quoted inline (not using the quotes appropriate to the foreign language).	2018-08-21 21:03:43 -07:00
John MacFarlane	a733068ebf	LaTeX reader: support enquote*, foreignquote, hypphenquote... from csquotes. See #4848. Still TBD: blockquote, blockcquote, foreignblockquote.	2018-08-21 17:39:27 -07:00
John MacFarlane	42f4632e60	LaTeX reader: Support more text-mode accents. Add support for `\\|`, `\b`, `\G`, `\h`, `\d`, `\f`, `\r`, `\t`, `\U`, `\i`, `\j`, `\newtie`, `\textcircled`. Also fall back to combining characters when composed characters are not available. Closes #4652.	2018-08-17 23:19:38 -07:00
Marc Schreiber	175da00295	Add support for latex mintinline (#4365 )	2018-08-17 20:57:36 -07:00
John MacFarlane	1b66865763	LaTeX reader: fix siunitx unit commands... ...they should only be recognized in siunitx contexts. For example, `\l` outside of an siunitx context should be l-slash, not l (for liter)! Closes #4842.	2018-08-17 15:22:47 -07:00
John MacFarlane	13dea94a91	Markdown reader: Use "tex" instead of "latex" for raw tex-ish content. We can't always tell if it's LaTeX, ConTeXt, or plain TeX. Better just to use "tex" always. Also changed: ConTeXt writer: now outputs raw "tex" blocks as well as "context". (Closes #969). RST writer: uses ".. raw:: latex" for "tex" content. (RST doesn't support raw context anyway.) Note that if "context" or "latex" specifically is desired, you can still force that in a markdown document by using the raw attribute (see MANUAL.txt): ```{=latex} \foo ``` Note that this change may affect some filters, if they assume that raw tex parsed by the Markdown reader will be RawBlock (Format "latex"). In most cases it should be trivial to modify the filters to accept "tex" as well.	2018-08-15 10:25:12 -07:00
John MacFarlane	c27ce1e70e	LaTeX reader: handle parameter patterns for `\def`. For example: `\def\foo#1[#2]{#1 and #2}`. Closes #4768. Also fixes #4771. API change: in Text.Pandoc.Readers.LaTeX.Types, new type ArgSpec added. Second parameter of Macro constructor is now `[ArgSpec]` instead of `Int`.	2018-08-14 00:03:55 -07:00
John MacFarlane	919c50162c	RST writer: render Divs with admonition classes as admonitions. Also omit Div with class "admonition-title". These are generated by the RST reader and should be omitted on round-trip. Closes #4833.	2018-08-13 11:17:26 -07:00
John MacFarlane	6d14f53bd9	LaTeX reader: Allow `%` characters in URLs. This affects `\href` and `\url`. Closes #4832.	2018-08-12 16:46:48 -07:00
John MacFarlane	b76203ccf1	Markdown reader: Properly handle boolean values in YAML metadata. This fixes a regression in 2.2.3, which cause boolean values to be parsed as MetaInlines instead of MetaBool. Note also an undocumented (but desirable) change in 2.2.3: numbers are now parsed as MetaInlines rather than MetaString. Closes #4819.	2018-08-07 09:26:58 -07:00
John MacFarlane	94c3753c08	Fix parsing of embedded mappings in YAML metadata. This fixes a regression in 2.2.3 which caused embedded mappings (e.g. mappings in sequences) not to work in YAML metadata. Closes #4817.	2018-08-06 12:32:04 -07:00
John MacFarlane	581a3514ca	RST reader: improve parsing of inline interpreted text roles. * Use a Span with class "title-reference" for the default title-reference role. * Use B.text to split up contents into Spaces, SoftBreaks, and Strs for title-reference. * Use Code with class "interpreted-text" instead of Span and Str for unknown roles. (The RST writer has also been modified to round-trip this properly.) * Disallow blank lines in interpreted text. * Backslash-escape now works in interpreted text. * Backticks followed by alphanumerics no longer end interpreted text. Closes #4811.	2018-08-05 09:56:43 -07:00
John MacFarlane	f7dc3e7487	Added test case for #4669 to repository.	2018-08-05 09:21:45 -07:00
danse	be2d7921cb	RST reader: remove support for nested inlines. RST does not allow nested emphasis, links, or other inline constructs. Closes #4581, double parsing of links with URLs as link text. This supersedes the earlier fix for #4581 in `6419819b46`. Fixes #4561, a bug parsing with URLs inside emphasis. Closes #4792.	2018-07-24 15:35:50 -07:00
John MacFarlane	50e8c3b107	MediaWiki writer: Avoid extra blank line in tables with empty cells. Note that the old output is semantically identical, but the new output looks better. Closes #4794.	2018-07-24 11:38:24 -07:00
John MacFarlane	6419819b46	RST reader: fix double-link bug. Link labels containing raw URLs were parsed as autolinks, but links within links are not allowed. Closes #4581.	2018-07-21 22:53:04 -07:00
John MacFarlane	34b229dd5a	Fix for bug in parsing `\include` in markdown. Starting in 2.2.2, everything after an `\input` (or `\include`) in a markdown file would be parsed as raw LaTeX. This commit fixes the issue and adds a regression test. Closes #4781.	2018-07-19 17:44:16 -07:00
John MacFarlane	af445b34d8	Make markdown and github writers respect the `emoji` extension.	2018-07-15 16:02:46 -07:00
Anders Waldenborg	ec30fb37c1	Wrap emojis in span nodes (#4759 ) Text.Pandoc.Emoji now exports `emojiToInline`, which returns a Span inline containing the emoji character and some attributes with metadata (class `emoji`, attribute `data-emoji` with emoji name). Previously, emojis (as supported in Markdown and CommonMark readers, e.g "😄") were simply translated into the corresponding unicode code point. By wrapping them in Span nodes, we make it possible to do special handling such as giving them a special font in HTML output. We also open up the possibility of treating them differently when the `--ascii` option is selected (though that is not part of this commit). Closes #4743.	2018-07-15 15:14:40 -07:00
Mauro Bieg	5809d5bef2	AsciiDoc Writer: escape square brackets at start of line (#4708 ) closes #4545	2018-07-12 19:37:37 +02:00
Alexander Krotov	41cf6d540f	More spellcheck	2018-07-02 19:07:28 +03:00
John MacFarlane	016e0a09e2	RST writer: don't treat 'example' as a syntax name. This fixes conversions from org with example blocks. Closes #4748.	2018-06-30 11:45:49 +02:00
Anders Waldenborg	904924d172	CommonMark reader: Handle ascii_identifiers extension (#4733 ) Non-ascii characters were not stripped from identifiers even if the `ascii_identifiers` extension was enabled (which is is by default for gfm). Closes #4742	2018-06-29 10:41:26 +02:00
Mauro Bieg	0459d1be26	TikiWiki reader: improve list parsing (#4723 ) - remove trailing Space from list items - parse lists that have no space after marker (fixes #4722)	2018-06-28 13:35:54 +02:00
John MacFarlane	48a505c5a0	Markdown reader: allow empty code spans. E.g. `` ` ` ``.	2018-06-13 11:12:10 -07:00
Mauro Bieg	7e477db95c	LaTeX Reader: parse figure label into Image id (#4704 ) closes #4700	2018-06-13 10:41:30 -07:00
John MacFarlane	a41222db3e	Adjust command test not to use echo. This is fraught on Windows.	2018-06-11 10:53:56 -07:00
Mauro Bieg	905dee6ee3	beamer output: fix single digit column percentage (#4691 ) fixes #4690	2018-06-07 10:50:14 -07:00
John MacFarlane	d32e866449	LaTeX reader: handle includes without surrounding blanklines. In addition, `\input` can now be used in an inline context, e.g. to provide part of a paragraph, as it can in LaTeX. Closes #4553.	2018-06-01 09:25:10 -07:00
John MacFarlane	7119715a6a	LaTeX reader `rawLaTeXBlock`: handle macros that resolve to a... ...`\begin` or `\end`. Fixes #4667.	2018-05-30 12:49:01 -07:00
John MacFarlane	252ab9b773	Markdown writer: preserve `implicit_figures` with attributes... ...even if `implicit_attributes` is not set, by rendering in raw HTML. Fixes #4677.	2018-05-30 09:24:52 -07:00
John MacFarlane	58447bba98	rawLaTeXBlock: don't expand macros in macro definitions! Closes #4653. Note that this only affected LaTeX in markdown. Added regression test.	2018-05-15 09:19:13 -07:00
John MacFarlane	a00ca6f0d8	Removed inadvertently added .orig files from repository. These were added by `96d10c72cc` Closes #4648.	2018-05-11 17:10:32 -07:00
John MacFarlane	d3be567a73	Fix regression with tex math environments in HTML + MathJax. Closes #4639.	2018-05-09 10:37:04 -07:00
John MacFarlane	81881ce470	Parsing: Lookahead for non-whitespace after single/double quote start. Closes #4637.	2018-05-09 10:00:34 -07:00
John MacFarlane	44f1c72b28	Add test for #4576 . Closes #4576.	2018-05-08 09:14:58 -07:00
John MacFarlane	a96c762a10	RST reader: fix anonymous redirects with backticks. Closes #4598.	2018-04-26 12:23:25 -07:00
John MacFarlane	aba0f7e063	Add tests for #4589 and #4594 (currently failing).	2018-04-25 23:04:08 -07:00
John MacFarlane	dab3330a58	RST reader: allow < 3 spaces indent under directives. Closes #4579.	2018-04-22 12:20:25 -07:00
John MacFarlane	7fbe473b2e	Markdown reader/writer: spacing adjustments in tables. * Markdown writer now includes a blank line at the end of the row in a single-row multiline table, to prevent it from being interpreted as a simple table. Closes #4578. * Markdown reader does a better job computing the relative width of the last column in a multiline table, so we can round-trip tables without constantly shrinking the last column.	2018-04-21 13:06:57 -07:00
John MacFarlane	276894a2f2	RST writer: use more consistent indentation. Previously we used an odd mix of 3- and 4-space indentation. Now we use 3-space indentation, except for ordered lists, where indentation must depend on the width of the list marker. Closes #4563.	2018-04-19 13:47:16 -07:00
John MacFarlane	d5b98c8c6e	Man writer: Don't escape U+2019 as '. Closes #4550.	2018-04-14 10:42:05 -07:00
John MacFarlane	d77e8f45c9	LaTEX reader: properly resolve section numbers with \ref and chapters. Closes #4529.	2018-04-05 10:14:06 -07:00
quasicomputational	13538ce6eb	CommonMark writer: correctly ignore LaTeX raw blocks when not raw_tex (#4533 ) Issue #4527.	2018-04-05 08:53:42 -07:00
Marc Schreiber	16523ea3d1	LaTeX reader: parse sloppypar environment (#4517 )	2018-04-02 09:33:29 -07:00
John MacFarlane	b9602766d8	Textile reader: fixed tables with no body rows. Previously these raised an exception. Closes #4513.	2018-03-30 14:56:36 -07:00
John MacFarlane	5a79948e0c	Mediawiki reader: improve table parsing. This fixes detection of table attributes and also handles `!` characters in cells. Closes #4508.	2018-03-28 08:59:34 -07:00
Marc Schreiber	155a2ac039	Add support to parse unit string of \SI command (closes #4296 ).	2018-03-17 20:59:20 -07:00
Francesco Occhipinti	e5845f33ad	Don't wrap lines in grid tables when `--wrap=none` (#4320 ) * Annotate gridTable code with comments and abstract small functions * Don't wrap lines in tables when `--wrap=none`. Instead, expand cells, even if it results in cells that don't respect relative widths or surpass page column width. * This change affects RST, Markdown, and Haddock writers.	2018-03-17 20:31:43 -07:00
John MacFarlane	b76c0e6a4a	RST reader: Allow unicode bullet characters. Closes #4454.	2018-03-14 17:33:00 -07:00
John MacFarlane	17725a0661	Beamer: put hyperlink after `\begin{frame}`. and not in the title. If it's in the title, then we get a titlebar on slides with the `plain` attribute, when the id is non-null. This fixes a regression from 1.9.x. Closes #4307.	2018-03-13 10:03:51 -07:00
John MacFarlane	adefd86cd4	LaTeX reader: Fix regression in package options including underscore. Closes #4424.	2018-03-02 09:33:18 -08:00
John MacFarlane	377640402f	LaTeX reader: Fixed comments inside citations. Closes #4374 .	2018-02-17 23:06:54 -08:00
Henri Menke	751b5ad010	ConTeXt writer: new section syntax and --section-divs (#4295 ) Fixes #2609. This PR introduces the new-style section headings: `\section[my-header]{My Header}` -> `\section[title={My Header},reference={my-header}]`. On top of this, the ConTeXt writer now supports the `--section-divs` option to write sections in the fenced style, with `\startsection` and `\stopsection`.	2018-01-25 11:56:28 -08:00
John MacFarlane	ac08a887cf	Markdown reader: Fix parsing bug with nested fenced divs. Closes #4281. Previously we allowed "nonindent spaces" before the opening and closing `:::`, but this interfered with list parsing, so now we require the fences to be flush with the margin of the containing block.	2018-01-20 14:44:08 -08:00
John MacFarlane	957c0e110d	RST reader: fix parsing of headers with trailing space. This was a regression in pandoc 2.0. Closes #4280.	2018-01-20 11:10:09 -08:00
John MacFarlane	ca8cd38bdc	Markdown reader: don't coalesce adjacent raw LaTeX blocks... if they are separated by a blank line. See lierdakil/pandoc-crossref#160 for motivation.	2018-01-17 09:22:35 -08:00
John MacFarlane	615a99c2c2	RST reader: add aligned environment when needed in math. rst2latex.py uses an align* environment for math in `.. math::` blocks, so this math may contain line breaks. If it does, we put the math in an `aligned` environment to simulate rst2latex.py's behavior. Closes #4254.	2018-01-14 15:11:11 -08:00
John MacFarlane	d9584d73f9	Markdown reader: Improved inlinesInBalancedBrackets. The change both improves performance and fixes a regression whereby normal citations inside inline notes were not parsed correctly. Closes jgm/pandoc-citeproc#315.	2018-01-14 12:24:21 -08:00
John MacFarlane	e7d95cadf5	LaTeX reader: pass through macro defs in rawLaTeXBlock... even if the `latex_macros` extension is set. This reverts to earlier behavior and is probably safer on the whole, since some macros only modify things in included packages, which pandoc's macro expansion can't modify. Closes #4246.	2018-01-13 22:12:32 -08:00
John MacFarlane	dca0032b0e	LaTeX reader: allow macro definitions inside macros. Previously we went into an infinite loop with ``` \newcommand{\noop}[1]{#1} \noop{\newcommand{\foo}[1]{#1}} \foo{hi} ``` See #4253.	2018-01-13 12:22:25 -08:00
John MacFarlane	49007ded7b	RST reader: better handling for headers with an anchor. Instead of creating a div containing the header, we put the id directly on the header. This way header promotion will work properly. Closes #4240.	2018-01-10 12:07:33 -08:00
John MacFarlane	13f7c2cf83	Fixed a test case so it works on windows too.	2018-01-09 17:45:02 -08:00
John MacFarlane	e3f01235e9	HTML writer: Fixed footnote backlinks with --id-prefix. Closes #4235.	2018-01-09 15:29:27 -08:00
John MacFarlane	3c93ac5cf0	LaTeX reader: be more tolerant of `&` character. This allows us to parse unknown tabular environments as raw LaTeX. Closes #4208.	2017-12-28 08:41:53 -08:00
John MacFarlane	acfa846aab	Merge pull request #4184 from mb21/html-reader-figcaption HTML Reader: be more forgiving about figcaption	2017-12-27 13:33:00 -07:00
John MacFarlane	a888083ee1	HTML reader: parse div with class `line-block` as LineBlock. See #4162.	2017-12-27 12:26:15 -08:00
John MacFarlane	9e1d86638c	LaTeX reader: support `\foreignlanguage` from babel.	2017-12-26 10:57:57 -08:00
John MacFarlane	ee5fe9bf2c	RST reader: allow empty list items (as docutils does). Closes #4193.	2017-12-24 13:02:18 -08:00
mb21	9b54b94612	HTML Reader: be more forgiving about figcaption fixes #4183	2017-12-23 09:42:04 +01:00
John MacFarlane	28b736bf95	`latex_macros` extension changes. Don't pass through macro definitions themselves when `latex_macros` is set. The macros have already been applied. If `latex_macros` is enabled, then `rawLaTeXBlock` in Text.Pandoc.Readers.LaTeX will succeed in parsing a macro definition, and will update pandoc's internal macro map accordingly, but the empty string will be returned. Together with earlier changes, this closes #4179.	2017-12-22 18:03:51 -08:00
John MacFarlane	4a07977715	Markdown reader: improved raw tex parsing. + Preserve original whitespace between blocks. + Recognize `\placeformula` as context.	2017-12-22 18:03:51 -08:00
John MacFarlane	9758720a24	RST writer: fix anchors for headers. We were missing an `_`. See #4188.	2017-12-22 10:36:37 -08:00
Alexander Krotov	d035689a06	Org writer: do not wrap "-" to avoid accidental bullet lists Also add TODO for ordered lists.	2017-12-21 16:36:29 +03:00
Alexander Krotov	1e21cfb251	Muse writer: don't wrap note references to the next line Closes #4172.	2017-12-19 13:30:48 +03:00
Alexander Krotov	ef8430e702	Fix for #4171 fix: don't wrap note references after SoftBreak	2017-12-19 13:30:48 +03:00
John MacFarlane	c0cc9270cb	Org writer: don't allow fn refs to wrap to beginning of line. Otherwise they can be interpreted as footnote definitions. Closes #4171.	2017-12-18 16:33:52 -08:00
John MacFarlane	808f6d3fa1	OPML reader: enable raw HTML and other extensions by default for notes. This fixes a regression in 2.0. Note that extensions can now be individually disabled, e.g. `-f opml-smart-raw_html`. Closes #4164.	2017-12-17 09:52:53 -08:00
John MacFarlane	044d58bb24	Fixed regression in LateX tokenization. This mainly affects the Markdown reader when parsing raw LaTeX with escaped spaces. Closes #4159.	2017-12-15 09:45:29 -08:00
John MacFarlane	b94f1e2045	RST reader: more accurate parsing of references. Previously we erroneously included the enclosing backticks in a reference ID (closes #4156). This change also disables interpretation of syntax inside references, as in docutils. So, there is no emphasis in `my link`_	2017-12-14 12:48:43 -08:00
John MacFarlane	7093a3b44c	Markdown: Improved computation of relative cell widths in pipe tables.	2017-12-12 15:36:29 -08:00
John MacFarlane	67b6abc806	LaTeX reader: fix \ before newline. This should be a nonbreaking space, as long as it's not followed by a blank line. This has been fixed at the tokenizer level. Closes #4134.	2017-12-08 16:34:15 -08:00
John MacFarlane	f6007e7146	Markdown reader: accept processing instructions as raw HTML. Closes #4125.	2017-12-06 16:05:50 -08:00
John MacFarlane	6a2562efb5	Rewrite empty_paragraphs test so it will run on Windows.	2017-12-04 15:41:09 -08:00
John MacFarlane	fac3953abf	Markdown reader: Don't parse native div as table caption. Closes #4119.	2017-12-04 15:04:47 -08:00
John MacFarlane	ae60e0196c	Add `empty_paragraphs` extension. * Deprecate `--strip-empty-paragraphs` option. Instead we now use an `empty_paragraphs` extension that can be enabled on the reader or writer. By default, disabled. * Add `Ext_empty_paragraphs` constructor to `Extension`. * Revert "Docx reader: don't strip out empty paragraphs." This reverts commit `d6c58eb836`. * Implement `empty_paragraphs` extension in docx reader and writer, opendocument writer, html reader and writer. * Add tests for `empty_paragraphs` extension.	2017-12-04 14:56:57 -08:00
John MacFarlane	03496d1810	Test for #4113 . Closes #4113.	2017-12-03 20:15:40 -08:00
John MacFarlane	1193c1a505	LaTeX writer: allow specifying just width or height for image size. Previously both needed to be specified (unless the image was being resized to be smaller than its original size). If height but not width is specified, we now set width to textwidth (and similarly if width but not height is specified). Since we have keepaspectratio, this yields the desired result.	2017-12-01 21:18:29 -08:00
John MacFarlane	b2a190546d	Revert "LaTeX writer: Add keepaspectratio to includegraphics..." This reverts commit `171187a452`.	2017-12-01 13:51:33 -08:00
John MacFarlane	171187a452	LaTeX writer: Add keepaspectratio to includegraphics... ...if only one of height/width is given.	2017-11-30 16:03:28 -08:00
John MacFarlane	03ddac451e	Support beamer `\alert` in LaTeX reader. Closes #4091 .	2017-11-29 21:30:13 -08:00
John MacFarlane	508aab0bd5	Text.Pandoc.Parsing.uri: allow `&` and `=` as word characters. This fixes a bug where pandoc would stop parsing a URI with an empty attribute: for example, `&a=&b=` wolud stop at `a`. (The uri parser tries to guess which punctuation characters are part of the URI and which might be punctuation after it.) Closes #4068.	2017-11-14 22:08:14 -08:00
John MacFarlane	51897937cd	LaTeX reader: allow optional arguments on `\footnote`. Closes #4062.	2017-11-13 21:19:38 -08:00
John MacFarlane	8d6e0e516a	Markdown writer: fix bug with doubled footnotes in grid tables. Closes #4061.	2017-11-13 21:12:04 -08:00
John MacFarlane	eeaa3b048c	LaTeX reader: support column specs like `*{2}{r}`. This is equivalent to `rr`. We now expand it like a macro. Closes #4056.	2017-11-12 14:46:29 -08:00
John MacFarlane	7ba0ae8b4d	LaTeX reader: allow optional args for parbox. See #4056.	2017-11-12 14:19:58 -08:00
John MacFarlane	fb5ba1bb00	Fixed YAML metadata with "chomp" (`\|-`). Previously if a YAML block under `\|-` contained a blank line, pandoc would not parse it as metadata.	2017-11-11 10:17:53 -05:00
John MacFarlane	1592d38821	Allow fenced code blocks to be indented 1-3 spaces. This brings our handling of them into alignment with CommonMark's. Closes #??.	2017-11-09 23:22:44 -05:00
John MacFarlane	fef5770591	Fix regression with --metadata. It should replace a metadata value set in the document itself, rather than creating a list including a new value. Closes #4054.	2017-11-08 21:54:23 -08:00
John MacFarlane	8e53489cbc	Fix strikethrough in gfm writer. Previously we got a crash, because we were trying to print a native cmark STRIKETHROUGH node, and the commonmark writer in cmark-github doesn't support this. Work around this by using a raw node to add the strikethrough delimiters. Closes #4038.	2017-11-04 10:35:52 -07:00
John MacFarlane	642d603666	Improved support for columns in HTML. * Move as much as possible to the CSS in the template. * Ensure that all the HTML-based templates (including epub) contain the CSS for columns. * Columns default to 50% width unless they are given a width attribute. Closes #4028.	2017-11-02 20:57:05 -07:00
John MacFarlane	6d00e6e8c3	Fixed revealjs slide column width issues. * Remove "width" attribute which is not allowed on div. * Remove space between `<div class="column">` elements, since this prevents columns whose widths sum to 100% (the space takes up space). Closes #4028.	2017-11-02 10:23:04 -07:00
John MacFarlane	ed3d466384	Really fix #3989 . The previous fix only worked in certain cases. Other cases with `>` in an HTML attribute broke.	2017-11-01 09:27:51 -07:00
John MacFarlane	f1ebdb8145	Updated command test for #3989 . We didn't fix it completely before.	2017-11-01 09:15:15 -07:00
John MacFarlane	fb6e5812bc	Fixed regression in parsing of HTML comments in markdown... and other non-HTML formats (`Text.Pandoc.Readers.HTML.htmlTag`). The parser stopped at the first `>` character, even if it wasn't the end of the comment. Closes #4019.	2017-10-31 21:14:38 -07:00
John MacFarlane	0e57b8b85d	Add Millimeter constructor to Dimension in ImageSize. Minor API change. Now sizes given in 'mm' are no longer converted to 'cm'. Closes #4012.	2017-10-31 11:58:43 -07:00
John MacFarlane	5f9f458df3	LaTeX reader: handle `%` comment right after command. For example \emph% {hi}	2017-10-31 11:31:35 -07:00
John MacFarlane	556c6c2c6d	Markdown reader: make sure fenced div closers work in lists. Previously the following failed: ::: {.class} 1. one 2. two ::: and you needed a blank line before the closing `:::`.	2017-10-31 10:57:20 -07:00
John MacFarlane	81610144f9	Make `fenced_divs` affect the Markdown writer. If `fenced_divs` is enabled, fenced divs will be used.	2017-10-31 10:57:20 -07:00
John MacFarlane	244b42dbaf	Added failing command test for #4007 .	2017-10-30 11:04:40 -07:00
John MacFarlane	513b16a71b	Fenced divs: ensure that paragraph at end doesn't become Plain. Added test case.	2017-10-24 09:53:29 -07:00
John MacFarlane	ecb5475a2a	Back to using [WARNING] and [INFO] to mark messages.	2017-10-23 23:01:37 -07:00
John MacFarlane	fda0c0119f	Implemented fenced Divs. + Added Ext_fenced_divs to Extensions (default for pandoc Markdown). + Document fenced_divs extension in manual. + Implemented fenced code divs in Markdown reader. + Added test. Closes #168.	2017-10-23 22:45:28 -07:00
John MacFarlane	896803b0d5	HTML reader: `htmlTag` improvements. We previously failed on cases where an attribute contained a `>` character. This patch fixes the bug. Closes #3989.	2017-10-23 17:29:32 -07:00
John MacFarlane	1a82ecbb68	More pleasing presentation of warnings and info messages. !! warning -- info	2017-10-23 15:00:11 -07:00
John MacFarlane	cecf02e326	Fixed test for change in log level.	2017-10-23 11:20:22 -07:00
mb21	e2123a4033	LaTeX Reader: support \lettrine	2017-10-22 20:33:30 +02:00
John MacFarlane	28bb5d610d	LaTeX reader: support `\expandafter`. Closes #3983.	2017-10-19 13:23:50 -07:00
John MacFarlane	61641f996f	Revised command test 3971 to work with Windows.	2017-10-16 22:51:25 -07:00
John MacFarlane	c40857b389	Improved handling of include files in LaTeX reader. Previously `\include` wouldn't work if the included file contained, e.g., a begin without a matching end. We've changed the Tok type so that it stores a full SourcePos, rather than just a line and column. So tokens keeep track of the file they came from. This allows us to use a simpler method for includes, which doesn't require parsing the included document as a whole. Closes #3971.	2017-10-16 22:05:34 -07:00
John MacFarlane	9cf9a64923	RST writer: correctly handle inline code containing backticks. (Use a :literal: role.) Closes #3974.	2017-10-16 20:54:43 -07:00
John MacFarlane	cba18c19a6	RST writer: don't backslash-escape word-internal punctuation. Closes #3978.	2017-10-16 20:39:19 -07:00
John MacFarlane	75d8c99c73	ConTeXt writer: Use identifiers for chapters. Closes #3968.	2017-10-11 20:21:55 -07:00
John MacFarlane	8cd1e00bbc	Add test - closes #3958 .	2017-10-08 21:57:26 -07:00
John MacFarlane	492f496842	Markdown reader: Fixed bug with indented code following raw LaTeX. Closes #3947.	2017-10-02 21:28:14 -07:00
John MacFarlane	2314534d4d	RST writer: add header anchors when header has non-standard id. Closes #3937.	2017-09-27 20:42:04 -07:00
John MacFarlane	b1ee747a24	Added `--strip-comments` option, `readerStripComments` in `ReaderOptions`. * Options: Added readerStripComments to ReaderOptions. * Added `--strip-comments` command-line option. * Made `htmlTag` from the HTML reader sensitive to this feature. This affects Markdown and Textile input. Closes #2552.	2017-09-17 13:01:27 -07:00
John MacFarlane	4177ee8626	Textile reader: allow 'pre' code in list item. Closes #3916.	2017-09-12 08:58:47 -07:00
John MacFarlane	5fc4980216	Markdown writer: Escape pipe characters when `pipe_tables` enabled. Closes #3887.	2017-09-07 22:10:13 -07:00
Albert Krewinkel	6a6c3858b4	Org writer: stop using raw HTML to wrap divs Div's are difficult to translate into org syntax, as there are multiple div-like structures (drawers, special blocks, greater blocks) which all have their advantages and disadvantages. Previously pandoc would use raw HTML to preserve the full div information; this was rarely useful and resulted in visual clutter. Div-rendering was changed to discard the div's classes and key-value pairs if there is no natural way to translate the div into an org structure. Closes: #3771	2017-09-01 00:08:12 +02:00
John MacFarlane	8fcf66453c	RST reader: Fixed `..include::` directive. Closes #3880.	2017-08-27 17:09:55 -07:00
John MacFarlane	1b3431a165	LaTeX reader: improved support for \hyperlink, \hypertarget. Closes #2549.	2017-08-25 22:04:57 -07:00
John MacFarlane	d70b89c0d9	Use pandoc-types 1.17.1. Tests updated for new simpleTable behavior... with empty headers.	2017-08-20 23:24:51 -07:00
John MacFarlane	9cc128b579	LaTeX reader: Set identifiers on Spans used for \label.	2017-08-20 16:52:03 -07:00
John MacFarlane	a31241a08b	Markdown reader: use CommonMark rules for list item nesting. Closes #3511. Previously pandoc used the four-space rule: continuation paragraphs, sublists, and other block level content had to be indented 4 spaces. Now the indentation required is determined by the first line of the list item: to be included in the list item, blocks must be indented to the level of the first non-space content after the list marker. Exception: if are 5 or more spaces after the list marker, then the content is interpreted as an indented code block, and continuation paragraphs must be indented two spaces beyond the end of the list marker. See the CommonMark spec for more details and examples. Documents that adhere to the four-space rule should, in most cases, be parsed the same way by the new rules. Here are some examples of texts that will be parsed differently: - a - b will be parsed as a list item with a sublist; under the four-space rule, it would be a list with two items. - a code Here we have an indented code block under the list item, even though it is only indented six spaces from the margin, because it is four spaces past the point where a continuation paragraph could begin. With the four-space rule, this would be a regular paragraph rather than a code block. - a code Here the code block will start with two spaces, whereas under the four-space rule, it would start with `code`. With the four-space rule, indented code under a list item always must be indented eight spaces from the margin, while the new rules require only that it be indented four spaces from the beginning of the first non-space text after the list marker (here, `a`). This change was motivated by a slew of bug reports from people who expected lists to work differently (#3125, #2367, #2575, #2210, #1990, #1137, #744, #172, #137, #128) and by the growing prevalance of CommonMark (now used by GitHub, for example). Users who want to use the old rules can select the `four_space_rule` extension. * Added `four_space_rule` extension. * Added `Ext_four_space_rule` to `Extensions`. * `Parsing` now exports `gobbleAtMostSpaces`, and the type of `gobbleSpaces` has been changed so that a `ReaderOptions` parameter is not needed.	2017-08-19 15:45:01 -07:00
John MacFarlane	5ab1162def	Markdown reader: fixed parsing of fenced code after list... ...when there is no intervening blank line. Closes #3733.	2017-08-18 21:46:55 -07:00
John MacFarlane	bfbdfa646a	LaTeX reader: implement \newtoggle, \iftoggle, \toggletrue\|false from etoolbox. Closes #3853.	2017-08-18 10:13:41 -07:00
John MacFarlane	d1444b4ecd	RST reader/writer: support unknown interpreted text roles... ...by parsing them as Span with "role" attributes. This way they can be manipulated in the AST. Closes #3407.	2017-08-17 16:01:44 -07:00
John MacFarlane	b1f6fb4af5	HTML reader: support column alignments. These can be set either with a `width` attribute or with `text-width` in a `style` attribute. Closes #1881.	2017-08-17 12:08:32 -07:00
John MacFarlane	db715ca847	LaTeX reader: use Link instead of Span for `\ref`. This makes more sense semantically and avoids unnecessary Span [Link] nestings when references are resolved.	2017-08-16 10:56:12 -07:00
schrieveslaach	cf4b40162d	LaTeX reader: add Support for `glossaries` and `acronym` package (#3589 ) Acronyms are not resolved by the reader, but acronym and glossary information is put into attributes on Spans so that they can be processed in filters.	2017-08-16 10:24:46 -07:00
John MacFarlane	68434957d6	Fixed command test #2994 on Windows.	2017-08-16 09:47:25 -07:00
John MacFarlane	892a4edeb1	Implement multicolumn support for slide formats. The structure expected is: <div class="columns"> <div class="column" width="40%"> contents... </div> <div class="column" width="60%"> contents... </div> </div> Support has been added for beamer and all HTML slide formats. Closes #1710. Note: later we could add a more elegant way to create this structure in Markdown than to use raw HTML div elements. This would come for free with a "native div syntax" (#168). Or we could devise something specific to slides	2017-08-14 23:17:44 -07:00
John MacFarlane	319d7ed6ff	Changed command test for #2994 so it actually tests the writer.	2017-08-14 00:00:50 -07:00
schrieveslaach	2845ab5976	Put content of \ref, \label commands into span… (#3639 ) * Put content of `\ref` and `\label` commands into Span elements so they can be used in filters. * Add support for `\eqref`	2017-08-13 10:58:45 -07:00
John MacFarlane	8f65590ce9	CommonMark writer: prefer pipe tables to HTML tables... ...even if it means losing relative column width information. See #3734.	2017-08-13 10:43:43 -07:00
John MacFarlane	506866ef73	Markdown writer: Use pipe tables if `raw_html` disabled... and `pipe_tables` enabled, even if the table has relative width information. Closes #3734.	2017-08-13 10:37:24 -07:00
John MacFarlane	418bda8128	Docx writer: pass through comments. We assume that comments are defined as parsed by the docx reader: I want <span class="comment-start" id="0" author="Jesse Rosenthal" date="2016-05-09T16:13:00Z">I left a comment.</span>some text to have a comment <span class="comment-end" id="0"></span>on it. We assume also that the id attributes are unique and properly matched between comment-start and comment-end. Closes #2994.	2017-08-12 22:59:53 -07:00

... 2 3 4 5 6 ...

487 commits