pandoc

Author	SHA1	Message	Date
John MacFarlane	d532eb14eb	HTML reader: allow tfoot before body rows. Closes #5079.	2018-11-16 11:29:15 -08:00
John MacFarlane	e61f632531	HTML reader: parse `<small>` as a Span with class "small". Closes #5080.	2018-11-15 22:36:01 -08:00
John MacFarlane	e61d1d0da9	Asciidoc writer: Render Spans using `[#id .class]#contents#`. See #5080.	2018-11-15 22:29:15 -08:00
John MacFarlane	1a102c11a9	Fix test case for #5014 .	2018-11-13 14:50:26 -08:00
John MacFarlane	1cfdd3662f	HTML reader: allow thead containing a row with td rather than th. See #5014. Note that this doesn't address the original issue in #5014, only an unrelated side-issue.	2018-11-13 14:49:12 -08:00
John MacFarlane	52a57a5362	LaTeX writer: don't emit `[<+->]` unless beamer output, even if `writerIncremental` is True. See #5072.	2018-11-12 09:43:12 -08:00
John MacFarlane	5bc38a741b	Exactly match GitHub's identifier generating algorithm. See #5057.	2018-11-11 20:45:38 -08:00
John MacFarlane	a36d202e86	Text.Pandoc.Shared: add parameter to uniqueIdent, inlineListToIdentifier. The parameter is Extensions. This allows these functions to be sensitive to the settings of `Ext_gfm_auto_identifiers` and `Ext_ascii_identifiers`. This allows us to use `uniqueIdent` in the CommonMark reader, replacing some custom code. It also means that `gfm_auto_identifiers` can now be used in all formats. Semantically, `gfm_auto_identifiers` is now a modifier of `auto_identifiers`; for identifiers to be set, `auto_identifiers` must be turned on, and then the type of identifier produced depends on `gfm_auto_identifiers` and `ascii_identifiers` are set. Closes #5057.	2018-11-11 13:46:23 -08:00
John MacFarlane	5f030f3c2c	Add command test for #5050 .	2018-11-06 22:57:11 -08:00
quasicomputational	a747268823	CommonMark writer: respect --ascii (#5043 )	2018-11-05 09:33:10 -08:00
John MacFarlane	511d647290	XML: toHtml5Entities: prefer shorter entities... when there are several choices for a particular character.	2018-11-04 22:15:53 -08:00
John MacFarlane	805b9f8a12	Roff reader: Improved handling of custom strings as arguments. Added test.	2018-11-02 21:35:49 -07:00
John MacFarlane	26341c1632	Implement --ascii for Markdown writer.	2018-11-01 16:31:04 -07:00
John MacFarlane	f379edc4ad	HTML writer: use character entities references when possible for HTML5.	2018-11-01 16:08:27 -07:00
John MacFarlane	e0290fd18b	LaTeX writer: add newline if math ends in a comment. This prevents the closing delimiter from being swalled up in the comment. Closes #4880.	2018-10-31 21:51:20 -07:00
John MacFarlane	c51be5dfc8	LaTeX reader: allow space at end of math after `\`. Closes #5010. Expose trimMath from T.P.Shared.	2018-10-29 22:20:14 -07:00
Albert Krewinkel	096cbe6987	Lua: allow access to pandoc state (#5015 ) * Lua: allow access to pandoc state Lua filters and custom writers now have read-only access to most fields of pandoc's internal state via the global variable `PANDOC_STATE`. * Lua: allow iterating through fields of PANDOC_STATE * Lua filters doc: describe CommonState * Lua filters doc: mention global variable PANDOC_STATE * Lua: add access to logs Log messages can currently only be printed, but not decomposed.	2018-10-25 22:12:14 -07:00
John MacFarlane	8efb8975ed	Groff writer character escaping changes. T.P.GroffChar: replaced `essentialEscapes` with `manEscapes`, which includes all the escapes mentioned in the groff_man manual. T.P.Writers.Groff: removed escapeCode; changed parameter on escapeString from Bool to new type `EscapeMode`. Rewrote `escapeString`.	2018-10-23 21:44:07 -07:00
Brian Leung	7eea5c62ed	LaTeX reader: add support for `nolinkurl` command. (#4992 )	2018-10-22 23:36:44 -07:00
John MacFarlane	efbb329f1a	Groff escaping changes. - `--ascii` is now turned on automatically for man output, for portability. All man output will be escaped to ASCII. - In T.P.Writers.Groff, `escapeChar`, `escapeString`, and `escapeCode` now take a boolean parameter that selects ascii-only output. This is used by the Ms writer for `--ascii`, instead of doing an extra pass after writing the document. - In ms output without `--ascii`, unicode is used whenever possible (e.g. for double quotes). - A few escapes are changed: e.g. `\[rs]` instead of `\\` for backslash, and `\ga]` instead of `` \` `` for backtick.	2018-10-18 10:21:34 -07:00
John MacFarlane	f48960b75f	Move common groff functions to Text.Pandoc.Writers.Groff (unexported module). These are used in both the man and ms writers. Moved groffEscape out of Text.Pandoc.Writers.Shared [cancels earlier API change from adding it, which was after last release]. This fixes strong/code combination on man (should be `\f[CB]` not `\f[BC]`), mentioned in #4973. Updated tests. Closes #4975.	2018-10-17 17:26:37 -07:00
Alexander Krotov	b3feaba6af	Man writer: use \f[R] instead of \f[] to reset font Fixes #4973	2018-10-17 18:29:07 +03:00
John MacFarlane	6f6ad0514d	LaTeX reader: make macroDef polymorphic and allow in inline context. Otherwise we can't parse something like ``` \lowercase{\def\x{Foo}} ``` I have actually seen tex like this in the wild.	2018-10-15 11:46:31 -07:00
John MacFarlane	22f81f78bd	Added failing test case for macros.	2018-10-15 00:37:17 -07:00
John MacFarlane	88faa45f1d	Markdown writer: ensure blank between raw block and normal content. Otherwise a raw block can prevent a paragraph from being recognized as such. Closes #4629.	2018-10-14 17:12:06 -07:00
John MacFarlane	cf8224045b	Markdown reader: Fix awkward soft break movements before abbreviations. Closes #4635.	2018-10-14 13:02:36 -07:00
John MacFarlane	f5c64c3060	HTML reader: fix htmlTag and isInlineTag to accept processing instructions. Fixes regression #3123 (since 2.0). Added regression test.	2018-10-11 09:58:25 -07:00
John MacFarlane	a92e43575f	LaTeX writer: with `--biblatex`, use `\autocite` when possible. `\autocites{a1}{a2}{a3}` will not collapse the entries. So, if we don't have prefixes and suffixes, we use instead `\autocite{a1;a2;a3}`. Closes #4960.	2018-10-08 20:47:09 -07:00
John MacFarlane	145710c4c3	RST reader: don't allow single-dash separator in headerless table. Closes #4382.	2018-10-07 12:37:08 -07:00
John MacFarlane	b806bff5b4	LaTeX reader: fix bugs omitting raw tex. The default is `-raw_tex`, so no raw tex should result unless we explicitly say `+raw_tex`. Previously some raw commands did make it through. Closes #4527.	2018-10-07 12:21:43 -07:00
John MacFarlane	08fef6b210	RST reader: pass through fields in unknown directives as div attributes. This commit also adds support for `class` and `name` attributes to directives in general. Closes #4715.	2018-10-07 11:44:11 -07:00
Brian Leung	e257b54124	Org reader: fix behavior for successive calls of `#+EXCLUDE_TAGS`. (#4951 ) Calling `#+EXCLUDE_TAGS` multiple times should preserve the status of the previously declared tags.	2018-10-05 22:21:20 -07:00
quasicomputational	6207bdeb68	CommonMark writer: add plain text fallbacks. (#4531 ) Previously, the writer would unconditionally emit HTMLish output for subscripts, superscripts, strikeouts (if the strikeout extension is disabled) and small caps, even with raw_html disabled. Now there are plain-text (and, where possible, fancy Unicode) fallbacks for all of these corresponding (mostly) to the Markdown fallbacks, and the HTMLish output is only used when raw_html is enabled. This commit adds exported functions `toSuperscript` and `toSubscript` to `Text.Pandoc.Writers.Shared`. [API change] Closes #4528.	2018-10-05 21:33:14 -07:00
Brian Leung	a26b3a2d6a	Org reader: Add partial support for `#+EXCLUDE_TAGS` option. (#4950 ) Closes #4284. Headers with the corresponding tags should not appear in the output. If one or more of the specified tags contains a non-tag character like `+`, Org-mode will not treat that as a valid tag, but will nonetheless continue scanning for valid tags. That behavior is not replicated in this patch; entering `cat+dog` as one of the entries in `#+EXCLUDE_TAGS` and running the file through Pandoc will cause the parser to fail and result in the only excluded tag being the default, `noexport`.	2018-10-05 14:28:17 -07:00
John MacFarlane	36f1846cc3	Implement `--ascii` (`writerPreferAscii`) in writers, not App. Now the `write*` functions for Docbook, HTML, ICML, JATS, Man, Ms, OPML are sensitive to `writerPreferAscii`. Previously the to-ascii translation was done in Text.Pandoc.App, and thus not available to those using the writer functions directly. In addition, the LaTeX writer is now sensitive to `writerPreferAscii` and to `--ascii`. 100% ASCII output can't be guaranteed, but the writer will use commands like `\"{a}` and `\l` whenever possible, to avoid emiting a non-ASCII character. A new unexported module, Text.Pandoc.Groff, has been added to store functions used in the different groff-based writers.	2018-09-30 22:32:00 -07:00
John MacFarlane	190ee279c9	LaTeX reader: allow verbatim blocks ending with blank lines. Closes #4624.	2018-09-29 10:57:11 -07:00
leungbk	6e8f31dab1	Force inline code blocks to honor export options. `exportsCode` is moved from `Blocks.hs` to `Shared.hs` and exported accordingly.	2018-09-26 08:49:13 +02:00
Brian Leung	72363cd2fc	Add support for multiprenote and multipostnote arguments in LaTeX. (#4930 ) * Add support for multiprenote and multipostnote arguments. The multiprenotes occur before the first prefix of a multicite, and the multipostnotes follow the last suffix. * Add test for multiprenote and multipostnote.	2018-09-25 20:49:13 -07:00
John MacFarlane	37c6f6adfe	RST reader: fix bug with internal link targets. They were gobbling up indented content underneath. Closes #4919.	2018-09-20 11:15:03 -07:00
John MacFarlane	136bf901aa	Markdown reader: distinguish autolinks in the AST. With this change, autolinks are parsed as Links with the `uri` class. (The same is true for bare links, if the `autolink_bare_uris` extension is enabled.) Email autolinks are parsed as Links with the `email` class. This allows the distinction to be represented in the URI. Formerly the `uri` class was added to autolinks by the HTML writer, but it had to guess what was an autolink and could not distinguish `[http://example.com](http://example.com)` from `<http://example.com>`. It also incorrectly recognized `[pandoc](pandoc)` as an autolink. Now the HTML writer simply passes through the `uri` attribute if it is present, but does not add anything. The Textile writer has been modified so that the `uri` class is not explicitly added for autolinks, even if it is present. Closes #4913.	2018-09-19 14:53:29 -07:00
John MacFarlane	44e4f7b292	Markdown reader: example_lists should work without startnum. Closes #4908.	2018-09-16 20:40:32 -07:00
mb21	5347e9454f	add test for --metadata-file	2018-09-15 17:06:10 +02:00
mb21	bd5500ba7f	add test yaml-metadata-blocks.md	2018-09-15 12:10:10 +02:00
John MacFarlane	fa4ebd71a3	LaTeX reader: resolve `\ref` for figure numbers.	2018-09-09 22:53:18 -07:00
John MacFarlane	a211edc819	HTML reader: parse `<script type="math/tex` tags as math. These are used by MathJax. Closes #4877.	2018-09-07 09:41:17 -07:00
John MacFarlane	85ed24e849	RSTR reader: don't skip link definitions after comments. Closes #4860.	2018-08-29 14:40:04 -07:00
John MacFarlane	a2c4261b32	HTML reader: allow enabling `raw_tex` extension. This now allows raw LaTeX environments, `\ref`, and `\eqref` to be parsed (which is helpful for translation HTML documents using MathJaX). Closes #1126.	2018-08-24 18:04:00 -07:00
Alexander Krotov	937b92cd30	HTML reader: extract spaces inside links instead of trimming them Fixes #4845	2018-08-22 12:43:15 +03:00
John MacFarlane	3b5949e8f2	LaTeX reader: support blockcquote, foreignblockquote from csquotes. Also foreigncblockquote, hyphenblockquote, hyphencblockquote. Closes #4848. But note: currently foreignquote will be parsed as a regular Quoted inline (not using the quotes appropriate to the foreign language).	2018-08-21 21:03:43 -07:00
John MacFarlane	a733068ebf	LaTeX reader: support enquote*, foreignquote, hypphenquote... from csquotes. See #4848. Still TBD: blockquote, blockcquote, foreignblockquote.	2018-08-21 17:39:27 -07:00

1 2 3 4 5 ...

333 commits