pandoc

Author	SHA1	Message	Date
Albert Krewinkel	6b462e5933	Lua: allow to pass custom reader options to `pandoc.read` Reader options can now be passed as an optional third argument to `pandoc.read`. The object can either be a table or a ReaderOptions value like `PANDOC_READER_OPTIONS`. Creating new ReaderOptions objects is possible through the new constructor `pandoc.ReaderOptions`. Closes: #7656	2021-11-06 09:04:29 -07:00
Rowan Rodrik van der Molen	7a70a46c03	Support for <indexterm>s when reading DocBook (#7607 ) * Support for <indexterm>s when reading DocBook * Update implementation status of `<n-ary>` tags * Remove non-idiomatic parentheses * More complete `<indexterm>` support, with tests Co-authored-by: Rowan Rodrik van der Molen <rowan@ytec.nl>	2021-11-05 10:22:38 -07:00
John MacFarlane	938d557844	Docx reader: don't let first line indents trigger block quotes. This fixes a regression introduced in pandoc 2.15 by PR #7606. Closes #7655.	2021-11-02 14:04:38 -07:00
Albert Krewinkel	45bcd7d3f1	Lua: fix typo in SoftBreak constructor	2021-11-02 21:53:08 +01:00
Albert Krewinkel	dbc654e4a7	Lua tests: ensure Inline elements have all expected properties	2021-11-02 21:40:37 +01:00
Albert Krewinkel	421fd736d4	Lua: re-add `content` property to Strikeout elements Fixes a regression introduced in 2.15.	2021-11-02 21:22:59 +01:00
Albert Krewinkel	cce49c5d4b	Lua: be more forgiving when retrieving the Image `caption` property Fixes a regression introduced in 2.15.	2021-11-02 17:40:07 +01:00
Albert Krewinkel	b26f950cca	Lua: display Attr values using their native Haskell representation	2021-11-02 17:25:47 +01:00
Albert Krewinkel	c467f0fed1	Lua: allow omitting the 2nd parameter in pandoc.Code constructor Fixes a regression introduced in 2.15 which required users to always specify an Attr value when constructing a Code element.	2021-11-02 17:23:24 +01:00
Albert Krewinkel	210e4c98b0	Lua: allow to compare, show Citation values Comparisons of Citation values are performed in Haskell; values are equal if they represent the same Haskell value. Converting a Citation value to a string now yields its native Haskell string representation.	2021-11-02 16:49:50 +01:00
Albert Krewinkel	3a0ac52f7b	Lua tests: ensure Block elements have expected properties	2021-11-02 10:23:14 +01:00
Albert Krewinkel	759aa50951	Lua: restore `content` property on Header elements	2021-11-01 15:43:51 +01:00
Albert Krewinkel	96e76d4cd4	Lua: restore List behavior of MetaList Fixes a regression introduced in 2.16 which had MetaList elements loose the `pandoc.List` properties. Fixes #7650	2021-11-01 08:27:14 +01:00
Albert Krewinkel	3de8f4fdc5	Lua: re-add `content` property to Link elements This was a regression introduced in version 2.15. Fixes: #7647	2021-10-31 11:15:50 +01:00
Tristan Stenner	2444cbc668	Docx writer: add IDs to native_numbering test	2021-10-29 08:40:20 -07:00
Tristan Stenner	31d6a494de	Update test golden master for docx native numbering	2021-10-29 08:40:20 -07:00
Albert Krewinkel	f4d9b443d8	Lua: use hslua module abstraction where possible This will make it easier to generate module documentation in the future.	2021-10-29 17:08:30 +02:00
Albert Krewinkel	e1cf0ad1be	Lua: fix placement of tests for Block elements in pandoc module tests	2021-10-28 14:10:54 +02:00
Albert Krewinkel	7fcf1d6184	Lua: re-add `t` and `tag` property to Attr values Removal of these properties from Attr values was a regression.	2021-10-27 22:32:19 +02:00
John MacFarlane	d226a35c0a	Switch back from HsYAML to yaml. Reasons: - Performance: HsYAML is around 20 times slower in parsing large YAML bibliographies (#6084). - An issue was submitted to HsYAML, but it hasn't gotten any attention. HsYAML seems borderline unmaintained; it hasn't had a commit in over a year. - Unfortunately this goes back on our attempts to free ourselves from C dependencies (#4535). But I don't see a better alternative until a better pure Haskell parser is available. Closes #6084. Notes: - We've removed the FromYAML instances for all types that had them, since this is a HsYAML-specific typeclass [API change]. (The yaml package just uses From/ToJSON.) - Unlike HsYAML (in the configuration we were using), yaml parses 'Y', 'N', 'Yes', 'No', 'On', 'Off' as boolean values. Users may need to quote these when they are meant to be interpreted as strings. Similarly, 'null' is parsed as a YAML null value (and will be treated as an empty string by pandoc rather than the string 'null'). Quoting it will force it to be interpreted as a string. - Some tests had to be adjusted accordingly. - Pandoc now behaves better when the YAML metadata contains escaping errors: instead of just falling back on treating the section as a table, it raises a YAML parsing error.	2021-10-27 12:50:51 -07:00
Albert Krewinkel	b95e864ecf	Lua: marshal SimpleTable values as userdata objects	2021-10-26 21:45:16 +02:00
Albert Krewinkel	a493c7029c	Lua: marshal Block values as userdata objects Properties of Block values are marshalled lazily, which generally improves performance considerably. Script users may also notice the following differences: - Block element properties can no longer be accessed by numerical indexing of the `.c` field. The `.c` property now serves as an alias for `.content`, so some filter that used this undocumented method for property access may continue to work, while others will need to be updated and use proper property names. - The marshalled Block elements now have a `show` method, and a `__tostring` metamethod. Both return the Haskell string representation of the element. - Block values now have the Lua type `userdata` instead of `table`.	2021-10-26 14:40:10 +02:00
Albert Krewinkel	230b133db5	Lua: marshal Citation values as userdata objects	2021-10-25 09:08:58 +02:00
John MacFarlane	113a66bd08	Fix more epub files in epub reader tests. Closes #7586.	2021-10-24 15:24:59 -07:00
John MacFarlane	73a105f59c	Clean up wasteland.epub and formatting.epub from reader tests. Make them valid according to epubcheck.	2021-10-24 15:11:02 -07:00
John MacFarlane	c282451fb2	Fixed test/epub/img.epub and img_no_cover.epub... so they're valid epubs.	2021-10-24 14:13:04 -07:00
John MacFarlane	6df9dc1262	Fix conformance errors in test/epub/features.epub and test/epub/formatting.epub. See #7586.	2021-10-23 23:23:22 -07:00
John MacFarlane	c712d13b67	Org reader: allow an initial :PROPERTIES: drawer to add to metadata. Closes #7520.	2021-10-22 22:10:25 -07:00
Albert Krewinkel	8523bb01b2	Lua: marshal Attr values as userdata - Adds a new `pandoc.AttributeList()` constructor, which creates the associative attribute list that is used as the third component of `Attr` values. Values of this type can often be passed to constructors instead of `Attr` values. - `AttributeList` values can no longer be indexed numerically.	2021-10-22 11:16:51 -07:00
Albert Krewinkel	e4287e6c95	Lua: marshal Pandoc values as userdata	2021-10-22 11:16:51 -07:00
Albert Krewinkel	9e74826ba9	Switch to hslua-2.0 The new HsLua version takes a somewhat different approach to marshalling and unmarshalling, relying less on typeclasses and more on specialized types. This allows for better performance and improved error messages. Furthermore, new abstractions allow to document the code and exposed functions.	2021-10-22 11:16:51 -07:00
John MacFarlane	0a93acf91a	Markdown reader: don't parse links or bracketed spans as citations. Previously pandoc would parse [link to (@a)](url) as a citation; similarly [(@a)]{#ident} This is undesirable. One should be able to use example references in citations, and even if `@a` is not defined as an example reference, `[@a](url)` should be a link containing an author-in-text citation rather than a normal citation followed by literal `(url)`. Closes #7632.	2021-10-20 10:34:47 -07:00
Milan Bracke	465c28d28e	Docx reader: fix handling of empty fields Some fields only have an instrText and no content, Pandoc didn't understand these, causing other fields to be misunderstood because it seemed like a field was still open when it wasn't.	2021-10-18 19:15:40 -07:00
Milan Bracke	6acc82c5d2	Docx parser: implement PAGEREF fields These fields, often used in tables of contents, can be a hyperlink.	2021-10-18 19:15:40 -07:00
Milan Bracke	193f6bfeba	Docx reader: fix handling of nested fields Fields delimited by fldChar elements can contain other fields. Before, the nested fields would be ignored, except for the end, which would be considered the end of the parent field. To fix this issue, fields needed to be considered containing ParParts instead of Runs, since a Run can't represent complex enough structures. This also impacted Hyperlinks since they can originate from a field.	2021-10-18 19:15:40 -07:00
Emily Bourke	8de261ba4e	pptx: Line up continuation paragraphs This commit changes the `marL` and `indent` values used for plain paragraphs and numbered lists, and changes the spacing defined in the reference doc master for bulleted lists. For paragraphs, there is now a left-indent taken from the `otherStyle` in the master. For numbered lists, the number is positioned where the text would be if this were a plain paragraph, and the text is indented to the next level. This means that continuation paragraphs line up nicely with numbered lists. It also /mostly/ matches the observed PowerPoint behaviour when inserting paragraphs and numbered lists: the only difference is that PowerPoint was using a different margin value for the first level numbered lists – I’ve changed this to match the other levels, as I don’t think it makes the spacing unappealing and it allows continuation paragraphs at any level to line up. With bulleted lists, I’m keeping the observed PowerPoint behaviour of specifying only a level, letting `marL` and `indent` be automatically taken from `bodyStyle`. To that end, this commit changes the `bodyStyle` spacing in the master of the default reference doc, to: - line up the text of the first paragraph in each bullet with any continuation paragraphs - line up nested bullet markers in any continuation paragraphs with the first paragraph, matching lists and plain paragraphs This does mean the continuation paragraphs still won’t line up for anyone using their own reference doc where they haven’t matched the `otherStyle` and `bodyStyle` indent levels, but I think people in that situation will be able to troubleshoot.	2021-10-17 17:24:30 -07:00
Emily Bourke	8af15ab345	pptx: Fix list level numbering In PowerPoint, the content of a top-level list is at the same level as the content of a top-level paragraph – the only difference is that a list style has been applied. At the moment, the pptx writer increments the paragraph level on each list, turning what should be top-level lists into second-level lists. This commit changes that logic, only incrementing the paragraph level on continuation paragraphs of lists. - Fixes https://github.com/jgm/pandoc/issues/4828 - Fixes https://github.com/jgm/pandoc/issues/4663	2021-10-17 17:24:30 -07:00
John MacFarlane	3f489bcb58	Ensure that babel is loaded also with pdflatex. This fixes a regression in #7604, which modernized babel usage but omitted to load babel for pdflatex, with the result that even simple documents could no longer be produced. Closes #7627.	2021-10-16 23:34:53 -07:00
Samuel Tardieu	a41c1fe0bb	asciidoc writer: translate numberLines attribute to linesnum switch AsciiDoctor allows to request line numbering on code blocks by using a switch on the `source` block, such as in: ``` [source%linesnum,haskell] ---- some Haskell code here ---- ```	2021-10-14 13:41:12 -07:00
Samuel Tardieu	628cde48cf	DocBook reader: honor linenumbering attribute The attribute DocBook linenumbering="numbered" attribute on code blocks maps to "numberLines" internally.	2021-10-14 09:04:56 -07:00
John MacFarlane	49c4e1d014	Fix markdown parsing bug for math in bracketed spans and links. This affects math with unbalanced brackets (e.g. `$(0,1]$`) inside links, images, bracketed spans. Closes #7623.	2021-10-13 08:59:37 -07:00
John MacFarlane	4ba4533d70	Update wasteland tests. When we trimmed it down we left out some notes.	2021-10-11 09:09:51 -07:00
Milan Bracke	0f98cbff4b	Avoid blockquote when parent style has more indent When a paragraph has an indentation different from the parent (named) style, it used to be considered a blockquote. But this only makes sense when the paragraph has more indentation. So this commit adds a check for the indentation of the parent style.	2021-10-10 16:27:32 -07:00
John MacFarlane	c72277e986	LaTeX reader: Properly handle `\^` followed by group closing. Closes #7615.	2021-10-10 11:24:28 -07:00
Emily Bourke	aa78765bf9	pptx: Remove excessive layout tests When I added the tests for moved layouts and deleted layouts, I added them to all tests. However, this doesn’t really give a lot more info than having single tests, and the extra tests take up time and disk space. This commit removes the moved-layouts and deleted-layouts tests, in favour of a single test for each of those scenarios.	2021-10-07 08:45:43 -07:00
John MacFarlane	b8d460eeab	Powerpoint writer: consolidate text runs when possible. This slims down the output files by avoiding unnecessary text run elements. Updated golden tests.	2021-10-04 12:24:12 -07:00
John MacFarlane	11baeb8850	OOXML tests: use pretty-printed form to display diffs. Otherwise everything is on one line and the diff is uninformative.	2021-10-04 12:12:16 -07:00
John MacFarlane	82d587493d	Revert "Powerpoint writer: consolidate text run nodes." This reverts commit `62f83aa486`. This was already being done, it seems. I misidentified the problem; it is really with `Str ""` nodes.	2021-10-04 11:50:32 -07:00
John MacFarlane	62f83aa486	Powerpoint writer: consolidate text run nodes. This should reduce the size of the generated files.	2021-10-04 11:45:01 -07:00
John MacFarlane	0088e79cf5	Update tests for babel-related changes in latex template.	2021-10-03 19:27:37 -07:00
John MacFarlane	6ff04ac52d	Fix compareXML helper in Tests.Writers.OOXML. Given how it is used, we were getting "mine" and "good" flipped in the test results.	2021-10-02 06:52:40 -07:00
Ezwal	472b33095e	Docx reader: Add placeholder for word diagram	2021-09-30 12:44:44 -07:00
John MacFarlane	92abe45863	Further test updates for switch to pretty-show.	2021-09-29 08:28:54 -07:00
John MacFarlane	0bdcf415e4	Switch from pretty-simple to pretty-show for native output. Update tests. Reason: it turns out that the native output generated by pretty-simple isn't always readable by the native reader. According to https://github.com/cdepillabout/pretty-simple/issues/99 it is not a design goal of the library that the rendered values be readable using 'read'. This makes it unsuitable for our purposes. pretty-show is a bit slower and it uses 4-space indents (non-configurable), but it doesn't have this serious drawback.	2021-09-28 21:17:53 -07:00
John MacFarlane	665e6d3d94	BibTeX parser: fix expansion of special strings in series... e.g. `newseries` or `library`. Expansion should not happen when these strings are protected in braces, or when they're capitalized. Closes #7591.	2021-09-23 22:21:05 -07:00
John MacFarlane	aa89f6be18	HTML reader: handle empty tbody element in table. Closes #7589.	2021-09-23 09:25:37 -07:00
John MacFarlane	3de0d5a977	test/epub: an excerpt from The Wasteland is enough! Saves over 100K.	2021-09-21 18:54:42 -07:00
John MacFarlane	c6fa0bc271	Revert "Remove unused epub test file features.epub." This reverts commit `83ebb85b64`.	2021-09-21 17:03:25 -07:00
John MacFarlane	a88b0098d6	Make test/epub/wasteland.epub valid.	2021-09-21 16:49:58 -07:00
John MacFarlane	83ebb85b64	Remove unused epub test file features.epub.	2021-09-21 16:42:33 -07:00
John MacFarlane	c266734448	Use pretty-simple to format native output. Previously we used our own homespun formatting. But this produces over-long lines that aren't ideal for diffs in tests. Easier to use something off-the-shelf and standard. Closes #7580. Performance is slower by about a factor of 10, but this isn't really a problem because native isn't suitable as a serialization format. (For serialization you should use json, because the reader is so much faster than native.)	2021-09-21 12:37:42 -07:00
John MacFarlane	5f7e7f539a	Add missing `%` on command tests. This prevented `--accept` from working properly.	2021-09-21 10:42:24 -07:00
John MacFarlane	a1ca51c979	Command tests: raise error if command doesn't begin with `%`.	2021-09-21 10:42:14 -07:00
John MacFarlane	dd7b83ac91	Use babel, not polyglossia, with xelatex. Previously polyglossia worked better with xelatex, but that is no longer the case, so we simplify the code so that babel is used with all latex engines. This involves a change to the default LaTeX template.	2021-09-19 09:40:59 -07:00
Emily Bourke	50adea220d	pptx: Support footers in the reference doc In PowerPoint, it’s possible to specify footers across all slides, containing a date (optionally automatically updated to today’s date), the slide number (optionally starting from a higher number than 1), and static text. There’s also an option to hide the footer on the title slide. Before this commit, none of that footer content was pulled through from the reference doc: this commit supports all the functionality listed above. There is one behaviour which may not be immediately obvious: if the reference doc specifies a fixed date (i.e. not automatically updating), and there’s a date specified in the metadata for the document, the footer date is replaced by the metadata date. - Include date, slide number, and static footer content from reference doc - Respect “slide number starts from” option - Respect “Don’t show on title slide” option - Add tests	2021-09-18 09:55:45 -07:00
John MacFarlane	57d93cca56	Org writer: don't indent contents of code blocks. We previously indented them by two spaces, following a common convention. Since the convention is fading, and the indentation is inconvenient for copy/paste, we are discontinuing this practice. Closes #5440.	2021-09-17 09:41:34 -07:00
John MacFarlane	a07d955d6f	Fix code blocks using `--preserve-tabs`. Previously they did not behave as the equivalent input with spaces would. Closes #7573.	2021-09-16 20:46:05 -07:00
Emily Bourke	7c22c0202e	pptx: Support specifying slide background images In the reveal-js output, it’s possible to use reveal’s `data-background-image` class on a slide’s title to specify a background image for the slide. With this commit, it’s possible to use `background-image` in the same way for pptx output. Only the “stretch” mode is supported, and the background image is centred around the slide in the image’s larger axis, matching the observed default behaviour of PowerPoint. - Support `background-image` per slide. - Add tests. - Update manual.	2021-09-16 19:45:53 -07:00
Emily Bourke	0fb6474a55	pptx: Add support for incremental lists - Support -i option - Support incremental/noincremental divs - Support older block quote syntax - Add tests One thing not clear from the manual is what should happen when the input uses a combination of these things. For example, what should the following produce? ```md ::: {.incremental .nonincremental} - are - these - incremental? ::: ::: incremental ::::: nonincremental - or - these? ::::: ::: ::: nonincremental > - how > - about > - these? ::: ``` In this commit I’ve taken the following approach, matching the observed behaviour for beamer and reveal.js output: - if a div with both classes, incremental wins - the innermost incremental/nonincremental div is the one which takes effect - a block quote containing a list as its first element inverts whether the list is incremental, whether or not the quote is inside an incremental/non-incremental div I’ve added some tests to verify this behaviour. This commit closes issue #5689 (https://github.com/jgm/pandoc/issues/5689).	2021-09-15 09:13:05 -07:00
John MacFarlane	a3162d341b	RST reader: handle escaped colons in reference definitions. Cloess #7568.	2021-09-13 22:57:08 -07:00
Emily Bourke	0ebe65e651	pptx: Fix logic for choosing Comparison layout There was a mistake in the logic used to choose between the Comparison and Two Content layouts: if one column contained only non-text (an image or a table) and the other contained only text, the Comparison layout was chosen instead of the desired Two Content layout. This commit fixes that logic: > If either column contains text followed by non-text, use Comparison. Otherwise, use Two Content. It also adds a test asserting this behaviour.	2021-09-13 08:30:36 -07:00
John MacFarlane	6271b09c50	Docx writer: make id used in native_numbering predictable. If the image has the id IMAGEID, then we use the id ref_IMAGEID for the figure number. Closes #7551. This allows one to create a filter that adds a figure number with figure name, e.g. <w:fldSimple w:instr=" REF ref_superfig "><w:r><w:t>Figure X</w:t></w:r></w:fldSimple> For this to be possible it must be possible to predict the figure number id from the image id. If images lack an id, an id of the form `ref_fig1` is used.	2021-09-12 15:30:29 -07:00
Emily Bourke	2b98991551	pptx: Include all themes in output archive - Accept test changes: they’re adding the second theme (for all tests not containing speaker notes), or changing its position in the XML (for the ones containing speaker notes).	2021-09-10 17:06:45 -07:00
Emily Bourke	8ec9b884f1	pptx: Fix capitalisation of notesMasterId I don’t think this has caused any problems, but before now it’s been "NotesMasterId", which is incorrect according to [ECMA-376]. [ECMA-376]: https://www.ecma-international.org/publications-and-standards/standards/ecma-376/	2021-09-10 17:06:45 -07:00
John MacFarlane	8beca46611	Fix command test for #7557 .	2021-09-10 12:07:11 -07:00
John MacFarlane	0216a2f504	Org reader: don't parse a list as first item in a list item. Closes #7557.	2021-09-10 09:50:05 -07:00
Francesco Mazzoli	99a4d1d0b0	Support `--reference-location` for HTML output (#7461 ) The HTML writer now supports `EndOfBlock`, `EndOfSection`, and `EndOfDocument` for reference locations. EPUB and HTML slide show formats are also affected by this change. This works similarly to the markdown writer, but with special care taken to skipping section divs with what regards to the block level. The change also takes care to not modify the output if `EndOfDocument` is used.	2021-09-10 09:30:05 -07:00
John MacFarlane	b185560a8e	RTF reader: better handling of `\` and bookmarks. We now ensure that groups starting with `\` never cause text to be added to the document. In addition, bookmarks now create a span between the start and end of the bookmark, rather than an empty span.	2021-09-04 11:06:01 -07:00
Emily Bourke	b82a01b688	pptx: Add support for more layouts Until now, the pptx writer only supported four slide layouts: “Title Slide” (used for the automatically generated metadata slide), “Section Header” (used for headings above the slide level), “Two Column” (used when there’s a columns div containing at least two column divs), and “Title and Content” (used for all other slides). This commit adds support for three more layouts: Comparison, Content with Caption, and Blank. - Support “Comparison” slide layout This layout is used when a slide contains at least two columns, at least one of which contains some text followed by some non-text (e.g. an image or table). The text in each column is inserted into the “body” placeholder for that column, and the non-text is inserted into the ObjType placeholder. Any extra content after the non-text is overlaid on top of the preceding content, rather than dropping it completely (as currently happens for the two-column layout). + Accept straightforward test changes Adding the new layout means the “-deleted-layouts” tests have an additional layout added to the master and master rels. + Add new tests for the comparison layout + Add new tests to pandoc.cabal - Support “Content with Caption” slide layout This layout is used when a slide’s body contains some text, followed by non-text (e.g. and image or a table). Before now, in this case the image or table would break onto a new slide: to get that output again, users can add a horizontal rule before the image or table. + Accept straightforward tests The “-deleted-layouts” tests all have an extra layout and relationship in the master for the Content with Caption layout. + Accept remove-empty-slides test Empty slides are still removed, but the Content with Caption layout is now used. + Change slide-level-0/h1-h2-with-text description This test now triggers the content with caption layout, giving a different (but still correct) result. + Add new tests for the new layout + Add new tests to the cabal file - Support “Blank” slide layout This layout is used when a slide contains only blank content (e.g. non-breaking spaces). No content is inserted into any placeholders in the layout. Fixes #5097. + Accept straightforward test changes Blank layout now copied over from reference doc as well, when layouts have been deleted. + Add some new tests A slide should use the blank layout if: - It contains only speaker notes - It contains only an empty heading with a body of nbsps - It contains only a heading containing only nbsps - Change ContentType -> Placeholder This type was starting to have a constructor for each placeholder on each slide (e.g. `ComparisonUpperLeftContent`). I’ve changed it instead to identify a placeholder by type and index, as I think that’s clearer and less redundant. - Describe layout-choosing logic in manual	2021-09-01 07:16:17 -07:00
Emily Bourke	8dbea49092	pptx: Restructure tests - Use dashes consistently rather than underscores - Make a folder for each set of tests - List test files explicitly (Cabal doesn’t support ** until version 2.4)	2021-09-01 07:16:17 -07:00
John MacFarlane	5dcd4610e2	Improve asciidoc escaping for `--` in URLs. Closes #7529 .	2021-08-29 10:12:20 -07:00
Emily Bourke	8e5a79f264	pptx: Make first heading title if slide level is 0 Before this commit, the pptx writer adds a slide break before any table, “columns” div, or paragraph starting with an image, unless the only thing before it on the same slide is a heading at the slide level. In that case, the item and heading are kept on the same slide, and the heading is used as the slide title (inserted into the layout’s “title” placeholder). However, if the slide level is set to 0 (as was recently enabled) this makes it impossible to have a slide with a title which contains any of those items in its body. This commit changes this behaviour: now if the slide level is 0, then items will be kept with a heading of any level, if the heading’s the only thing before the item on the same slide.	2021-08-27 09:47:03 -07:00
John MacFarlane	e4d7a6177f	Ensure we have unique ids for wp:docPr and pic:cNvPr elements. This will, I hope, fix #7527 and #7503.	2021-08-27 09:42:59 -07:00
John MacFarlane	7ff06a8c43	Fix test for #7521 .	2021-08-24 12:56:31 -07:00
John MacFarlane	3f9b7a10ad	Markdown reader: fix interaction of --strip-comments and list parsing. Use of `--strip-comments` was causing tight lists to be rendered as loose (as if the comment were a blank line). Closes #7521.	2021-08-23 22:06:39 -07:00
Simon Schuster	591cdca38b	LaTeX-parser: restrict \endinput to current file	2021-08-21 18:08:27 -07:00
John MacFarlane	07d847a910	RST reader: Fix `:literal:` includes. These should create code blocks, not insert raw RST. Closes #7513.	2021-08-20 09:54:42 -07:00
Emily Bourke	5616d00d09	pptx: Include image title in description The image title (i.e. `![alt text](link "title")`) was previously ignored when writing to pptx. This commit includes it in PowerPoint's description of the image, along with the link (which was already included). Fixes 7352.	2021-08-18 10:10:55 -07:00
John MacFarlane	fd99fe4d7e	Revise citeproc code to fit new citeproc 0.5 API. Linkification of URLs in the bibliography is now done in the citeproc library, depending on the setting of an option. We set that option depending on the value of the metadata field `link-bibliography` (defaulting to true, for consistency with earlier behavior, though the new behavior includes the CSL draft recommendation of hyperlinking the title or the whole entry if a DOI, PMID, PMCID, or URL field is present but not explicitly rendered). These changes implement the following recommendations from the draft CSL v1.0.2 spec (Appendix VI): > The CSL syntax does not have support for configuration of links. > However, processors should include links on bibliographic references, > using the following rules: > If the bibliography entry for an item renders any of the following > identifiers, the identifier should be anchored as a link, with the > target of the link as follows: > - url: output as is > - doi: prepend with "`https://doi.org/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > If the identifier is rendered as a URI, include rendered URI components > (e.g. "`https://doi.org/`") in the link anchor. Do not include any other > affix text in the link anchor (e.g. "Available from: ", "doi: ", "PMID: "). > If the bibliography entry for an item does not render any of > the above identifiers, then set the anchor of the link as the item > title. If title is not rendered, then set the anchor of the link as the > full bibliography entry for the item. Set the target of the link as one > of the following, in order of priority: > > - doi: prepend with "`https://doi.org/`" > - pmcid: prepend with "`https://www.ncbi.nlm.nih.gov/pmc/articles/`" > - pmid: prepend with "`https://www.ncbi.nlm.nih.gov/pubmed/`" > - url: output as is > > If the item data does not include any of the above identifiers, do not > include a link. > > Citation processors should include an option flag for calling > applications to disable bibliography linking behavior. Thanks to Benjamin Bray for getting this all working.	2021-08-17 15:34:23 -07:00
John MacFarlane	2e9a8935fb	OOXML tests: silence warnings. These can make the test output confusing, making people think tests are failing when they're passing.	2021-08-17 15:33:10 -07:00
Emily Bourke	72823ad947	pptx: Select layouts from reference doc by name Until now, users had to make sure that their reference doc contains layouts in a specific order: the first four layouts in the file had to have a specific structure, or else pandoc would error (or sometimes successfully produce a pptx file, which PowerPoint would then fail to open). This commit changes the layout selection to use the layout names rather than order: users must make sure their reference doc contains four layouts with specific names, and if a layout with the right name isn’t found pandoc will output a warning and use the corresponding layout from the default reference doc as a fallback. I believe the use of names rather than order will be clearer to users, and the clearer errors will help them troubleshoot when things go wrong. - Add tests for moved layouts - Add tests for deleted layouts - Add newly included layouts to slideMaster1.xml to fix tests	2021-08-17 09:35:25 -07:00
Emily Bourke	9204e5c9b1	Don’t compare cdLine in OOXML golden tests The `cdLine` field gives the line of the file some CData was found on. I don’t think this is a difference that should fail these golden tests, as the XML should still be parsable if nothing else has changed.	2021-08-17 09:35:25 -07:00
Emily Bourke	8474d488a5	Provide more detailed XML diff in tests I had some failing tests and couldn’t tell what was different in the XML. Updating the comparison to return what’s different made it easier to figure out what was wrong, and I think will be helpful for others in future.	2021-08-17 09:35:25 -07:00
OCzarnecki	e37cf4484d	Multimarkdown sub- and superscripts (#5512 ) (#7188 ) Added an extension `short_subsuperscripts` which modifies the behavior of `subscript` and `superscript`, allowing subscripts or superscripts containing only alphanumerics to end with a space character (eg. `x^2 = 4` or `H~2 is combustible`). This improves support for multimarkdown. Closes #5512. Add `Ext_short_subsuperscripts` constructor to `Extension` [API change]. This is enabled by default for `markdown_mmd`.	2021-08-15 21:57:57 -07:00
John MacFarlane	4340bd52c4	Make docx writer sensitive to `native_numbering` extension. Figure and table numbers are now only included if `native_numbering` is enabled. (By default it is disabled.) This is a behavior change with respect to 2.14.1, but the behavior is that of previous versions. The change was necessary to avoid incompatibilities between pandoc's native numbering and third-party cross reference filters like pandoc-crossref. Closes #7499.	2021-08-15 15:05:54 -07:00
John MacFarlane	2c466a15af	Remove misleading description from command/citeproc-87 test.	2021-08-15 09:26:30 -07:00
John MacFarlane	82638ad53b	Convert Quoted in bib entries to special Spans... before passing them off to citeproc. This ensures that we get proper localization and flipflopping if, e.g., quotes are used in titles. Closes jgm/citeproc#87.	2021-08-13 19:25:29 -07:00
John MacFarlane	15683bb607	Citeproc: avoid odd handling of quotes. citeproc changes allow us to ignore Quoted elements; citeproc now uses its own method for represented quoted things, and only localizes and flipflops quotes it adds itself. See #87. The one thing left to do is to convert Quoted elements in bibliography databases (esp. titles) to `Span ("",["csl-quoted"],[])` before passing them to citeproc, IF the localized quotes for the quote type match the standard inverted commas.	2021-08-13 18:13:06 -07:00
John MacFarlane	418155aa95	Fix raw LaTeX injection issue (LaTeX writer). Using a code block containing `\end{verbatim}`, one could inject raw TeX into a LaTeX document even when `raw_tex` is disabled. Thanks to Augustin Laville for noticing the bug. Closes #7497.	2021-08-13 11:27:04 -07:00
William Lupton	fc20672bb9	Various sample.lua editorial fixes. (#7493 ) These address most of the items mentioned in #7487. There's also a table caption fix (the caption wasn't escaped).	2021-08-12 10:16:34 -07:00
John MacFarlane	dd1a956a8a	LaTeX reader: Support `\global` before `\def`, `\let`, etc. See #7494.	2021-08-11 16:28:53 -07:00
John MacFarlane	e3a263df46	Fix scope for LaTeX macros. They should by default scope over the group in which they are defined (except `\gdef` and `\xdef`, which are global). In addition, environments must be treated as groups. We handle this by making sMacros in the LaTeX parser state a STACK of macro tables. Opening a group adds a table to the stack, closing one removes one. Only the top of the stack is queried. This commit adds a parameter for scope to the Macro constructor (not exported). Closes #7494.	2021-08-11 16:14:34 -07:00
John MacFarlane	a0e44b1ff6	LaTeX reader: improve handling of plain TeX macro primitives. - Fixed semantics for `\let`. - Implement `\edef`, `\gdef`, and `\xdef`. - Add comment noting that currently `\def` and `\edef` set global macros (so are equivalent to `\gdef` and `\xdef`). This should be fixed by scoping macro definitions to groups, in a future commit. Closes #7474.	2021-08-11 10:32:52 -07:00
John MacFarlane	06d97131e5	Tests.Helpers: export testGolden and use it in RTF reader. This gives a diff output on failure.	2021-08-10 22:07:48 -07:00
John MacFarlane	3a924d8f96	HTML reader: treat commments as blank when parsing. This modifies pBlank. Previously comments could sometimes flummox the parser. Cloes #7482.	2021-08-10 12:50:23 -07:00
John MacFarlane	7ca4233793	Add test for #7488 .	2021-08-10 11:11:33 -07:00
John MacFarlane	6543b05116	Add RTF reader. - `rtf` is now supported as an input format as well as output. - New module Text.Pandoc.Readers.RTF (exporting `readRTF`). [API change] Closes #3982.	2021-08-10 10:48:55 -07:00
John MacFarlane	dea1f0f080	RTF writer: emit \outlinelevel for section headings.	2021-08-04 16:37:20 -06:00
Peter Fabinski	8667ba2bcc	LaTeX table writer: Increase column width precision (#7466 ) In some cases, the rounding performed by the LaTeX table writer would introduce visible overrun outside the text area. This adds two more decimal places to the width values.	2021-08-03 15:34:39 -06:00
John MacFarlane	f938378d00	RTF writer: omit `\bin` in `\pict`. According to the spec, this is not needed or wanted when the data is in hexadecimal format, as it is here.	2021-08-01 22:45:41 -06:00
John MacFarlane	ca12e198ba	RTF template: specify font family for fixed-width font f1. According to the spec, this is mandatory.	2021-08-01 09:45:09 -06:00
Jan Tojnar	06408d08e5	DocBook reader: add support for citerefentry (#7437 ) Originally intended for referring to UNIX manual pages, either part of the same DocBook document as refentry element, or external – hence the manvolnum element. These days, refentry is more general, for example the element documentation pages linked below are each a refentry. As per the Processing expectations section of citerefentry, the element is supposed to be a hyperlink to a refentry (when in the same document) but pandoc does not support refentry tag at the moment so that is moot. https://tdg.docbook.org/tdg/5.1/citerefentry.html https://tdg.docbook.org/tdg/5.1/manvolnum.html https://tdg.docbook.org/tdg/5.1/refentry.html This roughly corresponds to a `manpage` role in rST syntax, which produces a `Code` AST node with attributes `.interpreted-text role=manpage` but that does not fit DocBook parser. https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-manpage	2021-07-11 15:28:52 -07:00
John MacFarlane	ac0a9da6d8	Improved parsing of raw LaTeX from Text streams (rawLaTeXParser). We now use source positions from the token stream to tell us how much of the text stream to consume. Getting this to work required a few other changes to make token source positions accurate. Closes #7434.	2021-07-11 13:50:28 -07:00
John MacFarlane	ae22b1e977	RST reader: fix regression with code includes. With the recent changes to include infrastructure, included code blocks were getting an extra newline. Closes #7436. Added regression test.	2021-07-09 12:27:41 -07:00
Michael Hoffmann	e56e2b0e0b	Recognize data-external when reading HTML img tags (#7429 ) Preserve all attributes in img tags. If attributes have a `data-` prefix, it will be stripped. In particular, this preserves a `data-external` attribute as an `external` attribute in the pandoc AST.	2021-07-06 16:06:29 -07:00
John MacFarlane	3a31fe68ef	Add command test for #7394 . And fix a small bug in handling of citations in notes, which led to commas at the end of sentences in some cases.	2021-07-05 15:10:14 -07:00
Mauro Bieg	de4da56079	document-css: reset overflow-wrap on code blocks fixes #7423	2021-07-05 08:57:23 -07:00
John MacFarlane	972db3cdca	Revert "LaTeX template: move title, author, date up to top of preamble." This reverts commit `cc088687b4` and PR #7295. This fixes issues people had when using LaTeX commands defined later in the preamble (or in some cases UTF-8 text) in the title or author fields. Closes #7422.	2021-07-03 15:34:42 -07:00
Aner Lucero	cb038bb312	HTML5 writer, remove aria-hidden when explicit atl text is provided.	2021-07-02 13:02:52 -07:00
John MacFarlane	0948af9cc5	Docx writer: Add table numbering for captioned tables. The numbers are added using fields, so that Word can create a list of tables that will update automatically.	2021-06-29 11:15:40 -07:00
John MacFarlane	a3d745e485	Docx writer: support figure numbers. These are set up in such a way that they will work with Word's automatic table of figures. Closes #7392.	2021-06-29 09:56:21 -07:00
John MacFarlane	b7572db224	Use dev version of citeproc. This eliminates double hyperlinks in author-in-text citations. Author-only citations are no longer hyperlinked. See jgm/citeproc#77.	2021-06-29 09:18:49 -07:00
Aner Lucero	f4ef652a41	Remove duplicated alt text in HTML output.	2021-06-29 09:02:13 -07:00
John MacFarlane	851d037b3e	Improve punctuation moving with `--citeproc`. Previously, using `--citeproc` could cause punctuation to move in quotes even when there aer no citations. This has been changed; now, punctuation moving is limited to citations. In addition, we only move footnotes around punctuation if the style is a note style, even if `notes-after-punctuation` is `true`.	2021-06-28 22:41:14 -07:00
John MacFarlane	dd098d4e15	Markdown writer: put space between Plain and following fenced Div. Closes #4465.	2021-06-28 11:33:22 -07:00
John MacFarlane	1b07997f4a	Fix regression with comment-only YAML metadata blocks. Closes #7400.	2021-06-22 09:55:50 -07:00
John MacFarlane	8eed5b90d0	LaTeX writer: add strut at end of minipage if it contains... line breaks. Without them, the last line is shorter than it should be, at least in some cases.	2021-06-21 23:33:00 -07:00
John MacFarlane	2ef2049b4e	Update command test for change to LaTeX LineBreak handling.	2021-06-21 22:34:38 -07:00
John MacFarlane	ed3974a254	LaTeX writer: always use a minipage for cells with line breaks... if width information is available. Otherwise the way we treat them can lead to content that overflows a cell. Closes #7393.	2021-06-21 18:25:36 -07:00
John MacFarlane	a39313eddb	Fix test for #7397	2021-06-21 09:30:23 -07:00
John MacFarlane	82ad855f38	Markdown writer: Fix regression in code blocks with attributes. Code blocks with a single class but nonempty attributes were having attributes drop as a result of #7242. Closes #7397.	2021-06-21 08:49:00 -07:00
John MacFarlane	b0cd6c6224	Fix regression in citeproc processing. If inline references are used (in the metadata `references` field), we should still only include in the bibliography items that are actually cited -- unless `nocite` is used. Closes #7376.	2021-06-12 10:16:44 -07:00
John MacFarlane	21cc52abe3	LaTeX writer: Fix regression in table header position. In recent versions the table headers were no longer bottom-aligned (if more than one line). This patch fixes that by using minipages for table headers in non-simple tables. Closes #7347.	2021-06-05 14:13:58 -06:00
Jan Tojnar	af9de925de	DocBook writer: Remove non-existent admonitions attention, error and hint are actually just reStructuredText specific. danger was too until introduced in DocBook 5.2: https://github.com/docbook/docbook/issues/55	2021-06-05 08:02:21 -06:00
John MacFarlane	2e4ef14d91	Markdown reader: fix pipe table regression in 2.11.4. Previously pipe tables with empty headers (that is, a header line with all empty cells) would be rendered as headerless tables. This broke in 2.11.4. The fix here is to produce an AST with an empty table head when a pipe table has all empty header cells. Closes #7343.	2021-06-01 21:44:55 -06:00
John MacFarlane	abb59bd582	LaTeX reader: don't allow optional * on symbol control sequences. Generally we allow optional starred variants of LaTeX commands (since many allow them, and if we don't accept these explicitly, ignoring the star usually gives acceptable results). But we don't want to do this for `$*$` and similar cases. Closes #7340.	2021-06-01 13:54:51 -06:00
John MacFarlane	62f46b3995	Fix regression with commonmark/gfm yaml metdata block parsing. A regression in 2.14 led to the document body being omitted after YAML metadata in some cases. This is now fixed. Closes #7339.	2021-05-31 21:34:51 -06:00
John MacFarlane	cc206af392	Have LoadedResource use relative paths. The immediate reason for this is to allow the test output of #3752 to work on both windows and linux.	2021-05-30 10:23:00 -07:00
John MacFarlane	c210b98366	Fix test #3752 (1) for Windows.	2021-05-29 14:36:49 -07:00
John MacFarlane	5772f7f943	Further test image size reductions.	2021-05-29 12:27:59 -07:00
John MacFarlane	7aade73dce	Replace biblatex-exmaples.bib with shorter averroes.bib in tests.	2021-05-29 12:14:37 -07:00
John MacFarlane	e86f6abc45	Further test image size reductions.	2021-05-29 12:09:21 -07:00
John MacFarlane	3ba9ef01eb	Reduce size of image in fb2 image test.	2021-05-29 11:54:03 -07:00
John MacFarlane	5cf887db20	Reduce size of cover image in test epub.	2021-05-29 11:48:52 -07:00
John MacFarlane	8660f42f09	Modify pptx tests to take a whole lot less space. - Replace a 300K image in the reference pptx with a 2K one. - Updated all the *_templated.pptx files based on the new reference pptx. - These changes should reduce the size of the tarball by roughly 7 MB! See haskell/hackage-server#935	2021-05-29 10:59:14 -07:00
John MacFarlane	b6b2331fdc	Support `rebase_relative_paths` for commonmark based formats. (Including `gfm`.)	2021-05-28 13:58:44 -07:00
Emily Bourke	56b211120c	Docx reader: Support new table features. * Column spans * Row spans - The spec says that if the `val` attribute is ommitted, its value should be assumed to be `continue`, and that its values are restricted to {`restart`, `continue`}. If the value has any other value, I think it seems reasonable to default it to `continue`. It might cause problems if the spec is extended in the future by adding a third possible value, in which case this would probably give incorrect behaviour, and wouldn't error. * Allow multiple header rows * Include table description in simple caption - The table description element is like alt text for a table (along with the table caption element). It seems like we should include this somewhere, but I’m not 100% sure how – I’m pairing it with the simple caption for the moment. (Should it maybe go in the block caption instead?) * Detect table captions - Check for caption paragraph style /and/ either the simple or complex table field. This means the caption detection fails for captions which don’t contain a field, as in an example doc I added as a test. However, I think it’s better to be too conservative: a missed table caption will still show up as a paragraph next to the table, whereas if I incorrectly classify something else as a table caption it could cause havoc by pairing it up with a table it’s not at all related to, or dropping it entirely. * Update tests and add new ones Partially fixes: #6316	2021-05-28 20:15:23 +02:00
Emily Bourke	44484d0dee	Docx reader: Read table column widths.	2021-05-28 20:15:23 +02:00
John MacFarlane	4842c5fb82	Two citeproc locator/suffix improvements: - Recognize locators spelled with a capital letter. Closes #7323. - Add a comma and a space in front of the suffix if it doesn't start with space or punctuation. Closes #7324.	2021-05-27 18:28:52 -07:00
John MacFarlane	4b16d181e7	rebase_relative_paths: leave empty paths unchanged.	2021-05-27 14:16:37 -07:00

1 2 3 4 5 ...

1883 commits