* docx writer: support custom properties. Solves the writer part of #3024.
Also supports additional core properties: `subject`, `lang`, `category`,
`description`.
* odt writer: improve standard properties, including the following core properties:
`generator` (Pandoc/VERSION), `description`, `subject`, `keywords`,
`initial-creator` (from authors), `creation-date` (actual creation date).
Also fix date.
* pptx writer: support custom properties. Also supports additional core
properties: `subject`, `category`, `description`.
* Includes golden tests.
* MANUAL: document metadata support for docx, odt, pptx writers
Word has a 40 character limit for bookmark names. In
addition, bookmarks must begin with a letter. Since
pandoc's auto-generated identifiers may not respect
these constraints, some internal links did not work.
With this change, pandoc uses a bookmark name based
on the SHA1 hash of the identifier when the identifier
isn't a legal bookmark name.
Closes#5091.
It makes more sense not to interpret -- otherwise using the original
document as the reference-doc would produce two of everything: the
interpreted version and the uninterpreted style version.
This commit adjusts the test cases for the Docx writer after the fix of #3930.
- Adjusted test cases with inline images. The inline images now have the correct sizing, title and description.
- Modified the test case to include an image multiple times with different sizing each time.
- Tested on Windows 8.1 with Word 2007 (12.0.6705.5000) The files are not corrupted and display exactly what is expected.
Previously Emph, Strong, etc were outside the custom-style span. This
moves them inside in order to make it easier to write filters that act
on the formatting in these contents.
Tests and MANUAL example are changed to match.
The previous commit had a bug where custom-style spans would be read
with every recurrsion. This fixes that, and changes the example given
in the manual.
These are based off the reader tests, with some removed (where the
reader output was identical, based on different docx inputs). There
are still more to be added. In particular, tests for custom-styles
need to be added.
All golden docx files have been checked in MS Word
2013 (windows). There is no corruption.
There is questionable output in the `tables` test: the three tables
seemed to be joined. This will be addressed in a future commit, and
the golden docx file will be changed.
This is difficult to recreate with a modern version of Word, so I'm
using the file submitted with the bug report. It would be preferable
to find a smaller example with Latin characters, though, so as not to
confuse the issue being tested.
There isn't any reason to have numberous anchors in the same place,
since we can't maintain docx's non-nesting overlapping. So we reduce
to a single anchor, and have all links pointing to one of the
overlapping anchors point to that one. This changes the behavior from
commit e90c714c7 slightly (use the first anchor instead of the last)
so we change the expected test result.
Note that because this produces a state that has to be set after every
invocation of `parPartToInlines`, we make the main function into a
primed subfunction `parPartToInlines'`, and make `parPartToInlines` a
wrapper around that.
Previously we had only read the first child of an sdtContents tag. Now
we replace sdt with all children of the sdtContents tag.
This changes the expected test result of our nested_anchors test,
since now we read docx's generated TOCs.
* Deprecate `--strip-empty-paragraphs` option. Instead we now
use an `empty_paragraphs` extension that can be enabled on
the reader or writer. By default, disabled.
* Add `Ext_empty_paragraphs` constructor to `Extension`.
* Revert "Docx reader: don't strip out empty paragraphs."
This reverts commit d6c58eb836.
* Implement `empty_paragraphs` extension in docx reader and writer,
opendocument writer, html reader and writer.
* Add tests for `empty_paragraphs` extension.
We now have the `--strip-empty-paragraphs` option for that,
if you want it. Closes#2252.
Updated docx reader tests.
We use stripEmptyParagraphs to avoid changing too
many tests. We should add new tests for empty paragraphs.
* Added underlineSpan builder function. This can be easily updated if needed. The purpose is for Readers to transform underlines consistently.
* Docx Reader: Use underlineSpan and update test
* Org Reader: Use underlineSpan and add test
* Textile Reader: Use underlineSpan and add test case
* Txt2Tags Reader: Use underlineSpan and update test
* HTML Reader: Use underlineSpan and add test case