pandoc/test/docx
John MacFarlane 8ca191604d Add new unexported module T.P.XMLParser.
This exports functions that uses xml-conduit's parser to
produce an xml-light Element or [Content].  This allows
existing pandoc code to use a better parser without
much modification.

The new parser is used in all places where xml-light's
parser was previously used.  Benchmarks show a significant
performance improvement in parsing XML-based formats
(especially ODT and FB2).

Note that the xml-light types use String, so the
conversion from xml-conduit types involves a lot
of extra allocation.  It would be desirable to
avoid that in the future by gradually switching
to using xml-conduit directly. This can be done
module by module.

The new parser also reports errors, which we report
when possible.

A new constructor PandocXMLError has been added to
PandocError in T.P.Error [API change].

Closes #7091, which was the main stimulus.

These changes revealed the need for some changes
in the tests.  The docbook-reader.docbook test
lacked definitions for the entities it used; these
have been added. And the docx golden tests have been
updated, because the new parser does not preserve
the order of attributes.

Add entity defs to docbook-reader.docbook.

Update golden tests for docx.
2021-02-10 22:04:11 -08:00
..
golden Add new unexported module T.P.XMLParser. 2021-02-10 22:04:11 -08:00
0_level_headers.docx Docx reader: Add tests for avoiding zero-level header. 2017-08-06 19:36:25 -07:00
0_level_headers.native Use the new builders, modify readers to preserve empty headers 2020-04-15 23:03:22 -04:00
adjacent_codeblocks.docx [Docx Reader] Update tests 2019-09-21 11:37:21 -07:00
adjacent_codeblocks.native Docx reader tests: Test for combining adjacent code blocks. 2018-04-17 09:29:54 -04:00
already_auto_ident.docx
already_auto_ident.native
alternate_document_path.docx Docx reader: Tests for alternate document.xml 2019-02-06 21:14:46 -05:00
alternate_document_path.native Support new Underline element in readers and writers (#6277) 2020-04-28 07:53:06 -07:00
block_quotes.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
block_quotes_parse_indent.native
char_styles.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
char_styles.native
codeblock.docx
codeblock.native
comments.docx
comments.native Add empty_paragraphs extension. 2017-12-04 14:56:57 -08:00
comments_no_comments.native
comments_warning.docx
compact-style-removal.docx [Docx Reader] Use style names, not ids, for assigning semantic meaning 2019-09-21 11:18:15 -07:00
compact-style-removal.native [Docx Reader] Use style names, not ids, for assigning semantic meaning 2019-09-21 11:18:15 -07:00
custom-style-no-styles.native Docx reader tests: test custom style extension. 2018-02-22 13:05:44 -05:00
custom-style-preserve.native Preserve built-in styles in DOCX with custom style (#5670) 2019-09-20 22:13:29 -07:00
custom-style-reference.docx
custom-style-roundtrip-end.native
custom-style-with-styles.native [Docx Reader] Update tests 2019-09-21 11:37:21 -07:00
custom_style.native Docx writer tests: Add tests for custom styles 2018-01-27 11:46:41 -05:00
deep_normalize.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
deep_normalize.native
definition_list.docx
definition_list.native
document-properties-short-desc.native Improve writing metadata for docx, pptx and odt (#5252) 2019-01-26 16:14:35 -08:00
document-properties.native Improve writing metadata for docx, pptx and odt (#5252) 2019-01-26 16:14:35 -08:00
drop_cap.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
drop_cap.native
dummy_item_after_list_item.docx
dummy_item_after_list_item.native
dummy_item_after_paragraph.docx
dummy_item_after_paragraph.native
enumerated_headings.docx
enumerated_headings.native
german_styled_lists.docx
german_styled_lists.native
hanging_indent.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
hanging_indent.native
headers.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
headers.native
i18n_blocks.docx
i18n_blocks.native
image.docx
image_no_embed.native
image_no_embed_writer.native
image_vml.docx
image_vml.native
image_writer_test.native Docx test: adjust test for fix of bug 2018-02-23 11:50:33 -05:00
inline_code.docx
inline_code.native
inline_formatting.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
inline_formatting.native [Docx Reader] Refactor/update smushInlines 2020-07-07 09:04:38 +03:00
inline_formatting_writer.native
inline_images.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
inline_images.native
inline_images_writer.native
inline_images_writer_test.native Docx writer tests: Use new golden framework 2018-01-27 08:08:25 -05:00
instrText_hyperlink.docx Docx reader: Add test for hyperlinks in instrText tag 2018-01-16 13:22:02 -05:00
instrText_hyperlink.native Docx reader: Add test for hyperlinks in instrText tag 2018-01-16 13:22:02 -05:00
link_in_notes.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
link_in_notes.native
links.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
links.native
links_writer.native
lists-compact.docx [Docx Reader] Use style names, not ids, for assigning semantic meaning 2019-09-21 11:18:15 -07:00
lists-compact.native [Docx Reader] Use style names, not ids, for assigning semantic meaning 2019-09-21 11:18:15 -07:00
lists.docx [Docx Reader] Update tests 2019-09-21 11:37:21 -07:00
lists.native [Docx Reader] Update tests 2019-09-21 11:37:21 -07:00
lists_continuing.docx Docx writer: Add tests for list continuation. 2017-12-13 15:16:44 -05:00
lists_continuing.native Docx writer: Add tests for list continuation. 2017-12-13 15:16:44 -05:00
lists_level_override.docx Docx: add test for lists with level overrides. 2018-12-10 19:24:56 -05:00
lists_level_override.native Docx reader tests: fix test file with trailing space. 2019-02-18 15:49:36 -05:00
lists_multiple_initial.native Docx writer: better handle list items whose contents are lists (#6522) 2020-10-02 09:30:05 -07:00
lists_restarting.docx Docx writer: Add tests for list continuation. 2017-12-13 15:16:44 -05:00
lists_restarting.native Docx writer: Add tests for list continuation. 2017-12-13 15:16:44 -05:00
lists_sublist_reset.docx Docx reader: fix list number resumption for sublists. Closes #4324. 2019-11-03 12:54:42 -08:00
lists_sublist_reset.native Docx reader: fix list number resumption for sublists. Closes #4324. 2019-11-03 12:54:42 -08:00
lists_writer.native
metadata.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
metadata.native
metadata_after_normal.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
metadata_after_normal.native
nested_anchors_in_header.docx
nested_anchors_in_header.native [Docx Reader] Update tests 2019-09-21 11:37:21 -07:00
nested_sdt.docx Docx reader: Handle nested sdt tags. 2018-02-28 16:32:20 -05:00
nested_sdt.native Docx reader: Handle nested sdt tags. 2018-02-28 16:32:20 -05:00
nested_smart_tags.docx Docx reader: add tests for nested smart tags. 2018-03-13 22:16:54 -04:00
nested_smart_tags.native Docx reader: add tests for nested smart tags. 2018-03-13 22:16:54 -04:00
normalize.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
normalize.native
notes.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
notes.native
numbered_header.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
numbered_header.native
overlapping_targets.docx Docx reader: tests for overlapping targets (anchor spans). 2017-12-31 09:36:42 -05:00
overlapping_targets.native Docx reader: tests for overlapping targets (anchor spans). 2017-12-31 09:36:42 -05:00
paragraph_insertion_deletion.docx Docx reader: Add tests for paragraph insertion/deletion. 2018-01-02 11:32:48 -05:00
paragraph_insertion_deletion_accept.native Docx reader: Add tests for paragraph insertion/deletion. 2018-01-02 11:32:48 -05:00
paragraph_insertion_deletion_all.native Docx reader: Add tests for paragraph insertion/deletion. 2018-01-02 11:32:48 -05:00
paragraph_insertion_deletion_reject.native Docx reader: Add tests for paragraph insertion/deletion. 2018-01-02 11:32:48 -05:00
raw-blocks.native Docx writer: keep raw openxml strings verbatim. 2020-12-13 14:09:59 +01:00
raw-bookmarks.native Docx writer: keep raw openxml strings verbatim. 2020-12-13 14:09:59 +01:00
sdt_elements.docx Docx reader: add tests for structured document tags unwrapping. 2017-12-27 10:03:00 -05:00
sdt_elements.native Use the new builders, modify readers to preserve empty headers 2020-04-15 23:03:22 -04:00
sdt_in_footnote.docx Docx reader: Add test for reading sdts in footnotes. 2019-02-12 17:26:37 -05:00
sdt_in_footnote.native Docx reader: Add test for reading sdts in footnotes. 2019-02-12 17:26:37 -05:00
special_punctuation.docx
special_punctuation.native
table_one_row.docx
table_one_row.native Use the new builders, modify readers to preserve empty headers 2020-04-15 23:03:22 -04:00
table_variable_width.docx Docx reader: Pick table width from the longest row or header 2018-02-15 15:06:01 -05:00
table_variable_width.native Adapt to the removal of the RowSpan, ColSpan, RowHeadColumns accessors 2020-04-15 23:03:22 -04:00
table_with_list_cell.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
table_with_list_cell.native Adapt to the removal of the RowSpan, ColSpan, RowHeadColumns accessors 2020-04-15 23:03:22 -04:00
tables.docx Remove nonfree ICC profiles from thumbnails in test docx files. 2018-04-25 17:00:21 -07:00
tables.native Use the new builders, modify readers to preserve empty headers 2020-04-15 23:03:22 -04:00
tabs.docx
tabs.native
track_changes_deletion.docx
track_changes_deletion_accept.native
track_changes_deletion_all.native
track_changes_deletion_reject.native
track_changes_insertion.docx
track_changes_insertion_accept.native
track_changes_insertion_all.native
track_changes_insertion_reject.native
track_changes_move.docx
track_changes_move_accept.native
track_changes_move_all.native
track_changes_move_reject.native
track_changes_scrubbed_metadata.docx DOCX reader: Allow empty dates in comments and tracked changes (#6726) 2020-10-06 21:03:00 -07:00
track_changes_scrubbed_metadata.native DOCX reader: Allow empty dates in comments and tracked changes (#6726) 2020-10-06 21:03:00 -07:00
trailing_spaces_in_formatting.docx
trailing_spaces_in_formatting.native
trim_last_inline.docx Docx reader: add tests for trimming last inline. 2019-02-18 15:49:00 -05:00
trim_last_inline.native Docx reader: add tests for trimming last inline. 2019-02-18 15:49:00 -05:00
unicode.docx
unicode.native
unused_anchors.docx Docx reader: tests for removing unused anchors. 2017-12-30 22:43:33 -05:00
unused_anchors.native Docx Reader: Combine adjacent anchors. 2017-12-31 09:29:51 -05:00
verbatim_subsuper.docx
verbatim_subsuper.native