This function can be used to sanitize reference labels so that
they do not contain any of the illegal characters \#[]",{}%()|= .
Currently only Links have their labels sanitized, because they
are the only Elements that use passed labels.
We previously took the old relationship names of the headers and footer in
secptr. That led to collisions. We now make a map of availabl names in the
relationships file, and then rename in secptr.
Graphics in `\section`/`\subsection` etc titles need to be `\protect`ed.
This adds a state value and manually turns it on before every invocation
of `sectionHeader` and manually turns it off after. Using a writer value
and applying `local` would probably be cleaner, but this fits with the
current style.
When we encounter one of the polyglot header styles, we want to remove
that from the par styles after we convert to a header. To do that, we
have to keep track of the style name, and remove it appropriately.
We're just keeping a list of header formats that different languages use
as their default styles. At the moment, we have English, German, Danish,
and French. We can continue to add to this.
This is simpler than parsing the styles file, and perhaps less
error-prone, since there seems to be some variations, even within a
language, of how a style file will define headers.
When users number their headers, Word understands that as a single item
enumerated list. We make the assumption that such a list is, in fact, a header.
Because of the built-in line skip, LaTeX can't handle a section header
as the first element in a list item. (To be precise, it can't handle it
if the list immediately follows a section header, but the instance is
rare enough that we can afford to be a bit more general). This puts a
non-breaking space before the header to solve this problem. We won't see
this space, since the header skips a line before printing anyway.
The output is ugly in LaTeX and this structure seems like it should
probably be avoided. But it is valid HTML and native pandoc, so we
should have some sort of typesettable representation in LaTeX.
Previously text that ended a div would be parsed as Plain
unless there was a blank line before the closing div tag.
Test case:
<div class="first">
This is a paragraph.
This is another paragraph.
</div>
Closes#1591.
Previously we just expected 'title', 'subtitle', 'author', 'date'.
Now we still support those, but also support the format recommended
for epub metadata in the pandoc README:
---
title:
- type: main
text: My Book
- type: subtitle
text: An investigation of metadata
creator:
- role: author
text: John Smith
- role: editor
text: Sarah Jones
identifier:
- scheme: DOI
text: doi:10.234234.234/33
publisher: My Press
rights: (c) 2007 John Smith, CC BY-NC
...
We can now handle all different alignment types, for simple
tables only (no captions, no relative widths, cell contents just
plain inlines). Other tables are still handled using raw HTML.
Addresses #1585 as far as it can be addresssed, I believe.
Removed outdated claim that pandoc will look in the user data
directory if a relative path is specified and the file is not
found locally. Closes#1572.