Commit graph

1618 commits

Author SHA1 Message Date
John MacFarlane
4e990a8cf9 LaTeX/siunitx: fix parsing of \cubic etc. See . 2021-05-20 10:13:20 -07:00
John MacFarlane
bc5058234f LaTeX reader sinuitx: fix + sign on ang. 2021-05-20 10:13:20 -07:00
John MacFarlane
5dc917da3e LaTeX reader siunitx: add leading 0 to numbers starting with . 2021-05-20 10:13:20 -07:00
Denis Maier
183ce58477
ConTeXt reader: improve ordered lists ()
Closes  

- change ordered list from itemize to enumerate
- adds new itemgroup for ordered lists
- add fontfeature for table figures
- remove width from itemize in context writer
2021-05-20 09:59:53 -07:00
John MacFarlane
a366bd6abc LaTeX reader: Fix parsing of +- in siunitx numbers.
See .
2021-05-20 09:03:29 -07:00
John MacFarlane
8437a4a002 LaTeX reader: support \pm in SI{..}.
Closes .
2021-05-20 08:16:46 -07:00
Albert Krewinkel
b6239f4150
ZimWiki writer: allow links and emphasis in headers
The latest version of ZimWiki supports this.

Closes: 
2021-05-20 12:48:05 +02:00
John MacFarlane
5736b331d8 LaTeX reader: better support for \xspace.
Previously we only supported it in inline contexts; now
we support it in all contexts, including math.

Partially addresses .
2021-05-19 16:14:49 -07:00
Albert Krewinkel
eb3dff148e
LaTeX writer: separate successive quote chars with thin space
Successive quote characters are separated with a thin space to improve
readability and to prevent unwanted ligatures. Detection of these quotes
sometimes had failed if the second quote was nested in a span element.

Closes: 
2021-05-18 22:55:47 +02:00
Albert Krewinkel
1843a8793a
HTML writer: keep attributes from code nested below pre tag.
If a code block is defined with `<pre><code
class="language-x">…</code></pre>`, where the `<pre>` element has no
attributes, then the attributes from the `<code>` element are used
instead. Any leading `language-` prefix is dropped in the code's *class*
attribute are dropped to improve syntax highlighting.

Closes: 
2021-05-17 18:08:02 +02:00
Albert Krewinkel
25f5b92777
HTML writer: ensure headings only have valid attribs in HTML4
Fixes: 
2021-05-17 15:42:15 +02:00
Albert Krewinkel
4417dacc44
ConTeXt writer: use span identifiers as reference anchors.
Closes: 
2021-05-17 13:14:32 +02:00
Albert Krewinkel
d3ca48656f
ConTeXt writer tests: keep code lines below 80 chars. 2021-05-17 13:11:33 +02:00
John MacFarlane
cc088687b4 LaTeX template: move title, author, date up to top of preamble.
This allows header-includes to use them, and puts them
in a position where you can see them immediately.
Closes .
2021-05-16 14:35:13 -07:00
John MacFarlane
5a6399d9f6 Markdown writer: fewer unneeded escapes for #.
See .
2021-05-16 12:23:34 -07:00
John MacFarlane
0a4c6925b6 Docx writer: copy over more settings from referenc.odcx.
From settings.xml in the reference-doc, we now include:
`zoom`, `embedSystemFonts`, `doNotTrackMoves`, `defaultTabStop`,
`drawingGridHorizontalSpacing`, `drawingGridVerticalSpacing`,
`displayHorizontalDrawingGridEvery`, `displayVerticalDrawingGridEvery`,
`characterSpacingControl`, `savePreviewPicture`, `mathPr`, `themeFontLang`,
`decimalSymbol`, `listSeparator`, `autoHyphenation`, `compat`.

Closes .
2021-05-15 15:40:49 -07:00
John MacFarlane
2cf971cf56 docx writer: Remove rsids from settings.docx.
Word will add these when revisions are made.  But it's
pointless to start out with a set of them.
2021-05-15 10:54:05 -07:00
Albert Krewinkel
0794862aac
HTML writer: parse <header> as a Div
HTML5 `<header>` elements are treated like `<div>` elements.
2021-05-15 16:46:02 +02:00
Albert Krewinkel
013e4a3164
HTML reader: keep h1 tags as normal headers ()
The tags `<title>` and `<h1 class="title">` often contain the same
information, so the latter was dropped from the document. However, as
this can lead to loss of information, the heading is now always
retained.

Use `--shift-heading-level-by=-1` to turn the `<h1>` into the document
title, or a filter to restore the previous behavior.

Closes: 
2021-05-14 12:31:24 -07:00
John MacFarlane
76a4e7127b Beamer writer: support exampleblock and alertblock.
A block will be rendered as an exampleblock if the heading
has class `example` and alertblock if it has class `alert`.

Closes .
2021-05-14 10:09:46 -07:00
Albert Krewinkel
17d96404f5
Docx writer: allow multirow table headers 2021-05-14 16:19:20 +02:00
Albert Krewinkel
875f8f3654
HTML reader: don't fail on unmatched closing "script" tag.
Prevent the reader from crashing if the HTML input contains an unmatched
closing `</script>` tag.

Fixes: 
2021-05-14 12:13:40 +02:00
John MacFarlane
3f09f53459 Implement curly-brace syntax for Markdown citation keys.
The change provides a way to use citation keys that contain
special characters not usable with the standard citation
key syntax.  Example: `@{foo_bar{x}'}` for the key `foo_bar{x}`.
Closes .

The change requires adding a new parameter to the `citeKey`
parser from Text.Pandoc.Parsing [API change].

Markdown reader: recognize @{..} syntax for citatinos.

Markdown writer:  use @{..} syntax for citations when needed.

Update manual with curly-brace syntax for citations.

Closes .
2021-05-13 21:59:32 -07:00
John MacFarlane
0217ae2a4f Hande 'annote' field in bibtex/biblatex writer.
Closes .
2021-05-12 11:05:55 -07:00
John MacFarlane
5eb7ad7d1e Improve integration of settings from reference.docx.
The settings we can carry over from a reference.docx are
autoHyphenation, consecutiveHyphenLimit, hyphenationZone,
doNotHyphenateCap, evenAndOddHeaders, and proofState.

Previously this was implemented in a buggy way, so that the
reference doc's values AND the new values were included.

This change allows users to create a reference.docx that
sets w:proofState for spelling or grammar to "dirty,"
so that spell/grammar checking will be triggered on the
generated docx.

Closes .
2021-05-11 22:31:38 -06:00
John MacFarlane
2bd5d0cafb LaTeX writer: better handling of line breaks in simple tables.
Now we also handle the case where they're embedded in other
elements, e.g. spans. Closes .
2021-05-11 07:52:05 -06:00
John MacFarlane
6e45607f99 Change reader types, allowing better tracking of source positions.
Previously, when multiple file arguments were provided, pandoc
simply concatenated them and passed the contents to the readers,
which took a Text argument.

As a result, the readers had no way of knowing which file
was the source of any particular bit of text.  This meant that
we couldn't report accurate source positions on errors or
include accurate source positions as attributes in the AST.
More seriously, it meant that we couldn't resolve resource
paths relative to the files containing them
(see e.g. , , , ).

Add Text.Pandoc.Sources (exported module), with a `Sources` type
and a `ToSources` class.  A `Sources` wraps a list of `(SourcePos,
Text)` pairs. [API change] A parsec `Stream` instance is provided for
`Sources`.  The module also exports versions of parsec's `satisfy` and
other Char parsers that track source positions accurately from a
`Sources` stream (or any instance of the new `UpdateSourcePos` class).

Text.Pandoc.Parsing now exports these modified Char parsers instead of
the ones parsec provides.  Modified parsers to use a `Sources` as stream
[API change].

The readers that previously took a `Text` argument have been
modified to take any instance of `ToSources`. So, they may still
be used with a `Text`, but they can also be used with a `Sources`
object.

In Text.Pandoc.Error, modified the constructor PandocParsecError
to take a `Sources` rather than a `Text` as first argument,
so parse error locations can be accurately reported.

T.P.Error: showPos, do not print "-" as source name.
2021-05-09 19:11:34 -06:00
Albert Krewinkel
8357b835d9
App: allow tabs expansion even if file-scope is used
Tabs in plain-text inputs are now handled correctly, even if the
`--file-scope` flag is used.

Closes: 
2021-05-05 19:09:21 +02:00
Albert Krewinkel
ddbf83f62c
Docx writer: support colspans and rowspans in tables
See: 
2021-05-01 18:52:24 +02:00
mbrackeantidot
b6a65445e1
Docx reader: add handling of vml image objects (jgm#4735) ()
They represent images, the same way as other images in vml format.
2021-04-29 09:11:44 -07:00
John MacFarlane
d14c5f94df Further improvements in smart quotes.
Improves heuristic for detection of an "open double quote."
Closes .
2021-04-29 08:48:49 -07:00
John MacFarlane
80e2e88287 Smarter smart quotes.
Treat a leading " with no closing " as a left curly quote.
This supports the practice, in fiction, of continuing
paragraphs quoting the same speaker without an end quote.
It also helps with quotes that break over lines in line
blocks.

Closes .
2021-04-28 23:32:37 -07:00
Albert Krewinkel
85f379e474
JATS writer: use either styled-content or named-content for spans.
If the element has a content-type attribute, or at least one class, then
that value is used as `content-type` and the span is put inside a
`<named-content>` element. Otherwise a `<styled-content>` element is
used instead.

Closes: 
2021-04-28 22:21:34 +02:00
Albert Krewinkel
0921b82d98
Docx writer: autoset table width if no column has an explicit width. 2021-04-27 13:27:20 +02:00
Jan Tojnar
e9c0f9f97b
Markdown writer: Cleaner (code)blocks with single class ()
When a block only has a single class and no other attributes,
it is not necessary to wrap the class attribute in curly braces –
the class name can be placed after the opening mark as is.

This will result in bit cleaner output when pandoc is used
as a markdown pretty-printer.
2021-04-25 10:36:06 -07:00
John MacFarlane
547bc2cdf8 Add quotes properly in markdown YAML metadata fields.
This fixes a bug, which caused the writer to look at the LAST
rather than the FIRST character in determining whether quotes
were needed.  So we got spurious quotes in some cases and
didn't get necessary quotes in others.

Closes .  Updated a number of test cases accordingly.
2021-04-25 10:31:33 -07:00
John MacFarlane
7f4850c9de Remove biblatex-nussbaum.md test.
It is basically the same as biblaetx-quotes.md.
2021-04-25 10:29:03 -07:00
John MacFarlane
73d394ca2a Use MetaInlines not MetaBlocks for multimarkdown metadata fields.
This gives better results in converting to e.g. pandoc markdown.

Ref: <https://groups.google.com/d/msgid/pandoc-discuss/9728d1f4-040e-4392-aa04-148f648a8dfdn%40googlegroups.com>
2021-04-18 22:01:12 -07:00
John MacFarlane
a478a5c4c8 Update to released unicode-collation, latest citeproc dev version.
Update citeproc test.
2021-04-17 16:15:14 -07:00
John MacFarlane
099ac9985b Use BCP47 language codes in citeproc tests. 2021-04-17 16:15:14 -07:00
John MacFarlane
ff5a504809 Use new citeproc + unicode-collation.
Add command test for unicode-collation.
2021-04-17 16:15:13 -07:00
Albert Krewinkel
5f79a66ed6
JATS writer: reduce unnecessary use of <p> elements for wrapping
The `<p>` element is used for wrapping in cases were the contents would
otherwise not be allowed in a certain context. Unnecessary wrapping is
avoided, especially around quotes (`<disp-quote>` elements).

Closes: 
2021-04-16 22:47:37 +02:00
Albert Krewinkel
2d60524de4
JATS writer: convert spans to <named-content> elements
Spans with attributes are converted to `<named-content>` elements
instead of being wrapped with `<milestone-start/>` and `<milestone-end>`
elements. Milestone elements are not allowed in documents using the
articleauthoring tag set, so this change ensures the creation of valid
documents.

Closes: 
2021-04-10 11:49:18 +02:00
Albert Krewinkel
051b7ffeaf
JATS writer: add footnote number as label in backmatter
Footnotes in the backmatter are given the footnote's number as a label.
The articleauthoring output is unaffected from this change, as footnotes
are placed inline there.

Closes: 
2021-04-10 10:57:06 +02:00
John MacFarlane
20cd33e5a4 Fix regression in grid tables for wide characters.
In the translation from String to Text, a char-width-sensitive
splitAt' was dropped.  This commit reinstates it.
Closes .
2021-04-08 14:48:29 -07:00
John MacFarlane
60974538b2 Commonmark writer: Use backslash escapes for < and |...
instead of entities.  Closes .
2021-04-05 23:29:22 -07:00
Albert Krewinkel
038261ea52
JATS writer: escape disallows chars in identifiers
XML identifiers must start with an underscore or letter, and can contain
only a limited set of punctuation characters. Any IDs not adhering to
these rules are rewritten by writing the offending characters as Uxxxx,
where `xxxx` is the character's hex code.
2021-04-05 21:55:54 +02:00
tecosaur
4371223d13
Org writer: Use LaTeX style maths deliminators ()
Org works better with LaTeX-style delimiters.
2021-04-01 23:36:02 +02:00
niszet
40da6c402b
Treat tabs as spaces in ODT Reader. () 2021-03-31 16:44:34 -07:00
John MacFarlane
56ce1fc126 Fix DocBook reader mathml regression...
...caused by the switch in XML libraries.
Also fixed a similar issue in JATS.
Closes .
2021-03-24 12:04:33 -07:00