Image scaling in ODT was broken when a width was set to
a percentage. The width was passed to the svg:width field
as a pecentage, which is not correct according to the ODT
standard.
Instead the real dimensions should be passed as width and
height and the style:rel-width attribute should be set to the
percentage while style:rel-heigh attribute should be set to
"scale". The converse is true if a percentage height is given.
This is now fixed and documents produced are now properly
scaled.
Inclusion of planning info (*DEADLINE*, *SCHEDULED*, and *CLOSED*) can
be controlled via the `p` export option: setting the option to `t` will
add all planning information in a *Plain* block below the respective
headline.
The RecordWildCards and ViewPatterns language extensions can be used to
shorten code, but usually also makes it harder to read. The DocumentTree
module was hence refactored and no longer relies on these extensions.
This now allows raw LaTeX environments, `\ref`, and `\eqref` to
be parsed (which is helpful for translation HTML documents using
MathJaX).
Closes#1126.
Also foreigncblockquote, hyphenblockquote, hyphencblockquote.
Closes#4848. But note: currently foreignquote will be
parsed as a regular Quoted inline (not using the quotes
appropriate to the foreign language).
Add support for `\|`, `\b`, `\G`, `\h`, `\d`, `\f`,
`\r`, `\t`, `\U`, `\i`, `\j`, `\newtie`, `\textcircled`.
Also fall back to combining characters when composed
characters are not available.
Closes#4652.
Closes#4826. This isn't a complete solution, since other
nestings of display math may still cause problems, but it should
work for what is by far the most common case.
Note that this also involves an API change: `isDisplayMath`
is now exported from Text.Pandoc.Writers.Shared.
Inline math in `\(..\)`, display math in `\[..\]`, tex is now used.
Previously we'd "fake it with unicode" and fall back to tex when
that didn't work. But as of
3f50b95532
haddock supports latex math.
...they should only be recognized in siunitx contexts.
For example, `\l` outside of an siunitx context should be l-slash,
not l (for liter)!
Closes#4842.
We can't always tell if it's LaTeX, ConTeXt, or plain TeX.
Better just to use "tex" always.
Also changed:
ConTeXt writer: now outputs raw "tex" blocks as well as "context".
(Closes#969).
RST writer: uses ".. raw:: latex" for "tex" content.
(RST doesn't support raw context anyway.)
Note that if "context" or "latex" specifically is desired,
you can still force that in a markdown document by using
the raw attribute (see MANUAL.txt):
```{=latex}
\foo
```
Note that this change may affect some filters, if they assume that raw
tex parsed by the Markdown reader will be RawBlock (Format "latex").
In most cases it should be trivial to modify the filters to accept
"tex" as well.
For example: `\def\foo#1[#2]{#1 and #2}`.
Closes#4768. Also fixes#4771.
API change: in Text.Pandoc.Readers.LaTeX.Types,
new type ArgSpec added. Second parameter of Macro
constructor is now `[ArgSpec]` instead of `Int`.
This fixes a regression in 2.2.3, which cause boolean values to
be parsed as MetaInlines instead of MetaBool.
Note also an undocumented (but desirable) change in 2.2.3:
numbers are now parsed as MetaInlines rather than MetaString.
Closes#4819.
* Use a Span with class "title-reference" for the default
title-reference role.
* Use B.text to split up contents into Spaces, SoftBreaks, and Strs
for title-reference.
* Use Code with class "interpreted-text" instead of Span and Str for
unknown roles. (The RST writer has also been modified to round-trip
this properly.)
* Disallow blank lines in interpreted text.
* Backslash-escape now works in interpreted text.
* Backticks followed by alphanumerics no longer end interpreted text.
Closes#4811.
Closes#4803
After this commit use `$titleblock$` in order to get what was contained
in `$title$` before, that is a title and subtitle rendered according to
the official rST method:
http://docutils.sourceforge.net/docs/user/rst/quickstart.html#document-title-subtitle. from
With this commit, the `$title$` and `$subtitle$` metadata are available and they
simply carry the metadata values. This opens up more possibilities in templates.
Now we properly parse title and subtitle elements that are
direct children of book and article (as well as children of
bookinfo, articleinfo, or info).
We also now use the "subtitle" metadata field for subtitles,
rather than tacking the subtitle on to the title.
Exposes a function converting which flattenes a list of blocks into a
list of inlines. An example use case would be the conversion of Note
elements into other inlines.
RST does not allow nested emphasis, links, or other inline
constructs.
Closes#4581, double parsing of links with URLs as
link text. This supersedes the earlier fix for #4581
in 6419819b46.
Fixes#4561, a bug parsing with URLs inside emphasis.
Closes#4792.
Emphasis was not parsed when it followed directly after some block types
(e.g., lists).
The org reader uses a wrapper for the `parseFromString` function to
handle org-specific state. The last position of a character allowed
before emphasis was reset incorrectly in this wrapper. Emphasized text
was not recognized when placed directly behind a block which the reader
parses using `parseFromString`.
Fixes: #4784
Starting in 2.2.2, everything after an `\input` (or `\include`)
in a markdown file would be parsed as raw LaTeX.
This commit fixes the issue and adds a regression test.
Closes#4781.
Under version 2.2.1 and prior pandoc found latex templates in the
templates directory under the data directory, but this no longer
works in 2.2.2.
MANUAL says: "If the template is not found, pandoc will search for it in
the templates subdirectory of the user data directory (see `--data-dir`)."
This commit fixes the regression, which stems from 07bce91.
Closes#4777.
Text.Pandoc.Emoji now exports `emojiToInline`, which returns a Span inline containing the emoji character and some attributes with metadata (class `emoji`, attribute `data-emoji` with emoji name). Previously, emojis (as supported in Markdown and CommonMark readers, e.g "😄")
were simply translated into the corresponding unicode code point. By wrapping them in Span
nodes, we make it possible to do special handling such as giving them a special font
in HTML output. We also open up the possibility of treating them differently when the
`--ascii` option is selected (though that is not part of this commit).
Closes#4743.
We now allow arbitrary LaTeX values.
This helps with #4761. The `\maxwidth` is still not
propagated to the latex destination, but at least we don't
choke on parsing.
Closes#4755.
This will mean some increase in the time it takes to
produce an image-heavy PDF with xelatex, but it will
make tables of contents correct, which is more important.
Note that the production time should also be decreased
by the previous commit, which fixed a logic error
affecting the number of runs. That change might mitigate
the effect of this one.
Non-ascii characters were not stripped from identifiers even if the
`ascii_identifiers` extension was enabled (which is is by default for
gfm).
Closes#4742
As noted [here](https://tex.stackexchange.com/a/49805) ([beamer
commit here](ff70090f36)),
`noframenumbering` is an undocumented, but long existing option
to disable frame numbering for a particular slide. This is useful
to avoid numbering backup slides.
Ideally we'd turn these on only when reading beamer, but currently
beamer is not distinguished from latex as an input format.
This commit also activates parsing of overlay specifications
after commands in general (e.g. `\item`), since they can occur
in many contexts in beamer.
Closes#4669.
Error messages produced by Lua were not displayed by Pandoc. The writer
was using the bottom-most stack element, while the error message is the
top-most element. This lead to the writer to always show "Lua 5.3" as
error message, disregarding the actual message.
Use a function from the *filepath* library to check whether a string is
a valid file name. The custom validity checker that was used before gave
wrong results, e.g. for absolute file paths on
Windows (kawabata/ox-pandoc#52).
This adjusts the path from a file: URI in a way that is sensitive
to Windows/Linux differences. Thus, on Windows,
`/c:/foo` gets interpreted as `c:/foo`, but on Linux,
`/c:/foo` gets interpreted as `/c:/foo`.
See #4613.
* Revert "PDF: Use withTempDir in html2pdf." We're going back to using tmpFile instead of piping
* Revert "html2pdf: inject base tag wih current working directory (#4443)"
Fixes#4413
If we do not catch these errors, any malformed entry in a media bag
could cause the loss of a whole document output. An example of
malformed entry is an entry with an empty file path.
Also `\MakeTextUppercase`, `\MakeTextLowercase` from textcase
and `\uppercase`, `\lowercase`.
We don't mimic exactly the quirky semantic differences between
these commands, but just uppercase/lowercase regular strings within
them. We leave commands and code alone.
Closes#4595.
Removed `--latexmathml`, `--gladtex`, `--mimetex`, `--jsmath`, `-m`,
`--asciimathml` options.
Removed `JsMath`, `LaTeXMathML`, and `GladTeX` constructors from
`Text.Pandoc.Options.HTMLMathMethod` [API change].
Removed unneeded data file LaTeXMathML.js and updated tests.
Bumped version to 2.2.
* Markdown writer now includes a blank line at the end
of the row in a single-row multiline table, to prevent it from being
interpreted as a simple table. Closes#4578.
* Markdown reader does a better job computing the relative width of
the last column in a multiline table, so we can round-trip tables
without constantly shrinking the last column.
Otherwise we can have problems with things like epstopdf.pl,
which pdflatex runs to convert eps files and which won't run
on a file above the working directory in restricted mode.
Instead of application/postscript. This will ensure that
we retain the eps extension after reading the image into
a mediabag and writing it again. See #2067.
Previously we used an odd mix of 3- and 4-space indentation.
Now we use 3-space indentation, except for ordered lists,
where indentation must depend on the width of the list marker.
Closes#4563.
This prevents a multiline codeblock in word from being read as
different paragraphs. This takes place in the Combine module to occur
during the normal combining of divs during conversion.
Note that this specifies that the attributes of the `CodeBlock`s must
be the same. The docx reader creates codeBlocks without attrs, so this
is trivially satisified.
Previously the parser tried to be efficient -- if no end double
quote was found, it would just return the contents. But this
could backfire in a case like:
**this should "be bold**
since the fallback would return the content `"be bold**` and the
closing boldface delimiter would never be encountered.
* Use `\f[R]` rather than `\f[]` to reset. The latter
returns to the previous font, which gives unintended
results in some cases.
* Use `\f[BI]` and `\f[CB]` in headers, instead of `\f[I]` and `\f[C]`,
since the header font is automatically bold.
* Use `\f[CB]` rather than `\f[BC]` for monospace bold.
Closes#4552.
+ Create pdf anchor for a Div with an identifier.
+ Escape `/` character in anchor ids.
+ Improve escaping for anchor ids: we now use _uNNN_ instead of uNNN
to avoid ambiguity.
This is intended to help with #4515; however, in my tests, the
link to the reference does not seem to work. I'm not sure why.
Previously this looked in the filesystem, even if pandoc
was compiled with `embed_data_files` (and sometimes it looked
in a nonexistent build directory). Now the bash completion
script just includes a hard-coded list of data file names.
See #4549.
HorizontalRule corresponds to <hr> element in the default output
format, HTML. Current HTML standard defines <hr> element as
"paragraph-level thematic break". In typography it is often
represented by extra space or centered asterism ("⁂"), but since
FB2 does not support text centering, empty line (similar to extra space)
is the only solution.
Line breaks, on the other hand, don't generate <empty-line />
anymore. Previously line breaks generated <empty-line /> element
inside paragraph, which is not allowed. So, this commit addresses
issue #2424 ("FB2 produced by pandoc doesn't validate").
FB2 does not have a way to represent line breaks inside paragraphs.
They are replaced with LF character, which is not rendered by
FB2 readers, but at least preserves some information.