Makes the code more consistent and makes it easier to use double quotes
in strings, which is the usual quoting style used for HTML attributes.
Closes: #7487
If a file path does not exist relative to the working directory, but
it does exist relative to the user data directory, and it exists outside
of the user data directory, do not read it. This applies to readDataFile
and readMetadataFile in PandocMonad and, by extension, any module that
uses these by passing them relative paths.
If files specified with `--metadata-file` are not found in the working
directory, look in `$DATADIR/metadata`.
Expose new `readMetadataFile` function from Text.Pandoc.Class
[API change].
Expose new `PandocCouldNotFindMetadataFileError` constructor for
`PandocError` from Text.Pandoc.Error [API change].
Closes#5876.
If speaker notes (a Div with class 'notes') occur right
after a section heading, but above slide level, the
resulting `\note{..}` caommand should not be wrapped in
a frame, as that will cause a spurious blank slide.
Closes#7857.
With the new (default) line wrapping of HTML, in
conjunction with the default CSS which includes
`code { whitespace: pre-wrap; }`, spurious line
breaks could be introduced into inline code.
Closes#7858.
If a table has explicit column width information *and* the
content extends beyond the `--columns` width, we need to
adjust the widths of the pipe separators to encode this width
information.
Closes#7847.
Adjacent docx tables need to be separated by an empty paragraph. If
there's a RawBlock between tables which renders to nothing, be sure to
still insert the empty paragraph so that they will not collapse
together.
Fixes#7724
The V font is defined conditionally, so that it renders
like CB in output formats that support that, and like B
in those that don't (e.g. the terminal).
We could just redefine C, but this would affect code
blocks, too, and putting them all in boldface looks ugly,
I think.
Possible drawback: fragments created by pandoc's man
writer will presuppose a nonstandard V font.
Closes#7506.
Supersedes 253467a549.
Closes#7506.
This also allows us to get rid of some special casing
on definition lists that ensured that options in code
spans would be boldface. (If this change is ever reverted,
we'll need that again.)
Make sure that we only create one bullet per list item in docx. In
particular, when a div is a list item, its contained paragraphs will
now no longer wrongly get individual bullets.
This is accomplished by making sure that for each list, we only use
the associated numId once. Any repeated use would add incorrect
bullets to the document.
Closes#7689
The commit 7a9832166e
had the effect that blank lines would be collapsed
in HTML attributes.
We also roll back a change that collapsed multiple
spaces into one.
This adds support for alphabetical lists in org by enabling the
extension Ext_fancy_lists, mimicking the behaviour of Org Mode when
org-list-allow-alphabetical is enabled.
Enabling Ext_fancy_lists will also make Pandoc differentiate between the
delimiters of ordered lists (periods or closing parentheses). Org does
this differentiation by default when exporting to some formats (e.g.
plain text) but does not in others (e.g. html and latex), so I decided
to copy Pandoc's markdown reader behaviour.
Prior to this commit the MediaWiki writer always added the display
text for a wiki link:
* [[Help|Help]]
* [[Bubbles|Everyone loves bubbles]]
However the display text in the first example is redundant since
MediaWiki uses the target as the default display text. The result being:
* [[Help]]
* [[Bubbles|Everyone loves bubbles]]
This adds support for counter cookies in org lists. Such cookies are
used to override the item counter in ordered lists. In org it is
possible to set the counter at any list item, but since Pandoc AST does
not support this, we restrict the usage to setting an offset for the
entire ordered list, by using the cookie in the first list item.
Note that even though unordered lists do not have counters, Org Mode
still parses such cookies in unordered lists and suppresses them in the
output, so we do the same.
Also, even though org-list-allow-alphabetical is disabled in Emacs by
default, for some reason alphabetical cookies are always parsed and used
in Org Mode regardlessly of whether this option is enabled or the list
style is decimal, so we do the same.
E.g.
2. test
3. test
Is parsed as an ordered list starting at 1, as before. This also
conforms to Org Mode behaviour.
1. [@2] test
2. test
Is now parsed as an ordered list starting at 2, so that it conforms to
Org Mode behaviour.
Note that when parsing
1. [@2] test
2. [@9] test
the second cookie is silenced and the entire list starts at 2. This is
because the current Pandoc AST does not support expressing a change in
the counter at a specific item.
Previously some displayed formulas would be floated above
a preceding text line. This is fixed by setting vertical-rel
to 'text' rather than 'paragraph-content'.
Closes#7777.
- With `--wrap=none`, we now output line breaks between
block-level elements. Previously they were omitted
entirely, so the whole document was on one line, unless
there were literal line breaks in pre sections. This makes
the HTML writer's behavior more consistent with that of
other writers.
- Put newline after `<dd>`.
- Put newlines after block-level elements in footnote section.
Previously the HTML writer was exceptional in not being
sensitive to the `--wrap` option. With this change `--wrap`
now works for HTML. The default (as with other formats) is
automatic wrapping to 72 columns.
A new internal module, T.P.Writers.Blaze, exports `layoutMarkup`.
This converts a blaze Html structure into a doclayout Doc Text.
In addition, we now add a line break between an `img` tag
and the associated `figcaption`.
Note: Output is never wrapped in `writeHtmlStringForEPUB`.
This accords with previous behavior since previously the HTML
writer was insensitive to `--wrap` settings. There's no real
need to wrap HTML inside a zipped container.
Note that the contents of script, textarea, and pre tags are
always laid out with the `flush` combinator, so that unwanted
spaces won't be introduced if these occur in an indented context
in a template.
Closes#7764.
The function behaves like the default `type` function from Lua's
standard library, but is aware of pandoc userdata types. A typical
use-case would be to determine the type of a metadata value.
Markua is a markdown variant used by Leanpub.
More information about Markua can be found at https://leanpub.com/markua/read.
Adds a new exported function `writeMarkua` from T.P.Writers.Markdown.
[API change]
Closes#1871.
Co-authored by Tim Wisotzki and Samuel Lemmenmeier.
Previously, both `fmt == f` case and Image have a rank of 1.
In the end, e.g. from ipynb to html conversion,
if both html and image exists, it actually prefers the image.
This commit changes this, so that fmt == f is always highest rank,
and rank never collides.
This is achieved by keeping fmt == f case having rank 1,
and every other rank increased by 1.
Property tests that roundtrip elements through the Lua stack are
performed in the test-suite of the pandoc-lua-marshal package. No need
to test this here as well.
Write RawBlock of markdown in code-cell output.
#7561 makes the ipynb reader reads code-cell output with mime
"text/markdown" to a RawBlock of markdown
This commit makes the ipynb writer writes this RawBlock of markdown
back inside a code-cell output with the same mime, preserving this
information in round-trip
Add tests of ipynb reader (#7561) and ipynb writer (#7563)'s ability to
handle a "text/markdown" mime type in a code-cell output
- `walk` methods are added to `Block` and `Inline` values; the methods
are similar to `pandoc.utils.walk_block` and
`pandoc.utils.walk_inline`, but apply to filter also to the element
itself, and therefore return a list of element instead of a single
element.
- Functions of name `Doc` are no longer accepted as alternatives for
`Pandoc` filter functions. This functionality was undocumented.
The new `pandoc.Inlines` function behaves identical on string input, but
allows other Inlines-like arguments as well.
The `pandoc.utils.text` function could be written as
function pandoc.utils.text (x)
assert(type(x) == 'string')
return pandoc.Inlines(x)
end
The marshaling functions for pandoc's AST are extracted into a separate
package. The package comes with a number of changes:
- Pandoc's List module was rewritten in C, thereby improving error
messages.
- Lists of `Block` and `Inline` elements are marshaled using the new
list types `Blocks` and `Inlines`, respectively. These types
currently behave identical to the generic List type, but give better
error messages. This also opens up the possibility of adding
element-specific methods to these lists in the future.
- Elements of type `MetaValue` are no longer pushed as values which
have `.t` and `.tag` properties. This was already true for
`MetaString` and `MetaBool` values, which are still marshaled as Lua
strings and booleans, respectively. Affected values:
+ `MetaBlocks` values are marshaled as a `Blocks` list;
+ `MetaInlines` values are marshaled as a `Inlines` list;
+ `MetaList` values are marshaled as a generic pandoc `List`s.
+ `MetaMap` values are marshaled as plain tables and no longer
given any metatable.
- The test suite for marshaled objects and their constructors has
been extended and improved.
- A bug in Citation objects, where setting a citation's suffix
modified it's prefix, has been fixed.
We were including the ams environment type in addition
to the number. This is proper behavior for `\cref` but
not for `\ref`. To support `\cref` we need to store
the environment label separately.
Fixed calculation of maximum column widths in pipe tables.
It is now based on the length of the markdown line, rather
than a "stringified" version of the parsed line. This should
be more predictable for users. In addition, we take into account
double-wide characters such as emojis.
Closes#7713.
The function converts a string to `Inlines`, treating interword spaces
as `Space`s or `SoftBreak`s. If you want a `Str` with literal spaces,
use `pandoc.Str`.
Closes: #7709
Using a Lua string where a list of inlines is expected will cause the
string to be split into words, replacing spaces and tabs into
`pandoc.Space()` elements and newlines into `pandoc.SoftBreak()`.
The previous behavior was to treat the string `s` as `{pandoc.Str(s)}`.
The old behavior can be recovered by wrapping the string into a table
`{s}`.
We need to generate a span when the header's ID doesn't match
the one MediaWiki would generate automatically. But MediaWiki's
generation scheme is different from ours (it uses uppercase letters,
and `_` instead of `-`, for example).
This means that in going from markdown -> mediawiki, we'll now get
spans before almost every heading, unless explicit identifiers are
used that correspond to the ones MediaWiki auto-generates.
This is uglier output but it's necessary for internal links to
work properly.
See #7697.
The `lpeg` and `re` modules are loaded into globals of the respective
name, but they are not necessarily registered as loaded packages. This
ensures that
- the built-in library versions are preferred when setting the globals,
- a shared library is used if pandoc has been compiled without `lpeg`,
and
- the `require` mechanism can be used to load the shared library if
available, falling back to the internal version if possible and
necessary.
Reader options can now be passed as an optional third argument to
`pandoc.read`. The object can either be a table or a ReaderOptions value
like `PANDOC_READER_OPTIONS`. Creating new ReaderOptions objects is
possible through the new constructor `pandoc.ReaderOptions`.
Closes: #7656
* Support for <indexterm>s when reading DocBook
* Update implementation status of `<n-ary>` tags
* Remove non-idiomatic parentheses
* More complete `<indexterm>` support, with tests
Co-authored-by: Rowan Rodrik van der Molen <rowan@ytec.nl>