Previously we always added an empty div before the list
item, but this created problems with spacing in tight
lists. Now we do this:
If the list item contents begin with a Plain block,
we modify the Plain block by adding a Span around
its contents.
Otherwise, we add a Div around the contents of the
list item (instead of adding an empty Div to the
beginning, as before).
Closes#3596.
Filtering functions take element components as arguments instead of the
whole block elements. This resembles the way elements are handled in
custom writers.
Instead of taking the whole inline element, forcing users to destructure it
themselves, the components of the elements are passed to the filtering
functions.
Plain text readers are exposed to lua scripts via the `pandoc.reader`
submodule, which is further subdivided by format. Converting e.g. a
markdown string into a pandoc document is possible from within lua:
doc = pandoc.reader.markdown.read_doc("Hello, World!")
A `read_block` convenience function is provided for all formats,
although it will still parse the whole string but return only the first
block as the result.
Custom reader options are not supported yet, default options are used
for all parsing operations.
Closes#3547.
Macro definitions are inserted in the template when there is highlighted
code.
Limitations: background colors and underline currently not
supported.
This avoids overly narrow labels for ordered lists with
() delimiters.
However, arguably it creates overly wide labels for bullets.
Also, lists now start flush with the margin, rather than
indented.
Fixes#2421.
Modified template to include a `<back>` and `<body>` section.
This should give authors more flexibility, e.g. to put
acknowledgements metadata in `<back>`. References are
automatically extracted and put into `<back>`.
* Fix lstinline handling: lstinline with braces can be used (verb cannot be used with braces)
* Use codeWith and determine the language from lstinline
* Improve code
* Add another test: convert lstinline without language option
* New module: Text.Pandoc.Writers.Ms.
* New template: default.ms.
* The writer uses texmath's new eqn writer to convert math
to eqn format, so a ms file produced with this writer
should be processed with `groff -ms -e` if it contains
math.
This is enabled by default in pandoc and GitHub markdown but not the
other flavors.
This requirse a space between the opening #'s and the header
text in ATX headers (as CommonMark does but many other implementations
do not). This is desirable to avoid falsely capturing things ilke
#hashtag
or
#5Closes#3512.
* Add `--lua-filter` option. This works like `--filter` but takes pathnames of special lua filters and uses the lua interpreter baked into pandoc, so that no external interpreter is needed. Note that lua filters are all applied after regular filters, regardless of their position on the command line.
* Add Text.Pandoc.Lua, exporting `runLuaFilter`. Add `pandoc.lua` to data files.
* Add private module Text.Pandoc.Lua.PandocModule to supply the default lua module.
* Add Tests.Lua to tests.
* Add data/pandoc.lua, the lua module pandoc imports when processing its lua filters.
* Document in MANUAL.txt.
This was failing because of a small discrepancy in markdown
table header line lengths on appveyor.
It's a minor issue, I can't see what is causing it, and
it's irrelevant to the issue this is testing, so we'll
just write native for this test.
Closes#1905.
Removed stateChapters from ParserState.
Now we parse chapters as level 0 headers, and parts as level -1 headers.
After parsing, we check for the lowest header level, and if it's
less than 1 we bump everything up so that 1 is the lowest header level.
So `\part` will always produce a header; no command-line options
are needed.
Previously only autogenerated ids were added to the list
of header identifiers in state, so explicit ids weren't taken
into account when generating unique identifiers. Duplicated
identifiers could result.
This simple fix ensures that explicitly given identifiers are
also taken into account.
Fixes#1745.
Note some limitations, however. An autogenerated identifier
may still coincide with an explicit identifier that is given
for a header later in the document, or with an identifier on
a div, span, link, or image. Fixing this would be much more
difficult, because we need to run `registerHeader` before
we have the complete parse tree (so we can't get a complete
list of identifiers from the document by walking the tree).
However, it might be worth issuing warnings for duplicate
header identifiers; I think we can do that. It is not
common for headers to have the same text, and the issue
can always be worked around by adding explicit identifiers,
if the user is aware of it.
* Previously we got overlong lists with `--wrap=none`. This is fixed.
* Previously a multiline list could become a simple list (and would
always become one with `--wrap=none`).
Closes#3384.
when they occur without space surrounding them.
E.g. equation, math.
This avoids incorrect vertical space around equations.
Closes#3309.
Closes#2171.
See also rstudio/bookdown#358.
E.g. an HTML table with two cells in the first row and one
in the second (but no row/colspan).
We now calculate the number of columns based on the longest
row (or the length of aligns or widths).
Closes#3337.
When multiple YAML metadata blocks are used, and two define
the same field, the value defined first takes precedence,
according to the manual. This was changed briefly in
ba3ee62323. This commit
reverts to the original behavior and adds a test case.
All templates now include `code{white-space: pre-wrap}`
and CSS for `q` if `--html-q-tags` is used.
Previously some templates had `pre` and others `pre-wrap`;
the `q` styles were only sometimes included.
See #3485.
Added test cases.
Fixed HTML reader to parse a span with class "smallcaps" as
SmallCaps.
Fixed Markdown writer to render SmallCaps as a native span
when native spans are enabled.
Polyglot markup is HTML5 that is also valid XHTML. See
<https://www.w3.org/TR/html-polyglot>. With this change, pandoc's
html5 writer creates HTML that is both valid HTML5 and valid XHTML.
See jgm/pandoc-templates#237 for prior discussion.
* Add xml namespace to `<html>` element.
* Make all `<meta>` elements self closing.
See <https://www.w3.org/TR/html-polyglot/#empty-elements>.
* Add `xml:lang` attribute on `<html>` element, defaulting to blank, and
always include `lang` attribute, even when blank. See
<https://www.w3.org/TR/html-polyglot/#language-attributes>.
* Update test files for template changes.
The key justification for having language values default to blank: it
turns out the HTML5 spec requires it (as I read it). Under
[the HTML5 spec, section "3.2.5.3. The lang and xml:lang
attributes"](https://www.w3.org/TR/html/dom.html#the-lang-and-xmllang-attributes),
providing attributes with blank contents both:
* Has meaning, "unknown", and
* Is a MUST (written as "must") if a language value is not provided ...
> The lang attribute (in no namespace) specifies the primary language
> for the element's contents and for any of the element's attributes that
> contain text. Its value must be a valid BCP 47 language tag, or the
> empty string. Setting the attribute to the empty string indicates that
> the primary language is unknown.
In short, it seems that where a language value is not provided then a
blank value MUST be provided for Polyglot Markup conformance, because
the HTML5 spec stipulates a "must". So although the Polyglot Markup spec
is unclear on this issue it would seem that if it was correctly written,
it would therefore require blank attributes.
Further justifications are found at
https://github.com/jgm/pandoc-templates/issues/237#issuecomment-275584181
(but the HTML5 spec justification given above would seem to be the
clincher).
In addition to having lang-values-default-to-blank I recommend that, when an
author does not provide a lang value, then upon on pandoc command execution
a warning message like the following be provided:
> Polyglot markup stipulates that 'The root element SHOULD always specify
> the language'. It is therefore recommended you specify a language value in
> your source document. See
> <https://www.w3.org/International/articles/language-tags/> for valid
> language values.
The citations appear at the end of the document as a definition
list in a special div with id `citations`.
Citations link to the definitions.
Added stateCitations to ParserState.
Closes#853.
Previously we would refuse to parse anything as raw inline if
it was in the blockCommands list. Now we allow exceptions
if they're listed under ignoreInlines in inlineCommands.
This should make it easier e.g. to include an \hspace
between two side-by-side raw LaTeX tables.
Otherwise things like `\noindent foo` break and turn into
`\noindentfoo`.
Affects `-f latex+raw_tex` and `-f markdown` (and other formats
that allow `raw_tex`).
Closes#1773.
For example, in
\begin{tabular}{>{$}l<{$}>{$}l<{$} >{$}l<{$}}
each cell will be interpreted as if it has a `$`
before its content and a `$` after (math mode).
Currently the support for the `.. table` directive is a bit
limited; we don't yet support the `widths` field. But at least
you can have a proper captioned table.
These were confusing.
Now we rely on the +raw_tex or +raw_html extension with latex
or html input.
Thus, instead of
--parse-raw -f latex
we use
-f latex+raw_tex
and instead of
--parse-raw -f html
we use
-f html+raw_html
The org writer was inserting two spaces after list bullets. Emacs
Org-mode defaults to a single space, so behavior is changed to reflect
this.
Closes: #3417