If the metadata field is all on one line, we try to interpret
it as Inlines, and only try parsing as Blocks if that fails.
If it extends over one line (including possibly the `|` or
`>` character signaling an indented block), then we parse as
Blocks.
This was motivated by some German users finding that
date: '22. Juin 2017'
got parsed as an ordered list.
Closes#3755.
Note that if the table has a first page header and a
continuation page header, the notes will appear only
on the first occurrence of the header.
Closes#2378.
* New module Text.Pandoc.Readers.Vimwiki, exporting readVimwiki [API change].
* New input format `vimwiki`.
* New data file, `data/vimwiki.css`, for displaying the HTML produced by this reader and pandoc's HTML writer in the style of vimwiki's own HTML export.
Note that as a result of this change, the following,
which formerly produced a header with two lines separated
by a line break, will now produce a header followed by a
paragraph:
# Hi\
there
This may affect some existing documents that relied on
this undocumented and unintended behavior.
This change makes pandoc more consistent with other
Markdown implementations, and with itself (since the two-space
version of a line break doesn't work inside ATX headers, and
neither version works inside Setext headers).
Closes#3730.
* XML.toEntities: changed type to Text -> Text.
* Shared.tabFilter -- fixed so it strips out CRs as before.
* Modified writers to take Text.
* Updated tests, benchmarks, trypandoc.
[API change]
Closes#3731.
Currently we only handle the form `0.9\linewidth`.
Anything else would have to be converted to a percentage,
using some kind arbitrary assumptions about line widths.
See #3709.
The Emacs default is to include tags in the headline when exporting.
Instead of just empty spans, which contain the tag name as attribute,
tags are rendered as small caps and wrapped in those spans.
Non-breaking spaces serve as separators for multiple tags.
Babel result blocks can have block attributes like captions and names.
Result blocks with attributes were not recognized and were parsed as
normal blocks without attributes.
Fixes: #3706
Until now, org-ref cite keys included special characters also at the
end. This caused problems when citations occur right before colons or
at the end of a sentence.
With this change, all non alphanumeric characters at the end of a cite
key are ignored.
This also adds `,` to the list of special characters that are legal
in cite keys to better mirror the behaviour of org-export.
With `--reference-location` of `section` or `block`, pandoc
will now repeat references that have been used in earlier
sections.
The Markdown reader has also been modified, so that *exactly*
repeated references do not generate a warning, only
references with the same label but different targets.
The idea is that, with references after every block,
one might want to repeat references sometimes.
Closes#3701.
Emacs parses org documents into a tree structure, which is then
post-processed during exporting. The reader is changed to do the same,
turning the document into a single tree of headlines starting at
level 0.
Fixes: #3695
- Export `inEm` from ImageSize [API change].
- Change `showFl` and `show` instance for `Dimension` so
extra decimal places are omitted.
- Added `Em` as a constructor of `Dimension` [API change].
- Allow `em`, `cm`, `in` to pass through without conversion
in HTML, LaTeX.
Closes#3450.
This is now the default for pandoc's Markdown.
It allows whitespace between the two parts of a
reference link: e.g.
[a] [b]
[b]: url
This is now forbidden by default.
Closes#2602.
This is a verison of parseFromString specialied to
ParserState, which resets stateLastStrPos at the end.
This is almost always what we want.
This fixes a bug where `_hi_` wasn't treated as emphasis in
the following, because pandoc got confused about the
position of the last word:
- [o] _hi_
Closes#3690.
Parsing of smart quotes and special characters can either be enabled via
the `smart` language extension or the `'` and `-` export options. Smart
parsing is active if either the extension or export option is enabled.
Only smart parsing of special characters (like ellipses and en and em
dashes) is enabled by default, while smart quotes are disabled.
This means that all smart parsing features will be enabled by adding the
`smart` language extension. Fine-grained control is possible by leaving
the language extension disabled. In that case, smart parsing is
controlled via the aforementioned export OPTIONS only.
Previously, all smart parsing was disabled unless the language extension
was enabled.
Support for the `#+INCLUDE:` file inclusion mechanism was added.
Recognized include types are *example*, *export*, *src*, and normal org
file inclusion. Advanced features like line numbers and level selection
are not implemented yet.
Closes: #3510
The grid table parsers for markdown and rst was combined into one single
parser, slightly changing parsing behavior of both parsers:
- The markdown parser now compactifies block content cell-wise: pure
text blocks in cells are now treated as paragraphs only if the cell
contains multiple paragraphs, and as plain blocks otherwise. Before,
this was true only for single-column tables.
- The rst parser now accepts newlines and multiple blocks in header
cells.
Closes: #3638
* LaTeX: Load `parskip` before `hyperref`.
According to `hyperref` package's `README.pdf`, page 22, `hyperref` package
should be loaded after `parskip` package.
* Adjust tests for previous change.
Previously we inadvertently interpreted indented HTML as
code blocks. This was a regression.
We now seek to determine the indentation level of the contents
of an HTML block, and (optionally) skip that much indentation.
As a side effect, indentation may be stripped off of raw
HTML blocks, if `markdown_in_html_blocks` is used. This
is better than having things interpreted as indented code
blocks.
Closes#1841.
* Fix keyval funtion: pandoc did not parse options in braces correctly. Additionally, dot, dash, and colon were no valid characters
* Add | as possible option value
* Improved code
Previously the Markdown writer would sometimes create links where there
were none in the source. This is now avoided by selectively escaping bracket
characters when they occur in a place where a link might be created.
Closes#3619.
Ensure that we do not generate reference links
whose labels differ only by case.
Also allow implicit reference links when the link
text and label are identical up to case.
Closes#3615.
The implicitly defined global filter (i.e. all element filtering
functions defined in the global lua environment) is used if no filter is
returned from a lua script. This allows to just write top-level
functions in order to define a lua filter. E.g
function Emph(elem) return pandoc.Strong(elem.content) end
Previously the LaTeX writer created invalid LaTeX
when `--listings` was specified and a code span occured
inside emphasis or another construction.
This is because the characters `%{}\` must be escaped
in lstinline when the listinline occurs in another
command, otherwise they must not be escaped.
To deal with this, adoping Michael Kofler's suggestion,
we always wrap lstinline in a dummy command `\passthrough`,
now defined in the default template if `--listings` is
specified. This way we can consistently escape the
special characters.
Closes#1629.
A single `read` function parsing pandoc-supported formats is added to
the module. This is simpler and more convenient than the previous method
of exposing all reader functions individually.
We now issue `<div class="line-block">` and include a
default definition for `line-block` in the default
templates, instead of hard-coding a `style` on the
div.
Closes#1623.
A figure with two subfigures turns into two pandoc
figures; the subcaptions are used and the main caption
ignored, unless there are no subcaptions.
Closes#3577.
Source block parameter names are no longer prefixed with *rundoc*. This
was intended to simplify working with the rundoc project, a babel
runner. However, the rundoc project is unmaintained, and adding those
markers is not the reader's job anyway.
The original language that is specified for a source element is now
retained as the `data-org-language` attribute and only added if it
differs from the translated language.
The line-numbering switch that can be given to source blocks (`-n` with
an start number as an optional parameter) is parsed and translated to a
class/key-value combination used by highlighting and other readers and
writers.
Previously we always added an empty div before the list
item, but this created problems with spacing in tight
lists. Now we do this:
If the list item contents begin with a Plain block,
we modify the Plain block by adding a Span around
its contents.
Otherwise, we add a Div around the contents of the
list item (instead of adding an empty Div to the
beginning, as before).
Closes#3596.
Filtering functions take element components as arguments instead of the
whole block elements. This resembles the way elements are handled in
custom writers.
Instead of taking the whole inline element, forcing users to destructure it
themselves, the components of the elements are passed to the filtering
functions.
Plain text readers are exposed to lua scripts via the `pandoc.reader`
submodule, which is further subdivided by format. Converting e.g. a
markdown string into a pandoc document is possible from within lua:
doc = pandoc.reader.markdown.read_doc("Hello, World!")
A `read_block` convenience function is provided for all formats,
although it will still parse the whole string but return only the first
block as the result.
Custom reader options are not supported yet, default options are used
for all parsing operations.
Closes#3547.
Macro definitions are inserted in the template when there is highlighted
code.
Limitations: background colors and underline currently not
supported.
This avoids overly narrow labels for ordered lists with
() delimiters.
However, arguably it creates overly wide labels for bullets.
Also, lists now start flush with the margin, rather than
indented.
Fixes#2421.
Modified template to include a `<back>` and `<body>` section.
This should give authors more flexibility, e.g. to put
acknowledgements metadata in `<back>`. References are
automatically extracted and put into `<back>`.
* Fix lstinline handling: lstinline with braces can be used (verb cannot be used with braces)
* Use codeWith and determine the language from lstinline
* Improve code
* Add another test: convert lstinline without language option
* New module: Text.Pandoc.Writers.Ms.
* New template: default.ms.
* The writer uses texmath's new eqn writer to convert math
to eqn format, so a ms file produced with this writer
should be processed with `groff -ms -e` if it contains
math.
This is enabled by default in pandoc and GitHub markdown but not the
other flavors.
This requirse a space between the opening #'s and the header
text in ATX headers (as CommonMark does but many other implementations
do not). This is desirable to avoid falsely capturing things ilke
#hashtag
or
#5Closes#3512.
* Add `--lua-filter` option. This works like `--filter` but takes pathnames of special lua filters and uses the lua interpreter baked into pandoc, so that no external interpreter is needed. Note that lua filters are all applied after regular filters, regardless of their position on the command line.
* Add Text.Pandoc.Lua, exporting `runLuaFilter`. Add `pandoc.lua` to data files.
* Add private module Text.Pandoc.Lua.PandocModule to supply the default lua module.
* Add Tests.Lua to tests.
* Add data/pandoc.lua, the lua module pandoc imports when processing its lua filters.
* Document in MANUAL.txt.
This was failing because of a small discrepancy in markdown
table header line lengths on appveyor.
It's a minor issue, I can't see what is causing it, and
it's irrelevant to the issue this is testing, so we'll
just write native for this test.