Reader helper functions were defined in the top-level Text.Pandoc
module. These functions are moved to the Readers submodule as to enable
reuse in other submodules.
The lua filters and custom lua writer system defined very similar
StackValue instances for strings and tuples. These instance definitions
are extracted to a separate module to enable sharing.
These are caught (and lead to exit) in pandoc.hs, but
other uses of Text.Pandoc.App may want to recover in another
way.
Added PandocAppError to PandocError (API change).
This is a stopgap: later we should have a separate constructor
for each type of error.
Also fixed uses of 'exit' in Shared.readDataFile, and
removed 'err' from Shared (API change).
Finally, removed the dependency on extensible-exceptions.
See #3548.
Plain text readers are exposed to lua scripts via the `pandoc.reader`
submodule, which is further subdivided by format. Converting e.g. a
markdown string into a pandoc document is possible from within lua:
doc = pandoc.reader.markdown.read_doc("Hello, World!")
A `read_block` convenience function is provided for all formats,
although it will still parse the whole string but return only the first
block as the result.
Custom reader options are not supported yet, default options are used
for all parsing operations.
I think template haskell is robust enough now across platforms
that this will work.
Motivation: file-embed gives us better dependency tracking: if a data
file changes, ghc/stack/cabal know to recompile the Data module.
This also removes hsb2hs as a build dependency.
The 0.5.0 release of hslua fixes problems with lua C modules on linux.
The signature of the `loadstring` function changed, so a compatibility
wrapper is introduced to allow both 0.4.* and 0.5.* versions to be used.
* New module: Text.Pandoc.Writers.Ms.
* New template: default.ms.
* The writer uses texmath's new eqn writer to convert math
to eqn format, so a ms file produced with this writer
should be processed with `groff -ms -e` if it contains
math.
* Add `--lua-filter` option. This works like `--filter` but takes pathnames of special lua filters and uses the lua interpreter baked into pandoc, so that no external interpreter is needed. Note that lua filters are all applied after regular filters, regardless of their position on the command line.
* Add Text.Pandoc.Lua, exporting `runLuaFilter`. Add `pandoc.lua` to data files.
* Add private module Text.Pandoc.Lua.PandocModule to supply the default lua module.
* Add Tests.Lua to tests.
* Add data/pandoc.lua, the lua module pandoc imports when processing its lua filters.
* Document in MANUAL.txt.
This contains a list of strings that will be recognized by pandoc's
Markdown parser as abbreviations. (A nonbreaking space will
be inserted after the period, preventing a sentence space in
formats like LaTeX.)
Users can override the default by putting a file abbreviations
in their user data directory (`~/.pandoc` on *nix).
Closes#1905.
Removed stateChapters from ParserState.
Now we parse chapters as level 0 headers, and parts as level -1 headers.
After parsing, we check for the lowest header level, and if it's
less than 1 we bump everything up so that 1 is the lowest header level.
So `\part` will always produce a header; no command-line options
are needed.
+ Text.Pandoc.Writers.Shared
+ Text.Pandoc.Parsing
+ Text.Pandoc.Asciify
+ Text.Pandoc.Emoji
+ Text.Pandoc.ImageSize
[API change]
These are often helpful to people writing their own
reader or writer modules. Closes#3260.
This now contains the Verbosity definition previously
in Options, as well as a new LogMessage datatype that
will eventually be used instead of raw strings for
warnings.
This will enable us, among other things, to provide
machine-readable warnings if desired.
See #3392.
The App module provides a function that does a pandoc conversion,
based on option settings. The program (pandoc.hs) now does nothing
more than parse options and pass them to this function, which can
easily be used by other applications (e.g. a GUI wrapper).
The Opt structure has been further simplified.
API changes:
* New exposed module Text.Pandoc.App
* Text.Pandoc.Highlighting has been exposed.
* highlightingStyles has been moved to Text.Pandoc.Highlighting.
Removed writerDocbookVersion in WriterOptions.
Renamed default.docbook template to default.docbook4.
Allow docbook4 as an output format.
But alias docbook = docbook4.
* Text.Pandoc.Writers.HTML: removed writeHtml, writeHtmlString,
added writeHtml4, writeHtml4String, writeHtml5, writeHtml5String.
* Removed writerHtml5 from WriterOptions.
* Renamed default.html template to default.html4.
* "html" now aliases to "html5"; to get the old HTML4 behavior,
you must now specify "-t html4".
The type is implemented in terms of an underlying bitset
which should be more efficient.
API change: from Text.Pandoc.Extensions export Extensions,
emptyExtensions, extensionsFromList, enableExtension, disableExtension,
extensionEnabled.
* Remove exported module `Text.Pandoc.Readers.TeXMath`
* Add exported module `Text.Pandoc.Writers.Math`
* The function `texMathToInlines` now lives in `Text.Pandoc.Writers.Math`
* Export helper function `convertMath` from `Text.Pandoc.Writers.Math`
* Use these functions in all writers that do math conversion.
This ensures that warnings will always be issued for failed
math conversions.
Introduce a new module, Text.Pandoc.Free, with pure versions, based on
the free monad, of numerous IO functions used in writers and
readers. These functions are in a pure
Monad (PandocAction). PandocAction takes as a parameter the type of
IORefs in it. It can be aliased in individual writers and readers to
avoid this parameter.
Note that this means that at the moment a reader can only use one type
of IORef. If possible, it would be nice to remove this limitation.
So far this just reproduces capacity.
Later we'll be able to add features like warning
messages, dynamic loading of xml syntax definitions,
and dynamic loading of themes.
+ Removed Text.Pandoc.Readers.Docx.Fonts
+ Moved its code to texmath; we now use (from texmath 0.9)
Text.TeXMath.Unicode.Fonts
+ Use texmath 0.9 (currently from git).
+ Updated epub tests because texmath now handles more mathml.
We already lower-bound tagsoup at 0.13.7, which means we were always
running the compatibility layer (it was conditional on min value
0.13). Better to just use `lookupEntity` from the library directly, and
convert a string to a char if need be.
Some code was duplicated (copy-pasted) or placed in an inappropriate
module during the modularization refactoring. Those functions are moved
into a `Shared` module, as was originally intended but forgotten.
Better documentation of the respective functions is a positive
side-effect.
Inline parsing code is moved to a separate module. Parsers for block
starts are extracted as well, as those are used in the `endline` parser.
This is part of the Org-mode reader cleanup effort.
The Org-mode reader uses many functions defined in the
`Text.Pandoc.Parsing` utility module. Some of the functions are
overwritten with versions adapted to Org-mode idiosyncrasies. These
special functions, as well as the normal Pandoc versions, are combined
in a single module to increase the ease of use.
This leads to decoupling of Org-mode and Pandoc and hence to slightly
cleaner code. The downside is code-bloat due to repeated import/export
statements.
Now instead of using `findExecutable`, which has limitations
on Windows, we just do `progname --version` and see if it
returns successfully. Closes#2903.
The docx reader used to use a Modifiable typeclass to combine both
Blocks and Inlines. But all the work was in the inlines. So most of the
generality was wasted, at the expense of making the code harder to
understand. This gets rid of the generality, and adds functions for
Blocks and Inlines. It should be a bit easier to work with going forward.