No description
Find a file
John MacFarlane 8ca191604d Add new unexported module T.P.XMLParser.
This exports functions that uses xml-conduit's parser to
produce an xml-light Element or [Content].  This allows
existing pandoc code to use a better parser without
much modification.

The new parser is used in all places where xml-light's
parser was previously used.  Benchmarks show a significant
performance improvement in parsing XML-based formats
(especially ODT and FB2).

Note that the xml-light types use String, so the
conversion from xml-conduit types involves a lot
of extra allocation.  It would be desirable to
avoid that in the future by gradually switching
to using xml-conduit directly. This can be done
module by module.

The new parser also reports errors, which we report
when possible.

A new constructor PandocXMLError has been added to
PandocError in T.P.Error [API change].

Closes #7091, which was the main stimulus.

These changes revealed the need for some changes
in the tests.  The docbook-reader.docbook test
lacked definitions for the entities it used; these
have been added. And the docx golden tests have been
updated, because the new parser does not preserve
the order of attributes.

Add entity defs to docbook-reader.docbook.

Update golden tests for docx.
2021-02-10 22:04:11 -08:00
.circleci CircleCI: fix stack installation. 2020-12-29 15:41:18 -08:00
.github CI: use haskell/actions/setup. 2021-02-06 19:00:00 -08:00
app Avoid unnecessary use of NoImplicitPrelude pragma (#7089) 2021-02-07 10:02:35 -08:00
benchmark Avoid unnecessary use of NoImplicitPrelude pragma (#7089) 2021-02-07 10:02:35 -08:00
citeproc/biblatex-localization Add built-in citation support using new citeproc library. 2020-09-21 10:15:50 -07:00
data LaTeX template: Update to iftex package (#7073) 2021-02-03 08:54:11 -08:00
doc doc/lua-filters.md: improve docs for pandoc.mediabag.insert 2021-02-04 19:07:59 +01:00
linux Re-add -optc-Os to static linux build, because it makes binary smaller. 2020-11-18 19:02:04 -08:00
macos Add built-in citation support using new citeproc library. 2020-09-21 10:15:50 -07:00
man Update README and man page. 2021-01-22 21:37:59 -08:00
prelude Use implicit Prelude (#6187) 2020-03-15 09:45:44 -07:00
src/Text Add new unexported module T.P.XMLParser. 2021-02-10 22:04:11 -08:00
test Add new unexported module T.P.XMLParser. 2021-02-10 22:04:11 -08:00
tools make changes_github: use details tag to make changelog collapsible. 2020-03-15 11:23:55 -07:00
trypandoc Avoid unnecessary use of NoImplicitPrelude pragma (#7089) 2021-02-07 10:02:35 -08:00
windows Add built-in citation support using new citeproc library. 2020-09-21 10:15:50 -07:00
.editorconfig
.gitattributes Added .gitattributes. 2019-09-15 10:40:59 -07:00
.gitignore Improve .gitignore. 2021-01-15 22:41:39 -08:00
.hlint.yaml Lua filters: use same function names in Haskell and Lua 2021-02-04 19:07:59 +01:00
.mailmap Add .mailmap 2019-01-07 08:44:40 +03:00
.stylish-haskell.yaml More spellcheck 2018-07-02 19:07:28 +03:00
AUTHORS.md Update AUTHORS.md. 2021-01-23 15:54:38 -08:00
BUGS
cabal.project cabal.project - more heap space 2021-02-02 14:47:27 -08:00
changelog.md Update changelog. 2021-01-22 21:37:04 -08:00
CONTRIBUTING.md CONTRIBUTING: note on GNU xargs 2021-01-12 21:53:39 -08:00
COPYING.md
COPYRIGHT JATS writer: fix citations (#7018) 2021-01-10 15:35:48 -08:00
default.nix Use simple default.nix. 2021-01-13 09:52:09 -08:00
INSTALL.md INSTALL.md: Remove references to pandoc-citeproc. 2020-11-19 09:40:46 -08:00
Makefile Makefile: give allocation data in benchmarks. 2021-01-31 18:18:53 -08:00
MANUAL.txt Add new unexported module T.P.XMLParser. 2021-02-10 22:04:11 -08:00
pandoc.cabal Add new unexported module T.P.XMLParser. 2021-02-10 22:04:11 -08:00
README.md Update README and man page. 2021-01-22 21:37:59 -08:00
README.template Update copyright notices for 2021 (#7012) 2021-01-08 09:38:20 -08:00
RELEASE-CHECKLIST Update RELEASE-CHECKLIST for CircleCI. 2020-12-17 23:47:25 -08:00
release.nix Use project.nix instead of default.nix for generated file. 2021-01-12 22:32:26 -08:00
Setup.hs Removed custom Setup.hs, use build-type: simple. 2019-01-02 17:02:02 -08:00
shell.nix shell.nix - install zlib 2021-02-02 14:47:14 -08:00
stack.yaml Use lts-17.2 resolver (with ghc 8.10.3). 2021-02-08 10:11:06 -08:00

Pandoc

github
release hackage
release homebrew stackage LTS
package CI
tests license pandoc-discuss on google
groups

The universal markup converter

Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It can convert from

It can convert to

Pandoc can also produce PDF output via LaTeX, Groff ms, or HTML.

Pandocs enhanced version of Markdown includes syntax for tables, definition lists, metadata blocks, footnotes, citations, math, and much more. See the Users Manual below under Pandocs Markdown.

Pandoc has a modular design: it consists of a set of readers, which parse text in a given format and produce a native representation of the document (an abstract syntax tree or AST), and a set of writers, which convert this native representation into a target format. Thus, adding an input or output format requires only adding a reader or writer. Users can also run custom pandoc filters to modify the intermediate AST (see the documentation for filters and Lua filters).

Because pandocs intermediate representation of a document is less expressive than many of the formats it converts between, one should not expect perfect conversions between every format and every other. Pandoc attempts to preserve the structural elements of a document, but not formatting details such as margin size. And some document elements, such as complex tables, may not fit into pandocs simple document model. While conversions from pandocs Markdown to all formats aspire to be perfect, conversions from formats more expressive than pandocs Markdown can be expected to be lossy.

Installing

Heres how to install pandoc.

Documentation

Pandocs website contains a full Users Guide. It is also available here as pandoc-flavored Markdown. The website also contains some examples of the use of pandoc and a limited online demo.

Contributing

Pull requests, bug reports, and feature requests are welcome. Please make sure to read the contributor guidelines before opening a new issue.

License

© 2006-2021 John MacFarlane (jgm@berkeley.edu). Released under the GPL, version 2 or greater. This software carries no warranty of any kind. (See COPYRIGHT for full copyright and warranty notices.)