pandoc/CONTRIBUTING.md

368 lines
14 KiB
Markdown
Raw Normal View History

2013-04-15 05:41:33 +02:00
Contributing to pandoc
======================
Have a question?
----------------
Ask on [pandoc-discuss].
2013-04-15 05:41:33 +02:00
Found a bug?
------------
Bug reports are welcome! Please report all bugs on pandoc's github
[issue tracker].
2017-10-27 02:55:16 +02:00
Before you submit a bug report, search the [open issues] *and* [closed issues]
to make sure the issue hasn't come up before. Also, check the [User's Guide] and
[FAQs] for anything relevant.
2013-04-15 05:41:33 +02:00
2017-10-27 02:55:16 +02:00
Make sure you can reproduce the bug with the [latest released version] of pandoc
(or, even better, the [development version]).
2013-04-15 05:41:33 +02:00
Your report should give detailed, *reproducible* instructions, including
2013-04-15 05:41:33 +02:00
* the pandoc version (check using `pandoc -v`)
2013-04-15 05:41:33 +02:00
* the exact command line used
* the exact input used
* the output received
* the output you expected instead
A small test case (just a few lines) is ideal. If your input is large,
try to whittle it down to a *minimum working example*.
2013-04-15 05:41:33 +02:00
Out of scope?
-------------
A less than perfect conversion does not necessarily mean there's
2016-07-20 14:12:57 +02:00
a bug in pandoc. Quoting from the MANUAL:
> Because pandoc's intermediate representation of a document is less
> expressive than many of the formats it converts between, one should
> not expect perfect conversions between every format and every other.
> Pandoc attempts to preserve the structural elements of a document, but
> not formatting details such as margin size. And some document elements,
> such as complex tables, may not fit into pandoc's simple document
> model. While conversions from pandoc's Markdown to all formats aspire
> to be perfect, conversions from formats more expressive than pandoc's
> Markdown can be expected to be lossy.
For example, both `docx` and `odt` formats can represent margin size, but
because pandoc's internal document model does not contain a representation of
margin size, this information will be lost on converting from docx
to `odt`. (You can, however, customize margin size using `--reference-doc`.)
So before submitting a bug report, consider whether it might be
2017-10-27 07:26:55 +02:00
"out of scope." If it concerns a feature of documents that isn't
representable in pandoc's Markdown, then it very likely is.
2017-10-27 02:55:16 +02:00
(If in doubt, you can always ask on [pandoc-discuss].)
Fixing bugs from the issue tracker
----------------------------------
2015-01-04 01:03:30 +01:00
Almost all the bugs on the issue tracker have one or more associated
tags. These are used to indicate the *complexity* and *nature* of a
2015-01-04 01:03:30 +01:00
bug. There is not yet a way to indicate priority. An up to date
2017-10-27 02:55:16 +02:00
summary of issues can be found on [GitHub labels].
2015-01-04 01:03:30 +01:00
* [good first issue] — The perfect starting point for new contributors. The
issue is generic and can be resolved without deep knowledge of the code
base.
* [enhancement] — A feature which would be desirable. We recommend
2015-01-04 01:03:30 +01:00
you discuss any proposed enhancement on pandoc-discuss before
writing code.
* [bug] — A problem which needs to be fixed.
* [complexity:low] — The fix should only be a couple of lines.
* [complexity:high] — The fix might require structural changes or in depth
knowledge of the code base.
* [new:reader] — A request to add a new input format.
* [new:writer] — A request to add a new output format.
2017-03-01 15:00:53 +01:00
* [docs] — A discrepancy, or ambiguity in the documentation.
* [status:in-progress] — Someone is actively working on or planning to work on the
2015-01-04 01:03:30 +01:00
ticket.
* [status:more-discussion-needed] — It is unclear what the correct approach
2015-01-04 01:03:30 +01:00
to solving the ticket is. Before starting on tickets such as this it
would be advisable to post on the ticket.
* [status:more-info-needed] — We require more information from a user before
2015-01-04 01:03:30 +01:00
we can classify a report properly.
Issues related to a specific format are tagged accordingly, e.g. feature request
or bug reports related to Markdown are labelled with [format:markdown].
2013-04-15 05:41:33 +02:00
Have an idea for a new feature?
-------------------------------
2017-10-27 02:55:16 +02:00
First, search [pandoc-discuss] and the issue tracker (both [open issues] *and*
[closed issues]) to make sure that the idea has not been discussed before.
2013-04-15 07:11:19 +02:00
Explain the rationale for the feature you're requesting. Why would this
feature be useful? Consider also any possible drawbacks, including backwards
compatibility, new library dependencies, and performance issues.
2013-04-15 05:41:33 +02:00
It is best to discuss a potential new feature on [pandoc-discuss]
before opening an issue.
Patches and pull requests
-------------------------
2013-04-15 07:11:19 +02:00
Patches and pull requests are welcome. Before you put time into a nontrivial
patch, it is a good idea to discuss it on [pandoc-discuss], especially if it is
for a new feature (rather than fixing a bug).
Please follow these guidelines:
2013-04-15 05:41:33 +02:00
2013-04-15 07:11:19 +02:00
1. Each patch (commit) should make a single logical change (fix a bug, add
2013-04-15 05:41:33 +02:00
a feature, clean up some code, add documentation). Everything
related to that change should be included (including tests and
documentation), and nothing unrelated should be included.
2013-04-15 07:11:19 +02:00
2. The first line of the commit message should be a short description
of the whole commit (ideally <= 50 characters). Then there should
be a blank line, followed by a more detailed description of the
change.
3. Follow the stylistic conventions you find in the existing
pandoc code. Use spaces, not tabs, and wrap code to 80 columns.
2013-04-15 05:41:33 +02:00
Always include type signatures for top-level functions.
Consider installing [EditorConfig], this will help you to follow the
coding style prevalent in pandoc.
2013-04-15 05:41:33 +02:00
2013-04-15 07:11:19 +02:00
4. Your code should compile without warnings (`-Wall` clean).
2013-04-15 05:41:33 +02:00
2013-04-15 07:11:19 +02:00
5. Run the tests to make sure your code does not introduce new bugs.
(See below under [Tests](#tests).) All tests should pass.
2013-04-15 05:41:33 +02:00
2013-04-15 07:11:19 +02:00
6. It is a good idea to add test cases for the bug you are fixing. (See
below under [Tests](#tests).) If you are adding a new writer or reader,
2013-04-15 05:41:33 +02:00
you must include tests.
7. If you are adding a new feature, include updates to `MANUAL.txt`.
2013-04-15 05:41:33 +02:00
8. All code must be released under the general license governing pandoc
(GPL v2).
9. It is better not to introduce new dependencies. Dependencies on
external C libraries should especially be avoided.
2019-03-27 19:38:24 +01:00
10. We aim for compatibility with ghc versions from 8.0 to the
latest release. All pull requests and commits are tested
automatically on CircleCI.
2013-04-15 05:41:33 +02:00
Tests
-----
Tests can be run as follows:
cabal install --only-dependencies --enable-tests
2013-04-15 05:41:33 +02:00
cabal configure --enable-tests
cabal build
cabal test
or, if you're using [stack],
2016-10-25 16:51:26 +02:00
stack setup
stack test
The test program is `test/test-pandoc.hs`.
2013-04-15 05:41:33 +02:00
To run particular tests (pattern-matching on their names), use
the `-p` option:
cabal install pandoc --enable-tests
cabal test --test-options='-p markdown'
Or with stack:
stack test --test-arguments='-p markdown'
It is often helpful to add `-j4` (run tests in parallel)
and `--hide-successes` (don't clutter output with successes)
to the test arguments as well.
If you add a new feature to pandoc, please add tests as well, following
the pattern of the existing tests. The test suite code is in
`test/test-pandoc.hs`. If you are adding a new reader or writer, it is
probably easiest to add some data files to the `test` directory, and
modify `test/Tests/Old.hs`. Otherwise, it is better to modify the module
under the `test/Tests` hierarchy corresponding to the pandoc module you
are changing. Alternatively, you may add a "command test" to
the `/test/command/` hierarchy, following the pattern of the tests there.
For `docx` tests, you can rebuild the golden tests by passing `--accept`
to the test script (so if you're using stack,
`stack test --test-arguments "-p Docx --accept"`. Then just make sure to
commit the changed golden files in their own commit with a line that you
checked them in Word (mentioning Word version and OS), that they weren't
corrupted, and that they had the expected output.
Benchmarks
----------
To run benchmarks with cabal:
cabal configure --enable-benchmarks
cabal build
cabal bench
With stack:
stack bench
2013-04-15 07:11:19 +02:00
You can also build pandoc with the `weigh-pandoc` flag and
run `weigh-pandoc` to get some statistics on memory usage.
(Eventually this should be incorporated into the benchmark
suite.)
Using the REPL
--------------
With a recent version of cabal, you can do `cabal repl` and get
a ghci REPL for working with pandoc. With [stack], use
`stack ghci`.
We recommend using the following `.ghci` file (which can be
placed in the source directory):
:set -fobject-code
:set -XTypeSynonymInstances
:set -XScopedTypeVariables
:set -XOverloadedStrings
Profiling
---------
To diagnose a performance issue with parsing, first try using
the `--trace` option. This will give you a record of when block
parsers succeed, so you can spot backtracking issues.
To use the GHC profiler with cabal:
cabal clean
cabal install --enable-library-profiling --enable-executable-profiling
pandoc +RTS -p -RTS [file]...
less pandoc.prof
With stack:
stack clean
stack install --profile
pandoc +RTS -p -RTS [file]...
less pandoc.prof
Templates
---------
2013-04-15 07:11:19 +02:00
The default templates live in `data/templates`, which is a git
subtree linked to <https://github.com/jgm/pandoc-templates.git>.
The purpose of maintaining a separate repository is to allow
people to maintain variant templates as a fork.
2013-04-15 07:11:19 +02:00
You can modify the templates and submit patches without worrying
much about this: when these patches are merged, we will
push them to the main templates repository by doing
2013-04-15 07:11:19 +02:00
git subtree push --prefix=data/templates templates master
2013-04-15 07:11:19 +02:00
where `templates` is a remote pointing to the templates
repository.
2013-04-15 07:11:19 +02:00
The code
--------
2013-04-15 07:11:19 +02:00
Pandoc has a publicly accessible git repository on
github: <http://github.com/jgm/pandoc>. To get a local copy of the source:
2013-04-15 07:11:19 +02:00
git clone https://github.com/jgm/pandoc.git
2013-04-15 07:11:19 +02:00
The source for the main pandoc program is `pandoc.hs`. The source for
the pandoc library is in `src/`, the source for the tests is in
`test/`, and the source for the benchmarks is in `benchmark/`.
2013-04-15 07:11:19 +02:00
The modules `Text.Pandoc.Definition`, `Text.Pandoc.Builder`, and
2015-04-18 16:05:30 +02:00
`Text.Pandoc.Generic` are in a separate library `pandoc-types`. The code can
be found in <http://github.com/jgm/pandoc-types>.
2013-04-15 07:11:19 +02:00
To build pandoc, you will need a working installation of the
[Haskell platform].
The library is structured as follows:
- `Text.Pandoc` is a top-level module that exports what is needed
by most users of the library. Any patches that add new readers
or writers will need to make changes here, too.
- `Text.Pandoc.Definition` (in `pandoc-types`) defines the types
used for representing a pandoc document.
- `Text.Pandoc.Builder` (in `pandoc-types`) provides functions for
2018-07-02 17:51:51 +02:00
building pandoc documents programmatically.
2013-04-15 07:11:19 +02:00
- `Text.Pandoc.Generics` (in `pandoc-types`) provides functions allowing
you to promote functions that operate on parts of pandoc documents
to functions that operate on whole pandoc documents, walking the
tree automatically.
- `Text.Pandoc.Readers.*` are the readers, and `Text.Pandoc.Writers.*`
are the writers.
- `Text.Pandoc.Biblio` is a utility module for formatting citations
using citeproc-hs.
- `Text.Pandoc.Data` is used to embed data files when the `embed_data_files`
cabal flag is used. It is generated from `src/Text/Pandoc/Data.hsb` using
the preprocessor [hsb2hs].
- `Text.Pandoc.Highlighting` contains the interface to the
skylighting library, which is used for code syntax highlighting.
2013-04-15 07:11:19 +02:00
- `Text.Pandoc.ImageSize` is a utility module containing functions for
calculating image sizes from the contents of image files.
- `Text.Pandoc.MIME` contains functions for associating MIME types
with extensions.
- `Text.Pandoc.Options` defines reader and writer options.
- `Text.Pandoc.PDF` contains functions for producing a PDF from a
LaTeX source.
- `Text.Pandoc.Parsing` contains parsing functions used in multiple readers.
- `Text.Pandoc.Pretty` is a pretty-printing library specialized to
the needs of pandoc.
- `Text.Pandoc.SelfContained` contains functions for making an HTML
file "self-contained," by importing remotely linked images, CSS,
2017-03-01 15:00:53 +01:00
and JavaScript and turning them into `data:` URLs.
2013-04-15 07:11:19 +02:00
- `Text.Pandoc.Shared` is a grab-bag of shared utility functions.
- `Text.Pandoc.Writers.Shared` contains utilities used in writers only.
2013-04-15 07:11:19 +02:00
- `Text.Pandoc.Slides` contains functions for splitting a markdown document
2016-07-20 14:12:57 +02:00
into slides, using the conventions described in the MANUAL.
2013-04-15 07:11:19 +02:00
- `Text.Pandoc.Templates` defines pandoc's templating system.
- `Text.Pandoc.UTF8` contains functions for converting text to and from
UTF8 bytestrings (strict and lazy).
- `Text.Pandoc.Asciify` contains functions to derive ascii versions of
identifiers that use accented characters.
2013-04-15 07:11:19 +02:00
- `Text.Pandoc.UUID` contains functions for generating UUIDs.
- `Text.Pandoc.XML` contains functions for formatting XML.
2013-04-15 05:41:33 +02:00
Lua filters
-----------
If you've written a useful pandoc [lua filter](lua-filters.html),
you may want to consider submitting a pull request to the
[lua-filters repository](https://github.com/pandoc/lua-filters).
2017-10-27 02:55:16 +02:00
[open issues]: https://github.com/jgm/pandoc/issues
[closed issues]: https://github.com/jgm/pandoc/issues?q=is%3Aissue+is%3Aclosed
[latest released version]: https://github.com/jgm/pandoc/releases/latest
[development version]: https://github.com/pandoc-extras/pandoc-nightly/releases/latest
2013-04-15 05:41:33 +02:00
[pandoc-discuss]: http://groups.google.com/group/pandoc-discuss
[issue tracker]: https://github.com/jgm/pandoc/issues
2016-07-20 14:12:57 +02:00
[User's Guide]: http://pandoc.org/MANUAL.html
2015-06-09 22:22:58 +02:00
[FAQs]: http://pandoc.org/faqs.html
[EditorConfig]: http://editorconfig.org/
2013-04-15 07:11:19 +02:00
[Haskell platform]: http://www.haskell.org/platform/
[hsb2hs]: http://hackage.haskell.org/package/hsb2hs
2017-10-27 02:55:16 +02:00
[GitHub labels]: https://github.com/jgm/pandoc/labels
[good first issue]:https://github.com/jgm/pandoc/labels/good%20first%20issue
2015-01-04 01:03:30 +01:00
[enhancement]: https://github.com/jgm/pandoc/labels/enhancement
[bug]: https://github.com/jgm/pandoc/labels/bug
[complexity:low]: https://github.com/jgm/pandoc/labels/complexity:low
[complexity:high]: https://github.com/jgm/pandoc/labels/complexity:high
[docs]: https://github.com/jgm/pandoc/labels/docs
[format:markdown]: https://github.com/jgm/pandoc/labels/format:markdown
[new:reader]: https://github.com/jgm/pandoc/labels/new:reader
[new:writer]: https://github.com/jgm/pandoc/labels/new:writer
[status:in-progress]: https://github.com/jgm/pandoc/labels/status:in-progress
[status:more-discussion-needed]: https://github.com/jgm/pandoc/labels/status:more-discussion-needed
[status:more-info-needed]: https://github.com/jgm/pandoc/labels/status:more-info-needed
[stack]: https://github.com/commercialhaskell/stack