Regardless of input type, we should use default handling if we are
dealing with stdin. In other words, there should be no file-scope if
there are no files. This was an issue with pandoc json, which could be
piped on stdin, but which was read by default with `--file-scope`.
Traditionally pandoc operates on multiple files by first concetenating
them (around extra line breaks) and then processing the joined file. So
it only parses a multi-file document at the document scope. This has the
benefit that footnotes and links can be in different files, but it also
introduces a couple of difficulties:
- it is difficult to join files with footnotes without some sort of
preprocessing, which makes it difficult to write academic documents
in small pieces.
- it makes it impossible to process multiple binary input files, which
can't be catted.
- it makes it impossible to process files from different input
formats.
This commit introduces alternative method. Instead of catting the files
first, it parses the files first, and then combines the parsed
output. This makes it impossible to have links across multiple files,
and auto-identified headers won't work correctly if headers in multiple
files have the same name. On the other hand, footnotes across multiple
files will work correctly and will allow more freedom for input formats.
Since ByteStringReaders can currently only read one binary file, and
will ignore subsequent files, we also changes the behavior to
automatically parse before combining if using the ByteStringReader. If
we use one file, it will work as normal. If there is more than one file
it will combine them after parsing (assuming that the format is the
same).
Note that this is intended to be an optional method, defaulting to
off. Turn it on with `--file-scope`.
Previously, if you tried to do `pandoc -s -t /path/to/lua/script.lua`,
pandoc would look for the template in
`~/.pandoc/templates/default./path/to/lua/script.lua`.
With this change it will look in the more reasonable
`~/.pandoc/templates/default.script.lua`.
This makes it possible to store default templates for custom
writers.
Closes#2625.
* Bumped version to 1.16.
* Added Attr field to Link and Image.
* Added `common_link_attributes` extension.
* Updated readers for link attributes.
* Updated writers for link attributes.
* Updated tests
* Updated stack.yaml to build against unreleased versions of
pandoc-types and texmath.
* Fixed various compiler warnings.
Closes#261.
TODO:
* Relative (percentage) image widths in docx writer.
* ODT/OpenDocument writer (untested, same issue about percentage widths).
* Update pandoc-citeproc.
This change makes `--no-tex-ligatures` affect the LaTeX reader
as well as the LaTeX and ConTeXt writers. If it is used,
the LaTeX reader will parse characters `` ` ``, `'`, and `-`
literally, rather than parsing ligatures for quotation marks
and dashes. And the LaTeX writer will print unicode quotation
mark and dash characters literally, rather than converting
them to the standard ASCII ligatures.
Note that `--smart` has no affect on the LaTeX reader.
`--smart` is still the default for all input formats when
LaTeX or ConTeXt is the output format, *unless* `--no-tex-ligatures`
is used.
Some examples to illustrate the logic:
```
% echo "'hi'" | pandoc -t latex
`hi'
% echo "'hi'" | pandoc -t latex --no-tex-ligatures
'hi'
% echo "'hi'" | pandoc -t latex --no-tex-ligatures --smart
‘hi’
% echo "'hi'" | pandoc -f latex --no-tex-ligatures
<p>'hi'</p>
% echo "'hi'" | pandoc -f latex
<p>’hi’</p>
```
Closes#2541.
The filter pandoc-citeproc is automatically used when
`--bibliography` is specified on the command line, unless
`--natbib` or `--biblatex` is used.
However, previously this only worked if `--bibliography`
was spelled out in full, and not if `--biblio` was used.
This patch fixes that problem.
- The (non-exported) prelude is in prelude/Prelude.hs.
- It exports Monoid and Applicative, like base 4.8 prelude,
but works with older base versions.
- It exports (<>) for mappend.
- It hides 'catch' on older base versions.
This allows us to remove many imports of Data.Monoid
and Control.Applicative, and remove Text.Pandoc.Compat.Monoid.
It should allow us to use -Wall again for ghc 7.10.
`src/Text/Pandoc/Shared.hs`, so that all Writers can access this variable
without importing `src/Text/Pandoc.hs`, preventing circular import.
* pandoc.hs: Import pandocVersion from `Text.Pandoc.Shared`.
* src/Text/Pandoc.hs: Remove the definition of pandocVersion
and relevant import.
* src/Text/Pandoc/Shared.hs: Add the definition of pandocVersion
and relevant import.
Update the default KaTeX JS/CSS links to the current version. KaTeX v0.5.1 has far more functions and symbols than v0.1.0, so it seems like a better default. I think technically this might break compatibility because we released a breaking change due to the greediness of the `\color` function, but this probably has very little impact.
* Added `Ext_common_link_attributes` constructor to `Extension`
(for link and image attributes).
* Added this to `pandocExtensions` and `phpMarkdownExtraExtensions`.
* Added `writerDpi` to `WriterOptions`.
* pandoc.hs: Added `--dpi` option.
* Updated README for `--dpi` and `common_link_attributes` extension.
Patch due to mb21, with some modifications: `writerDpi` is now an
`Int` rather than a `Double`.
+ Removed `--man1`, `--man5` options (breaking change).
+ Removed `Text.Pandoc.ManPages` module (breaking API change).
+ Version bump to 1.15 because of the breaking changes, even
though they involve features that have only been in pandoc
for a day.
+ Makefile target for `man/man1/pandoc.1`. This uses pandoc to
create the man page from README using a custom template and filters.
+ Added `man/` directory with template and filters needed to build
man page.
+ We no longer have two man pages: pandoc.1 and pandoc_markdown.5.
Now there is just pandoc.1, which has all the content from README.
This change was needed because of the extensive cross-references
between parts of the README.
+ Removed old `data/pandoc.1.template` and
`data/pandoc_markdown.5.template`.
This change adds `--man1` and `--man5` options to pandoc, so
pandoc can generate its own man pages.
It removes the old overly complex method of building a separate
executable (but not installing it) just to create the man pages.
The man pages are no longer automatically created in the build
process.
The man/ directory has been removed. The man page templates
have been moved to data/.
New unexported module: Text.Pandoc.ManPages.
Text.Pandoc.Data now exports readmeFile, and `readDataFile`
knows how to find README.
Closes#2190.
This reverts commit 1c2951dfd9.
See #2040.
The semantics was too squishy. `--css` takes a URL, but
for EPUB we need files that we can read. I prefer keeping
the old system for now, with `--epub-stylesheet`.
* Allow `--css` to be used to specify stylesheets.
* Deprecated `--epub-stylesheet` and made it a synoynym of
`--css`.
* If a code block with class "css" is given as contents of the
`stylesheet` metadata field, use its literal code as contents of
the epub stylesheet. Otherwise, treat it as a filename and
read the file.
* Note: `--css` and `stylesheet` in metadata are not compatible.
`stylesheet` takes precedence.
This cause problems on Windows 8, where the variable is called
`Path`.
Instead, simply trap the exception that will be raised by
`findExecutable` if path is not set.
This should fix#1542.
* mkSelfContained now takes just two arguments, WriterOptions and
the string.
* It no longer looks in data files. This only made sense when we
had copies of slidy and S5 code there.
* Shared.fetchItem' is used instead of the nearly duplicate getItem.
Moved `MediaBag` definition and functions from Shared:
`lookupMedia`, `mediaDirectory`, `insertMedia`, `extractMediaBag`.
Removed `emptyMediaBag`; use `mempty` instead, since `MediaBag`
is a Monoid.
This has been documented to affect the epub and docx readers, so
we should either add the epub reader before the next release or
change the documentation.
We now check the writerName for a lua script in pandoc.hs, so that
lowercasing and format parsing aren't done. Note this behavior
change: getWriter in Text.Pandoc no longer returns a custom writer on
input "foo.lua".
Pandoc first tries to find the executable (searching the path
if path isn't given). If it fails, but the file exists and has
a .py, .pl, .rb, .hs, or .php extension, pandoc runs the filter
using the appropriate interpreter.
This should make it easier to use filters on Windows, and make
it more convenient for everyone.
Closes#1096.
This is to debug backtracking-related parsing bugs.
So far it is only implemented for markdown, but it would
be good to extend it to latex and html readers.
Note that anything not parseable as a YAML boolean or string
is treated as a literal string.
Note that you can still get a string value with "yes" or any
of the strings interpretable as booleans:
-M boolvalue=yes -M stringvalue='"yes"'
Going forward we'll use pandoc-citeproc, as an external filter.
The `--bibliography`, `--csl`, and `--citation-abbreviation` fields
have been removed. Instead one must include `bibliography`, `csl`,
or `csl-abbrevs` fields in the document's YAML metadata. The filter
can then be used as follows:
pandoc --filter pandoc-citeproc
The `Text.Pandoc.Biblio` module has been removed. Henceforth,
`Text.CSL.Pandoc` from pandoc-citations can be used by library users.
The Markdown and LaTeX readers now longer format bibliographies and
citations. That must be done using `processCites` or `processCites'`
from Text.CSL.Pandoc.
All bibliography-related fields have been removed from `ReaderOptions`
and `WriterOptions`: `writerBiblioFiles`, `readerReferences`,
`readerCitationStyle`.
API change.
Previously we used to store the directory of the first input file,
even if it was local, and used this as a base directory for
finding images in ODT, EPUB, Docx, and PDF.
This has been confusing to many users. It seems better to look for
images relative to the current working directory, even if the first
file argument is in another directory.
writerSourceURL is set to 'Just url' when the first command-line
argument is an absolute URL. (So, relative links will be resolved
in relation to the first page.) Otherwise, 'Nothing'.
The ODT, EPUB, Docx, and PDF writers have been modified accordingly.
Note that this change may break some existing workflows. If you
have been assuming that relative links will be interpreted relative
to the directory of the first file argument, you'll need to
make that the current directory before running pandoc.
Closes#942.
This way filters can figure out what the target format is
and react appropriately.
Example:
#!/usr/bin/env runghc
import Text.Pandoc.JSON
import Data.Char
main = toJSONFilter cap
where cap (Just "html") (Str xs) = Str $ map toUpper xs
cap _ x = x
This capitalizes text only for html output.
* `Text.Pandoc.PDF` exports `makePDF` instead of `tex2pdf`.
(API change.)
* `makePDF` walks the pandoc AST and checks for the existence of
images in the local directory. If they are not found, it attempts
to find them, either in the directory containing the first source
file, or at an absolute URL, or at a URL relative to the base URL
of the first command line argument.
* Closes#917.
pandoc -t data/sample.lua
will load the script sample.lua and use it as a custom writer.
data/sample.lua is provided as an example.
Added `--print-custom-lua-writer` option to print the sample
script.
Support unordered and ordered lists with "fragment" elements.
Modified by JGM to remove the --reveal_js-url command-line option.
Instead use -V reveal_js-url=... as with slidy and the other slide
formats. Also cleaned up the list code in the HTML writer.
The _note attribute is supported. This is unofficial, but
used e.g. in OmniOutliner and supported by multimarkdown.
We treat the contents as markdown blocks under a section
header.
Added to documentation and tests.
* Moved code for translating listings language names to
highlighting-kate names and back from LaTeX reader to Highlighting.
* Text.Pandoc.Highlighting no longer exposed (API change)
* Text.Pandoc.Highlighting exports toListingsLang, fromListingsLang
Also `writerNumberFrom` -> `writeNumberOffset`.
The offset is a list of numbers (0 by default).
These are added to the section, subsection, etc.
numbers that would have been generated automatically.
* RTF writer: Export writeRTFWithEmbeddedImages instead of
rtfEmbedImage.
* Text.Pandoc: Use writeRTFWithEmbeddedImages for RTF.
* Moved code for embedding images in RTF out of pandoc.hs.
The previous default was to use `<q>` tags in HTML5.
But `<q>` tags are also valid HTML4, and they are not very
robust in HTML5. Some user agents don't support them,
and some CSS resets prevent pandoc's quotes CSS from working
properly (e.g. bootstrap). It seems a better default just
to insert quote characters, but the option is provided for
those who have gotten used to using `<q>` tags.
Users who want pure readers can still get them; this just affects
the function getReader that looks up a reader based on the format
name.
The point of this change is to make it possible to print warnings
from the parser.
* Added `embed_data_files` flag. (not yet used)
* Shared no longer exports `findDataFile`.
* `readDataFile` now returns a strict bytestring.
* Shared now exports `readDataFileUTF8` which returns a string like
the old `readDataFile`.
* Rewrote modules to use new data file functions and to avoid
using functions from Paths_pandoc directly.
* Remove executable and library flags.
* Expose `Text.Pandoc.XML` and `Text.Pandoc.Biblio`.
* Depend on pandoc library in executable, so we don't recompile
everything.
* Move pandoc.hs from src/ to .