Finished pandoc-server documentation.

This commit is contained in:
John MacFarlane 2022-08-16 16:11:04 -07:00
parent 9a9fd0720a
commit c5f8a38f38
2 changed files with 252 additions and 92 deletions

View file

@ -24,8 +24,26 @@
\f[V]pandoc-server\f[R] is a web server that can perform pandoc \f[V]pandoc-server\f[R] is a web server that can perform pandoc
conversions. conversions.
It can be used either as a running server or as a CGI program. It can be used either as a running server or as a CGI program.
To use \f[V]pandoc-server\f[R] as a CGI program, rename it as To use \f[V]pandoc-server\f[R] as a CGI program, rename it (or symlink
\f[V]pandoc-server.cgi\f[R]. it) as \f[V]pandoc-server.cgi\f[R].
(Note: if you symlink it, you may need to adjust your webserver\[cq]s
configuration in order to allow it to follow symlinks for the CGI
script.)
.PP
All pandoc functions are run in the PandocPure monad, which ensures that
they can do no I/O operations on the server.
This should provide a high degree of security.
It does, however, impose certain limitations:
.IP \[bu] 2
PDFs cannot be produced.
.IP \[bu] 2
Filters are not supported.
.IP \[bu] 2
Resources cannot be fetched via HTTP.
.IP \[bu] 2
Any images, include files, or other resources needed for the document
conversion must be explicitly included in the request, via the
\f[V]files\f[R] field (see below under API).
.SH OPTIONS .SH OPTIONS
.TP .TP
\f[V]--port NUM\f[R] \f[V]--port NUM\f[R]
@ -37,7 +55,10 @@ Timeout in seconds, after which a conversion is killed.
Default: 2. Default: 2.
.TP .TP
\f[V]--help\f[R] \f[V]--help\f[R]
Help Print this help.
.TP
\f[V]--version\f[R]
Print version.
.SH API .SH API
.SS Root endpoint .SS Root endpoint
.PP .PP
@ -59,63 +80,182 @@ The body of the POST request should be a JSON object, with the following
fields. fields.
Only the \f[V]text\f[R] field is required; all of the others can be Only the \f[V]text\f[R] field is required; all of the others can be
omitted for default values. omitted for default values.
When there are several string alternatives, the first one given is the
default.
.TP .TP
\f[V]text\f[R] (string) \f[V]text\f[R] (string)
the document to be converted. The document to be converted.
Note: if the \f[V]from\f[R] format is binary (e.g., \f[V]epub\f[R] or Note: if the \f[V]from\f[R] format is binary (e.g., \f[V]epub\f[R] or
\f[V]docx\f[R]), then \f[V]text\f[R] should be a base64 encoding of the \f[V]docx\f[R]), then \f[V]text\f[R] should be a base64 encoding of the
document. document.
.TP .TP
\f[V]from\f[R] (string, default \f[V]\[dq]markdown\[dq]\f[R]) \f[V]from\f[R] (string, default \f[V]\[dq]markdown\[dq]\f[R])
the input format, possibly with extensions, just as it is specified on The input format, possibly with extensions, just as it is specified on
the pandoc command line. the pandoc command line.
.TP .TP
\f[V]to\f[R] (string, default \f[V]\[dq]html\[dq]\f[R]) \f[V]to\f[R] (string, default \f[V]\[dq]html\[dq]\f[R])
the output format, possibly with extensions, just as it is specified on The output format, possibly with extensions, just as it is specified on
the pandoc command line. the pandoc command line.
.IP .TP
.nf \f[V]wrapText\f[R] (\f[V]\[dq]auto\[dq]|\[dq]preserve\[dq]|\[dq]none\[dq]\f[R])
\f[C] Text wrapping option: either \f[V]\[dq]auto\[dq]\f[R] (automatic
wrapText :: Maybe WrapOption hard-wrapping to fit within a column width),
columns :: Maybe Int \f[V]\[dq]preserve\[dq]\f[R] (insert newlines where they are present in
standalone :: Maybe Bool the source), or \f[V]\[dq]none\[dq]\f[R] (don\[cq]t insert any
template :: Maybe Text unnecessary newlines at all).
tabStop :: Maybe Int .TP
indentedCodeClasses :: Maybe [Text] \f[V]columns\f[R] (integer, default 72)
abbreviations :: Maybe (Set Text) Column width (affects text wrapping and calculation of table column
defaultImageExtension :: Maybe Text widths in plain text formats)
trackChanges :: Maybe TrackChanges .TP
stripComments :: Maybe Bool \f[V]standalone\f[R] (boolean, default false)
citeproc :: Maybe Bool If true, causes a standalone document to be produced, using the default
variables :: Maybe (DocTemplates.Context Text) template or the custom template specified using \f[V]template\f[R].
tableOfContents :: Maybe Bool If false, a fragment will be produced.
incremental :: Maybe Bool .TP
htmlMathMethod :: Maybe HTMLMathMethod \f[V]template\f[R] (string)
numberSections :: Maybe Bool String contents of a document template (see Templates in
numberOffset :: Maybe [Int] \f[V]pandoc(1)\f[R] for the format).
sectionDivs :: Maybe Bool .TP
referenceLinks :: Maybe Bool \f[V]tabStop\f[R] (integer, default 4)
dpi :: Maybe Int Tab stop (spaces per tab).
emailObfuscation :: Maybe ObfuscationMethod .TP
identifierPrefix :: Maybe Text \f[V]indentedCodeClasses\f[R] (array of strings)
citeMethod :: Maybe CiteMethod List of classes to be applied to indented Markdown code blocks.
htmlQTags :: Maybe Bool .TP
slideLevel :: Maybe Int \f[V]abbreviations\f[R] (array of strings)
topLevelDivision :: Maybe TopLevelDivision List of strings to be regarded as abbreviations when parsing Markdown.
listings :: Maybe Bool See \f[V]--abbreviations\f[R] in \f[V]pandoc(1)\f[R] for details.
highlightStyle :: Maybe Text .TP
setextHeaders :: Maybe Bool \f[V]defaultImageExtension\f[R] (string)
epubSubdirectory :: Maybe Text Extension to be applied to image sources that lack extensions
epubFonts :: Maybe [FilePath] (e.g.\ \f[V]\[dq].jpg\[dq]\f[R]).
epubMetadata :: Maybe Text .TP
epubChapterLevel :: Maybe Int \f[V]trackChanges\f[R] (\f[V]\[dq]accept\[dq]|\[dq]reject\[dq]|\[dq]all\[dq]\f[R])
tocDepth :: Maybe Int Specifies what to do with insertions, deletions, and comments produced
referenceDoc :: Maybe FilePath by the MS Word \[lq]Track Changes\[rq] feature.
referenceLocation :: Maybe ReferenceLocation Only affects docx input.
preferAscii :: Maybe Bool .TP
files :: Maybe [(FilePath, Blob)] \f[V]stripComments\f[R] (boolean, default false)
\f[R] Causes HTML comments to be stripped in Markdown or Textile source,
.fi instead of being passed through to the output format.
.TP
\f[V]citeproc\f[R] (boolean, default false)
Causes citations to be processed using citeproc.
See Citations in \f[V]pandoc(1)\f[R] for details.
.TP
\f[V]citeMethod\f[R] (\f[V]\[dq]citeproc\[dq]|\[dq]natbib\[dq]|\[dq]biblatex\[dq]\f[R])
Determines how citations are formatted in LaTeX output.
.TP
\f[V]tableOfContents\f[R] (boolean, default false)
Include a table of contents (in supported formats).
.TP
\f[V]tocDepth\f[R] (integer, default 3)
Depth of sections to include in the table of contents.
.TP
\f[V]numberSections\f[R] (boolean, default false)
Automatically number sections (in supported formats).
.TP
\f[V]numberOffset\f[R] (array of integers)
Offsets to be added to each component of the section number.
For example, \f[V][1]\f[R] will cause the first section to be numbered
\[lq]2\[rq] and the first subsection \[lq]2.1\[rq]; \f[V][0,1]\f[R] will
cause the first section to be numbered \[lq]1\[rq] and the first
subsection \[lq]1.2.\[rq]
.TP
\f[V]identifierPrefix\f[R] (string)
Prefix to be added to all automatically-generated identifiers.
.TP
\f[V]sectionDivs\f[R] (boolean, default false)
Arrange the document into a hierarchy of nested sections based on the
headings.
.TP
\f[V]htmlQTags\f[R] (boolean, default false)
Use \f[V]<q>\f[R] elements in HTML instead of literal quotation marks.
.TP
\f[V]listings\f[R] (boolean, default false)
Use the \f[V]listings\f[R] package to format code in LaTeX output.
.TP
\f[V]referenceLinks\f[R] (boolean, default false)
Create reference links rather than inline links in Markdown output.
.TP
\f[V]setextHeaders\f[R] (boolean, default false)
Use Setext (underlined) headings instead of ATX (\f[V]#\f[R]-prefixed)
in Markdown output.
.TP
\f[V]preferAscii\f[R] (boolean, default false)
Use entities and escapes when possible to avoid non-ASCII characters in
the output.
.TP
\f[V]referenceLocation\f[R] (\f[V]\[dq]document\[dq]|\[dq]section\[dq]|\[dq]block\[dq]\f[R])
Determines whether link references and footnotes are placed at the end
of the document, the end of the section, or the end of the block
(e.g.\ paragraph), in certain formats.
(See \f[V]pandoc(1)\f[R] under \f[V]--reference-location\f[R].)
.TP
\f[V]topLevelDivision\f[R] (\f[V]\[dq]default\[dq]|\[dq]part\[dq]|\[dq]chapter\[dq]|\[dq]section\[dq]\f[R])
Determines how top-level headings are interpreted in LaTeX, ConTeXt,
DocBook, and TEI.
The \f[V]\[dq]default\[dq]\f[R] value tries to choose the best
interpretation based on heuristics.
.TP
\f[V]emailObfuscation\f[R] (\f[V]\[dq]none\[dq]|\[dq]references\[dq]|\[dq]javascript\[dq]\f[R])
Determines how email addresses are obfuscated in HTML.
.TP
\f[V]htmlMathMethod\f[R] (\f[V]\[dq]plain\[dq]|\[dq]webtex\[dq]|\[dq]gladtex\[dq]|\[dq]mathml\[dq]|\[dq]mathjax\[dq]|\[dq]katex\[dq]\f[R])
Determines how math is represented in HTML.
.TP
\f[V]variables\f[R] (JSON mapping)
Variables to be interpolated in the template.
(See Templates in \f[V]pandoc(1)\f[R].)
.TP
\f[V]dpi\f[R] (integer, default 96)
Dots-per-inch to use for conversions between pixels and other
measurements (for image sizes).
.TP
\f[V]incremental\f[R] (boolean, default false)
If true, lists appear incrementally by default in slide shows.
.TP
\f[V]slideLevel\f[R] (integer)
Heading level that deterimes slide divisions in slide shows.
The default is to pick the highest heading level under which there is
body text.
.TP
\f[V]highlightStyle\f[R] (string, default \f[V]\[dq]pygments\[dq]\f[R])
Specify the style to use for syntax highlighting of code.
Standard styles are \f[V]\[dq]pygments\[dq]\f[R] (the default),
\f[V]\[dq]kate\[dq]\f[R], \f[V]\[dq]monochrome\[dq]\f[R],
\f[V]\[dq]breezeDark\[dq]\f[R], \f[V]\[dq]espresso\[dq]\f[R],
\f[V]\[dq]zenburn\[dq]\f[R], \f[V]\[dq]haddock\[dq]\f[R], and
\f[V]\[dq]tango\[dq]\f[R].
Alternatively, the path of a \f[V].theme\f[R] with a KDE syntax theme
may be used (in this case, the relevant file contents must also be
included in \f[V]files\f[R], see below).
.TP
\f[V]epubMetadata\f[R] (string)
Dublin core XML elements to be used for EPUB metadata.
.TP
\f[V]epubChapterLevel\f[R] (integer, default 1)
Heading level at which chapter splitting occurs in EPUBs.
.TP
\f[V]epubSubdirectory\f[R] (string, default \[lq]EPUB\[rq])
Name of content subdirectory in the EPUB container.
.TP
\f[V]epubFonts\f[R] (array of file paths)
Fonts to include in the EPUB.
The fonts themselves must be included in \f[V]files\f[R] (see below).
.TP
\f[V]referenceDoc\f[R] (file path)
Reference doc to use in creating \f[V]docx\f[R] or \f[V]odt\f[R] or
\f[V]pptx\f[R].
See \f[V]pandoc(1)\f[R] under \f[V]--reference-doc\f[R] for details.
.TP
\f[V]files\f[R] (JSON mapping of file paths to base64-encoded strings)
Any files needed for the conversion, including images referred to in the
document source, should be included here.
Binary data must be base64-encoded.
Textual data may be left as it is, unless it is \f[I]also\f[R] valid
base 64 data, in which case it will be interpreted that way.
.SS \f[V]/batch\f[R] endpoint .SS \f[V]/batch\f[R] endpoint
.PP .PP
The \f[V]/batch\f[R] endpoint behaves like the root endpoint, except for The \f[V]/batch\f[R] endpoint behaves like the root endpoint, except for

View file

@ -71,97 +71,97 @@ the first one given is the default.
`text` (string) `text` (string)
: the document to be converted. Note: : The document to be converted. Note:
if the `from` format is binary (e.g., `epub` or `docx`), then if the `from` format is binary (e.g., `epub` or `docx`), then
`text` should be a base64 encoding of the document. `text` should be a base64 encoding of the document.
`from` (string, default `"markdown"`) `from` (string, default `"markdown"`)
: the input format, possibly with extensions, just as it is : The input format, possibly with extensions, just as it is
specified on the pandoc command line. specified on the pandoc command line.
`to` (string, default `"html"`) `to` (string, default `"html"`)
: the output format, possibly with extensions, just as it is : The output format, possibly with extensions, just as it is
specified on the pandoc command line. specified on the pandoc command line.
`wrapText` (`"auto"|"preserve"|"none"`) `wrapText` (`"auto"|"preserve"|"none"`)
: text wrapping option: either `"auto"` (automatic : Text wrapping option: either `"auto"` (automatic
hard-wrapping to fit within a column width), `"preserve"` hard-wrapping to fit within a column width), `"preserve"`
(insert newlines where they are present in the source), (insert newlines where they are present in the source),
or `"none"` (don't insert any unnecessary newlines at all). or `"none"` (don't insert any unnecessary newlines at all).
`columns` (integer, default 72) `columns` (integer, default 72)
: column width (affects text wrapping and calculation of : Column width (affects text wrapping and calculation of
table column widths in plain text formats) table column widths in plain text formats)
`standalone` (boolean, default false) `standalone` (boolean, default false)
: if true, causes a standalone document to be produced, using : If true, causes a standalone document to be produced, using
the default template or the custom template specified using the default template or the custom template specified using
`template`. If false, a fragment will be produced. `template`. If false, a fragment will be produced.
`template` (string) `template` (string)
: string contents of a document template (see Templates in : String contents of a document template (see Templates in
`pandoc(1)` for the format). `pandoc(1)` for the format).
`tabStop` (integer, default 4) `tabStop` (integer, default 4)
: tab stop (spaces per tab). : Tab stop (spaces per tab).
`indentedCodeClasses` (array of strings) `indentedCodeClasses` (array of strings)
: list of classes to be applied to indented Markdown code blocks. : List of classes to be applied to indented Markdown code blocks.
`abbreviations` (array of strings) `abbreviations` (array of strings)
: list of strings to be regarded as abbreviations when : List of strings to be regarded as abbreviations when
parsing Markdown. See `--abbreviations` in `pandoc(1)` for parsing Markdown. See `--abbreviations` in `pandoc(1)` for
details. details.
`defaultImageExtension` (string) `defaultImageExtension` (string)
: extension to be applied to image sources that lack extensions : Extension to be applied to image sources that lack extensions
(e.g. `".jpg"`). (e.g. `".jpg"`).
`trackChanges` (`"accept"|"reject"|"all"`) `trackChanges` (`"accept"|"reject"|"all"`)
: specifies what to do with insertions, deletions, and : Specifies what to do with insertions, deletions, and
comments produced by the MS Word "Track Changes" feature. Only comments produced by the MS Word "Track Changes" feature. Only
affects docx input. affects docx input.
`stripComments` (boolean, default false) `stripComments` (boolean, default false)
: causes HTML comments to be stripped in Markdown or Textile : Causes HTML comments to be stripped in Markdown or Textile
source, instead of being passed through to the output format. source, instead of being passed through to the output format.
`citeproc` (boolean, default false) `citeproc` (boolean, default false)
: causes citations to be processed using citeproc. See : Causes citations to be processed using citeproc. See
Citations in `pandoc(1)` for details. Citations in `pandoc(1)` for details.
`citeMethod` (`"citeproc"|"natbib"|"biblatex"`) `citeMethod` (`"citeproc"|"natbib"|"biblatex"`)
: determines how citations are formatted in LaTeX output. : Determines how citations are formatted in LaTeX output.
`tableOfContents` (boolean, default false) `tableOfContents` (boolean, default false)
: include a table of contents (in supported formats). : Include a table of contents (in supported formats).
`tocDepth` (integer, default 3) `tocDepth` (integer, default 3)
: depth of sections to include in the table of contents. : Depth of sections to include in the table of contents.
`numberSections` (boolean, default false) `numberSections` (boolean, default false)
: automatically number sections (in supported formats). : Automatically number sections (in supported formats).
`numberOffset` (array of integers) `numberOffset` (array of integers)
: offsets to be added to each component of the section number. : Offsets to be added to each component of the section number.
For example, `[1]` will cause the first section to be For example, `[1]` will cause the first section to be
numbered "2" and the first subsection "2.1"; `[0,1]` will numbered "2" and the first subsection "2.1"; `[0,1]` will
cause the first section to be numbered "1" and the first cause the first section to be numbered "1" and the first
@ -169,96 +169,116 @@ the first one given is the default.
`identifierPrefix` (string) `identifierPrefix` (string)
: prefix to be added to all automatically-generated identifiers. : Prefix to be added to all automatically-generated identifiers.
`sectionDivs` (boolean, default false) `sectionDivs` (boolean, default false)
: arrange the document into a hierarchy of nested sections : Arrange the document into a hierarchy of nested sections
based on the headings. based on the headings.
`htmlQTags` (boolean, default false) `htmlQTags` (boolean, default false)
: use `<q>` elements in HTML instead of literal quotation marks. : Use `<q>` elements in HTML instead of literal quotation marks.
`listings` (boolean, default false) `listings` (boolean, default false)
: use the `listings` package to format code in LaTeX output. : Use the `listings` package to format code in LaTeX output.
`referenceLinks` (boolean, default false) `referenceLinks` (boolean, default false)
: create reference links rather than inline links in Markdown output. : Create reference links rather than inline links in Markdown output.
`setextHeaders` (boolean, default false) `setextHeaders` (boolean, default false)
: use Setext (underlined) headings instead of ATX (`#`-prefixed) : Use Setext (underlined) headings instead of ATX (`#`-prefixed)
in Markdown output. in Markdown output.
`preferAscii` (boolean, default false) `preferAscii` (boolean, default false)
: use entities and escapes when possible to avoid non-ASCII : Use entities and escapes when possible to avoid non-ASCII
characters in the output. characters in the output.
`referenceLocation` (`"document"|"block"|"section"`) `referenceLocation` (`"document"|"section"|"block"`)
: : Determines whether link references and footnotes are placed
at the end of the document, the end of the section, or the
end of the block (e.g. paragraph), in
certain formats. (See `pandoc(1)` under `--reference-location`.)
`topLevelDivision` (`"default"|"part"|"chapter"|"section"`) `topLevelDivision` (`"default"|"part"|"chapter"|"section"`)
: : Determines how top-level headings are interpreted in
LaTeX, ConTeXt, DocBook, and TEI. The `"default"` value
tries to choose the best interpretation based on heuristics.
`emailObfuscation` (`"none"|"references"|"javascript"`) `emailObfuscation` (`"none"|"references"|"javascript"`)
: : Determines how email addresses are obfuscated in HTML.
`htmlMathMethod` (`"plain"|"webtex"|"gladtex"|"mathml"|"mathjax"|"katex"`) `htmlMathMethod` (`"plain"|"webtex"|"gladtex"|"mathml"|"mathjax"|"katex"`)
: : Determines how math is represented in HTML.
`variables` (JSON mapping) `variables` (JSON mapping)
: : Variables to be interpolated in the template. (See Templates
in `pandoc(1)`.)
`dpi` (integer, default 96) `dpi` (integer, default 96)
: : Dots-per-inch to use for conversions between pixels and
other measurements (for image sizes).
`incremental` (boolean, default false) `incremental` (boolean, default false)
: : If true, lists appear incrementally by default in slide shows.
`slideLevel` (integer) `slideLevel` (integer)
: : Heading level that deterimes slide divisions in slide shows.
The default is to pick the highest heading level under which
there is body text.
`highlightStyle` (string, default `"pygments"`) `highlightStyle` (string, default `"pygments"`)
: pygments (the default), kate, monochrome, breezeDark, : Specify the style to use for syntax highlighting of code.
espresso, zenburn, haddock, and tango or .theme file Standard styles are `"pygments"` (the default), `"kate"`,
`"monochrome"`, `"breezeDark"`, `"espresso"`, `"zenburn"`,
`"haddock"`, and `"tango"`. Alternatively, the path of
a `.theme` with a KDE syntax theme may be used (in this
case, the relevant file contents must also be included
in `files`, see below).
`epubMetadata` (string) `epubMetadata` (string)
: : Dublin core XML elements to be used for EPUB metadata.
`epubChapterLevel` (integer, default 1) `epubChapterLevel` (integer, default 1)
: : Heading level at which chapter splitting occurs in EPUBs.
`epubSubdirectory` (string, default "EPUB") `epubSubdirectory` (string, default "EPUB")
: : Name of content subdirectory in the EPUB container.
`epubFonts` (array of file paths) `epubFonts` (array of file paths)
: : Fonts to include in the EPUB. The fonts themselves must be
included in `files` (see below).
`referenceDoc` (file path) `referenceDoc` (file path)
: : Reference doc to use in creating `docx` or `odt` or `pptx`.
See `pandoc(1)` under `--reference-doc` for details.
`files` (JSON mapping of file paths to base64-encoded strings) `files` (JSON mapping of file paths to base64-encoded strings)
: : Any files needed for the conversion, including images
referred to in the document source, should be included here.
Binary data must be base64-encoded. Textual data may be
left as it is, unless it is *also* valid base 64 data,
in which case it will be interpreted that way.
## `/batch` endpoint ## `/batch` endpoint