No description
Find a file
John Luke Bentley 07d51d9e30 Make default.html5 polyglot markup conformant. (#3473)
Polyglot markup is HTML5 that is also valid XHTML. See
<https://www.w3.org/TR/html-polyglot>.  With this change, pandoc's
html5 writer creates HTML that is both valid HTML5 and valid XHTML.
See jgm/pandoc-templates#237 for prior discussion.

* Add xml namespace to `<html>` element.
* Make all `<meta>` elements self closing.
  See <https://www.w3.org/TR/html-polyglot/#empty-elements>.
* Add `xml:lang` attribute on `<html>` element, defaulting to blank, and
  always include `lang` attribute, even when blank.  See
  <https://www.w3.org/TR/html-polyglot/#language-attributes>.
* Update test files for template changes.

The key justification for having language values default to blank: it
turns out the HTML5 spec requires it (as I read it).  Under
[the HTML5 spec, section "3.2.5.3. The lang and xml:lang
attributes"](https://www.w3.org/TR/html/dom.html#the-lang-and-xmllang-attributes),
providing attributes with blank contents both:

    * Has meaning, "unknown", and
    * Is a MUST (written as "must") if a language value is not provided ...

> The lang attribute (in no namespace) specifies the primary language
> for the element's contents and for any of the element's attributes that
> contain text. Its value must be a valid BCP 47 language tag, or the
> empty string. Setting the attribute to the empty string indicates that
> the primary language is unknown.

In short, it seems that where a language value is not provided then a
blank value MUST be provided for Polyglot Markup conformance, because
the HTML5 spec stipulates a "must". So although the Polyglot Markup spec
is unclear on this issue it would seem that if it was correctly written,
it would therefore require blank attributes.

Further justifications are found at
https://github.com/jgm/pandoc-templates/issues/237#issuecomment-275584181
(but the HTML5 spec justification given above would seem to be the
clincher).

In addition to having lang-values-default-to-blank I recommend that, when an
author does not provide a lang value, then upon on pandoc command execution
a warning message like the following be provided:

> Polyglot markup stipulates that 'The root element SHOULD always specify
> the language'. It is therefore recommended you specify a language value in
> your source document. See
> <https://www.w3.org/International/articles/language-tags/> for valid
> language values.
2017-03-04 10:08:38 +01:00
benchmark Fix stale references to tests directory (#3469) 2017-02-25 10:31:40 +01:00
data Make default.html5 polyglot markup conformant. (#3473) 2017-03-04 10:08:38 +01:00
deb deb/make_deb.sh fixes. 2017-02-12 23:13:30 +01:00
doc Added skeletons for docs on customizing pandoc and using pandoc API. 2017-02-01 12:50:44 +01:00
lib/fonts lib: Added symbol.txt and file to generate codepoint to unicode mapping 2014-08-09 22:37:12 -04:00
macos Replaced {deb,macos,windows}/stack.yaml with stack.pkg.yaml. 2017-02-12 21:45:30 +01:00
man Updated man page. 2017-01-29 21:19:16 +01:00
prelude Remove unnecessary CPP in custom Prelude. 2016-09-03 15:23:32 -04:00
src/Text OpenDocument writer: fixed dropped elements in some ordered lists. 2017-03-03 22:48:37 +01:00
test Make default.html5 polyglot markup conformant. (#3473) 2017-03-04 10:08:38 +01:00
tools Moved extract-changes.hs and github-upload.sh to tools/. 2017-01-25 17:07:41 +01:00
trypandoc Adjustments for building trypandoc with stack. 2017-01-29 22:01:11 +01:00
windows Windows packaging fixes to use new stack.pkg.yaml. 2017-02-12 22:04:53 +01:00
.editorconfig Fix editorconfig for test files 2014-04-12 12:22:09 +02:00
.gitignore Added deb/.vagrant to gitignore 2017-02-01 12:36:56 +01:00
.travis.yml Travis: move 7.8.4 out of allowed-failures. 2017-02-20 16:39:37 +01:00
appveyor.yml appveyor.yml: Fixed some paths. 2017-02-12 23:17:35 +01:00
BUGS BUGS: Added reference to CONTRIBUTING.md. 2013-04-14 22:14:44 -07:00
changelog Updated changelog. 2017-01-29 21:09:21 +01:00
CONTRIBUTING.md Fixed typos in CONTRIBUTING.md (#3479) 2017-03-01 15:00:53 +01:00
COPYING.md Download markdown version of the license from GNU and rename to COPYING.md 2016-10-19 04:11:36 -07:00
COPYRIGHT Fix stale references to tests directory (#3469) 2017-02-25 10:31:40 +01:00
INSTALL.md Fix stale references to tests directory (#3469) 2017-02-25 10:31:40 +01:00
Makefile Makefile: make version overridable. 2017-02-07 19:01:21 +01:00
MANUAL.txt Removed --epub-stylesheet; use --css instead. 2017-02-27 21:29:16 +01:00
pandoc.cabal Bumped syb upper bound. 2017-03-02 10:42:07 +01:00
pandoc.hs Consolidated file arguments into Opt. 2017-02-06 14:52:16 +01:00
README.md Rearrange and extend badges in README (#3354) 2017-01-15 10:48:28 +01:00
RELEASE-CHECKLIST Update RELEASE_CHECKLIST. 2017-02-27 10:19:22 +01:00
RELEASE-CHECKLIST.md Updated RELEASE-CHECKLIST and markdownified. 2017-01-25 17:07:41 +01:00
Setup.hs Setup.hs - removed some unneeded imports. 2016-10-18 15:11:05 +02:00
stack.full.yaml Use lts-7.5 resolver. 2016-10-26 12:32:30 +02:00
stack.pkg.yaml Use latest pandoc-citeproc commit in pkg.yaml. 2017-02-23 10:16:57 +01:00
stack.yaml Use latest skylighting (0.3). 2017-02-20 13:31:15 +01:00

Pandoc

github release hackage release homebrew stackage LTS package travis build status appveyor build status license pandoc-discuss on google groups

The universal markup converter

Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It can read Markdown, CommonMark, PHP Markdown Extra, GitHub-Flavored Markdown, MultiMarkdown, and (subsets of) Textile, reStructuredText, HTML, LaTeX, MediaWiki markup, TWiki markup, Haddock markup, OPML, Emacs Org mode, DocBook, txt2tags, EPUB, ODT and Word docx; and it can write plain text, Markdown, CommonMark, PHP Markdown Extra, GitHub-Flavored Markdown, MultiMarkdown, reStructuredText, XHTML, HTML5, LaTeX including [`beamer`] slide shows, ConTeXt, RTF, OPML, DocBook, OpenDocument, ODT, Word docx, GNU Texinfo, MediaWiki markup, DokuWiki markup, ZimWiki markup, Haddock markup, EPUB v2 or v3, FictionBook2, Textile, groff man pages, Emacs Org mode, AsciiDoc, InDesign ICML, TEI Simple, and Slidy, Slideous, DZSlides, reveal.js or S5 HTML slide shows. It can also produce PDF output on systems where LaTeX, ConTeXt, or wkhtmltopdf is installed.

Pandoc's enhanced version of Markdown includes syntax for footnotes, tables, flexible ordered lists, definition lists, fenced code blocks, superscripts and subscripts, strikeout, metadata blocks, automatic tables of contents, embedded LaTeX math, citations, and Markdown inside HTML block elements. (These enhancements, described further under Pandoc's Markdown, can be disabled using the markdown_strict input or output format.)

In contrast to most existing tools for converting Markdown to HTML, which use regex substitutions, pandoc has a modular design: it consists of a set of readers, which parse text in a given format and produce a native representation of the document, and a set of writers, which convert this native representation into a target format. Thus, adding an input or output format requires only adding a reader or writer.

Because pandoc's intermediate representation of a document is less expressive than many of the formats it converts between, one should not expect perfect conversions between every format and every other. Pandoc attempts to preserve the structural elements of a document, but not formatting details such as margin size. And some document elements, such as complex tables, may not fit into pandoc's simple document model. While conversions from pandoc's Markdown to all formats aspire to be perfect, conversions from formats more expressive than pandoc's Markdown can be expected to be lossy.

Installing

Here's how to install pandoc.

Documentation

Pandoc's website contains a full User's Guide. It is also available here as pandoc-flavored Markdown. The website also contains some examples of the use of pandoc and a limited online demo.

Contributing

Pull requests, bug reports, and feature requests are welcome. Please make sure to read the contributor guidelines before opening a new issue.

License

© 2006-2016 John MacFarlane (jgm@berkeley.edu). Released under the GPL, version 2 or greater. This software carries no warranty of any kind. (See COPYRIGHT for full copyright and warranty notices.)