Make default.html5 polyglot markup conformant. (#3473)

Polyglot markup is HTML5 that is also valid XHTML. See
<https://www.w3.org/TR/html-polyglot>.  With this change, pandoc's
html5 writer creates HTML that is both valid HTML5 and valid XHTML.
See jgm/pandoc-templates#237 for prior discussion.

* Add xml namespace to `<html>` element.
* Make all `<meta>` elements self closing.
  See <https://www.w3.org/TR/html-polyglot/#empty-elements>.
* Add `xml:lang` attribute on `<html>` element, defaulting to blank, and
  always include `lang` attribute, even when blank.  See
  <https://www.w3.org/TR/html-polyglot/#language-attributes>.
* Update test files for template changes.

The key justification for having language values default to blank: it
turns out the HTML5 spec requires it (as I read it).  Under
[the HTML5 spec, section "3.2.5.3. The lang and xml:lang
attributes"](https://www.w3.org/TR/html/dom.html#the-lang-and-xmllang-attributes),
providing attributes with blank contents both:

    * Has meaning, "unknown", and
    * Is a MUST (written as "must") if a language value is not provided ...

> The lang attribute (in no namespace) specifies the primary language
> for the element's contents and for any of the element's attributes that
> contain text. Its value must be a valid BCP 47 language tag, or the
> empty string. Setting the attribute to the empty string indicates that
> the primary language is unknown.

In short, it seems that where a language value is not provided then a
blank value MUST be provided for Polyglot Markup conformance, because
the HTML5 spec stipulates a "must". So although the Polyglot Markup spec
is unclear on this issue it would seem that if it was correctly written,
it would therefore require blank attributes.

Further justifications are found at
https://github.com/jgm/pandoc-templates/issues/237#issuecomment-275584181
(but the HTML5 spec justification given above would seem to be the
clincher).

In addition to having lang-values-default-to-blank I recommend that, when an
author does not provide a lang value, then upon on pandoc command execution
a warning message like the following be provided:

> Polyglot markup stipulates that 'The root element SHOULD always specify
> the language'. It is therefore recommended you specify a language value in
> your source document. See
> <https://www.w3.org/International/articles/language-tags/> for valid
> language values.
This commit is contained in:
John Luke Bentley 2017-03-04 20:08:38 +11:00 committed by John MacFarlane
parent ce9d49ef04
commit 07d51d9e30
4 changed files with 22 additions and 22 deletions

View file

@ -1,17 +1,17 @@
<!DOCTYPE html> <!DOCTYPE html>
<html$if(lang)$ lang="$lang$"$endif$$if(dir)$ dir="$dir$"$endif$> <html xmlns="http://www.w3.org/1999/xhtml" lang="$lang$" xml:lang="$lang$"$if(dir)$ dir="$dir$"$endif$>
<head> <head>
<meta charset="utf-8"> <meta charset="utf-8" />
<meta name="generator" content="pandoc"> <meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes"> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
$for(author-meta)$ $for(author-meta)$
<meta name="author" content="$author-meta$"> <meta name="author" content="$author-meta$" />
$endfor$ $endfor$
$if(date-meta)$ $if(date-meta)$
<meta name="dcterms.date" content="$date-meta$"> <meta name="dcterms.date" content="$date-meta$" />
$endif$ $endif$
$if(keywords)$ $if(keywords)$
<meta name="keywords" content="$for(keywords)$$keywords$$sep$, $endfor$"> <meta name="keywords" content="$for(keywords)$$keywords$$sep$, $endfor$" />
$endif$ $endif$
<title>$if(title-prefix)$$title-prefix$ $endif$$pagetitle$</title> <title>$if(title-prefix)$$title-prefix$ $endif$$pagetitle$</title>
<style type="text/css">code{white-space: pre;}</style> <style type="text/css">code{white-space: pre;}</style>

View file

@ -1,9 +1,9 @@
<!DOCTYPE html> <!DOCTYPE html>
<html> <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head> <head>
<meta charset="utf-8"> <meta charset="utf-8" />
<meta name="generator" content="pandoc"> <meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes"> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title></title> <title></title>
<style type="text/css">code{white-space: pre;}</style> <style type="text/css">code{white-space: pre;}</style>
<style type="text/css"> <style type="text/css">

View file

@ -1,9 +1,9 @@
<!DOCTYPE html> <!DOCTYPE html>
<html> <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head> <head>
<meta charset="utf-8"> <meta charset="utf-8" />
<meta name="generator" content="pandoc"> <meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes"> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title></title> <title></title>
<style type="text/css">code{white-space: pre;}</style> <style type="text/css">code{white-space: pre;}</style>
<style type="text/css"> <style type="text/css">

View file

@ -1,12 +1,12 @@
<!DOCTYPE html> <!DOCTYPE html>
<html> <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head> <head>
<meta charset="utf-8"> <meta charset="utf-8" />
<meta name="generator" content="pandoc"> <meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes"> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<meta name="author" content="John MacFarlane"> <meta name="author" content="John MacFarlane" />
<meta name="author" content="Anonymous"> <meta name="author" content="Anonymous" />
<meta name="dcterms.date" content="2006-07-17"> <meta name="dcterms.date" content="2006-07-17" />
<title>Pandoc Test Suite</title> <title>Pandoc Test Suite</title>
<style type="text/css">code{white-space: pre;}</style> <style type="text/css">code{white-space: pre;}</style>
<!--[if lt IE 9]> <!--[if lt IE 9]>