README: Documented scheme for header identifiers in HTML.

git-svn-id: https://pandoc.googlecode.com/svn/trunk@680 788f1e2b-df1e-0410-8736-df70ead52e1b
This commit is contained in:
fiddlosopher 2007-07-12 04:33:15 +00:00
parent 389c762afc
commit 62227fe259

103
README
View file

@ -597,40 +597,6 @@ These work like simple tables, but with the following differences:
- They must end with a row of dashes, then a blank line.
- The rows must be separated by blank lines.
Embedded HTML
-------------
Pandoc treats embedded HTML in markdown a bit differently than
Markdown 1.0. While Markdown 1.0 leaves HTML blocks exactly as they
are, Pandoc treats text between HTML tags as markdown. Thus, for
example, Pandoc will turn
<table>
<tr>
<td>*one*</td>
<td>[a link](http://google.com)</td>
</tr>
</table>
into
<table>
<tr>
<td><em>one</em></td>
<td><a href="http://google.com">a link</a></td>
</tr>
</table>
whereas `Markdown.pl` will preserve it as is.
There is one exception to this rule: text between `<script>` and
`</script>` tags is not interpreted as markdown.
This departure from standard markdown should make it easier to mix
markdown with HTML block elements. For example, one can surround
a block of markdown text with `<div>` tags without preventing it
from being interpreted as markdown.
Title blocks
------------
@ -675,6 +641,75 @@ For example,
The middle of the man page footer is used for the date.
Markdown in HTML blocks
-----------------------
While standard markdown leaves HTML blocks exactly as they are, Pandoc
treats text between HTML tags as markdown. Thus, for example, Pandoc
will turn
<table>
<tr>
<td>*one*</td>
<td>[a link](http://google.com)</td>
</tr>
</table>
into
<table>
<tr>
<td><em>one</em></td>
<td><a href="http://google.com">a link</a></td>
</tr>
</table>
whereas `Markdown.pl` will preserve it as is.
There is one exception to this rule: text between `<script>` and
`</script>` tags is not interpreted as markdown.
This departure from standard markdown should make it easier to mix
markdown with HTML block elements. For example, one can surround
a block of markdown text with `<div>` tags without preventing it
from being interpreted as markdown.
Header identifiers in HTML
--------------------------
Each header element in pandoc's HTML output is given a unique
identifier. This identifier is based on the text of the header. To
derive the identifier from the header text,
- Remove all formatting, links, etc.
- Remove all punctuation, except dashes and hyphens.
- Replace all spaces, dashes, newlines, and hyphens with hyphens.
- Convert all alphabetic characters to lowercase.
Thus,
Header text Identifier
--------------------------------- --- ----------------------------
Header identifiers in HTML → `header-identifiers-in-html`
*Dogs*?--in *my* house? → `dogs--in-my-house`
[HTML], [S5], or [RTF]? → `html-s5-or-rtf`
These rules should, in most cases, allow one to determine the identifier
from the header text. The exception is when several headers have the
same text; in this case, the first will get an identifier as described
above; the second will get the same identifier with `-1` appended; the
third with `-2`; and so on.
These identifiers are used to provide link targets in the table of
contents generated by the `--toc|--table-of-contents` option. They
also make it easy to provide links from one section of a document to
another. A link to this section, for example, might look like this:
See the section on [header identifiers](#header-identifiers-in-html).
Note, however, that this method of providing links to sections works
only in HTML.
Box-style blockquotes
---------------------