* The new reader is faster and more accurate.
* API changes for Text.Pandoc.Readers.HTML:
- removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag,
anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType,
htmlBlockElement, htmlComment
- added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag
* tagsoup is a new dependency.
* Text.Pandoc.Parsing: Generalized type on readWith.
* Benchmark.hs: Added length calculation to force full evaluation.
* Updated HTML reader tests.
* Updated markdown and textile readers to use the functions from
the HTML reader.
* Note: The markdown reader now correctly handles some cases it did not
before. For example:
<hr/>
is reproduced without adding a space.
<script>
a = '<b>';
</script>
is parsed correctly.
Interact.hs loads ghci from the src directory, specifying
all the options needed to load pandoc modules (including
specific package dependencies, which it gets by parsing
dist/setup-config).
I had previously assumed that we needed to ignore
</script> occuring in a string literal or javascript
comment. It turns out, though, that browsers aren't
that smart.
This was achieved by rearranging the parsers in inline.
Benchmarks went from 500ms to 307ms -- not quite back to the
279ms we had in 1.6, before supporting smart punctuation and
footnotes, but close.
* Added Text.Pandoc.Pretty.
This is better suited for pandoc than the 'pretty' package.
One advantage is that we now get proper wrapping; Emph [Inline]
is no longer treated as a big unwrappable unit. Previously
we only got breaks for spaces at the "outer level." We can also
more easily avoid doubled blank lines. Performance is
significantly better as well.
* Removed Text.Pandoc.Blocks.
Text.Pandoc.Pretty allows you to define blocks and concatenate
them.
* Modified markdown, RST, org readers to use Text.Pandoc.Pretty
instead of Text.PrettyPrint.HughesPJ.
* Text.Pandoc.Shared: Added writerColumns to WriterOptions.
* Markdown, RST, Org writers now break text at writerColumns.
* Added --columns command-line option, which sets stColumns
and writerColumns.
* Table parsing: If the size of the header > stColumns,
use the header size as 100% for purposes of calculating
relative widths of columns.