e9d66b8ed8
git-svn-id: https://pandoc.googlecode.com/svn/trunk@546 788f1e2b-df1e-0410-8736-df70ead52e1b
357 lines
16 KiB
Text
357 lines
16 KiB
Text
pandoc (0.4) UNRELEASED; urgency=low
|
|
|
|
[ John MacFarlane ]
|
|
|
|
* Added support for Markdown tables. Two kinds of tables are supported
|
|
(a simple table with one-line rows, and a more complex variety with
|
|
multiline rows). Currently only the Markdown reader and the LaTeX,
|
|
Docbook, and HTML writers support tables. The syntax is documented in
|
|
README.
|
|
|
|
* Refactored to avoid reliance on Haskell's Text.Regex library, which
|
|
(a) is slow, and (b) does not properly handle unicode. This fixed
|
|
some strange bugs, e.g. in parsing S-cedilla, and improved performance.
|
|
|
|
+ Replaced 'gsub' with a general list function 'substitute'
|
|
that does not rely on Text.Regex.
|
|
+ Rewrote extractTagType in HTML reader so that it doesn't use
|
|
regexs.
|
|
+ In Markdown reader, replaced email regex test with a custom email
|
|
autolink parser (autoLinkEmail). Also replaced selfClosingTag regex
|
|
with a custom function isSelfClosingTag.
|
|
+ Modified Docbook writer so that it doesn't rely on Text.Regex for
|
|
detecting 'mailto' links.
|
|
+ Removed escapePreservingRegex and reamped entity-handling
|
|
functions in Text/Pandoc/Shared.hs and Text/Pandoc/Entities.hs to
|
|
avoid reliance on Text.Regex (see below on Entity handling changes).
|
|
|
|
* Changed handling of SGML entities. Entities are now parsed (and unicode
|
|
characters returned) in the Markdown and HTML readers, rather than being
|
|
handled in the writers. In HTML and Docbook writers, UTF-8 is now used
|
|
instead of entities for characters above 128. This makes the HTML and
|
|
Docbook output much more readable and more easily editable.
|
|
|
|
+ Removed sgmlHexEntity, sgmlDecimalEntity, sgmlNamedEntity, and
|
|
sgmlCharacterEntity regexes from Shared.hs.
|
|
+ Added parsers characterEntity, namedEntity, decimalEntity, hexEntity
|
|
to Entities.hs; these parse a string and return a unicode character.
|
|
+ Added new 'entity' parser to Markdown reader, and added '&' as a
|
|
special character.
|
|
+ Changed 'entity' parser in HTML reader to use characterEntity.
|
|
+ Rewrote decodeEntities to use the new parsers instead of Text.Regex.
|
|
+ Modified HTML and Markdown readers to call decodeEntities on all raw
|
|
strings (e.g. authors, dates, link titles), to ensure that no
|
|
unprocessed entities are included in the native representation of
|
|
the document. (In the HTML reader, most of this work is done by a
|
|
change in extractAttributeName.)
|
|
+ Added escapeSGMLChar to Entities.hs. Modified escapeSGMLString to
|
|
use escapeSGMLChar.
|
|
+ In SGML and Markdown output, escape unicode nonbreaking space as ' ',
|
|
since a unicode non-breaking space is impossible to distinguish visually
|
|
from a regular space. (Resolves issue #3.)
|
|
+ Replaced all calls to stringToSGML and encodeEntities with calls to
|
|
escapeSGMLString.
|
|
+ Rewrote escapeSGMLString for better performance.
|
|
+ Added charToEntity and charToNumericalEntity to Entities.hs.
|
|
Removed encodeEntitiesNumerical.
|
|
+ Use Data.Map for entityTable and (new) reverseEntityTable, for a
|
|
slight performance boost over the old association list.
|
|
+ Removed unneeded decodeEntities from 'str' parser in HTML and
|
|
Markdown readers.
|
|
|
|
* Fixed several bugs in HTML reader (extractTagType, attribute parsing).
|
|
|
|
* Markdown reader:
|
|
|
|
+ Fixed several bugs in smart quote recognition.
|
|
+ Changed autoLink parsing to conform better to Markdown.pl's
|
|
behavior. <google.com> is not treated as a link, but
|
|
<http://google.com>, <ftp://google.com>, and <mailto:google@google.com> are.
|
|
+ Cleaned up handling of embedded quotes in link titles. Now these are
|
|
stored as a '"' character, not as '"'.
|
|
+ Use lookAhead parser for the 'first pass' (looking for reference keys),
|
|
instead of parsing normally, then using setInput to reset input. This
|
|
yields a slight performance boost.
|
|
|
|
* Markdown writer: Use autolinks when possible. Instead of
|
|
[site.com](site.com), use <site.com>.
|
|
|
|
* RST Reader:
|
|
|
|
+ Allow the URI in a RST hyperlink target to start on the line
|
|
after the reference key.
|
|
+ Added 'try' in front of 'string', where needed, or used a different parser,
|
|
in RST reader. This fixes a bug where ````` would not be correctly parsed as
|
|
a verbatim `.
|
|
+ Fixed slow performance in parsing inline literals in RST reader. The
|
|
problem was that ``#`` was seen by 'inline' as a potential link or image.
|
|
Fix: inserted 'notFollowedBy (char '`')' in link parsers.
|
|
(Resolves issue #8.)
|
|
+ Use lookAhead instead of getInput/setInput in RST reader. Removed
|
|
unneeded getState call, since lookAhead automatically saves and
|
|
restores the parser state.
|
|
|
|
* LaTeX Reader: replaced 'choice [(try (string ...), ...]' idiom with
|
|
'oneOfStrings' in LaTeX reader, for clarity.
|
|
|
|
* Modified LaTeX writer to insert '\,' between consecutive quotes.
|
|
|
|
* Text.ParserCombinators.Pandoc:
|
|
|
|
+ Removed followedBy' parser, replacing it with the lookAhead parser from
|
|
Text/ParserCombinators/Parsec.
|
|
+ Added some needed 'try's before multicharacter parsers, especially in
|
|
'option' contexts.
|
|
+ Removed the 'try' from the 'end' parser in 'enclosed', so that
|
|
'enclosed' behaves like 'option', 'manyTill', etc.
|
|
|
|
* Added defaultWriterOptions to Text/Pandoc/Shared.
|
|
|
|
* Improved website target:
|
|
|
|
+ Use a subsidiary Makefile that can be run from the website
|
|
directory.
|
|
+ Improved "Examples" page: added a templating system, syntax
|
|
highlighting of xml, tex, and html files, and a demo of
|
|
docbook postprocessed by xmlto.
|
|
+ Download links now go to Google's download details page (with
|
|
SHA1 checksum) rather than directly to the files.
|
|
|
|
* Added FreeBSD port.
|
|
|
|
-- Recai Oktaş <roktas@debian.org> Tue, 16 Jan 2007 00:37:21 +0200
|
|
|
|
pandoc (0.3) unstable; urgency=low
|
|
|
|
[ John MacFarlane ]
|
|
|
|
* Changes in pandoc options:
|
|
|
|
+ Allow options to follow or precede arguments.
|
|
+ Changed '--smartypants' to '--smart' and adjusted symbols accordingly.
|
|
+ Added '--strict' option.
|
|
+ Added '-o/--output' option.
|
|
+ Added '--dump-args' and '--ignore-args' options (for use in wrappers).
|
|
+ Modified '-v' and '-h' output to go to STDERR, not STDOUT, and return
|
|
error conditions. This is helpful for writing wrappers.
|
|
+ Added copyright message to '-v' output, modeled after FSF messages.
|
|
+ Reformatted usage message so that it doesn't wrap illegibly.
|
|
+ Removed extra blanks after '-h' and '-D' output.
|
|
|
|
* Added docbook writer.
|
|
|
|
* Added implicit setting of default input and output format based
|
|
on input and output filename extensions. These defaults are
|
|
overridden if explicit input and output formats are specified using
|
|
'-t', '-f', '-r', or '-w' options. Documented in pandoc(1) man page
|
|
and README.
|
|
|
|
* Allow ordered list items to begin with (single) letters, as well
|
|
as numbers. The list item marker may now be terminated either by
|
|
'.' or by ')'. This extension to standard markdown is documented
|
|
in README.
|
|
|
|
* Revised footnote syntax. (See README for full details.) The
|
|
'[^1]' format now standard in markdown extensions is supported,
|
|
as are inline footnotes with this syntax: '^[My note.]'.
|
|
The earlier footnote syntax '^(1)' is no longer supported.
|
|
|
|
* Improved HTML representation of footnotes. All footnotes
|
|
are now auto-numbered and appear in an ordered list at the
|
|
end of the HTML document. Since the default appearance is now
|
|
acceptable, the old footnote styles have been removed from the
|
|
HTML header.
|
|
|
|
* Bug fixes:
|
|
|
|
+ Fixed a serious bug in the markdown, LaTeX, and RST readers.
|
|
These readers ran 'runParser' on processed chunks of text to handle
|
|
embedded block lists in lists and quotation blocks. But then
|
|
any changes made to the parser state in these chunks was lost,
|
|
as the state is local to the parser. So, for example, footnotes
|
|
didn't work in quotes or list items. The fix: instead of calling
|
|
runParser on some raw text, use setInput to make it the input, then
|
|
parse it, then use setInput to restore the input to what it was
|
|
before. This is shorter and more elegant, and it fixes the problem.
|
|
+ Fixed bug in notFollowedBy' combinator (adding 'try' before
|
|
'parser'). Adjusted code that uses this combinator accordingly.
|
|
+ Fixed bug in RTF writer that caused improper indentation on
|
|
footnotes occurring in indented blocks like lists.
|
|
+ Fixed parsing of metadata in LaTeX reader. Now the title, author,
|
|
and date are parsed correctly. Everything else in the preamble
|
|
is skipped.
|
|
+ Modified escapedChar in LaTeX reader to allow a '\' at the end of a
|
|
line to count as escaped whitespace.
|
|
+ Modified LaTeX reader to produce inline links rather than reference
|
|
links. Otherwise, links in footnotes aren't handled properly.
|
|
+ Fixed handling of titles in links in Markdown reader, so that
|
|
embedded quotation marks are now handled properly.
|
|
+ Fixed Markdown reader's handling of embedded brackets in links.
|
|
+ Fixed Markdown reader so that it only parses bracketed material
|
|
as a reference link if there is actually a corresponding key.
|
|
+ Revised inline code parsing in Markdown reader to conform to
|
|
markdown standard. Now any number of `s can begin inline code,
|
|
which will end with the same number of `s. For example, to
|
|
have two backticks as code, write ``` `` ```. Modified Markdown
|
|
writer accordingly.
|
|
+ Fixed bug in text-wrapping routine in Markdown and RST writers.
|
|
Now LineBreaks no longer cause wrapping problems.
|
|
+ Supported hexadecimal numerical entity references as well as
|
|
decimal ones.
|
|
+ Fixed bug in Markdown reader's handling of underscores and other
|
|
inline formatting markers inside reference labels: for example,
|
|
in '[A_B]: /url/a_b', the material between underscores was being
|
|
parsed as emphasized inlines.
|
|
+ Changed Markdown reader's handling of backslash escapes so that
|
|
only non-alphanumeric characters can be escaped. Strict mode
|
|
follows Markdown.pl in only allowing a select group of punctuation
|
|
characters to be escaped.
|
|
+ Modified HTML reader to skip a newline following a <br> tag.
|
|
Otherwise the newline will be treated as a space at the beginning
|
|
of the next line.
|
|
|
|
* Made handling of code blocks more consistent. Previously, some
|
|
readers allowed trailing newlines, while others stripped them.
|
|
Now, all readers strip trailing newlines in code blocks. Writers
|
|
insert a newline at the end of code blocks as needed.
|
|
|
|
* Modified readers to make spacing at the end of output more consistent.
|
|
|
|
* Minor improvements to LaTeX reader:
|
|
|
|
+ '\thanks' now treated like a footnote.
|
|
+ Simplified parsing of LaTeX command arguments and options.
|
|
commandArgs now returns a list of arguments OR options (in
|
|
whatever order they appear). The brackets are included, and
|
|
a new stripFirstAndLast function is provided to strip them off
|
|
when needed. This fixes a problem in dealing with \newcommand
|
|
and \newenvironment.
|
|
|
|
* Revised RTF writer:
|
|
|
|
+ Default font is now Helvetica.
|
|
+ An '\f0' is added to each '\pard', so that font resizing works
|
|
correctly.
|
|
|
|
* Moved handling of "smart typography" from the writers to the Markdown
|
|
and LaTeX readers. This allows great simplification of the writers
|
|
and more accurate smart quotes, dashes, and ellipses. DocBook can
|
|
now use '<quote>'. The '--smart' option now toggles an option in
|
|
the parser state rather than a writer option. Several new kinds
|
|
of inline elements have been added: Quoted, Ellipses, Apostrophe,
|
|
EmDash, EnDash.
|
|
|
|
* Changes in HTML writer:
|
|
|
|
+ Include title block in header even when title is null.
|
|
+ Made javascript obfuscation of emails even more obfuscatory,
|
|
by combining it with entity obfuscation.
|
|
|
|
* Changed default ASCIIMathML text color to black.
|
|
|
|
* Test suite:
|
|
|
|
+ Added --strip-trailing-cr option to diff in runtests.pl, for
|
|
compatibility with Windows.
|
|
+ Added regression tests with footnotes in quote blocks and lists.
|
|
|
|
* Makefile changes:
|
|
|
|
+ osx-pkg target creates a Mac OS X package (directory). New osx
|
|
directory contains files needed for construction of the package.
|
|
+ osx-dmg target creates a compressed disk image containing the package.
|
|
+ win-pkg target creates Windows binary package.
|
|
+ tarball target creates distribution source tarball.
|
|
+ website target generates pandoc's website automatically, including
|
|
demos. New 'web' directory containts files needed for construction
|
|
of the website (which will be created as the 'pandoc' subdirectory
|
|
of 'web').
|
|
+ Makefile checks to see if we're running Windows/Cygwin; if so,
|
|
a '.exe' extension is added to each executable in EXECS.
|
|
|
|
* Removed all wrappers except markdown2pdf and html2markdown.
|
|
|
|
* Added new wrapper hsmarkdown, to be used as a drop-in replacement
|
|
for Markdown.pl. hsmarkdown calls pandoc with the '--strict'
|
|
option and disables other options.
|
|
|
|
* Added code to html2markdown that tries to determine the character
|
|
encoding of an HTML file, by parsing the "Content-type" meta tag.
|
|
|
|
+ If the encoding can't be determined, then if the content is local,
|
|
the local encoding is used; if it comes from a URL, UTF-8 is used
|
|
by default.
|
|
+ If input is from STDIN, don't try to determine character encoding.
|
|
+ Encoding can be specified explicitly using '-e' option.
|
|
|
|
* Improved warning messages in wrappers:
|
|
|
|
+ Print warning if iconv not available
|
|
+ More user-friendly error messages in markdown2pdf, when
|
|
pdflatex fails.
|
|
|
|
* Code cleanup:
|
|
|
|
+ Renamed 'Text/Pandoc/HtmlEntities' module to
|
|
'Text/Pandoc/Entities'. Also changed function names so as
|
|
not to be HTML-specific.
|
|
+ Refactored SGML string escaping functions from HTML and Docbook
|
|
writers into Text/Pandoc/Shared. (escapeSGML, stringToSGML)
|
|
+ Removed 'BlockQuoteContext' from ParserContext, as it isn't
|
|
used anywhere.
|
|
+ Removed splitBySpace and replaced it with a general, polymorphic
|
|
splitBy function.
|
|
+ Refactored LaTeX reader for clarity (added isArg function).
|
|
+ Converted some CR's to LF's in src/ui/default/print.css.
|
|
+ Added license text to top of source files.
|
|
+ Added module data for haddock to source files.
|
|
+ Reformatted code for consistency.
|
|
|
|
* Rewrote documentation and man pages. Split README into INSTALL
|
|
and README.
|
|
|
|
* Split LICENSE into COPYING and COPYRIGHT.
|
|
|
|
* Removed TODO, since we now maintain ToDo on the wiki.
|
|
|
|
* Made COPYRIGHT in top level a symlink to debian/copyright, to avoid
|
|
duplication.
|
|
|
|
[ Recai Oktaş ]
|
|
|
|
* Revamped build process to conform to debian standards and created
|
|
a proper debian package. Closes: #391666.
|
|
|
|
* Modified build process to support GHC 6.6.
|
|
|
|
+ The package can still be compiled using GHC 6.4.2, though because
|
|
of dependencies the "make deb" target works only with GHC 6.6+.
|
|
+ The script 'cabalize' is used to create an appropriate
|
|
'Pandoc.cabal' from 'Pandoc.cabal.in', depending on the GHC and
|
|
Cabal versions.
|
|
|
|
* Refactored template processing (fillTemplates.pl).
|
|
|
|
* Modified wrapper scripts to make them more robust and portable.
|
|
To avoid code duplication and ensure consistency, wrappers are
|
|
generated via a templating system from templates in src/wrappers.
|
|
|
|
+ Wrappers now accept multiple filenames, when appropriate.
|
|
+ Spaces and tabs allowed in filenames.
|
|
+ getopts shell builtin is used for portable option parsing.
|
|
+ Improved html2markdown's web grabber code, making it more robust,
|
|
configurable and verbose. Added '-e', '-g' options.
|
|
|
|
-- Recai Oktaş <roktas@debian.org> Fri, 05 Jan 2007 09:41:19 +0200
|
|
|
|
pandoc (0.2) unstable; urgency=low
|
|
|
|
* Fixed unicode/utf-8 translation
|
|
|
|
-- John MacFarlane <jgm@berkeley.edu> Mon, 14 Aug 2006 00:00:00 -0400
|
|
|
|
pandoc (0.1) unstable; urgency=low
|
|
|
|
* Initial creation of debian package
|
|
|
|
-- John MacFarlane <jgm@berkeley.edu> Mon, 14 Aug 2006 00:00:00 -0400
|