From 95b1327f6bdefd63b3a9241e2924bdf7bffe0156 Mon Sep 17 00:00:00 2001 From: John MacFarlane Date: Tue, 2 Mar 2021 14:23:02 -0800 Subject: [PATCH] Update changelog. --- changelog.md | 280 +++++++++++++++++++++++++++++++++------------------ 1 file changed, 182 insertions(+), 98 deletions(-) diff --git a/changelog.md b/changelog.md index b13bf1fd4..a48b668b5 100644 --- a/changelog.md +++ b/changelog.md @@ -2,6 +2,11 @@ ## pandoc 2.12 (UNRELEASED -- PROVISIONAL) + * When downloading content from URL arguments, be sensitive to + the character encoding (#5600). We can properly handle UTF-8 and latin1 + (ISO-8859-1); for others we raise an error. Fall back to latin1 if + no charset is given in the mime type and UTF-8 decoding fails. + * Add new unexported module Text.Pandoc.XML.Light, as well as Text.Pandoc.XML.Light.Types, Text.Pantoc.XML.Light.Proc, Text.Pandoc.XML.Light.Output. (Closes #6001, #6565, #7091). @@ -34,75 +39,6 @@ these have been added. And the docx golden tests have been updated, because the new parser does not preserve the order of attributes. - * Text.Pandoc.App - - + Add `parseOptionsFromArgs` [API change, new exported function]. - - * Text.Pandoc.Citeproc.BibTeX - - + `Text.Pandoc.Citeproc.writeBibTeXString` now returns - `Doc Text` instead of `Text` (#7068). - + Correctly handle `pages` (= `page` in CSL) (#7067). - + Correctly handle BibLaTeX `langid` (= `language` in CSL, #7067). - + In BibTeX output, protect foreign titles since there's no language - field (#7067). - + Clean up BibTeX parsing (#7049). Previously there was a messy code - path that gave strange results in some cases, not passing through raw - tex but trying to extract a string content. This was an artefact of - trying to handle some special bibtex-specific commands in the BibTeX - reader. Now we just handle these in the LaTeX reader and simplify - parsing in the BibTeX reader. This does mean that more raw tex will - be passed through (and currently this is not sensitive to the - `raw_tex` extension; this should be fixed). - - * Text.Pandoc.Citeproc.MetaValue - - + Correctly parse "raw" date value in markdown references metadata. - (See jgm/citeproc#53.) - - * Text.Pandoc.Class - - + Add `getTimestamp` [API change]. This attempts to read the - `SOURCE_DATE_EPOCH` environment variable and parse a UTC time - from it (treating it as a unix date stamp, see - https://reproducible-builds.org/specs/source-date-epoch/). If the - variable is not set or can't be parsed as a unix date stamp, then the - function returns the current date. - - * Text.Pandoc.Error - - + Remove unused variables (Albert Krewinkel) - + Export `renderError` [API change]. - + Refactor `handleError` to use `renderError`. This allows us render - error messages without exiting. - - * Text.Pandoc.Extensions - - + `Ext_task_lists` is now supported by org (and turned - on by default) (Albert Krewinkel, #6336). - + Remove `Ext_fenced_code_attributes` from allowed commonmark attributes - (#7097). This attribute was listed as allowed, but it didn't actually - do anything. Use `attributes` for code attributes and more. - - * Lua subsystem: - - + Always load built-in Lua scripts from default data-dir (Albert - Krewinkel). The Lua modules `pandoc` and `pandoc.List` are now always - loaded from the system's default data directory. Loading from a - different directory by overriding the default path, e.g. via - `--data-dir`, is no longer supported to avoid unexpected behavior - and to address security concerns. - + Add module "pandoc.path" (Albert Krewinkel, #6001, #6565). - The module allows to work with file paths in a convenient and - platform-independent manner. - - * Text.Pandoc.PDF - - + Disable `smart` extension when building PDF via LaTeX. - This is to prevent accidental creation of ligatures like - `` ?` `` and `` !` `` (especially in languages with quotations like - German), and similar ligature issues. (See jgm/citeproc#54.) - * DocBook reader: + Avoid expensive tree normalization step, as it is not necessary @@ -137,6 +73,11 @@ * LaTeX reader: + + Don't export `tokenize`, `untokenize` [API change]. These are internal + implementation details, which were only exported for testing. + They don't belong in the public API. + + Improved efficiency of the parser. With these changes the reader + is almost twice as fast as in the last release in our benchmarks. + Code cleanup, removing some unnecessary things. + Rewrite `withRaw` so it doesn't rely on fragile assumptions about token positions (which break when macros are expanded) @@ -156,11 +97,19 @@ it doesn't change anything. + Improve `braced'`. Remove the parameter, have it parse the opening brace, and make it more efficient. + + Factor out pieces of the LaTeX reader to make the module smaller. + This reduces memory demands when compiling. Created + Text.Pandoc.Readers.{LaTeX,Math,Citation,Table,Macro,Accent}. + Changed Text.Pandoc.Readers.LaTeX.SIunitx to export a command map + instead of individual commands. + + * Make Text.Pandoc.Readers.LaTeX.Types an unexported module [API change]. * Markdown reader: + Improved handling of mmd link attributes in references (#7080). Previously they only worked for links that had titles. + + Improved efficiency of the parser (benchmarks show a 15% speedup). * OPML reader: @@ -218,7 +167,7 @@ + Use `getTimestamp` instead of `getCurrentTime` for timestamp. Setting `SOURCE_DATE_EPOCH` will allow reproducible builds. - * Text.Pandoc.Writers.EPUB + * EPUB writer: + Use `getTimestamp` instead of `getCurrentTime` for timestamp. Setting `SOURCE_DATE_EPOCH` will allow reproducible builds (#7093). @@ -233,6 +182,10 @@ contains an ampersand or another character with a special meaning in XML. + * Jira writer: + + + Use Span identifiers as anchors (Albert Krewinkel, + * LaTeX writer: + Adjust hypertargets to beginnings of paragraphs (#7078). @@ -241,6 +194,8 @@ This makes a particular difference for links to citations using `--citeproc` and `link-citations: true`. + Change BCP47 lang tag from `jp` to `ja` (Mauro Bieg, #7047). + + Use function instead of map for accent lookup (should be + more efficient). * Markdown writer: @@ -279,40 +234,151 @@ `--include-in-header`, `--include-before-body`, `--include-after-body`) may be used. - * LaTeX template: Update to iftex package (#7073) (Andrew Dunning) + * LaTeX template: + + + Update to iftex package (#7073) (Andrew Dunning) + + Wrap url colours in braces (#7121) (Loïc Grobol). * revealjs template: Add 'center' option for vertical slide centering. (maurerle, #7104). * Text.Pandoc.XML: Improve efficiency of `fromEntities`. - * Test suite: a more robust way of testing the executable. - Many of our tests require running the pandoc executable. This is - problematic for a few different reasons. First, cabal-install will - sometimes run the test suite after building the library but before - building the executable, which means the executable isn't in place for - the tests. One can work around that by first building, then building and - running the tests, but that's fragile. Second, we have to find the - executable. So far, we've done that using a function `findPandoc` that - attempts to locate it relative to the test executable (which can be - located using findExecutablePath). But the logic here is delicate and - work with every combination of options. To solve both problems, we add - an `--emulate` option to the `test-pandoc` executable. When `--emulate` - occurs as the first argument passed to `test-pandoc`, the program simply - emulates the regular pandoc executable, using the rest of the arguments - (after `--emulate`). Thus, `test-pandoc --emulate -f markdown -t latex` - is just like `pandoc -f markdown -t latex`. - Since all the work is done by library functions, implementing this - emulation just takes a couple lines of code and should be entirely - reliable. With this change, we can test the pandoc executable by running - the test program itself (locatable using `findExecutablePath`) with the - `--emulate` option. This removes the need for the fragile `findPandoc` - step, and it means we can run our integration tests even when we're just - building the library, not the executable. [Note: part of this change - involved simplifying some complex handling to set environment variables - for dynamic library paths. I have tested a build with - `--enable-dynamic-executable`, and it works, but further testing may be - needed.] + * Text.Pandoc.MIME + + + Add exported function `getCharset` [API change]. + + * Text.Pandoc.UTF8: change IO functions to return Text, not String + [API change]. This affects `readFile`, `getContents`, `writeFileWith`, + `writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`. + `hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`. + This avoids the need to uselessly create a linked list of characters + when emiting output. + + * Text.Pandoc.App + + + Add `parseOptionsFromArgs` [API change, new exported function]. + + * Text.Pandoc.Citeproc.BibTeX + + + `Text.Pandoc.Citeproc.writeBibTeXString` now returns + `Doc Text` instead of `Text` (#7068). + + Correctly handle `pages` (= `page` in CSL) (#7067). + + Correctly handle BibLaTeX `langid` (= `language` in CSL, #7067). + + In BibTeX output, protect foreign titles since there's no language + field (#7067). + + Clean up BibTeX parsing (#7049). Previously there was a messy code + path that gave strange results in some cases, not passing through raw + tex but trying to extract a string content. This was an artefact of + trying to handle some special bibtex-specific commands in the BibTeX + reader. Now we just handle these in the LaTeX reader and simplify + parsing in the BibTeX reader. This does mean that more raw tex will + be passed through (and currently this is not sensitive to the + `raw_tex` extension; this should be fixed). + + * Text.Pandoc.Citeproc.MetaValue + + + Correctly parse "raw" date value in markdown references metadata. + (See jgm/citeproc#53.) + + * Text.Pandoc.Citeproc + + + Use https URLs for links (Salim B, #7122). + + * Text.Pandoc.Class + + + Add `getTimestamp` [API change]. This attempts to read the + `SOURCE_DATE_EPOCH` environment variable and parse a UTC time + from it (treating it as a unix date stamp, see + https://reproducible-builds.org/specs/source-date-epoch/). If the + variable is not set or can't be parsed as a unix date stamp, then the + function returns the current date. + + * Text.Pandoc.Error + + + Add `PandocUnsupportedCharsetError` constructor for + `PandocError` [API change]. + + Export `renderError` [API change]. + + Refactor `handleError` to use `renderError`. This allows us render + error messages without exiting. + + Remove unused variables (Albert Krewinkel) + + * Text.Pandoc.Extensions + + + `Ext_task_lists` is now supported by org (and turned + on by default) (Albert Krewinkel, #6336). + + Remove `Ext_fenced_code_attributes` from allowed commonmark attributes + (#7097). This attribute was listed as allowed, but it didn't actually + do anything. Use `attributes` for code attributes and more. + + * Lua subsystem: + + + Always load built-in Lua scripts from default data-dir (Albert + Krewinkel). The Lua modules `pandoc` and `pandoc.List` are now always + loaded from the system's default data directory. Loading from a + different directory by overriding the default path, e.g. via + `--data-dir`, is no longer supported to avoid unexpected behavior + and to address security concerns. + + Add module "pandoc.path" (Albert Krewinkel, #6001, #6565). + The module allows to work with file paths in a convenient and + platform-independent manner. + + Use strict evaluation when retrieving AST value from the stack + (Albert Krewinkel, #6674). + + * Text.Pandoc.PDF + + + Disable `smart` extension when building PDF via LaTeX. + This is to prevent accidental creation of ligatures like + `` ?` `` and `` !` `` (especially in languages with quotations like + German), and similar ligature issues. (See jgm/citeproc#54.) + + * Text.Pandoc.CSV: + + + Fix parsing of unquoted values (#7112). Previously we didn't allow + unescaped quotes in unquoted values, but they are allowed + in CSV. + + * Test suite: + + + Use a more robust method for testing the executable. Many + of our tests require running the pandoc executable. This + is problematic for a few different reasons. First, + cabal-install will sometimes run the test suite after + building the library but before building the executable, + which means the executable isn't in place for the tests. + One can work around that by first building, then building + and running the tests, but that's fragile. Second, we + have to find the executable. So far, we've done that using + a function `findPandoc` that attempts to locate it + relative to the test executable (which can be located + using findExecutablePath). But the logic here is delicate + and work with every combination of options. To solve both + problems, we add an `--emulate` option to the + `test-pandoc` executable. When `--emulate` occurs as the + first argument passed to `test-pandoc`, the program simply + emulates the regular pandoc executable, using the rest of + the arguments (after `--emulate`). Thus, `test-pandoc + --emulate -f markdown -t latex` is just like `pandoc -f + markdown -t latex`. Since all the work is done by library + functions, implementing this emulation just takes a couple + lines of code and should be entirely reliable. With this + change, we can test the pandoc executable by running the + test program itself (locatable using `findExecutablePath`) + with the `--emulate` option. This removes the need for the + fragile `findPandoc` step, and it means we can run our + integration tests even when we're just building the + library, not the executable. [Note: part of this change + involved simplifying some complex handling to set + environment variables for dynamic library paths. I have + tested a build with `--enable-dynamic-executable`, and it + works, but further testing may be needed.] + + Print accurate location if a test fails (Albert + Krewinkel). Ensures that tasty-hunit reports the location + of the failing test instead of the location of the helper + `test` function. + + * Documentation: Update URLs and use `https` where possible (#7122, + Salim B). * MANUAL.txt @@ -328,6 +394,9 @@ * Makefile: in `make bench`, create CSV files for comparison and compare against previous benchmark run. Add timestamp to CSV filenames. + * cabal.project: don't explicitly set -trypandoc. + If we do, this can't be overridden on the cabal command line. + * doc/lua-filters.md: improve documentation for `pandoc.mediabag.insert`, `pandoc.mediabag.fetch`, `directory`, `normalize` (Albert Krewinkel). @@ -339,6 +408,14 @@ * Require citeproc 0.3.0.7, which correctly titlecases when titles contain non-ASCII characters. + * Use skylighting 0.10.4. This version of skylighting uses xml-conduit + rather than hxt. This speeds up parsing of XML syntax definitions + fourfold, and removes four packages from pandoc's dependency graph: + hxt-charproperties, hxt-unicode, hxt-regex-xmlschema, hxt. + + * Add script `tools/parseTimings.pl` to help pin down which + modules take the most time and memory to compile. + * Avoid unnecessary use of NoImplicitPrelude pragma (#7089) (Albert Krewinkel) @@ -353,6 +430,13 @@ + Clean up benchmark code. + Allow specifying patterns using `-p blah'. + * trypandoc: add 2 second timeout. + + * Use `-split-sections` in creating linux release binary. + This reduces executable size significantly (by about 30%). + + * Remove `weigh-pandoc`. It's not really useful any more, now that our + regular benchmarks include data on allocation. ## pandoc 2.11.4 (2021-01-22)