Commit graph

9141 commits

Author SHA1 Message Date
John MacFarlane
6f6e83a06e Parsing: added takeP, takeWhileP for efficient parsing of [Char]. 2017-07-07 12:37:21 +02:00
John MacFarlane
0feb7504b1 Rewrote LaTeX reader with proper tokenization.
This rewrite is primarily motivated by the need to
get macros working properly.  A side benefit is that the
reader is significantly faster (27s -> 19s in one
benchmark, and there is a lot of room for further
optimization).

We now tokenize the input text, then parse the token stream.

Macros modify the token stream, so they should now be effective
in any context, including math. Thus, we no longer need the clunky
macro processing capacities of texmath.

A custom state LaTeXState is used instead of ParserState.
This, plus the tokenization, will require some rewriting
of the exported functions rawLaTeXInline, inlineCommand,
rawLaTeXBlock.

* Added Text.Pandoc.Readers.LaTeX.Types (new exported module).
  Exports Macro, Tok, TokType, Line, Column.  [API change]
* Text.Pandoc.Parsing: adjusted type of `insertIncludedFile`
  so it can be used with token parser.
* Removed old texmath macro stuff from Parsing.
  Use Macro from Text.Pandoc.Readers.LaTeX.Types instead.
* Removed texmath macro material from Markdown reader.
* Changed types for Text.Pandoc.Readers.LaTeX's
  rawLaTeXInline and rawLaTeXBlock.  (Both now return a String,
  and they are polymorphic in state.)
* Added orgMacros field to OrgState.  [API change]
* Removed readerApplyMacros from ReaderOptions.
  Now we just check the `latex_macros` reader extension.
* Allow `\newcommand\foo{blah}` without braces.

Fixes #1390.
Fixes #2118.
Fixes #3236.
Fixes #3779.
Fixes #934.
Fixes #982.
2017-07-07 12:36:00 +02:00
John MacFarlane
1dd769e558 Logging: added MacroAlreadyDefined. 2017-07-06 14:50:51 +02:00
John MacFarlane
8300b3fbda MANUAL: document ibooks specific epub metadata. 2017-06-30 23:56:01 +02:00
John MacFarlane
20103ac2bc Allow ibooks-specific metadata in epubs. Closes #2693.
You can now have the following fields in your YAML metadata,
and it will be treated appropriately in the generated
EPUB.

```
ibooks:
  version: 1.3.4
  specified-fonts: false
  ipad-orientation-lock: portrait-only
  iphone-orientation-lock: landscape-only
  binding: true
  scroll-axis: vertical
```

This commit also fixes a regression in stylesheet paths.
2017-06-30 23:33:51 +02:00
John MacFarlane
57122f3760 Updated stack.pkg.yaml. 2017-06-30 23:26:17 +02:00
John MacFarlane
d3dae1200a Removed hard_line_breaks extension from markdown_github.
GitHub has two Markdown modes, one for long-form documents like READMEs
and one for short things like issue coments. In issue comments, a line
break is treated as a hard line break. In README, wikis, etc., it is
treated as a space as in regular Markdown.

Since pandoc is more likely to be used to convert long-form
documents from GitHub Markdown, `-hard_line_breaks` is a better
default.

Closes #3594.
2017-06-30 22:26:42 +02:00
John MacFarlane
69b2cb38a8 Make east_asian_line_breaks affect all readers/writers.
Closes #3703.
2017-06-30 22:23:15 +02:00
John MacFarlane
fa43c0feaa Updated jats tests for new texmath version. 2017-06-30 20:50:37 +02:00
John MacFarlane
69388d52e3 Use latest texmath. 2017-06-30 20:39:53 +02:00
John MacFarlane
e574d50b1c Markdown writer: Ensure that + and - are escaped properly...
so they don't cause spurious lists.  Previously they were only
if succeeded by a space, not if they were at end of line.

Closes #3773.
2017-06-30 17:41:25 +02:00
John MacFarlane
5e00cf8086 Added parameter for user data directory to runLuaFilter.
in Text.Pandoc.Lua.  Also to pushPandocModule.

This change allows users to override pandoc.lua with a file
in their local data directory, adding custom functions, etc.

@tarleb, if you think this is a bad idea, you can revert this.
But in general our data files are all overridable.
2017-06-29 17:13:19 +02:00
John MacFarlane
0f658eb46c data/pandoc.lua: regularize constructors.
We now use Pandoc instead of Doc (though Doc remains a deprecated
Synonym), and we deprecate DoubleQuoted, SingleQuoted,
InlineMath, and DisplayMath.
2017-06-29 17:08:59 +02:00
John MacFarlane
cb25326fa3 Text.Pandoc.Lua: more code simplification.
Also, now we check before running walkM that the function
table actually does contain something relevant.  E.g. if
your filter just defines Str, there's no need to run walkM
for blocks, meta, or the whole document. This should
help performance a bit (and it does, in my tests).
2017-06-29 17:07:30 +02:00
John MacFarlane
780a65f8a8 Lua filters: Remove special treatment of Quoted, Math.
No more SingleQuoted, DoubleQuoted, InlineMath, DisplayMath.
This makes everything uniform and predictable, though it does
open up a difference btw lua filters and custom writers.
2017-06-29 15:47:27 +02:00
John MacFarlane
5c80aca0e2 Text.Pandoc.Lua: refactored to remove duplicated code. 2017-06-29 14:31:02 +02:00
John MacFarlane
6ad74815f6 Text.Pandoc.Lua: use generics to reduce boilerplate.
I tested this with the str.lua filter on MANUAL.txt, and
I could see no significant performance degradation.

Doing things this way will ease maintenance, as we won't
have to manually modify this module when types change.

@tarleb, do we really need special cases for things like
DoubleQuoted and InlineMath?
2017-06-29 14:00:26 +02:00
John MacFarlane
c505de59c0 Added link-table example to doc/lua-filters.md. 2017-06-28 15:31:42 +02:00
John MacFarlane
2902260b63 Make papersize: a4 work regardless of the case of a4.
It is converted to `a4` in LaTeX and `A4` in ConTeXt.
2017-06-28 15:07:35 +02:00
John MacFarlane
7ea49da002 LaTeX template: added natbiboptions variable.
Closes #3768.
2017-06-28 14:34:12 +02:00
Alexander Krotov
79cc56726c Muse reader: parse indented blockquotes (#3769) 2017-06-28 14:32:53 +02:00
John MacFarlane
cd690d0401 LaTeX writer: fixed detection of otherlangs.
We weren't recursing into inline contexts.

Closes #3770.
2017-06-28 14:20:53 +02:00
Albert Krewinkel
f5f8485923
Text.Pandoc.Lua: catch lua errors in filter functions
Replace lua errors with `LuaException`s.
2017-06-27 17:55:47 +02:00
Albert Krewinkel
4282abbd07
Text.Pandoc.Lua: keep element unchanged if filter returns nil
This was suggested by jgm and is consistent with the behavior of other
filtering libraries.
2017-06-27 17:11:42 +02:00
Albert Krewinkel
beb78a552c
Text.Pandoc.Lua: simplify filter function runner
The code still allowed to pass an arbitrary number of arguments to the
filter function, as element properties were passed as function arguments
at some point. Now we only pass the element as the single arg, so the
code to handle multiple arguments is no longer necessary.
2017-06-27 17:11:42 +02:00
Albert Krewinkel
4fab0d2c30
data/pandoc.lua: add accessors to Table elements 2017-06-27 17:11:42 +02:00
John MacFarlane
7d9d77ca44 Require nonempty alt text for implicit_figures.
A figure with an empty caption doesn't make sense.

Closes #2844.
2017-06-27 15:25:37 +02:00
John MacFarlane
33a29fbf87 RST reader: support anchors.
E.g.

    `hello`

    .. _hello:

    paragraph

This is supported by putting "paragraph" in a Div with
id `hello`.

Closes #262.
2017-06-27 15:03:16 +02:00
John MacFarlane
563c9c8687 RST reader: Handle chained link definitions.
For example,

    .. _hello:
    .. _goodbye: example.com

Here both `hello` and `goodbye` should link to `example.com`.

Fixes the first part of #262.
2017-06-27 14:35:03 +02:00
John MacFarlane
a868b238f2 Docx writer: Allow 9 list levels.
Closes #3519.
2017-06-27 12:42:56 +02:00
John MacFarlane
cf3b9a4058 Removed redundant element from data/docx/word/numbering.xml.
The elements we need are generated when the document is
compiled; this didn't do anything.
2017-06-27 12:39:13 +02:00
bucklereed
460b6c470b HTML reader: Use the lang value of <html> to set the lang meta value. (#3765)
* HTML reader: Use the lang value of <html> to set the lang meta value.

* Fix for pre-AMP environments.
2017-06-27 10:19:37 +02:00
John MacFarlane
19d9482fc4 OpenDocument/ODT writer: Added support for table of contents.
Closes #2836.

Thanks to @anayrat.
2017-06-26 16:46:56 +02:00
John MacFarlane
6773447c8c Support --toc in opendocument/odt. 2017-06-26 16:31:28 +02:00
John MacFarlane
75f4e41d7d Use table-of-contents for contents of toc, make toc a boolean.
Changed markdown, rtf, and HTML-based templates accordingly.

This allows you to set `toc: true` in the metadata; this
previously produced strange results in some output formats.

Closes #2872.

For backwards compatibility, `toc` is still set to the
toc contents.  But it is recommended that you update templates
to use `table-of-contents` for the toc contents and `toc`
for a boolean flag.
2017-06-26 16:20:09 +02:00
Alexander Krotov
fa515e46f3 Muse writer: fix hlint errors (#3764) 2017-06-26 15:07:45 +02:00
John MacFarlane
b2fe009d8f LaTeX writer: use BCP47 parser. 2017-06-26 15:04:22 +02:00
John MacFarlane
700a0843b2 parseBCP47: Parse extensions and private-use as variants.
Even though officially they aren't.  This suffices
for our purposes.
2017-06-26 15:03:51 +02:00
Yuchen Pei
f09473eab7 minor updates to vimwiki reader. (#3759)
- updated comments in Vimwiki.hs to reflect current status of
implementation
- added vimwiki to trypandoc
2017-06-26 08:41:51 +02:00
Alexander Krotov
492b3b1291 Muse reader: fix horizontal rule parsing (#3762)
Do not parse 3 dashes as horizontal rule and allow whitespace after rule
2017-06-26 08:41:17 +02:00
Alexander Krotov
b95f391beb Muse reader: simplify para implementation (#3761) 2017-06-26 08:40:53 +02:00
John MacFarlane
4cbbc9dd58 BCP47: split toLang from getLang, rearranged types. 2017-06-25 23:16:55 +02:00
John MacFarlane
d0d2443f2e Refactored ConTeXt writer to use BCP47.
BCP47 - consistent case for BCP47 fields (e.g. uppercase
for region).
2017-06-25 21:56:29 +02:00
John MacFarlane
ac9423eccc Moved BCP47 specific functions from Writers.Shared to new module.
Text.Pandoc.BCP47 (unexported, internal module).
`getLang`, `Lang(..)`, `parseBCP47`.
2017-06-25 21:00:35 +02:00
John MacFarlane
643cbdf104 Writers.Shared: improve type of Lang and bcp47 parser.
Use a real parsec parser for BCP47, include variants.
2017-06-25 18:31:59 +02:00
John MacFarlane
a85d833576 Fixed log message for InvalidLang. 2017-06-25 15:52:30 +02:00
John MacFarlane
e7cd3cb466 Writers.Shared: refactored getLang, splitLang...
into `Lang(..)`, `getLang`, `parceBCP47`.
2017-06-25 15:36:30 +02:00
John MacFarlane
3ae4105d14 Fixed support for lang attribute in OpenDocument and ODT writers.
This improves on the last commit, which didn't work in
some important ways.

See #1667.
2017-06-25 13:49:40 +02:00
John MacFarlane
083a224d1e Support lang attribute in OpenDocument and ODT writers.
This adds the required attributes to the temporary styles,
and also replaces existing language attributes in styles.xml.

Support for lang attributes on Div and Span has also been
added.

Closes #1667.
2017-06-25 12:46:26 +02:00
John MacFarlane
a02f08c9fc Added InvalidLang to LogMessage. 2017-06-25 12:46:26 +02:00