Commit graph

149 commits

Author SHA1 Message Date
John MacFarlane
40d8100d44 Use texmath 0.7 interface. 2014-08-04 11:13:09 -07:00
Matthew Pickering
95fb0755c1 Parsing: Added isbn and pmid schemes 2014-07-27 19:59:57 +01:00
Matthew Pickering
5e2d22a27e Generalised more in Parsing.hs to enable the use of custom state 2014-07-26 21:23:49 +01:00
Matthew Pickering
83028c5982 Exported runParserT and Stream 2014-07-22 16:37:41 +01:00
Matthew Pickering
e045b1d5f2 Generalised readWith to readWithM 2014-07-22 15:13:54 +01:00
John MacFarlane
b6c769084e Fix behavior of markdown_attribute extension.
It now works as in PHP markdown extra.  Setting `markdown="1"` on
an outer tag affects all contained tags until it is reversed with
`markdown="0"`.  Closes #1378.

Added `stateMarkdownAttribute` to `ParserState`.
2014-07-20 17:44:28 -07:00
John MacFarlane
cdc4ecbe98 readWith: reverted generalization from f201bdcb.
We need input to be a string so we can print the offending line
on an error.
2014-07-20 13:51:03 -07:00
John MacFarlane
47a5f04761 Parsing: Simplified dash and ellipsis.
This originated with @dubiousjim's observation in #1419
that there was a typo in the definition of enDash.
It returned an em dash character instead of an en dash.

I thought about why this had not been noticed before, and
realized that en dashes were just being parsed as regular
symbols.

That made me realize that, now that we no longer have
dedicate EnDash, EmDash, and Ellipses inline elements, as
we used to in pandoc, we no longer need to parse the
unicode characters specially.  This allowed a considerable
simplification of the code.

Partially resolves #1419.
2014-07-12 23:44:56 -07:00
John MacFarlane
4676bfdf82 Removed space at ends of lines in source. 2014-07-12 22:57:22 -07:00
Matthew Pickering
72fe742ca0 Removed inline fmap from Parsing.hs
Replaced all inline occurences of fmap with the more idiomatic (<$>).
2014-07-11 12:53:31 +01:00
Matthew Pickering
2fb8063f78 Removed (>>~) function
This function is equivalent to the more general (<*) which is defined in
Control.Applicative. This change makes pandoc code easier to understand for
those not familar with the codebase.
2014-07-11 12:51:26 +01:00
Matthew Pickering
f201bdcb58 Generalised all functions in Parsing.hs
Before it wasn't possible to use these general combinators with the ParsecT
transformer but with the more general types this is now possible.
2014-07-11 12:45:34 +01:00
John MacFarlane
3d4e76f342 Parsing: Added stateInHtmlBlock to ParserState.
This is used to keep track of the ending tag we're waiting
for when we're parsing inside HTML block tags.
2014-07-07 15:53:59 -06:00
John MacFarlane
2e80613451 Markdown reader: inline math must have nonspace before final $.
Closes #1313.
2014-05-27 11:59:28 -07:00
Albert Krewinkel
2423f9e6b1 Move citeKey from Readers.Markdown to Parsing
The function can be used by other readers, so it is made accessible for
all parsers.
2014-05-14 14:58:05 +02:00
Albert Krewinkel
9df589b9c5 Introduce class HasLastStrPosition, generalize functions
Both `ParserState` and `OrgParserState` keep track of the parser position at
which the last string ended.  This patch introduces a new class
`HasLastStrPosition` and makes the above types instances of that class.  This
enables the generalization of functions updating the state or checking if one
is right after a string.
2014-05-14 14:57:00 +02:00
Albert Krewinkel
8fdbef841d Update copyright notices for 2014, add missing notices 2014-05-09 00:46:08 +02:00
John MacFarlane
743dac493f LaTeX reader: Better error messages with include files.
Closes #1274.

Rewrote handleIncludes.

We now report the actual source file and position where the error
occurs, even if it is included.  We do this by inserting special
commands, `\PandocStartInclude` and `\PandocEndInclude`, that encode
this information in the preprocessing phase.

Also generalized the types of a couple functions from
`Text.Pandoc.Parsing`.
2014-05-03 17:37:54 -07:00
Matthew Pickering
5a51a67abd Changed the smart punctuation parser to return Inlines rather than an Inline element and updated files accordingly 2014-04-01 13:53:34 +01:00
John MacFarlane
0934c4430a Parsing: Added stateCaption.
This is primarily for use in the LaTeX reader, so far.
2014-03-25 22:44:16 -07:00
John MacFarlane
6992050161 Parsing: Added HasMacros, simplified other typeclasses.
Removed updateHeaderMap, setHeaderMap, getHeaderMap,
updateIdentifierList, setIdentifierList, getIdentifierList.
2014-03-25 14:55:18 -07:00
John MacFarlane
6ec3ee3a67 Whitespace change, and note:
Contrary to the previous commit message, there was no API
change, since Text.Pandoc.Parsing is not an exposed module.
2014-03-25 13:51:55 -07:00
John MacFarlane
08d1404b31 API changes to HasReaderOptions, HasHeaderMap, HasIdentifierList.
Previously these were typeclasses of monads.  They've been changed
to be typeclasses of states.  This ismplifies the instance definitions
and provides more flexibility.

This is an API change!  However, it should be backwards compatible
unless you're defining instances of HasReaderOptions, HasHeaderMap,
or HasIdentifierList.  The old getOption function should work as
before (albeit with a more general type).

The function askReaderOption has been removed.
extractReaderOptions has been added.
getOption has been given a default definition.

In HasHeaderMap, extractHeaderMap and updateHeaderMap have been added.
Default definitions have been given for getHeaderMap, putHeaderMap,
and modifyHeaderMap.

In HasIdentifierList, extractIdentifierList and updateIdentifierList
have been added.  Default definitions have been given for
getIdentifierList, putIdentifierList, and modifyIdentifierList.

The ultimate goal here is to allow different parsers to use their
own, tailored parser states (instead of ParserState) while still
using shared functions.
2014-03-25 13:43:34 -07:00
John MacFarlane
3fa38db80b Parsing: Make F an instance of Applicative. Closes #1138. 2014-03-24 10:29:24 -07:00
Merijn Verstraaten
66fd9bf759 Clarified field values in RstCustomRoles. 2014-02-15 17:57:08 +01:00
Merijn Verstraaten
fe246ce01c Enhanced Pandoc's support for rST roles.
rST parser now supports:
    - All built-in rST roles
    - New role definition
    - Role inheritance

Issues/TODO:
    - Silently ignores illegal fields on roles
    - Silently drops class annotations for roles
    - Only supports :format: fields with a single format for :raw: roles,
      requires a change to Text.Pandoc.Definition.Format to support multiple
      formats.
    - Allows direct use of :raw: role, rST only allows indirect (i.e.,
      inherited use of :raw:).
2014-02-15 17:51:33 +01:00
Henry de Valence
0c5e7cf8cb HLint: use elem and notElem
Replaces long conditional chains with calls to `elem` and `notElem`.
2013-12-19 20:19:24 -05:00
John MacFarlane
def05d3504 HTML reader: Parse LaTeX math if appropriate options are set.
* Moved inlineMath, displayMath from Markdown reader to Parsing.
* Export them from Parsing.  (API change.)
* Generalize their types.
2013-12-06 17:15:13 -08:00
John MacFarlane
d5660275a3 Parsing: Generalized type of registerHeader, using new typeclasses.
New type classes HasReadeOptions, HasIdentifierList, HasHeaderMap.
These allow certain common functions to be reused even in parsers
that use custom state (instead of ParserState), such as the MediaWiki
reader.

Minor API bump.
2013-11-17 08:45:21 -08:00
John MacFarlane
6ed41fdfcc Factored out registerHeader from markdown reader, added to Parsing.
Text.Pandoc.Parsing now exports registerHeader, which can be
used in other readers.
2013-09-01 08:54:10 -07:00
John MacFarlane
af786829a0 Parsing: Added stateMeta' to ParserState. 2013-08-18 16:22:56 -07:00
John MacFarlane
12e7ec4070 Added Text.Pandoc.Compat.TagSoupEntity.
This allows pandoc to compile with tagsoup 0.13.x.
Thanks to Dirk Ullrich for the patch.
2013-08-08 10:42:52 -07:00
John MacFarlane
5050cff37c Removed comment that chokes recent cpp.
Closes #933.
2013-08-03 23:16:54 -07:00
John MacFarlane
e973bbbbc8 Markdown reader: Better error messages for yaml headers. 2013-07-02 09:23:43 -07:00
John MacFarlane
f869f7e08d Use new flexible metadata type.
* Depend on pandoc 1.12.
* Added yaml dependency.
* `Text.Pandoc.XML`: Removed `stripTags`.  (API change.)
* `Text.Pandoc.Shared`:  Added `metaToJSON`.
  This will be used in writers to create a JSON object for use
  in the templates from the pandoc metadata.
* Revised readers and writers to use the new Meta type.
* `Text.Pandoc.Options`: Added `Ext_yaml_title_block`.
* Markdown reader:  Added support for YAML metadata block.
  Note that it must come at the beginning of the document.
* `Text.Pandoc.Parsing.ParserState`:  Replace `stateTitle`,
  `stateAuthors`, `stateDate` with `stateMeta`.
* RST reader:  Improved metadata.
  Treat initial field list as metadata when standalone specified.
  Previously ALL fields "title", "author", "date" in field lists
  were treated as metadata, even if not at the beginning.
  Use `subtitle` metadata field for subtitle.
* `Text.Pandoc.Templates`:  Export `renderTemplate'` that takes a string
  instead of a compiled template..
* OPML template:  Use 'for' loop for authors.
* Org template: '#+TITLE:' is inserted before the title.
  Previously the writer did this.
2013-06-24 20:29:41 -07:00
John MacFarlane
a578a490ee Parsing: Generalized state type on readWith. 2013-06-24 20:27:36 -07:00
John MacFarlane
7cb8b60910 Parsing: Better error reporting in readWith.
- Specialize readWith to String input.
- On error have it print the line in which the error occurred,
  with a caret pointing to the column.
- This should help diagnose parsing problems in LaTeX especially.
2013-03-28 22:20:05 -07:00
John MacFarlane
ee0fc19bc5 Parsing: Further improvements to uri parser.
Don't treat punctuation before percent-encoding as final punctuation.
Don't treat '+' as final punctuation.
2013-03-28 11:33:01 -07:00
John MacFarlane
099b4b7769 Mediawiki: Fixed regression for <ref>URL</ref>.
`<` is no longer allowed in URLs, according to the uri parser
in Text.Pandoc.Parsing.

Added a test case.
2013-03-28 09:54:02 -07:00
John MacFarlane
07e8cedf2b Make implicit_header_references work with explicit header ids.
(Markdown reader.)
2013-02-21 19:53:35 -08:00
John MacFarlane
cc410a71b5 Allow & in emails (for entities).
Added tests for entities in titles and links.
Closes #723.
2013-02-15 23:02:17 -08:00
John MacFarlane
59764fa388 Parsing: uri, email: resolve entities.
A markdown link `<http://g&ouml;ogle.com>` should
be a link to http://göogle.com.
2013-02-15 22:39:49 -08:00
John MacFarlane
a6c167125f Optimized oneOfStringsCI.
The call to toLower in ciMatch was very expensive (and very often
used), because toLower from Data.Char calls a fully unicode
aware function.  This optimization avoids the call to toLower
for the most common, ASCII cases.  This dramatically reduces the
speed penalty that comes from enabling the `autolink_bare_uris`
extension.  The penalty is still substantial (in one test, from 0.33s
to 0.44s), but nowhere near what it used to be.
2013-02-02 18:46:10 -08:00
John MacFarlane
8c55023d18 Fixed latex macro parsing.
Now latex macro definitions are preserved when output is latex,
and applied when it is another format, as originally intended.

Partially addresses #730.
\providecommand is still not supported.  For this we need changes
to texmath.
2013-01-28 10:50:58 -08:00
John MacFarlane
f989ff2d5d Parsing: More improvements of anyLine parser. 2013-01-25 18:32:06 -08:00
John MacFarlane
d27dc6a420 More anyLine tweaks: Use incSourceLine. 2013-01-25 17:59:57 -08:00
John MacFarlane
0801b120b9 anyLine: Set position properly. 2013-01-25 17:53:50 -08:00
John MacFarlane
4c74b7aaab Parsing: Much faster new version of anyLine.
Not only faster but uses less memory.
2013-01-25 15:32:10 -08:00
John MacFarlane
c4b93bc3e7 Fixed bug in uri parser.
The bug prevented an autolink at the end of a string (e.g.
at the end of a line block line) from counting as a link.

Closes #711.
2013-01-20 20:23:50 -08:00
John MacFarlane
bf3a911a1c Changed Ext_autolink_urls -> Ext_autolink_bare_uris.
Added tests.
2013-01-15 12:44:50 -08:00