pandoc

Author	SHA1	Message	Date
John MacFarlane	cc410a71b5	Allow `&` in emails (for entities). Added tests for entities in titles and links. Closes #723.	2013-02-15 23:02:17 -08:00
John MacFarlane	59764fa388	Parsing: uri, email: resolve entities. A markdown link `<http://göogle.com>` should be a link to http://göogle.com.	2013-02-15 22:39:49 -08:00
John MacFarlane	a6c167125f	Optimized oneOfStringsCI. The call to toLower in ciMatch was very expensive (and very often used), because toLower from Data.Char calls a fully unicode aware function. This optimization avoids the call to toLower for the most common, ASCII cases. This dramatically reduces the speed penalty that comes from enabling the `autolink_bare_uris` extension. The penalty is still substantial (in one test, from 0.33s to 0.44s), but nowhere near what it used to be.	2013-02-02 18:46:10 -08:00
John MacFarlane	8c55023d18	Fixed latex macro parsing. Now latex macro definitions are preserved when output is latex, and applied when it is another format, as originally intended. Partially addresses #730. \providecommand is still not supported. For this we need changes to texmath.	2013-01-28 10:50:58 -08:00
John MacFarlane	f989ff2d5d	Parsing: More improvements of anyLine parser.	2013-01-25 18:32:06 -08:00
John MacFarlane	d27dc6a420	More anyLine tweaks: Use incSourceLine.	2013-01-25 17:59:57 -08:00
John MacFarlane	0801b120b9	anyLine: Set position properly.	2013-01-25 17:53:50 -08:00
John MacFarlane	4c74b7aaab	Parsing: Much faster new version of anyLine. Not only faster but uses less memory.	2013-01-25 15:32:10 -08:00
John MacFarlane	c4b93bc3e7	Fixed bug in uri parser. The bug prevented an autolink at the end of a string (e.g. at the end of a line block line) from counting as a link. Closes #711.	2013-01-20 20:23:50 -08:00
John MacFarlane	bf3a911a1c	Changed Ext_autolink_urls -> Ext_autolink_bare_uris. Added tests.	2013-01-15 12:44:50 -08:00
John MacFarlane	5971721ec1	Case-insensitive parsing of URI schemes.	2013-01-15 11:48:21 -08:00
John MacFarlane	95c02f6b57	Parsing: Improve oneOfStrings, export oneOfStringsCI. oneOfStrings will now take the longest match it can in a list of strings, so if 'foo' and 'foobar' are both included, 'foobar' will match even if 'foo' is first in the list.	2013-01-15 11:47:35 -08:00
John MacFarlane	e0e36ce543	Revised URI parser. * It no longer uses Network.URIs URI parser, which is too restrictive (not allowing unicode URIs unless encoded). * It allows many more schemes. * It better handles punctuation so as to avoid capturing trailing punctuation in bare URLs.	2013-01-15 10:52:02 -08:00
John MacFarlane	51e0bd277a	Parsing: Fixed uri -- escape unicode URLs. Otherwise Network.URI.parseURI fails on e.g. Chinese URLs. Changed an incorrect test in markdown-reader-more.	2013-01-14 17:38:34 -08:00
John MacFarlane	127851ea61	Parsing: Simplified and improved singleQuoteStart. This makes 's', 'l', etc. parse properly. Formerly we had some English-centric heuristics, but they are no longer needed now that we keep track of the last 'Str' position in state. Closes #698.	2013-01-14 16:06:45 -08:00
John MacFarlane	0598cf0fee	Moved lineBlockLines to Parsing. This will be used by both RST and markdown readers.	2013-01-13 11:39:32 -08:00
John MacFarlane	cf4cd2ccb0	More improvements in emailAddress parser.	2013-01-09 21:32:42 -08:00
John MacFarlane	a71641a2a0	Made email parser more correct. Now it's based on RFC 822, though it still doesn't implement quoted strings in email addresses.	2013-01-09 17:19:32 -08:00
John MacFarlane	d599c4cdab	Added Attr field to Header. Previously header ids were autogenerated by the writers. Now they are generated (unless supplied explicitly) in the markdown parser, if the `header_identifiers` extension is selected. In addition, the textile reader now supports id attributes on headers.	2013-01-09 09:30:05 -08:00
John MacFarlane	ef806f6a99	Markdown reader: Warn about duplicate link references.	2013-01-04 12:01:09 -08:00
John MacFarlane	7ef07ea3fc	Added stateWarnings. It is not connected to anything yet.	2013-01-03 20:52:51 -08:00
John MacFarlane	c435e9cda7	Implemented `Ext_header_identifiers`, `Ext_implicit_header_references`. Now by default pandoc will act as if link references have been defined for all headers. So, you can do this: # My header Link to [My header]. Another link to [it][My header]. Closes #691.	2013-01-03 20:35:01 -08:00
John MacFarlane	2695434113	Fixed bug in withRaw. Didn't correctly handle case where nothing is parsed.	2012-12-13 19:04:01 -08:00
John MacFarlane	1b68dc3405	Revert "Added stateWarnings to ParserState, added warning function." This reverts commit `5419b504ce`.	2012-10-05 19:38:43 -07:00
John MacFarlane	5419b504ce	Added stateWarnings to ParserState, added warning function. This will be used to provide warnings for things like duplicate footnote refs and link refs.	2012-10-05 19:25:26 -07:00
John MacFarlane	93e92a4716	Renamed removedLeadingTrailingSpace to trim. Also removeLeadingSpace to triml, removeTrailingSpace to trimr.	2012-09-29 17:09:34 -04:00
John MacFarlane	7633d51971	Parsing: Changed type of stateSubstitutions to use Inlines.	2012-09-27 16:44:49 -07:00
John MacFarlane	35662e14a9	Removed nullBlock. Don't use nullBlock in Textile reader. Better to know about parsing problems than to skip stuff when we get stuck.	2012-09-27 16:06:29 -07:00
John MacFarlane	1be27ffb3a	Added stateSubstitutions to ParserState, use for RST substitutions.	2012-09-27 15:20:29 -07:00
John MacFarlane	12045d84b6	Revert "More intelligent handling of text encodings." This reverts commit `7272735b3d`.	2012-09-23 22:53:34 -07:00
John MacFarlane	7272735b3d	More intelligent handling of text encodings. Previously, UTF-8 was enforced for both input and output. The new system: * For input, UTF-8 is tried first; if an error is raised, the locale encoding is tried. * For output, the locale encoding is always used.	2012-09-23 22:12:21 -07:00
John MacFarlane	f67333696b	Revert "Use local encoding for input/output rather than forcing UTF8." This reverts commit `c69837adb6`.	2012-09-23 12:33:17 -07:00
John MacFarlane	c69837adb6	Use local encoding for input/output rather than forcing UTF8. Note that system templates are stored as UTF8 and will still be read as such, even if the local encoding is different. Text downloaded from URLs will also be treated as UTF-8.	2012-09-23 11:01:33 -07:00
John MacFarlane	167012daf7	Export 'nested' in Parsing.	2012-09-12 08:45:03 -07:00
John MacFarlane	58a096c058	Text.Pandoc.Parsing: Handle trailing slash in 'uri'.	2012-09-12 08:45:03 -07:00
John MacFarlane	7fc804ed22	Parsing: Generalized type of withQuoteContext.	2012-09-09 18:12:18 -07:00
John MacFarlane	dfa4b76630	Changes to literate haskell options. - Removed writerLiterateHaskell from WriterOptions. - Removed readerLiterateHaskell from ReaderOptions. - Added Ext_literate_haskell to Extensions. Test for this instead of the above. - Removed failUnlessLHS from Shared. Note: At this point, +lhs and .lhs extension no longer has any effect. Need to fix.	2012-08-08 23:18:19 -07:00
John MacFarlane	33fd791ea1	Made F a newtype, moved definitions to Parser. Parser now exports F(..), askF, asksF, runF.	2012-08-02 17:12:20 -07:00
John MacFarlane	a1677b612b	Parsing: removed duplication of Key and Key'. Now we just use the former Key' (string contents), renamed Key. lookupKeySrc and fromKey are no longer eport. Key', toKey' and KeyTable' have become Key, toKey, and KeyTable.	2012-08-01 22:40:07 -07:00
John MacFarlane	fadc7b0d87	Major rewrite of markdown reader. * Use Builder's Inlines/Blocks instead of lists. * Return values in the reader monad, which are then run (at the end of parsing) against the final parser state. This allows links, notes, and example numbers to be resolved without a second parser pass. * An effect of using Builder is that everything is normalized automatically. * New exports from Text.Pandoc.Parsing: widthsFromIndices, NoteTable', KeyTable', Key', toKey', withQuoteContext, singleQuoteStart, singleQuoteEnd, doubleQuoteStart, doubleQuoteEnd, ellipses, apostrophe, dash * Updated opendocument tests. * Don't derive Show for ParserState. * Benchmarks: markdown reader takes 82% of the time it took before. Markdown writer takes 92% of the time (here the speedup is probably due to the fact that everything is normalized by default).	2012-08-01 21:45:40 -07:00
John MacFarlane	973c7ecacf	Removed commented-out pandoc2 code. This will be developed in a branch, noreparsing.	2012-07-27 21:04:38 -07:00
John MacFarlane	c76ef95308	Parser: Changed types to use type alias Parser, not Parsec.	2012-07-27 20:50:03 -07:00
John MacFarlane	6d7f0a1b81	Fixed whitespace errors.	2012-07-26 22:32:53 -07:00
John MacFarlane	14c911ba06	Parsing: Removed failIfStrict.	2012-07-26 22:20:44 -07:00
John MacFarlane	5186da929d	Parsing: Added guardEnabled, guardDisabled.	2012-07-26 19:10:56 -07:00
John MacFarlane	2654da3823	Moved stateApplyMacros, stateIndentedCodeClasses to ReaderOptions.	2012-07-25 22:05:06 -07:00
John MacFarlane	070b968ae0	stateCitations -> readerCitations.	2012-07-25 22:05:06 -07:00
John MacFarlane	856aa8c244	Moved stateLiterateHaskell to readerLiterateHaskell in Options.	2012-07-25 22:05:06 -07:00
John MacFarlane	1dba82f25e	Got rid of stateStandalone, which was hardly used anyway. The only possible effect will be with rst fragments that begin with an rst title block, which will now cause the header transform.	2012-07-25 20:08:42 -07:00
John MacFarlane	95570ba34c	Moved stateOldDashes to readerOldDashes in ReaderOptions.	2012-07-25 12:37:04 -07:00

1 2 3

109 commits