Commit graph

  • c6323b2c78 Changes to RTF writer: + Added support for definition lists. + Removed extra '\cell' in table output, which caused a blank column to the left. + Added support for captions in tables. + Added an 'alignment' parameter to RTF block writers. + Added support for column alignments in tables. fiddlosopher 2007-05-03 14:46:37 +0000
  • 09fa7e6f63 Added support for definition lists to RST writer. fiddlosopher 2007-05-03 14:44:28 +0000
  • 292d2bd38d Added support for definition lists to markdown writer. fiddlosopher 2007-05-03 14:44:04 +0000
  • 23841bdf77 Add -asxhtml flag to tidy in html2markdown. This will perhaps help the parser. fiddlosopher 2007-05-03 14:43:16 +0000
  • cf081435ff Changed definition list syntax in markdown reader and simplified the parsing code. A colon is now required before every block in a definition. This fixes a problem with the old syntax, in which the last block in the following was ambiguous between a regular paragraph in the definition and a code block following the definition list: fiddlosopher 2007-05-03 14:42:40 +0000
  • 485fa81559 Resolved issue #10: instead of adding "\n\n" to the end of strings in Main, do it in readMarkdown and readRST. (Note: the point of this is to ensure that a block at the end of the file gets treated as if it has blank space after it, which is generally what is wanted.) fiddlosopher 2007-04-22 04:38:05 +0000
  • 726685af1b Fixed bug in anyLine parser. Previously anyLine would parse an empty string "". But it should fail on an empty string, or we get an error from its use inside "many" combinators. fiddlosopher 2007-04-22 04:19:34 +0000
  • 7742301c09 Support for definition lists in LaTeX writer. fiddlosopher 2007-04-21 21:38:19 +0000
  • ffd7248af8 Fixed export declarations; removed unneeded import of Pandoc.Shared. fiddlosopher 2007-04-15 16:22:29 +0000
  • e15fd3bf86 Moved escape and nullBlock parsers from ParserCombinators/Pandoc to Pandoc/Shared. Reason: ParserCombinators/Pandoc is for general-purpose parsers that don't require Pandoc.Definition. Also removed some unnecessary imports from Pandoc/Shared. fiddlosopher 2007-04-15 16:20:07 +0000
  • 24ee5f1f49 Updated test suite to reflect new prettyprinted native Table format. fiddlosopher 2007-04-13 22:18:47 +0000
  • 944740ce5c Added Table to prettyBlock in Shared.hs. fiddlosopher 2007-04-13 01:33:10 +0000
  • 92845a9834 Added Text.Pandoc module that exports basic readers, writers, definitions, and utility functions. fiddlosopher 2007-04-11 22:57:45 +0000
  • 23df0ed176 Extensive changes stemming from a rethinking of the Pandoc data structure. Key and Note blocks have been removed. Link and image URLs are now stored directly in Link and Image inlines, and note blocks are stored in Note inlines. This requires changes in both parsers and writers. Markdown and RST parsers need to extract data from key and note blocks and insert them into the relevant inline elements. Other parsers can be simplified, since there is no longer any need to construct separate key and note blocks. Markdown, RST, and HTML writers need to construct lists of notes; Markdown and RST writers need to construct lists of link references (when the --reference-links option is specified); and the RST writer needs to construct a list of image substitution references. All writers have been rewritten to use the State monad when state is required. This rewrite yields a small speed boost and considerably cleaner code. fiddlosopher 2007-04-10 01:56:50 +0000
  • 74e7497226 Fixed bug in email obfuscation (issue #15). If the text to be obfuscated contains an entity, this needs to be decoded before obfuscation. Thanks to thsutton for the patch. fiddlosopher 2007-04-08 21:04:47 +0000
  • 571c3b4173 Removed Blank block element as unnecessary. fiddlosopher 2007-03-17 20:23:47 +0000
  • 9b3d5d88c2 Consolidated 'text', 'special', and 'inline' into 'inline'. fiddlosopher 2007-03-17 17:25:28 +0000
  • cdf7c78b5f Added trys to two list start routines. Reason: <|> only parses second parser when first hasn't consumed input. fiddlosopher 2007-03-16 06:05:43 +0000
  • 0d2e5eab79 Added clauses for DefinitionList and Table to replaceReferenceLinks in Text/Pandoc/Shared.hs. This ensures that reference-style links inside tables and definition lists will be handled properly. fiddlosopher 2007-03-12 00:36:41 +0000
  • d55346ca9a Simplified keyTable, using assumption that key blocks are not inside other block elements (an assumption that the Markdown reader uses in making its initial pass anyway). fiddlosopher 2007-03-12 00:23:39 +0000
  • 7bf7aba7e7 Changes to Markdown reader relating to definition lists: + fixed bug in indentSpaces (which didn't properly handle cases with mixed spaces and tabs) + rewrote definition list code to conform to new syntax + include definition lists in list block + failIfStrict on definition lists fiddlosopher 2007-03-11 07:56:29 +0000
  • aca046702e Added support for DefinitionList blocks to HTML writer. Cleaned up bullet and ordered list code by using ordList and unordList instead of raw olist and ulist. fiddlosopher 2007-03-11 07:54:47 +0000
  • 44cec96e61 New syntax documentation for definition lists. Now we require a ':' at the beginning of the definition; otherwise, too many false positives for definition lists. fiddlosopher 2007-03-11 07:53:21 +0000
  • 2e794fdf37 Fixed bug in HTML email obfuscation using --strict mode. The problem is that the "href" function escapes &, so (href "&#108;") is 'href="&amp;#108;"'. Fixed by using primHtml for the whole link. Resolves issue 9. fiddlosopher 2007-03-11 00:19:15 +0000
  • 115cad8882 Changed syntax of definition lists in Markdown parser: + definition blocks must be indented throughout (not just in first line) + compact lists can be formed by leaving no blank line between a definition and the next term fiddlosopher 2007-03-10 20:45:19 +0000
  • d277baebe4 Added documentation for definition lists. fiddlosopher 2007-03-10 20:43:59 +0000
  • 5ec31cc727 Added parser for definition lists, derived from reStructuredText syntax: fiddlosopher 2007-03-10 17:48:16 +0000
  • 0ce965f34c Modified prettyPandoc to handle DefinitionList elements. fiddlosopher 2007-03-10 17:45:00 +0000
  • 68cd976838 Added definition for DefinitionList block element. fiddlosopher 2007-03-10 17:44:27 +0000
  • d5f9b1dbb4 Change in ordered lists in Markdown reader: + Lists may begin with lowercase letters only, and only 'a' through 'n'. Otherwise first initials and page references (e.g., p. 400) are too easily parsed as lists. + Numbers beginning list items must end with '.' (not ')', which is now allowed only after letters). NOTE: This change may cause documents to be parsed differently. Users should take care in upgrading. fiddlosopher 2007-03-09 02:37:49 +0000
  • f1191f029a More smart quote adjustments: + remove support for all-caps contractions (too much potential for conflict with things like 'M. Mitterand') + add support for 'm as a contraction fiddlosopher 2007-03-07 20:53:37 +0000
  • 42ade05ebe Smart quote parsing in Markdown reader: treat ' followed by ll, re, ve, then a non-letter, as a contraction. (e.g. I've, you're, he'll) fiddlosopher 2007-03-07 03:01:43 +0000
  • 402016df28 Fixed bug in noscript part of email obfuscation: & instead of &amp; fiddlosopher 2007-03-04 07:40:22 +0000
  • 21ba6b08f2 Made image parsing in HTML reader sensitive to the --inline-links option. fiddlosopher 2007-03-03 18:25:49 +0000
  • 31c030e3a5 Added --inline-links option to force links in HTML to be parsed as inline links, rather than reference links. (Addresses Issue #4.) fiddlosopher 2007-03-03 18:19:31 +0000
  • f99cedd236 Strip executable binaries before installing. fiddlosopher 2007-02-27 07:16:20 +0000
  • 463a0e5c3e Changes to test suite for new XHTML output. fiddlosopher 2007-02-27 07:05:11 +0000
  • 59065c103f Modified HTML writer to use the Text.XHtml library. This results in cleaner, faster code, and it makes it easier to use Pandoc in other projects, like wikis, that use Text.XHtml. Two functions are now provided, writeHtml and writeHtmlString: the former outputs an Html structure, the latter a rendered string. The S5 writer is also changed, in parallel ways (writeS5, writeS5String). The Html header is now written programmatically, so it has been removed from the 'headers' directory. The S5 header is still needed, but the doctype and some of the meta declarations have been removed, since they are written programatically. The INSTALL file and cabalize have been updated to reflect the new dependency on the xhtml package. fiddlosopher 2007-02-26 19:08:10 +0000
  • e0303dfc79 'cp -a' does not work in BSD. Replace with 'cp -R'. Note that we don't want user and group to be preserved, anyway. fiddlosopher 2007-02-23 17:31:46 +0000
  • e9d66b8ed8 Modified changelog: note on defaultWriterOptions. fiddlosopher 2007-02-21 01:23:11 +0000
  • 3456355174 Added defaultWriterOptions to Shared.hs. fiddlosopher 2007-02-21 01:22:08 +0000
  • 90390b65f4 Modified changelog to incorporate changes since the 0.3 release. fiddlosopher 2007-02-20 01:21:42 +0000
  • 06ff1feea0 Use 'highlight' to produce syntax-highlighted versions of xml, html, and tex demo pages (in website /examples/). Add links so that html files can be viewed as web pages (without syntax highlighting). fiddlosopher 2007-02-19 04:47:12 +0000
  • 8138fc48ae Change extensions of example text files in website demo to '.text' -- the only reason for this is that I use '.txt' for page source files, and generally exclude them from being uploaded to the website. fiddlosopher 2007-02-19 02:03:40 +0000
  • 7d3382e9f0 In writing Markdown, print unicode nonbreaking space (160) as "&nbsp;", since otherwise it is hard to distinguish from a regular space. (Addresses Issue #3.) fiddlosopher 2007-02-17 04:57:41 +0000
  • 7a04caeea8 Escape non-breaking space in SGML as '&nbsp;' instead of printing a unicode non-breaking space, which is hard to distinguish visually from a regular space. (Resolves issue #3.) fiddlosopher 2007-02-17 03:40:28 +0000
  • 1855af4be5 Refactored str and strong in Markdown reader, for clarity. fiddlosopher 2007-02-15 02:27:36 +0000
  • 87eb8be143 Got rid of two unneeded 'getState's. Note that lookAhead automatically saves and restores the state. fiddlosopher 2007-02-15 01:43:12 +0000
  • 9ad7c73b57 Use lookAhead instead of getInput/setInput in RST reader. fiddlosopher 2007-02-15 01:33:35 +0000
  • d3efa4aa8a Use lookAhead parser for the "first pass" looking for reference keys in Markdown parser, instead of parsing normally, then using setInput to reset input. Slight performance improvement. fiddlosopher 2007-02-15 01:22:02 +0000
  • ae19b94fd1 Removed followedBy' parser from Text/ParserCombinators/Pandoc, replacing it with the 'lookAhead' parser from Text/ParserCombinators/Parsec. fiddlosopher 2007-02-15 01:10:15 +0000
  • 0114f68d21 Introduced a new map, reverseEntityTable, for lookups of entity by character, in Entities.hs. This yields a small performance improvement. fiddlosopher 2007-02-14 18:02:51 +0000
  • 1266b189a1 Changed Entities.hs to use Data.Map rather than an association list, for a slight performance boost. fiddlosopher 2007-02-14 17:39:06 +0000
  • 5a9b7c229d Added freebsd directory, with freebsd port, to repository. fiddlosopher 2007-02-14 17:09:17 +0000
  • 7f65a4e914 Fixed issue #8: slow performance in parsing inline literals in RST reader. The problem was that `# was seen by 'inline' as a potential link or image. Fix: insert 'notFollowedBy (char '')' in link parsers. fiddlosopher 2007-02-14 06:57:23 +0000
  • 6e467c26c4 Replaced "choice [(try (string ...), ...]" idiom with "oneOfStrings" in LaTeX reader. fiddlosopher 2007-02-12 17:18:20 +0000
  • f3437dd27c + Added some needed "try"s before multicharacter parsers, especially in "option" contexts. + Removed the "try" from the "end" parser in "enclosed" (Text.Pandoc.Shared). Now "enclosed" behaves like "option", "manyTill", etc. fiddlosopher 2007-02-12 17:14:11 +0000
  • 73cbae7202 Added 'try' in front of 'string', where needed, or used a different parser, in RST reader. This fixes a bug where ```` would not be correctly parsed as a verbatim . fiddlosopher 2007-02-12 03:53:14 +0000
  • ca93680f72 Allow the URI in a RST hyperlink target to start on the line after the reference key. fiddlosopher 2007-02-12 02:46:39 +0000
  • df03ca2c95 Added link to FreeBSD port in website. fiddlosopher 2007-02-11 19:20:03 +0000
  • f7fcbf528f Use "gaps" in copyrightMessage string for cleaner code formatting. fiddlosopher 2007-01-31 01:08:57 +0000
  • 6c3d96a776 Changed 'encodeEntities' to 'escapeSGMLString'. fiddlosopher 2007-01-28 16:40:44 +0000
  • dc6925542c + Simplified entity handling by removing stringToSGML from Entities.hs. It is no longer needed now that all entities are processed in the markdown and HTML readers. All calls to stringToSGML have been replaced by calls to encodeEntities. + Since inTag's attribute handling already encodes entities, calls to encodeEntities are no longer needed for attribute values, so they've been removed. + The HTML and Markdown readers now call decodeEntities on all raw strings (e.g. authors, dates, link titles), to ensure that no unprocessed entities are included in the native representation of the document. (In the HTML reader, most of this work is done by a change in extractAttributeName.) + The result is a small speed improvement (around 5% on my benchmark) and cleaner code. fiddlosopher 2007-01-28 00:04:43 +0000
  • 21484713c6 Use encodeEntities rather than stringToSGML for contents of Str inline in Docbook and HTML writers, since now these strings should not contain literal entity references. fiddlosopher 2007-01-27 22:58:03 +0000
  • 8e0ad5a006 Cleaned up handling of embedded quotes in link titles. Now these are stored as a '"' character, not as '&quot;'. The function escapeLinkTitle in the Markdown writer is unnecessary and was removed. Tests modified accordingly. fiddlosopher 2007-01-27 22:45:14 +0000
  • 141affdb51 More changes in entity handling: Instead of using entities for characters above 128 in HTML and Docbook output, we now just use unicode. After all, we're declaring UTF-8 content in the header. This makes the HTML and docbook files produced by pandoc much more readable and editable. fiddlosopher 2007-01-27 22:13:11 +0000
  • d06417125d Changes in entity handling: + Entities are parsed (and unicode characters returned) in both Markdown and HTML readers. + Parsers characterEntity, namedEntity, decimalEntity, hexEntity added to Entities.hs; these parse a string and return a unicode character. + Changed 'entity' parser in HTML reader to use the 'characterEntity' parser from Entities.hs. + Added new 'entity' parser to Markdown reader, and added '&' as a special character. Adjusted test suite accordingly since now we get 'Str "AT",Str "&",Str "T"' instead of 'Str "AT&T".. + stringToSGML moved to Entities.hs. escapeSGML removed as redundant, given encodeEntities. + stringToSGML, encodeEntities, and specialCharToEntity are given a boolean parameter that causes only numerical entities to be used. This is used in the docbook writer. The HTML writer uses named entities where possible, but not all docbook-consumers know about the named entities without special instructions, so it seems safer to use numerical entities there. + decodeEntities is rewritten in a way that avoids Text.Regex, using the new parsers. + charToEntity and charToNumericalEntity added to Entities.hs. + Moved specialCharToEntity from Shared.hs to Entities.hs. + Removed unneeded 'decodeEntities' from 'str' parser in HTML and Markdown readers. + Removed sgmlHexEntity, sgmlDecimalEntity, sgmlNamedEntity, and sgmlCharacterEntity from Shared.hs. + Modified Docbook writer so that it doesn't rely on Text.Regex for detecting "mailto" links. fiddlosopher 2007-01-27 03:04:40 +0000
  • f2de08864e Rewrote functions in Text/Pandoc/Shared so as not to use Text.Regex, which does not support unicode: - escapePreservingRegex removed - stringToSGML rewritten using Parsec parser - new parsers for SGML character entities - escapeSGML rewritten using specialCharToEntity - new function specialCharToEntity fiddlosopher 2007-01-24 23:25:27 +0000
  • 11456ba5ce Changed Markdown autoLink parsing to conform better to Markdown.pl's behavior. <google.com> is not treated as a link, but <http://google.com>, <ftp://google.com>, and <mailto:google@google.com> are. fiddlosopher 2007-01-24 20:55:27 +0000
  • c94dacec35 Fixed bug in 'extractTagType' in HTML reader: previous version was not skipping / in close tags. fiddlosopher 2007-01-24 20:26:06 +0000
  • 890fbe97ec Refactored markdown reader so that Text.Regex is not used. Replaced email regex test with a custom email autolink parser (autoLinkEmail). Also replaced 'selfClosingTag' with a custom function 'isSelfClosingTag'. fiddlosopher 2007-01-24 19:44:43 +0000
  • 8f0cfe9bd0 Fixed a bug in extractTagType in HTML Reader: the previous version extracted the attributes, too, which is not wanted. fiddlosopher 2007-01-24 19:40:32 +0000
  • c61f2b6984 Fixed bug in HTML attribute parser: now a space is required before an attribute. Previously, <a.b> would be parsed as an HTML tag with an attribute! fiddlosopher 2007-01-24 19:07:35 +0000
  • ca6cb23f23 Modified Markdown writer to use autolinks when possible. So, instead of [site.com](site.com) we get <site.com>. Changed test suite accordingly. fiddlosopher 2007-01-24 19:06:30 +0000
  • 0646eef976 Rewrote 'extractTagType' in HTML reader so that it doesn't use regexs. fiddlosopher 2007-01-24 17:43:39 +0000
  • 96919a6ac5 More smart quote bug fixes: + LaTeX writer now handles consecutive quotes properly: for example, `\,hello'\,'' + LaTeX reader now parses '\,' as empty Str + normalizeSpaces function in Shared now removes empty Str elements + Modified tests accordingly fiddlosopher 2007-01-24 08:14:43 +0000
  • e6cc2aa3cf Fixed bug in smart quoting: recognize ' in contractions like "don't" as not beginning single quoted contexts. fiddlosopher 2007-01-24 04:49:36 +0000
  • 1121e8738b Removed 'gsub' entirely and replaced its uses with 'substitute'. fiddlosopher 2007-01-22 22:52:39 +0000
  • 8f0750574a + Added a 'substitute' function to Shared.hs. This is a generic list function that can be used to substitute one substring for another in a string, like 'gsub' except without regular expressions. + Use 'substitute' instead of 'gsub' in the LaTeX writer. This avoids what appears to be a bug in Text.Regex, whereby "\\^" matches "\350". There seems to be a slight speed improvement as well. (Note: If this works, it would be good to replace other uses of gsub that don't employ regexs with 'substitute'.) fiddlosopher 2007-01-22 21:28:46 +0000
  • a7839a18a7 Small bug fix to last change, and count "'S" as well as "'s" as possessive when followed by non-alphanumeric. fiddlosopher 2007-01-18 02:54:07 +0000
  • 2c64e7947b More tweaks to smart quote parsing: a ' is not a single quote start if followed by 's' and then a non-alphanumeric. (Yes, this is English-centric, I'm afraid. But it does help, and I can't think of a language in which 's' by itself is a word.) fiddlosopher 2007-01-18 02:47:27 +0000
  • 5a48839168 Minor tweaks to smart quoting code. fiddlosopher 2007-01-16 23:51:57 +0000
  • 8c257385ac Fixed bug in smart quote recognition: ' before ) or certain other punctuation must not be an open quote. fiddlosopher 2007-01-16 23:46:33 +0000
  • f3cf1cc168 Fixed haddock documentation errors. fiddlosopher 2007-01-16 00:07:42 +0000
  • b4f3eb53f1 Fix debian/changelog. roktas 2007-01-15 22:50:29 +0000
  • 60989d0637 Added support for tables in markdown reader and in LaTeX, DocBook, and HTML writers. The syntax is documented in README. Tests have been added to the test suite. fiddlosopher 2007-01-15 19:52:42 +0000
  • 4224d91388 Changed website to link to Google's download details page (with SHA1 checksum) rather than directly to the files. fiddlosopher 2007-01-10 22:41:05 +0000
  • 9a2c653277 More website tweaks. Added demo of extra xsl configuration and CSS in chunked xhtml produced from docbook. fiddlosopher 2007-01-10 03:17:37 +0000
  • 3528c72d7e Minor changes to Makefile required by changes to website build system. fiddlosopher 2007-01-10 02:41:27 +0000
  • e7dabc9dd0 More Changes to website target. Moved to a templating system for the examples page. fiddlosopher 2007-01-10 02:33:33 +0000
  • 8d4a1ed039 More website changes. Include demo of docbook postprocessed by xmlto. fiddlosopher 2007-01-09 20:42:13 +0000
  • c80f181137 Reorganized Makefile target - now uses a subsidiary Makefile that can be run from the website directory for small changes. fiddlosopher 2007-01-09 18:55:50 +0000
  • 55031090de Need to export TMPDIR in tempdir.sh. fiddlosopher 2007-01-09 08:39:19 +0000
  • f55b54917a On Cygwin, set TMPDIR to . before using mktemp. Otherwise one gets an error creating the output file in the /tmp directory. I haven't tracked this one down, but this should serve as a workaround. fiddlosopher 2007-01-09 08:32:59 +0000
  • 46580147a5 Reverted r471. My alternative to --strip-trailing-cr didn't work. This only affects the test target on systems without GNU diff (rare), so I'm not too worried about it. fiddlosopher 2007-01-09 07:40:00 +0000
  • edfac2b491 Small tweak on last demo in website. fiddlosopher 2007-01-09 07:16:51 +0000
  • 5bcfa22330 Added DocBook to description of package in Pandoc.cabal.in. fiddlosopher 2007-01-09 07:07:21 +0000
  • b37f8da181 Small change in web page for "Pandoc features." fiddlosopher 2007-01-09 07:06:15 +0000
  • b25706e098 Cleaned up markdown2pdf.in. Note that bibtex does not return an error condition when it gives warnings, so instead we grep for warnings or error messages to see if we need to print the log. fiddlosopher 2007-01-09 06:38:15 +0000
  • 8adb142720 Minor changes to markdown2pdf: removed an unnecessary '|| exit $?', and made sure error output goes to stderr. fiddlosopher 2007-01-09 06:11:29 +0000