pandoc

Author	SHA1	Message	Date
Matthew Pickering	495f55b03e	DokuWiki Writer: Hlint cleanup	2014-08-27 13:48:19 +01:00
Matthew Pickering	3412018287	DokuWiki Writer: Qualified all imports	2014-08-27 13:29:09 +01:00
John MacFarlane	8f2aa45d69	Merge pull request #1564 from jkr/trackChangesWriter Docx writer: write track changes.	2014-08-26 22:22:16 -07:00
Calvin Beck	f813755c55	Fixed exampleLine parser to accept example lines which have indentation at the start of the line.	2014-08-26 21:56:40 -06:00
Jesse Rosenthal	b613d85af9	Docx writer: Accomodate GHC 7.4 (no lookupEnv)	2014-08-26 06:43:14 -04:00
Jesse Rosenthal	21253b59e8	Docx writer: Default to user login and time of change if not given.	2014-08-25 14:03:35 -04:00
Jesse Rosenthal	e1bb28a388	Docx writer: Implement track changes. These have default authors and dates of "unknown" and timestamp-zero, respectively.	2014-08-25 12:37:56 -04:00
John MacFarlane	9f8051d95d	Hlint changes to Docx writer.	2014-08-24 11:37:23 -07:00
John MacFarlane	0ef1f787c7	Docx writer: Bibliography entries get Bibliography style. Closes #1559.	2014-08-23 20:52:09 -07:00
John MacFarlane	2956ef251c	Fixed --self-contained with Windows paths. Previously C:\foo.js was being wrongly interpreted as a URI. Closes #1558.	2014-08-22 23:21:57 -07:00
mpickering	aa808055f0	Txt2Tags Reader: Fixed crash when reading from stdin	2014-08-21 17:11:21 +01:00
mpickering	3b6d7afa71	Txt2Tags Reader: Corrected formatting of %%mtime macro	2014-08-21 17:11:16 +01:00
mpickering	2a7319541d	Txt2Tags Reader: Parse Meta information The header is now parsed as meta information. The first line is the `title`, the second is the `author` and third line is the `date`.	2014-08-21 17:09:40 +01:00
mpickering	2cd049a1bf	Txt2Tags reader: Header is now parsed only if standalone flag is set	2014-08-20 18:11:37 +01:00
John MacFarlane	60f3e777f3	EPUB writer: don't use page-progression-direction in EPUB2. Also, if page-progression-direction not specified in metadata, don't include the attribute even in EPUB3; not including it is the same as including it with the value "default", as we did before. Closes #1550.	2014-08-19 09:21:26 -07:00
John MacFarlane	716ad5fd8a	Merge pull request #1547 from jkr/styleparse Docx reader: parsing styles	2014-08-18 14:14:01 -07:00
John MacFarlane	6dce8c6760	HTML reader: improved handling of tags that can be block or inline. Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline): <video controls="controls"> <source src="../videos/test.mp4" type="video/mp4" /> <source src="../videos/test.webm" type="video/webm" /> <p> The videos can not be played back on your system.<br/> Try viewing on Youtube (requires Internet connection): <a href="http://youtu.be/etE5urBps_w">Relative Velocity on Youtube</a>. </p> </video> This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we're parsing a "plain" sequence of inlines, we don't parse anything that COULD be a block-level tag.	2014-08-18 12:41:09 -07:00
Jesse Rosenthal	4b38e9f1f0	Docx reader: whitespace fix.	2014-08-17 20:11:50 -04:00
Jesse Rosenthal	198aea190f	Docx reader: remove emph styles and strong styles list. We no longer need the explicit lists since we're deriving them from the ground up.	2014-08-17 17:04:55 -04:00
Jesse Rosenthal	9da7b0946e	Docx reader: Add "Hyperlink" to blacklisted styles. This is the only one so far. We'll add others as they show up.	2014-08-17 17:04:14 -04:00
Jesse Rosenthal	15ce28b8ca	Docx reader: Use style resolver. We now no longer check against explicit styles.	2014-08-17 17:03:44 -04:00
Jesse Rosenthal	03d5d8e596	Docx Reader: Introduce function for resolving dependent run styles. We always favor an explicit positive or negative in a style in a descendent, and only turn to the ancestor if nothing is set. We also introduce an (empty) list of styles that are black-listed. We won't check them. (Think underlines in hyperlinks).	2014-08-17 16:54:11 -04:00
John MacFarlane	8e60d35d58	Merge pull request #1536 from considerate/master Add row width to tables in Docx XML	2014-08-17 12:54:38 -07:00
John MacFarlane	b6103eeb83	Merge pull request #1543 from jkr/superSubVert Docx reader: Change behavior of Super/Subscript	2014-08-17 12:54:10 -07:00
John MacFarlane	c14088ea93	Docx writer: Fixed regression, bungled list numbering. In pandoc 1.13, all lists come out as basic ordered lists. This fixes that bad regression. Closes #1544.	2014-08-17 12:48:41 -07:00
Jesse Rosenthal	99491f0d98	Docx Parse: build a bottom-up style tree. Two points here: (1) We're going bottom-up, from styles not based on anything, to avoid circular dependencies or any other sort of maliciousness/incompetence. And (2) each style points to its parent. That way, we don't need the whole tree to pass a style over to Docx.hs	2014-08-17 15:46:17 -04:00
Artyom Kazak	357172f13a	Remove an unnecessary import.	2014-08-17 23:44:02 +04:00
Artyom Kazak	6a34cd3ddf	Update Reader.EPUB to use `MimeType`.	2014-08-17 21:00:55 +04:00
Artyom Kazak	cca9e8feb4	MIME cleanup. * Create a type synonym for MIME type (instead of `String`). * Add `getMimeTypeDef` function. * Avoid recreating MIME type `Map`s every time. * Move “Formula-...” case handling into `getMimeType`.	2014-08-17 21:00:50 +04:00
Jesse Rosenthal	b8f1658c36	Alias string and runStyle to CharStyle type.	2014-08-17 11:30:22 -04:00
Jesse Rosenthal	c4871ac790	Docx Style parser: Basic one now just takes a parent style. This will make it easier to build the style map from the bottom up (to avoid any infinite references).	2014-08-17 10:19:48 -04:00
Jesse Rosenthal	75eec0a6b8	Docx reader: work with new rStyle. Just discards info at the moment, so at least it works the same.	2014-08-17 09:22:25 -04:00
Jesse Rosenthal	ea85a797c2	Parser: Framework for parsing styles. We want to be able to read user-defined styles. Eventually we'll be able to figure out styles in terms of inheritance as well. The actual cascading will happen in the docx reader.	2014-08-17 09:22:21 -04:00
Jesse Rosenthal	dc5b0ba09b	Docx reader: Change behavior of Super/Subscript In docx, super- and subscript are attributes of Vertalign. It makes more sense to follow this, and have different possible values of Vertalign in runStyle. This is mainly a preparatory step for real style parsing, since it can distinguish between vertical align being explicitly turned off and it not being set. In addition, it makes parsing a bit clearer, and makes sure we don't do docx-impossible things like being simultaneously super and sub.	2014-08-17 08:20:00 -04:00
John MacFarlane	9d52ecdd42	HTML reader: Parse appropriately styled span as SmallCaps.	2014-08-16 22:57:00 -07:00
Viktor Kronvall	753e47194c	Simplify row width calculation.	2014-08-17 03:24:44 +02:00
Christoffer Ackelman	9f3c34841b	Include row width in table rows. Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000).	2014-08-17 03:24:44 +02:00
John MacFarlane	cb4ae6112e	Markdown writer: don't escape $, ^, ~ when extensions are deactivated. `tex_math_dollars`, `superscript`, and `subscript` extensions, respectively. Closes #1127.	2014-08-16 17:14:51 -07:00
Jesse Rosenthal	9bb0b99981	Docx reader: Remove unnecessary plural functions functions like runElemsToInlines and parPartsToInlines are just defined in terms of concatting and mapping their singular version (e.g. `runElemToInlines`). Having two functions with almost identical names makes it easier to introduce errors. It's easy enough to just concat and map inline, and it makes it clearer what is going on in the code.	2014-08-16 15:07:41 -04:00
Jesse Rosenthal	9969b2ebee	Docx reader: Fix bug in character styles. Style handling has been cleaned up, but introduced a bug here. There wasn't previously a test to catch it.	2014-08-16 14:05:19 -04:00
Jesse Rosenthal	0ff9ec2f4e	Rewrite Docx.hs and Reducible to use Builder. The big news here is a rewrite of Docx to use the builder functions. As opposed to previous attempts, we now see a significant speedup -- times are cut in half (or more) in a few informal tests. Reducible has also been rewritten. It can doubtless be simplified and clarified further. We can consider this, at the moment, a reference for correct behavior.	2014-08-16 10:22:55 -04:00
John MacFarlane	8bf39cf6d6	Markdown reader: Better handle quote characters in inline links. This was previously failing to be recognized as a link: [Test](http://en.wikipedia.org/wiki/Ward's_method) Closes #1534.	2014-08-14 10:59:27 -07:00
John MacFarlane	e917bcc124	Make `raw_tex` extension non-default for textile reader, writer. Enable `raw_tex` extension in textile writer. Closes #1532.	2014-08-14 09:49:31 -07:00
John MacFarlane	52a9ccce4f	Merge pull request #1531 from jkr/morefonts Docx reader: Interpret "Strong" and "Emphasis" run styles.	2014-08-13 19:47:42 -07:00
John MacFarlane	17b2fd567b	Fixed haddock comment.	2014-08-13 13:59:50 -07:00
John MacFarlane	05b7fd8dee	Removed unneeded import.	2014-08-13 11:35:09 -07:00
Jesse Rosenthal	6897905602	Docx reader: Interpret "Strong" and Emphasis run styles.	2014-08-13 12:23:03 -04:00
John MacFarlane	22ab3367c6	Removed unneeded CPP.	2014-08-12 22:50:51 -07:00
Jesse Rosenthal	a1320a76f9	Docx: Reducible forgot about smallcaps	2014-08-13 00:09:40 -04:00
Jesse Rosenthal	dca55630e6	Docx Reader: Trim line breaks from the beginning and end of Section Headers. We might also want to do this elsewhere (for pars, for example).	2014-08-12 23:42:01 -04:00
Jesse Rosenthal	378a795eaa	Docx: More robust handling of multiple bookmarks in header.	2014-08-12 23:41:57 -04:00
Jesse Rosenthal	85579052b5	Docx reader: Check for null-id'd anchors too. Otherwise they get left dangling in the document.	2014-08-12 23:33:03 -04:00
Jesse Rosenthal	194ed88852	Docx reader: accept explicit "Italic" and "Bold" rStyles. Note that "Italic" can be on, and, from the last commit, `<w:i>` can be present, but be turned off. In that case, the turned-off tag takes precedence. So, we have to distinguish between something being off and something not being there. Hence, isItalic, isBold, isStrike, and isSmallCaps have become Maybes.	2014-08-12 22:39:18 -04:00
Jesse Rosenthal	aae71ad595	Docx reader: Add "BlockQuotation" to divs list.	2014-08-12 22:08:30 -04:00
Jesse Rosenthal	d4748038d7	Docx Reader: Fix font style parsing. Before we just checked for the existence of a tag. Now, we make sure to check for its on/off value.	2014-08-12 22:04:07 -04:00
John MacFarlane	e883ef4eb9	Merge pull request #1527 from mpickering/juicypixels Attempts to convert gif, tiff and bmp to png in pdf writer	2014-08-12 16:57:22 -07:00
John MacFarlane	7684b24959	Merge pull request #1528 from mpickering/epubtitlepage EPUB Reader: Ignores titlepage attribute	2014-08-12 16:56:38 -07:00
Matthew Pickering	2b31df32de	LaTeX Writer: Added missing closing braces to hyperdef commands	2014-08-13 00:37:18 +01:00
Matthew Pickering	57bebe26df	PDF Writer: Attempts to convert images to pdf renderable formats Now depends on the JuicyPixels library. Will attempt to convert an image (gif, tiff, bmp) to png when converting to pdf.	2014-08-13 00:37:18 +01:00
John MacFarlane	81157c7cc6	HTML writer: use 'uri' or 'email' class for autolinks. This allows them to be styled specially. Closes #1501.	2014-08-12 15:49:43 -07:00
John MacFarlane	da507dcb84	ConTeXt writer: improved autolink detection. It previously failed in some cases with escaped special characters.	2014-08-12 15:49:20 -07:00
Matthew Pickering	34cf016251	EPUB Reader: Ignore title pages	2014-08-12 23:03:24 +01:00
John MacFarlane	f16dd1bfdf	DocBook: Support equations with mathml. equation, informalequation, inlineequation and mml:math elements.	2014-08-12 14:00:46 -07:00
John MacFarlane	d6f0973128	Merge pull request #1524 from jkr/dropCap3 Docx reader: move dropcap combining logic to Reducible	2014-08-12 11:13:27 -07:00
John MacFarlane	f97ec6db2c	Markdown reader: Improved parsing of indented code in list items. Indented code at the beginning of a list item must be indented eight spaces from the margin (or from the edge of the container), or four spaces past the list marker, whichever is farther. Some examples in `tests/markdown-reader-more.txt`.	2014-08-12 11:10:48 -07:00
John MacFarlane	ab75e1d3bd	Beamer: Use \footnote<.->{..} for notes. This ensures that the footnotes will not appear before the overlays in which their corresponding note markers appear. Closes #1525.	2014-08-12 10:56:57 -07:00
Jesse Rosenthal	9d0b390d48	Docx reader: move combining logic to Reducible Introduces a new function in Reducibles, concatR. The idea is that if we have two list of Reducibles (blocks or inlines), we can combine them and just perform the reduction on the joining parts (the last element of the first list, the first element of the second list). This is useful in cases where the two lists are already reduced, and we're only worried about the joining elements. This actually improves the efficiency a bit further, because concatR can be smart about empty lists.	2014-08-12 10:26:49 -04:00
Jesse Rosenthal	e4a8e4a636	Docx reader: Make dropcap combining more efficient. Before, we had to run reduceList on the whole combined paragraph, which was redundant, and could take some time for long paragraphs. We only need to combine the drop cap with the first inline of the next paragraph.	2014-08-12 09:00:53 -04:00
Jesse Rosenthal	45ec035e93	Docx reader: combine inlines properly in dropcaps. Make sure that adjacent inlines are combined properly in dropcaps. This updates the test results as well.	2014-08-11 23:31:16 -04:00
Jesse Rosenthal	3e32cd5bb1	Docx reader: Use dropcap state. If we get to a dropcap, we keep hold the inlines until the next paragraph, and combine it there.	2014-08-11 23:08:33 -04:00
Jesse Rosenthal	bca74a2bd0	Add dropCap to paragraph style.	2014-08-11 21:42:02 -04:00
John MacFarlane	86d4da994a	EPUB reader: use walk instead of bottomUp. This should be more efficient.	2014-08-11 14:48:42 -07:00
John MacFarlane	31811657fa	Merge pull request #1521 from jkr/emptyEmph Discard empty formatters	2014-08-11 14:44:08 -07:00
John MacFarlane	211fe266e0	LaTeX writer: Don't produce `\label{}` for Div or Span. Just `\hyperdef`. A slight amendment to #1519.	2014-08-11 12:20:44 -07:00
John MacFarlane	95d9b43b42	Merge pull request #1519 from mpickering/more EPUB Normalisation and anchors for div blocks in tex	2014-08-11 11:28:11 -07:00
John MacFarlane	6fae136cbb	Textile reader: list and HTML block parsing improvements. Closes #1513. Lists can now start without an intervening blank line. Also, html block-level tags that don't start a line are parsed as RawInline and don't interrupt paragraphs, as in RedCloth.	2014-08-11 11:22:39 -07:00
John MacFarlane	4a535211d8	Merge pull request #1365 from gbataille/docx-margin Scale images to fit the page for DOCX	2014-08-11 10:44:52 -07:00
Jesse Rosenthal	0411fe7ccf	Docx reader: handle empty reducibles.	2014-08-11 12:48:16 -04:00
Matthew Pickering	1952dd0592	TeX Writer: Write hyperdef and label for identifiers on Div blocks	2014-08-11 16:23:05 +01:00
Matthew Pickering	72b1470713	EPUB Reader: Fixed another normalisation problem..	2014-08-11 16:23:05 +01:00
John MacFarlane	e690fe4a3e	Merge pull request #1516 from mpickering/epubmetadata EPUB improvements	2014-08-11 08:14:54 -07:00
Matthew Pickering	973ed469de	Docx Parse: Improved font recognition when specified in rFonts element	2014-08-11 10:30:32 -04:00
Matthew Pickering	427466f80c	Docx Fonts: Derives Show and Eq	2014-08-11 10:30:32 -04:00
Matthew Pickering	e02360d3d8	EPUB Reader: Can now parse multiple meta data fields	2014-08-11 13:12:42 +01:00
Matthew Pickering	285d56dea7	EPUB Writer: Added page-progression-direction meta field	2014-08-11 11:21:38 +01:00
Matthew Pickering	9eded27e32	EPUB reader: Fixed bug where filepaths weren't sufficiently normalised	2014-08-11 11:20:33 +01:00
Matthew Pickering	1f02ff60ba	EPUB Writer: Added explicit imports	2014-08-11 10:21:52 +01:00
John MacFarlane	65b31e0cac	Merge pull request #1510 from jkr/spacefix Docx reader: Fix spacing issue.	2014-08-10 07:11:59 -07:00
John MacFarlane	7ec8dd956f	Removed OMath module, depend on texmath >= 0.8.	2014-08-10 06:19:41 -07:00
Jesse Rosenthal	c15978ce5e	Change head/tail to pattern guards.	2014-08-10 09:10:34 -04:00
Jesse Rosenthal	a02ce74acf	Docx reader: Fix spacing issue. Previously spaces at the beginning of Emph/Strong/etc were kept inside. This makes sure they are moved out.	2014-08-09 23:35:09 -04:00
Matthew Pickering	3bb19307f6	Docx Parse: Recognises code points in sym elements which are in the private range	2014-08-09 22:37:12 -04:00
Matthew Pickering	edc57f77fc	Added Text.Pandoc.Readers.Docx.Fonts	2014-08-09 22:37:12 -04:00
Matthew Pickering	2deaa7096f	Docx Reader: Added recognition of sym element in paragraphs	2014-08-09 22:37:12 -04:00
Matthew Pickering	4ae61bdf8f	EPUB: Fixed another mediabag related regression..	2014-08-10 00:12:09 +01:00
Matthew Pickering	a6648e5a73	EPUB Reader: Changed image paths to be relative to manifest file	2014-08-09 23:06:16 +01:00
John MacFarlane	4983083079	HTML writer: Don't include empty TOC items for slide shows. Previously creating a slide with a horizontal rule would result in an empty list item in the TOC. This patch fixes that.	2014-08-09 10:29:39 -07:00
John MacFarlane	bc06ef0edb	Merge branch 'newbranch' of https://github.com/mpickering/pandoc into mpickering-newbranch Conflicts: src/Text/Pandoc/Readers/EPUB.hs	2014-08-08 22:22:55 -07:00
John MacFarlane	19daf6cf0a	Added `native_divs` and `native_spans` extensions. This allows users to turn off the default pandoc behavior of parsing contents of div and span tags in markdown and HTML as native pandoc Div blocks and Span inlines. Setting of default epub extensions has been moved from the EPUB reader to Text.Pandoc.	2014-08-08 21:05:34 -07:00
John MacFarlane	a4a6b6f28c	Plain writer: Use ALL CAPS for level 1 headers.	2014-08-08 15:20:29 -07:00
Matthew Pickering	cfd8c0214c	EPUB Reader: Improved robustness of image extraction We now maintain the invariant that when fetchImages is called, all images have absolute paths. This patch fixes several bugs relating to this as there are three places where images can be introduced. (1) During the HTML parse (2) As spine elements (3) As a cover image For (1), the paths are corrected by the transformation renameImages For (2) and (3), we need to append the "root" to the path we parse from the spine	2014-08-08 23:04:03 +01:00
Matthew Pickering	40ae8efddc	EPUB Reader: Fixed regressions in image extraction Before the images were relative to the position of the package file. The collapse function changed this so that they were then absolute in the archive but the fetchImages function wasn't updated to recognise this.	2014-08-08 22:31:27 +01:00
Matthew Pickering	8c551f6f43	EPUB Reader: Use collapseFilePath	2014-08-08 22:31:22 +01:00
Matthew Pickering	2d956677ef	Shared: Added collapseFilePath function This function removes intermediate "." and ".." from a path.	2014-08-08 22:31:02 +01:00
Matthew Pickering	116f03a70a	EPUB Reader: Removed incorrectly set reader flag	2014-08-08 22:31:02 +01:00
John MacFarlane	aae90a8671	Merge pull request #1503 from jkr/streamlineMath OMath parser: Change signature of exported function.	2014-08-08 13:45:30 -07:00
John MacFarlane	f723a0575d	Markdown writer: Respect -raw_html. pandoc -t markdown-raw_html should not emit any raw HTML, even span and div tags that go with pandoc Span and Div elements. Cleaned up a bit of the logic with extensions and plain.	2014-08-08 13:34:57 -07:00
Jesse Rosenthal	a426812ccc	OMath parser: Change signature of exported function. This changes the signature of the exported `readOMML` to `String -> Either String [Exp]`, so it can now, in theory, be slotted into TeXMath. It doesn't have any real error reporting yet, but that might make more sense once I put it in a branch, and understand how it works in the other readers. It also now reads strings that parse to either oMath or oMathPara elements. Note that the distinction is lost in the output. It's up to the caller to remember the display type.	2014-08-08 16:34:38 -04:00
John MacFarlane	7b47042ae6	Textile reader: fixed list parsing bug. Closes #1500 .	2014-08-08 12:18:47 -07:00
John MacFarlane	dd78dd6d1b	Textile reader: don't allow inline formatting to extend over newline. This matches behavior of RedCarpet, avoids some ugly bugs, and improves performance.	2014-08-08 12:18:47 -07:00
Jesse Rosenthal	2f7a627f6d	OMath: Finish initial cleanup. This gets rid of commented-out functions, cleans up whitespace errors, and exports and imports the correct functions.	2014-08-08 14:16:54 -04:00
Jesse Rosenthal	ba5804f9ec	OMath: Remove Namespaces We still need to test against prefixes, but this is only going to look at oMath fragments, so we're not going to be worried about looking up the real namespace.	2014-08-08 14:15:17 -04:00
Jesse Rosenthal	0acd139fb1	OMath: Start phasing out internal OMath type. This is the first step in removing the intermediate OMath type, which we no longer need since we're writing straight to TeXMath Exp.	2014-08-08 14:14:30 -04:00
Jesse Rosenthal	cf849443cb	OMath parser: don't group expressions if there's only one.	2014-08-08 14:12:05 -04:00
Matthew Pickering	40602c3df6	HTML EPUB exts: switch element can now be in either the inline or block position	2014-08-08 10:25:40 -07:00
John MacFarlane	94466c0060	HTML reader: Really ignore DOCTYPE and xml declarations. This actually does what `d71b013841` said it did. Revised epub tests to remove the repeated DOCTYPE and xml tags.	2014-08-07 22:12:44 -07:00
John MacFarlane	3c4079edc8	Merge pull request #1488 from mpickering/epubfixes EPUB Reader: Improved image extraction	2014-08-07 19:00:32 -07:00
John MacFarlane	08bed142ba	Merge pull request #1496 from mpickering/master Org Writer: Write anchor elements	2014-08-07 16:29:20 -07:00
Matthew Pickering	07bb41d6da	Org Writer: Write anchor elements The Org Writer now writes empty span elements which have an id as an anchor. For example `Span ("uid", [], []) []` becomes `<<uid>>`	2014-08-08 00:20:18 +01:00
Matthew Pickering	19d2ff68b1	EPUB Reader: Improved how images are extracted	2014-08-07 22:56:30 +01:00
John MacFarlane	17e48ba81e	Merge pull request #1494 from jkr/math-module Math module	2014-08-07 13:44:19 -07:00
Jesse Rosenthal	7bd7d4d476	Docx reader: Handle inline drawings. Previous drawings that were under some other toplevel run (i.e., a hyperlink) wouldn't be properly handled. This should fix that.	2014-08-07 15:01:05 -04:00
Jesse Rosenthal	d293dd528b	OMath module: Add new file.	2014-08-07 12:41:33 -04:00
Jesse Rosenthal	a7967d1aef	Docx reader: Split math out into math module. Could use some cleanup, but this is the first step for getting an OMML reader into TeXMath.	2014-08-07 12:20:22 -04:00
Matthew Pickering	13f26af84f	Docx Reader: Added Default instances and removed withDState Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>	2014-08-06 19:15:33 -04:00
Jesse Rosenthal	91ab2f155f	Get rid of unused docx variable. Since changing the Docx type, this is no longer necessary. Thanks to Matthew Pickering for picking up on this.	2014-08-06 12:19:24 -04:00
John MacFarlane	444b1c2ad8	Merge pull request #1491 from jkr/texmath-equations Docx Reader: Use TeXMath for writing equations.	2014-08-06 09:07:00 -07:00
Jesse Rosenthal	cd9ca5a18a	Docx reader: remove now-unnecessary state variable. This also introduces a `defaultDState` value.	2014-08-06 11:20:41 -04:00
Jesse Rosenthal	cdd769624f	Remove now-unnecessary TexChar TeXMath does the work now.	2014-08-06 11:20:41 -04:00
Jesse Rosenthal	06488c95fa	Add a note on how `mapD` works.	2014-08-06 11:20:41 -04:00
Jesse Rosenthal	3bc2ea4cf7	Docx reader: Use TeXMath to write math The new version of TeXMath can translate from its type system into LaTeX. So instead of writing the LaTeX ourself, we write to the TeXMath `Exp` type, and let TeXMath do the rest.	2014-08-06 11:20:27 -04:00
Uli Köhler	9d07db933c	MediaWiki reader doesn't recognize german "Bild"	2014-08-06 00:47:23 +02:00
Matthew Pickering	b04bb3b6d2	MediaBag: Improved normalisation when writing files	2014-08-05 11:02:23 +01:00
John MacFarlane	2de2842bdd	Merge pull request #1486 from Aelve/minor Very minor cleanup and readability changes	2014-08-04 22:07:02 -07:00
John MacFarlane	39b59b7603	Merge pull request #1476 from jkr/endnote-fix Docx Parser: Produce endnotes.	2014-08-04 21:59:58 -07:00
John MacFarlane	d71b013841	HTML reader: ignore <?xml..> and <DOCTYPE..> tags. Previously they were parsed as raw.	2014-08-04 18:39:39 -07:00
John MacFarlane	40d8100d44	Use texmath 0.7 interface.	2014-08-04 11:13:09 -07:00
Artyom Kazak	141fdf944a	Add PatternGuards pragmas.	2014-08-04 19:58:25 +04:00
Artyom Kazak	eb88444452	Remove redundant isHexDigit function.	2014-08-04 19:58:25 +04:00
Artyom Kazak	e51a2cedf9	Remove dangling `where` from one function.	2014-08-04 19:58:25 +04:00
Artyom Kazak	82118b3328	Use `stripPrefix` where appropriate.	2014-08-04 19:57:42 +04:00
Artyom Kazak	feebab9740	Clean up `mediaTypeOf` a bit.	2014-08-04 19:41:37 +04:00
Artyom Kazak	f659644fcc	Use `mapM_` instead of `() <$ mapM` in one place.	2014-08-04 19:41:37 +04:00
John MacFarlane	4630cff2a6	Merge branch 'epubend' of https://github.com/mpickering/pandoc into mpickering-epubend Conflicts: pandoc.cabal	2014-08-04 07:36:18 -07:00
Artyom Kazak	ec88d47f23	Correctly implement capitalisation. Using `map toUpper` to capitalise text is wrong, as e.g. “Straße” should be converted to “STRASSE”, which is 1 character longer. This commit adds a `capitalize` function and replaces 2 identical implementations in different modules (`toCaps` and `capitalize`) with it.	2014-08-03 17:37:37 +04:00
John MacFarlane	842c705097	SelfContained: Fixed determining of source URL from within CSS files. (This fixes a bug introduced a couple commits back.)	2014-08-02 16:33:22 -07:00
John MacFarlane	85ff3c5771	fetchItem: improved mime type guessing. Strip a fragment like `?#iefix` from the extension before doing the mime lookup.	2014-08-02 16:32:11 -07:00
John MacFarlane	1d137fbed6	Shared: fetchItem improvements. * More consistent logic: absolute URIs are fetched from the net; other things are treated as relative URIs if sourceURL is a Just, otherwise as file paths. * We escape characters that are not allowed in URIs before trying to parse them (e.g. '\|', which often occurs in the wild). * When treating relative paths as local file paths, we drop any fragment or query. This is useful e.g. when you've downloaded web fonts locally, but your source still contains the original relative URLs. Together with the previous commit, this should close #1477.	2014-08-02 16:12:05 -07:00
John MacFarlane	ce8922437d	Text.Pandoc.SelfContained changes. * mkSelfContained now takes just two arguments, WriterOptions and the string. * It no longer looks in data files. This only made sense when we had copies of slidy and S5 code there. * Shared.fetchItem' is used instead of the nearly duplicate getItem.	2014-08-02 16:07:19 -07:00
Jesse Rosenthal	3a411c3307	Docx Parser: Produce endnotes. The parser had been changing footnotes and endnotes into footnotes. This isn't a problem, because pandoc collapses them, but the parser should maintain as much of the docx structure as is collapsed, and let the toplevel reader worry about how to translate it into Pandoc. (This would be an issue when, as is planned, the docx parser spins off into its own module.) The output is the same, so no test change is required.	2014-08-01 09:16:42 -04:00
Jesse Rosenthal	57939a9282	Docx Reader: Single underlines are "emph" All other underlines are ignored.	2014-07-31 18:44:41 -04:00
Matthew Pickering	cd9a5d90cb	EPUB Reader: Now uses the new MediaBag for images	2014-07-31 22:48:08 +01:00
Matthew Pickering	8460ea417f	EPUB Reader: Integrated into program	2014-07-31 21:39:50 +01:00
Matthew Pickering	51051e9953	HTML Reader: Added ability to read MathML formatted <math> blocks	2014-07-31 21:39:50 +01:00
Matthew Pickering	7c1f867397	HTML Reader: Added support for anchors on links and list items	2014-07-31 21:39:50 +01:00
Matthew Pickering	266e1977e0	HTML Reader: Extended HTML Reader to recognise EPUB specific elements	2014-07-31 21:39:49 +01:00
Matthew Pickering	002ae95d7a	Options: Added option to turn on epub html extensions	2014-07-31 21:39:49 +01:00
Matthew Pickering	b57e554b59	Except Compat: Updated to export more module functions	2014-07-31 21:39:49 +01:00
Matthew Pickering	089745af61	EPUB Reader: Added EPUB reader	2014-07-31 21:39:49 +01:00
John MacFarlane	6dd2418476	New module, Text.Pandoc.MediaBag. Moved `MediaBag` definition and functions from Shared: `lookupMedia`, `mediaDirectory`, `insertMedia`, `extractMediaBag`. Removed `emptyMediaBag`; use `mempty` instead, since `MediaBag` is a Monoid.	2014-07-31 12:00:21 -07:00
John MacFarlane	00662faefb	Made MediaBag a newtype, and added mime type information to media. Shared now exports functions for interacting with a MediaBag: - `emptyMediaBag` - `lookuMedia` - `insertMedia` - `mediaDirectory` - `extractMediaBag`	2014-07-31 11:05:35 -07:00
Matthew Pickering	31e7c1b67c	Shared: Added function insertMedia which is an alias for M.insert	2014-07-31 15:47:49 +01:00
John MacFarlane	f1885ae799	Removed deprecated and no longer used readerStrict in ReaderOptions. This is handled by readerExtensions now.	2014-07-30 18:32:06 -07:00
John MacFarlane	71e76175be	getT2TMeta: Take list of source files instead of single. Get latest modification time.	2014-07-30 17:25:00 -07:00
John MacFarlane	e4913d6dba	Allow --self-contained to get content from MediaBag. Added a parameter to makeSelfContained (API change).	2014-07-30 15:26:40 -07:00
John MacFarlane	23d806644f	RTF writer: Improved image embedding. Use calculated sizes.	2014-07-30 14:49:57 -07:00
John MacFarlane	e365af9c23	RTF writer: refactored image embedding, using fetchItem'.	2014-07-30 14:27:51 -07:00
John MacFarlane	234652a4b8	PDF, Docx, EPUB, and ODT writers now automatically use MediaBag. The MediaBag is thread through from the reader, with no need to extract to files.	2014-07-30 14:07:31 -07:00
John MacFarlane	28321a18bf	Shared: Added fetchItem', which searches a media bag too.	2014-07-30 13:47:07 -07:00
John MacFarlane	67c5c7a575	Moved MediaBag back from Shared to Options, to avoid module cycle.	2014-07-30 13:46:48 -07:00
John MacFarlane	08e2498e73	Added writerMediaBag to WriterOptions.	2014-07-30 13:09:55 -07:00
John MacFarlane	555f9b746d	Moved MediaBag from Shared to Options. This will allow us to put a MediaBag in WriterOptions.	2014-07-30 13:00:54 -07:00
John MacFarlane	d3cf53a956	Moved withTempDir from PDF to Shared, export from Shared. API change.	2014-07-30 12:29:04 -07:00
Jesse Rosenthal	a79ea18503	Pandoc.hs: change BSReader to output MediaBag as well as pandoc.	2014-07-30 12:47:49 -04:00
Jesse Rosenthal	f78d2f6219	Shared: Make MediaBag available through Shared.	2014-07-30 12:47:26 -04:00
Jesse Rosenthal	9ce2295700	Docx reader: Make docx reader put image data in MediaBag. Image data will not be put in a media bag map, which will be output along with the pandoc output.	2014-07-30 12:46:03 -04:00
John MacFarlane	02c79ea4f6	Mediawiki writer: don't escape inside `<source>`. Closes #1445. Escapes can still be used with `<code>` and `<pre>`.	2014-07-29 21:32:07 -07:00
John MacFarlane	33a051d00d	Docx writer: Print subtitle from metadata if present. Use Subtitle style. See #1451.	2014-07-29 20:46:55 -07:00
John MacFarlane	8c2ed54e2e	LaTeX writer: use $..$ instead of $..$ for inline math. Closes #1464.	2014-07-29 20:45:49 -07:00
John MacFarlane	8d4eebaff4	Merge pull request #1463 from jkr/metadata Make metadata out of styled pars	2014-07-29 11:15:34 -07:00
Jesse Rosenthal	840108a9c1	Docx reader: Make metavalues out of styled paragraphs. This will make paragraphs styled with `Author`, `Title`, `Subtitle`, `Date`, and `Abstract` into pandoc metavalues, rather than text. The implementation only takes those elements from the beginning of the document (ignoring empty paragraphs). Multiple paragraphs in the `Author` style will be made into a metaList, one paragraph per item. Hard linebreaks (shift-return) in the paragraph will be maintained, and can be used for institution, email, etc.	2014-07-29 13:03:01 -04:00
Matthew Pickering	95fb0755c1	Parsing: Added isbn and pmid schemes	2014-07-27 19:59:57 +01:00
John MacFarlane	19109331df	Markdown writer: Separate adjacent lists of the same kind with comment. Closes #1458.	2014-07-27 08:26:17 -07:00
John MacFarlane	c302ab3133	Markdown writer: More improvements to 'plain' output, updated tests. Math now appears in unicode if possible, without the distracting italics around identifiers. Blank lines around headers are more consistent. Footnotes appear in regular [n] style.	2014-07-27 07:57:23 -07:00
John MacFarlane	cc24a1f3e5	Text.Pandoc.Pretty: added blanklines. This ensures a certain number of blanklines (and no more) in output.	2014-07-27 07:56:55 -07:00
John MacFarlane	3eff3782c1	Markdown writer: Better 'plain' output. We now largely follow the style of Project Gutenberg. Emphasis is rendered with `_underscores_`, strong with ALL CAPS. The appearance of horizontal rules has changed (even in regular markdown) to a line across the whole page. Headings are rendered differently, using space to set them off.	2014-07-27 00:17:15 -07:00
John MacFarlane	18d72a0768	Markdown writer: Update definition lists. They now behave like the new reader does. The old behavior can be activated with the `compact_definition_lists` extension.	2014-07-27 00:15:18 -07:00
John MacFarlane	b104db4fb4	Docx writer: Added missing case from last commit.	2014-07-26 23:32:13 -07:00
John MacFarlane	2610de0159	Docx writer: include abstract with Abstract style. Addresses docx part of #1451.	2014-07-26 22:55:45 -07:00
John MacFarlane	d9751f91c4	Merge pull request #1457 from mpickering/generalstate Generalised more in Parsing.hs to enable the use of custom state	2014-07-26 19:01:26 -07:00
Matthew Pickering	9e4604fa0b	Added compatability layer to support directory-1.1	2014-07-27 00:36:23 +01:00
Matthew Pickering	f0a32197c8	Txt2Tags Reader: Added copyright information	2014-07-27 00:12:57 +01:00
Matthew Pickering	43304d6bd6	Txt2Tags Reader: Added recognition of macros	2014-07-27 00:12:56 +01:00
Matthew Pickering	ab3589ff0b	Txt2Tags Reader: Integrated into pandoc	2014-07-27 00:12:56 +01:00
Matthew Pickering	7d04d383a6	Added txt2tags reader http://txt2tags.org/ There are two points which currently do not match the official implementation. 1. In the official implementation lists can not be nested like the following but the reader would interpret this as a bullet list with the first item being a numbered list. ``` - + This is not a list ``` 2. The specification describes how URIs automatically becomes links. Unfortunately as is often the case, their definitiong of URI is not clear. I tried three solutions but was unsure about which to adopt. * Using isURI from Network.URI, this matches far too many strings and is therefore unsuitable * Using uri from Text.Pandoc.Shared, this doesn't match all strings that the reference implementation matches * Try to simulate the regex which is used in the native code I went with the third approach but it is not perfect, for example trailing punctuation is captured in Urls.	2014-07-27 00:12:56 +01:00
Matthew Pickering	5e2d22a27e	Generalised more in Parsing.hs to enable the use of custom state	2014-07-26 21:23:49 +01:00
John MacFarlane	18f4490482	Fixed runtime error with compactify'DL on certain lists. Closes #1452. Added test.	2014-07-25 10:53:04 -07:00
John MacFarlane	9c3f7688ee	DocBook reader: Better handle elements inside code environments. Of course, we can't include structure in the code block, but this way we at least preserve the text. Closes #1449.	2014-07-23 10:06:36 -07:00
Matthew Pickering	83028c5982	Exported runParserT and Stream	2014-07-22 16:37:41 +01:00
Matthew Pickering	e045b1d5f2	Generalised readWith to readWithM	2014-07-22 15:13:54 +01:00

... 2 3 4 5 6 ...

2923 commits