pandoc

Author	SHA1	Message	Date
John MacFarlane	a4d28cdd6d	Fixed absolute URI detection in EPUB writer. Closes #1672 .	2014-10-08 14:54:03 -07:00
Freiric Barral	24231623f3	fix inDirectory to reset to the original directory in case an exception occurs	2014-10-08 23:25:01 +02:00
John MacFarlane	d60707eed0	EPUB writer: Don't add sourceURL to absolute URIs! Closes #1669. If there are further issues, please open a new, targeted issue on the tracker. Some notes on the further issues you gestured at: Data URIs are indeed dereferenced, but why is this a problem? (The function being used to fetch from URLs is used for many different formats. Preserving data URIs would make sense in EPUBs, but not for e.g. PDF output. And by dereferencing we can get a smaller, more efficient EPUB, with the data stored as bytes in a file rather than encoded in textual representation.) "absolute uris are not recognized" -- I assume that is the problem just fixed. If not, please open a new issue. "relative uris are resolved (wrongly) like file paths" -- can you give an example? `<base>` tag is ignored. Yes. I didn't know about the base tag. Could you open a new issue just for this?	2014-10-08 11:52:47 -07:00
Grégory Bataille	8a1a5948be	Getting the page width from the reference file Uses it to scale images that are too large. When there is no reference files, default to a US letter portrait size to scale the images	2014-10-05 14:53:06 +02:00
Jason Ronallo	3dc58090d2	add mime type for WebVTT	2014-10-04 22:40:02 -04:00
John MacFarlane	bf00556c72	Added `track` to list of tags treated by `--self-contained`. Closes #1664.	2014-10-04 11:39:08 -07:00
Wikiwide	678aa31561	cref, sep Adding inlineCommands	2014-10-03 11:33:02 +10:00
John MacFarlane	08ac33815b	RST writer: Wrap line blocks with spaces before continuations. Improves on fix to #1656.	2014-09-30 09:25:54 -07:00
John MacFarlane	29e1c9529f	Don't wrap lines in rST line blocks. Closes #1656. Fixing pandoc to wrap the lines but insert spaces would be much more complicated. This at least makes the output semantically correct.	2014-09-29 21:48:59 -07:00
John MacFarlane	fe6d43b3e0	Merge pull request #1601 from jkr/windowsfix Fix path-slashes inside archive for windows	2014-09-27 16:21:17 -07:00
John MacFarlane	9c4e33f085	Merge pull request #1589 from mszep/master Add function to sanitize ConTeXt labels	2014-09-27 16:20:56 -07:00
Matthew Pickering	5cb475c374	Org Reader: Parse multi-inline terms correctly in definition list Closes #1649	2014-09-27 22:40:25 +01:00
Artyom	bc115ffc2d	Fix 'Ext_lists_without_preceding_blankline' bug. * Fixes #1636. * Adds a test.	2014-09-26 13:32:08 +04:00
mpickering	6740a9592a	HTML Reader: Recognise <br> tags inside <pre> blocks Closes #1620	2014-09-25 19:20:12 +01:00
mpickering	1f0ba8ec11	HTML Writer: Don't double render when email-obfuscation=none Closes #1625	2014-09-25 18:46:36 +01:00
mpickering	515a120d04	Add support for KaTeX HTML math Closes #1626	2014-09-25 18:32:42 +01:00
mpickering	575c76e36b	HTML Writer: MathML now outputted with tex annotation. Closes #1635	2014-09-25 15:28:50 +01:00
mpickering	cc07d0c6bf	Shared: Make collapseFilePath OS-agnostic	2014-09-25 12:42:53 +01:00
mpickering	56e4ecab20	MediaBag: Fixes Windows specific path problems Changes the internal representation to fix the problem. I haven't tested this on windows. Closes #1597	2014-09-25 12:19:52 +01:00
Mark Szepieniec	84b75a1c2a	ConTeXt writer: add function toLabel This function can be used to sanitize reference labels so that they do not contain any of the illegal characters \#[]",{}%()\|= . Currently only Links have their labels sanitized, because they are the only Elements that use passed labels.	2014-09-18 23:27:14 +02:00
Jesse Rosenthal	020a527c15	Docx writer: Renumber header and footer relationships to avoid collisions. We previously took the old relationship names of the headers and footer in secptr. That led to collisions. We now make a map of availabl names in the relationships file, and then rename in secptr.	2014-09-11 15:11:44 -04:00
Jesse Rosenthal	1326a41780	LaTeX writer: Protect graphics in headers. Graphics in `\section`/`\subsection` etc titles need to be `\protect`ed. This adds a state value and manually turns it on before every invocation of `sectionHeader` and manually turns it off after. Using a writer value and applying `local` would probably be cleaner, but this fits with the current style.	2014-09-09 10:56:37 -04:00
Jesse Rosenthal	132814aeb6	Docx Reader: Remove header class properly in other langs When we encounter one of the polyglot header styles, we want to remove that from the par styles after we convert to a header. To do that, we have to keep track of the style name, and remove it appropriately.	2014-09-06 07:53:29 -04:00
Jesse Rosenthal	71452946d9	Docx reader: Use polyglot header list. We're just keeping a list of header formats that different languages use as their default styles. At the moment, we have English, German, Danish, and French. We can continue to add to this. This is simpler than parsing the styles file, and perhaps less error-prone, since there seems to be some variations, even within a language, of how a style file will define headers.	2014-09-05 21:59:58 -04:00
Jesse Rosenthal	13fefd7959	Docx Reader: Start list of polyglot section headers.	2014-09-05 17:31:24 -04:00
Jesse Rosenthal	73b887e2df	Org reader: Added state changing blanklines. This allows us to emphasize at the beginning of a new paragraph (or, in general, after blank lines).	2014-09-04 19:55:53 -04:00
Jesse Rosenthal	ac8ed1fa93	Docx reader: Rewrite rewriteLink to work with new headers. There could be new top-level headers after making lists, so we have to rewrite links after that.	2014-09-04 16:44:21 -04:00
Jesse Rosenthal	7fe54505df	Docx reader: Single-item headers in ordered lists are headers. When users number their headers, Word understands that as a single item enumerated list. We make the assumption that such a list is, in fact, a header.	2014-09-04 16:35:57 -04:00
Jesse Rosenthal	4ef850ded5	Docx reader: Fix window path for image lookup. Don't use os-sensitive "combine", since we always want the paths in our zip-archive to use forward-slashes.	2014-09-02 13:45:01 -04:00
John MacFarlane	db90667a79	EPUB writer: Don't include nav node in spine unless --toc was requested. Previously we included it in the spine with `linear="no"`, leading to odd results in some readers. Closes #1593.	2014-09-01 16:31:32 -07:00
John MacFarlane	cb1a8da01c	LaTeX writer: Avoid using reserved characters as \lstinline delimiters. Closes #1595.	2014-09-01 10:11:09 -07:00
John MacFarlane	43ebb0229f	EPUB writer: Fixed typo.	2014-09-01 08:04:21 -07:00
John MacFarlane	3533218d6d	Merge pull request #1594 from jkr/itemFix Item fix	2014-08-31 19:31:38 -07:00
John MacFarlane	01b7957812	EPUB writer: Extract title even from structured title. Added docTitle'.	2014-08-31 14:47:07 -07:00
Jesse Rosenthal	d3053807a8	LaTeX writer: Put `~` before header in item text. Because of the built-in line skip, LaTeX can't handle a section header as the first element in a list item. (To be precise, it can't handle it if the list immediately follows a section header, but the instance is rare enough that we can afford to be a bit more general). This puts a non-breaking space before the header to solve this problem. We won't see this space, since the header skips a line before printing anyway. The output is ugly in LaTeX and this structure seems like it should probably be avoided. But it is valid HTML and native pandoc, so we should have some sort of typesettable representation in LaTeX.	2014-08-31 16:05:09 -04:00
John MacFarlane	598d3ee23b	Markdown reader: better handling of paragraph in div. Previously text that ended a div would be parsed as Plain unless there was a blank line before the closing div tag. Test case: <div class="first"> This is a paragraph. This is another paragraph. </div> Closes #1591.	2014-08-31 12:55:47 -07:00
John MacFarlane	54df49335a	EPUB writer: Don't use opf:title-type for epub2. It is not supported and epubcheck complains.	2014-08-31 12:08:17 -07:00
John MacFarlane	611bc27862	Shared: Moved import of toChunks outside of conditional. Closes #1590.	2014-08-31 11:17:53 -07:00
John MacFarlane	374bb3c147	DokuWiki writer: Make tables prettier by aligning columns. Also cleaned up crufty code and added tests.	2014-08-30 21:24:33 -07:00
John MacFarlane	d97aed3903	DokuWiki writer: Handle table cell alignments. Closes #1566.	2014-08-30 20:54:33 -07:00
John MacFarlane	a8273009ba	Textile reader: Improved table support. We can now handle all different alignment types, for simple tables only (no captions, no relative widths, cell contents just plain inlines). Other tables are still handled using raw HTML. Addresses #1585 as far as it can be addresssed, I believe.	2014-08-30 20:34:42 -07:00
John MacFarlane	b50927527c	PDF: Catch errors in conversion of images and display message. See #1582.	2014-08-30 18:45:58 -07:00
John MacFarlane	f70e3c3297	Merge branch 'mime' of https://github.com/Aelve/John into Aelve-mime Conflicts: src/Text/Pandoc/Writers/Docx.hs	2014-08-30 11:49:50 -07:00
John MacFarlane	8b09d954f9	Merge pull request #1580 from jkr/stringCellDokuWiki DokuWiki writer: Backslash newlines in table cells	2014-08-30 09:22:38 -07:00
Jesse Rosenthal	ccda2a902c	DokuWiki writer: Use backslash newlines in table cells. Write out strings in table cells with backslash linebreaks in place of newlines. We also want to remove the first two spaces of an indent in lists.	2014-08-30 07:17:55 -04:00
John MacFarlane	218633548f	Merge pull request #1574 from jlduran/latex-horizontal-rule LaTeX writer: Make Horizontal Rules more flexible	2014-08-29 21:40:55 -07:00
John MacFarlane	017d44af1d	Merge branch 'ugly-tables' of https://github.com/jlduran/pandoc into jlduran-ugly-tables	2014-08-29 21:24:36 -07:00
Jose Luis Duran	4c684561ee	LaTeX writer: Add `\strut` to fix multiline tables See: http://tex.stackexchange.com/questions/34971	2014-08-29 13:54:08 +00:00
Jesse Rosenthal	c931be24e1	Docx Reader: Read single para in table cell as plain This makes to docx reader's native output fit with the way the markdown reader understands its markdown output. Ie, as far as table cells go: docx -> native == docx -> native -> markdown -> native (This identity isn't true for other things outside of table cells, of course).	2014-08-28 14:35:33 -04:00
Jose Luis Duran	1fc665c07d	LaTeX writer: Make Horizontal Rules more flexible Currently, pandoc has hard-coded the following in order to make horizontal rules in LaTeX: ```hs "\\begin{center}\\rule{3in}{0.4pt}\\end{center}" ``` Which is fine, but does not allow customizations. It also does not take into consideration the current line width. I'm proposing this change: ```diff @@ In Writers/LaTeX.hs: -"\\begin{center}\\rule{3in}{0.4pt}\\end{center}" +"\\begin{center}\\rule{0.5\\linewidth}{\\linethickness}\\end{center}" ```	2014-08-28 03:12:37 +00:00
Jose Luis Duran	f1d330b7b5	LaTeX writer: Fix tables - [x] Fix a bug introduced in `66378062b6`, which causes the table caption to repeat across all pages - [x] Address the issues discussed [here](https://groups.google.com/forum/#!msg/pandoc-discuss/qMu6_5lYy0o/ZAU7lzAIKw0J) regarding the extra vertical space. - [ ] NOTE: This will cause multiline table cells to appear unpadded. See http://tex.stackexchange.com/questions/34971 - [x] Use [`\tabularnewline`](http://tex.stackexchange.com/questions/78796) instead of `\\`.	2014-08-28 02:02:20 +00:00
Matthew Pickering	404a58f456	DokuWiki Writer: Refactor to use Reader monad	2014-08-27 14:29:09 +01:00
Matthew Pickering	495f55b03e	DokuWiki Writer: Hlint cleanup	2014-08-27 13:48:19 +01:00
Matthew Pickering	3412018287	DokuWiki Writer: Qualified all imports	2014-08-27 13:29:09 +01:00
John MacFarlane	8f2aa45d69	Merge pull request #1564 from jkr/trackChangesWriter Docx writer: write track changes.	2014-08-26 22:22:16 -07:00
Calvin Beck	f813755c55	Fixed exampleLine parser to accept example lines which have indentation at the start of the line.	2014-08-26 21:56:40 -06:00
Jesse Rosenthal	b613d85af9	Docx writer: Accomodate GHC 7.4 (no lookupEnv)	2014-08-26 06:43:14 -04:00
Jesse Rosenthal	21253b59e8	Docx writer: Default to user login and time of change if not given.	2014-08-25 14:03:35 -04:00
Jesse Rosenthal	e1bb28a388	Docx writer: Implement track changes. These have default authors and dates of "unknown" and timestamp-zero, respectively.	2014-08-25 12:37:56 -04:00
John MacFarlane	9f8051d95d	Hlint changes to Docx writer.	2014-08-24 11:37:23 -07:00
John MacFarlane	0ef1f787c7	Docx writer: Bibliography entries get Bibliography style. Closes #1559.	2014-08-23 20:52:09 -07:00
John MacFarlane	2956ef251c	Fixed --self-contained with Windows paths. Previously C:\foo.js was being wrongly interpreted as a URI. Closes #1558.	2014-08-22 23:21:57 -07:00
mpickering	aa808055f0	Txt2Tags Reader: Fixed crash when reading from stdin	2014-08-21 17:11:21 +01:00
mpickering	3b6d7afa71	Txt2Tags Reader: Corrected formatting of %%mtime macro	2014-08-21 17:11:16 +01:00
mpickering	2a7319541d	Txt2Tags Reader: Parse Meta information The header is now parsed as meta information. The first line is the `title`, the second is the `author` and third line is the `date`.	2014-08-21 17:09:40 +01:00
mpickering	2cd049a1bf	Txt2Tags reader: Header is now parsed only if standalone flag is set	2014-08-20 18:11:37 +01:00
John MacFarlane	60f3e777f3	EPUB writer: don't use page-progression-direction in EPUB2. Also, if page-progression-direction not specified in metadata, don't include the attribute even in EPUB3; not including it is the same as including it with the value "default", as we did before. Closes #1550.	2014-08-19 09:21:26 -07:00
John MacFarlane	716ad5fd8a	Merge pull request #1547 from jkr/styleparse Docx reader: parsing styles	2014-08-18 14:14:01 -07:00
John MacFarlane	6dce8c6760	HTML reader: improved handling of tags that can be block or inline. Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline): <video controls="controls"> <source src="../videos/test.mp4" type="video/mp4" /> <source src="../videos/test.webm" type="video/webm" /> <p> The videos can not be played back on your system.<br/> Try viewing on Youtube (requires Internet connection): <a href="http://youtu.be/etE5urBps_w">Relative Velocity on Youtube</a>. </p> </video> This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we're parsing a "plain" sequence of inlines, we don't parse anything that COULD be a block-level tag.	2014-08-18 12:41:09 -07:00
Jesse Rosenthal	4b38e9f1f0	Docx reader: whitespace fix.	2014-08-17 20:11:50 -04:00
Jesse Rosenthal	198aea190f	Docx reader: remove emph styles and strong styles list. We no longer need the explicit lists since we're deriving them from the ground up.	2014-08-17 17:04:55 -04:00
Jesse Rosenthal	9da7b0946e	Docx reader: Add "Hyperlink" to blacklisted styles. This is the only one so far. We'll add others as they show up.	2014-08-17 17:04:14 -04:00
Jesse Rosenthal	15ce28b8ca	Docx reader: Use style resolver. We now no longer check against explicit styles.	2014-08-17 17:03:44 -04:00
Jesse Rosenthal	03d5d8e596	Docx Reader: Introduce function for resolving dependent run styles. We always favor an explicit positive or negative in a style in a descendent, and only turn to the ancestor if nothing is set. We also introduce an (empty) list of styles that are black-listed. We won't check them. (Think underlines in hyperlinks).	2014-08-17 16:54:11 -04:00
John MacFarlane	8e60d35d58	Merge pull request #1536 from considerate/master Add row width to tables in Docx XML	2014-08-17 12:54:38 -07:00
John MacFarlane	b6103eeb83	Merge pull request #1543 from jkr/superSubVert Docx reader: Change behavior of Super/Subscript	2014-08-17 12:54:10 -07:00
John MacFarlane	c14088ea93	Docx writer: Fixed regression, bungled list numbering. In pandoc 1.13, all lists come out as basic ordered lists. This fixes that bad regression. Closes #1544.	2014-08-17 12:48:41 -07:00
Jesse Rosenthal	99491f0d98	Docx Parse: build a bottom-up style tree. Two points here: (1) We're going bottom-up, from styles not based on anything, to avoid circular dependencies or any other sort of maliciousness/incompetence. And (2) each style points to its parent. That way, we don't need the whole tree to pass a style over to Docx.hs	2014-08-17 15:46:17 -04:00
Artyom Kazak	357172f13a	Remove an unnecessary import.	2014-08-17 23:44:02 +04:00
Artyom Kazak	6a34cd3ddf	Update Reader.EPUB to use `MimeType`.	2014-08-17 21:00:55 +04:00
Artyom Kazak	cca9e8feb4	MIME cleanup. * Create a type synonym for MIME type (instead of `String`). * Add `getMimeTypeDef` function. * Avoid recreating MIME type `Map`s every time. * Move “Formula-...” case handling into `getMimeType`.	2014-08-17 21:00:50 +04:00
Jesse Rosenthal	b8f1658c36	Alias string and runStyle to CharStyle type.	2014-08-17 11:30:22 -04:00
Jesse Rosenthal	c4871ac790	Docx Style parser: Basic one now just takes a parent style. This will make it easier to build the style map from the bottom up (to avoid any infinite references).	2014-08-17 10:19:48 -04:00
Jesse Rosenthal	75eec0a6b8	Docx reader: work with new rStyle. Just discards info at the moment, so at least it works the same.	2014-08-17 09:22:25 -04:00
Jesse Rosenthal	ea85a797c2	Parser: Framework for parsing styles. We want to be able to read user-defined styles. Eventually we'll be able to figure out styles in terms of inheritance as well. The actual cascading will happen in the docx reader.	2014-08-17 09:22:21 -04:00
Jesse Rosenthal	dc5b0ba09b	Docx reader: Change behavior of Super/Subscript In docx, super- and subscript are attributes of Vertalign. It makes more sense to follow this, and have different possible values of Vertalign in runStyle. This is mainly a preparatory step for real style parsing, since it can distinguish between vertical align being explicitly turned off and it not being set. In addition, it makes parsing a bit clearer, and makes sure we don't do docx-impossible things like being simultaneously super and sub.	2014-08-17 08:20:00 -04:00
John MacFarlane	9d52ecdd42	HTML reader: Parse appropriately styled span as SmallCaps.	2014-08-16 22:57:00 -07:00
Viktor Kronvall	753e47194c	Simplify row width calculation.	2014-08-17 03:24:44 +02:00
Christoffer Ackelman	9f3c34841b	Include row width in table rows. Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000).	2014-08-17 03:24:44 +02:00
John MacFarlane	cb4ae6112e	Markdown writer: don't escape $, ^, ~ when extensions are deactivated. `tex_math_dollars`, `superscript`, and `subscript` extensions, respectively. Closes #1127.	2014-08-16 17:14:51 -07:00
Jesse Rosenthal	9bb0b99981	Docx reader: Remove unnecessary plural functions functions like runElemsToInlines and parPartsToInlines are just defined in terms of concatting and mapping their singular version (e.g. `runElemToInlines`). Having two functions with almost identical names makes it easier to introduce errors. It's easy enough to just concat and map inline, and it makes it clearer what is going on in the code.	2014-08-16 15:07:41 -04:00
Jesse Rosenthal	9969b2ebee	Docx reader: Fix bug in character styles. Style handling has been cleaned up, but introduced a bug here. There wasn't previously a test to catch it.	2014-08-16 14:05:19 -04:00
Jesse Rosenthal	0ff9ec2f4e	Rewrite Docx.hs and Reducible to use Builder. The big news here is a rewrite of Docx to use the builder functions. As opposed to previous attempts, we now see a significant speedup -- times are cut in half (or more) in a few informal tests. Reducible has also been rewritten. It can doubtless be simplified and clarified further. We can consider this, at the moment, a reference for correct behavior.	2014-08-16 10:22:55 -04:00
John MacFarlane	8bf39cf6d6	Markdown reader: Better handle quote characters in inline links. This was previously failing to be recognized as a link: [Test](http://en.wikipedia.org/wiki/Ward's_method) Closes #1534.	2014-08-14 10:59:27 -07:00
John MacFarlane	e917bcc124	Make `raw_tex` extension non-default for textile reader, writer. Enable `raw_tex` extension in textile writer. Closes #1532.	2014-08-14 09:49:31 -07:00
John MacFarlane	52a9ccce4f	Merge pull request #1531 from jkr/morefonts Docx reader: Interpret "Strong" and "Emphasis" run styles.	2014-08-13 19:47:42 -07:00
John MacFarlane	17b2fd567b	Fixed haddock comment.	2014-08-13 13:59:50 -07:00
John MacFarlane	05b7fd8dee	Removed unneeded import.	2014-08-13 11:35:09 -07:00
Jesse Rosenthal	6897905602	Docx reader: Interpret "Strong" and Emphasis run styles.	2014-08-13 12:23:03 -04:00
John MacFarlane	22ab3367c6	Removed unneeded CPP.	2014-08-12 22:50:51 -07:00
Jesse Rosenthal	a1320a76f9	Docx: Reducible forgot about smallcaps	2014-08-13 00:09:40 -04:00
Jesse Rosenthal	dca55630e6	Docx Reader: Trim line breaks from the beginning and end of Section Headers. We might also want to do this elsewhere (for pars, for example).	2014-08-12 23:42:01 -04:00
Jesse Rosenthal	378a795eaa	Docx: More robust handling of multiple bookmarks in header.	2014-08-12 23:41:57 -04:00
Jesse Rosenthal	85579052b5	Docx reader: Check for null-id'd anchors too. Otherwise they get left dangling in the document.	2014-08-12 23:33:03 -04:00
Jesse Rosenthal	194ed88852	Docx reader: accept explicit "Italic" and "Bold" rStyles. Note that "Italic" can be on, and, from the last commit, `<w:i>` can be present, but be turned off. In that case, the turned-off tag takes precedence. So, we have to distinguish between something being off and something not being there. Hence, isItalic, isBold, isStrike, and isSmallCaps have become Maybes.	2014-08-12 22:39:18 -04:00
Jesse Rosenthal	aae71ad595	Docx reader: Add "BlockQuotation" to divs list.	2014-08-12 22:08:30 -04:00
Jesse Rosenthal	d4748038d7	Docx Reader: Fix font style parsing. Before we just checked for the existence of a tag. Now, we make sure to check for its on/off value.	2014-08-12 22:04:07 -04:00
John MacFarlane	e883ef4eb9	Merge pull request #1527 from mpickering/juicypixels Attempts to convert gif, tiff and bmp to png in pdf writer	2014-08-12 16:57:22 -07:00
John MacFarlane	7684b24959	Merge pull request #1528 from mpickering/epubtitlepage EPUB Reader: Ignores titlepage attribute	2014-08-12 16:56:38 -07:00
Matthew Pickering	2b31df32de	LaTeX Writer: Added missing closing braces to hyperdef commands	2014-08-13 00:37:18 +01:00
Matthew Pickering	57bebe26df	PDF Writer: Attempts to convert images to pdf renderable formats Now depends on the JuicyPixels library. Will attempt to convert an image (gif, tiff, bmp) to png when converting to pdf.	2014-08-13 00:37:18 +01:00
John MacFarlane	81157c7cc6	HTML writer: use 'uri' or 'email' class for autolinks. This allows them to be styled specially. Closes #1501.	2014-08-12 15:49:43 -07:00
John MacFarlane	da507dcb84	ConTeXt writer: improved autolink detection. It previously failed in some cases with escaped special characters.	2014-08-12 15:49:20 -07:00
Matthew Pickering	34cf016251	EPUB Reader: Ignore title pages	2014-08-12 23:03:24 +01:00
John MacFarlane	f16dd1bfdf	DocBook: Support equations with mathml. equation, informalequation, inlineequation and mml:math elements.	2014-08-12 14:00:46 -07:00
John MacFarlane	d6f0973128	Merge pull request #1524 from jkr/dropCap3 Docx reader: move dropcap combining logic to Reducible	2014-08-12 11:13:27 -07:00
John MacFarlane	f97ec6db2c	Markdown reader: Improved parsing of indented code in list items. Indented code at the beginning of a list item must be indented eight spaces from the margin (or from the edge of the container), or four spaces past the list marker, whichever is farther. Some examples in `tests/markdown-reader-more.txt`.	2014-08-12 11:10:48 -07:00
John MacFarlane	ab75e1d3bd	Beamer: Use \footnote<.->{..} for notes. This ensures that the footnotes will not appear before the overlays in which their corresponding note markers appear. Closes #1525.	2014-08-12 10:56:57 -07:00
Jesse Rosenthal	9d0b390d48	Docx reader: move combining logic to Reducible Introduces a new function in Reducibles, concatR. The idea is that if we have two list of Reducibles (blocks or inlines), we can combine them and just perform the reduction on the joining parts (the last element of the first list, the first element of the second list). This is useful in cases where the two lists are already reduced, and we're only worried about the joining elements. This actually improves the efficiency a bit further, because concatR can be smart about empty lists.	2014-08-12 10:26:49 -04:00
Jesse Rosenthal	e4a8e4a636	Docx reader: Make dropcap combining more efficient. Before, we had to run reduceList on the whole combined paragraph, which was redundant, and could take some time for long paragraphs. We only need to combine the drop cap with the first inline of the next paragraph.	2014-08-12 09:00:53 -04:00
Jesse Rosenthal	45ec035e93	Docx reader: combine inlines properly in dropcaps. Make sure that adjacent inlines are combined properly in dropcaps. This updates the test results as well.	2014-08-11 23:31:16 -04:00
Jesse Rosenthal	3e32cd5bb1	Docx reader: Use dropcap state. If we get to a dropcap, we keep hold the inlines until the next paragraph, and combine it there.	2014-08-11 23:08:33 -04:00
Jesse Rosenthal	bca74a2bd0	Add dropCap to paragraph style.	2014-08-11 21:42:02 -04:00
John MacFarlane	86d4da994a	EPUB reader: use walk instead of bottomUp. This should be more efficient.	2014-08-11 14:48:42 -07:00
John MacFarlane	31811657fa	Merge pull request #1521 from jkr/emptyEmph Discard empty formatters	2014-08-11 14:44:08 -07:00
John MacFarlane	211fe266e0	LaTeX writer: Don't produce `\label{}` for Div or Span. Just `\hyperdef`. A slight amendment to #1519.	2014-08-11 12:20:44 -07:00
John MacFarlane	95d9b43b42	Merge pull request #1519 from mpickering/more EPUB Normalisation and anchors for div blocks in tex	2014-08-11 11:28:11 -07:00
John MacFarlane	6fae136cbb	Textile reader: list and HTML block parsing improvements. Closes #1513. Lists can now start without an intervening blank line. Also, html block-level tags that don't start a line are parsed as RawInline and don't interrupt paragraphs, as in RedCloth.	2014-08-11 11:22:39 -07:00
John MacFarlane	4a535211d8	Merge pull request #1365 from gbataille/docx-margin Scale images to fit the page for DOCX	2014-08-11 10:44:52 -07:00
Jesse Rosenthal	0411fe7ccf	Docx reader: handle empty reducibles.	2014-08-11 12:48:16 -04:00
Matthew Pickering	1952dd0592	TeX Writer: Write hyperdef and label for identifiers on Div blocks	2014-08-11 16:23:05 +01:00
Matthew Pickering	72b1470713	EPUB Reader: Fixed another normalisation problem..	2014-08-11 16:23:05 +01:00
John MacFarlane	e690fe4a3e	Merge pull request #1516 from mpickering/epubmetadata EPUB improvements	2014-08-11 08:14:54 -07:00
Matthew Pickering	973ed469de	Docx Parse: Improved font recognition when specified in rFonts element	2014-08-11 10:30:32 -04:00
Matthew Pickering	427466f80c	Docx Fonts: Derives Show and Eq	2014-08-11 10:30:32 -04:00
Matthew Pickering	e02360d3d8	EPUB Reader: Can now parse multiple meta data fields	2014-08-11 13:12:42 +01:00
Matthew Pickering	285d56dea7	EPUB Writer: Added page-progression-direction meta field	2014-08-11 11:21:38 +01:00
Matthew Pickering	9eded27e32	EPUB reader: Fixed bug where filepaths weren't sufficiently normalised	2014-08-11 11:20:33 +01:00
Matthew Pickering	1f02ff60ba	EPUB Writer: Added explicit imports	2014-08-11 10:21:52 +01:00
John MacFarlane	65b31e0cac	Merge pull request #1510 from jkr/spacefix Docx reader: Fix spacing issue.	2014-08-10 07:11:59 -07:00
John MacFarlane	7ec8dd956f	Removed OMath module, depend on texmath >= 0.8.	2014-08-10 06:19:41 -07:00
Jesse Rosenthal	c15978ce5e	Change head/tail to pattern guards.	2014-08-10 09:10:34 -04:00
Jesse Rosenthal	a02ce74acf	Docx reader: Fix spacing issue. Previously spaces at the beginning of Emph/Strong/etc were kept inside. This makes sure they are moved out.	2014-08-09 23:35:09 -04:00
Matthew Pickering	3bb19307f6	Docx Parse: Recognises code points in sym elements which are in the private range	2014-08-09 22:37:12 -04:00
Matthew Pickering	edc57f77fc	Added Text.Pandoc.Readers.Docx.Fonts	2014-08-09 22:37:12 -04:00
Matthew Pickering	2deaa7096f	Docx Reader: Added recognition of sym element in paragraphs	2014-08-09 22:37:12 -04:00
Matthew Pickering	4ae61bdf8f	EPUB: Fixed another mediabag related regression..	2014-08-10 00:12:09 +01:00
Matthew Pickering	a6648e5a73	EPUB Reader: Changed image paths to be relative to manifest file	2014-08-09 23:06:16 +01:00
John MacFarlane	4983083079	HTML writer: Don't include empty TOC items for slide shows. Previously creating a slide with a horizontal rule would result in an empty list item in the TOC. This patch fixes that.	2014-08-09 10:29:39 -07:00
John MacFarlane	bc06ef0edb	Merge branch 'newbranch' of https://github.com/mpickering/pandoc into mpickering-newbranch Conflicts: src/Text/Pandoc/Readers/EPUB.hs	2014-08-08 22:22:55 -07:00
John MacFarlane	19daf6cf0a	Added `native_divs` and `native_spans` extensions. This allows users to turn off the default pandoc behavior of parsing contents of div and span tags in markdown and HTML as native pandoc Div blocks and Span inlines. Setting of default epub extensions has been moved from the EPUB reader to Text.Pandoc.	2014-08-08 21:05:34 -07:00
John MacFarlane	a4a6b6f28c	Plain writer: Use ALL CAPS for level 1 headers.	2014-08-08 15:20:29 -07:00
Matthew Pickering	cfd8c0214c	EPUB Reader: Improved robustness of image extraction We now maintain the invariant that when fetchImages is called, all images have absolute paths. This patch fixes several bugs relating to this as there are three places where images can be introduced. (1) During the HTML parse (2) As spine elements (3) As a cover image For (1), the paths are corrected by the transformation renameImages For (2) and (3), we need to append the "root" to the path we parse from the spine	2014-08-08 23:04:03 +01:00
Matthew Pickering	40ae8efddc	EPUB Reader: Fixed regressions in image extraction Before the images were relative to the position of the package file. The collapse function changed this so that they were then absolute in the archive but the fetchImages function wasn't updated to recognise this.	2014-08-08 22:31:27 +01:00
Matthew Pickering	8c551f6f43	EPUB Reader: Use collapseFilePath	2014-08-08 22:31:22 +01:00
Matthew Pickering	2d956677ef	Shared: Added collapseFilePath function This function removes intermediate "." and ".." from a path.	2014-08-08 22:31:02 +01:00
Matthew Pickering	116f03a70a	EPUB Reader: Removed incorrectly set reader flag	2014-08-08 22:31:02 +01:00
John MacFarlane	aae90a8671	Merge pull request #1503 from jkr/streamlineMath OMath parser: Change signature of exported function.	2014-08-08 13:45:30 -07:00
John MacFarlane	f723a0575d	Markdown writer: Respect -raw_html. pandoc -t markdown-raw_html should not emit any raw HTML, even span and div tags that go with pandoc Span and Div elements. Cleaned up a bit of the logic with extensions and plain.	2014-08-08 13:34:57 -07:00
Jesse Rosenthal	a426812ccc	OMath parser: Change signature of exported function. This changes the signature of the exported `readOMML` to `String -> Either String [Exp]`, so it can now, in theory, be slotted into TeXMath. It doesn't have any real error reporting yet, but that might make more sense once I put it in a branch, and understand how it works in the other readers. It also now reads strings that parse to either oMath or oMathPara elements. Note that the distinction is lost in the output. It's up to the caller to remember the display type.	2014-08-08 16:34:38 -04:00
John MacFarlane	7b47042ae6	Textile reader: fixed list parsing bug. Closes #1500 .	2014-08-08 12:18:47 -07:00
John MacFarlane	dd78dd6d1b	Textile reader: don't allow inline formatting to extend over newline. This matches behavior of RedCarpet, avoids some ugly bugs, and improves performance.	2014-08-08 12:18:47 -07:00
Jesse Rosenthal	2f7a627f6d	OMath: Finish initial cleanup. This gets rid of commented-out functions, cleans up whitespace errors, and exports and imports the correct functions.	2014-08-08 14:16:54 -04:00
Jesse Rosenthal	ba5804f9ec	OMath: Remove Namespaces We still need to test against prefixes, but this is only going to look at oMath fragments, so we're not going to be worried about looking up the real namespace.	2014-08-08 14:15:17 -04:00
Jesse Rosenthal	0acd139fb1	OMath: Start phasing out internal OMath type. This is the first step in removing the intermediate OMath type, which we no longer need since we're writing straight to TeXMath Exp.	2014-08-08 14:14:30 -04:00
Jesse Rosenthal	cf849443cb	OMath parser: don't group expressions if there's only one.	2014-08-08 14:12:05 -04:00
Matthew Pickering	40602c3df6	HTML EPUB exts: switch element can now be in either the inline or block position	2014-08-08 10:25:40 -07:00
John MacFarlane	94466c0060	HTML reader: Really ignore DOCTYPE and xml declarations. This actually does what `d71b013841` said it did. Revised epub tests to remove the repeated DOCTYPE and xml tags.	2014-08-07 22:12:44 -07:00
John MacFarlane	3c4079edc8	Merge pull request #1488 from mpickering/epubfixes EPUB Reader: Improved image extraction	2014-08-07 19:00:32 -07:00
John MacFarlane	08bed142ba	Merge pull request #1496 from mpickering/master Org Writer: Write anchor elements	2014-08-07 16:29:20 -07:00
Matthew Pickering	07bb41d6da	Org Writer: Write anchor elements The Org Writer now writes empty span elements which have an id as an anchor. For example `Span ("uid", [], []) []` becomes `<<uid>>`	2014-08-08 00:20:18 +01:00
Matthew Pickering	19d2ff68b1	EPUB Reader: Improved how images are extracted	2014-08-07 22:56:30 +01:00
John MacFarlane	17e48ba81e	Merge pull request #1494 from jkr/math-module Math module	2014-08-07 13:44:19 -07:00
Jesse Rosenthal	7bd7d4d476	Docx reader: Handle inline drawings. Previous drawings that were under some other toplevel run (i.e., a hyperlink) wouldn't be properly handled. This should fix that.	2014-08-07 15:01:05 -04:00
Jesse Rosenthal	d293dd528b	OMath module: Add new file.	2014-08-07 12:41:33 -04:00
Jesse Rosenthal	a7967d1aef	Docx reader: Split math out into math module. Could use some cleanup, but this is the first step for getting an OMML reader into TeXMath.	2014-08-07 12:20:22 -04:00
Matthew Pickering	13f26af84f	Docx Reader: Added Default instances and removed withDState Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>	2014-08-06 19:15:33 -04:00
Jesse Rosenthal	91ab2f155f	Get rid of unused docx variable. Since changing the Docx type, this is no longer necessary. Thanks to Matthew Pickering for picking up on this.	2014-08-06 12:19:24 -04:00
John MacFarlane	444b1c2ad8	Merge pull request #1491 from jkr/texmath-equations Docx Reader: Use TeXMath for writing equations.	2014-08-06 09:07:00 -07:00
Jesse Rosenthal	cd9ca5a18a	Docx reader: remove now-unnecessary state variable. This also introduces a `defaultDState` value.	2014-08-06 11:20:41 -04:00
Jesse Rosenthal	cdd769624f	Remove now-unnecessary TexChar TeXMath does the work now.	2014-08-06 11:20:41 -04:00
Jesse Rosenthal	06488c95fa	Add a note on how `mapD` works.	2014-08-06 11:20:41 -04:00
Jesse Rosenthal	3bc2ea4cf7	Docx reader: Use TeXMath to write math The new version of TeXMath can translate from its type system into LaTeX. So instead of writing the LaTeX ourself, we write to the TeXMath `Exp` type, and let TeXMath do the rest.	2014-08-06 11:20:27 -04:00
Uli Köhler	9d07db933c	MediaWiki reader doesn't recognize german "Bild"	2014-08-06 00:47:23 +02:00
Matthew Pickering	b04bb3b6d2	MediaBag: Improved normalisation when writing files	2014-08-05 11:02:23 +01:00
John MacFarlane	2de2842bdd	Merge pull request #1486 from Aelve/minor Very minor cleanup and readability changes	2014-08-04 22:07:02 -07:00
John MacFarlane	39b59b7603	Merge pull request #1476 from jkr/endnote-fix Docx Parser: Produce endnotes.	2014-08-04 21:59:58 -07:00
John MacFarlane	d71b013841	HTML reader: ignore <?xml..> and <DOCTYPE..> tags. Previously they were parsed as raw.	2014-08-04 18:39:39 -07:00
John MacFarlane	40d8100d44	Use texmath 0.7 interface.	2014-08-04 11:13:09 -07:00
Artyom Kazak	141fdf944a	Add PatternGuards pragmas.	2014-08-04 19:58:25 +04:00
Artyom Kazak	eb88444452	Remove redundant isHexDigit function.	2014-08-04 19:58:25 +04:00
Artyom Kazak	e51a2cedf9	Remove dangling `where` from one function.	2014-08-04 19:58:25 +04:00
Artyom Kazak	82118b3328	Use `stripPrefix` where appropriate.	2014-08-04 19:57:42 +04:00
Artyom Kazak	feebab9740	Clean up `mediaTypeOf` a bit.	2014-08-04 19:41:37 +04:00
Artyom Kazak	f659644fcc	Use `mapM_` instead of `() <$ mapM` in one place.	2014-08-04 19:41:37 +04:00
John MacFarlane	4630cff2a6	Merge branch 'epubend' of https://github.com/mpickering/pandoc into mpickering-epubend Conflicts: pandoc.cabal	2014-08-04 07:36:18 -07:00
Artyom Kazak	ec88d47f23	Correctly implement capitalisation. Using `map toUpper` to capitalise text is wrong, as e.g. “Straße” should be converted to “STRASSE”, which is 1 character longer. This commit adds a `capitalize` function and replaces 2 identical implementations in different modules (`toCaps` and `capitalize`) with it.	2014-08-03 17:37:37 +04:00
John MacFarlane	842c705097	SelfContained: Fixed determining of source URL from within CSS files. (This fixes a bug introduced a couple commits back.)	2014-08-02 16:33:22 -07:00
John MacFarlane	85ff3c5771	fetchItem: improved mime type guessing. Strip a fragment like `?#iefix` from the extension before doing the mime lookup.	2014-08-02 16:32:11 -07:00
John MacFarlane	1d137fbed6	Shared: fetchItem improvements. * More consistent logic: absolute URIs are fetched from the net; other things are treated as relative URIs if sourceURL is a Just, otherwise as file paths. * We escape characters that are not allowed in URIs before trying to parse them (e.g. '\|', which often occurs in the wild). * When treating relative paths as local file paths, we drop any fragment or query. This is useful e.g. when you've downloaded web fonts locally, but your source still contains the original relative URLs. Together with the previous commit, this should close #1477.	2014-08-02 16:12:05 -07:00

... 2 3 4 5 6 ...

2975 commits