pandoc

Author	SHA1	Message	Date
Jesse Rosenthal	24fd0ad04d	Docx reader: Handle lists correctly inside table cells. Previously we didn't transform lists inside table cells.	2015-02-13 09:02:16 -05:00
Jesse Rosenthal	ba59e5447f	Docx writer: Add footnotes id -1 and 0. Word uses, by default, footnotes with id -1 and 0 for separators. If a user modifies reference.docx, they will end up with a settings.xml file that references these footnotes, but no such footnotes in the document. This will produce a corruption error. Here we add these to the document and settings.xml file, so future modifications won't break the file.	2015-02-12 09:21:41 -05:00
Jesse Rosenthal	96d5c8a5dc	Docx Writer: Add "BodyText" Style We apply a "BodyText" style to all unstyled paragraphs. This is, essentially, the same as "Normal" up until now -- except that since not everything inherits from "BodyText" (the metadata won't, for example, or the headers or footnote numbers) we can change the text in the body without having to make exceptions for everything. This will still inherit from Normal, so if we want to change everything, we can do it through "Normal".	2015-02-11 15:06:36 -05:00
Jesse Rosenthal	25ef68d266	Docx Writer: Use FirstParagraph style at beginning. Before we had used `FirstParagraph` style after Headings, BlockQuotes, and other blocks a user might not want an indentation after. We hadn't actually used it for the first paragraph -- i.e. the opening of the body. This makes sure the first body paragraph gets that style.	2015-02-10 21:05:13 -05:00
Jesse Rosenthal	daab4c3f22	Docx Writer: Implement FirstParagraph Style Following the odt writer, we make the first text paragraph following an image, blockquote, table, or heading into a "FirstParagraph" style. This allows it to be styled differently, if the user wishes. The default is for it to be the same as "Normal"	2015-02-09 23:22:52 -05:00
John MacFarlane	12962e2332	Merge pull request #1927 from freephile/master update syntax for Images/Media files in MediaWiki	2015-02-07 20:33:33 -08:00
John MacFarlane	bd7cf8dbd5	Merge branch 'patch/fixTexinfoWrap' of https://github.com/timtylin/scholdoc into timtylin-patch/fixTexinfoWrap Conflicts: src/Text/Pandoc/Writers/Texinfo.hs	2015-02-07 20:28:56 -08:00
Tim Lin	858ebf99eb	Texinfo writer: fix wrapping by using breakable spaces	2015-02-06 01:16:40 -08:00
Greg Rundlett	218f28af6d	update syntax for Images/Media files in MediaWiki The preferred syntax for Images and other media is [[File:Foo.jpg]] in MediaWiki since v1.14 (2008). [[Image:Foo.jpg]] is deprecated but still works as an alias to the File namespace. I don't think this would break any existing wikis since talk of switching the syntax/namespace for images started back in 2002 (https://phabricator.wikimedia.org/T2044). NS_FILE became the new namespace for Files in v 1.14 in late 2008. (https://www.mediawiki.org/wiki/Release_notes/1.14) There is still a namespace alias so '[[Image:]]' still works today. It's just that MediaWiki supports other media as well, and so the name and syntax used in documentation (see https://www.mediawiki.org/wiki/Help:Images) has long been '[[File:foo.jpg]]'	2015-02-05 17:07:50 -05:00
Tim Lin	0c18f3a854	Append newline to the LineBreak of various writers This change improves output formatting of content with a large amount of force line breaks, such as line-blocks. The following writers are affected: * Dokuwiki * HTML * EPUB (via HTML) * LaTeX * MediaWiki * OpenDocument * Texinfo This commit resolves #1924	2015-02-04 22:42:22 -08:00
John MacFarlane	fb7a03dcda	Textile reader: table improvements. * Handle newlines in cells. * Handle empty cells. * Closes #1919.	2015-02-02 10:45:50 -08:00
John MacFarlane	7050c26abc	LaTeX writer: Don't escape $ in URL. Closes #1913 .	2015-02-01 11:19:55 -08:00
John MacFarlane	6a0d4da382	HTML writer: Add "inline" or "display" class to math spans. This allows inline and display math to be styled differently. Closes #1914.	2015-02-01 11:08:27 -08:00
Konstantin Zudov	92e762c2d6	Refactored `if x then [] else y` to `[y \| not x]`	2015-01-29 22:36:23 +02:00
Konstantin Zudov	b5cc01e976	Do not ommit missing `alt` attribute on `img` tag Fixes #1131	2015-01-29 22:16:38 +02:00
John MacFarlane	82c04a28ce	Fixed list-style-type for numbered example lists. Should be "decimal," not "example." Closes #1902.	2015-01-27 16:56:56 -08:00
John MacFarlane	33d1c8cc01	Merge pull request #1885 from mb21/html-reader-tables fixes HTML Reader: tables	2015-01-25 10:46:47 -08:00
mb21	b40d33b174	fixes #1859 HTML Reader table parsing	2015-01-25 09:41:12 +01:00
John MacFarlane	d90dc6b8b5	LaTeX reader: don't limit includes to .tex extension. Previously `\input` and `\include` would only work if the included files had the extension `.tex`. This change relaxes that restriction, though if the extension is not `.tex`, it must be given explicitly in the `\input` or `\include`. Closes #1882.	2015-01-22 23:17:25 -08:00
Jesse Rosenthal	eb11c61182	Docx: Parse images in deprecated vml format. Some older versions of word use vml (vector markup language) and put their images in a "v:imagedata" tag inside a "w:pict". We read those as we read the more modern "blip" inside a "w:drawing". Note that this does not mean the reader knows anything about vml. It just looks for a `v:imagdata`. It's possible that, with more complicated uses of images in vml, it won't do the right thing.	2015-01-21 13:41:16 -05:00
John MacFarlane	8a7db5cc9d	Use CPP to avoid unneeded import warning for blaze-markup >= 0.6.3. See https://github.com/jgm/pandoc/pull/1888#issuecomment-70470409	2015-01-19 10:29:57 -08:00
John MacFarlane	030d3b597d	Custom writer: Raise `PandocLuaException` instead of using 'error'. Eventually we'll change the return type so that no exception is involved, but at least this can be trapped.	2015-01-18 22:04:42 -08:00
John MacFarlane	ab8b00ea0c	Custom writer: raise error if loadstring returns an error status. This will make debugging custom scripts much easier.	2015-01-18 21:48:04 -08:00
John MacFarlane	25e12ca7b2	EPUB writer: properly handle internal links to IDs in spans, divs. Closes #1884.	2015-01-17 11:27:49 -08:00
mb21	6aa41b86d0	don't log Try xelatex if xelatex already in use, closes #1832	2015-01-11 15:24:04 +01:00
Mark Wright	9c68017786	ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:48:01 +11:00
Mark Wright	dbe1b38816	ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:47:33 +11:00
Mark Wright	5ea3856bb0	ghc 7.10.1 RC1 requires FlexibleContexts https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#Inferredtype-signaturesnowmayrequiretoenableFlexibleContextsGADTsorTypeFamilies	2015-01-05 14:46:57 +11:00
Mark Wright	c80c9ac9da	ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:46:40 +11:00
Mark Wright	8b9bded796	ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:46:15 +11:00
Mark Wright	e4c7894d01	ghc 7.10.1 RC1 requires FlexibleContexts https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#Inferredtype-signaturesnowmayrequiretoenableFlexibleContextsGADTsorTypeFamilies	2015-01-05 14:42:45 +11:00
Mark Wright	2a6f68f4bf	ghc 7.10.1 RC1 requires FlexibleContexts https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#Inferredtype-signaturesnowmayrequiretoenableFlexibleContextsGADTsorTypeFamilies	2015-01-05 14:42:26 +11:00
Mark Wright	4e3281c550	ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:41:54 +11:00
Mark Wright	cd5b1fe5e3	ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:41:35 +11:00
Mark Wright	ed7606da9a	ghc 7.10.1 RC1 requires FlexibleContexts https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#Inferredtype-signaturesnowmayrequiretoenableFlexibleContextsGADTsorTypeFamilies	2015-01-05 14:40:59 +11:00
Mark Wright	b748833889	ghc 7.10.1 RC1 requires FlexibleContexts https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#Inferredtype-signaturesnowmayrequiretoenableFlexibleContextsGADTsorTypeFamilies ; ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:40:30 +11:00
Mark Wright	10d53989d8	ghc 7.10.1 RC1 requires FlexibleContexts https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#Inferredtype-signaturesnowmayrequiretoenableFlexibleContextsGADTsorTypeFamilies ; ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:40:06 +11:00
Mark Wright	f18ceb1b5e	ghc 7.10.1 RC1 requires specifying the type of String literals https://ghc.haskell.org/trac/ghc/wiki/Migration/7.10#GHCsaysNoinstanceforFoldable...arisingfromtheuseof ...	2015-01-05 14:38:06 +11:00
Mark Wright	693f9abb18	Allow haddock-library 1.2, by calling the Documentation.Haddock.Types.MetaDoc record accessor function _doc :: MetaDoc mod id -> DocH mod id	2015-01-05 14:35:31 +11:00
John MacFarlane	e7187fa3bb	LaTeX reader: handle `tabular` environment. This change allows pandoc not to choke on the table-width parameter of `tabular`. Note that the table width is not actually parsed or taken into account, but this should give tolerable results in many cases. Closes #1850.	2015-01-01 08:46:45 -08:00
John MacFarlane	52310eb470	SelfContained: Add `;charset=utf-8` to script mime type if missing. Closes #1842.	2014-12-31 14:51:23 -08:00
John MacFarlane	e3422dc438	Added `--verbose` flag for debugging output in PDF production. Closes #1840. Closes #1653.	2014-12-26 11:19:55 -07:00
John MacFarlane	2c3310a592	Added Text.Pandoc.Compat.Locale to assist with transition to time 1.5.	2014-12-19 16:13:38 -08:00
John MacFarlane	005eda2f02	MediaWiki writer: Fixed links with URL = text. Previously these were rendered as bare words, even if the URL was not an absolute URL. Closes #1825.	2014-12-19 11:32:37 -08:00
John MacFarlane	7e41d0b1ee	LaTeX reader: parse math environments as inline when possible. Closes #1821.	2014-12-16 12:27:04 -08:00
John MacFarlane	fcd1599b09	FB2 writer: Add newline to output.	2014-12-15 22:14:29 -08:00
John MacFarlane	4c03231e9b	getDefaultTemplate: don't fail when called with "fb2". Closes #1660.	2014-12-15 22:10:03 -08:00
John MacFarlane	08abbc4604	LaTeX writer: Handle consecutive linebreaks. Closes #1733.	2014-12-15 22:04:18 -08:00
John MacFarlane	58ea1ce5f1	LaTeX reader: parse label after caption into a span... instead of inserting an additional paragraph of bracketed text. Closes #1747.	2014-12-15 21:50:10 -08:00
John MacFarlane	0eb3f8cff2	HTML writer: put newline btw img and caption paragraph.	2014-12-15 21:49:16 -08:00
John MacFarlane	61d2c2e8cb	LaTeX writer: better handling of display math in simple tables. We convert display math to inline math in simple tables, since LaTeX can't deal with display math in simple tables. Closes #1754.	2014-12-15 21:14:07 -08:00
John MacFarlane	cb8bb4705f	EPUB writer: include "landmarks" section in nav document for epub3. Closes #1757.	2014-12-15 20:56:03 -08:00
John MacFarlane	8a3363a269	Merge branch 'patch-1' of https://github.com/Wikiwide/pandoc into Wikiwide-patch-1 Conflicts: src/Text/Pandoc/Readers/LaTeX.hs	2014-12-15 20:27:42 -08:00
Matthew Pickering	e6bd29e9b8	Text.Pandoc.Writers.RTF: Add blankline at end of output Closes #1732	2014-12-15 21:35:47 +00:00
Matthew Pickering	a17f6f2f39	Text.Pandoc.Readers.HTML: Retain display type of MathML output Closes #1719	2014-12-15 21:35:47 +00:00
Matthew Pickering	58e4e4a608	Text.Pandoc.Parsing: Change parseFromString to fail if not all input is consumed.	2014-12-15 21:35:46 +00:00
John MacFarlane	a5cac0a0c4	Don't treat a citation as a reference link label. Closes #1763.	2014-12-15 10:54:12 -08:00
John MacFarlane	9bf76fa5a2	LaTeX reader: better handling of `\noindent` and `\greektext`. Closes #1783.	2014-12-15 10:34:59 -08:00
John MacFarlane	c37db57c9a	EPUB writer: Removed playOrder from navpoint elements in ncx file. These aren't required, and they make manual modification of epubs difficult. Closes #1760.	2014-12-15 10:22:19 -08:00
John MacFarlane	47c360e079	Improved texorpdfstring patch #1148 . * Make LaTeX reader recognize texorpdfstring. * Don't use texorpdfstring unless it's actually needed. * Fix tests.	2014-12-15 10:06:03 -08:00
John MacFarlane	544f3e5b45	Merge branch 'use-texorpdfstring' of https://github.com/wilx/pandoc into wilx-use-texorpdfstring Conflicts: src/Text/Pandoc/Writers/LaTeX.hs tests/Tests/Writers/LaTeX.hs	2014-12-15 10:01:50 -08:00
John MacFarlane	a864e9a348	Merge pull request #1805 from bergey/rst RST Reader - Improved Role Support	2014-12-15 09:06:45 -08:00
John MacFarlane	9e75b9b84b	DocBook readers: Include id on section headers. Closes #1818.	2014-12-14 23:46:25 -08:00
John MacFarlane	269b33d24b	DocBook reader: Handle menuchoice elements better. They are now rendered with a `>` between them. Closes #1817.	2014-12-14 23:37:54 -08:00
John MacFarlane	c350847943	DocBook reader: get string content in inner tags for literal elements. Closes #1816.	2014-12-14 19:12:48 -08:00
John MacFarlane	9e83cd62a6	DocBook reader: handle keycombo, keycap. Closes #1815.	2014-12-14 19:03:48 -08:00
John MacFarlane	1d3ca088f2	Merge pull request #1813 from tarleb/file-links Org reader: properly handle links to `file:target`	2014-12-14 13:36:34 -08:00
Albert Krewinkel	4d85b17fc5	Org reader: properly handle links to `file:target` Org links like `[[file:target][title]]` were not handled correctly, parsing the link target verbatim. The org reader is changed such that the leading `file:` is dropped from the link target. This is related to issues #756 and #1812.	2014-12-14 21:30:10 +01:00
John MacFarlane	2b08e32a90	Fixe autolinks with following punctuation. Closes #1811. The price of this is that autolinked bare URIs can no longer contain `>` characters, but this is not a big issue.	2014-12-14 12:20:33 -08:00
Daniel Bergey	ea157cf23f	RST: warn about ignored fields in role directives	2014-12-12 14:45:45 +00:00
Daniel Bergey	689fb112bf	RST Reader: compute Attrs when role is defined Move recursive role lookup from renderRole to addNewRole. The Attr value will be the same for every occurance of this role, so there's no reason to compute it every time. This allows simplifying the stateRstCustomRoles map considerably. We could go even further, and remove the fmt and attr arguments to renderRole, which are null except for custom roles.	2014-12-12 14:45:45 +00:00
Daniel Bergey	dc3ea9840e	RST reader: improve support for custom roles - Add "sourceCode" to classes for :code: role, and anything inheriting from it. - Add the name of the custom role to classes if the Inline constructor supports Attr. - If the custom role directive does not specify a parent role, inherit from the :span: role. This differs somewhat from the rst2xml.py behavior. If a custom role inherits from another custom role, Pandoc will attach both roles' names as classes. rst2xml.py will only use the class of the directly invoked role (though in the case of inheriting from a :code: role with a :language: defined, it will also provide the inherited language as a class).	2014-12-12 14:45:45 +00:00
Daniel Bergey	dba066a33d	RST: literal role should produce Code, code role should have "code" class. http://docutils.sourceforge.net/docs/ref/rst/roles.html says that `text`:literal` is the same as ``text``. docutils outputs a <literal> element in both cases, whereas for the code role, it outputs a <literal> element with the "code" class.	2014-12-12 14:45:44 +00:00
Daniel Bergey	15816853a3	expose warnings from RST reader; refactor This commit moves some code which was only used for the Markdown Reader into a generic form which can be used for any Reader. Otherwise, it takes naming and interface cues from the preexisting Markdown code.	2014-12-12 14:45:44 +00:00
John MacFarlane	4ffa70970d	Merge pull request #1695 from bjornbm/master Escape inline verbatim spaces in LaTeX output	2014-12-10 09:07:46 -08:00
Bryan O'Sullivan	2150903230	DocBook reader: document/test "type" as implemented	2014-12-08 23:17:27 -08:00
Bryan O'Sullivan	fe1d147187	DocBook reader: add support for classname	2014-12-08 23:12:06 -08:00
Bryan O'Sullivan	33fdb6bc15	DocBook reader: add support for calloutlist and callout We treat a calloutlist as a bulleted list. This works well in practice.	2014-12-08 22:26:09 -08:00
Matthew Pickering	48e2586ec8	Merge pull request #1746 from shelf/dw-ext-images DokuWiki writer: fix external images	2014-12-08 23:55:36 +00:00
Nikolay Yakimov	f7b265e2ff	Fix for #1641 (docx table captions above tables) Word doesn't really treat table captions as something special. It's just a paragraph with special style, nothing more, so simple reversal of output order in writer works fine.	2014-12-08 22:50:57 +00:00
Daniel Bergey	87e536b438	RST Reader: Warn about skipped directives move `addWarning` to Parsing.hs, so it can be used by Markdown & RST readers.	2014-12-08 14:43:04 +00:00
Matthew Pickering	068bdbbc91	Merge pull request #1716 from lierdakil/issue1607-pullreq First step to fixing internationalisation problems with docx output	2014-12-07 16:44:24 +00:00
Matthew Pickering	9761283c8f	Text.Pandoc.Pretty: Improve performance of realLength Eliminates memory usage and twofold increase in speed.	2014-12-06 22:58:40 +00:00
Daniel Bergey	74c1b547c2	parse RST class directives The class directive accepts one or more class names, and creates a Div value with those classes. If the directive has an indented body, the body is parsed as the children of the Div. If not, the first block folowing the directive is made a child of the Div. This differs from the behavior of rst2xml, which does not create a Div element. Instead, the specified classes are applied to each child of the directive. However, most Pandoc Block constructors to not take an Attr argument, so we can't duplicate this behavior.	2014-12-01 18:22:03 +00:00
Daniel Bergey	2cdfa5eb20	parse RST quoted literal blocks closes #65 RST quoted literal blocks are the same as indented literal blocks (which pandoc already supports) except that the quote character is preserved in each line. This includes test cases for the quoted literal block, as well as additional tests for line blocks and indented literal blocks, to verify that these are unaffected by the changes.	2014-12-01 18:22:03 +00:00
John MacFarlane	4d296f70df	ICML writer: Don't force all citations into footnotes.	2014-11-30 22:30:04 -08:00
John MacFarlane	d8fde9547e	Reverted "omit blank lines after list items," better fix for #1777 . Now we do as before, including blank lines after list items in loose lists (even though RST doesn't care -- this is just a matter of visual appeal). But we chomp any excess whitespace after the last list item, which solves #1777.	2014-11-25 12:34:44 -08:00
John MacFarlane	25e2c42347	RST writer: Omit blank lines after list items. They are optional in RST (except after the last list item, of course). Fixes #1777.	2014-11-25 12:24:33 -08:00
John MacFarlane	dc92a62883	RST writer: Ensure blank line after figure.	2014-11-25 12:24:14 -08:00
John MacFarlane	6c0943000d	LaTeX reader: support `\smartcite` and `\Smartcite` from biblatex. See jgm/pandoc-citeproc#26.	2014-11-25 10:03:43 -08:00
John MacFarlane	159c711675	Fixed double-rendering of footnotes in RST tables. Closes #1769.	2014-11-19 16:20:07 -08:00
John MacFarlane	7a5cb29319	Really fix #1758 . Add `id="cover"` to body on cover page. Not title page!	2014-11-17 15:43:40 -08:00
John MacFarlane	91a26fcdc4	Use regular page template for nav.xhtml. This includes the HTML doctype. Closes #1759.	2014-11-16 21:43:05 -08:00
John MacFarlane	4aadcd51b5	Make `embed` tag either block or inline. Closes #1756.	2014-11-16 20:51:35 -08:00
John MacFarlane	333fc60684	Changed mime type for otf to application/vnd.ms-opentype. Closes #1761. This is needed for epub3 validation. See http://www.idpf.org/epub/20/spec/OPF_2.0.1_draft.htm#Section2.3.1	2014-11-16 20:29:38 -08:00
John MacFarlane	46d343f474	Fixed bug in org with bulleted lists: - a - b * c was being parsed as a list, even though an unindented `*` should make a heading. See <http://orgmode.org/manual/Plain-lists.html#fn-1>.	2014-11-13 23:40:18 -08:00
Caleb McDaniel	196c4f2343	Account for external link URLs with anchors Previously, if a URL had an anchor, such as http://johnmacfarlane.net/pandoc/README.html#synopsis the reader would incorrectly identify it as an internal link and return "#synopsis" for the link in output.	2014-11-13 00:42:58 -05:00
John MacFarlane	43c1978fae	Merge pull request #1645 from neongreen/issue1636 Fix 'Ext_lists_without_preceding_blankline' bug.	2014-11-12 09:05:29 -08:00
Timothy Humphries	ffae1567fd	DokuWiki writer: fix external images Handles #1739. Preface relative links with ":", absolute URIs without.	2014-11-09 00:35:29 -05:00
Albert Krewinkel	e6cd8c9077	Org reader: allow empty links for gitit interop While empty links are not allowed in Emacs org-mode, Pandoc org-mode should support them: gitit relies on empty links as they are used to create wiki links. Fixes jgm/gitit#471	2014-11-05 23:15:28 +01:00
Albert Krewinkel	daaf635806	Org reader: absolute, relative paths in links The org reader was to restrictive when parsing links, some relative links and links to files given as absolute paths were not recognized correctly. The org reader's link parsing function was amended to handle such cases properly. This fixes #1741	2014-11-05 22:27:25 +01:00
John MacFarlane	f3ac41937d	DokuWiki writer: Better handling of block quotes. This change ensures that multiple paragraph blockquotes are rendered using native `>` rather than as HTML. Closes #1738.	2014-11-04 14:52:19 -08:00
John MacFarlane	1d268876f8	Merge pull request #1726 from AlexanderS/twiki-parser TWiki Reader: add new new twiki reader	2014-10-31 12:02:14 -07:00
John MacFarlane	deaefb18ca	ODT writer: Correctly handle images without extensions. Closes #1729.	2014-10-30 15:54:04 -07:00
Alexander Sulfrian	c3780992ab	TWiki Reader: add new new twiki reader	2014-10-30 19:54:48 +01:00
Todd Sifleet	8ad321d0d4	Strip querystring in ODT write * Resolve #1682 * Strip querystring from filename before rendering ODT files, ODT cannot handle querystrings in files.	2014-10-28 18:45:36 -07:00
Nikolay Yakimov	96c4b9e2e6	Docx reader: fix for Issue #1692 (i18n styles) This patch builds paragraph styles tree, then checks if paragraph has style.styleId or style/name.val matching predetermined patterns. Works with "Heading#" (name.val="heading #") for headings and "Quote"\|"BlockQuote"\|"BlockQuotation" (name.val="Quote"\|"Block Text") for block quotes.	2014-10-25 15:54:44 -04:00
Nikolay Yakimov	3c894987b2	Docx Writer: Partial fix for #1607 International heading styles are inferred based on `<w:name val="heading #">` fallback, if there are no en-US "Heading#" styles	2014-10-24 23:57:06 +04:00
John MacFarlane	e16683b539	HTML writer: Make header attributes work outside top level. Previously they only appeared on top level header elements. Now they work e.g. in blockquotes. Closes #1711.	2014-10-23 10:27:14 -07:00
John MacFarlane	e4f3475eaa	DOCX writer: Look in user data dir for archive reference.docx.	2014-10-21 14:33:15 -07:00
John MacFarlane	78bdc08de7	Merge pull request #1706 from tarleb/org-symbol-entities Org reader: parse LaTeX-style MathML entities	2014-10-21 10:11:19 -07:00
John MacFarlane	0714a363c6	Merge pull request #1668 from gbataille/widthFromRef2 Getting the page width from the reference file	2014-10-21 10:05:51 -07:00
John MacFarlane	7f6bbfadf4	Pretty: Make CR + BLANKLINE = BLANKLINE. This fixes an extra blank line we were getting at the end of markdown fragments (as well as rst, org, etc.) Closes #1705.	2014-10-20 20:26:08 -07:00
Albert Krewinkel	a5eb02f6a7	Org reader: parse LaTeX-style MathML entities Org supports special symbols which can be included using LaTeX syntax, but are actually MathML entities. Examples for this are `\nbsp` (non-breaking space), `\Aacute` (the letter A with accent acute) or `\copy` (the copyright sign ©). This fixes #1657.	2014-10-20 22:57:36 +02:00
John MacFarlane	d7169c715d	Parsing: fixed `inlineMath` so it handles `\text{..}` containing `$`. For example: `$x = \text{the $n$th root of $y$}`. Closes #1677.	2014-10-19 16:42:56 -07:00
John MacFarlane	328ff8e71f	Markdown reader: allow `startnum` to work without `fancy_lists`. Formerly `pandoc -f markdown-fancy_lists+startnum` did not work properly.	2014-10-18 13:58:08 -07:00
John MacFarlane	84f6b1e41a	Merge pull request #1680 from shelf/master Respect indent when parsing Org bullet lists	2014-10-18 13:20:27 -07:00
John MacFarlane	31713d572a	Merge pull request #1700 from tarleb/org-emphasis-fix Org reader: fix rules for emphasis recognition	2014-10-18 13:19:42 -07:00
Albert Krewinkel	e3c36ed6ce	Org reader: Drop COMMENT document trees Document trees under a header starting with the word `COMMENT` are comment trees and should not be exported. Those trees are dropped silently. This closes #1678.	2014-10-18 22:11:53 +02:00
Albert Krewinkel	d571bec454	Org reader: fix rules for emphasis recognition Things like `/hello,/` or `/hi'/` were falsy recognized as emphasised strings. This is wrong, as `,` and `'` are forbidden border chars and may not occur on the inner border of emphasized text. This patch enables the reader to matches the reference implementation in that it reads the above strings as plain text.	2014-10-18 12:47:59 +02:00
Timothy Humphries	f1f56e8533	Fix indent issue for definition lists Tidy up fix for #1650, #1698 as per comments in #1680. Fix same issue for definition lists with the same method.	2014-10-17 20:06:25 -04:00
Bjorn Buckwalter	7960013cd4	Escape spaces. Fixes jgm/pandoc#1694 .	2014-10-15 21:06:41 +02:00
Kristof Bastiaensen	0198e95071	Use '=' instead of '#' for atx-style headers in markdown+lhs.	2014-10-14 13:28:28 +02:00
Timothy Humphries	4f4b0f031d	Respect indent when parsing Org bullet lists Fixes issue with top-level bullet list parsing. Previously we would use `many1 spaceChars` rather than respecting the list's indent level. We also permitted `*` bullets on unindented lists, which should unambiguously parse as `header 1`. Combined, this meant headers at a different indent level were being unwittingly slurped into preceding bullet lists, as per Issue #1650.	2014-10-12 03:18:36 -04:00
John MacFarlane	8b60d430f2	Merge pull request #1674 from freiric/master fix inDirectory to reset to the original directory in case an exception ...	2014-10-08 15:48:59 -07:00
John MacFarlane	2eaa0f6ab1	EPUB reader: Further URI handling improvements. Now we outsource most of the work to `fetchItem'`. Also, do not include queries in file extensions. Improves fix to #1671. It is possible that this will have some unexpected effects, so further testing would be good.	2014-10-08 15:45:50 -07:00
John MacFarlane	f8087b6c43	EPUB writer: correctly resolve relative URIs. (Closes #1671.)	2014-10-08 15:19:27 -07:00
John MacFarlane	a4d28cdd6d	Fixed absolute URI detection in EPUB writer. Closes #1672 .	2014-10-08 14:54:03 -07:00
Freiric Barral	24231623f3	fix inDirectory to reset to the original directory in case an exception occurs	2014-10-08 23:25:01 +02:00
John MacFarlane	d60707eed0	EPUB writer: Don't add sourceURL to absolute URIs! Closes #1669. If there are further issues, please open a new, targeted issue on the tracker. Some notes on the further issues you gestured at: Data URIs are indeed dereferenced, but why is this a problem? (The function being used to fetch from URLs is used for many different formats. Preserving data URIs would make sense in EPUBs, but not for e.g. PDF output. And by dereferencing we can get a smaller, more efficient EPUB, with the data stored as bytes in a file rather than encoded in textual representation.) "absolute uris are not recognized" -- I assume that is the problem just fixed. If not, please open a new issue. "relative uris are resolved (wrongly) like file paths" -- can you give an example? `<base>` tag is ignored. Yes. I didn't know about the base tag. Could you open a new issue just for this?	2014-10-08 11:52:47 -07:00
Grégory Bataille	8a1a5948be	Getting the page width from the reference file Uses it to scale images that are too large. When there is no reference files, default to a US letter portrait size to scale the images	2014-10-05 14:53:06 +02:00
Jason Ronallo	3dc58090d2	add mime type for WebVTT	2014-10-04 22:40:02 -04:00
John MacFarlane	bf00556c72	Added `track` to list of tags treated by `--self-contained`. Closes #1664.	2014-10-04 11:39:08 -07:00
Wikiwide	678aa31561	cref, sep Adding inlineCommands	2014-10-03 11:33:02 +10:00
John MacFarlane	08ac33815b	RST writer: Wrap line blocks with spaces before continuations. Improves on fix to #1656.	2014-09-30 09:25:54 -07:00
John MacFarlane	29e1c9529f	Don't wrap lines in rST line blocks. Closes #1656. Fixing pandoc to wrap the lines but insert spaces would be much more complicated. This at least makes the output semantically correct.	2014-09-29 21:48:59 -07:00
John MacFarlane	fe6d43b3e0	Merge pull request #1601 from jkr/windowsfix Fix path-slashes inside archive for windows	2014-09-27 16:21:17 -07:00
John MacFarlane	9c4e33f085	Merge pull request #1589 from mszep/master Add function to sanitize ConTeXt labels	2014-09-27 16:20:56 -07:00
Matthew Pickering	5cb475c374	Org Reader: Parse multi-inline terms correctly in definition list Closes #1649	2014-09-27 22:40:25 +01:00
Artyom	bc115ffc2d	Fix 'Ext_lists_without_preceding_blankline' bug. * Fixes #1636. * Adds a test.	2014-09-26 13:32:08 +04:00
mpickering	6740a9592a	HTML Reader: Recognise <br> tags inside <pre> blocks Closes #1620	2014-09-25 19:20:12 +01:00
mpickering	1f0ba8ec11	HTML Writer: Don't double render when email-obfuscation=none Closes #1625	2014-09-25 18:46:36 +01:00
mpickering	515a120d04	Add support for KaTeX HTML math Closes #1626	2014-09-25 18:32:42 +01:00
mpickering	575c76e36b	HTML Writer: MathML now outputted with tex annotation. Closes #1635	2014-09-25 15:28:50 +01:00
mpickering	cc07d0c6bf	Shared: Make collapseFilePath OS-agnostic	2014-09-25 12:42:53 +01:00
mpickering	56e4ecab20	MediaBag: Fixes Windows specific path problems Changes the internal representation to fix the problem. I haven't tested this on windows. Closes #1597	2014-09-25 12:19:52 +01:00
Mark Szepieniec	84b75a1c2a	ConTeXt writer: add function toLabel This function can be used to sanitize reference labels so that they do not contain any of the illegal characters \#[]",{}%()\|= . Currently only Links have their labels sanitized, because they are the only Elements that use passed labels.	2014-09-18 23:27:14 +02:00
Jesse Rosenthal	020a527c15	Docx writer: Renumber header and footer relationships to avoid collisions. We previously took the old relationship names of the headers and footer in secptr. That led to collisions. We now make a map of availabl names in the relationships file, and then rename in secptr.	2014-09-11 15:11:44 -04:00
Jesse Rosenthal	1326a41780	LaTeX writer: Protect graphics in headers. Graphics in `\section`/`\subsection` etc titles need to be `\protect`ed. This adds a state value and manually turns it on before every invocation of `sectionHeader` and manually turns it off after. Using a writer value and applying `local` would probably be cleaner, but this fits with the current style.	2014-09-09 10:56:37 -04:00
Jesse Rosenthal	132814aeb6	Docx Reader: Remove header class properly in other langs When we encounter one of the polyglot header styles, we want to remove that from the par styles after we convert to a header. To do that, we have to keep track of the style name, and remove it appropriately.	2014-09-06 07:53:29 -04:00
Jesse Rosenthal	71452946d9	Docx reader: Use polyglot header list. We're just keeping a list of header formats that different languages use as their default styles. At the moment, we have English, German, Danish, and French. We can continue to add to this. This is simpler than parsing the styles file, and perhaps less error-prone, since there seems to be some variations, even within a language, of how a style file will define headers.	2014-09-05 21:59:58 -04:00
Jesse Rosenthal	13fefd7959	Docx Reader: Start list of polyglot section headers.	2014-09-05 17:31:24 -04:00
Jesse Rosenthal	73b887e2df	Org reader: Added state changing blanklines. This allows us to emphasize at the beginning of a new paragraph (or, in general, after blank lines).	2014-09-04 19:55:53 -04:00
Jesse Rosenthal	ac8ed1fa93	Docx reader: Rewrite rewriteLink to work with new headers. There could be new top-level headers after making lists, so we have to rewrite links after that.	2014-09-04 16:44:21 -04:00
Jesse Rosenthal	7fe54505df	Docx reader: Single-item headers in ordered lists are headers. When users number their headers, Word understands that as a single item enumerated list. We make the assumption that such a list is, in fact, a header.	2014-09-04 16:35:57 -04:00
Jesse Rosenthal	4ef850ded5	Docx reader: Fix window path for image lookup. Don't use os-sensitive "combine", since we always want the paths in our zip-archive to use forward-slashes.	2014-09-02 13:45:01 -04:00
John MacFarlane	db90667a79	EPUB writer: Don't include nav node in spine unless --toc was requested. Previously we included it in the spine with `linear="no"`, leading to odd results in some readers. Closes #1593.	2014-09-01 16:31:32 -07:00
John MacFarlane	cb1a8da01c	LaTeX writer: Avoid using reserved characters as \lstinline delimiters. Closes #1595.	2014-09-01 10:11:09 -07:00
John MacFarlane	43ebb0229f	EPUB writer: Fixed typo.	2014-09-01 08:04:21 -07:00
Jose Luis Duran	9557eb6f8e	LaTeX writer: Use a declaration for tight lists Currently, pandoc has hard-coded the following in order to make tight lists in LaTeX: ```hs text "\\itemsep1pt\\parskip0pt\\parsep0pt" ``` Which is fine, but does not allow customizations. For example, the `memoir` class already has a `\tightlist` declaration for this purpose: ```tex \newcommand{\tightlist}{% \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}} ``` I'm proposing to use a similar solution: ```diff @@ In Writers/LaTeX.hs: -then text "\\itemsep1pt\\parskip0pt\\parsep0pt" +then text "\\tightlist" @@ In templates/default.latex: +\newcommand{\tightlist}{% + \setlength{\itemsep}{1pt}\setlength{\parskip}{0pt}\setlength{\parsep}{0pt}} ``` This allows us to customize the tightness to our needs. Backward Compatibility If a person is using a custom LaTeX template (not based upon the `memoir` class), the `\tightlist` declaration must be added.	2014-09-01 05:08:24 +00:00
John MacFarlane	3533218d6d	Merge pull request #1594 from jkr/itemFix Item fix	2014-08-31 19:31:38 -07:00
John MacFarlane	01b7957812	EPUB writer: Extract title even from structured title. Added docTitle'.	2014-08-31 14:47:07 -07:00
Jesse Rosenthal	d3053807a8	LaTeX writer: Put `~` before header in item text. Because of the built-in line skip, LaTeX can't handle a section header as the first element in a list item. (To be precise, it can't handle it if the list immediately follows a section header, but the instance is rare enough that we can afford to be a bit more general). This puts a non-breaking space before the header to solve this problem. We won't see this space, since the header skips a line before printing anyway. The output is ugly in LaTeX and this structure seems like it should probably be avoided. But it is valid HTML and native pandoc, so we should have some sort of typesettable representation in LaTeX.	2014-08-31 16:05:09 -04:00
John MacFarlane	598d3ee23b	Markdown reader: better handling of paragraph in div. Previously text that ended a div would be parsed as Plain unless there was a blank line before the closing div tag. Test case: <div class="first"> This is a paragraph. This is another paragraph. </div> Closes #1591.	2014-08-31 12:55:47 -07:00
John MacFarlane	54df49335a	EPUB writer: Don't use opf:title-type for epub2. It is not supported and epubcheck complains.	2014-08-31 12:08:17 -07:00
John MacFarlane	611bc27862	Shared: Moved import of toChunks outside of conditional. Closes #1590.	2014-08-31 11:17:53 -07:00
John MacFarlane	374bb3c147	DokuWiki writer: Make tables prettier by aligning columns. Also cleaned up crufty code and added tests.	2014-08-30 21:24:33 -07:00
John MacFarlane	d97aed3903	DokuWiki writer: Handle table cell alignments. Closes #1566.	2014-08-30 20:54:33 -07:00
John MacFarlane	a8273009ba	Textile reader: Improved table support. We can now handle all different alignment types, for simple tables only (no captions, no relative widths, cell contents just plain inlines). Other tables are still handled using raw HTML. Addresses #1585 as far as it can be addresssed, I believe.	2014-08-30 20:34:42 -07:00
John MacFarlane	b50927527c	PDF: Catch errors in conversion of images and display message. See #1582.	2014-08-30 18:45:58 -07:00
John MacFarlane	f70e3c3297	Merge branch 'mime' of https://github.com/Aelve/John into Aelve-mime Conflicts: src/Text/Pandoc/Writers/Docx.hs	2014-08-30 11:49:50 -07:00
John MacFarlane	8b09d954f9	Merge pull request #1580 from jkr/stringCellDokuWiki DokuWiki writer: Backslash newlines in table cells	2014-08-30 09:22:38 -07:00
Jesse Rosenthal	ccda2a902c	DokuWiki writer: Use backslash newlines in table cells. Write out strings in table cells with backslash linebreaks in place of newlines. We also want to remove the first two spaces of an indent in lists.	2014-08-30 07:17:55 -04:00
John MacFarlane	218633548f	Merge pull request #1574 from jlduran/latex-horizontal-rule LaTeX writer: Make Horizontal Rules more flexible	2014-08-29 21:40:55 -07:00
John MacFarlane	017d44af1d	Merge branch 'ugly-tables' of https://github.com/jlduran/pandoc into jlduran-ugly-tables	2014-08-29 21:24:36 -07:00
Jose Luis Duran	4c684561ee	LaTeX writer: Add `\strut` to fix multiline tables See: http://tex.stackexchange.com/questions/34971	2014-08-29 13:54:08 +00:00
Jesse Rosenthal	c931be24e1	Docx Reader: Read single para in table cell as plain This makes to docx reader's native output fit with the way the markdown reader understands its markdown output. Ie, as far as table cells go: docx -> native == docx -> native -> markdown -> native (This identity isn't true for other things outside of table cells, of course).	2014-08-28 14:35:33 -04:00
Jose Luis Duran	1fc665c07d	LaTeX writer: Make Horizontal Rules more flexible Currently, pandoc has hard-coded the following in order to make horizontal rules in LaTeX: ```hs "\\begin{center}\\rule{3in}{0.4pt}\\end{center}" ``` Which is fine, but does not allow customizations. It also does not take into consideration the current line width. I'm proposing this change: ```diff @@ In Writers/LaTeX.hs: -"\\begin{center}\\rule{3in}{0.4pt}\\end{center}" +"\\begin{center}\\rule{0.5\\linewidth}{\\linethickness}\\end{center}" ```	2014-08-28 03:12:37 +00:00
Jose Luis Duran	f1d330b7b5	LaTeX writer: Fix tables - [x] Fix a bug introduced in `66378062b6`, which causes the table caption to repeat across all pages - [x] Address the issues discussed [here](https://groups.google.com/forum/#!msg/pandoc-discuss/qMu6_5lYy0o/ZAU7lzAIKw0J) regarding the extra vertical space. - [ ] NOTE: This will cause multiline table cells to appear unpadded. See http://tex.stackexchange.com/questions/34971 - [x] Use [`\tabularnewline`](http://tex.stackexchange.com/questions/78796) instead of `\\`.	2014-08-28 02:02:20 +00:00
Matthew Pickering	404a58f456	DokuWiki Writer: Refactor to use Reader monad	2014-08-27 14:29:09 +01:00
Matthew Pickering	495f55b03e	DokuWiki Writer: Hlint cleanup	2014-08-27 13:48:19 +01:00
Matthew Pickering	3412018287	DokuWiki Writer: Qualified all imports	2014-08-27 13:29:09 +01:00
John MacFarlane	8f2aa45d69	Merge pull request #1564 from jkr/trackChangesWriter Docx writer: write track changes.	2014-08-26 22:22:16 -07:00
Calvin Beck	f813755c55	Fixed exampleLine parser to accept example lines which have indentation at the start of the line.	2014-08-26 21:56:40 -06:00
Jesse Rosenthal	b613d85af9	Docx writer: Accomodate GHC 7.4 (no lookupEnv)	2014-08-26 06:43:14 -04:00
Jesse Rosenthal	21253b59e8	Docx writer: Default to user login and time of change if not given.	2014-08-25 14:03:35 -04:00
Jesse Rosenthal	e1bb28a388	Docx writer: Implement track changes. These have default authors and dates of "unknown" and timestamp-zero, respectively.	2014-08-25 12:37:56 -04:00
John MacFarlane	9f8051d95d	Hlint changes to Docx writer.	2014-08-24 11:37:23 -07:00
John MacFarlane	0ef1f787c7	Docx writer: Bibliography entries get Bibliography style. Closes #1559.	2014-08-23 20:52:09 -07:00
John MacFarlane	2956ef251c	Fixed --self-contained with Windows paths. Previously C:\foo.js was being wrongly interpreted as a URI. Closes #1558.	2014-08-22 23:21:57 -07:00
mpickering	aa808055f0	Txt2Tags Reader: Fixed crash when reading from stdin	2014-08-21 17:11:21 +01:00
mpickering	3b6d7afa71	Txt2Tags Reader: Corrected formatting of %%mtime macro	2014-08-21 17:11:16 +01:00
mpickering	2a7319541d	Txt2Tags Reader: Parse Meta information The header is now parsed as meta information. The first line is the `title`, the second is the `author` and third line is the `date`.	2014-08-21 17:09:40 +01:00
mpickering	2cd049a1bf	Txt2Tags reader: Header is now parsed only if standalone flag is set	2014-08-20 18:11:37 +01:00
John MacFarlane	60f3e777f3	EPUB writer: don't use page-progression-direction in EPUB2. Also, if page-progression-direction not specified in metadata, don't include the attribute even in EPUB3; not including it is the same as including it with the value "default", as we did before. Closes #1550.	2014-08-19 09:21:26 -07:00
John MacFarlane	716ad5fd8a	Merge pull request #1547 from jkr/styleparse Docx reader: parsing styles	2014-08-18 14:14:01 -07:00
John MacFarlane	6dce8c6760	HTML reader: improved handling of tags that can be block or inline. Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline): <video controls="controls"> <source src="../videos/test.mp4" type="video/mp4" /> <source src="../videos/test.webm" type="video/webm" /> <p> The videos can not be played back on your system.<br/> Try viewing on Youtube (requires Internet connection): <a href="http://youtu.be/etE5urBps_w">Relative Velocity on Youtube</a>. </p> </video> This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we're parsing a "plain" sequence of inlines, we don't parse anything that COULD be a block-level tag.	2014-08-18 12:41:09 -07:00
Jesse Rosenthal	4b38e9f1f0	Docx reader: whitespace fix.	2014-08-17 20:11:50 -04:00
Jesse Rosenthal	198aea190f	Docx reader: remove emph styles and strong styles list. We no longer need the explicit lists since we're deriving them from the ground up.	2014-08-17 17:04:55 -04:00
Jesse Rosenthal	9da7b0946e	Docx reader: Add "Hyperlink" to blacklisted styles. This is the only one so far. We'll add others as they show up.	2014-08-17 17:04:14 -04:00
Jesse Rosenthal	15ce28b8ca	Docx reader: Use style resolver. We now no longer check against explicit styles.	2014-08-17 17:03:44 -04:00
Jesse Rosenthal	03d5d8e596	Docx Reader: Introduce function for resolving dependent run styles. We always favor an explicit positive or negative in a style in a descendent, and only turn to the ancestor if nothing is set. We also introduce an (empty) list of styles that are black-listed. We won't check them. (Think underlines in hyperlinks).	2014-08-17 16:54:11 -04:00
John MacFarlane	8e60d35d58	Merge pull request #1536 from considerate/master Add row width to tables in Docx XML	2014-08-17 12:54:38 -07:00
John MacFarlane	b6103eeb83	Merge pull request #1543 from jkr/superSubVert Docx reader: Change behavior of Super/Subscript	2014-08-17 12:54:10 -07:00
John MacFarlane	c14088ea93	Docx writer: Fixed regression, bungled list numbering. In pandoc 1.13, all lists come out as basic ordered lists. This fixes that bad regression. Closes #1544.	2014-08-17 12:48:41 -07:00
Jesse Rosenthal	99491f0d98	Docx Parse: build a bottom-up style tree. Two points here: (1) We're going bottom-up, from styles not based on anything, to avoid circular dependencies or any other sort of maliciousness/incompetence. And (2) each style points to its parent. That way, we don't need the whole tree to pass a style over to Docx.hs	2014-08-17 15:46:17 -04:00
Artyom Kazak	357172f13a	Remove an unnecessary import.	2014-08-17 23:44:02 +04:00
Artyom Kazak	6a34cd3ddf	Update Reader.EPUB to use `MimeType`.	2014-08-17 21:00:55 +04:00
Artyom Kazak	cca9e8feb4	MIME cleanup. * Create a type synonym for MIME type (instead of `String`). * Add `getMimeTypeDef` function. * Avoid recreating MIME type `Map`s every time. * Move “Formula-...” case handling into `getMimeType`.	2014-08-17 21:00:50 +04:00
Jesse Rosenthal	b8f1658c36	Alias string and runStyle to CharStyle type.	2014-08-17 11:30:22 -04:00
Jesse Rosenthal	c4871ac790	Docx Style parser: Basic one now just takes a parent style. This will make it easier to build the style map from the bottom up (to avoid any infinite references).	2014-08-17 10:19:48 -04:00
Jesse Rosenthal	75eec0a6b8	Docx reader: work with new rStyle. Just discards info at the moment, so at least it works the same.	2014-08-17 09:22:25 -04:00
Jesse Rosenthal	ea85a797c2	Parser: Framework for parsing styles. We want to be able to read user-defined styles. Eventually we'll be able to figure out styles in terms of inheritance as well. The actual cascading will happen in the docx reader.	2014-08-17 09:22:21 -04:00
Jesse Rosenthal	dc5b0ba09b	Docx reader: Change behavior of Super/Subscript In docx, super- and subscript are attributes of Vertalign. It makes more sense to follow this, and have different possible values of Vertalign in runStyle. This is mainly a preparatory step for real style parsing, since it can distinguish between vertical align being explicitly turned off and it not being set. In addition, it makes parsing a bit clearer, and makes sure we don't do docx-impossible things like being simultaneously super and sub.	2014-08-17 08:20:00 -04:00
John MacFarlane	9d52ecdd42	HTML reader: Parse appropriately styled span as SmallCaps.	2014-08-16 22:57:00 -07:00
Viktor Kronvall	753e47194c	Simplify row width calculation.	2014-08-17 03:24:44 +02:00
Christoffer Ackelman	9f3c34841b	Include row width in table rows. Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000).	2014-08-17 03:24:44 +02:00
John MacFarlane	cb4ae6112e	Markdown writer: don't escape $, ^, ~ when extensions are deactivated. `tex_math_dollars`, `superscript`, and `subscript` extensions, respectively. Closes #1127.	2014-08-16 17:14:51 -07:00
Jesse Rosenthal	9bb0b99981	Docx reader: Remove unnecessary plural functions functions like runElemsToInlines and parPartsToInlines are just defined in terms of concatting and mapping their singular version (e.g. `runElemToInlines`). Having two functions with almost identical names makes it easier to introduce errors. It's easy enough to just concat and map inline, and it makes it clearer what is going on in the code.	2014-08-16 15:07:41 -04:00
Jesse Rosenthal	9969b2ebee	Docx reader: Fix bug in character styles. Style handling has been cleaned up, but introduced a bug here. There wasn't previously a test to catch it.	2014-08-16 14:05:19 -04:00
Jesse Rosenthal	0ff9ec2f4e	Rewrite Docx.hs and Reducible to use Builder. The big news here is a rewrite of Docx to use the builder functions. As opposed to previous attempts, we now see a significant speedup -- times are cut in half (or more) in a few informal tests. Reducible has also been rewritten. It can doubtless be simplified and clarified further. We can consider this, at the moment, a reference for correct behavior.	2014-08-16 10:22:55 -04:00
John MacFarlane	8bf39cf6d6	Markdown reader: Better handle quote characters in inline links. This was previously failing to be recognized as a link: [Test](http://en.wikipedia.org/wiki/Ward's_method) Closes #1534.	2014-08-14 10:59:27 -07:00
John MacFarlane	e917bcc124	Make `raw_tex` extension non-default for textile reader, writer. Enable `raw_tex` extension in textile writer. Closes #1532.	2014-08-14 09:49:31 -07:00
John MacFarlane	52a9ccce4f	Merge pull request #1531 from jkr/morefonts Docx reader: Interpret "Strong" and "Emphasis" run styles.	2014-08-13 19:47:42 -07:00
John MacFarlane	17b2fd567b	Fixed haddock comment.	2014-08-13 13:59:50 -07:00
John MacFarlane	05b7fd8dee	Removed unneeded import.	2014-08-13 11:35:09 -07:00
Jesse Rosenthal	6897905602	Docx reader: Interpret "Strong" and Emphasis run styles.	2014-08-13 12:23:03 -04:00
John MacFarlane	22ab3367c6	Removed unneeded CPP.	2014-08-12 22:50:51 -07:00
Jesse Rosenthal	a1320a76f9	Docx: Reducible forgot about smallcaps	2014-08-13 00:09:40 -04:00
Jesse Rosenthal	dca55630e6	Docx Reader: Trim line breaks from the beginning and end of Section Headers. We might also want to do this elsewhere (for pars, for example).	2014-08-12 23:42:01 -04:00
Jesse Rosenthal	378a795eaa	Docx: More robust handling of multiple bookmarks in header.	2014-08-12 23:41:57 -04:00
Jesse Rosenthal	85579052b5	Docx reader: Check for null-id'd anchors too. Otherwise they get left dangling in the document.	2014-08-12 23:33:03 -04:00
Jesse Rosenthal	194ed88852	Docx reader: accept explicit "Italic" and "Bold" rStyles. Note that "Italic" can be on, and, from the last commit, `<w:i>` can be present, but be turned off. In that case, the turned-off tag takes precedence. So, we have to distinguish between something being off and something not being there. Hence, isItalic, isBold, isStrike, and isSmallCaps have become Maybes.	2014-08-12 22:39:18 -04:00
Jesse Rosenthal	aae71ad595	Docx reader: Add "BlockQuotation" to divs list.	2014-08-12 22:08:30 -04:00
Jesse Rosenthal	d4748038d7	Docx Reader: Fix font style parsing. Before we just checked for the existence of a tag. Now, we make sure to check for its on/off value.	2014-08-12 22:04:07 -04:00
John MacFarlane	e883ef4eb9	Merge pull request #1527 from mpickering/juicypixels Attempts to convert gif, tiff and bmp to png in pdf writer	2014-08-12 16:57:22 -07:00
John MacFarlane	7684b24959	Merge pull request #1528 from mpickering/epubtitlepage EPUB Reader: Ignores titlepage attribute	2014-08-12 16:56:38 -07:00
Matthew Pickering	2b31df32de	LaTeX Writer: Added missing closing braces to hyperdef commands	2014-08-13 00:37:18 +01:00
Matthew Pickering	57bebe26df	PDF Writer: Attempts to convert images to pdf renderable formats Now depends on the JuicyPixels library. Will attempt to convert an image (gif, tiff, bmp) to png when converting to pdf.	2014-08-13 00:37:18 +01:00
John MacFarlane	81157c7cc6	HTML writer: use 'uri' or 'email' class for autolinks. This allows them to be styled specially. Closes #1501.	2014-08-12 15:49:43 -07:00
John MacFarlane	da507dcb84	ConTeXt writer: improved autolink detection. It previously failed in some cases with escaped special characters.	2014-08-12 15:49:20 -07:00
Matthew Pickering	34cf016251	EPUB Reader: Ignore title pages	2014-08-12 23:03:24 +01:00
John MacFarlane	f16dd1bfdf	DocBook: Support equations with mathml. equation, informalequation, inlineequation and mml:math elements.	2014-08-12 14:00:46 -07:00
John MacFarlane	d6f0973128	Merge pull request #1524 from jkr/dropCap3 Docx reader: move dropcap combining logic to Reducible	2014-08-12 11:13:27 -07:00
John MacFarlane	f97ec6db2c	Markdown reader: Improved parsing of indented code in list items. Indented code at the beginning of a list item must be indented eight spaces from the margin (or from the edge of the container), or four spaces past the list marker, whichever is farther. Some examples in `tests/markdown-reader-more.txt`.	2014-08-12 11:10:48 -07:00
John MacFarlane	ab75e1d3bd	Beamer: Use \footnote<.->{..} for notes. This ensures that the footnotes will not appear before the overlays in which their corresponding note markers appear. Closes #1525.	2014-08-12 10:56:57 -07:00
Jesse Rosenthal	9d0b390d48	Docx reader: move combining logic to Reducible Introduces a new function in Reducibles, concatR. The idea is that if we have two list of Reducibles (blocks or inlines), we can combine them and just perform the reduction on the joining parts (the last element of the first list, the first element of the second list). This is useful in cases where the two lists are already reduced, and we're only worried about the joining elements. This actually improves the efficiency a bit further, because concatR can be smart about empty lists.	2014-08-12 10:26:49 -04:00
Jesse Rosenthal	e4a8e4a636	Docx reader: Make dropcap combining more efficient. Before, we had to run reduceList on the whole combined paragraph, which was redundant, and could take some time for long paragraphs. We only need to combine the drop cap with the first inline of the next paragraph.	2014-08-12 09:00:53 -04:00
Jesse Rosenthal	45ec035e93	Docx reader: combine inlines properly in dropcaps. Make sure that adjacent inlines are combined properly in dropcaps. This updates the test results as well.	2014-08-11 23:31:16 -04:00
Jesse Rosenthal	3e32cd5bb1	Docx reader: Use dropcap state. If we get to a dropcap, we keep hold the inlines until the next paragraph, and combine it there.	2014-08-11 23:08:33 -04:00
Jesse Rosenthal	bca74a2bd0	Add dropCap to paragraph style.	2014-08-11 21:42:02 -04:00
John MacFarlane	86d4da994a	EPUB reader: use walk instead of bottomUp. This should be more efficient.	2014-08-11 14:48:42 -07:00
John MacFarlane	31811657fa	Merge pull request #1521 from jkr/emptyEmph Discard empty formatters	2014-08-11 14:44:08 -07:00
John MacFarlane	211fe266e0	LaTeX writer: Don't produce `\label{}` for Div or Span. Just `\hyperdef`. A slight amendment to #1519.	2014-08-11 12:20:44 -07:00
John MacFarlane	95d9b43b42	Merge pull request #1519 from mpickering/more EPUB Normalisation and anchors for div blocks in tex	2014-08-11 11:28:11 -07:00
John MacFarlane	6fae136cbb	Textile reader: list and HTML block parsing improvements. Closes #1513. Lists can now start without an intervening blank line. Also, html block-level tags that don't start a line are parsed as RawInline and don't interrupt paragraphs, as in RedCloth.	2014-08-11 11:22:39 -07:00
John MacFarlane	4a535211d8	Merge pull request #1365 from gbataille/docx-margin Scale images to fit the page for DOCX	2014-08-11 10:44:52 -07:00
Jesse Rosenthal	0411fe7ccf	Docx reader: handle empty reducibles.	2014-08-11 12:48:16 -04:00
Matthew Pickering	1952dd0592	TeX Writer: Write hyperdef and label for identifiers on Div blocks	2014-08-11 16:23:05 +01:00
Matthew Pickering	72b1470713	EPUB Reader: Fixed another normalisation problem..	2014-08-11 16:23:05 +01:00
John MacFarlane	e690fe4a3e	Merge pull request #1516 from mpickering/epubmetadata EPUB improvements	2014-08-11 08:14:54 -07:00
Matthew Pickering	973ed469de	Docx Parse: Improved font recognition when specified in rFonts element	2014-08-11 10:30:32 -04:00
Matthew Pickering	427466f80c	Docx Fonts: Derives Show and Eq	2014-08-11 10:30:32 -04:00
Matthew Pickering	e02360d3d8	EPUB Reader: Can now parse multiple meta data fields	2014-08-11 13:12:42 +01:00
Matthew Pickering	285d56dea7	EPUB Writer: Added page-progression-direction meta field	2014-08-11 11:21:38 +01:00
Matthew Pickering	9eded27e32	EPUB reader: Fixed bug where filepaths weren't sufficiently normalised	2014-08-11 11:20:33 +01:00
Matthew Pickering	1f02ff60ba	EPUB Writer: Added explicit imports	2014-08-11 10:21:52 +01:00
John MacFarlane	65b31e0cac	Merge pull request #1510 from jkr/spacefix Docx reader: Fix spacing issue.	2014-08-10 07:11:59 -07:00
John MacFarlane	7ec8dd956f	Removed OMath module, depend on texmath >= 0.8.	2014-08-10 06:19:41 -07:00
Jesse Rosenthal	c15978ce5e	Change head/tail to pattern guards.	2014-08-10 09:10:34 -04:00
Jesse Rosenthal	a02ce74acf	Docx reader: Fix spacing issue. Previously spaces at the beginning of Emph/Strong/etc were kept inside. This makes sure they are moved out.	2014-08-09 23:35:09 -04:00
Matthew Pickering	3bb19307f6	Docx Parse: Recognises code points in sym elements which are in the private range	2014-08-09 22:37:12 -04:00
Matthew Pickering	edc57f77fc	Added Text.Pandoc.Readers.Docx.Fonts	2014-08-09 22:37:12 -04:00
Matthew Pickering	2deaa7096f	Docx Reader: Added recognition of sym element in paragraphs	2014-08-09 22:37:12 -04:00
Matthew Pickering	4ae61bdf8f	EPUB: Fixed another mediabag related regression..	2014-08-10 00:12:09 +01:00
Matthew Pickering	a6648e5a73	EPUB Reader: Changed image paths to be relative to manifest file	2014-08-09 23:06:16 +01:00
John MacFarlane	4983083079	HTML writer: Don't include empty TOC items for slide shows. Previously creating a slide with a horizontal rule would result in an empty list item in the TOC. This patch fixes that.	2014-08-09 10:29:39 -07:00
John MacFarlane	bc06ef0edb	Merge branch 'newbranch' of https://github.com/mpickering/pandoc into mpickering-newbranch Conflicts: src/Text/Pandoc/Readers/EPUB.hs	2014-08-08 22:22:55 -07:00
John MacFarlane	19daf6cf0a	Added `native_divs` and `native_spans` extensions. This allows users to turn off the default pandoc behavior of parsing contents of div and span tags in markdown and HTML as native pandoc Div blocks and Span inlines. Setting of default epub extensions has been moved from the EPUB reader to Text.Pandoc.	2014-08-08 21:05:34 -07:00
John MacFarlane	a4a6b6f28c	Plain writer: Use ALL CAPS for level 1 headers.	2014-08-08 15:20:29 -07:00
Matthew Pickering	cfd8c0214c	EPUB Reader: Improved robustness of image extraction We now maintain the invariant that when fetchImages is called, all images have absolute paths. This patch fixes several bugs relating to this as there are three places where images can be introduced. (1) During the HTML parse (2) As spine elements (3) As a cover image For (1), the paths are corrected by the transformation renameImages For (2) and (3), we need to append the "root" to the path we parse from the spine	2014-08-08 23:04:03 +01:00
Matthew Pickering	40ae8efddc	EPUB Reader: Fixed regressions in image extraction Before the images were relative to the position of the package file. The collapse function changed this so that they were then absolute in the archive but the fetchImages function wasn't updated to recognise this.	2014-08-08 22:31:27 +01:00
Matthew Pickering	8c551f6f43	EPUB Reader: Use collapseFilePath	2014-08-08 22:31:22 +01:00
Matthew Pickering	2d956677ef	Shared: Added collapseFilePath function This function removes intermediate "." and ".." from a path.	2014-08-08 22:31:02 +01:00
Matthew Pickering	116f03a70a	EPUB Reader: Removed incorrectly set reader flag	2014-08-08 22:31:02 +01:00
John MacFarlane	aae90a8671	Merge pull request #1503 from jkr/streamlineMath OMath parser: Change signature of exported function.	2014-08-08 13:45:30 -07:00
John MacFarlane	f723a0575d	Markdown writer: Respect -raw_html. pandoc -t markdown-raw_html should not emit any raw HTML, even span and div tags that go with pandoc Span and Div elements. Cleaned up a bit of the logic with extensions and plain.	2014-08-08 13:34:57 -07:00
Jesse Rosenthal	a426812ccc	OMath parser: Change signature of exported function. This changes the signature of the exported `readOMML` to `String -> Either String [Exp]`, so it can now, in theory, be slotted into TeXMath. It doesn't have any real error reporting yet, but that might make more sense once I put it in a branch, and understand how it works in the other readers. It also now reads strings that parse to either oMath or oMathPara elements. Note that the distinction is lost in the output. It's up to the caller to remember the display type.	2014-08-08 16:34:38 -04:00
John MacFarlane	7b47042ae6	Textile reader: fixed list parsing bug. Closes #1500 .	2014-08-08 12:18:47 -07:00
John MacFarlane	dd78dd6d1b	Textile reader: don't allow inline formatting to extend over newline. This matches behavior of RedCarpet, avoids some ugly bugs, and improves performance.	2014-08-08 12:18:47 -07:00
Jesse Rosenthal	2f7a627f6d	OMath: Finish initial cleanup. This gets rid of commented-out functions, cleans up whitespace errors, and exports and imports the correct functions.	2014-08-08 14:16:54 -04:00
Jesse Rosenthal	ba5804f9ec	OMath: Remove Namespaces We still need to test against prefixes, but this is only going to look at oMath fragments, so we're not going to be worried about looking up the real namespace.	2014-08-08 14:15:17 -04:00
Jesse Rosenthal	0acd139fb1	OMath: Start phasing out internal OMath type. This is the first step in removing the intermediate OMath type, which we no longer need since we're writing straight to TeXMath Exp.	2014-08-08 14:14:30 -04:00
Jesse Rosenthal	cf849443cb	OMath parser: don't group expressions if there's only one.	2014-08-08 14:12:05 -04:00
Matthew Pickering	40602c3df6	HTML EPUB exts: switch element can now be in either the inline or block position	2014-08-08 10:25:40 -07:00
John MacFarlane	94466c0060	HTML reader: Really ignore DOCTYPE and xml declarations. This actually does what `d71b013841` said it did. Revised epub tests to remove the repeated DOCTYPE and xml tags.	2014-08-07 22:12:44 -07:00
John MacFarlane	3c4079edc8	Merge pull request #1488 from mpickering/epubfixes EPUB Reader: Improved image extraction	2014-08-07 19:00:32 -07:00
John MacFarlane	08bed142ba	Merge pull request #1496 from mpickering/master Org Writer: Write anchor elements	2014-08-07 16:29:20 -07:00
Matthew Pickering	07bb41d6da	Org Writer: Write anchor elements The Org Writer now writes empty span elements which have an id as an anchor. For example `Span ("uid", [], []) []` becomes `<<uid>>`	2014-08-08 00:20:18 +01:00
Matthew Pickering	19d2ff68b1	EPUB Reader: Improved how images are extracted	2014-08-07 22:56:30 +01:00

... 4 5 6 7 8 ...

3203 commits