pandoc

Author	SHA1	Message	Date
John MacFarlane	00b11bcbcf	Fixed exponential parsing bug in textile reader. Closes #3020.	2016-07-14 08:42:38 -07:00
Albert Krewinkel	529146decf	Org reader: fix parsing of verbatim inlines Org rules for allowed characters before or after markup chars were not checked for verbatim text. This resultet in wrong parsing outcomes of if the verbatim text contained e.g. space enclosed markup characters as part of the text (`=is_substr = True=`). Forcing the parser to update the positions of allowed/forbidden markup border characters fixes this. This fixes #3016.	2016-07-14 13:33:25 +02:00
John MacFarlane	e2659a46db	Merge pull request #3014 from tarleb/org-writer-div Org writer: improve Div handling	2016-07-05 12:46:13 -07:00
Albert Krewinkel	5378b7c5bd	Org writer: improve Div handling Div blocks handling is changed to make the output look more like idiomatic org mode: - Div-wrapped content is output as-is if the div's attribute is the null attribute. - Div containers with an id but neither classes nor key-value pairs are unwrapped and the id is added as an anchor. - Divs with classes associated with greater block elements are wrapped in a `#+BEGIN`...`#+END` block. - The old behavior for Divs with more complex attributes is kept.	2016-07-05 11:49:45 +02:00
Albert Krewinkel	f417fecf5f	Org reader: replace ugly code with view pattern Some less-than-smart code required a pragma switching of overlapping pattern warnings in order to compile seamlessly. Using view patterns makes the code easier to read and also doesn't require overlapping pattern checks to be disabled.	2016-07-04 11:20:05 +02:00
John MacFarlane	e548b8df07	Merge pull request #3010 from tarleb/org-header-tree Org reader: support archived trees, headline levels export setting	2016-07-03 22:57:22 -07:00
John MacFarlane	4099b2dca4	Odt reader: Removed redundant Monoid constraints.	2016-07-03 22:47:32 -07:00
John MacFarlane	b203a31ba7	Fix warning for parseURl import.	2016-07-03 22:26:08 -07:00
John MacFarlane	261c3af053	CPP workaround for deprecation of parseUrl in http-client.	2016-07-03 21:29:47 -07:00
Albert Krewinkel	5ffa4abf72	Org reader: support headline levels export setting The depths of headlines can be modified using the `H` option. Deeper headlines will be converted to lists.	2016-07-03 23:28:45 +02:00
John MacFarlane	40caf516aa	Allow 'standout' as a beamer frame option. ## Slide title {.standout} Closes #3007.	2016-07-03 11:56:03 -07:00
Albert Krewinkel	c1f6bd2640	Org reader: put export setting parser into module Export option parsing is distinct enough from general block parsing to justify putting it into a separate module.	2016-07-02 13:14:09 +02:00
John MacFarlane	e0cc9e4463	LaTeX reader: strip off double quotes around image source if present. Avoids interpreting these as part of the literal filename. See #2825.	2016-07-01 15:47:42 -07:00
John MacFarlane	7e712abfa6	LaTeX writer: don't URI-escape image source. Usually this is a local file, and replacing spaces with `%20` ruins things. Closes #2825.	2016-07-01 15:41:33 -07:00
Albert Krewinkel	c4cf6d237f	Org reader: support archived trees export options Handling of archived trees can be modified using the `arch` option. Archived trees are either dropped, exported completely, or collapsed to include just the header when the `arch` option is nil, non-nil, or `headline`, respectively.	2016-07-01 23:05:33 +02:00
Albert Krewinkel	1ebaf6de11	Org reader: refactor comment tree handling Comment trees were handled after parsing, as pattern matching on lists is easier than matching on sequences. The new method of reading documents as trees allows for more elegant subtree removal.	2016-07-01 23:05:32 +02:00
Albert Krewinkel	17484ed01a	Org reader: parse as headlines, convert to blocks Emacs org-mode is based on outline-mode, which treats documents as trees with headlines are nodes. The reader is refactored to parse into a similar tree structure. This simplifies transformations acting on document (sub-)trees.	2016-07-01 23:05:32 +02:00
Albert Krewinkel	2f8d6755f4	Org reader: improve tag and properties type safety Specific newtype definitions are used to replace stringly typing of tags and properties. Type safety is increased while readability is improved.	2016-07-01 23:05:32 +02:00
John MacFarlane	b3382cf377	ZimWiki writer: removed commented out code that confused Haddock. See https://travis-ci.org/jgm/pandoc/jobs/141542247	2016-07-01 10:39:32 -07:00
Alex Ivkin	a73c95f61d	Added Zim Wiki writer, template and tests.	2016-06-30 23:59:43 -07:00
Jesse Rosenthal	b103f829f0	Docx writer: set paragraph to FirstPara after display math We treat display math like block quotes, and apply FirstParagraph style to paragraphs that follow them. These can be styled as the user wishes. (But, when the user is using indentation, this allows for paragraphs to continue after display math without indentation.)	2016-07-01 01:14:16 -04:00
Jesse Rosenthal	2c62f0e122	Writers: treat SoftBreak as space for stripping In Writers.Shared, we strip leading and trailing spaces for display math. Since SoftBreak's are treated as spaces, we should strip those too.	2016-07-01 00:52:52 -04:00
John MacFarlane	3429fa6438	LaTeX reader: fixed `\cite` so it is a NormalCitation not AuthorInText.	2016-06-29 07:59:00 -07:00
John MacFarlane	a349814665	Merge pull request #3001 from tarleb/org-figure-label Org reader: support figure labels	2016-06-26 17:51:51 -07:00
Albert Krewinkel	0f3f5ce1a1	Org reader: support figure labels Figure labels given as `#+LABEL: thelabel` are used as the ID of the respective image. This allows e.g. the LaTeX to add proper `\label` markup. This fixes half of #2496 and #2999.	2016-06-26 20:42:22 +02:00
John MacFarlane	38c97320ef	Textile reader: Fix overly aggressive interpretation as images. Spaces are not allowed in the image URL in textile. Closes #2998.	2016-06-25 14:04:47 -07:00
John MacFarlane	d283f9c864	Fixed RST links with no explicit link text. The link `<foo>`_ should have `foo` as both its link text and its URL. See RST spec at <http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#embedded-uris-and-aliases> "The reference text may also be omitted, in which case the URI will be duplicated for use as the reference text. This is useful for relative URIs where the address or file name is also the desired reference text: See `<a_named_relative_link>`_ or `<an_anonymous_relative_link>`__ for details." Closes Debian #828167 -- reported by Christian Heller.	2016-06-25 10:56:37 -07:00
John MacFarlane	a4294800bf	Make --webtex work with the Markdown writer. Closes #1177. This is a convenient option for people using websites whose Markdown flavors don't provide for math.	2016-06-24 14:57:21 -07:00
John MacFarlane	69e59e7f29	Process markdown extensions on command line in L->R order. Previously they were processed, very unintuitively, in R->L order, so that `markdown-tex_math_dollars+tex_math_dollars` had `tex_math_dollars` disabled. Closes #2995.	2016-06-23 23:04:42 -07:00
John MacFarlane	a820c1bd1c	Textile reader: fixed attributes. Attributes can't be followed by a space. So, _(class)emph_ but _(noclass) emph_ Closes #2984.	2016-06-23 10:28:54 -07:00
John MacFarlane	139d418d4b	Markdown writer: use raw HTML for simple, pipe tables with linebreaks. Markdown line breaks involve a newline, and simple and pipe tables can't contain one. Closes #2993.	2016-06-23 10:00:33 -07:00
Jesse Rosenthal	032ba8dd0c	Docx reader: Add warning for advanced comment formatting. We can't guarantee we'll convert every comment correctly, though we'll do the best we can. This warns if the comment includes something other than Para or Plain.	2016-06-23 10:50:46 -04:00
Jesse Rosenthal	5f0cd89129	docx reader: enable warnings in top-level reader Previously we had only allowed for warnings in the parser. Now we allow for them in the `Docx.hs` as well. The warnings are simply concatenated.	2016-06-23 10:50:46 -04:00
Jesse Rosenthal	8bb739f7ff	Docx reader: add simple comment functionality. This adds simple track-changes comment parsing to the docx reader. It is turned on with `--track-changes=all`. All comments are converted to inlines, which can list some information. In the future a warning will be added for comments with formatting that seems like it will be excessively denatured. Note that comments can extend across blocks. For that reason there are two spans: `comment-start` and `comment-end`. `comment-start` will contain the comment. `comment-end` will always be empty. The two will be associated by a numeric id.	2016-06-23 10:50:46 -04:00
Jesse Rosenthal	cbc2c15f0f	Shared: Add BlockQuote to blocksToInlines	2016-06-23 10:50:46 -04:00
Jesse Rosenthal	2b701f9389	Shared: introduce blocksToInlines function This is a lossy function for converting `[Block] -> [Inline]`. Its main use, at the moment, is for docx comments, which can contain arbitrary blocks (except for footnotes), but which will be converted to spans. This is, at the moment, pretty useless for everything but the basic `Para` and `Plain` comments. It can be improved, but the docx reader should probably emit a warning if the comment contains more than this.	2016-06-23 10:50:46 -04:00
John MacFarlane	319a56aefc	Merge pull request #2992 from tarleb/org-partial-functions Org reader: remove partial functions	2016-06-22 12:51:37 -07:00
John MacFarlane	ba7868765a	HTML writer: Better support for raw LaTeX environments. Previously we just passed all raw TeX through when MathJax was used for HTML math. This passed through too much. With this patch, only raw LaTeX environments that MathJax can handle get passed through. This patch also causes raw LaTeX environments to be treated as math, when possible, with MathML and WebTeX output. Closes #2758.	2016-06-22 11:47:44 -07:00
Albert Krewinkel	7df656089f	Org reader: remove partial functions Partial functions like `head` lead to avoidable errors and should be avoided. They are replaced with total functions. This fixes #2991.	2016-06-21 23:51:15 +02:00
John MacFarlane	58d60b1c85	Changed email-obfuscation default to no obfuscation. - `writerEmailObfuscation` in `defaultWriterOptions` is now `NoObfuscation` - the default for the command-line `--email-obfuscation` option is now `none`. Closes #2988.	2016-06-20 10:37:23 -07:00
Albert Krewinkel	29552eff3e	Org reader: support arbitrary raw inlines Org mode allows arbitrary raw inlines ("export snippets" in Emacs parlance) to be included as `@@format:raw foreign format text@@`. Support for this features is added to the Org reader.	2016-06-13 23:53:14 +02:00
Albert Krewinkel	cf2502de8f	Org writer: support arbitrary raw inlines Org mode allows arbitrary raw inlines ("export snippets" in Emacs parlance) to be included as `@@format:raw foreign format text@@`. Support for this features is added to the Org writer.	2016-06-13 23:13:05 +02:00
Ivo Clarysse	240cdfd1b3	Docbook writer: Declare xlink namespace in Docbook5 output	2016-06-07 06:03:06 -07:00
Albert Krewinkel	8a9f5915ab	Org reader: add support for "Berkeley-style" cites A specification for an official Org-mode citation syntax was drafted by Richard Lawrence and enhanced with the help of others on the orgmode mailing list. Basic support for this citation style is added to the reader. This closes #1978.	2016-06-05 11:28:57 +02:00
Albert Krewinkel	06dfe3276d	Org reader: add semicolon to list of special chars Semicolons are used as special characters in citations syntax. This ensures the correct parsing of Pandoc-style citations: [prefix; @key; suffix] Previously, parsing would have failed unless there was a space or other special character as the last <prefix> character.	2016-06-05 11:28:57 +02:00
Albert Krewinkel	f56792927f	Org reader: support special strings export option Parsing of special strings (like '...' as ellipsis or '--' as en dash) can be toggled using the `-` option.	2016-06-03 11:41:23 +02:00
Albert Krewinkel	d4de8451b9	Org reader: support emphasized text export option Parsing of emphasized text can be toggled using the `*` option. This influences parsing of text marked as emphasized, strong, strikeout, and underline. Parsing of inline math, code, and verbatim text is not affected by this option.	2016-06-03 11:17:02 +02:00
Albert Krewinkel	952a7dac58	Org reader: support smart quotes export option Reading of smart quotes can be toggled using the `'` option.	2016-06-03 11:16:35 +02:00
Albert Krewinkel	729fca311f	Org reader: drop unused field from parser state The `OrgParserState` contained both an `orgStateMeta` and `orgStateMeta'` field, the former for plain meta information and the latter for F-monad wrapped meta info. The plain meta info is only used to make `OrgParserState` an instance of the `HasMeta` class, which in turn is never used in the reader. The (F Meta) version is hence renamed to the "un-primed" version while the other one is dropped.	2016-06-02 15:30:21 +02:00
Albert Krewinkel	512bf2eebf	Org reader: undo code duplication Some code was duplicated (copy-pasted) or placed in an inappropriate module during the modularization refactoring. Those functions are moved into a `Shared` module, as was originally intended but forgotten. Better documentation of the respective functions is a positive side-effect.	2016-06-02 15:30:20 +02:00
John MacFarlane	061bc60f70	Merge pull request #2950 from tarleb/org-ref-support Org reader: support org-ref style citations	2016-05-31 12:44:29 -07:00
John MacFarlane	669ecbd4ab	Merge pull request #2954 from tarleb/org-export-blocks Org export blocks	2016-05-31 11:16:08 -07:00
John MacFarlane	561afac0bc	brazilian -> brazil for polyglossia. Closes #2953.	2016-05-31 11:15:21 -07:00
Albert Krewinkel	c17c62a2c7	Org reader: support new syntax for export blocks Org-mode version 9 usees a new syntax for export blocks. Instead of `#+BEGIN_<FORMAT>`, where `<FORMAT>` is the format of the block's content, the new format uses `#+BEGIN_export <FORMAT>` instead. Both types are supported.	2016-05-29 21:08:50 +02:00
Albert Krewinkel	4f84cf02c7	Org reader: refactor BEGIN…END block parsing - Reorder functions, grouping related functions together. - Demote simple functions to local functions if they are used just once. - Rename and document functions to increase code readability. - Fix handling of whitespace in blocks, allowing content to be indented less then the block header.	2016-05-29 21:08:36 +02:00
Albert Krewinkel	bc82123122	Org reader: rename `parseInlines` to `inlines` Having a function starting with `parse` in a parsing library is overly redundant. Let's use a nicer, shorter name more in line with the rest of the library.	2016-05-29 21:08:31 +02:00
Albert Krewinkel	f226cb88b0	Org reader: support org-ref style citations The org-ref package is an org-mode extension commonly used to manage citations in org documents. Basic support for the `cite:citeKey` and `[[cite:citeKey][prefix text::suffix text]]` syntax is added.	2016-05-27 21:19:28 +02:00
Albert Krewinkel	eea6d6568f	Org reader: extract blocks parser to module Block parsing code is moved to a separate module. This is part of the Org-mode reader cleanup effort.	2016-05-25 23:21:40 +02:00
Albert Krewinkel	39e8b4276e	Org reader: extract inline parser to module Inline parsing code is moved to a separate module. Parsers for block starts are extracted as well, as those are used in the `endline` parser. This is part of the Org-mode reader cleanup effort.	2016-05-25 22:54:45 +02:00
Albert Krewinkel	a340c7249f	Org reader: extract parsing function to module The Org-mode reader uses many functions defined in the `Text.Pandoc.Parsing` utility module. Some of the functions are overwritten with versions adapted to Org-mode idiosyncrasies. These special functions, as well as the normal Pandoc versions, are combined in a single module to increase the ease of use. This leads to decoupling of Org-mode and Pandoc and hence to slightly cleaner code. The downside is code-bloat due to repeated import/export statements.	2016-05-25 22:53:55 +02:00
mb21	340f0aaef8	EPUB Reader: normalise Link id as well	2016-05-24 17:42:37 +02:00
Carlos Sosa	5667e0959a	Org writer: add drawer capability For the implementation of the Drawer element in the Org Writer, we make use of a generic Block container with attributes. The presence of a `drawer` class defines that the `Div` constructor is a drawer. The first class defines the drawer name to use. The key-value list in the attributes defines the keys to add inside the Drawer. Lastly, the list of Block elements contains miscellaneous blocks elements to add inside of the Drawer. Signed-off-by: Albert Krewinkel <albert@zeitkraut.de>	2016-05-23 10:00:14 +02:00
Albert Krewinkel	a4717c2fc5	Org reader: respect drawer export setting The `d` export option can be used to control which drawers are exported and which are discarded. Basic support for this option is added here.	2016-05-23 09:44:37 +02:00
Albert Krewinkel	f3d27e4c80	Org reader/writer: use CUSTOM_ID in properties The `ID` property is reserved for internal use by Org-mode and should not be used. The `CUSTOM_ID` property is to be used instead, it is converted to the `ID` property for certain export format. The reader and writer erroneously used `ID`. This is corrected by using `CUSTOM_ID` where appropriate.	2016-05-22 23:01:47 +02:00
John MacFarlane	446cf6a1cf	HTML reader: fixed bug in pClose. This caused exponential parsing behavior in documnets with unclosed tags in dl, dd, dt.	2016-05-21 23:05:00 -07:00
Albert Krewinkel	cd3282b08d	Org writer: add :PROPERTIES: drawer support This allows header attributes to be added to org documents in the form of `:PROPERTIES:` drawers. All available attributes are stored as key/value pairs. This reflects the way the org reader handles `:PROPERTIES:` blocks. This closes #1962.	2016-05-20 17:01:50 +02:00
Albert Krewinkel	68d388f833	Org reader: add :PROPERTIES: drawer support Headers can have optional `:PROPERTIES:` drawers associated with them. These drawers contain key/value pairs like the header's `id`. The reader adds all listed pairs to the header's attributes; `id` and `class` attributes are handled specially to match the way `Attr` are defined. This also changes behavior of how drawers of unknown type are handled. Instead of including all unknown drawers, those are not read/exported, thereby matching current Emacs behavior. This closes #1877.	2016-05-20 17:01:26 +02:00
John MacFarlane	0958f2f5d0	Merge pull request #2927 from tarleb/org-attr-html Org reader support for ATTR_HTML statements	2016-05-19 10:44:11 -07:00
Albert Krewinkel	16e233475a	Org reader: add support for ATTR_HTML attributes Arbitrary key-value pairs can be added to some block types using a `#+ATTR_HTML` line before the block. Emacs Org-mode only includes these when exporting to HTML, but since we cannot make this distinction here, the attributes are always added. The functionality is now supported for figures. This closes #1906.	2016-05-19 09:55:12 +02:00
Albert Krewinkel	26e8d98be2	Org reader: use custom `anyLine` Additional state changes need to be made after a newline is parsed, otherwise markup may not be recognized correctly. This fixes a bug where markup after certain block-types would not be recognized. E.g. `/emph/` in the following snippet was not parsed as emphasized. foo # comment /emph/	2016-05-19 09:35:47 +02:00
Albert Krewinkel	1dda535378	Org reader: refactor block attribute handling A parser state attribute was used to keep track of block attributes defined in meta-lines. Global state is undesirable, so block attributes are no longer saved as part of the parser state. Old functions and the respective part of the parser state are removed.	2016-05-19 09:33:51 +02:00
John MacFarlane	847167804a	EPUB reader: unescape URIs in spine. This should fix #2924. Testing on the epub that caused the problem originally would be welcome.	2016-05-17 09:38:52 -07:00
John MacFarlane	7be30a40f1	LaTeX writer: Don't escape underscore in labels. Previously they were escaped as ux5f. Closes #2921.	2016-05-17 09:18:52 -07:00
John MacFarlane	344412cba8	Merge pull request #2894 from sid-kap/rst-code-class Add class option for code block in RST reader	2016-05-12 00:03:14 -07:00
John MacFarlane	609fb33302	Merge pull request #2913 from jlduran/strut-minipage-tables Retake on strut with \minipage inside tables	2016-05-11 23:57:47 -07:00
John MacFarlane	3800cb3d42	Merge pull request #2912 from tarleb/org-export-settings Org reader: basic support for export settings	2016-05-11 13:36:02 -07:00
Albert Krewinkel	be5cccf248	Org reader: parse but ignore export options All known export options are parsed but ignored.	2016-05-11 19:13:43 +02:00
Albert Krewinkel	76143de97e	Org reader: add support for sub/superscript export options Org-mode allows to specify export settings via `#+OPTIONS` lines. Disabling simple sub- and superscripts is one of these export options, this options is now supported.	2016-05-11 19:13:43 +02:00
Albert Krewinkel	7a0729ea09	Org reader: move parser state into separate module The org reader code has become large and confusing. Extracting smaller parts into submodules should help to clean things up.	2016-05-11 19:13:42 +02:00
Jose Luis Duran	ec2fc30288	Retake on strut with \minipage inside tables Reimplement on `4c684561ee` The problem with `4c68456` was a space between the cell contents and the `\strut` that affected the alignment.	2016-05-11 14:02:09 -03:00
John MacFarlane	f7601297f0	Avoid lazy foldl in LaTeX writer.	2016-05-09 18:25:57 -07:00
John MacFarlane	fd9ec835ec	Merge pull request #2907 from tarleb/org-fixes Org fixes (reader and writer)	2016-05-09 10:17:56 -07:00
Albert Krewinkel	d32878b84b	Org writer: print empty table rows Empty table rows should not be dropped from the output, so row-height is always set to be at least 1.	2016-05-09 19:06:24 +02:00
Albert Krewinkel	10a809f126	Org reader: fix inline-LaTeX regression The last fix for whitespace handling of inline LaTeX commands was incorrect, preventing correct recognition of inline LaTeX commands which contain spaces. This fix ensures that only trailing whitespace is cut off.	2016-05-09 19:06:04 +02:00
roblabla	acd492c7f4	Allow spaces before '!' in MediaWiki table header	2016-05-09 17:54:40 +02:00
John MacFarlane	21d1a3b57c	Merge pull request #2898 from tarleb/org-table-refactoring Org reader: table parsing code refactoring and fixes	2016-05-05 16:22:56 -07:00
Albert Krewinkel	405c3e9c36	Org reader: fix spacing after LaTeX-style symbols The org-reader was droping space after unescaped LaTeX-style symbol commands: `\ForAll \Auml` resulted in `∀Ä` but should give `∀ Ä` instead. This seems to be because the LaTeX-reader treats the command-terminating space as part of the command. Dropping the trailing space from the symbol-command fixes this issue.	2016-05-04 23:16:23 +02:00
Albert Krewinkel	2d825603c6	Org reader: fix handling of empty table cells, rows This fixes Org mode parsing of some corner cases regarding empty cells and rows. Empty cells weren't parsed correctly, e.g. `\|\|\|` should be two empty cells, but would be parsed as a single cell containing a pipe character. Empty rows where parsed as alignment rows and dropped from the output. This fixes #2616.	2016-05-04 16:02:03 +02:00
Albert Krewinkel	a51e4e8215	Org reader: refactor rows-to-table conversion This refactores the codes conversing a list table lines to an org table ADT. The old code was simplified and is now slightly less ugly.	2016-05-04 16:01:22 +02:00
Albert Krewinkel	d5e4bc179c	Org reader: stop padding short table rows Emacs Org-mode doesn't add any padding to table rows. The first row (header or first body row) is used to determine the column count, no other magic is performed. The org reader was padding rows to the length of the longest table row. This was done due to a misunderstanding of how Org handles tables. This feature reflected how Org-mode handles tables when pressing <TAB>. The Org exporter however, which is what the reader should implement, doesn't do any of this. So this was a mis-feature that made the reader more complex and reduced comparability. It was hence removed.	2016-05-04 15:48:07 +02:00
John MacFarlane	ee4e863225	Merge pull request #2890 from bcdevices/docbook5-writer Docbook5 write support	2016-05-01 22:43:38 -07:00
Sidharth Kapur	490c2b543d	Add class option for code block in RST reader According to http://docutils.sourceforge.net/docs/ref/rst/directives.html#code, the code directive supports the ":class:" option.	2016-05-01 21:42:58 -05:00
Jesse Rosenthal	99eac312fe	Binary fmts throw PandocError on zip-archive fail Commit `91dc3342` made `readDocx` throw PandocError if there was an unarchiving error. This extends that fix to `readOdt` and `readEPUB`.	2016-05-01 18:27:20 -04:00
John MacFarlane	1fbe79db05	LaTeX writer: use {} around options containing special chars. Closes #2892.	2016-05-01 11:20:26 -07:00
Jesse Rosenthal	91dc334249	Docx Reader: Throw PandocError on unzip failure Previously, readDocx would error out if zip-archive failed. We change the archive extraction step from `toArchive` to `toArchiveOrFail`, which returns an Either value.	2016-05-01 12:17:12 -04:00
Ivo Clarysse	fd36e6b64a	Docbook5 writer: Properly handle ulink/link	2016-04-29 16:06:55 -07:00
Ivo Clarysse	987ec3a752	Write out Docbook 5 namespace	2016-04-29 15:43:15 -07:00
John MacFarlane	aa4a1d527a	HTML writer: ensure mathjax link is added when math appears in footnote. Previously if a document only had math in a footnote, the MathJax link would not be added. Closes #2881.	2016-04-29 14:54:54 -07:00
Ivo Clarysse	271cb4d845	Add docbook5 writer support	2016-04-29 14:00:46 -07:00
John MacFarlane	32f1b0a5f1	Revert "LaTeX writer: Add `\strut` to fix multiline tables" This reverts commit `4c684561ee`. See https://groups.google.com/d/msg/pandoc-discuss/u6J-_aCProU/UufN3IYRAgAJ This should fix uneven spacing issues in multiline tables.	2016-04-27 17:25:45 -07:00
John MacFarlane	ece215ed7d	Merge pull request #2735 from mb21/patch-1 LaTeX Writer: fix polyglossia to babel env mapping	2016-04-26 23:09:02 -07:00
John MacFarlane	cc0527bf31	Merge pull request #2829 from adunning/patch-1 LaTeX writer: Add missing languages.	2016-04-26 23:08:15 -07:00
John MacFarlane	cc82851a6a	Merge pull request #2876 from shosti/org-code-indent Ignore leading space in org code blocks	2016-04-26 23:07:29 -07:00
John MacFarlane	1985164816	LaTeX writer: ignore --incremental unless -t beamer. Closes #2843.	2016-04-26 21:50:37 -07:00
Emanuel Evans	1bfe39e24c	Ignore leading space in org code blocks Fixes #2862 Also fix up tab handling for leading whitespace in code blocks.	2016-04-26 10:29:59 -07:00
Jesse Rosenthal	a385ee1d4f	Docx Reader: parse `moveTo` and `moveFrom` `moveTo` and `moveFrom` are track-changes tags that are used when a block of text is moved in the document. We now recognize these tags and treat them the same as `insert` and `delete`, respectively. So, `--track-changes=accept` will show the moved version, while `--track-changes=reject` will show the original version.	2016-04-15 14:09:18 -04:00
John MacFarlane	4b49f923cb	Markdown reader: Fix pandoc title blocks with lines ending in 2 spaces. Closes #2799. Also added -s to markdown-reader-more test.	2016-04-10 09:13:53 -07:00
John MacFarlane	773bbb8fc7	Markdown + HTML readers: be more forgiving about unescaped &. We are now more forgiving about parsing invalid HTML with unescaped `&` as raw HTML. (Previously any unescaped `&` would cause pandoc not to recognize the string as raw HTML.) Closes #2410.	2016-04-10 07:39:36 -07:00
Andrew Dunning	9765ef2ce6	LaTeX writer: Add missing languages. Updates the list from the hyphenation files at <http://mirror.ctan.org/language/hyph-utf8/tex/generic/hyph-utf8/loadhyph/>.	2016-04-01 16:47:33 +01:00
Andrew Dunning	0c37a7c488	Recognize `la-x-classic` as Classical Latin. This allows one to access the hyphenation patterns at <http://mirrors.ctan.org/language/hyph-utf8/tex/generic/hyph-utf8/patterns/tex/hyph-la-x-classic.tex>, using its private language tag.	2016-03-30 14:15:47 +01:00
John MacFarlane	f74498cb47	EPUB writer: set 'navpage' variable on nav page. This allows templates to treat it differently.	2016-03-26 13:14:50 -07:00
John MacFarlane	9742c48647	Removed two superfluous lines.	2016-03-25 09:05:38 -07:00
John MacFarlane	f47b369f37	LaTeX writer: better positioning for hypertarget in figures. Closes #2813.	2016-03-24 16:44:33 -07:00
John MacFarlane	bb6897a13e	LaTeX writer: Fixed position of label in figures. Partially addresses #2813. This isn't perfect, because now the hypertarget is in the wrong place -- when you link to the figure, the screen is positioned with the caption at the top, and most of the figure off screen. So this needs a bit more tweaking.	2016-03-24 09:41:45 -07:00
John MacFarlane	499985c1a3	Updated copyright dates to include 2016.	2016-03-22 17:20:39 -07:00
John MacFarlane	b1ffdf3b01	Fixed bug in Markdown raw HTML parsing. This was a regression, with the rewrite of `htmlInBalanced` (from `Text.Pandoc.Readers.HTML`) in 1.17. It caused newlines to be omitted in raw HTML blocks. Closes #2804.	2016-03-22 16:56:10 -07:00
Mauro Bieg	44f95484a4	LaTeX Writer: fix polyglossia to babel env mapping allow for optional argument in square brackets, closes #2728	2016-03-20 19:17:30 +01:00
John MacFarlane	3af753de47	Merge pull request #2637 from mb21/latex-figure-label LaTeX writer: figure label	2016-03-19 13:56:14 -07:00
John MacFarlane	976e7e2054	ConTeXt writer: fix whitespace at line beginning in line blocks. Add a `\strut` after `\crlf` before space. Closes #2744, #2745. Thanks to @c-foster. This uses the fix suggested by @c-foster. Mid-line spaces are still not supported, because of limitations of the Markdown parser.	2016-03-18 16:36:56 -07:00
John MacFarlane	e821b05125	LaTeX writer: Avoid double toprule in headerless table with caption. Closes #2742.	2016-03-18 16:16:18 -07:00
Jesse Rosenthal	28c7617f19	Docx reader: Handle alternate content Some word functions -- especially graphics -- give various choices for content so there can be backwards compatibility. This follows the largely undocumented feature by working through the choices until we find one that works. Note that we had to split out the processing of child elems of runs into a separate function so we can recurse properly. Any processing of an element within a run (other than a plain run) should go into `childElemToRun`.	2016-03-18 09:38:26 -04:00
Jesse Rosenthal	855c8b43f0	Docx reader: Don't make numbered heads into lists. Word uses list numbering styles to number its headings. We only call something a numbered list if it does not also heave a heading style.	2016-03-16 12:50:32 -04:00
Jesse Rosenthal	5c055b4cf3	Introduce file-scope parsing (parse-before-combine) Traditionally pandoc operates on multiple files by first concetenating them (around extra line breaks) and then processing the joined file. So it only parses a multi-file document at the document scope. This has the benefit that footnotes and links can be in different files, but it also introduces a couple of difficulties: - it is difficult to join files with footnotes without some sort of preprocessing, which makes it difficult to write academic documents in small pieces. - it makes it impossible to process multiple binary input files, which can't be catted. - it makes it impossible to process files from different input formats. This commit introduces alternative method. Instead of catting the files first, it parses the files first, and then combines the parsed output. This makes it impossible to have links across multiple files, and auto-identified headers won't work correctly if headers in multiple files have the same name. On the other hand, footnotes across multiple files will work correctly and will allow more freedom for input formats. Since ByteStringReaders can currently only read one binary file, and will ignore subsequent files, we also changes the behavior to automatically parse before combining if using the ByteStringReader. If we use one file, it will work as normal. If there is more than one file it will combine them after parsing (assuming that the format is the same). Note that this is intended to be an optional method, defaulting to off. Turn it on with `--file-scope`.	2016-03-15 12:52:51 -04:00
Jesse Rosenthal	68fd333ec4	Add a general ByteStringReader with warnings. Have docx reader use it.	2016-03-12 17:08:20 -05:00
Jesse Rosenthal	ee03e954d0	Add readDocxWithWarnings The regular readDocx just becomes a special case.	2016-03-12 17:08:20 -05:00
Jesse Rosenthal	102ba9ecb8	Docx Reader: Add state to the parser, for warnings In order to be able to collect warnings during parsing, we add a state monad transformer to the D monad. At the moment, this only includes a list of warning strings (nothing currently triggers them, however). We use StateT instead of WriterT to correspond more closely with the warnings behavior in T.P.Parsing.	2016-03-12 17:08:20 -05:00
John MacFarlane	a485c42d78	Fixed behavior of base tag. + If the base path does not end with slash, the last component will be replaced. E.g. base = `http://example.com/foo` combines with `bar.html` to give `http://example.com/bar.html`. + If the href begins with a slash, the whole path of the base is replaced. E.g. base = `http://example.com/foo/` combines with `/bar.html` to give `http://example.com/bar.html`. Closes #2777.	2016-03-10 19:59:55 -08:00
mb21	139fa54d48	Docx Writer: handle image alt text closes #2754	2016-03-10 08:56:08 +01:00
John MacFarlane	2b55b76ebe	Markdown reader: Improved pipe table parsing. Fixes #2765. Added test case.	2016-03-09 11:46:00 -08:00
John MacFarlane	54a68616d7	Markdown reader: Clean up pipe table parsing.	2016-03-09 10:11:32 -08:00
John MacFarlane	6e950a8eb5	Markdown reader: allow `+` separators in pipe table cells. We already allowed them in the header, but not in the body rows, for some reason. This gives compatibility with org-mode tables.	2016-03-09 08:44:31 -08:00
John MacFarlane	4ed64835cb	Markdown reader: don't cross line boundary parsing pipe table row. Previously an emph element could be parsed across the newline at the end of the pipe table row. I thought this would help with #2765, but it doesn't.	2016-03-09 08:33:13 -08:00
John MacFarlane	6bfaa5ad15	DokuWiki writer: use $$ for display math.	2016-03-08 10:08:14 -08:00
Jesse Rosenthal	0b9c54d9f3	Docx reader: update feature checklist. The feature checklist in the source code was out of date. Update.	2016-03-08 00:36:13 -05:00
John MacFarlane	7c6a3c0f69	LaTeX reader: handle interior `$` characters in math. e.g. `$$\hbox{$i$}$$`. Partially addresses #2743.	2016-02-28 11:14:03 -08:00
Jesse Rosenthal	a7a0b452a5	Docx Reader: Get rid of Modifiable typeclass. The docx reader used to use a Modifiable typeclass to combine both Blocks and Inlines. But all the work was in the inlines. So most of the generality was wasted, at the expense of making the code harder to understand. This gets rid of the generality, and adds functions for Blocks and Inlines. It should be a bit easier to work with going forward.	2016-02-26 08:57:53 -05:00
John MacFarlane	f2bd6fd37c	Make protocol-relative URIs work again. Closes #2737.	2016-02-23 21:58:10 -08:00
John MacFarlane	04d1e40f37	Markdown reader: use htmlInBalanced for rawVerbatimBlock. This should give better performance. See #2730.	2016-02-21 07:56:41 -08:00
John MacFarlane	9693de7f59	Fixed some linter warnings.	2016-02-20 22:16:39 -08:00
John MacFarlane	29706ee02d	Merge pull request #2646 from tarleb/org-figure-with-no-name Prefix even empty figure names with "fig:"	2016-02-20 21:44:39 -08:00
John MacFarlane	649cfb61b8	Merge pull request #2668 from monofon/fix/yaml-metadata-block-bottom-line Markdown writer: Use hyphens for yaml metadata block bottom line	2016-02-20 21:43:15 -08:00
John MacFarlane	e369e60fb4	Merge pull request #2691 from tarleb/org-image-file-links Org reader: Refactor link-target processing	2016-02-20 21:42:12 -08:00
John MacFarlane	1534052dd9	HTML reader: rewrote htmlInBalanced. This version avoids an exponential performance problem with `<script>` tags, and it should be faster in general. Closes #2730.	2016-02-20 15:00:31 -08:00
Jesse Rosenthal	4438ff17fb	LaTeX writer: clean up options parser. Make sure that we require the closing bracket.	2016-02-18 23:35:38 -05:00
Jesse Rosenthal	4112b321cd	LaTeX writer: treat memoir template with `article` opt as article We currently treat all memoir templates as books. This means that pandoc will infer the `--chapters` argument, even if the `article` iption is set for memoir. This commit makes pandoc treats the document as an article if there is an article option (i.e., `\documentclass[12pt,article]{memoir}`). Note that this refactors out the parsec parsers for document class and options, to make it a little clearer what's going on.	2016-02-18 22:32:38 -05:00
John MacFarlane	b8dadc608a	HTML reader: properly handle an empty cell in a simple table. Closes #2718.	2016-02-16 11:05:51 -08:00
John MacFarlane	bbc67dee36	Removed `tex_math_single_backslash` from `markdown_github` options. Closes #2707.	2016-02-09 22:30:52 -08:00
John MacFarlane	a692bd2872	Custom writer: Pass attributes parameter to CaptionedImage. Closes #2697.	2016-02-05 16:49:27 -08:00
John MacFarlane	6cb4991f6b	Markdown reader: Fixed bug with smart quotes around tex math. Previously smart quotes were incorrect in the following: '$\neg(x \in x)$'. (because of the following period). This commit fixes the problem, which was introduced by commit `4229cf2d92`.	2016-02-04 12:09:26 -08:00
John MacFarlane	93a05dffd3	HTML writer: don't include alignment attribute for default table columns. Previously these were given "left" alignment. Better to leave off alignment attributes altogether. Closes #2694.	2016-02-03 13:31:21 -08:00
Jesse Rosenthal	2ee7752d14	Docx reader: Add a "Link" modifier to Reducible We want to make sure that links have their spaces removed, and are appropriately smushed together. This closes #2689	2016-02-02 14:40:09 -05:00
Albert Krewinkel	92e6ae47f6	Org reader: Refactor link-target processing Cleanup of the code for link target handling. Most notably, the canonicalization of a link is handled by a separate function. This fixes #2684.	2016-01-31 23:23:09 +01:00
John MacFarlane	18745585c1	LaTeX reader: `inlineCommand` now gobbles an empty `{}` after any command. This gives better results when people write e.g. `\TeX{}` in Markdown. \TeX{} and \LaTeX{} now works as expected with `pandoc -f markdown -t latex`. Closes #2687.	2016-01-31 10:52:46 -08:00
John MacFarlane	a02c26d9f4	HTML reader: handle multiple meta tags with same name. Put them in a list in the metadata so they are all preserved, rather than (as before) throwing out all but one..	2016-01-29 11:51:01 -08:00
John MacFarlane	76983c31f2	Properly handle LaTeX "math" environment as inline math. See #2171.	2016-01-29 10:11:45 -08:00
John MacFarlane	a1021bdda6	Textile reader: Support `>`, `<`, `=`, `<>` text alignment attributes. Closes #2674.	2016-01-25 09:34:49 -08:00
John MacFarlane	11c5831a1f	Make language extensions trigger highlighting. For example, `py` will now work as well as `python`. Closes jgm/highlighting-kate#83.	2016-01-24 14:15:06 -08:00
John MacFarlane	20170c328f	Changed type of Shared.uniqueIdent argument from [String] to Set String. This avoids performance problems in documents with many identically named headers. Closes #2671.	2016-01-22 10:16:47 -08:00
John MacFarlane	3b39b16a4b	Merge pull request #2638 from c-forster/teiwriter Add TEI Writer.	2016-01-21 15:23:50 -08:00
Henrik Tramberend	556d0c1810	Markdown writer: Use hyphens for yaml metadata block bottom line	2016-01-21 12:44:16 +01:00
Henrik Tramberend	7a18879a36	LaTeX writer: Allow more flexible table alignment	2016-01-20 13:21:26 +01:00
John MacFarlane	4d74a966c4	Added some entity tests in Markdown reader tests. Change types of divs. From Docbook "sect#" and "simplesect" to "level#" and "section." Add tests. Add mention of TEI to README. Small changes to TEI writer.	2016-01-19 14:03:57 -05:00
csforste	25a9ca697a	Add TEI Writer.	2016-01-19 14:03:57 -05:00
John MacFarlane	f2c0974a26	HTML writer: harmless code simplification. Since the 'math' is only put into the template if stMath is set anyway, there's no need for this conditional.	2016-01-14 10:55:04 -08:00
John MacFarlane	f45a8e1d3b	Org writer - pass through RawInline with format "org".	2016-01-13 23:01:51 -08:00
Albert Krewinkel	fabbd1aa79	Prefix even empty figure names with "fig:" The convention used by pandoc for figures is to mark them by prefixing the name with "fig:". The org reader failed to do this if a figure had no name. The test for this was broken as well. This fixes #2643.	2016-01-11 22:23:59 +01:00
John MacFarlane	f34382ef2c	Depend on deepseq rather than deepseq-generics. See fpco/stackage#1096.	2016-01-11 12:49:28 -08:00
John MacFarlane	8611ac56a6	Fixed regression in latex smart quote parsing. Closes #2645. In cases where a match was not found for a quote, everything from the open quote to the end of the paragraph was being dropped.	2016-01-11 12:17:49 -08:00
mb21	1fde92053f	LaTeX writer: figure label	2016-01-10 13:30:32 +01:00
John MacFarlane	1506e62f48	LaTeX writer: restore old treatment of Span. A Span is rendered with surrounding {braces}. This was a regression in 1.16. Closes #2624.	2016-01-09 12:16:24 -08:00
John MacFarlane	729911ad74	Fixed shadowing warning.	2016-01-08 20:20:37 -08:00
John MacFarlane	5884ff6994	Work around tagsoup bug - not allowing uppercase x in hex entities. Issue submitted at tagsoup.	2016-01-08 17:33:32 -08:00
John MacFarlane	12a5bd3c8d	Entity handling fixes: - Text.Pandoc.XML.fromEntities: handle entities without a semicolon. Always lookup character references with the trailing ';', even if it wasn't present. And never add it when looking up numerical entities. (This is what tagsoup seems to require.) - Text.Pandoc.Parsing.characterReference: Always lookup character references with the trailing ';', and leave off the ';' when looking up numerical entities. This fixes a regression for e.g. `&lang;`.	2016-01-08 17:08:01 -08:00
John MacFarlane	9320d359a2	Merge pull request #2629 from tarleb/org-noexport-fix Fix function dropping subtrees tagged :noexport:	2016-01-07 11:34:27 -08:00
Albert Krewinkel	b3b00da43d	Fix function dropping subtrees tagged :noexport: Continue scanning for comment subtrees beyond only the first block. Note to self: when writing an recursive function, don't forget to, you know, actually recurse. Shout to @mrvdb for noticing this. This fixes #2628.	2016-01-07 19:56:44 +01:00
John MacFarlane	c4fdf28815	Markdown reader: renormalize table column widths if they exceed 100%. Closes #2626.	2016-01-07 10:40:30 -08:00
John MacFarlane	a796538d84	RST, Markdown writers: Fixed rendering of grid tables with blank rows. Closes #2615.	2016-01-05 14:04:10 -08:00
John MacFarlane	4990350fc7	Fixed v1.16 reversion with --latex-engine. In 1.16 --latex-engine raises an error if a full path is given. This commit fixes this reversion. Closes #2618.	2016-01-04 22:44:50 -08:00
John MacFarlane	97c9691696	Textile reader: don't allow block HTML tags in inline contexts. The reader previously did allow this, following redcloth, which happily parses Html blocks can be <div>inlined</div> as well. as <p>Html blocks can be <div>inlined</div> as well.</p> This is invalid HTML, and this kind of thing can lead to parsing problems (stack overflows) as well. So this commit undoes this behavior. The above sample now produces; <p>Html blocks can be</p> <div> <p>inlined</p> </div> <p>as well.</p>	2016-01-02 22:34:06 -08:00
John MacFarlane	75695b1817	MediaWiki reader: interpret markup inside `<tt>`, `<code>`. Closes #2607.	2016-01-02 12:26:16 -08:00
John MacFarlane	a68e072bac	MediaWiki writer: fix spacing issues. + Start cell on new line unless it's a single Para or Plain. + For single Para or Plain, insert a space after the `\|` to avoid problems when the text begins with a character like `-`. Closes #2604, closes #2606.	2016-01-02 12:14:12 -08:00
John MacFarlane	b27783e2ec	Use cmark 0.5. Closes #2605.	2015-12-29 19:52:06 -08:00
John MacFarlane	297345098d	ConTeXt writer: set default layout based on margin-left, etc. This sets up `\setuplayout` based on the variables `margin-left`, `margin-right`, `margin-bottom`, and `margin-top`, if no layout is given.	2015-12-22 13:28:11 -08:00
John MacFarlane	f9202f5d39	LaTeX writer: create defaults for geometry using margin-left etc. If `geometry` has no value, but `margin-left`, `margin-right`, `margin-top`, and/or `-margin-bottom` are given, a default value for `geometry` is created from these. Note that these variables already affect PDF production via HTML5 with wkhtmltopdf.	2015-12-22 13:10:46 -08:00
John MacFarlane	35e0544977	LaTeX reader: allow blank space between braced arguments of commands. For example \foo {bar} {baz} Closes #2592.	2015-12-22 11:06:06 -08:00
John MacFarlane	46e38d0a0a	Improved treatment of margins in wkhtmltopdf.	2015-12-21 23:47:03 -08:00
John MacFarlane	8b8bdca56a	Allow setting margins from metadata variables for wkhtmltopdf. Variables margin-top, margin-bottom, margin-left, margin-right. Setting them with css inside @page doesn't seem to work, at least with the released wkhtmltopdf.	2015-12-21 22:59:01 -08:00
John MacFarlane	0596b65a74	pdf via wkhtmltopdf: take `title` and `page-size` from metadata. Adjusted default `page-size` to `letter`, to match current LaTeX template.	2015-12-21 22:13:44 -08:00
John MacFarlane	0a768f1cc5	Added preliminary support for PDF creation via wkhtmltopdf. To use this: pandoc -t html5 -o result.pdf (and add `--mathjax` if you have math.)	2015-12-21 17:22:12 -08:00
John MacFarlane	28b2d86b21	LaTeX/Beamer template changes (Thomas Hodgson): * Added `thanks` variable * Use `parskip.sty` when `indent` isn't set (fall back to using `setlength` as before if `parskip.sty` isn't available). * Use `biblio-style` with biblatex. * Added `biblatexoptions` variable. * Added `section-titles` variable (defaults to true) to enable/suppress section title pages in beamer slide shows. * Moved beamer themes after fonts, so that themes can change fonts. (Previously the fonts set were being clobbered by lmodern.sty.)	2015-12-19 18:50:45 -08:00
John MacFarlane	9333814254	Added needed import of FromJSON. Fixes build failure.	2015-12-19 17:54:20 -08:00
John MacFarlane	770641f741	Fix language code for Czech (cs not cz) Closes #2597.	2015-12-19 17:54:02 -08:00
John MacFarlane	4c103f67f9	Merge branch 'master' of https://github.com/AndreasLoow/pandoc into AndreasLoow-master	2015-12-19 00:07:28 -08:00
John MacFarlane	e20f433f38	Markdown reader: fixed parsing bug with macros. Previously macro definitions in indented code blocks were being parsed as macro definitions, not code.	2015-12-19 00:00:04 -08:00
mb21	1ead1f39ad	ICML writer: intersperse line breaks instead of appending them to every ParagraphStyleRange closes #2501	2015-12-17 10:26:59 +01:00
mb21	f3a9bdafef	ICML writer: added figure handling, closes #2590	2015-12-16 11:07:23 +01:00
John MacFarlane	9f43acb5d2	ICML writer: removed redundant import.	2015-12-13 22:18:23 -08:00
John MacFarlane	f3133a8e9e	Merge pull request #2570 from mb21/rst-reader-imgattrs Image attributes	2015-12-13 20:29:13 -08:00
John MacFarlane	a924a3f43d	Fixed ICML image syntax for local files. `file:filename` rather than `file://./filename`. I think this is right; it matches what we had before with people actually using the ICML writer, and seems to match examples in the spec. I don't have a copy of InDesign I can test on, though. @DigitalPublishingToolkit and @mb21, can you have a look?	2015-12-13 20:19:34 -08:00
John MacFarlane	90b8024fac	Use posix path separators in ICML link URIs. Closes #2589.	2015-12-13 17:40:24 -08:00
mb21	df68f25459	ODT/OpenDocument writer: improved image attributes - support for percentage widths/heights - use Attr instead of title to get dimensions from ODT walker to writeOpenDocument	2015-12-13 21:40:13 +01:00
mb21	37931cb0c5	Docx reader: image attributes	2015-12-13 21:40:13 +01:00
mb21	2060f5fe83	new function to extract multiple properties at once in CSS.hs and use it in Textile reader	2015-12-13 21:40:12 +01:00
mb21	30644b291b	RST reader: image attributes	2015-12-13 21:40:12 +01:00
John MacFarlane	e4b3da6929	AsciiDoc writer: support anchors in spans with id elements.	2015-12-13 09:02:37 -08:00
John MacFarlane	3e079a25bc	AsciiDoc writers: Add anchors on Div elements. This partially addresses jgm/pandoc-citeproc#143. It does not use the native asciidoc syntax for citations, but it does get the links to individual citations working.	2015-12-13 08:56:22 -08:00
John MacFarlane	44120ea716	Implemented `east_asian_line_breaks` extension. Text.Pandoc.Options: Added `Ext_east_asian_line_breaks` constructor to `Extension` (API change). This extension is like `ignore_line_breaks`, but smarter -- it only ignores line breaks between two East Asian wide characters. This makes it better suited for writing with a mix of East Asian and non-East Asian scripts. Closes #2586.	2015-12-12 17:28:52 -08:00
John MacFarlane	af7e782436	Modified readers to emit SoftBreak when appropriate.	2015-12-12 09:31:51 -08:00
John MacFarlane	47cc5ad6e0	Restore no wrapping of XML in Docx, ODT. It's possible that wrapping causes problems; safer to turn it off.	2015-12-12 00:28:47 -08:00
John MacFarlane	28a2f4c2a4	Fixed cite key parsing regression. We were capturing final colons as in [@foo: bar]; the citation id was being parsed as "@foo:". Closes jgm/pandoc-citeproc#201.	2015-12-12 00:27:08 -08:00
John MacFarlane	1b0e0998fa	FB2 writer: support SoftBreak. This was omitted earlier.	2015-12-12 00:13:58 -08:00
John MacFarlane	536b6bf538	Implemented SoftBreak and new `--wrap` option. Added threefold wrapping option. * Command line option: deprecated `--no-wrap`, added `--wrap=[auto\|none\|preserve]` * Added WrapOption, exported from Text.Pandoc.Options * Changed type of writerWrapText in WriterOptions from Bool to WrapOption. * Modified Text.Pandoc.Shared functions for SoftBreak. * Supported SoftBreak in writers. * Updated tests. * Updated README. Closes #1701.	2015-12-11 23:55:08 -08:00
John MacFarlane	63d875c6cb	Markdown reader: parse soft break as SoftBreak.	2015-12-11 15:33:53 -08:00
John MacFarlane	09958d7f95	Fixed Emoji character definitions. There were many bugs in the definitions. Closes #2523.	2015-12-04 09:38:58 -08:00
John MacFarlane	dd8df6cfbc	Markdown reader: Improved pipe table relative widths. Previously pipe table columns got relative widths (based on the header underscore lines) when the source of one of the rows was greater in width than the column width. This gave bad results in some cases where much of the width of the row was due to nonprinting material (e.g. link URLs). Now pandoc only looks at printable width (the width of a plain string version of the source), which should give better results. Thanks to John Muccigrosso for bringing up the issue.	2015-12-03 11:02:45 -08:00
Raniere Silva	13f74d018b	Add support to GAP	2015-12-03 08:23:26 -02:00
mb21	d901a3da03	Textile Reader: image attributes closes #2515	2015-12-03 00:06:18 +01:00
mb21	1f379da94b	Parse CSS that doesn't contain the optional semicolon	2015-12-02 23:56:44 +01:00
John MacFarlane	622f09617e	Docx writer: better handling of PDF images. Previously we tried to get the image size from the image even if an explicit size was specified. Since we still can't get image size for PDFs, this made it impossible to use PDF images in docx. Now we don't try to get the image size when a size is already explicitly specified.	2015-12-01 00:23:03 -08:00
John MacFarlane	6d91fb2563	Markdown writer: use raw HTML for link/image attributes when the `link_attributes` extension is unset and `raw_html` is set. Closes #2554.	2015-11-24 23:28:52 -08:00
John MacFarlane	33d328f1cf	Allow pipe tables with no body rows. Previously this raised a runtime error. Closes #2556.	2015-11-24 20:23:06 -08:00
John MacFarlane	c73ae81628	LaTeX reader: Improved smart quote parsing. This fixes redering of unmatched quotes. Closes #2555.	2015-11-24 17:20:15 -08:00
John MacFarlane	ce5583460c	Improved fetchItem so that C:/Blah/Blah.jpg isn't treated as URL. The Haskell URI parsing routines will accept "C:" as a scheme, so we rule that out manually. This helps with `--self-contained` and absolute Windows paths. See http://stackoverflow.com/questions/33899126/rchart-in-markdown-doesnt-render-due-to-invalidurlexception-from-pandoc	2015-11-24 11:05:31 -08:00
John MacFarlane	2eb5d2dc42	LaTeX reader: Use curly quotes for unmatched `. Partially addresses #2555. Note that there's still a problem with the code sample given.	2015-11-23 23:44:39 -08:00
John MacFarlane	2633dc2f5e	Beamer writer: mark frame as fragile when it contains verbatim. Closes #1613.	2015-11-23 23:07:56 -08:00
John MacFarlane	b20ecbedc4	AsciiDoc writer: Fixed code blocks. Closes #1861.	2015-11-23 21:29:21 -08:00
John MacFarlane	4361dc0245	Define a `meta-json` variable for all writers. This contains a JSON version of all the metadata, in the format selected for the writer. So, for example, to get just the YAML metadata, you can run pandoc with the following custom template: $meta-json$ Closes #2019. The intent is to make it easier for static site generators and other tools to get at the metadata.	2015-11-23 20:40:27 -08:00
Jesse Rosenthal	07b8a456b1	Docx Reader: Remove DummyListItem type Change `5527465c` introduced a `DummyListItem` type in Docx/Parse.hs. In retrospect, this seems like it mixes parsing and iterpretation excessively. What's really going on is that we have a list item without and associate level or numeric info. We can decide what to do what that in Docx.hs (treat it like a list paragraph), but the parser shouldn't make that decision. This commit makes what is going on a bit more explicit. `LevelInfo` is now a Maybe value in the `ListItem` type. If it's a Nothing, we treat it as a ListParagraph. If it's a Just, it's a normal list item.	2015-11-23 11:50:49 -05:00
John MacFarlane	a008e57ddf	hlint fixes	2015-11-22 07:43:48 -08:00
John MacFarlane	f7e37141e5	hlint fixes	2015-11-22 07:42:11 -08:00
John MacFarlane	bbb3d8d442	hlint changes	2015-11-22 07:40:26 -08:00
John MacFarlane	a7f6241f50	hlint fixes.	2015-11-22 07:38:51 -08:00
John MacFarlane	4b293a6a54	hlint fixes.	2015-11-22 07:37:51 -08:00
John MacFarlane	c5b9ae3060	ImageSize: use safeRead instead of readMaybe. readMaybe is only provided in base 4.6+.	2015-11-21 08:46:01 -08:00
John MacFarlane	73e2d7976c	Renamed link attribute extensions. * Old `link_attributes` -> `mmd_link_attributes` * Recently added `common_link_attributes` -> `link_attributes` Note: this change could break some existing workflows.	2015-11-19 23:17:50 -08:00
John MacFarlane	244cd5644b	Merge branch 'new-image-attributes' of https://github.com/mb21/pandoc into mb21-new-image-attributes * Bumped version to 1.16. * Added Attr field to Link and Image. * Added `common_link_attributes` extension. * Updated readers for link attributes. * Updated writers for link attributes. * Updated tests * Updated stack.yaml to build against unreleased versions of pandoc-types and texmath. * Fixed various compiler warnings. Closes #261. TODO: * Relative (percentage) image widths in docx writer. * ODT/OpenDocument writer (untested, same issue about percentage widths). * Update pandoc-citeproc.	2015-11-19 23:14:23 -08:00
John MacFarlane	1ad296dc69	Merge pull request #2532 from michaelbeaumont/fix-2530 Interpret pauses correctly for all headers	2015-11-19 21:06:53 -08:00
John MacFarlane	fdc81be7d2	Merge pull request #2506 from adunning/patch-1 Remove redundant `center` variable for reveal.js.	2015-11-19 21:03:31 -08:00
John MacFarlane	ed1173ace6	Rationalized behavior of --no-tex-ligatures and --smart. This change makes `--no-tex-ligatures` affect the LaTeX reader as well as the LaTeX and ConTeXt writers. If it is used, the LaTeX reader will parse characters `` ` ``, `'`, and `-` literally, rather than parsing ligatures for quotation marks and dashes. And the LaTeX writer will print unicode quotation mark and dash characters literally, rather than converting them to the standard ASCII ligatures. Note that `--smart` has no affect on the LaTeX reader. `--smart` is still the default for all input formats when LaTeX or ConTeXt is the output format, unless `--no-tex-ligatures` is used. Some examples to illustrate the logic: ``` % echo "'hi'" \| pandoc -t latex `hi' % echo "'hi'" \| pandoc -t latex --no-tex-ligatures 'hi' % echo "'hi'" \| pandoc -t latex --no-tex-ligatures --smart ‘hi’ % echo "'hi'" \| pandoc -f latex --no-tex-ligatures <p>'hi'</p> % echo "'hi'" \| pandoc -f latex <p>’hi’</p> ``` Closes #2541.	2015-11-19 20:30:41 -08:00
Jesse Rosenthal	da4103bc42	Docx reader: Clean up commented-out function A residue of a recent change was left around in the form of a commented-out function. Let's clean that up.	2015-11-18 14:06:13 -05:00
Jesse Rosenthal	5527465c77	Docx reader: Handle dummy list items. These come up when people create a list item and then delete the bullet. It doesn't refer to any real list item, and we used to ignore it. We handle it with a DummyListItem type, which, in Docx.hs, is turned into a normal paragraph with a "ListParagraph" class. If it follow another list item, it is folded as another paragraph into that item. If it doesn't, it's just its own (usually indented, and therefore block-quoted) paragraph.	2015-11-18 13:02:57 -05:00
John MacFarlane	995f28ff07	Haddock writer: omit formatting inside links. It isn't supported by Haddock. Closes #2515.	2015-11-16 20:53:01 -08:00
John MacFarlane	469338a272	Textile reader: skip over attribute in image source. We don't have a place yet for styles or sizes on images, but we can skip the attributes rather than incorrectly taking them to be part of the filename. Closes #2515.	2015-11-16 20:43:07 -08:00
John MacFarlane	f096f032f0	ICML writer: better handling of math. Instead of just printing the raw tex, we now try to fake it with unicode characters.	2015-11-16 20:24:34 -08:00
John MacFarlane	74cf52728e	HTML writer: Include `example` class for example lists. Closes #2524.	2015-11-16 09:57:28 -08:00
John MacFarlane	593cbd8142	Docx writer: insert space between footnote ref and footnote. This matches Word's default behavior. Closes #2527.	2015-11-15 07:53:40 -08:00
John MacFarlane	8f5ff7075c	Derive Generic instances for types in Text.Pandoc.Options.	2015-11-14 17:46:55 -08:00
John MacFarlane	420c86b69a	Allow more customization of opendocument styles. Automatic styles can now be inserted in the template, since the template, not the writer, now provides the enclosing `<office:automatic-styles>` tags. Closes #2520.	2015-11-14 17:19:25 -08:00
michaelbeaumont	8b289326a7	Interpret pauses correctly for all headers Previously, when using headers below the slide level, pauses are left uninterpreted into pauses. In my opinion, unexpected behavior but intentional looking at the code. Fixes #2530	2015-11-15 01:37:39 +01:00
Jesse Rosenthal	e5b374e2ca	Follow relationships correctly in foot/endnotes. There are separate relationship (link) files for foot and endnotes. These had previously been grouped together which led to links not working correctly in notes. This should finally fix that.	2015-11-14 13:41:34 -05:00
John MacFarlane	37285b432c	Text.Pandoc.Emoji: use hex escapes instead of Unicode in source. Some of the unicode characters cause ghc parse errors in older ghc versions.	2015-11-13 14:18:02 -08:00
John MacFarlane	73e6333fae	Merge pull request #2526 from tarleb/org-definition-lists-fix Org reader: Require whitespace around def list markers	2015-11-13 14:07:56 -08:00
Albert Krewinkel	67cb2809fd	Org reader: Require whitespace around def list markers Definition list markers (i.e. double colons `::`) must be surrounded by whitespace to start a definition item. This rule was not checked before, resulting in bugs with footnotes and some link types. Thanks to @conklech for noticing and reporting this issue. This fixes #2518.	2015-11-13 22:04:17 +01:00
John MacFarlane	028a605bf8	Merge pull request #2525 from tarleb/org-smart-fixes Org reader: Fix emphasis rules for smart parsing	2015-11-13 12:27:23 -08:00
John MacFarlane	0a6aaf5e1b	Added `emoji` extension to Markdown. This is enabled by default in `markdown_github`. Added `Ext_emoji` to `Extension` in `Text.Pandoc.Options` (API change). Closes #2523.	2015-11-13 12:14:24 -08:00
Albert Krewinkel	220f3d12b8	Org reader: Fix emphasis rules for smart parsing Smart quotes, ellipses, and dashes should behave like normal quotes, single dashes, and dots with respect to text markup parsing. The parser state was not updated properly in all cases, which has been fixed. Thanks to @conklech for reporting this issue. This fixes #2513.	2015-11-13 20:43:46 +01:00
John MacFarlane	d8080db7f7	Allow `://` in citation keys. Closes jgm/pandoc-citeproc#166.	2015-11-13 11:00:56 -08:00
John MacFarlane	c80c0df1fe	EPUB writer: don't download linked media when `data-external` attribute set. By default pandoc downloads all linked media and includes it in the EPUB container. This can be disabled by setting `data-external` on the tags linking to media that should not be downloaded. Example: <audio controls="1"> <source src="http://www.sixbarsjail.it/tmp/bach_toccata.mp3" type="audio/mpeg"></source> </audio> Closes #2473.	2015-11-12 13:27:41 -08:00
John MacFarlane	83b1aa042d	LaTeX writer: set `colorlinks`... if `linkcolor`, `urlcolor`, `citecolor`, or `toccolor` is set. Closes #2508.	2015-11-12 12:37:20 -08:00
John MacFarlane	64b32e1e81	Fixed shadowing error.	2015-11-09 11:25:05 -08:00
John MacFarlane	c1e474f005	Restored Text.Pandoc.Compat.Monoid. Don't use custom prelude for latest ghc. This is a better approach to making 'stack ghci' and 'cabal repl' work. Instead of using NoImplicitPrelude, we only use the custom prelude for older ghc versions. The custom prelude presents a uniform API that matches the current base version's prelude. So, when developing (presumably with latest ghc), we don't use a custom prelude at all and hence have no trouble with ghci. The custom prelude no longer exports (<>): we now want to match the base 4.8 prelude behavior.	2015-11-09 11:19:25 -08:00
John MacFarlane	23b693c029	Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly." This reverts commit `c423dbb5a3`.	2015-11-09 10:08:22 -08:00
Andrew Dunning	6c27e5f2c1	Remove redundant `center` variable for reveal.js. This is no longer needed with the updates to the template in `da139313d2`	2015-11-09 09:38:12 -05:00
John MacFarlane	5a9ca26172	Merge pull request #2502 from minoki/latex-comment-environment LaTeX reader: Handle `comment` environment.	2015-11-08 17:57:52 -08:00
John MacFarlane	237c10992e	Merge pull request #2505 from tarleb/org-header-markup-fix Org reader: fix markup parsing in headers	2015-11-08 17:17:19 -08:00
John MacFarlane	c423dbb5a3	Use -XNoImplicitPrelude and 'import Prelude' explicitly. This is needed for ghci to work with pandoc, given that we now use a custom prelude. Closes #2503.	2015-11-08 16:56:59 -08:00
Albert Krewinkel	e3b4844bd7	Org reader: fix markup parsing in headers Markup as the very first item in a header wasn't recognized. This was caused by an incorrect parser state: positions at which inline markup can start need to be marked explicitly by changing the parser state. This wasn't done for headers. The proper function to update the state is now called at the beginning of the header parser, fixing this issue. This fixes #2504.	2015-11-08 22:32:00 +01:00
ARATA Mizuki	94500f7469	LaTeX reader: Handle `comment` environment. The `comment` environment is handled in a similar way to the `verbatim` environment, except that its content is discarded.	2015-11-08 02:24:25 +09:00
John MacFarlane	411a25306c	LaTeX writer: properly handle footnotes in captions. Closes #1506.	2015-11-01 15:30:05 -08:00
John MacFarlane	c4ea64203a	LaTeX writer: avoid footnotes in list of figures. Footnotes aren't allowed in the list of figures. This patch causes footnotes to be stripped from captions when entered into the list of figures. Footnotes still don't actually WORK in captions in latex/pdf, but at least an error is no longer raised. See #1506.	2015-11-01 13:42:36 -08:00
John MacFarlane	eb8aee477d	Pipe tables with long lines now get relative cell widths. If a pipe table contains a line longer than the column width (as set by `--columns` or 80 by default), relative widths are computed based on the widths of the separator lines relative to the column width. This should solve persistent problems with long pipe tables in LaTeX/PDF output, and give more flexibility for determining relative column widths in other formats, too. For narrower pipe tables, column widths of 0 are used, telling pandoc not to specify widths explicitly in output formats that permit this. Closes #2471.	2015-10-30 12:37:08 -07:00
John MacFarlane	7843b5759a	HTML writer: use width on whole table if col widths sum to < 100%. Otherwise some browsers display the table with the columns separated far apart.	2015-10-30 12:36:36 -07:00
John MacFarlane	532ae22c29	Textile reader: don't do smart punctuation unless explicitly asked. Closes #2480. Note that although smart punctuation is part of the textile spec, it's not always wanted when converting from textile to, say, Markdown. So it seems better to make this an option.	2015-10-30 10:54:07 -07:00
John MacFarlane	fb1843ecde	Fixed omitted `url(...)` in CSS data-uri with `--self-contained`. Fixes #2489.	2015-10-28 10:06:40 -07:00
John MacFarlane	1d53d452c3	LaTeX writer: add `\protect` to `\hyperlink`. Thanks to Hadrien Mary for the problem and solution. Closes #2490.	2015-10-28 09:37:05 -07:00
John MacFarlane	d5efa9b35c	LaTeX writer: Use `\hypertarget` and `\hyperlink` for links. This works correctly to link to Div or Span elements. We now don't bother defining `\label` for Div or Span elements. Closes jgm/pandoc-citeproc#174.	2015-10-27 14:08:35 -07:00
John MacFarlane	3ea444666a	Markdown reader: improved parser for `mmd_title_block`. We now allow blank metadata fields. These were explicitly disallowed before. For background see #2026. The issue in #2026 has since been fixed in another way, so there is no need to forbid blank metadata fields.	2015-10-26 22:06:34 -07:00
nickbart1980	143093eabd	Added de-CH-1901, fixed el-polyton el-polyton, not el-poly, see http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry	2015-10-26 10:34:32 +00:00
John MacFarlane	89d399b6b1	Merge pull request #2481 from mb21/textarabic LaTeX writer: \textarabic fix	2015-10-25 12:42:41 -07:00
John MacFarlane	45df87cf15	Merge pull request #2477 from tarleb/org-toggling-header-args Org reader: allow toggling header args	2015-10-25 12:10:07 -07:00
mb21	f3f6483510	LaTeX writer: \textarabic fix	2015-10-25 18:31:35 +01:00
Albert Krewinkel	27a8603278	Org reader: allow toggling header args Org-mode allows to skip the argument of a code block header argument if it's toggling a value. Argument-less headers are now recognized, avoiding weird parsing errors. The fixes are not exactly pretty, but neither is the code that was fixed. So I guess it's about par for the course. However, a rewrite of the header parsing code wouldn't hurt in the long run. Thanks to @jo-tham for filing the bug report. This fixes #2269.	2015-10-25 08:54:00 +01:00
Albert Krewinkel	b27366780f	Org reader: fix paragraph/list interaction Paragraphs can be followed by lists, even if there is no blank line between the two blocks. However, this should only be true if the paragraph is not within a list, were the preceding block should be parsed as a plain instead of paragraph (to allow for compact lists). Thanks to @rgaiacs for bringing this up. This fixes #2464.	2015-10-24 19:05:56 +02:00
John MacFarlane	a7150bb6b6	Fixed over-eager raw HTML inline parsing. Tightened up the inline HTML parser so it disallows TagWarnings. This only affects the markdown reader when the `markdown_in_html_blocks` option is disabled. Closes #2469.	2015-10-22 21:18:06 -07:00
John MacFarlane	a21833b638	Avoid compiler warning for unused identifier.	2015-10-22 21:05:52 -07:00
John MacFarlane	317d9eea17	Changed § to % in operators from Odt.Arrows.Utils. This prevents problems building haddocks with "C" locale. Closes #2457.	2015-10-22 17:54:58 -07:00
John MacFarlane	48b68aac43	Textile writer: support start number in ordered lists. e.g. `#3`. Partially addresses #2465. TBD: reader support.	2015-10-22 12:37:40 -07:00
John MacFarlane	8193ebcd99	Allow use of ConTeXt to generate PDFs. pandoc my.md -t context -o my.pdf will now create a PDF using ConTeXt rather than LaTeX. Closes #2463.	2015-10-20 08:16:17 -07:00
mb21	9328f4cd3d	LaTeX and ConTeXt writers: support lang attribute on divs and spans For LaTeX, also collect lang and dir attributes on spans and divs to set the lang, otherlangs and dir variables if they aren’t set already. See #895.	2015-10-18 17:01:37 +02:00
John MacFarlane	7f4b78c064	Text.Pandoc.Data: store paths in dataFiles using posix separators. This way we have uniform separators, whether on Windows or Linux. This should solve a problem where on some Windows versions the data files weren't being found. Closes #2459.	2015-10-17 22:04:02 -07:00
John MacFarlane	34d53aff6e	Remove compiler warning with embed_data_files.	2015-10-17 21:21:52 -07:00
Andreas Lööw	f0c47907ca	Consider header files when determining whether to use csquotes.	2015-10-17 23:04:15 +02:00
John MacFarlane	2357e61748	LaTeX reader: fixed longtable support.	2015-10-15 23:15:40 -07:00
John MacFarlane	504bf3f8e7	Support all frame attributes in Beamer.	2015-10-15 15:11:07 -07:00
John MacFarlane	047cb32dfc	Use unicode super/subscripts for digits in plain output.	2015-10-15 14:35:01 -07:00
John MacFarlane	6dc3b6585d	More changes to avoid compiler warnings on ghc 7.10. * CPP around deprecated `parseTime`. * Text.Pandoc.Compat.Locale -> Text.Pandoc.Compat.Time, now exports Data.Time.	2015-10-14 10:06:18 -07:00
John MacFarlane	82b3e0ab97	Use custom Prelude to avoid compiler warnings. - The (non-exported) prelude is in prelude/Prelude.hs. - It exports Monoid and Applicative, like base 4.8 prelude, but works with older base versions. - It exports (<>) for mappend. - It hides 'catch' on older base versions. This allows us to remove many imports of Data.Monoid and Control.Applicative, and remove Text.Pandoc.Compat.Monoid. It should allow us to use -Wall again for ghc 7.10.	2015-10-14 09:09:10 -07:00
John MacFarlane	198862ee40	LaTeX writer: add `\protect` to `\hyperdef` in inline context. This way we don't get an error when this is used as a moveable argument. Closes #2136.	2015-10-13 21:48:14 -07:00
John MacFarlane	25e0e0bd2a	epub with `--webtex`: include image file rather than data: URI. Closes #2363.	2015-10-13 21:19:43 -07:00
John MacFarlane	24f68654e9	RST writer: do header normalization only in "standalone" mode. If we're producing a fragment, just skip normalization. After all, the fragment might be somewhere in the middle of the document. It's more important for fragments to have consistency in rendering (so they can be pieced together) than to normalize. This closes #2394. It's simpler and more robust than my earlier fix.	2015-10-12 23:00:27 -07:00
John MacFarlane	fb51077712	Revert "RST writer: tweaks to header normalization." This reverts commit `476b383c57`.	2015-10-12 22:44:37 -07:00
John MacFarlane	476b383c57	RST writer: tweaks to header normalization. These changes are intended to make the writer more useful to people who are processing small fragments, which may for example look like this: ### third level header from previous section ## second level header Previously such fragments got turned into two headers of the same level. The new algorithm avoids doing any normalization until we hit the minimal-level header in the fragment (here, the second level header). Closes #2394.	2015-10-12 22:04:40 -07:00
John MacFarlane	0b91c73456	Removed unnecessary import.	2015-10-11 17:27:00 -07:00
John MacFarlane	1e8a25ad69	Percent-encode more special characters in URLs. HTML, LaTeX writers adjusted. The special characters are '<','>','\|','"','{','}','[',']','^', '`'. Closes #1640, #2377.	2015-10-11 17:12:50 -07:00
John MacFarlane	04307a1554	Define Typeable and Exception instances for PandocError. Closes #2386.	2015-10-11 15:50:41 -07:00
John MacFarlane	0e78eba791	HTML reader/writer: better handling of "section" elements. Previously `<section>` tags were just parsed as raw HTML blocks. With this change, section elements are parsed as Div elements with the class "section". The HTML writer will use `<section>` tags to render these Divs in HTML5; otherwise they will be rendered as `<div class="section">`. Closes #2438.	2015-10-11 15:25:49 -07:00
John MacFarlane	60dcaa37d5	Native writer: format Div properly, with blocks separated.	2015-10-11 15:14:35 -07:00
John MacFarlane	72b038d201	Merge pull request #2412 from frerich/reader/docbook/xref_support Added support for <xref> tag in DocBook reader	2015-10-10 14:18:28 -07:00
John MacFarlane	3e4713c2de	Merge pull request #2441 from mb21/polyglossia-lang Change variable to polyglossia-lang.name and .options	2015-10-10 13:52:36 -07:00
John MacFarlane	5e57beac8d	Re-export pandocVersions from Text.Pandoc. The actual definition has been moved to Text.Pandoc.Shared, but to avoid breaking changes we reexport it here.	2015-10-10 13:42:02 -07:00
John MacFarlane	4aabcf3d4e	Merge pull request #2426 from alexvong1995/better-man-writer Better man writer (revised)	2015-10-10 13:11:21 -07:00
John MacFarlane	114103d67f	LaTeX reader: don't eat excess whitespace after macros. Really close #2446.	2015-10-09 14:39:42 -07:00
John MacFarlane	1af8bc6f4d	LaTeX reader: don't eat whitespace after macro with only opt arg. Closes #2446.	2015-10-09 10:32:31 -07:00
mb21	80b851a4cf	Change variable to polyglossia-lang.name and .options closes #2437	2015-10-07 22:53:09 +02:00
Ophir Lifshitz	0b899ce7ef	Docx Reader: Parse soft, no-break hyphen elements	2015-10-04 06:11:07 -04:00
John MacFarlane	421845202d	FIxed typo: Ext_superscript, Ext_subscript.	2015-10-03 16:03:40 -07:00
John MacFarlane	68c02e1d01	For markdown_mmd, add: implicit_figures, superscripts, subscripts. See #2401.	2015-10-03 15:32:01 -07:00
Alex Vong	319832cc19	Set the template variable $pandoc-version$ to pandocVersion by default. * src/Text/Pandoc/Writers/Man.hs: Set $pandoc-version$ to be pandocVersion.	2015-10-01 02:24:34 +08:00
Alex Vong	d7a19c22be	Move the variable pandocVersion from `src/Text/Pandoc.hs` to `src/Text/Pandoc/Shared.hs`, so that all Writers can access this variable without importing `src/Text/Pandoc.hs`, preventing circular import. * pandoc.hs: Import pandocVersion from `Text.Pandoc.Shared`. * src/Text/Pandoc.hs: Remove the definition of pandocVersion and relevant import. * src/Text/Pandoc/Shared.hs: Add the definition of pandocVersion and relevant import.	2015-10-01 02:24:34 +08:00
Alex Vong	f5e33e0dce	Set the template variable $hyphenate$ to true by default * src/Text/Pandoc/Writers/Man.hs: Set $hyphenate$ to be true.	2015-10-01 02:24:34 +08:00
John MacFarlane	af8fb5e792	Removed unneeded imports.	2015-09-26 22:56:13 -07:00
John MacFarlane	6532950b26	MediaBag: ensure that / is always used as path separator.	2015-09-26 22:40:58 -07:00
John MacFarlane	fdfc961284	Merge pull request #2419 from mb21/bidi Support bidirectional text output with XeLaTeX, ConTeXt and HTML	2015-09-26 17:06:56 -07:00
mb21	7b0c1e0d37	Support bidirectional text output with XeLaTeX, ConTeXt and HTML closes #2191	2015-09-26 22:22:24 +02:00
John MacFarlane	29668552c8	Removed unneeded import.	2015-09-26 10:27:55 -07:00
John MacFarlane	da1b599c96	Correctly recognize book documentclass in metadata. Closes #2395.	2015-09-25 23:28:38 -07:00
John MacFarlane	dcb0b02aa3	Markdown reader: handle 'id' and 'class' in parsing key/value attrs. # Header {id="myid" class="foo bar"} is now equivalent to # Header {#myid .foo .bar} Closes #2396.	2015-09-25 23:01:34 -07:00
Frerich Raabe	eee992520c	Improve text generated for <xref> by employing docbook-xsl heuristics docbook-xsl, a set of XSLT scripts to generate HMTL out of DocBook, tries harder to generate a nice xref text. Depending on the element being linked to, it looks at the title or other descriptive child elements. Let's do that, too.	2015-09-24 18:28:51 +02:00
Frerich Raabe	35f12b5095	Added proper support for DocBook 'xref' elements 'xref' is used to create cross references to other parts of the document. It is an empty element - the cross reference text depends on various attributes. Quoting 'DocBook: The Definitive Guide': 1. If the endterm attribute is specified on xref, the content of the element pointed to by endterm will be used as the text of the cross-reference. 2. Otherwise, if the object pointed to has a specified XRefLabel, the content of that attribute will be used as the cross-reference text.	2015-09-24 18:26:55 +02:00
Frerich Raabe	f6538144f0	Pass the parsed DocBook content along the state of readDocBook Having access to the entire document will be needed when handling elements which refer to other elements. This is needed for e.g. <xref> or <link>, both of which reference other elements (by the 'id' attribute) for the label text. I suppose that in practice, the [Content] returned by parseXML always only contains one 'Elem' value -- the document element. However, I'm not totally sure about it, so let's just pass all the Content along.	2015-09-23 19:31:25 +02:00
Frerich Raabe	3564cd82ca	Minor refactoring to readDocBook I plan to use the parsed and normalized XML tree read in readDocBook in other places - prepare that commit by factoring this code out into a separate, shared, definition.	2015-09-23 19:25:58 +02:00
John MacFarlane	72e71a1dad	LaTeX reader: support longtable. Closes #2411.	2015-09-23 08:34:37 -07:00
John MacFarlane	f232a0a720	Merge pull request #2369 from mb21/language-variables `lang` variable is now in BCP47 format	2015-09-22 22:21:06 -07:00
John MacFarlane	9b033672e4	Merge pull request #2406 from tarleb/org-verse-fix Make sure verse blocks can contain empty lines	2015-09-20 13:02:47 -07:00
Albert Krewinkel	8007dd97b5	Make sure verse blocks can contain empty lines The previous verse parsing code made the faulty assumption that empty strings are valid (and empty) inlines. This isn't the case, so lines are changed to contain at least a newline. It would generally be nicer and faster to keep the newlines while splitting the string. However, this would require more code, which seems unjustified for a simple (and fairly rare) block as verse. This fixes #2402.	2015-09-19 22:02:43 +02:00
Nikolay Yakimov	5788f62ef5	[RST Writer] Don't normalize heading levels below input minimum	2015-09-19 17:45:54 +03:00
John MacFarlane	4d49f76dbb	Markdown writer: in TOC, add links to headers. Closes #829.	2015-09-17 11:41:05 -07:00
John MacFarlane	bee255cbfe	Use user data directory for reference docx archive. This allows the test suite to work without installing pandoc first. It also brings the docx writer in line with the odt writer.	2015-09-09 10:16:45 -07:00
mb21	622df7034c	`lang` variable is now in BCP47 format strings are converted for LaTeX and ConTeXt output, closes #1614	2015-08-20 23:17:47 +02:00
John MacFarlane	761e1edc30	Merge pull request #2364 from gbataille/bugDoc [BUG] Haddock : * and ^ to be escaped in docs	2015-08-17 12:15:32 -07:00
Grégory Bataille	0dff30271c	[BUG] Haddock : * and ^ to be escaped in docs	2015-08-17 09:03:33 +02:00
John MacFarlane	1f00a5395f	RST reader: better handling of indirect roles. Previously the parser failed on this kind of case .. role:: indirect(code) .. role:: py(indirect) :language: python :py:`hi` Now it currectly recognizes `:py:` as a code role. The previous test for this didn't work, because the name of the indirect role was the same as the language defined its parent, os it didn't really test for this behavior. Updated test.	2015-08-15 10:22:47 -07:00
John MacFarlane	8c579a5daa	Merge pull request #2360 from jg/issue-2354 Org reader: add auto identifiers if not present on headers	2015-08-15 09:47:56 -07:00
Juliusz Gonera	f1c87ed164	Org reader: add auto identifiers if not present on headers Refs #2354 This should also fix the table of contents (--toc) when generating a html file from org input	2015-08-15 07:57:48 +02:00
John MacFarlane	0302330a27	RST writer: ensure that `\` is inserted when needed... ...before Cite and Span elements that begin with a "complex" element. Closes jgm/pandoc-citeproc#157.	2015-08-13 23:20:22 -07:00
John MacFarlane	c82f3ad61e	RST writer: Don't insert `\` when complex expression in matched pairs. E.g. `` [:sup:`3`] `` is okay; you don't need `` [:sup:`3`\ ] ``.	2015-08-12 21:08:13 -07:00
John MacFarlane	9894012776	EPUB TOC: replace literal "<br/>" with space. Closes #2105.	2015-08-10 16:58:47 -07:00
John MacFarlane	7b8c005d07	EPUB reader: stop mangling external URLs. Closes #2284. Note the changes to the test suite. In each case, a mangled external link has been fixed, so these are all positive.	2015-08-10 16:35:43 -07:00
John MacFarlane	0ad576eb1a	Docx writer: Moved invalid character stripping to `formattedString`. This avoids an inefficient generic traversal. Updates `f3aa03e`. Closes #2356.	2015-08-10 10:49:18 -07:00
John MacFarlane	06d69fe215	Text.Pandoc: disable auto_identifiers for epub. The epub writer inserts its own auto identifiers; this is more complex due to splitting into "chapter" files.	2015-08-08 21:05:43 -07:00
John MacFarlane	467e3be700	MediaWiki reader: handle unquoted table attributes. Closes #2355.	2015-08-08 20:55:00 -07:00
John MacFarlane	2eec8cf61b	HTML reader: add auto identifiers if not present on headers. This makes TOC linking work properly. The same thing needs to be done to the org reader to fix #2354; in addition, `Ext_auto_identifiers` should be added to the list of default extensions for org in Text.Pandoc.	2015-08-08 11:20:15 -07:00
John MacFarlane	eaccef1491	DocBook reader: handle informalexample. It is parsed into a Div with class `informalexample`. Closes #2319.	2015-08-08 10:43:25 -07:00
John MacFarlane	bf7d858a9e	LaTeX reader: Implement \Cite. See #2335.	2015-08-08 10:32:03 -07:00
John MacFarlane	74c31abb1a	Merge pull request #2327 from hftf/list-style HTML Reader: Correctly parse inline list-style(-type) for <ol>	2015-08-07 11:08:53 -07:00
mb21	a010b83a75	Updated readers, writers and README for link attribute	2015-08-07 12:38:37 +02:00
John MacFarlane	92d48fa65b	Updated readers and writers for new image attribute parameter. (mb21)	2015-08-07 12:37:12 +02:00
John MacFarlane	9deb335ca5	ICML writer: changed type of `writeICML`. API change: It is now `WriterOptions -> Pandoc -> IO String`. Also handle new image attributes. (mb21)	2015-08-05 16:08:46 +02:00
John MacFarlane	4391c5f34c	ICML writer: Add Cite style to citations. (mb21)	2015-08-05 16:08:46 +02:00
John MacFarlane	12df4054ad	PDF: Modified for new image size attributes parameter. (mb21)	2015-08-05 16:08:46 +02:00
John MacFarlane	76f0708ef5	Parsing: Add `extractIdClass`, modified type of `KeyTable`. (mb21)	2015-08-05 16:08:46 +02:00
John MacFarlane	878ab00233	ImageSize: Added functions for converting between image dimensions. (mb21)	2015-08-05 16:08:46 +02:00
Sergei Trofimovich	ab7c5f2221	fix build failure with --flags=-https The issue was originally reported by CasperVector as https://github.com/gentoo-haskell/gentoo-haskell/issues/427 Mainfests itself as a builg failure full of missing zip-archive names: src/Text/Pandoc/Shared.hs:756:49: Not in scope: type constructor or class ‘Archive’ src/Text/Pandoc/Shared.hs:777:38: Not in scope: ‘toEntry’ src/Text/Pandoc/Shared.hs:786:19: Not in scope: ‘toArchive’ Perhaps you meant ‘mbArchive’ (line 778) Included Codec.Archive.Zip unconditionally. Signed-off-by: Sergei Trofimovich <siarheit@google.com>	2015-07-30 22:39:25 +01:00
Ophir Lifshitz	18b1b21a6a	HTML Reader: Detect font-variant with pickStyleAttrProps	2015-07-27 20:08:04 -04:00
John MacFarlane	5df099957e	Text.Pandoc.Options: modifications for image attributes. * Added `Ext_common_link_attributes` constructor to `Extension` (for link and image attributes). * Added this to `pandocExtensions` and `phpMarkdownExtraExtensions`. * Added `writerDpi` to `WriterOptions`. * pandoc.hs: Added `--dpi` option. * Updated README for `--dpi` and `common_link_attributes` extension. Patch due to mb21, with some modifications: `writerDpi` is now an `Int` rather than a `Double`.	2015-07-27 21:52:43 +02:00
John MacFarlane	0baaa1080a	Pipe tables: allow indented columns. Previously the left-hand column could not start with 4 or more spaces indent. This was inconvenient for right-aligned left columns. Note that the first (header column) must still have 3 or fewer spaces indentation, or the table will be treated as an indented code block.	2015-07-27 10:24:06 -07:00
John MacFarlane	defcb5b6b1	Merge pull request #1689 from kuribas/master Use '=' instead of '#' for atx-style headers in markdown+lhs.	2015-07-25 10:21:02 -07:00
John MacFarlane	2e8064346d	Pretty: comment fix (mb21).	2015-07-25 15:51:55 +02:00
Ophir Lifshitz	7ef8700734	HTML Reader: Parse <ol> type, class, and inline list-style(-type) CSS	2015-07-24 02:53:17 -04:00
MarLinn	f068093555	Added odt reader Fully implemented features: * Paragraphs * Headers * Basic styling * Unordered lists * Ordered lists * External Links * Internal Links * Footnotes, Endnotes * Blockquotes Partly implemented features: * Citations Very basic, but pandoc can't do much more * Tables No headers, no sizing, limited styling	2015-07-23 15:37:01 -07:00
John MacFarlane	8390d935d8	Updated tests and removed a skipSpaces.... we no longer need it with the change to toKey, and it is expensive to skip spaces after every inline.	2015-07-23 15:35:18 -07:00
John MacFarlane	35e6c893ec	Parsing: toKey: strip off outer brackets. This makes keys with extra space at the beginning and end work: e.g. [foo]: bar [ foo ] will now be a link to bar (it wasn't before).	2015-07-23 15:34:27 -07:00
John MacFarlane	5db4787330	Merge pull request #2323 from hftf/implicit-header-refs Fix implicit header refs for headers with extra spaces	2015-07-23 14:46:38 -07:00
John MacFarlane	66a72b8eec	LaTeX reader: support abstract environment. The abstract populates an "abstract" metadata field.	2015-07-23 09:31:46 -07:00
Ophir Lifshitz	42c139d302	Markdown Reader: Skip spaces in headers	2015-07-23 02:29:37 -04:00
John MacFarlane	fa2c008ae5	Fix regression: allow HTML comments containing `--`. Technically this isn't allowed in an HTML comment, but we've always allowed it, and so do most other implementations. It is handy if e.g. you want to put command line arguments in HTML comments.	2015-07-21 22:44:18 -07:00
John MacFarlane	ec5960ab11	Use newManager instead of withManager in recent http-client. This avoids a deprecation warning.	2015-07-21 16:32:44 -07:00
John MacFarlane	450bef90e0	DZSlides: Add `role="note"` for speaker notes. Closes #1693.	2015-07-21 14:54:43 -07:00
John MacFarlane	da0842b5b5	HTML reader: handle type attribute on ol. E.g. `<ol type="i">`. Closes #2313.	2015-07-21 13:07:52 -07:00
John MacFarlane	f6ad9e263f	LaTeX reader: properly handle booktabs lines. Lines aren't part of the pandoc table model, but we can just ignore them. Closes #2307.	2015-07-21 10:26:29 -07:00
John MacFarlane	6166cd7559	Removed unneeded import.	2015-07-16 17:06:57 -07:00
John MacFarlane	075ad9a406	LaTeX writer: Fixed detection of 'chapters' from template. If a documentclass isn't specified in metadata, but the template has a hardwired bookish documentclass, act as if `--chapters` was used. This was the default in earlier versions, but it has been broken for a little while.	2015-07-16 15:52:38 -07:00
John MacFarlane	c2ab44af84	`--self-contained`: Fixed overaggressive CSS minimization. Previously `--self-contained` wiped out all spaces in CSS, including semantically significant spaces! Closes #2301. Closes #2286.	2015-07-15 08:16:42 -07:00
John MacFarlane	6c32afc3c4	Updated to use cmark >= 0.4.	2015-07-14 22:51:23 -07:00
John MacFarlane	9e0fb844a9	Markdown reader: don't allow bare URI links or autolinks in link label. Added test cases. Closes #2300.	2015-07-14 13:16:40 -07:00
John MacFarlane	9cdfd4f649	Improved bare autolink detection. Previously we disallowed `-` at the end of an autolink, and disallowed the combination `=-`. This commit liberalizes the rules for allowing punctuation in a bare URI. Added test cases. One potential drawback is that you can no longer put a bare URI in em dashes like this this uri---http://example.com---is an example. But in this respect we now match github's treatment of bare URIs. Closes #2299.	2015-07-14 10:24:39 -07:00
John MacFarlane	b8634b9f75	HTML writer: support speaker notes in dzslides. With this change `<div class="notes">` and also `<div class="notes" role="note">` will be output if `-t dzslides` is used. So we can have speaker notes in dzslides too. Thanks to maybegeek.	2015-07-13 22:50:17 -07:00
Tiziano Müller	f464e49142	DokuWiki: write $..$ instead of <math>..</math> MathJax seems currently to be the only maintained math rendering extension for DokuWiki and it uses $..$ instead of <math>..</math>.	2015-07-13 14:19:48 +02:00
John MacFarlane	2df3dfe883	Changed hierarchicalize so it treats references div as top-level header. Fixes a bug with `--section-divs`, where the final references section added by pandoc-citeproc, enclosed in its own div, got put in the div for the section previous to it. This fixes #2294. Longer term, we might think about how hierarchicalize should interact with Div elements.	2015-07-12 13:58:28 -07:00
John MacFarlane	99fe8594d9	Avoid parsing partial URLs as HTML tags. Closes #2277.	2015-07-10 10:33:27 -07:00
John MacFarlane	b587acb224	Merge pull request #2266 from PromyLOPh/fieldinline RST: Support inline markup for field list names	2015-07-08 22:45:06 -07:00
John MacFarlane	ac79429a12	PDF: Make sure `--latex-engine-opt` goes before the filename... on the command line. LaTeX needs the argument to come after the options. Closes #1779 - again! Thanks to squisher for pointing out the problem.	2015-07-08 17:37:54 -07:00
Andrew Dunning	4850aaf046	Correct superscript/subscript.	2015-07-08 13:57:04 -04:00
John MacFarlane	9e528f4c0c	Fixed email javascript obfuscation with mailto: URLs. This fixes a potential security issue. Because single quotes weren't being escaped in the link portion, a specially crafted email address could allow javascript code injection. [Jim'+alert('hi')+'OBrien](mailto:me@example.com) Closes #2280.	2015-07-07 11:15:40 -07:00
Lars-Dominik Braun	b2adf44e75	Readers.RST: Factor out inline markup string parsing	2015-07-03 16:42:51 +02:00
Lars-Dominik Braun	68b6b9f652	Readers.RST: Parse field list name “Inline markup is parsed in field names.” [1] [1] http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#field-lists	2015-07-03 16:41:28 +02:00
John MacFarlane	e0a88df686	ConTeXt: use `\goto` for internal links.	2015-07-01 12:11:45 -07:00
John MacFarlane	b5d3b4f608	Merge pull request #2255 from mchladek/odt_linebreak Fix #2254 : OpenDocument writer adds space with hard line break	2015-07-01 11:49:58 -07:00
John MacFarlane	8e747004e6	ConTeXt writer: Added a % at end for `\reference` to avoid spurious space.	2015-07-01 11:30:28 -07:00
John MacFarlane	a04c15a422	New method for building man pages. + Removed `--man1`, `--man5` options (breaking change). + Removed `Text.Pandoc.ManPages` module (breaking API change). + Version bump to 1.15 because of the breaking changes, even though they involve features that have only been in pandoc for a day. + Makefile target for `man/man1/pandoc.1`. This uses pandoc to create the man page from README using a custom template and filters. + Added `man/` directory with template and filters needed to build man page. + We no longer have two man pages: pandoc.1 and pandoc_markdown.5. Now there is just pandoc.1, which has all the content from README. This change was needed because of the extensive cross-references between parts of the README. + Removed old `data/pandoc.1.template` and `data/pandoc_markdown.5.template`.	2015-07-01 11:27:15 -07:00

... 6 7 8 9 10 ...

3913 commits