pandoc

Author	SHA1	Message	Date
John MacFarlane	b3b40546cb	HTML reader: Fix performance issue with malformed HTML tables. We let a `</table>` tag close an open `<tr>` or `<td>`. Closes #1167.	2014-06-20 10:47:29 -07:00
John MacFarlane	cab4b829b3	Support --trace in HTML reader.	2014-06-20 10:39:24 -07:00
John MacFarlane	12efffa85a	LaTeX writer: Fixed strikeout + highlighted code. Closes #1294 . Previously strikeout highlighted code caused an error.	2014-06-20 10:24:30 -07:00
Jesse Rosenthal	f6ae644831	Make strNormalize go bottomUp. This was how it used to be before it was folded into blockNormalize.	2014-06-20 12:31:36 -04:00
Jesse Rosenthal	2aa5f58c5b	Docx reader: Add a comment explaining strNormalize `normalize` from Text.Pandoc.Shared is more general. In tests, though, it more than doubles the run time. `strNormalize` does less, but it does what we need. This comment is added for future maintainability.	2014-06-20 10:27:18 -04:00
Jesse Rosenthal	03af19a7e1	Docx Reader: Normalize DefinitionLists Previously DefinitionList had been left out of `blockNormalize`. Now it is included.	2014-06-20 10:20:37 -04:00
Jesse Rosenthal	3da515bdb0	Docx reader: simplify blockNormalize Use a function `stripSpaces`, instead of recursion. Makes it a bit easier to read and mantain, and simplify normalizing DefinitionList, which was left out the first time.	2014-06-20 10:12:28 -04:00
Jesse Rosenthal	7fd48b30e0	Docx reader: Fix hdr handling in block norm `blockNormalize` previously forgot to account for the case in which a Header's inlines did not start with a space.	2014-06-20 09:30:30 -04:00
John MacFarlane	557b302731	Docx writer: Use Compact style for empty table cells. Otherwise we get overly tall lines when there are empty table cells and the other cells are compact. Closes #1353.	2014-06-19 23:31:17 -07:00
John MacFarlane	3c059dbe60	HTML reader: Allow space between `<col>` and `</col>`. Test case: ``` <table border="1"> <colgroup> <col> </col> <col></col> </colgroup> <tbody> <tr> <td>X</td> <td>Y</td> </tr> <tr> <td>1</td> <td>2</td> </tr> </tbody> </table> ```	2014-06-19 23:24:28 -07:00
John MacFarlane	0d8e0e5674	Merge pull request #1354 from jkr/literalTab Parse literal tabs in docx	2014-06-19 22:47:32 -07:00
Jesse Rosenthal	a934db9a32	Introduce blockNormalize This will help take care of spaces introduced at the beginning of strings.	2014-06-19 19:28:55 -04:00
Jesse Rosenthal	0e7d2dbd43	Have Docx reader properly interpret tabs.	2014-06-19 17:55:02 -04:00
Jesse Rosenthal	86fc44d6b3	Add literal tabs to parser.	2014-06-19 17:53:52 -04:00
John MacFarlane	5cb53a48d5	ImageSize: ignore unknown exif header tag rather than crashing. Some images seem to have tag type of 256, which was causing a runtime error.	2014-06-19 14:30:03 -07:00
John MacFarlane	00281559bf	Haddock writer: Use _____ for hrule. Avoids interpretation as list.	2014-06-19 00:28:23 -07:00
John MacFarlane	de7b3a3d08	Haddock writer: Only use Decimal list style.	2014-06-18 18:11:01 -07:00
John MacFarlane	c4182b39ca	Small fix to haddock "tables".	2014-06-18 18:08:41 -07:00
John MacFarlane	ff6a2baeb9	More polish on Haddock reader/writer.	2014-06-18 17:49:59 -07:00
John MacFarlane	35e57db5c2	Finished first draft of Haddock writer.	2014-06-18 17:09:36 -07:00
John MacFarlane	9fc5c8d7af	Rewrote haddock reader to use haddock-library. This brings pandoc's rendering of haddock markup in line with the new haddock. Note that we preserve line breaks in `@` code blocks, unlike the earlier version. Modified tests pass. More tests would be good.	2014-06-18 14:18:55 -07:00
John MacFarlane	ab390a10ec	Removed old haddock reader code. Add dependency on haddock-library. This also removes the dependency on alex and happy.	2014-06-18 11:33:09 -07:00
John MacFarlane	b371e83d73	Highlighting: Let .numberLines work even if no language given. Closes #1287, jgm/highlighting-kate#40.	2014-06-17 15:15:56 -07:00
John MacFarlane	59272e4d99	DocBook reader: Support <?asciidoc-br?>. Closes #1236. Note, this is a bit of a kludge, to work around the fact that xml-light doesn't parse `<?asciidoc-br?>` correctly. We preprocess the input, replacing that instruction with `<br/>`, and then parse that as a line break. Other XML instructions are simply removed from the input stream.	2014-06-17 12:14:02 -07:00
John MacFarlane	fc291efad3	LaTeX reader: Correctly handle table rows with too few cells. LaTeX seems to treat them as if they have empty cells at the end. Closes #241.	2014-06-17 00:38:55 -07:00
John MacFarlane	7d60c798bf	Fixed compiler warning.	2014-06-16 23:02:20 -07:00
John MacFarlane	bbe99003f8	Naming: Use Docx instead of DocX. For consistency with the existing writer.	2014-06-16 22:44:40 -07:00
John MacFarlane	bec9f3c641	Merge branch 'docx' of https://github.com/jkr/pandoc into jkr-docx	2014-06-16 22:16:45 -07:00
John MacFarlane	78ee2416d1	Org reader: make tildes create inline code. Closes #1345. Also relabeled 'code' and 'verbatim' parsers to accord with the org-mode manual. I'm not sure what the distinction between code and verbatim is supposed to be, but I'm pretty sure both should be represented as Code inlines in pandoc. The previous behavior resulted in the text not appearing in any output format.	2014-06-16 22:03:26 -07:00
John MacFarlane	f9b97e6bfb	Small improvement to fix to #1333 . This allows blank lines at end of multiline headers.	2014-06-16 21:26:50 -07:00
John MacFarlane	9da5d8955e	Markdown reader: fixed #1333 (table parsing bug).	2014-06-16 21:18:24 -07:00
John MacFarlane	87c08be58f	LaTeX reader: handle leading/trailing spaces in emph better. `\emph{ hi }` gets parsed as `[Space, Emph [Str "hi"], Space]` so that we don't get things like `* hi *` in markdown output. Also applies to textbf and some other constructions. Closes #1146. (`--normalize` isn't touched by this, but normalization should not generally be necessary with the changes to the readers.)	2014-06-16 19:18:33 -07:00
John MacFarlane	459805de4c	LaTeX reader: don't assume preamble doesn't contain environments. Closes #1338.	2014-06-16 17:43:56 -07:00
John MacFarlane	31fd843133	HTML reader: Fixed major parsing problem with HTML tables. Table cells were being combined into one cell. Closes #1341.	2014-06-16 15:45:20 -07:00
John MacFarlane	2b364b34bb	Merge pull request #1344 from mpickering/master Moved extractSpaces to Shared.hs	2014-06-16 14:43:43 -07:00
John MacFarlane	01ef573ac2	Org reader: fixed #1342 . This change rewrites `inlineLaTeXCommand` so that parsec will know when input is being consumed. Previously a run-time error would be produced with some input involving raw latex. (I believe this does not affect the last release, as the inline latex reading was added recently.)	2014-06-16 14:18:06 -07:00
mpickering	7807564d44	Moved extractSpaces to Shared.hs Generalised and move the extractSpaces function from `HTML.hs` to `Shared.hs` so that the docx reader can also use it.	2014-06-16 20:45:54 +01:00
mpickering	3bc818d2d3	Integrated the docx reader into the main pandoc program. Changes also include generalising the types of reader allowed. The mechanism now mimics the more general output mechanism.	2014-06-16 07:18:40 -04:00
Jesse Rosenthal	293e4cfdc3	Add DocX files to tree. This introduces Text.Pandoc.DocX, and its exported `readDocX` function.	2014-06-16 07:18:34 -04:00
James Aspnes	abbf33ae7d	allow (and discard) optional argument for \caption	2014-06-12 21:19:00 -04:00
John MacFarlane	9681574661	LaTeX reader: Handle comments at the end of tables. This resolves the issue illustrated in http://stackoverflow.com/questions/24009489/comments-in-latex-break-pandoc-table.	2014-06-03 23:17:42 -07:00
John MacFarlane	ab5dda7a60	Markdown writer: Prettier pipe tables. Columns are now aligned. Closes #1323.	2014-06-03 23:17:03 -07:00
John MacFarlane	45f3851611	Docx writer: Section numbering carries over from reference.docx. Closes #1305.	2014-06-03 16:46:55 -07:00
John MacFarlane	0ddb4cd2e8	Docx writer: Combine reference.docx numbering with pandoc's. This should have fixed #1305, allowing the reference.docx to define section numbering, but it doesn't. Now the headings appear with proper indentation, but the numbers don't appear. Unclear why. styles.xml and numbering.xml basically match the docx which has the expected result.	2014-06-03 13:14:32 -07:00
John MacFarlane	ec047aaa8c	Docx writer: pandoc uses only numIds >= 1000 for lists. This opens up the possiblity (with further code changes) of preserving some numbering from the reference.docx (e.g. header numbering.) See #1305.	2014-06-03 12:13:31 -07:00
John MacFarlane	2842ad5a97	Docx writer: Changed abstractNumId numbering scheme. Now the minimum id used by pandoc is 990. All ids start with "99". This gives some room for a reference.docx to define numbering styles. Note: this is not yet possible, since pandoc generates numbering.xml entirely on its own.	2014-06-03 11:33:09 -07:00
John MacFarlane	05355ac57b	Docx writer: Simplified abstractNumId numbering. Instead of sequential numbering, we assign numbers based on the list marker styles. This simplifies some of the code and should make it easier to modify numbering in the future.	2014-06-03 11:03:40 -07:00
John MacFarlane	9b4e772718	Templates: use ordNum instead of ord. Closes #1022.	2014-06-03 11:01:23 -07:00
John MacFarlane	2a627f85fe	Shared: Added ordNub. API change (adds export).	2014-06-03 11:00:54 -07:00
John MacFarlane	cbfde5cb50	Docx writer: Create overrides per-image for media/ in ref docx. This should be somewhat more robust and cover more types of images.	2014-06-02 20:39:27 -07:00
John MacFarlane	326d7fa8f8	Docx writer: Improved entryFromArchive to avoid parse. No need to parse the XML if we're just going to render it right away!	2014-06-02 20:20:16 -07:00
John MacFarlane	bf915da6cd	Docx writer: Make images work in reference.docx headers/footers. * All media from reference.docx are copied into result. * Added defaults for common image types to [Content Types]. * Avoided redundant XML parse + write for entries taken over from reference.docx, for better performance.	2014-06-02 20:07:41 -07:00
John MacFarlane	e1cf47efa0	Templates: Fail informatively on template syntax errors. With the move from parsec to attoparsec, we lost good error reporting. In fact, since we weren't testing for end of input, malformed templates would fail silently. Here we revert back to Parsec for better error messages.	2014-06-01 23:45:05 -07:00
John MacFarlane	7242165bed	Docx writer: Improved handling of headers/footers.	2014-06-01 22:29:13 -07:00
John MacFarlane	6848f642e8	Docx writer: Header and footer are now carried over from reference.docx.	2014-06-01 21:17:00 -07:00
John MacFarlane	6327ccf523	Minor code reformat.	2014-06-01 15:29:27 -07:00
John MacFarlane	23a9b800a3	Docx writer: Take over document formatting from reference.docx. This includes margins, page size, page orientation.	2014-05-31 22:02:33 -07:00
John MacFarlane	9cf5f74e8f	PDF writer: Fixed treatment of data uris for images. Closes #1062.	2014-05-28 10:41:40 -07:00
John MacFarlane	e656658af8	Merge pull request #1302 from tarleb/inline-latex Org reader: support for inline LaTeX	2014-05-28 09:26:48 -07:00
John MacFarlane	e3ddc371de	Markdown reader: Handle `c++` and `objective-c` as language identifiers in github-style fenced blocks. Closes #1318. Note: This is special-case handling of these two cases. It would be good to do something more systematic.	2014-05-27 12:44:39 -07:00
John MacFarlane	2e80613451	Markdown reader: inline math must have nonspace before final `$`. Closes #1313.	2014-05-27 11:59:28 -07:00
Albert Krewinkel	3238a2f919	Org reader: support for inline LaTeX Inline LaTeX is now accepted and parsed by the org-mode reader. Both, math symbols (like \tau) and LaTeX commands (like \cite{Coffee}), can be used without any further escaping.	2014-05-20 22:29:21 +02:00
John MacFarlane	3c77ab98bf	EPUB writer: Handle multiple dates with OPF `event` attributes. Note: in EPUB3 we can have only one dc:date, so only the first one is used.	2014-05-19 13:25:44 -07:00
John MacFarlane	8d04c821aa	Avoid `import Prelude hiding (catch)`. See #1309.	2014-05-19 09:45:00 -07:00
John MacFarlane	ee8c8da8cc	Removed dependency on conduit. * http-conduit flag is now https. * Instead of http-conduit, we depend on http-client and http-client-tls.	2014-05-18 22:07:00 -07:00
John MacFarlane	c5c9b0d289	EPUB writer: Fixed regression on cover image. In 1.12.4 and 1.12.4.2, the cover image would not appear properly, because the metadata id was not correct. This was introduced by the fix to #1254. Now we derive the id from the actual cover image filename, which we preserve rather than using "cover-image."	2014-05-15 10:11:48 -07:00
John MacFarlane	60b8b85040	Merge pull request #1293 from tarleb/typo Process: Fix minor typo in pipeProcess' docs	2014-05-14 06:41:04 -07:00
John MacFarlane	b5959b2007	Merge pull request #1297 from tarleb/citations Org reader: support Pandocs citation extension	2014-05-14 06:37:29 -07:00
Albert Krewinkel	ceeb701c25	Org reader: support Pandocs citation extension Citations are defined via the "normal citation" syntax used in markdown, with the sole difference that newlines are not allowed between "[...]". This is for consistency, as org-mode generally disallows newlines between square brackets. The extension is turned on by default and can be turned off via the default syntax-extension mechanism, i.e. by specifying "org-citation" as the input format. Move `citeKey` from Readers.Markdown into Parsing The function can be used by other readers, so it is made accessible for all parsers.	2014-05-14 15:00:26 +02:00
Albert Krewinkel	2423f9e6b1	Move `citeKey` from Readers.Markdown to Parsing The function can be used by other readers, so it is made accessible for all parsers.	2014-05-14 14:58:05 +02:00
Albert Krewinkel	9df589b9c5	Introduce class HasLastStrPosition, generalize functions Both `ParserState` and `OrgParserState` keep track of the parser position at which the last string ended. This patch introduces a new class `HasLastStrPosition` and makes the above types instances of that class. This enables the generalization of functions updating the state or checking if one is right after a string.	2014-05-14 14:57:00 +02:00
John MacFarlane	aa019448d6	LaTeX reader: Support `\addbibresource`.	2014-05-12 13:06:06 -07:00
John MacFarlane	2348f07b11	Shared addMetaField: if old and new values both lists, concatenate.	2014-05-12 13:05:42 -07:00
John MacFarlane	a8319d1339	LaTeX reader: set `bibliography` in metadata from `\bibliography` cmd.	2014-05-11 22:52:29 -07:00
Albert Krewinkel	113a32daa8	Process: Fix minor typo in pipeProcess' docs Replace fullstop with comma, adjust capitalisation.	2014-05-11 15:07:01 +02:00
John MacFarlane	0092606476	LaTeX reader: Don't error on "%foo" with no newline.	2014-05-10 23:26:32 -07:00
Albert Krewinkel	c5fd631b55	Org reader: Fix block parameter reader, relax constraints The reader produced wrong results for block containing non-letter chars in their parameter arguments. This patch relaxes constraints in that it allows block header arguments to contain any non-space character (except for ']' for inline blocks). Thanks to Xiao Hanyu for noticing this.	2014-05-10 11:35:54 +02:00
John MacFarlane	884693fea8	Merge pull request #1288 from tarleb/update-copyright Update copyright notices for 2014, add missing notices	2014-05-09 09:53:06 -07:00
Albert Krewinkel	07694b3018	Org reader: Fix parsing of blank lines within blocks Blank lines were parsed as two newlines instead of just one. Thanks to Xiao Hanyu (@xiaohanyu) for pointing this out.	2014-05-09 18:23:23 +02:00
Albert Krewinkel	757c4f68f3	Org reader: Support arguments for code blocks The general form of source block headers (`#+BEGIN_SRC <language> <switches> <header arguments>`) was not recognized by the reader. This patch adds support for the above form, adds header arguments to the block's key-value pairs and marks the block as a rundoc block if header arguments are present. This closes #1286.	2014-05-09 18:08:30 +02:00
Albert Krewinkel	7760504bb2	Org reader: refactor #+BEGIN..#+END block parsing code	2014-05-09 10:53:08 +02:00
Albert Krewinkel	8fdbef841d	Update copyright notices for 2014, add missing notices	2014-05-09 00:46:08 +02:00
mpickering	f0f88111e6	Small improvement to textile reader fix. Removed 'try'.	2014-05-07 09:48:48 -07:00
mpickering	0050b50905	Fix textile reader hanging. Textile reader hung on pandoc -f textile http://johnmacfarlane.net/pandoc/demo/example25.textile The reader no longer hangs.	2014-05-07 09:32:25 -07:00
John MacFarlane	84f2336a7d	Textile reader: Rearranged inline parsers for performance. This is possible because of the rewrite of simpleInline. Also removed a redundant parser for grouped inlines.	2014-05-06 23:41:56 -07:00
John MacFarlane	442eecc15c	Textile reader: Rewrote simpleInline for clarity and efficiency. This way we only look once for the opening `[`.	2014-05-06 23:27:16 -07:00
John MacFarlane	ea4e947bd0	Textile reader: Disallow blank lines in inline contexts. @hi there@ should not be a single code span.	2014-05-06 23:16:47 -07:00
John MacFarlane	d6a9ba1cdc	Make `--trace` work with textile reader.	2014-05-06 22:28:11 -07:00
John MacFarlane	10644607e3	Textile reader: Rewrote some inline parsing code for clarity. (It seems clearer to put the whitespace parsing in the grouped parser. This also uses stateLastStrPos to determine when the border is adjacent to an alphanumeric.)	2014-05-06 22:14:35 -07:00
Albert Krewinkel	71bd4fb2b3	Org reader: Read inline code blocks Org's inline code blocks take forms like `src_haskell(print "hi")` and are frequently used to include results from computations called from within the document. The blocks are read as inline code and marked with the special class `rundoc-block`. Proper handling and execution of these blocks is the subject of a separate library, rundoc, which is work in progress. This closes #1278.	2014-05-06 13:21:26 +02:00
John MacFarlane	dbd6c1540f	Fixed the fix to #1154 . We need to strip off up to 4 spaces, not up to 3.	2014-05-04 16:21:18 -07:00
John MacFarlane	51aa304834	LaTeX writer: Fixed inconsistencies with reference escaping. - toLabel is now monadic, and it does the needed string escaping. - Closes #1130.	2014-05-04 14:43:05 -07:00
John MacFarlane	0c7e084342	Docx writer: Fall back on distribution reference.docx. * Undid changes to parseXml in last commit. * Instead of a string fallback, we have parseXml fall back on the reference.docx that comes with pandoc if the user's reference.docx does not contain a needed file. * Closes #1185.	2014-05-04 10:54:45 -07:00
John MacFarlane	d728715981	Docx writer: Added ability to give fallback in parseXml.	2014-05-04 10:45:20 -07:00
John MacFarlane	3e42f08e87	Markdown reader: Fixed bug with unwanted code in lists. Closes #1154. When reading a raw list item, we now strip off nonindent spaces.	2014-05-04 08:07:17 -07:00
John MacFarlane	96c0c950ca	AsciiDoc writer: Handle multiblock table cells. Closes #1246.	2014-05-03 21:31:53 -07:00
John MacFarlane	fde52c25a6	AsciiDoc writer: Correctly handle empty table cells. Closes #1245.	2014-05-03 21:08:45 -07:00
John MacFarlane	abd3a039b9	DocBook writer: Small tweaks to last commit. * Use isTightList from Shared. * Adjust writer test, since isTightList is a bit different from what was used before. Closes #1250.	2014-05-03 20:45:38 -07:00
Neil Mayhew	ccbf4fc9c2	Distinguish tight and loose lists in Docbook output Determined by the first block of the first item being Plain.	2014-05-03 18:37:02 -07:00
John MacFarlane	2ba7873086	LaTeX reader: Fixed regression introduced with last commit. Tests now pass again.	2014-05-03 18:34:23 -07:00

1 2 3 4 5 ...

2688 commits