Jesse Rosenthal
9c7e0dc84b
Implement new normalization.
...
There were some problems with the old str normalization. This fixes those
problems. Also, since it drills down on its own, it only needs to be
mapped over the blocks, not walked over the tree.
2014-06-22 00:45:18 -04:00
John MacFarlane
5d0103606f
Markdown reader: Support smallcaps through span.
...
`<span style="font-variant:small-caps;">foo</span>` will be
parsed as a `SmallCaps` inline, and will work in all output
formats that support small caps.
Closes #1360 .
2014-06-20 15:26:45 -07:00
John MacFarlane
d397a66107
MediaWiki reader: Tightened up template parsing.
...
The opening "{{" must be followed by an alphanumeric or ':'.
This prevents the exponential slowdown in #1033 .
Closes #1033 .
2014-06-20 12:00:26 -07:00
John MacFarlane
8f20ac3da3
MediaWiki reader: Support --trace.
2014-06-20 11:39:24 -07:00
John MacFarlane
d81b4358ea
LaTeX writer: Correctly handle figures in notes.
...
Notes can't contain figures in LaTeX, so we fake it to avoid
an error. Closes #1053 .
2014-06-20 11:26:38 -07:00
John MacFarlane
56c410ef6a
Markdown reader: Prevent spurious line breaks after list items.
...
When the `hard_line_breaks` option was specified, pandoc would
produce a spurious line break after a tight list item. This
patch solves the problem. Closes #1137 .
2014-06-20 11:10:35 -07:00
John MacFarlane
2eadc78053
ImageSize: Use default instead of failing if image size not found
...
in exif header. Closes #1358 .
2014-06-20 10:58:26 -07:00
John MacFarlane
b3b40546cb
HTML reader: Fix performance issue with malformed HTML tables.
...
We let a `</table>` tag close an open `<tr>` or `<td>`.
Closes #1167 .
2014-06-20 10:47:29 -07:00
John MacFarlane
cab4b829b3
Support --trace in HTML reader.
2014-06-20 10:39:24 -07:00
John MacFarlane
12efffa85a
LaTeX writer: Fixed strikeout + highlighted code. Closes #1294 .
...
Previously strikeout highlighted code caused an error.
2014-06-20 10:24:30 -07:00
Jesse Rosenthal
f6ae644831
Make strNormalize go bottomUp.
...
This was how it used to be before it was folded into blockNormalize.
2014-06-20 12:31:36 -04:00
Jesse Rosenthal
2aa5f58c5b
Docx reader: Add a comment explaining strNormalize
...
`normalize` from Text.Pandoc.Shared is more general. In tests, though,
it more than doubles the run time. `strNormalize` does less, but it does
what we need. This comment is added for future maintainability.
2014-06-20 10:27:18 -04:00
Jesse Rosenthal
03af19a7e1
Docx Reader: Normalize DefinitionLists
...
Previously DefinitionList had been left out of `blockNormalize`. Now it
is included.
2014-06-20 10:20:37 -04:00
Jesse Rosenthal
3da515bdb0
Docx reader: simplify blockNormalize
...
Use a function `stripSpaces`, instead of recursion. Makes it a bit
easier to read and mantain, and simplify normalizing DefinitionList,
which was left out the first time.
2014-06-20 10:12:28 -04:00
Jesse Rosenthal
7fd48b30e0
Docx reader: Fix hdr handling in block norm
...
`blockNormalize` previously forgot to account for the case in which a
Header's inlines did not start with a space.
2014-06-20 09:30:30 -04:00
John MacFarlane
557b302731
Docx writer: Use Compact style for empty table cells.
...
Otherwise we get overly tall lines when there are empty
table cells and the other cells are compact.
Closes #1353 .
2014-06-19 23:31:17 -07:00
John MacFarlane
3c059dbe60
HTML reader: Allow space between <col>
and </col>
.
...
Test case:
```
<table border="1">
<colgroup>
<col> </col>
<col></col>
</colgroup>
<tbody>
<tr>
<td>X</td>
<td>Y</td>
</tr>
<tr>
<td>1</td>
<td>2</td>
</tr>
</tbody>
</table>
```
2014-06-19 23:24:28 -07:00
John MacFarlane
0d8e0e5674
Merge pull request #1354 from jkr/literalTab
...
Parse literal tabs in docx
2014-06-19 22:47:32 -07:00
Jesse Rosenthal
a934db9a32
Introduce blockNormalize
...
This will help take care of spaces introduced at the beginning of strings.
2014-06-19 19:28:55 -04:00
Jesse Rosenthal
0e7d2dbd43
Have Docx reader properly interpret tabs.
2014-06-19 17:55:02 -04:00
Jesse Rosenthal
86fc44d6b3
Add literal tabs to parser.
2014-06-19 17:53:52 -04:00
John MacFarlane
5cb53a48d5
ImageSize: ignore unknown exif header tag rather than crashing.
...
Some images seem to have tag type of 256, which was causing
a runtime error.
2014-06-19 14:30:03 -07:00
John MacFarlane
00281559bf
Haddock writer: Use _____ for hrule.
...
Avoids interpretation as list.
2014-06-19 00:28:23 -07:00
John MacFarlane
de7b3a3d08
Haddock writer: Only use Decimal list style.
2014-06-18 18:11:01 -07:00
John MacFarlane
c4182b39ca
Small fix to haddock "tables".
2014-06-18 18:08:41 -07:00
John MacFarlane
ff6a2baeb9
More polish on Haddock reader/writer.
2014-06-18 17:49:59 -07:00
John MacFarlane
35e57db5c2
Finished first draft of Haddock writer.
2014-06-18 17:09:36 -07:00
John MacFarlane
9fc5c8d7af
Rewrote haddock reader to use haddock-library.
...
This brings pandoc's rendering of haddock markup in line
with the new haddock.
Note that we preserve line breaks in `@` code blocks, unlike
the earlier version.
Modified tests pass. More tests would be good.
2014-06-18 14:18:55 -07:00
John MacFarlane
ab390a10ec
Removed old haddock reader code. Add dependency on haddock-library.
...
This also removes the dependency on alex and happy.
2014-06-18 11:33:09 -07:00
John MacFarlane
b371e83d73
Highlighting: Let .numberLines work even if no language given.
...
Closes #1287 , jgm/highlighting-kate#40 .
2014-06-17 15:15:56 -07:00
John MacFarlane
59272e4d99
DocBook reader: Support <?asciidoc-br?>.
...
Closes #1236 .
Note, this is a bit of a kludge, to work around the fact that xml-light
doesn't parse `<?asciidoc-br?>` correctly. We preprocess the input,
replacing that instruction with `<br/>`, and then parse that as a line
break. Other XML instructions are simply removed from the input stream.
2014-06-17 12:14:02 -07:00
John MacFarlane
fc291efad3
LaTeX reader: Correctly handle table rows with too few cells.
...
LaTeX seems to treat them as if they have empty cells at the
end. Closes #241 .
2014-06-17 00:38:55 -07:00
John MacFarlane
7d60c798bf
Fixed compiler warning.
2014-06-16 23:02:20 -07:00
John MacFarlane
bbe99003f8
Naming: Use Docx instead of DocX.
...
For consistency with the existing writer.
2014-06-16 22:44:40 -07:00
John MacFarlane
bec9f3c641
Merge branch 'docx' of https://github.com/jkr/pandoc into jkr-docx
2014-06-16 22:16:45 -07:00
John MacFarlane
78ee2416d1
Org reader: make tildes create inline code.
...
Closes #1345 . Also relabeled 'code' and 'verbatim' parsers
to accord with the org-mode manual.
I'm not sure what the distinction between code and verbatim
is supposed to be, but I'm pretty sure both should be represented
as Code inlines in pandoc. The previous behavior resulted in the
text not appearing in any output format.
2014-06-16 22:03:26 -07:00
John MacFarlane
f9b97e6bfb
Small improvement to fix to #1333 .
...
This allows blank lines at end of multiline headers.
2014-06-16 21:26:50 -07:00
John MacFarlane
9da5d8955e
Markdown reader: fixed #1333 (table parsing bug).
2014-06-16 21:18:24 -07:00
John MacFarlane
87c08be58f
LaTeX reader: handle leading/trailing spaces in emph better.
...
`\emph{ hi }` gets parsed as `[Space, Emph [Str "hi"], Space]`
so that we don't get things like `* hi *` in markdown output.
Also applies to textbf and some other constructions.
Closes #1146 . (`--normalize` isn't touched by this, but
normalization should not generally be necessary with the
changes to the readers.)
2014-06-16 19:18:33 -07:00
John MacFarlane
459805de4c
LaTeX reader: don't assume preamble doesn't contain environments.
...
Closes #1338 .
2014-06-16 17:43:56 -07:00
John MacFarlane
31fd843133
HTML reader: Fixed major parsing problem with HTML tables.
...
Table cells were being combined into one cell. Closes #1341 .
2014-06-16 15:45:20 -07:00
John MacFarlane
2b364b34bb
Merge pull request #1344 from mpickering/master
...
Moved extractSpaces to Shared.hs
2014-06-16 14:43:43 -07:00
John MacFarlane
01ef573ac2
Org reader: fixed #1342 .
...
This change rewrites `inlineLaTeXCommand` so that parsec will
know when input is being consumed. Previously a run-time
error would be produced with some input involving raw latex.
(I believe this does not affect the last release, as the inline
latex reading was added recently.)
2014-06-16 14:18:06 -07:00
mpickering
7807564d44
Moved extractSpaces to Shared.hs
...
Generalised and move the extractSpaces function from `HTML.hs` to
`Shared.hs` so that the docx reader can also use it.
2014-06-16 20:45:54 +01:00
mpickering
3bc818d2d3
Integrated the docx reader into the main pandoc program.
...
Changes also include generalising the types of reader allowed. The
mechanism now mimics the more general output mechanism.
2014-06-16 07:18:40 -04:00
Jesse Rosenthal
293e4cfdc3
Add DocX files to tree.
...
This introduces Text.Pandoc.DocX, and its exported `readDocX` function.
2014-06-16 07:18:34 -04:00
James Aspnes
abbf33ae7d
allow (and discard) optional argument for \caption
2014-06-12 21:19:00 -04:00
John MacFarlane
9681574661
LaTeX reader: Handle comments at the end of tables.
...
This resolves the issue illustrated in
http://stackoverflow.com/questions/24009489/comments-in-latex-break-pandoc-table .
2014-06-03 23:17:42 -07:00
John MacFarlane
ab5dda7a60
Markdown writer: Prettier pipe tables.
...
Columns are now aligned. Closes #1323 .
2014-06-03 23:17:03 -07:00
John MacFarlane
45f3851611
Docx writer: Section numbering carries over from reference.docx.
...
Closes #1305 .
2014-06-03 16:46:55 -07:00