Jesse Rosenthal
1de8d4d087
Docx Reader: Simplify makeHeaderAnchors
...
Using pattern guard, in preparation for doing some more complicated
stuff with it (recording header anchors, so we can change them to auto
ids.)
2014-06-28 04:00:16 -04:00
Jesse Rosenthal
ab76bbebbe
Docx Reader: Clean up guards
...
Use PatternGuards to get rid of need for `isJust`, `fromJust`
altogether.
2014-06-28 04:00:16 -04:00
Jesse Rosenthal
db187348cd
Docx rdr: Avoid mapping makeHeaderAnchors globally
...
It only applies to headers, so we can just apply it when we make a
header.
2014-06-28 04:00:16 -04:00
Jesse Rosenthal
4248f25152
Move Docx reader to DocxContext monad
...
This is a ReaderT State stack, which keeps track of some environment info, such
as the options and the docx doc. The state will come in handy in the future,
for a couple of planned features (rewriting the section anchors as auto_idents,
and hopefully smart-quoting).
2014-06-28 04:00:16 -04:00
John MacFarlane
b2127311cb
Require haddock-library >= 1.1 and simplify haddock reader code.
...
See #1346 .
2014-06-26 12:35:13 -07:00
John MacFarlane
9f694619cd
Merge pull request #1374 from jkr/track-changes-options
...
Track changes with options
2014-06-25 23:51:16 -07:00
Jesse Rosenthal
2396be6f57
Docx reader: Code cleanup in parse.
...
Remove some redundant ways of dealing with Maybe.
2014-06-25 17:12:03 -04:00
Jesse Rosenthal
0e9bf37f64
Docx reader: Make use of track-changes option.
2014-06-25 14:17:20 -04:00
Jesse Rosenthal
d824f89fb3
Add TrackChanges to Options export.
2014-06-25 14:05:21 -04:00
Jesse Rosenthal
6ff84b5e8d
Add reader option for track changes.
2014-06-25 13:57:56 -04:00
Jesse Rosenthal
3ec62d0064
Add TrackChanges type to options.
2014-06-25 13:50:08 -04:00
Jesse Rosenthal
9614ddfedc
Docx reader: Remove unnecessary filter in Parse.
...
mapMaybe does the filtering for us.
2014-06-25 11:00:15 -04:00
Jesse Rosenthal
ed44e4ca8c
Docx reader: Add rudimentary track changes support.
...
This will only read the insertions, and ignore the deletions.
2014-06-25 10:38:01 -04:00
Jesse Rosenthal
38e1d3e95b
Docx reader: Parse Insertions and Deletions.
...
This is just for the Parse module, reading it into the Docx format. It
still has to be translated into pandoc.
2014-06-25 10:32:48 -04:00
Jesse Rosenthal
c343f1a90b
Docx Reader: Add change types
...
Insertion and deletion. Dates are just strings for now.
2014-06-25 08:10:19 -04:00
Jesse Rosenthal
69743cd598
Docx reader: Ignore zero (or negative) indent
...
If a block has an indentation less than or equal to zero, it should not be
treated as a block quote.
2014-06-24 15:06:25 -04:00
Jesse Rosenthal
a8866bc121
Docx reader: remove T.P.Generic import.
...
This marks the removal of the final tree-walk in the code. (Though there
is still one in the Lists module.)
2014-06-24 12:15:26 -04:00
Jesse Rosenthal
5ae6b8c6f1
Docx reader: pass definition test.
...
This commit also fixes a problem with the previous code pushes, which
wouldn't allow code blocks to share a div.
2014-06-24 12:12:02 -04:00
Jesse Rosenthal
bebea5e936
Docx reader: pass code tests.
2014-06-24 10:34:07 -04:00
Jesse Rosenthal
08633fad33
Add copyright block to T.P.R.Docx.Reducible.
2014-06-23 20:26:08 -04:00
John MacFarlane
ac6756009f
Merge pull request #1366 from jkr/reducible3
...
Docx rewrite and cleanup (in terms of Reducible typeclass)
2014-06-23 14:33:38 -07:00
Jesse Rosenthal
11b0778744
Use Reducible in docx reader.
...
This cleans up them implementation, and cuts down on tree-walking.
Anecdotally, I've seen about a 3-fold speedup.
2014-06-23 17:08:17 -04:00
Jesse Rosenthal
94d0fb1538
Move some of the clean-up logic into List module.
...
This will allow us to get rid of more general functions we no longer need in
the main reader.
2014-06-23 17:08:17 -04:00
Jesse Rosenthal
ef5fad2698
Add new typeclass, Reducible
...
This defines a typeclass `Reducible` which allows us to "reduce" pandoc
Inlines and Blocks, like so
Emph [Strong [Str "foo", Space]] <++> Strong [Emph [Str "bar"]], Str
"baz"] =
[Strong [Emph [Str "foo", Space, Str "bar"], Space, Str "baz"]]
So adjacent formattings and strings are appropriately grouped.
Another set of operators for `(Reducible a) => (Many a)` are also
included.
2014-06-23 17:08:05 -04:00
John MacFarlane
87ab01637e
LaTeX writer: Use \textquotesingle
for '
in inline code.
...
Otherwise we get curly quotes in the PDF output.
Closes #1364 .
2014-06-23 12:51:10 -07:00
John MacFarlane
e03ed7377c
Markdown reader: Combine consecutive latex environments.
...
This helps when you have two minipages which can't have
blank lines between them.
See #690 , #1196 .
2014-06-23 12:42:27 -07:00
Jesse Rosenthal
8e5bd9d851
Docx reader: Fix spacing in formatting.
...
The normalizing tests revealed a problem with unformatted spaces, brought about
by `spanTrim`. This fixes by not trimming the spaces out of spans until they
are in their final form.
2014-06-22 01:53:30 -04:00
Jesse Rosenthal
9c7e0dc84b
Implement new normalization.
...
There were some problems with the old str normalization. This fixes those
problems. Also, since it drills down on its own, it only needs to be
mapped over the blocks, not walked over the tree.
2014-06-22 00:45:18 -04:00
John MacFarlane
5d0103606f
Markdown reader: Support smallcaps through span.
...
`<span style="font-variant:small-caps;">foo</span>` will be
parsed as a `SmallCaps` inline, and will work in all output
formats that support small caps.
Closes #1360 .
2014-06-20 15:26:45 -07:00
John MacFarlane
d397a66107
MediaWiki reader: Tightened up template parsing.
...
The opening "{{" must be followed by an alphanumeric or ':'.
This prevents the exponential slowdown in #1033 .
Closes #1033 .
2014-06-20 12:00:26 -07:00
John MacFarlane
8f20ac3da3
MediaWiki reader: Support --trace.
2014-06-20 11:39:24 -07:00
John MacFarlane
d81b4358ea
LaTeX writer: Correctly handle figures in notes.
...
Notes can't contain figures in LaTeX, so we fake it to avoid
an error. Closes #1053 .
2014-06-20 11:26:38 -07:00
John MacFarlane
56c410ef6a
Markdown reader: Prevent spurious line breaks after list items.
...
When the `hard_line_breaks` option was specified, pandoc would
produce a spurious line break after a tight list item. This
patch solves the problem. Closes #1137 .
2014-06-20 11:10:35 -07:00
John MacFarlane
2eadc78053
ImageSize: Use default instead of failing if image size not found
...
in exif header. Closes #1358 .
2014-06-20 10:58:26 -07:00
John MacFarlane
b3b40546cb
HTML reader: Fix performance issue with malformed HTML tables.
...
We let a `</table>` tag close an open `<tr>` or `<td>`.
Closes #1167 .
2014-06-20 10:47:29 -07:00
John MacFarlane
cab4b829b3
Support --trace in HTML reader.
2014-06-20 10:39:24 -07:00
John MacFarlane
12efffa85a
LaTeX writer: Fixed strikeout + highlighted code. Closes #1294 .
...
Previously strikeout highlighted code caused an error.
2014-06-20 10:24:30 -07:00
Jesse Rosenthal
f6ae644831
Make strNormalize go bottomUp.
...
This was how it used to be before it was folded into blockNormalize.
2014-06-20 12:31:36 -04:00
Jesse Rosenthal
2aa5f58c5b
Docx reader: Add a comment explaining strNormalize
...
`normalize` from Text.Pandoc.Shared is more general. In tests, though,
it more than doubles the run time. `strNormalize` does less, but it does
what we need. This comment is added for future maintainability.
2014-06-20 10:27:18 -04:00
Jesse Rosenthal
03af19a7e1
Docx Reader: Normalize DefinitionLists
...
Previously DefinitionList had been left out of `blockNormalize`. Now it
is included.
2014-06-20 10:20:37 -04:00
Jesse Rosenthal
3da515bdb0
Docx reader: simplify blockNormalize
...
Use a function `stripSpaces`, instead of recursion. Makes it a bit
easier to read and mantain, and simplify normalizing DefinitionList,
which was left out the first time.
2014-06-20 10:12:28 -04:00
Jesse Rosenthal
7fd48b30e0
Docx reader: Fix hdr handling in block norm
...
`blockNormalize` previously forgot to account for the case in which a
Header's inlines did not start with a space.
2014-06-20 09:30:30 -04:00
John MacFarlane
557b302731
Docx writer: Use Compact style for empty table cells.
...
Otherwise we get overly tall lines when there are empty
table cells and the other cells are compact.
Closes #1353 .
2014-06-19 23:31:17 -07:00
John MacFarlane
3c059dbe60
HTML reader: Allow space between <col>
and </col>
.
...
Test case:
```
<table border="1">
<colgroup>
<col> </col>
<col></col>
</colgroup>
<tbody>
<tr>
<td>X</td>
<td>Y</td>
</tr>
<tr>
<td>1</td>
<td>2</td>
</tr>
</tbody>
</table>
```
2014-06-19 23:24:28 -07:00
John MacFarlane
0d8e0e5674
Merge pull request #1354 from jkr/literalTab
...
Parse literal tabs in docx
2014-06-19 22:47:32 -07:00
Jesse Rosenthal
a934db9a32
Introduce blockNormalize
...
This will help take care of spaces introduced at the beginning of strings.
2014-06-19 19:28:55 -04:00
Jesse Rosenthal
0e7d2dbd43
Have Docx reader properly interpret tabs.
2014-06-19 17:55:02 -04:00
Jesse Rosenthal
86fc44d6b3
Add literal tabs to parser.
2014-06-19 17:53:52 -04:00
John MacFarlane
5cb53a48d5
ImageSize: ignore unknown exif header tag rather than crashing.
...
Some images seem to have tag type of 256, which was causing
a runtime error.
2014-06-19 14:30:03 -07:00
John MacFarlane
00281559bf
Haddock writer: Use _____ for hrule.
...
Avoids interpretation as list.
2014-06-19 00:28:23 -07:00