Commit graph

3217 commits

Author SHA1 Message Date
Jesse Rosenthal
75eec0a6b8 Docx reader: work with new rStyle.
Just discards info at the moment, so at least it works the same.
2014-08-17 09:22:25 -04:00
Jesse Rosenthal
ea85a797c2 Parser: Framework for parsing styles.
We want to be able to read user-defined styles. Eventually we'll be able
to figure out styles in terms of inheritance as well. The actual
cascading will happen in the docx reader.
2014-08-17 09:22:21 -04:00
Jesse Rosenthal
dc5b0ba09b Docx reader: Change behavior of Super/Subscript
In docx, super- and subscript are attributes of Vertalign. It makes more
sense to follow this, and have different possible values of Vertalign in
runStyle. This is mainly a preparatory step for real style parsing,
since it can distinguish between vertical align being explicitly turned
off and it not being set.

In addition, it makes parsing a bit clearer, and makes sure we don't do
docx-impossible things like being simultaneously super and sub.
2014-08-17 08:20:00 -04:00
John MacFarlane
9d52ecdd42 HTML reader: Parse appropriately styled span as SmallCaps. 2014-08-16 22:57:00 -07:00
Viktor Kronvall
753e47194c Simplify row width calculation. 2014-08-17 03:24:44 +02:00
Christoffer Ackelman
9f3c34841b Include row width in table rows.
Added a property to all table rows where the sum of column widths
is specified in pct (fraction of 5000).
2014-08-17 03:24:44 +02:00
John MacFarlane
cb4ae6112e Markdown writer: don't escape $, ^, ~ when extensions are deactivated.
`tex_math_dollars`, `superscript`, and `subscript` extensions,
respectively.

Closes #1127.
2014-08-16 17:14:51 -07:00
Jesse Rosenthal
9bb0b99981 Docx reader: Remove unnecessary plural functions
functions like runElemsToInlines and parPartsToInlines are just defined
in terms of concatting and mapping their singular
version (e.g. `runElemToInlines`). Having two functions with almost
identical names makes it easier to introduce errors. It's easy enough to
just concat and map inline, and it makes it clearer what is going on in
the code.
2014-08-16 15:07:41 -04:00
Jesse Rosenthal
9969b2ebee Docx reader: Fix bug in character styles.
Style handling has been cleaned up, but introduced a bug here. There
wasn't previously a test to catch it.
2014-08-16 14:05:19 -04:00
Jesse Rosenthal
0ff9ec2f4e Rewrite Docx.hs and Reducible to use Builder.
The big news here is a rewrite of Docx to use the builder
functions. As opposed to previous attempts, we now see a significant
speedup -- times are cut in half (or more) in a few informal tests.

Reducible has also been rewritten. It can doubtless be simplified and
clarified further. We can consider this, at the moment, a reference for
correct behavior.
2014-08-16 10:22:55 -04:00
John MacFarlane
8bf39cf6d6 Markdown reader: Better handle quote characters in inline links.
This was previously failing to be recognized as a link:

    [Test](http://en.wikipedia.org/wiki/Ward's_method)

Closes #1534.
2014-08-14 10:59:27 -07:00
John MacFarlane
e917bcc124 Make raw_tex extension non-default for textile reader, writer.
Enable `raw_tex` extension in textile writer.

Closes #1532.
2014-08-14 09:49:31 -07:00
John MacFarlane
52a9ccce4f Merge pull request #1531 from jkr/morefonts
Docx reader: Interpret "Strong" and "Emphasis" run styles.
2014-08-13 19:47:42 -07:00
John MacFarlane
17b2fd567b Fixed haddock comment. 2014-08-13 13:59:50 -07:00
John MacFarlane
05b7fd8dee Removed unneeded import. 2014-08-13 11:35:09 -07:00
Jesse Rosenthal
6897905602 Docx reader: Interpret "Strong" and Emphasis run styles. 2014-08-13 12:23:03 -04:00
John MacFarlane
22ab3367c6 Removed unneeded CPP. 2014-08-12 22:50:51 -07:00
Jesse Rosenthal
a1320a76f9 Docx: Reducible forgot about smallcaps 2014-08-13 00:09:40 -04:00
Jesse Rosenthal
dca55630e6 Docx Reader: Trim line breaks from the beginning and end of Section
Headers.

We might also want to do this elsewhere (for pars, for example).
2014-08-12 23:42:01 -04:00
Jesse Rosenthal
378a795eaa Docx: More robust handling of multiple bookmarks in header. 2014-08-12 23:41:57 -04:00
Jesse Rosenthal
85579052b5 Docx reader: Check for null-id'd anchors too.
Otherwise they get left dangling in the document.
2014-08-12 23:33:03 -04:00
Jesse Rosenthal
194ed88852 Docx reader: accept explicit "Italic" and "Bold" rStyles.
Note that "Italic" can be on, and, from the last commit, `<w:i>` can be
present, but be turned off. In that case, the turned-off tag takes
precedence. So, we have to distinguish between something being off and
something not being there. Hence, isItalic, isBold, isStrike, and
isSmallCaps have become Maybes.
2014-08-12 22:39:18 -04:00
Jesse Rosenthal
aae71ad595 Docx reader: Add "BlockQuotation" to divs list. 2014-08-12 22:08:30 -04:00
Jesse Rosenthal
d4748038d7 Docx Reader: Fix font style parsing.
Before we just checked for the existence of a tag. Now, we make sure to
check for its on/off value.
2014-08-12 22:04:07 -04:00
John MacFarlane
e883ef4eb9 Merge pull request #1527 from mpickering/juicypixels
Attempts to convert gif, tiff and bmp to png in pdf writer
2014-08-12 16:57:22 -07:00
John MacFarlane
7684b24959 Merge pull request #1528 from mpickering/epubtitlepage
EPUB Reader: Ignores titlepage attribute
2014-08-12 16:56:38 -07:00
Matthew Pickering
2b31df32de LaTeX Writer: Added missing closing braces to hyperdef commands 2014-08-13 00:37:18 +01:00
Matthew Pickering
57bebe26df PDF Writer: Attempts to convert images to pdf renderable formats
Now depends on the JuicyPixels library.

Will attempt to convert an image (gif, tiff, bmp) to png when converting
to pdf.
2014-08-13 00:37:18 +01:00
John MacFarlane
81157c7cc6 HTML writer: use 'uri' or 'email' class for autolinks.
This allows them to be styled specially.

Closes #1501.
2014-08-12 15:49:43 -07:00
John MacFarlane
da507dcb84 ConTeXt writer: improved autolink detection.
It previously failed in some cases with escaped special characters.
2014-08-12 15:49:20 -07:00
Matthew Pickering
34cf016251 EPUB Reader: Ignore title pages 2014-08-12 23:03:24 +01:00
John MacFarlane
f16dd1bfdf DocBook: Support equations with mathml.
equation, informalequation, inlineequation and mml:math elements.
2014-08-12 14:00:46 -07:00
John MacFarlane
d6f0973128 Merge pull request #1524 from jkr/dropCap3
Docx reader: move dropcap combining logic to Reducible
2014-08-12 11:13:27 -07:00
John MacFarlane
f97ec6db2c Markdown reader: Improved parsing of indented code in list items.
Indented code at the beginning of a list item must be indented eight
spaces from the margin (or from the edge of the container), or four
spaces past the list marker, whichever is farther.

Some examples in `tests/markdown-reader-more.txt`.
2014-08-12 11:10:48 -07:00
John MacFarlane
ab75e1d3bd Beamer: Use \footnote<.->{..} for notes.
This ensures that the footnotes will not appear before the
overlays in which their corresponding note markers appear.

Closes #1525.
2014-08-12 10:56:57 -07:00
Jesse Rosenthal
9d0b390d48 Docx reader: move combining logic to Reducible
Introduces a new function in Reducibles, concatR.  The idea is that if we
have two list of Reducibles (blocks or inlines), we can combine them and
just perform the reduction on the joining parts (the last element of the
first list, the first element of the second list). This is useful in cases
where the two lists are already reduced, and we're only worried about the
joining elements.

This actually improves the efficiency a bit further, because concatR can be
smart about empty lists.
2014-08-12 10:26:49 -04:00
Jesse Rosenthal
e4a8e4a636 Docx reader: Make dropcap combining more efficient.
Before, we had to run reduceList on the whole combined paragraph, which
was redundant, and could take some time for long paragraphs. We only
need to combine the drop cap with the first inline of the next
paragraph.
2014-08-12 09:00:53 -04:00
Jesse Rosenthal
45ec035e93 Docx reader: combine inlines properly in dropcaps.
Make sure that adjacent inlines are combined properly in dropcaps. This
updates the test results as well.
2014-08-11 23:31:16 -04:00
Jesse Rosenthal
3e32cd5bb1 Docx reader: Use dropcap state.
If we get to a dropcap, we keep hold the inlines until the next
paragraph, and combine it there.
2014-08-11 23:08:33 -04:00
Jesse Rosenthal
bca74a2bd0 Add dropCap to paragraph style. 2014-08-11 21:42:02 -04:00
John MacFarlane
86d4da994a EPUB reader: use walk instead of bottomUp.
This should be more efficient.
2014-08-11 14:48:42 -07:00
John MacFarlane
31811657fa Merge pull request #1521 from jkr/emptyEmph
Discard empty formatters
2014-08-11 14:44:08 -07:00
John MacFarlane
211fe266e0 LaTeX writer: Don't produce \label{} for Div or Span.
Just `\hyperdef`.
A slight amendment to #1519.
2014-08-11 12:20:44 -07:00
John MacFarlane
95d9b43b42 Merge pull request #1519 from mpickering/more
EPUB Normalisation and anchors for div blocks in tex
2014-08-11 11:28:11 -07:00
John MacFarlane
6fae136cbb Textile reader: list and HTML block parsing improvements.
Closes #1513.

Lists can now start without an intervening blank line.
Also, html block-level tags that don't start a line are parsed
as RawInline and don't interrupt paragraphs, as in RedCloth.
2014-08-11 11:22:39 -07:00
John MacFarlane
4a535211d8 Merge pull request #1365 from gbataille/docx-margin
Scale images to fit the page for DOCX
2014-08-11 10:44:52 -07:00
Jesse Rosenthal
0411fe7ccf Docx reader: handle empty reducibles. 2014-08-11 12:48:16 -04:00
Matthew Pickering
1952dd0592 TeX Writer: Write hyperdef and label for identifiers on Div blocks 2014-08-11 16:23:05 +01:00
Matthew Pickering
72b1470713 EPUB Reader: Fixed another normalisation problem.. 2014-08-11 16:23:05 +01:00
John MacFarlane
e690fe4a3e Merge pull request #1516 from mpickering/epubmetadata
EPUB improvements
2014-08-11 08:14:54 -07:00