Commit graph

2513 commits

Author SHA1 Message Date
Jesse Rosenthal
d65fd58171 Docx Reader: A nicer Docx type.
This modifies the Docx type in the parser to avoid all the extra files
(Notes, numbering, etc). A reader monad keeps track of these, and applies
them at the end. The reader monad is stacked with ErrorT to enable better
error-handling than the old Maybes. (Note that the better error handling
isn't really there yet, but it is now possible.)

One long-term goal of these changes is to make it easier to write the Docx
type. This should make it easier to develop a standalone docx package in the
future.
2014-07-12 18:03:27 +01:00
John MacFarlane
616cf6c539 Merge branch 'dokuwiki' of https://github.com/claremacrae/pandoc into claremacrae-dokuwiki 2014-07-07 16:15:35 -06:00
John MacFarlane
e4263d306e Revamped raw HTML block parsing in markdown.
- We no longer include trailing spaces and newlines in the
  raw blocks.
- We look for closing tags for elements (but without backtracking).
- Each block-level tag is its own RawBlock; we no longer try to
  consolidate them (though `--normalize` will do so).

Closes #1330.
2014-07-07 15:53:59 -06:00
John MacFarlane
91b902f02f EPUB writer: better handle HTML media tags. 2014-07-07 15:53:59 -06:00
John MacFarlane
3d4e76f342 Parsing: Added stateInHtmlBlock to ParserState.
This is used to keep track of the ending tag we're waiting
for when we're parsing inside HTML block tags.
2014-07-07 15:53:59 -06:00
John MacFarlane
8c7abf173a normalize: consolidate adjacent RawBlocks when possible. 2014-07-07 15:53:59 -06:00
John MacFarlane
cbeb931554 HTML reader: adjust blockTags and eitherBlockOrInline.
- Added `audio` and `source` in `eitherBlockOrInline`.
- Moved `video`, `svg`, `progress`, `script`, `noscript`, `svg` from
  `blockTags` to `eitherBlockOrInline`.
- `map` and `object` were mistakenly in both lists; they have been removed
  from `blockTags`.
2014-07-07 15:53:59 -06:00
John MacFarlane
186b8e71e0 Merge pull request #1397 from jkr/equations
Docx Reader: Parse Docx OMML math/equations
2014-07-07 11:13:03 -06:00
John MacFarlane
5ea21760d9 MediaWiki writer: Minor renaming of 'st' prefixed names. 2014-07-04 18:56:11 -06:00
Matej Kollar
0bda602fcb Little restructuralization. 2014-07-04 23:48:58 +02:00
Matej Kollar
d2c81346e7 Move more things to Reader. 2014-07-04 23:42:10 +02:00
Matej Kollar
0bc900e36a HLint suggestions. 2014-07-04 23:25:44 +02:00
Clare Macrae
0c6f06b8a4 DokuWiki writer: Span no longer swallows text 2014-07-02 22:40:34 +01:00
Jesse Rosenthal
d77ccbba63 Docx Reader: Write LaTeX based on equations in word.
This is a first stab at writing out equations in LaTeX based on
omml equations in Word. There are some glitches: unicode chars not known to
LaTeX are silently skipped, and functions (such as `\oiiint`) not in the
standard LaTeX packages are inserted, which can lead to pdf compilation
errors (depending, of course, on your preamble).

Adding, for example, `\usepackage[charter]{mathdesign}` to the preamble will
allow you to use most of the more esoteric functions.
2014-07-02 16:54:33 -04:00
Jesse Rosenthal
9f4bacf86f Docx Reader: Add new file, TexChar.
This will allow us to deal with unicode characters from word equations. This
part of the process will need to continue to be improved.
2014-07-02 16:53:28 -04:00
Jesse Rosenthal
2bc0c77791 Docx Reader: Parse omml equations. 2014-07-02 16:52:39 -04:00
Clare Macrae
3faf31678e DokuWiki writer: Remove todos that I have already done. 2014-07-02 21:44:05 +01:00
Clare Macrae
61cc983bea DokuWiki writer: Retain unknown RawBlock and RawInline text
This added \cite and \begin latex to the testuite output.
2014-07-02 21:40:12 +01:00
Clare Macrae
d234157d25 DokuWiki output: Implement blockquotes properly
TODO Also implement nested blockquotes.
2014-07-02 21:26:24 +01:00
Clare Macrae
92a962ba63 DokuWiki writer: remove unused code 2014-07-02 21:01:33 +01:00
Matej Kollar
05b94c31ae Use Reader.
To avoid to pass opts explicitly around (as we
do not use it very much at the moment anyway)
2014-07-02 06:36:31 +02:00
Clare Macrae
3cb76d9560 Merge branch 'master' of git://github.com/jgm/pandoc into dokuwiki 2014-07-01 22:10:08 +01:00
Clare Macrae
244c4eee74 Remove stray <div> and </div> from DokuWiki output (#386) 2014-07-01 21:42:21 +01:00
Clare Macrae
0727579167 Improved HTML Blocks in DokuWiki output (#386)
For example, this fixes the display of a broken table, and 
it also fixes the various  HTML horizontal rules.
2014-07-01 21:21:09 +01:00
Clare Macrae
a049a60129 Disable warnings about unused parameters. 2014-06-30 22:07:17 +01:00
Jesse Rosenthal
0abfd386a4 Docx reader: clean up parStyle processing.
This gets rid of `divAttrToContainers`: an internal convenience function
which had become pretty inconvenient. Rather than converting classes and
indentations to string lists and back, we deal with the `pPr` attribute
directly.
2014-06-30 11:19:06 -04:00
John MacFarlane
3fbbafd391 Rewrote normalize for efficiency. (Closes #1385.)
* Added normalizeInlines, normalizeBlocks.
* Type signature is now more narrow, `Pandoc -> Pandoc` instead of
  `Data a :: a -> a`.  Some users may need to change their uses of
  `normalize` to the newly exported `normalizeInlines` or
  `normalizeBlocks`.
2014-06-29 23:05:08 -07:00
John MacFarlane
aad618d9db Merge pull request #1386 from jkr/hanging_indent
Fix hanging indent behavior
2014-06-29 21:31:44 -07:00
Jesse Rosenthal
0f59196e0e Docx reader: Make use of new ParIndentation info.
Here, when hanging indents are greater than or equal to left indents, we
don't set it to block quote. Such indents are frequently used in
academic bibliographies. (Thanks to Caleb McDaniel.)
2014-06-29 23:20:10 -04:00
Jesse Rosenthal
c0fcc8a789 Docx reader: Add ParIndentation type to parser.
This lets us keep more information about the indentation, and act
accordingly in the reader.
2014-06-29 23:19:56 -04:00
John MacFarlane
45d26fd01a Merge pull request #1383 from jkr/writer-bookmark-fix
Docx writer: insert bookmark tags inside <w:p> tag.
2014-06-29 20:01:51 -07:00
Jesse Rosenthal
0587334bc0 Docx writer: insert bookmark tags inside <w:p> tag.
This makes the header anchors in pandoc-generated ooxml match those
generated by word.
2014-06-29 16:38:51 -04:00
Clare Macrae
e35738a3cc Updated Copyright year, for consistency with MediaWiki.hs 2014-06-29 21:21:00 +01:00
Clare Macrae
fdbf52b1cc Updated DokuWiki code and tests to work with latest code from jgm.
The new code was got from inspecting changes in MediaWiki.hs

This slightly changes the output of Div blocks, but I'm not 
convinced the original behaviour was really correct anyway.

The code for handling Span does nothing for now, until I can 
work out the desired behaviour, and add tests for it.
2014-06-29 21:15:17 +01:00
Clare Macrae
717e16660d Merge remote-tracking branch 'jgm/master' into dokuwiki 2014-06-29 19:22:31 +01:00
Jesse Rosenthal
b1eba3c65c Docx Reader: Update state properly
Previously, a fresh state was created for the purpose of updating. In
the future, when there is more than one field in the state, this
obviously won't work.
2014-06-29 08:14:36 -04:00
Jesse Rosenthal
c0a8d5ac72 Docx Reader: All headers get auto id.
Previously, only those with an anchor got an auto id. Now, all do, which
puts it in line with pandoc's markdown extension.
2014-06-28 17:47:00 -04:00
Jesse Rosenthal
dce360e1e6 Docx Reader: Introduce link rewriting. 2014-06-28 04:00:16 -04:00
Jesse Rosenthal
b89a3ba2b1 make makeHeaderAnchors make an auto id
Record relationship between original id and auto id, so we can fix links
after.
2014-06-28 04:00:16 -04:00
Jesse Rosenthal
5969baf5b9 Rewrote header generation.
In preparation for auto ids.
2014-06-28 04:00:16 -04:00
Jesse Rosenthal
1de8d4d087 Docx Reader: Simplify makeHeaderAnchors
Using pattern guard, in preparation for doing some more complicated
stuff with it (recording header anchors, so we can change them to auto
ids.)
2014-06-28 04:00:16 -04:00
Jesse Rosenthal
ab76bbebbe Docx Reader: Clean up guards
Use PatternGuards to get rid of need for `isJust`, `fromJust`
altogether.
2014-06-28 04:00:16 -04:00
Jesse Rosenthal
db187348cd Docx rdr: Avoid mapping makeHeaderAnchors globally
It only applies to headers, so we can just apply it when we make a
header.
2014-06-28 04:00:16 -04:00
Jesse Rosenthal
4248f25152 Move Docx reader to DocxContext monad
This is a ReaderT State stack, which keeps track of some environment info, such
as the options and the docx doc. The state will come in handy in the future,
for a couple of planned features (rewriting the section anchors as auto_idents,
and hopefully smart-quoting).
2014-06-28 04:00:16 -04:00
John MacFarlane
b2127311cb Require haddock-library >= 1.1 and simplify haddock reader code.
See #1346.
2014-06-26 12:35:13 -07:00
John MacFarlane
9f694619cd Merge pull request #1374 from jkr/track-changes-options
Track changes with options
2014-06-25 23:51:16 -07:00
Jesse Rosenthal
2396be6f57 Docx reader: Code cleanup in parse.
Remove some redundant ways of dealing with Maybe.
2014-06-25 17:12:03 -04:00
Jesse Rosenthal
0e9bf37f64 Docx reader: Make use of track-changes option. 2014-06-25 14:17:20 -04:00
Jesse Rosenthal
d824f89fb3 Add TrackChanges to Options export. 2014-06-25 14:05:21 -04:00
Jesse Rosenthal
6ff84b5e8d Add reader option for track changes. 2014-06-25 13:57:56 -04:00