Commit graph

59 commits

Author SHA1 Message Date
John MacFarlane
e1a9a61774 Docx writer: Implemented csl flipflopping spans. 2013-11-23 14:52:14 -08:00
John MacFarlane
3d453f096c Docx writer: Use mime type info returned by fetchItem. 2013-11-19 13:16:31 -08:00
John MacFarlane
2efd0951d3 Docx writer: fixed core metadata.
- Don't create empty date nodes if no date given.
- Don't create multiple dc:creator nodes; instead separate by
  semicolons.

Closes #1046.
2013-11-07 08:48:59 -08:00
John MacFarlane
5b99112f22 Docx writer: Fix URL for core-properties in _rels/.rels.
Partially addresses #1046.
2013-11-06 19:18:47 -08:00
John MacFarlane
0d95c15e83 TexMath: Export readTeXMath', which attends to display/inline.
Deprecate readTeXMath, and use readTeXMath' in all the writers.
Require texmath >= 0.6.5.
2013-11-01 14:28:24 -07:00
John MacFarlane
d27e5a6ff0 DOCX writer: Add missing settings.xml to the zip container.
Closes #990.
2013-09-19 09:48:02 -07:00
John MacFarlane
e279175ea5 Options: Changed writerSourceDir to writerSourceURL (now a Maybe).
Previously we used to store the directory of the first input file,
even if it was local, and used this as a base directory for
finding images in ODT, EPUB, Docx, and PDF.

This has been confusing to many users.  It seems better to look for
images relative to the current working directory, even if the first
file argument is in another directory.

writerSourceURL is set to 'Just url' when the first command-line
argument is an absolute URL.  (So, relative links will be resolved
in relation to the first page.)  Otherwise, 'Nothing'.

The ODT, EPUB, Docx, and PDF writers have been modified accordingly.

Note that this change may break some existing workflows.  If you
have been assuming that relative links will be interpreted relative
to the directory of the first file argument, you'll need to
make that the current directory before running pandoc.

Closes #942.
2013-08-11 15:58:09 -07:00
John MacFarlane
02a125d0aa Use walk, walkM in place of bottomUp, bottomUpM when possible.
They are significantly faster.
2013-08-10 18:45:00 -07:00
John MacFarlane
cbfa932106 Adjustments for new Format newtype. 2013-08-10 17:24:54 -07:00
John MacFarlane
e9de0f0e22 Preliminary support for new Div and Span elements in writers.
Currently these are "transparent" containers, except in HTML,
where they produce div and span elements with attributes.
2013-08-08 23:14:12 -07:00
John MacFarlane
802dc9a8b9 Added Text.Pandoc.Compat.Monoid.
This allows pandoc to compile with base < 4.5, where Data.Monoid
doesn't export `<>`.  Thanks to Dirk Ullirch for the patch.
2013-08-08 10:41:39 -07:00
John MacFarlane
7c980f39bf Improved fetching of external resources.
* In Shared, openURL and fetchItem now return an Either, for
  better error handling. (API change.)
* Better error message when fetching a URL fails with
  `--self-contained`.
* EPUB writer: If resource not found, skip it, as in Docx writer.
* Closes #916.
2013-07-18 20:58:14 -07:00
John MacFarlane
b2385d0e9b Text.Pandoc.ImageSize: Handle EPS.
Closes #903.  This change will make EPS images properly sized
on conversion to Word.
2013-07-16 22:04:59 -07:00
John MacFarlane
f42095b7b7 Docx writer: Make --no-highlight work properly. 2013-07-13 13:48:50 -07:00
John MacFarlane
bd1079e48e Docx writer: Ignore most components of reference.docx.
We take the word/styles.xml, docProps/app.xml, word/theme/theme1.xml,
and word/fontTable.xml from reference.docx, ignoring everything else.

Perhaps this will help with the corruption problems caused when
different versions of Word resave the reference.docx and
reorganize things.
2013-07-12 20:58:15 +01:00
John MacFarlane
e7b5f2deb5 Docx writer: Use w:br with w:type 'textWrapping' for linebreaks.
Previously we used w:cr.

I don't see a difference between these in my version of Word,
but apparently some do.  Closes #873.
2013-07-04 16:37:44 -07:00
John MacFarlane
f869f7e08d Use new flexible metadata type.
* Depend on pandoc 1.12.
* Added yaml dependency.
* `Text.Pandoc.XML`: Removed `stripTags`.  (API change.)
* `Text.Pandoc.Shared`:  Added `metaToJSON`.
  This will be used in writers to create a JSON object for use
  in the templates from the pandoc metadata.
* Revised readers and writers to use the new Meta type.
* `Text.Pandoc.Options`: Added `Ext_yaml_title_block`.
* Markdown reader:  Added support for YAML metadata block.
  Note that it must come at the beginning of the document.
* `Text.Pandoc.Parsing.ParserState`:  Replace `stateTitle`,
  `stateAuthors`, `stateDate` with `stateMeta`.
* RST reader:  Improved metadata.
  Treat initial field list as metadata when standalone specified.
  Previously ALL fields "title", "author", "date" in field lists
  were treated as metadata, even if not at the beginning.
  Use `subtitle` metadata field for subtitle.
* `Text.Pandoc.Templates`:  Export `renderTemplate'` that takes a string
  instead of a compiled template..
* OPML template:  Use 'for' loop for authors.
* Org template: '#+TITLE:' is inserted before the title.
  Previously the writer did this.
2013-06-24 20:29:41 -07:00
John MacFarlane
72020f1773 Docx writer: Use Compact style for Plain block elements.
This differentiates between tight and loose lists.
Closes #775.
2013-03-30 22:11:00 -07:00
John MacFarlane
d596b0db83 Docx writer: Fixed rendering of display math in lists.
In 1.11 and 1.11.1, display math in lists rendered as a new list
item.  Now it always appears centered, just as outside of lists,
and in proper display math style, no matter how far indented the
containing list item is.

Closes #784.
2013-03-18 19:31:48 -07:00
John MacFarlane
6cfd2e8fa9 Docx writer: Better treatment of display math.
Display math inside a paragraph is now put in a separate
paragraph, so it will render properly (centered and without
extra blank lines around it).

Partially addresses #742.
2013-02-26 22:59:21 -08:00
John MacFarlane
3fca434737 Changed style names in reference docx.
FootnoteReference -> FootnoteRef.
Hyperlink -> Link.

Why?  Because the old names got changed by Word when the
reference.docx was edited.  I don't understand why, but this
fixes things.

Closes #414.
2013-02-26 22:01:47 -08:00
John MacFarlane
caed0df4a7 Docx writer: Create content types and document rels from scratch.
This fixes problems that arise when you edit the reference.docx
with Word.  Word tends to remove things from the `[Content_Types].xml`
and `word/_rels/document.xml.rels` files that are needed (e.g.
references to the `footnotes.xml` file and image default mime types).
So we regenerate these completely rather than taking them from
the `reference.docx`.

We also now encode mime types for each individual image rather
than using defaults.  This should allow us to handle a wider
range of image types.

This mostly addresses #414.  The only remaining issue I can see
is the issue of style IDs, which Word inexplicably changes in
some cases when the reference.docx is saved.  E.g.
`FootnoteReference` becomes `FootnoteReference1`.
2013-02-26 20:31:32 -08:00
John MacFarlane
c46eac5aea Refactoring in Docx writer. 2013-02-25 19:04:20 -08:00
John MacFarlane
cae409725f Docx writer: Handle PDF images. 2013-02-23 23:04:42 -08:00
John MacFarlane
7bc37e4414 Use 'fig:' instead of '\SOH' in title to indicate figure.
Revises 1a4b47e933
2013-01-15 08:46:09 -08:00
John MacFarlane
1a4b47e933 Implemented Ext_implicit_figures.
* In markdown reader, add a '\1' character to the beginning
  of the title of an image that is alone in its paragraph,
  if implicit_figures extension is selected.
* In writers, check for Para [Image alt (src,'\1':tit)] and treat
  it as a figure if possible.
* Updated tests.

This is a bit of a hack, but it allows us to make implicit_figures
an extension of the markdown reader, rather than the writers.
2013-01-14 20:53:08 -08:00
John MacFarlane
449ddeb53b Refactoring:
* Shared now exports fetchItem (instead of getItem) and openURL
* fetchItem has different parameters than getItem and includes
  some logic formerly in the ODT and Docx writers
* getItem still used in SelfContained
2013-01-11 16:19:06 -08:00
John MacFarlane
8f7beb6d10 ODT, Docx writers: Properly handle URL refs for images.
These images are now downloaded instead of being ignored (as
used to happen in the docx reader) or causing an error (as
used to happen in the odt reader).
2013-01-11 15:45:19 -08:00
John MacFarlane
2a0ed1c433 Improvements to docx writer.
Avoid reading image files again when we've already processed them.
2013-01-11 13:41:17 -08:00
John MacFarlane
4e4c3537e0 Docx writer: Preliminary improvements.
* Use getItem to fetch images, so we can get them over the net
  if they have absolute URLs.
* Added TODO notes for cleaning up the logic.
2013-01-11 12:17:41 -08:00
John MacFarlane
d599c4cdab Added Attr field to Header.
Previously header ids were autogenerated by the writers.
Now they are generated (unless supplied explicitly) in the
markdown parser, if the `header_identifiers` extension is
selected.

In addition, the textile reader now supports id attributes on
headers.
2013-01-09 09:30:05 -08:00
John MacFarlane
4d1c82de9e Docx writer: Use rIdNN identifiers for r:embed in images. 2013-01-06 19:07:35 -08:00
John MacFarlane
f779411fe2 Docx writer: Use separate footnotes.xml for notes.
This seems to help LibreOffice convert the file, even though
it was valid docx before.

Note that the references in notes must be in
word/_rels/footnotes.xml.rel.  We handle this now by simply
making that file contain all the references in
word/_rels/document.xml.rel.  Something better could be done
eventually, but this works.

Closes #637.
2013-01-06 12:26:44 -08:00
John MacFarlane
1864bb0994 Data files changes.
* Added `embed_data_files` flag.  (not yet used)
* Shared no longer exports `findDataFile`.
* `readDataFile` now returns a strict bytestring.
* Shared now exports `readDataFileUTF8` which returns a string like
  the old `readDataFile`.
* Rewrote modules to use new data file functions and to avoid
  using functions from Paths_pandoc directly.
2012-12-29 17:54:07 -08:00
John MacFarlane
3f86127f5a Docx writer: Added nsid to abstractNum elements.
This helps when merging word documents with numbered or bulleted lists.
Closes #627.
2012-10-02 19:43:18 -07:00
John MacFarlane
02bb0f051a Use integer ids for bookmarks.
Closes #626.
2012-10-02 19:20:51 -07:00
John MacFarlane
6ad7ac1239 Removed need for utf8-string package.
* Depend on text.
* Expose Text.Pandoc.UTF8.
* Text.Pandoc.UTF8 now exports toString, fromString,
  toStringLazy, fromStringLazy.
* These are used instead of the old utf8-string functions.
2012-09-25 19:54:21 -07:00
John MacFarlane
12045d84b6 Revert "More intelligent handling of text encodings."
This reverts commit 7272735b3d.
2012-09-23 22:53:34 -07:00
John MacFarlane
7272735b3d More intelligent handling of text encodings.
Previously, UTF-8 was enforced for both input and output.

The new system:

* For input, UTF-8 is tried first; if an error is raised, the
  locale encoding is tried.
* For output, the locale encoding is always used.
2012-09-23 22:12:21 -07:00
John MacFarlane
6f0b465173 Docx writer: Fixed bug with nested lists.
Previously a list like

    1. one
        - a
        - b
    2. two

would come out with a bullet instead of "2."
Thanks to Russell Allen for reporting the bug.
2012-09-05 16:24:37 -07:00
John MacFarlane
fd616665ac Docx line breaks: Use w:cr in w:r instead of w:br.
This seems to fix a problem viewing pandoc-generated
docx files in LibreOffice.
2012-08-17 18:27:48 -07:00
John MacFarlane
00dc1e715e Moved WriterOptions and associated types Shared -> Options. 2012-07-26 22:59:56 -07:00
John MacFarlane
999edd9608 Changed signatures of writeODT, writeDocx, writeEPUB.
These now take WriterOptions and Pandoc only.
The extra parameters for epub stylesheet, epub fonts,
reference Docx, and reference ODT have been removed, since
these things are now in WriterOptions.

Note:  breaking API change!
2012-07-24 09:56:00 -07:00
John MacFarlane
5669dc2a47 Simplified bullet characters so they work with Word 2007.
Closes #520.
2012-06-01 18:44:00 -07:00
John MacFarlane
e6e63d1699 Docx writer: Fixed error message when style file can't be parsed. 2012-04-21 09:27:53 -07:00
John MacFarlane
66f8dc14b7 Docx writer: Fixed multi-paragraph list items.
Previously they each got a list marker.
Closes #457.
2012-04-07 17:08:52 -07:00
John MacFarlane
7376d26c36 Add TableNormal style to tables.
Needs testing with Word.
2012-02-14 17:41:11 -08:00
John MacFarlane
5cfec4b922 Fix _rels/.rels if it has been screwed up by Word.
Closes #414.

Previously, if you edited the reference.docx with Word, then
created a new docx using the edited reference.docx, Word would complain
about the file being corrupt.  The problem seems to be that Word
changes _rels/.rels, changing the Type of the Relationship to
docProps/core.xml from
"http://schemas.openxmlformats.org/officedocument/2006/relationships/metadata/core-properties"
to
"http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties".
As far as I can see, this is a bug in Word, since the latter is not
valid.  (See
http://idippedut.dk/post/2010/04/22/Correct-according-to-spec-or-implementation.aspx.)

This change simply does a global replace on _rels/.rels that reverts
the change Word makes.  And now producing docx files with Word-modified
reference.docx seems to work.
2012-02-11 09:12:58 -08:00
John MacFarlane
f437827b0c Remove dependency on old-time. 2012-01-28 16:04:35 -08:00
John MacFarlane
3a0b3df007 Put date in YYYY-MM-DD format if possible for HTML, docx metadata.
Added normalizeDate to Text.Pandoc.Shared.
2012-01-28 15:54:34 -08:00