Commit graph

58 commits

Author SHA1 Message Date
John MacFarlane
f3a80034ff Removed writerSourceURL, add source URL to common state.
Removed `writerSourceURL` from `WriterOptions` (API change).
Added `stSourceURL` to `CommonState`.
It is set automatically by `setInputFiles`.

Text.Pandoc.Class now exports `setInputFiles`, `setOutputFile`.

The type of `getInputFiles` has changed; it now returns `[FilePath]`
instead of `Maybe [FilePath]`.

Functions in Class that formerly took the source URL as a parameter
now have one fewer parameter (`fetchItem`, `downloadOrRead`,
`setMediaResource`, `fillMediaBag`).

Removed `WriterOptions` parameter from `makeSelfContained` in
`SelfContained`.
2017-09-30 16:11:20 -05:00
Albert Krewinkel
5debb0da0f Shared: Provide custom isURI that rejects unknown schemes [isURI]
We also export the set of known `schemes`.

The new function replaces the function of the same name
from `Network.URI`, as the latter did not check whether a scheme is
well-known.  E.g. MediaWiki wikis frequently feature pages with names
like `User:John`. These links were interpreted as URIs, thus turning
internal links into global links. This is prevented by also checking
whether the scheme of a URI is frequently used (i.e. is IANA registered
or an otherwise well-known scheme).

Fixes: #2713

Update set of well-known URIs from IANA list
All official IANA schemes (as of 2017-05-22) are included in the set of
known schemes.  The four non-official schemes doi, isbn, javascript, and
pmid are kept.
2017-05-23 09:48:11 +02:00
John MacFarlane
93eaf33e6e SelfContained: handle @import with quoted string. 2017-05-20 17:32:46 +02:00
John MacFarlane
8d4fbe6a2a SelfContained: fixed problem with embedded fonts.
Closes #3629.

However, there is still room for improvement.

`@import` with following media declaration is not
handled.

Also `@import` with a simple filename (rather than
`url(...)` is not handled.
2017-05-20 17:09:47 +02:00
Albert Krewinkel
965f1ddd4a
Update dates in copyright notices
This follows the suggestions given by the FSF for GPL licensed software.
<https://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html>
2017-05-13 23:30:13 +02:00
John MacFarlane
9f0a80457f Revert "SelfContained: special handling for css @import."
This reverts commit 89b3fcc8e0.
2017-05-05 23:23:49 +02:00
John MacFarlane
89b3fcc8e0 SelfContained: special handling for css @import.
We now avoid creating a data URI for the url under an
@import.
2017-05-05 23:03:31 +02:00
John MacFarlane
c1b45adda0 SelfContained: Handle url() inside material retrieved from url().
This can happen e.g. with an @import of a google web font.
(What is imported is some CSS which contains an url reference
to the font itself.)

Also, allow unescaped pipe (|) in URL.

This is intended to help with #3629, but it doesn't seem to
work.
2017-05-05 17:03:27 +02:00
John MacFarlane
2f19b5daac SelfContained: export makeDataURI 2017-03-30 16:43:12 +02:00
John MacFarlane
e256c8ce17 Stylish-haskell automatic formatting changes. 2017-03-04 13:03:41 +01:00
John MacFarlane
377c27befe --self-contained: don't incorporate elements with data-external="1".
You can leave an external link as it is by adding the attribute
data-external="1" to the element.  Pandoc will then not try to
incorporate its content when `--self-contained` is used.  This is
similar to a feature already supported by the EPUB writer.

Closes #2656.
2017-02-26 22:48:02 +01:00
John MacFarlane
ce647d1cd8 Some fixes to the preceding revisions in SelfContained.
Make sure we don't duplicate end tags for script or link.
2017-02-24 13:11:29 +01:00
John MacFarlane
7c0a80c323 SelfContained: don't use data URIs for script or style.
Instead, just use script or style tags with the content inside.
The old method with data URIs prevents certain optimizations
outside pandoc.

Exception: data URIs are still used when a script contains
`</script>` or a style contains `</`.

Closes #3423.

Also, in MIME, use application/javascript (not
application/x-javascript).
2017-02-24 11:55:50 +01:00
John MacFarlane
0e8b19e709 Refactored getData from getDataURI in SelfContained. 2017-02-24 11:27:52 +01:00
John MacFarlane
2bbf98a613 Put makeSelfContained in PandocMonad instead of IO.
This removes the need to pass MediaBag around and improves
exceptions.  It also opens up the possibility of using
makeSelfContained purely.
2017-02-23 15:06:25 +01:00
John MacFarlane
612f1238aa Use lazy loading for reveal.js slide shows.
* In HTML writer, with reveal.js we use data-src instead of src
  for images.
* In SelfContained, we also load resources from data-src.

Closes #2283.
2017-02-20 22:21:20 +01:00
John MacFarlane
a3c3694024 Removed writerMediaBag from WriterOpts.
...since this is now handled through PandocMonad.

Added an explicit MediaBag parameter to makePDF and makeSelfContained.
2017-01-25 17:07:42 +01:00
John MacFarlane
6aff97e4e1 Text.Pandoc.Shared: Removed fetchItem, fetchItem'.
Made changes where these are used, so that the version
of fetchItem from PandocMonad can be used instead.
2017-01-25 17:07:42 +01:00
John MacFarlane
cf7d7f533a SelfContained: put makeSelfContained in MonadIO. 2017-01-25 17:07:41 +01:00
John MacFarlane
499985c1a3 Updated copyright dates to include 2016. 2016-03-22 17:20:39 -07:00
John MacFarlane
23b693c029 Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly."
This reverts commit c423dbb5a3.
2015-11-09 10:08:22 -08:00
John MacFarlane
c423dbb5a3 Use -XNoImplicitPrelude and 'import Prelude' explicitly.
This is needed for ghci to work with pandoc, given that we
now use a custom prelude.

Closes #2503.
2015-11-08 16:56:59 -08:00
John MacFarlane
fb1843ecde Fixed omitted url(...) in CSS data-uri with --self-contained.
Fixes #2489.
2015-10-28 10:06:40 -07:00
John MacFarlane
82b3e0ab97 Use custom Prelude to avoid compiler warnings.
- The (non-exported) prelude is in prelude/Prelude.hs.
- It exports Monoid and Applicative, like base 4.8 prelude,
  but works with older base versions.
- It exports (<>) for mappend.
- It hides 'catch' on older base versions.

This allows us to remove many imports of Data.Monoid
and Control.Applicative, and remove Text.Pandoc.Compat.Monoid.

It should allow us to use -Wall again for ghc 7.10.
2015-10-14 09:09:10 -07:00
John MacFarlane
c2ab44af84 --self-contained: Fixed overaggressive CSS minimization.
Previously `--self-contained` wiped out all spaces in CSS,
including semantically significant spaces!

Closes #2301.
Closes #2286.
2015-07-15 08:16:42 -07:00
John MacFarlane
ed9a118b54 Fixed regression in CSS parsing with --self-contained.
In 1b44acf0c5 we replaced some
hackish CSS parsing with css-text, which I thought was a complete
CSS parser.  It turns out that it is very buggy, which results
in lots of things being silently dropped from CSS when
`--self-contained` is used (#2224).

This commit replaces the use of css-text with a small but
more principled css preprocessor, which only removes whitespace
and replaces URLs with base 64 data when possible.

Closes #2224.
2015-06-28 11:54:18 -07:00
John MacFarlane
1b44acf0c5 SelfContained: properly handle data URIs in css urls.
Also use a proper css parser (adds dependency on text-css).

Closes #2129.
2015-05-04 16:00:28 -07:00
John MacFarlane
9b2f645e2a SelfContained: cssURLs no longer tries to fetch fragment URLs.
The current test is: does the URL start with a `#`?
Closes #2121.
2015-05-01 22:15:43 -07:00
John MacFarlane
1868cb5e42 Updated copyright notices to -2015. Closes #2111. 2015-04-26 10:18:29 -07:00
John MacFarlane
d5469b30fe Improved building of data URIs in SelfContained.
Now base64 is used except for 'text/*' mime types.  Closes #1940.
2015-02-13 21:37:43 -08:00
John MacFarlane
52310eb470 SelfContained: Add ;charset=utf-8 to script mime type if missing.
Closes #1842.
2014-12-31 14:51:23 -08:00
John MacFarlane
bf00556c72 Added track to list of tags treated by --self-contained.
Closes #1664.
2014-10-04 11:39:08 -07:00
Artyom Kazak
cca9e8feb4 MIME cleanup.
* Create a type synonym for MIME type (instead of `String`).
  * Add `getMimeTypeDef` function.
  * Avoid recreating MIME type `Map`s every time.
  * Move “Formula-...” case handling into `getMimeType`.
2014-08-17 21:00:50 +04:00
John MacFarlane
842c705097 SelfContained: Fixed determining of source URL from within CSS files.
(This fixes a bug introduced a couple commits back.)
2014-08-02 16:33:22 -07:00
John MacFarlane
ce8922437d Text.Pandoc.SelfContained changes.
* mkSelfContained now takes just two arguments, WriterOptions and
  the string.
* It no longer looks in data files.  This only made sense when we
  had copies of slidy and S5 code there.
* Shared.fetchItem' is used instead of the nearly duplicate getItem.
2014-08-02 16:07:19 -07:00
John MacFarlane
6dd2418476 New module, Text.Pandoc.MediaBag.
Moved `MediaBag` definition and functions from Shared:
`lookupMedia`, `mediaDirectory`, `insertMedia`, `extractMediaBag`.
Removed `emptyMediaBag`; use `mempty` instead, since `MediaBag`
is a Monoid.
2014-07-31 12:00:21 -07:00
John MacFarlane
00662faefb Made MediaBag a newtype, and added mime type information to media.
Shared now exports functions for interacting with a MediaBag:

- `emptyMediaBag`
- `lookuMedia`
- `insertMedia`
- `mediaDirectory`
- `extractMediaBag`
2014-07-31 11:05:35 -07:00
John MacFarlane
e4913d6dba Allow --self-contained to get content from MediaBag.
Added a parameter to makeSelfContained (API change).
2014-07-30 15:26:40 -07:00
Albert Krewinkel
8fdbef841d Update copyright notices for 2014, add missing notices 2014-05-09 00:46:08 +02:00
John MacFarlane
6fda361977 SelfContained: Handle "poster" attribute in "video" tags.
Closes #1188.
2014-03-05 09:10:09 -08:00
John MacFarlane
386e933432 Use isURI instead of isAbsoluteURI.
It allows fragments identifiers.
2013-10-16 09:48:11 -07:00
John MacFarlane
7c980f39bf Improved fetching of external resources.
* In Shared, openURL and fetchItem now return an Either, for
  better error handling. (API change.)
* Better error message when fetching a URL fails with
  `--self-contained`.
* EPUB writer: If resource not found, skip it, as in Docx writer.
* Closes #916.
2013-07-18 20:58:14 -07:00
John MacFarlane
dede39452f Added comment/todo to SelfContained. 2013-04-10 10:22:00 -07:00
John MacFarlane
40f0a6dd66 SelfContained: handle src in embed, audio, source, input tags. 2013-03-26 08:45:25 -07:00
John MacFarlane
0ee54549af SelfContained: strip off fragment, query of relative URL
before treating as a filename.  This fixes `--self-contained`
when used with CSS files that include web fonts using the
method described here:

http://paulirish.com/2009/bulletproof-font-face-implementation-syntax/

Examples from reveal.js themes:

    "../../lib/font/league_gothic-webfont.eot?#iefix"
    "../../lib/font/league_gothic-webfont.svg#LeagueGothicRegular"

Closes #739.
2013-03-25 20:09:24 -07:00
John MacFarlane
449ddeb53b Refactoring:
* Shared now exports fetchItem (instead of getItem) and openURL
* fetchItem has different parameters than getItem and includes
  some logic formerly in the ODT and Docx writers
* getItem still used in SelfContained
2013-01-11 16:19:06 -08:00
John MacFarlane
77d9ead1b2 Move getItem from SelfContained to Share; export getItem. 2013-01-11 11:30:31 -08:00
John MacFarlane
1864bb0994 Data files changes.
* Added `embed_data_files` flag.  (not yet used)
* Shared no longer exports `findDataFile`.
* `readDataFile` now returns a strict bytestring.
* Shared now exports `readDataFileUTF8` which returns a string like
  the old `readDataFile`.
* Rewrote modules to use new data file functions and to avoid
  using functions from Paths_pandoc directly.
2012-12-29 17:54:07 -08:00
John MacFarlane
6ad7ac1239 Removed need for utf8-string package.
* Depend on text.
* Expose Text.Pandoc.UTF8.
* Text.Pandoc.UTF8 now exports toString, fromString,
  toStringLazy, fromStringLazy.
* These are used instead of the old utf8-string functions.
2012-09-25 19:54:21 -07:00
John MacFarlane
a6f2b96084 Moved renderTags' from HTML reader & SelfContained to Shared.
Improved removal of markdown="1" attribute in Markdow reader.
2012-08-15 09:42:16 -07:00