Commit graph

4077 commits

Author SHA1 Message Date
John MacFarlane
d51e475bad Export Text.Pandoc.Error in Text.Pandoc.
[API change]
2016-10-24 22:31:36 +02:00
John MacFarlane
bf72a482eb Tighten up parsing of raw email addresses.
Technically `**@user` is a valid email address, but if we
allow things like this, we get bad results in markdown flavors
that autolink raw email addresses. (See #2940.)
So we exclude a few valid email addresses in order to
avoid these more common bad cases.

Closes #2940.
2016-10-23 23:12:36 +02:00
Thomas Weißschuh
96fa1daa6e fix example in documentation (#3176)
Errors are encountered while reading, not writing
2016-10-23 21:48:22 +02:00
Mauro Bieg
9dd18ebf34 ICML writer: replace partial function (!!) in table handling (#3175) 2016-10-23 21:42:33 +02:00
John MacFarlane
696dfbc993 Added angle_brackets_escapable extension.
This is needed because github flavored Markdown has a slightly
different set of escapable symbols than original Markdown;
it includes angle brackets.

Closes #2846.
2016-10-22 23:41:55 +02:00
John MacFarlane
d2a6533d6e EPUB reader: don't add root path to data: URIs.
Closes #3150.

Thanks to @lep for the bug report and patch.
2016-10-22 23:07:15 +02:00
John MacFarlane
1da40d63b1 Merge pull request #3108 from tarleb/part
Add command line option allowing to set type of top-level divisions
2016-10-19 14:07:44 +02:00
Albert Krewinkel
595a171407
Add option for top-level division type
The `--chapters` option is replaced with `--top-level-division` which allows
users to specify the type as which top-level headers should be output. Possible
values are `section` (the default), `chapter`, or `part`.

The formats LaTeX, ConTeXt, and Docbook allow `part` as top-level division, TEI
only allows to set the `type` attribute on `div` containers.  The writers are
altered to respect this option in a sensible way.
2016-10-19 13:12:57 +02:00
Hubert Plociniczak
dd799df0ae Image with a caption needs special formatting
Latex Writer only handles captions if the image's title
is prefixed with 'fig:'.
2016-10-19 11:40:44 +02:00
John MacFarlane
4a2a7a21e5 Merge pull request #3166 from hubertp-lshift/bug/3134
Issue 3143: Don't duplicate text for anchors
2016-10-18 22:03:45 +02:00
John MacFarlane
0cd11b3e54 Merge pull request #3165 from hubertp-lshift/feature/odt-image
[odt] images parser
2016-10-18 22:00:58 +02:00
John MacFarlane
8264ae2abe Better fix for the problem with ghc 7.8. 2016-10-18 16:25:13 +02:00
John MacFarlane
3747abf029 Try to fix build error on ghc 7.8.
@tarleb this is an interesting one, see the build log in
https://travis-ci.org/jgm/pandoc/jobs/168612017

It only failed on ghc 7.8; I think this must have to do with
the change making Monad a superclass of Applicative, hence
this change.
2016-10-18 16:03:34 +02:00
Hubert Plociniczak
c74c5fdd97 Issue 3143: Don't duplicate text for anchors
When creating an anchor element we were adding its representation
as well as the original content, leading to text duplication.
2016-10-18 10:50:37 +02:00
Albert Krewinkel
1266b210ac
Org writer: drop space before footnote markers
The writer no longer adds an extra space before footnote markers.

Fixes: #3162
2016-10-17 22:11:03 +02:00
Hubert Plociniczak
4417e33ea9 Use bind function instead of pattern matching 2016-10-17 16:58:53 +02:00
Hubert Plociniczak
7234321e8f Minor refactoring 2016-10-17 16:50:03 +02:00
Hubert Plociniczak
a02f276ff1 Infer caption from the text following the img
Frame can contain other frames with the text boxes.
This is something that has not been considered before
and meant that the whole construction of images was
broken in those cases. Also the captions were fixed/ignored.
2016-10-17 16:35:13 +02:00
Jesse Rosenthal
e666c92bc9 RST reader: skip whitespace before note.
RST requires a space before a footnote marker. We discard those spaces
so that footnotes will be adjacent to the text that comes before
it. This is in line with what rst2latex does. rst2html does not discard
the space, but its html output is different than pandoc's, so this seems
the most semantically correct approach.

Closes #3163
2016-10-17 09:54:59 -04:00
Albert Krewinkel
462c140eb6
Org reader: allow figure with empty caption
A `#+CAPTION` attribute before an image is enough to turn an image into a
figure. This wasn't the case because the `parseFromString` function, which
processes the caption value, would fail on empty values. Adding a newline
character to the caption value fixes this.

Fixes: #3161
2016-10-14 23:16:51 +02:00
John MacFarlane
9bd1da122a Merge pull request #3146 from hubertp-lshift/feature/odt-list-start-value
[ODT Parser] Include list's starting value
2016-10-14 15:15:50 +02:00
Hubert Plociniczak
9282fadc6b Added tests and a corner case for starting number
Review revealed that we didn't handle the case
when the starting point is an empty string. While
this is not a valid .odt file, we simply added
a special case to deal with it.

Also added tests for the new feature.
2016-10-14 13:56:24 +02:00
Jesse Rosenthal
cd1427876e Markdown writer: Abstract out note/ref function.
We do basically the same thing every time we insert notes, so let's cut
down on code duplication.
2016-10-13 11:04:35 -04:00
John MacFarlane
6d13567ac5 Allow http-client 0.4.30, which is the version in stackage lts.
Previously we required 0.5.
Remove CPP conditionals for earlier versions.
2016-10-13 13:01:49 +02:00
John MacFarlane
4a1ef0b51d Revert "Remove http-client CPP conditionals."
This reverts commit 3f82471355.

We might want to revert the requirement of http-client 0.5,
as this is not yet in Stackage and that is starting to
cause problems.  I can't recall why it is there.
2016-10-13 12:35:58 +02:00
Albert Krewinkel
3e60ed9c03
Allow empty lines when parsing line blocks
Line blocks are allowed to contain empty lines and should be parsed as a
single block in that case.  Previously an empty (line block) line would
have terminated parsing of the line block element.
2016-10-13 08:46:44 +02:00
Albert Krewinkel
c9460e7013
Parse line-oriented markup as LineBlock
Markup-features focusing on lines as distinctive part of the markup are read
into `LineBlock` elements. This currently means line blocks in reStructuredText
and Markdown (the latter only if the `line_block` extension is enabled), the
`linegroup`/`line` combination from the Docbook 5.1 working draft, and Org-mode
`VERSE` blocks.
2016-10-13 08:46:44 +02:00
Albert Krewinkel
22cb9e3327
Add support for the LineBlock element to writers
The following markup features are used to output the lines of the `LineBlock`
element:

  - AsciiDoc: a `[verse]` block,
  - ConTeXt: text surrounded by `\startlines` and `\endlines`,
  - HTML: `div` with an per-element style setting to interpret the content as
    pre-wrapped,
  - Markdown: line blocks if the `line_blocks` extension is enabled, a simple
    paragraph with hard linebreaks otherwise,
  - Org: VERSE block,
  - RST: a line block, and
  - all other formats: a paragraph, containing hard linebreaks between lines.

Custom lua writers should be updated to use the `LineBlock` element.
2016-10-13 08:46:44 +02:00
Albert Krewinkel
64b77cc2c5
Shared: add function combining lines using LineBreak
The `linesToBlock` function takes a list of lines and combines them by appending
a hard `LineBreak` to each line and concatenating the result, putting the result
it into a `Para`. This is most useful when dealing when converting `LineBlock`
elements.
2016-10-13 08:46:38 +02:00
Hubert Plociniczak
edc951ee7d [ODT Parser] Include list's starting value
Previously the starting value of the lists' items has been
hardcoded to 1. In reality ODT's list style definition can
provide a new starting value in one of its attributes.

Writers already handle the modified start value so no need
to change anything in that area.
2016-10-12 18:36:40 +02:00
Hubert Plociniczak
c924611de5 Basic support for images in ODT documents
Highly influenced by the docx support, refactored
some code to avoid DRY.
2016-10-12 17:50:35 +02:00
John MacFarlane
901045b0bb Merge pull request #3159 from jkr/refs
Specify location for footnotes (and reference links) in MD output
2016-10-12 11:11:06 +02:00
Jesse Rosenthal
ca50deeeee Markdown writer: Allow footnotes/refs at the end of blocks, sections
This allows footnotes and refs to be placed at the end of blocks and
sections. Note that we only place them at the end of blocks that are at
the top level and before headers that are the top level. We add an
environment variable to keep track of this. Because we clear the
footnotes and refs when we use them, we also add a state variable to
keep track of the starting number.

Finally, note that we still add any remaining footnotes at the end. This
takes care of the final section, if we are placing at the end of a
section, and will always come after a final block as well.
2016-10-11 15:17:01 -04:00
Jesse Rosenthal
6914808139 Add ReaderT monad for environment variables.
This will make it easier to keep track of what level of block we are at.
2016-10-11 13:53:47 -04:00
Jesse Rosenthal
27113bda1f Options: Add references location.
This will be used by the markdown writer for deciding where to put links
and footnotes.
2016-10-11 13:30:01 -04:00
Albert Krewinkel
bb48f2edc4
Org reader: trim verse lines properly
An empty verse line should not result in `Str ""` but in `mempty`.
2016-10-10 21:12:47 +02:00
John MacFarlane
f4b7ab125e More checks for Ext_raw_html when rendering HTML in Markdown.
Previously we'd emit raw HTML tables even if the `raw_html`
extension was disabled.

Now we just emit `[TABLE]` if no table formats are enabled
and raw HTML is not enabled.

We also check for the `raw_html` extension before emiting
a raw HTML block.

Closes #3154.
2016-10-10 11:23:04 +02:00
KolenCheung
46be319ca9 removed mmd raw_tex in src/Text/Pandoc/Options.hs 2016-10-09 21:30:03 +02:00
Jesse Rosenthal
6d05c378d0 Docx writer: Move one more env var to Reader monad
PrintWidth is set at the beginning and stays the same throughout the
document writing, so we just set it as an env variable in the Reader
monad.
2016-10-05 08:15:55 -04:00
Jesse Rosenthal
4eba314c24 Docx writer: code legibility fixups.
More meaningful variable name, and explanatory comment.
2016-10-05 08:06:22 -04:00
Jesse Rosenthal
be27c9c646 Docx writer: Clean up and streamline RTL behavior
Now RTL is turned and off by a general function, `withDirection`
wrapping `inlineToOpenXML` and `blockToOpenXML`. This acts according to
the `envRTL` variable. This means we can just set the environment at the
outset, and change the environment with `local` as need be.

Note that this requires making the `inlineToOpenXML` and
`blockToOpenXML` functions into wrappers around
primed-versions (`{inline,block}ToOpenXML`) where the real work takes
place.
2016-10-04 11:18:11 -04:00
Jesse Rosenthal
1893f8fe48 Docx writer: move a couple more vars to ReaderT
In general, we want things that are either:

 1. unchanging environment variables, or
 2. environment variables that will change for a the scope of
    a function and then pop back

to be in the reader monad. This is safer for (1), since we won't
accidentally change it, and easier for (2), since we can use `local`
instad of setting the old value and then resetting.

We keep the StateT monad for values that we will want to accumulate or
change and then use later.
2016-10-04 08:54:35 -04:00
Jesse Rosenthal
41f4c8741f Clean up commented-out code
A few commented out functions were left in the code during the
conversion from StateT to ReaderT. This removes them.
2016-10-03 21:48:59 -04:00
Jesse Rosenthal
ab31d5ea8d Remove bool on setRTL.
We had to use this because we set the env, which means that setRTL
wouldn't do anything at the top level. We now don't set the env (it will
always be false at the outset), which means the toplevel setRTL will
work if necessary.
2016-10-03 21:39:40 -04:00
Jesse Rosenthal
666c042e80 Filter text/para props correctly.
We only filter on the name, not the prefix.
2016-10-03 21:39:40 -04:00
Jesse Rosenthal
acf352331c Add a boolean flag to the setRTL function.
At the toplevel we don't check to see if RTL is already set.
2016-10-03 21:39:40 -04:00
Jesse Rosenthal
e98be61a38 Test for "dir" metadata. 2016-10-03 21:39:40 -04:00
Jesse Rosenthal
8d148c02a8 Add setRTL and setLTR functions. 2016-10-03 21:39:34 -04:00
Jesse Rosenthal
a2d3854f23 Move more enviroment vars to Reader Monad.
Things that get pushed and then reset are better in ReaderT, because
they can be run with `local`.
2016-10-03 12:12:38 -04:00
Jesse Rosenthal
6a3d1cf210 Add ReaderT env to the docx writer:
This will allow us to add text and paragraph properties depending on if
rtl is already set or not.

(It would probably be cleaner and safer to move the paraprops and
textprops to this part of the stack in the future.)
2016-10-03 07:50:40 -04:00