Commit graph

2083 commits

Author SHA1 Message Date
John MacFarlane
d62c4a9247 LaTeX reader: Improve handling of accents.
Handle ogonek, and fall back correctly with forms like `\"{}`.
2017-09-05 10:58:34 -07:00
John MacFarlane
146a10780e LaTeX reader: support \k ogonek accent. 2017-09-05 09:55:42 -07:00
Alexander
350c282f20 Muse reader: require at least one space char after * in header (#3895) 2017-09-05 09:41:27 -07:00
Alexander
c09b586147 Muse reader: parse <div> tag (#3888) 2017-09-04 21:22:40 -07:00
John MacFarlane
1d0805ce41 HTML reader: Fix pattern match. 2017-09-04 18:11:26 -07:00
John MacFarlane
50ec64ffbc HTML reader: improved handling of figure.
Previously we had a parse failure if the figure contained
anything besides an image and caption.
2017-08-30 17:05:12 -07:00
Alexander
14f813c3f2 Muse reader: parse verse markup (#3882) 2017-08-29 12:40:34 -07:00
Alexander
2d936ff4e0 hlint Muse reader (#3884) 2017-08-29 09:15:06 -07:00
John MacFarlane
8fcf66453c RST reader: Fixed ..include:: directive.
Closes #3880.
2017-08-27 17:09:55 -07:00
John MacFarlane
1b3431a165 LaTeX reader: improved support for \hyperlink, \hypertarget.
Closes #2549.
2017-08-25 22:04:57 -07:00
Alexander
e6f767b581 Muse reader: parse <verse> tag (#3872) 2017-08-25 07:09:28 -07:00
bucklereed
c80e26f888 LaTeX reader: RN and Rn, from biblatex (#3854) 2017-08-24 09:45:58 -07:00
Alexander
5d74932578 Muse reader: avoid crashes on multiparagraph inline tags (#3866)
Test checks that behavior is consistent with Amusewiki
2017-08-22 23:12:34 -07:00
Alexander
c7d4fd8cf1 Muse reader: do not allow closing tags with EOF (#3863)
This behavior is compatible to Amusewiki
2017-08-22 16:34:18 -07:00
Alexander
0a839cbdc9 Muse reader: add definition list support (#3860) 2017-08-21 21:08:44 -07:00
John MacFarlane
9cc128b579 LaTeX reader: Set identifiers on Spans used for \label. 2017-08-20 16:52:03 -07:00
John MacFarlane
f2fdd275fd LaTeX reader: allow ] inside group in option brackets.
Closes #3857.
2017-08-20 13:42:43 -07:00
John MacFarlane
a31241a08b Markdown reader: use CommonMark rules for list item nesting.
Closes #3511.

Previously pandoc used the four-space rule: continuation paragraphs,
sublists, and other block level content had to be indented 4
spaces.  Now the indentation required is determined by the
first line of the list item:  to be included in the list item,
blocks must be indented to the level of the first non-space
content after the list marker. Exception: if are 5 or more spaces
after the list marker, then the content is interpreted as an
indented code block, and continuation paragraphs must be indented
two spaces beyond the end of the list marker.  See the CommonMark
spec for more details and examples.

Documents that adhere to the four-space rule should, in most cases,
be parsed the same way by the new rules.  Here are some examples
of texts that will be parsed differently:

    - a
      - b

will be parsed as a list item with a sublist; under the four-space
rule, it would be a list with two items.

    - a

          code

Here we have an indented code block under the list item, even though it
is only indented six spaces from the margin, because it is four spaces
past the point where a continuation paragraph could begin.  With the
four-space rule, this would be a regular paragraph rather than a code
block.

    - a

            code

Here the code block will start with two spaces, whereas under
the four-space rule, it would start with `code`.  With the four-space
rule, indented code under a list item always must be indented eight
spaces from the margin, while the new rules require only that it
be indented four spaces from the beginning of the first non-space
text after the list marker (here, `a`).

This change was motivated by a slew of bug reports from people
who expected lists to work differently (#3125, #2367, #2575, #2210,
 #1990, #1137, #744, #172, #137, #128) and by the growing prevalance
of CommonMark (now used by GitHub, for example).

Users who want to use the old rules can select the `four_space_rule`
extension.

* Added `four_space_rule` extension.
* Added `Ext_four_space_rule` to `Extensions`.
* `Parsing` now exports `gobbleAtMostSpaces`, and the type
  of `gobbleSpaces` has been changed so that a `ReaderOptions`
  parameter is not needed.
2017-08-19 15:45:01 -07:00
John MacFarlane
5ab1162def Markdown reader: fixed parsing of fenced code after list...
...when there is no intervening blank line.

Closes #3733.
2017-08-18 21:46:55 -07:00
John MacFarlane
7cac58f126 Markdown reader: parse -@roe as suppress-author citation.
Previously only `[-@roe]` (with brackets) was recognized as
suppress-author, and `-@roe` was treated the same as `@roe`.

Closes jgm/pandoc-citeproc#237.
2017-08-18 11:40:57 -07:00
John MacFarlane
bfbdfa646a LaTeX reader: implement \newtoggle, \iftoggle, \toggletrue|false
from etoolbox.

Closes #3853.
2017-08-18 10:13:41 -07:00
John MacFarlane
d1444b4ecd RST reader/writer: support unknown interpreted text roles...
...by parsing them as Span with "role" attributes.
This way they can be manipulated in the AST.

Closes #3407.
2017-08-17 16:01:44 -07:00
John MacFarlane
b1f6fb4af5 HTML reader: support column alignments.
These can be set either with a `width` attribute or
with `text-width` in a `style` attribute.

Closes #1881.
2017-08-17 12:08:32 -07:00
John MacFarlane
b9b35059f6 LaTeX reader: support \lq, \rq. 2017-08-17 12:08:32 -07:00
John MacFarlane
c175317d03 LaTeX reader: support \textquoteleft|right, \textquotedblleft|right.
Closes #3849.
2017-08-17 10:09:35 -07:00
John MacFarlane
ae61d5f57d LaTeX reader: rudimentary support for \hyperlink. 2017-08-16 10:56:16 -07:00
John MacFarlane
db715ca847 LaTeX reader: use Link instead of Span for \ref.
This makes more sense semantically and avoids unnecessary
Span [Link] nestings when references are resolved.
2017-08-16 10:56:12 -07:00
schrieveslaach
cf4b40162d LaTeX reader: add Support for glossaries and acronym package (#3589)
Acronyms are not resolved by the reader, but acronym and glossary information is put into attributes on Spans so that they can be processed in filters.
2017-08-16 10:24:46 -07:00
John MacFarlane
6aef1bd228 Better handle complex \def macros as raw latex. 2017-08-13 12:45:04 -07:00
John MacFarlane
425b731050 LaTeX reader: Allow @ as a letter in control sequences.
@ is commonly used in macros using `\makeatletter`.
Ideally we'd make the tokenizer sensitive to `\makeatletter`
and `\makeatother`, but until then this seems a good change.
2017-08-13 12:24:06 -07:00
John MacFarlane
bf9ec6dfd8 LaTeX reader: fix \let\a=0 case, with single character token. 2017-08-13 12:16:51 -07:00
John MacFarlane
f9656ece4e Resolve references to section numbers in LaTeX reader. 2017-08-13 11:48:44 -07:00
John MacFarlane
253a7c6201 LaTeX reader: track header numbers and correlate with labels. 2017-08-13 11:30:17 -07:00
schrieveslaach
2845ab5976 Put content of \ref, \label commands into span… (#3639)
* Put content of `\ref` and `\label` commands into Span elements so they can be used in filters.
* Add support for `\eqref`
2017-08-13 10:58:45 -07:00
John MacFarlane
0ab8670a0e LaTeX reader: Fixed space after \figurename etc. 2017-08-12 13:40:28 -07:00
John MacFarlane
3897df868a LaTeX reader: support \chaptername, \partname, \abstractname, etc.
See #3559.
Obsoletes #3560.
2017-08-12 13:28:18 -07:00
John MacFarlane
f035f0ffe3 LaTeX reader: have \setmainlanguage set lang in metadata. 2017-08-12 12:34:36 -07:00
John MacFarlane
74212eb1b0 Added support for translations (localization) (see #3559).
* readDataFile, readDefaultDataFile, getReferenceDocx,
  getReferenceODT have been removed from Shared and
  moved into Class.  They are now defined in terms of
  PandocMonad primitives, rather than being primitve
  methods of the class.

* toLang has been moved from BCP47 to Class.

* NoTranslation and CouldNotLoudTranslations have
  been added to LogMessage.

* New module, Text.Pandoc.Translations, exporting
  Term, Translations, readTranslations.

* New functions in Class: translateTerm, setTranslations.
  Note that nothing is loaded from data files until
  translateTerm is used; setTranslation just sets the
  language to be used.

* Added two translation data files in data/translations.

* LaTeX reader: Support `\setmainlanguage` or `\setdefaultlanguage`
  (polyglossia) and `\figurename`.
2017-08-11 22:22:31 -07:00
John MacFarlane
dee4cbc854 RST reader: implement csv-table directive.
Most attributes are supported, including `:file:` and `:url:`.
A (probably insufficient) test case has been added.

Closes #3533.
2017-08-10 15:01:14 -07:00
John MacFarlane
a5790dd308 RST reader: Basic support for csv-table directive.
* Added Text.Pandoc.CSV, simple CSV parser.
* Options still not supported, and we need tests.

See #3533.
2017-08-10 11:12:41 -07:00
John MacFarlane
f4bff5d359 RST reader: reorganize block parsers for ~20% faster parsing. 2017-08-09 21:16:17 -07:00
John MacFarlane
1dcecffef4 Removed spurious comments. 2017-08-09 20:53:42 -07:00
John MacFarlane
ac18ff90b2 Org reader: use org-language attribute rather than data-org-language. 2017-08-09 09:45:17 -07:00
John MacFarlane
96933c6043 Org reader: use tag-name attribute instead of data-tag-name. 2017-08-09 09:26:57 -07:00
John MacFarlane
09b7df472d LaTeX reader: Use label instead of data-label for label in caption.
See d441e656db, #3639.
2017-08-09 09:15:50 -07:00
bucklereed
db55f7c1b2 HTML reader: parse <main> like <div role=main>. (#3791)
* HTML reader: parse <main> like <div role=main>.

* <main> closes <p> and behaves like a block element generally
2017-08-09 09:10:12 -07:00
Alexander
cfa597fc2a Muse reader: simplify tableCell implementation (#3846) 2017-08-09 09:09:05 -07:00
John MacFarlane
606a8e2af4 RST reader: support :widths: attribute for table directive. 2017-08-08 20:48:30 -07:00
John MacFarlane
3752298d91 Thread options through CommonMark reader.
This is more efficient than doing AST traversals for
emojis and hard breaks.

Also make behavior sensitive to `raw_html` extension.
2017-08-08 13:55:19 -07:00
John MacFarlane
54658b923a Support hard_line_breaks in CommonMark reader. 2017-08-08 13:30:53 -07:00