Support ipynb (Jupyter notebook) as input and output format.

[API change] * Depend on ipynb library. * Add `ipynb` as input and output format. * Added Text.Pandoc.Readers.Ipynb (supports both nbformat v3 and v4). * Added Text.Pandoc.Writers.Ipynb (supports nbformat v4). * Added ipynb readers and writers to T.P.Readers, T.P.Writers, and T.P.Extensions. Register the file extension .ipynb for this format. * Add `PandocIpynbDecodingError` constructor to Text.Pandoc.Error.Error. * Note: there is no template for ipynb.
2019-01-13 07:56:55 -08:00 · 2019-01-13 07:56:55 -08:00 · 395ea03069
commit 395ea03069
parent 5ddd7b121e
13 changed files with 638 additions and 9 deletions
--- a/MANUAL.txt
+++ b/MANUAL.txt
@ -248,6 +248,7 @@ General options {.options}
      if you need extensions not supported in [`gfm`](#markdown-variants).
    - `haddock` ([Haddock markup])
    - `html` ([HTML])
+    - `ipynb` ([Jupyter notebook])
    - `jats` ([JATS] XML)
    - `json` (JSON version of native AST)
    - `latex` ([LaTeX])
@ -300,6 +301,7 @@ General options {.options}
    - `html` or `html5` ([HTML], i.e. [HTML5]/XHTML [polyglot markup])
    - `html4` ([XHTML] 1.0 Transitional)
    - `icml` ([InDesign ICML])
+    - `ipynb` ([Jupyter notebook])
    - `jats` ([JATS] XML)
    - `json` (JSON version of native AST)
    - `latex` ([LaTeX])
@ -471,6 +473,7 @@ General options {.options}
 [PDF]: https://www.adobe.com/pdf/
 [reveal.js]: http://lab.hakim.se/reveal-js/
 [FictionBook2]: http://www.fictionbook.org/index.php/Eng:XML_Schema_Fictionbook_2.1
+[Jupyter notebook]: https://nbformat.readthedocs.io/en/latest/
 [InDesign ICML]: http://wwwimages.adobe.com/www.adobe.com/content/dam/acom/en/devnet/indesign/sdk/cs6/idml/idml-cookbook.pdf
 [TEI Simple]: https://github.com/TEIC/TEI-Simple
 [Muse]: https://amusewiki.org/library/manual
@ -730,6 +733,8 @@ General writer options {.options}
    where there are nonsemantic newlines in the source, there
    will be nonsemantic newlines in the output as well).
    Automatic wrapping does not currently work in HTML output.
+    In `ipynb` output, this option affects wrapping of the
+    contents of markdown cells.

 `--columns=`*NUMBER*

@ -902,6 +907,7 @@ Options affecting specific writers {.options}
 :   Use ATX-style headers in Markdown output. The default is
    to use setext-style headers for levels 1-2, and then ATX headers.
    (Note: for `gfm` output, ATX headers are always used.)
+    This option also affects markdown cells in `ipynb` output.

 `--top-level-division=[default|section|chapter|part]`

@ -1806,6 +1812,10 @@ section [Pandoc's Markdown] below (See [Markdown variants] for
 `commonmark` and `gfm`.) In the following, extensions that also work
 for other formats are covered.

+Note that markdown extensions added to the `ipynb` format
+affect Markdown cells in Jupyter notebooks (as do command-line
+options like `--atx-headers`).
+
 Typography
 ----------

@ -1955,11 +1965,19 @@ This extension can be enabled/disabled for the following formats

 input formats
 :  `latex`, `org`, `textile`, `html` (environments, `\ref`, and
-   `\eqref` only)
+   `\eqref` only), `ipynb`

 output formats
 :  `textile`, `commonmark`

+Note: as applied to `ipynb`, `raw_html` and `raw_tex` affect not
+only raw TeX in markdown cells, but data with mime type
+`text/html` in output cells.  Since the `ipynb` reader attempts
+to preserve the richest possible outputs when several options
+are given, you will get best results if you disable `raw_html`
+and `raw_tex` when converting to formats like `docx` which don't
+allow raw `html` or `tex`.
+
 #### Extension: `native_divs` {#native_divs}

 This extension is enabled by default for HTML input. This means that
@ -4747,6 +4765,112 @@ with the `src` attribute.  For example:
      </source>
    </audio>

+Creating Jupyter notebooks with pandoc
+======================================
+
+When creating a [Jupyter notebook], pandoc will try to infer the
+notebook structure.  Code blocks with the class `code` will be
+taken as code cells, and intervening content will be taken as
+Markdown cells.  Attachments will automatically be created for
+images in Markdown cells. For example:
+
+````
+---
+title: My notebook
+nbformat: 4
+nbformat_minor: 5
+kernelspec:
+   display_name: Python 2
+   language: python
+   name: python2
+language_info:
+   codemirror_mode:
+     name: ipython
+     version: 2
+   file_extension: ".py"
+   mimetype: "text/x-python"
+   name: "python"
+   nbconvert_exporter: "python"
+   pygments_lexer: "ipython2"
+   version: "2.7.15"
+---
+
+# Lorem ipsum
+
+**Lorem ipsum** dolor sit amet, consectetur adipiscing elit. Nunc luctus
+bibendum felis dictum sodales.
+
+``` code
+print("hello")
+```
+
+## Pyout
+
+``` code
+from IPython.display import HTML
+HTML("""
+<script>
+console.log("hello");
+</script>
+<b>HTML</b>
+""")
+```
+
+## Image
+
+This image ![image](myimage.png) will be
+included as a cell attachment.
+````
+
+If you want to add cell attributes, group cells differently, or
+add output to code cells, then you need to include divs to
+indicate the structure. You can use either [fenced
+divs][Extension: `fenced_divs`] or [native divs][Extension:
+`native_divs`] for this.  Here is an example:
+
+````
+:::::: {.cell .markdown}
+# Lorem
+
+**Lorem ipsum** dolor sit amet, consectetur adipiscing elit. Nunc luctus
+bibendum felis dictum sodales.
+::::::
+
+:::::: {.cell .code execution_count=1}
+``` {.python}
+print("hello")
+```
+
+::: {.output .stream .stdout}
+```
+hello
+```
+:::
+::::::
+
+:::::: {.cell .code execution_count=2}
+``` {.python}
+from IPython.display import HTML
+HTML("""
+<script>
+console.log("hello");
+</script>
+<b>HTML</b>
+""")
+```
+
+::: {.output .execute_result execution_count=2}
+```{=html}
+<script>
+console.log("hello");
+</script>
+<b>HTML</b>
+hello
+```
+:::
+::::::
+````
+
 Syntax highlighting
 ===================

--- a/README.md
+++ b/README.md
@ -32,6 +32,7 @@ library. It can convert *from*
  - `creole` ([Creole 1.0](http://www.wikicreole.org/wiki/Creole1.0))
  - `docbook` ([DocBook](http://docbook.org))
  - `docx` ([Word docx](https://en.wikipedia.org/wiki/Office_Open_XML))
+  - `dokuwiki` ([DokuWiki markup](https://www.dokuwiki.org/dokuwiki))
  - `epub` ([EPUB](http://idpf.org/epub))
  - `fb2`
    ([FictionBook2](http://www.fictionbook.org/index.php/Eng:XML_Schema_Fictionbook_2.1)
@ -44,6 +45,8 @@ library. It can convert *from*
  - `haddock` ([Haddock
    markup](https://www.haskell.org/haddock/doc/html/ch03s08.html))
  - `html` ([HTML](http://www.w3.org/html/))
+  - `ipynb` ([Jupyter
+    notebook](https://nbformat.readthedocs.io/en/latest/))
  - `jats` ([JATS](https://jats.nlm.nih.gov) XML)
  - `json` (JSON version of native AST)
  - `latex` ([LaTeX](http://latex-project.org))
@ -105,6 +108,8 @@ It can convert *to*
  - `html4` ([XHTML](http://www.w3.org/TR/xhtml1/) 1.0 Transitional)
  - `icml` ([InDesign
    ICML](http://wwwimages.adobe.com/www.adobe.com/content/dam/acom/en/devnet/indesign/sdk/cs6/idml/idml-cookbook.pdf))
+  - `ipynb` ([Jupyter
+    notebook](https://nbformat.readthedocs.io/en/latest/))
  - `jats` ([JATS](https://jats.nlm.nih.gov) XML)
  - `json` (JSON version of native AST)
  - `latex` ([LaTeX](http://latex-project.org))
--- a/cabal.project
+++ b/cabal.project
@ -4,3 +4,4 @@ source-repository-package
    type: git
    location: https://github.com/jgm/pandoc-citeproc
    tag: 4b467c62af17ddfc739933891c5ea2291a6b9b76
+
--- a/pandoc.cabal
+++ b/pandoc.cabal
@ -20,14 +20,14 @@ description:     Pandoc is a Haskell library for converting from one markup
                 (subsets of) HTML, reStructuredText, LaTeX, DocBook, JATS,
                 MediaWiki markup, DokuWiki markup, TWiki markup,
                 TikiWiki markup, Creole 1.0, Haddock markup, OPML,
-                 Emacs Org-Mode, Emacs Muse, txt2tags,
-                 Vimwiki, Word Docx, ODT, EPUB, FictionBook2, roff man,
-                 and Textile, and it can write Markdown, reStructuredText,
-                 XHTML, HTML 5, LaTeX, ConTeXt, DocBook, JATS, OPML, TEI,
-                 OpenDocument, ODT, Word docx, PowerPoint pptx,
-                 RTF, MediaWiki, DokuWiki, ZimWiki, Textile,
+                 Emacs Org-Mode, Emacs Muse, txt2tags, ipynb (Jupyter
+                 notebooks), Vimwiki, Word Docx, ODT, EPUB, FictionBook2,
+                 roff man, and Textile, and it can write Markdown,
+                 reStructuredText, XHTML, HTML 5, LaTeX, ConTeXt, DocBook,
+                 JATS, OPML, TEI, OpenDocument, ODT, Word docx,
+                 PowerPoint pptx, RTF, MediaWiki, DokuWiki, ZimWiki, Textile,
                 roff man, roff ms, plain text, Emacs Org-Mode,
-                 AsciiDoc, Haddock markup, EPUB (v2 and v3),
+                 AsciiDoc, Haddock markup, EPUB (v2 and v3), ipynb,
                 FictionBook2, InDesign ICML, Muse, LaTeX beamer slides,
                 and several kinds of HTML/JavaScript slide shows
                 (S5, Slidy, Slideous, DZSlides, reveal.js).
@ -398,7 +398,8 @@ library
                 http-types >= 0.8 && < 0.13,
                 case-insensitive >= 1.2 && < 1.3,
                 unicode-transforms >= 0.3 && < 0.4,
-                 HsYAML >= 0.1.1.1 && < 0.2
+                 HsYAML >= 0.1.1.1 && < 0.2,
+                 ipynb >= 0.1 && < 0.2
  if impl(ghc < 8.0)
    build-depends: semigroups == 0.18.*,
                   -- basement 0.0.8 and foundation 0.0.21, transitive
@ -470,12 +471,14 @@ library
                   Text.Pandoc.Readers.Man,
                   Text.Pandoc.Readers.FB2,
                   Text.Pandoc.Readers.DokuWiki,
+                   Text.Pandoc.Readers.Ipynb,
                   Text.Pandoc.Writers,
                   Text.Pandoc.Writers.Native,
                   Text.Pandoc.Writers.Docbook,
                   Text.Pandoc.Writers.JATS,
                   Text.Pandoc.Writers.OPML,
                   Text.Pandoc.Writers.HTML,
+                   Text.Pandoc.Writers.Ipynb,
                   Text.Pandoc.Writers.ICML,
                   Text.Pandoc.Writers.LaTeX,
                   Text.Pandoc.Writers.ConTeXt,
--- a/src/Text/Pandoc/App/FormatHeuristics.hs
+++ b/src/Text/Pandoc/App/FormatHeuristics.hs
@ -90,5 +90,6 @@ formatFromFilePath x =
    ".txt"      -> Just "markdown"
    ".wiki"     -> Just "mediawiki"
    ".xhtml"    -> Just "html"
+    ".ipynb"    -> Just "ipynb"
    ['.',y]     | y `elem` ['1'..'9'] -> Just "man"
    _           -> Nothing
--- a/src/Text/Pandoc/Error.hs
+++ b/src/Text/Pandoc/Error.hs
@ -70,6 +70,7 @@ data PandocError = PandocIOError String IOError
                 | PandocEpubSubdirectoryError String
                 | PandocMacroLoop String
                 | PandocUTF8DecodingError String Int Word8
+                 | PandocIpynbDecodingError String
                 deriving (Show, Typeable, Generic)

 instance Exception PandocError
@ -124,6 +125,8 @@ handleError (Left e) =
      "UTF-8 decoding error in " ++ f ++ " at byte offset " ++ show offset ++
      " (" ++ printf "%2x" w ++ ").\n" ++
      "The input must be a UTF-8 encoded text."
+    PandocIpynbDecodingError w -> err 93 $
+      "ipynb decoding error: " ++ w

 err :: Int -> String -> IO a
 err exitCode msg = do
--- a/src/Text/Pandoc/Extensions.hs
+++ b/src/Text/Pandoc/Extensions.hs
@ -332,6 +332,8 @@ getDefaultExtensions "markdown_phpextra" = phpMarkdownExtraExtensions
 getDefaultExtensions "markdown_mmd" = multimarkdownExtensions
 getDefaultExtensions "markdown_github" = githubMarkdownExtensions
 getDefaultExtensions "markdown"        = pandocExtensions
+getDefaultExtensions "ipynb"           = enableExtension Ext_tex_math_dollars
+                                           githubMarkdownExtensions
 getDefaultExtensions "muse"            = extensionsFromList
                                           [Ext_amuse,
                                            Ext_auto_identifiers]
--- a/src/Text/Pandoc/Readers.hs
+++ b/src/Text/Pandoc/Readers.hs
@ -67,6 +67,7 @@ module Text.Pandoc.Readers
  , readEPUB
  , readMuse
  , readFB2
+  , readIpynb
  -- * Miscellaneous
  , getReader
  , getDefaultExtensions
@ -90,6 +91,7 @@ import Text.Pandoc.Readers.Docx
 import Text.Pandoc.Readers.DokuWiki
 import Text.Pandoc.Readers.EPUB
 import Text.Pandoc.Readers.FB2
+import Text.Pandoc.Readers.Ipynb
 import Text.Pandoc.Readers.Haddock
 import Text.Pandoc.Readers.HTML (readHtml)
 import Text.Pandoc.Readers.JATS (readJATS)
@ -147,6 +149,7 @@ readers = [ ("native"       , TextReader readNative)
           ,("muse"         , TextReader readMuse)
           ,("man"          , TextReader readMan)
           ,("fb2"          , TextReader readFB2)
+           ,("ipynb"        , TextReader readIpynb)
           ]

 -- | Retrieve reader, extensions based on formatSpec (format+extensions).
--- a/src/Text/Pandoc/Readers/Ipynb.hs
+++ b/src/Text/Pandoc/Readers/Ipynb.hs
@ -0,0 +1,249 @@
+{-# LANGUAGE NoImplicitPrelude #-}
+{-# LANGUAGE DeriveGeneric #-}
+{-# LANGUAGE LambdaCase #-}
+{-# LANGUAGE OverloadedStrings #-}
+{-# LANGUAGE FlexibleContexts  #-}
+{-# LANGUAGE ScopedTypeVariables #-}
+{-
+Copyright (C) 2019 John MacFarlane <jgm@berkeley.edu>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+-}
+
+{- |
+   Module      : Text.Pandoc.Readers.Ipynb
+   Copyright   : Copyright (C) 2019 John MacFarlane
+   License     : GNU GPL, version 2 or above
+
+   Maintainer  : John MacFarlane <jgm@berkeley.edu>
+   Stability   : alpha
+   Portability : portable
+
+Ipynb (Jupyter notebook JSON format) reader for pandoc.
+-}
+module Text.Pandoc.Readers.Ipynb ( readIpynb )
+where
+import Prelude
+import Data.Maybe (fromMaybe)
+import Data.Digest.Pure.SHA (sha1, showDigest)
+import Text.Pandoc.Options
+import qualified Text.Pandoc.Builder as B
+import Text.Pandoc.Logging
+import Text.Pandoc.Definition
+import Data.Ipynb as Ipynb
+import Text.Pandoc.Class
+import Text.Pandoc.MIME (extensionFromMimeType)
+import Text.Pandoc.UTF8
+import Text.Pandoc.Error
+import Data.Text (Text)
+import qualified Data.Map as M
+import qualified Data.Text as T
+import qualified Data.Text.Encoding as TE
+import qualified Data.ByteString.Lazy as BL
+import Data.Aeson as Aeson
+import Control.Monad.Except (throwError)
+import Text.Pandoc.Readers.Markdown (readMarkdown)
+import Text.Pandoc.Readers.HTML (readHtml)
+
+readIpynb :: PandocMonad m => ReaderOptions -> Text -> m Pandoc
+readIpynb opts t = do
+  let src = BL.fromStrict (TE.encodeUtf8 t)
+  case eitherDecode src of
+    Right (notebook4 :: Notebook NbV4) -> notebookToPandoc opts notebook4
+    Left _ ->
+      case eitherDecode src of
+        Right (notebook3 :: Notebook NbV3) -> notebookToPandoc opts notebook3
+        Left err -> throwError $ PandocIpynbDecodingError err
+
+notebookToPandoc :: (PandocMonad m, FromJSON (Notebook a))
+                 => ReaderOptions -> Notebook a -> m Pandoc
+notebookToPandoc opts notebook = do
+  let cells = notebookCells notebook
+  let (fmt,fmtminor) = notebookFormat notebook
+  let m = M.insert "nbformat" (MetaString $ show fmt) $
+          M.insert "nbformat_minor" (MetaString $ show fmtminor) $
+          jsonMetaToMeta (notebookMetadata notebook)
+  let lang = case M.lookup "kernelspec" m of
+                   Just (MetaMap ks) ->
+                      case M.lookup "language" ks of
+                         Just (MetaString l) -> l
+                         _ -> "python"
+                   _ -> "python"
+  bs <- mconcat <$> mapM (cellToBlocks opts lang) cells
+  let Pandoc _ blocks = B.doc bs
+  return $ Pandoc (Meta m) blocks
+
+cellToBlocks :: PandocMonad m
+             => ReaderOptions -> String -> Cell a -> m B.Blocks
+cellToBlocks opts lang c = do
+  let Source ts = cellSource c
+  let source = mconcat ts
+  let kvs = jsonMetaToPairs (cellMetadata c)
+  let attachments = maybe mempty M.toList $ cellAttachments c
+  mapM_ addAttachment attachments
+  case cellType c of
+    Ipynb.Markdown -> do
+      Pandoc _ bs <- readMarkdown opts source
+      return $ B.divWith ("",["cell","markdown"],kvs)
+             $ B.fromList bs
+    Ipynb.Heading lev -> do
+      Pandoc _ bs <- readMarkdown opts
+        (T.replicate lev "#" <> " " <> source)
+      return $ B.divWith ("",["cell","markdown"],kvs)
+             $ B.fromList bs
+    Ipynb.Raw -> do
+      let format = fromMaybe "" $ lookup "format" kvs
+      let format' =
+            case format of
+              "text/html"       ->  "html"
+              "text/latex"      -> "latex"
+              "application/pdf" -> "latex"
+              "text/markdown"   -> "markdown"
+              "text/x-rsrt"     -> "rst"
+              _                 -> format
+      return $ B.divWith ("",["cell","raw"],kvs) $ B.rawBlock format'
+             $ T.unpack source
+    Ipynb.Code{ codeOutputs = outputs, codeExecutionCount = ec } -> do
+      outputBlocks <- mconcat <$> mapM (outputToBlock opts) outputs
+      let kvs' = maybe kvs (\x -> ("execution_count", show x):kvs) ec
+      return $ B.divWith ("",["cell","code"],kvs') $
+        B.codeBlockWith ("",[lang],[]) (T.unpack source)
+        <> outputBlocks
+
+addAttachment :: PandocMonad m => (Text, MimeBundle) -> m ()
+addAttachment (fname, mimeBundle) = do
+  let fp = T.unpack fname
+  case M.toList (unMimeBundle mimeBundle) of
+    (mimeType, BinaryData bs):_ ->
+      insertMedia fp (Just $ T.unpack mimeType) (BL.fromStrict bs)
+    (mimeType, TextualData t):_ ->
+      insertMedia fp (Just $ T.unpack mimeType)
+          (BL.fromStrict $ TE.encodeUtf8 t)
+    (mimeType, JsonData v):_ ->
+      insertMedia fp (Just $ T.unpack mimeType) (encode v)
+    [] -> report $ CouldNotFetchResource fp "no attachment"
+
+outputToBlock :: PandocMonad m => ReaderOptions -> Output a -> m B.Blocks
+outputToBlock _ Stream{ streamName = sName,
+                        streamText = Source text } = do
+  return $ B.divWith ("",["output","stream",T.unpack sName],[])
+         $ B.codeBlock $ T.unpack . mconcat $ text
+outputToBlock opts DisplayData{ displayData = data',
+                                 displayMetadata = metadata' } =
+  B.divWith ("",["output", "display_data"],[]) <$>
+    handleData opts metadata' data'
+outputToBlock opts ExecuteResult{ executeCount = ec,
+                                   executeData = data',
+                                   executeMetadata = metadata' } =
+  B.divWith ("",["output", "execute_result"],[("execution_count",show ec)])
+    <$> handleData opts metadata' data'
+outputToBlock _ Err{ errName = ename,
+                     errValue = evalue,
+                     errTraceback = traceback } = do
+  return $ B.divWith ("",["output","error"],
+                         [("ename",T.unpack ename),
+                          ("evalue",T.unpack evalue)])
+         $ B.codeBlock $ T.unpack . T.unlines $ traceback
+
+-- We want to display the richest output possible given
+-- the output format.
+handleData :: PandocMonad m
+           => ReaderOptions -> JSONMeta -> MimeBundle -> m B.Blocks
+handleData opts metadata (MimeBundle mb) = do
+  let mimePairs = M.toList mb
+
+  results <- mapM dataBlock mimePairs
+
+  -- return the result with highest priority:
+
+  let highest = maximum (0 : map fst results)
+  return $ case [r | (pr, r) <- results, pr == highest] of
+             x:_  -> x
+             []   -> mempty
+
+  where
+
+    exts = readerExtensions opts
+
+    dataBlock :: PandocMonad m => (MimeType, MimeData) -> m (Int, B.Blocks)
+    dataBlock (mt, BinaryData bs)
+     | "image/" `T.isPrefixOf` mt
+      = do
+      -- normally metadata maps from mime types to key-value map;
+      -- but not always...
+      let meta = case M.lookup mt metadata of
+                   Just v@(Object{}) ->
+                     case fromJSON v of
+                       Success m' -> m'
+                       Error _   -> mempty
+                   _ -> mempty
+      let metaPairs = jsonMetaToPairs meta
+      let bl = BL.fromStrict bs
+      -- SHA1 hash for filename
+      let mt' = T.unpack mt
+      let fname = showDigest (sha1 bl) ++
+            case extensionFromMimeType mt' of
+              Nothing  -> ""
+              Just ext -> '.':ext
+      insertMedia fname (Just mt') bl
+      return (3, B.para $ B.imageWith ("",[],metaPairs) fname "" mempty)
+
+    dataBlock (_, BinaryData _) = return (0, mempty)
+
+    dataBlock ("text/html", TextualData t)
+      | extensionEnabled Ext_raw_html exts
+        = return (2, B.rawBlock "html" $ T.unpack t)
+      | otherwise = do -- try parsing the HTML
+          Pandoc _ bls <- readHtml opts t
+          return (1, B.fromList bls)
+
+    dataBlock ("text/latex", TextualData t) =
+      return $ if extensionEnabled Ext_raw_tex exts
+                  then (2, B.rawBlock "latex" $ T.unpack t)
+                  else (0, mempty)
+
+    dataBlock ("text/plain", TextualData t) =
+      return (0, B.codeBlock $ T.unpack t)
+
+    dataBlock (_, JsonData v) =
+      return (2, B.codeBlockWith ("",["json"],[]) $ toStringLazy $ encode v)
+
+    dataBlock _ = return (0, mempty)
+
+jsonMetaToMeta :: JSONMeta -> M.Map String MetaValue
+jsonMetaToMeta = M.mapKeys T.unpack . M.map valueToMetaValue
+  where
+    valueToMetaValue :: Value -> MetaValue
+    valueToMetaValue x@(Object{}) =
+      case fromJSON x of
+        Error s -> MetaString s
+        Success jm' -> MetaMap $ jsonMetaToMeta jm'
+    valueToMetaValue x@(Array{}) =
+      case fromJSON x of
+        Error s -> MetaString s
+        Success xs -> MetaList $ map valueToMetaValue xs
+    valueToMetaValue (Bool b) = MetaBool b
+    valueToMetaValue (String t) = MetaString (T.unpack t)
+    valueToMetaValue (Number n) = MetaString (show n)
+    valueToMetaValue Aeson.Null = MetaString ""
+
+jsonMetaToPairs :: JSONMeta -> [(String, String)]
+jsonMetaToPairs = M.toList . M.mapMaybe
+     (\case
+        MetaString s -> Just s
+        MetaBool True -> Just "true"
+        MetaBool False -> Just "false"
+        -- for now we skip complex cell metadata:
+        _ -> Nothing) . jsonMetaToMeta
--- a/src/Text/Pandoc/Templates.hs
+++ b/src/Text/Pandoc/Templates.hs
@ -58,6 +58,7 @@ getDefaultTemplate writer = do
       "docx"    -> return ""
       "fb2"     -> return ""
       "pptx"    -> return ""
+       "ipynb"   -> return ""
       "odt"     -> getDefaultTemplate "opendocument"
       "html"    -> getDefaultTemplate "html5"
       "docbook" -> getDefaultTemplate "docbook5"
--- a/src/Text/Pandoc/Writers.hs
+++ b/src/Text/Pandoc/Writers.hs
@ -49,6 +49,7 @@ module Text.Pandoc.Writers
    , writeEPUB2
    , writeEPUB3
    , writeFB2
+    , writeIpynb
    , writeHaddock
    , writeHtml4
    , writeHtml4String
@ -101,6 +102,7 @@ import Text.Pandoc.Writers.Docx
 import Text.Pandoc.Writers.DokuWiki
 import Text.Pandoc.Writers.EPUB
 import Text.Pandoc.Writers.FB2
+import Text.Pandoc.Writers.Ipynb
 import Text.Pandoc.Writers.Haddock
 import Text.Pandoc.Writers.HTML
 import Text.Pandoc.Writers.ICML
@ -140,6 +142,7 @@ writers = [
  ,("epub2"        , ByteStringWriter writeEPUB2)
  ,("epub3"        , ByteStringWriter writeEPUB3)
  ,("fb2"          , TextWriter writeFB2)
+  ,("ipynb"        , TextWriter writeIpynb)
  ,("html"         , TextWriter writeHtml5String)
  ,("html4"        , TextWriter writeHtml4String)
  ,("html5"        , TextWriter writeHtml5String)
--- a/src/Text/Pandoc/Writers/Ipynb.hs
+++ b/src/Text/Pandoc/Writers/Ipynb.hs
@ -0,0 +1,233 @@
+{-# LANGUAGE NoImplicitPrelude #-}
+{-# LANGUAGE DeriveGeneric #-}
+{-# LANGUAGE OverloadedStrings #-}
+{-# LANGUAGE ScopedTypeVariables #-}
+{-
+Copyright (C) 2019 John MacFarlane <jgm@berkeley.edu>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+-}
+
+{- |
+   Module      : Text.Pandoc.Writers.Ipynb
+   Copyright   : Copyright (C) 2019 John MacFarlane
+   License     : GNU GPL, version 2 or above
+
+   Maintainer  : John MacFarlane <jgm@berkeley.edu>
+   Stability   : alpha
+   Portability : portable
+
+Ipynb (Jupyter notebook JSON format) writer for pandoc.
+
+-}
+module Text.Pandoc.Writers.Ipynb ( writeIpynb )
+where
+import Prelude
+import Control.Monad (foldM)
+import Control.Monad.State
+import qualified Data.Map as M
+import Data.Maybe (catMaybes, fromMaybe)
+import Text.Pandoc.Options
+import Text.Pandoc.Definition
+import Data.Ipynb as Ipynb
+import Text.Pandoc.Walk (walkM)
+import qualified Text.Pandoc.Builder as B
+import Text.Pandoc.Class
+import Text.Pandoc.Logging
+import Data.Text (Text)
+import qualified Data.Text as T
+import Data.Aeson as Aeson
+import qualified Text.Pandoc.UTF8 as UTF8
+import Text.Pandoc.Shared (safeRead)
+import Text.Pandoc.Writers.Shared (metaToJSON')
+import Text.Pandoc.Writers.Markdown (writeMarkdown)
+import qualified Data.Text.Encoding as TE
+import qualified Data.ByteString.Lazy as BL
+import Data.Aeson.Encode.Pretty (Config(..), defConfig,
+           encodePretty', keyOrder, Indent(Spaces))
+
+writeIpynb :: PandocMonad m => WriterOptions -> Pandoc -> m Text
+writeIpynb opts d = do
+  notebook <- pandocToNotebook opts d
+  return $ TE.decodeUtf8 . BL.toStrict . encodePretty' defConfig{
+             confIndent  = Spaces 1,
+             confCompare = keyOrder
+               [ "cells", "nbformat", "nbformat_minor",
+                 "cell_type", "output_type",
+                 "execution_count", "metadata",
+                 "outputs", "source",
+                 "data", "name", "text" ] }
+         $ notebook
+
+pandocToNotebook :: PandocMonad m
+                 => WriterOptions -> Pandoc -> m (Notebook NbV4)
+pandocToNotebook opts (Pandoc meta blocks) = do
+  let blockWriter bs = writeMarkdown
+           opts{ writerTemplate = Nothing } (Pandoc nullMeta bs)
+  let inlineWriter ils = T.stripEnd <$> writeMarkdown
+           opts{ writerTemplate = Nothing } (Pandoc nullMeta [Plain ils])
+  metadata' <- metaToJSON' blockWriter inlineWriter $
+                 B.deleteMeta "nbformat" $
+                 B.deleteMeta "nbformat_minor" $ meta
+  let metadata = case fromJSON metadata' of
+                   Error _ -> mempty -- TODO warning here? shouldn't happen
+                   Success x -> x
+  cells <- extractCells opts blocks
+  return $ Notebook{
+       notebookMetadata = metadata
+     , notebookFormat = (4, 5)
+     , notebookCells = cells }
+
+addAttachment :: PandocMonad m
+              => Inline
+              -> StateT (M.Map Text MimeBundle) m Inline
+addAttachment (Image attr lab (src,tit)) = do
+  (img, mbmt) <- fetchItem src
+  let mt = maybe "application/octet-stream" (T.pack) mbmt
+  modify $ M.insert (T.pack src)
+          (MimeBundle (M.insert mt (BinaryData img) mempty))
+  return $ Image attr lab ("attachment:" <> src, tit)
+addAttachment x = return x
+
+extractCells :: PandocMonad m => WriterOptions -> [Block] -> m [Cell a]
+extractCells _ [] = return []
+extractCells opts (Div (_id,classes,kvs) xs : bs)
+  | "cell" `elem` classes
+  , "markdown" `elem` classes = do
+      let meta = pairsToJSONMeta kvs
+      (newdoc, attachments) <-
+        runStateT (walkM addAttachment (Pandoc nullMeta xs)) mempty
+      source <- writeMarkdown opts{ writerTemplate = Nothing } newdoc
+      (Cell{
+          cellType = Markdown
+        , cellSource = Source $ breakLines source
+        , cellMetadata = meta
+        , cellAttachments = if M.null attachments
+                               then Nothing
+                               else Just attachments } :)
+            <$> extractCells opts bs
+  | "cell" `elem` classes
+  , "code" `elem` classes = do
+      let (codeContent, rest) =
+            case xs of
+               (CodeBlock _ t : ys) -> (T.pack t, ys)
+               ys                   -> (mempty, ys)
+      let meta = pairsToJSONMeta kvs
+      outputs <- catMaybes <$> mapM blockToOutput rest
+      let exeCount = lookup "execution_count" kvs >>= safeRead
+      (Cell{
+          cellType = Ipynb.Code {
+                codeExecutionCount = exeCount
+              , codeOutputs = outputs
+              }
+        , cellSource = Source $ breakLines codeContent
+        , cellMetadata = meta
+        , cellAttachments = Nothing } :) <$> extractCells opts bs
+  | "cell" `elem` classes
+  , "raw" `elem` classes =
+      case xs of
+        [RawBlock (Format f) raw] -> do
+          let format' =
+                case f of
+                  "html"     -> "text/html"
+                  "revealjs" -> "text/html"
+                  "latex"    -> "text/latex"
+                  "markdown" -> "text/markdown"
+                  "rst"      -> "text/x-rst"
+                  _          -> f
+          (Cell{
+              cellType = Raw
+            , cellSource = Source $ breakLines $ T.pack raw
+            , cellMetadata = M.insert "format"
+                             (Aeson.String $ T.pack format') mempty
+            , cellAttachments = Nothing } :) <$> extractCells opts bs
+        _ -> extractCells opts bs
+extractCells opts (CodeBlock (_id,classes,kvs) raw : bs)
+  | "code" `elem` classes = do
+      let meta = pairsToJSONMeta kvs
+      let exeCount = lookup "execution_count" kvs >>= safeRead
+      (Cell{
+          cellType = Ipynb.Code {
+                codeExecutionCount = exeCount
+              , codeOutputs = []
+              }
+        , cellSource = Source $ breakLines $ T.pack raw
+        , cellMetadata = meta
+        , cellAttachments = Nothing } :) <$> extractCells opts bs
+extractCells opts (b:bs) = do
+      let isCodeOrDiv (CodeBlock (_,cl,_) _) = "code" `elem` cl
+          isCodeOrDiv (Div (_,cl,_) _)       = "cell" `elem` cl
+          isCodeOrDiv _                      = False
+      let (mds, rest) = break (isCodeOrDiv) bs
+      extractCells opts (Div ("",["cell","markdown"],[]) (b:mds) : rest)
+
+blockToOutput :: PandocMonad m => Block -> m (Maybe (Output a))
+blockToOutput (Div (_,["output","stream",sname],_) (CodeBlock _ t:_)) =
+  return $ Just
+         $ Stream{ streamName = T.pack sname
+               , streamText = Source (breakLines $ T.pack t) }
+blockToOutput (Div (_,["output","error"],kvs) (CodeBlock _ t:_)) =
+  return $ Just
+         $ Err{ errName = maybe mempty T.pack (lookup "ename" kvs)
+              , errValue = maybe mempty T.pack (lookup "evalue" kvs)
+              , errTraceback = breakLines $ T.pack t }
+blockToOutput (Div (_,["output","execute_result"],kvs) bs) = do
+  (data', metadata') <- extractData bs
+  return $ Just
+         $ ExecuteResult{ executeCount = fromMaybe 0 $
+                          lookup "execution_count" kvs >>= safeRead
+                        , executeData = data'
+                        , executeMetadata = pairsToJSONMeta kvs <> metadata'}
+blockToOutput (Div (_,["output","display_data"],kvs) bs) = do
+  (data', metadata') <- extractData bs
+  return $ Just
+         $ DisplayData { displayData = data'
+                       , displayMetadata = pairsToJSONMeta kvs <> metadata'}
+blockToOutput _ = return Nothing
+
+extractData :: PandocMonad m => [Block] -> m (MimeBundle, JSONMeta)
+extractData bs = do
+  (mmap, meta) <- foldM go mempty bs
+  return (MimeBundle mmap, meta)
+  where
+    go (mmap, meta) b@(Para [Image (_,_,kvs) _ (src,_)]) = do
+      (img, mbmt) <- fetchItem src
+      case mbmt of
+        Just mt -> return
+          (M.insert (T.pack mt) (BinaryData img) mmap,
+           meta <> pairsToJSONMeta kvs)
+        Nothing -> (mmap, meta) <$ report (BlockNotRendered b)
+    go (mmap, meta) b@(CodeBlock (_,["json"],_) code) =
+      case decode (UTF8.fromStringLazy code) of
+        Just v  -> return
+                    (M.insert "application/json" (JsonData v) mmap, meta)
+        Nothing -> (mmap, meta) <$ report (BlockNotRendered b)
+    go (mmap, meta) (CodeBlock ("",[],[]) code) =
+       return (M.insert "text/plain" (TextualData (T.pack code)) mmap, meta)
+    go (mmap, meta) (RawBlock (Format "html") raw) =
+       return (M.insert "text/html" (TextualData (T.pack raw)) mmap, meta)
+    go (mmap, meta) (RawBlock (Format "latex") raw) =
+       return (M.insert "text/latex" (TextualData (T.pack raw)) mmap, meta)
+    go (mmap, meta) b = (mmap, meta) <$ report (BlockNotRendered b)
+
+pairsToJSONMeta :: [(String, String)] -> JSONMeta
+pairsToJSONMeta kvs =
+  M.fromList [(T.pack k, case v of
+                           "true"  -> Bool True
+                           "false" -> Bool False
+                           _       -> case safeRead v of
+                                        Just n -> Number n
+                                        _      -> String (T.pack v))
+             | (k,v) <- kvs , k /= "execution_count" ]
--- a/stack.yaml
+++ b/stack.yaml
@ -21,6 +21,7 @@ packages:
 extra-deps:
 - github: jgm/pandoc-citeproc
  commit: 4b467c62af17ddfc739933891c5ea2291a6b9b76
+- ipynb-0.1
 - haddock-library-1.7.0
 - HsYAML-0.1.1.2
 - yaml-0.11.0.0