Lua filters: load module pandoc before calling init.lua (#5287)

The file `init.lua` in pandoc's data directory is run as part of
pandoc's Lua initialization process. Previously, the `pandoc` module was
loaded in `init.lua`, and the structure for marshaling was set-up after.
This allowed simple patching of element marshaling, but made using
`init.lua` more difficult:

  - it encouraged mixing essential initialization with user-defined
    customization;

  - upstream changes to init.lua had to be merged manually;

  - accidentally breaking marshaling by removing required modules was
    possible;

Instead, all required modules are now loaded before calling `init.lua`.
The file can be used entirely for user customization. Patching
marshaling functions, while discouraged, is still possible via the
`debug` module.
This commit is contained in:
Albert Krewinkel 2019-02-09 22:56:49 +01:00 committed by John MacFarlane
parent 66ed198fff
commit 75c791b4fe
3 changed files with 37 additions and 22 deletions

View file

@ -1,5 +1,3 @@
-- This Lua script is run every time the Lua interpreter is started when running -- This Lua script is run every time the Lua interpreter is started when running
-- a Lua filter. It can be customized to load additional modules or to alter the -- a Lua filter. It can be customized to load additional modules or to alter the
-- default modules. -- default modules.
pandoc = require 'pandoc'

View file

@ -219,21 +219,15 @@ Some pandoc functions have been made available in lua:
# Lua interpreter initialization # Lua interpreter initialization
The way the Lua interpreter is set-up can be controlled by Initialization of pandoc's Lua interpreter can be controlled by
placing a file `init.lua` in pandoc's data directory. The placing a file `init.lua` in pandoc's data directory. A common
default init file loads the `pandoc` and `pandoc.mediabag` use-case would be to load additional modules, or even to alter
modules: default modules.
``` {.lua} The following snippet is an example of code that might be useful
pandoc = require 'pandoc' when added to `init.lua`. The snippet adds all unicode-aware
pandoc.mediabag = require 'pandoc.mediabag' functions defined in the [`text` module] to the default `string`
``` module, prefixed with the string `uc_`.
A common use-case would be to add code to load additional
modules or to alter default modules. E.g., the following snippet
adds all unicode-aware functions defined in the [`text`
module](#module-text) to the default `string` module, prefixed
with the string `uc_`.
``` {.lua} ``` {.lua}
for name, fn in pairs(require 'text') do for name, fn in pairs(require 'text') do
@ -244,6 +238,8 @@ end
This makes it possible to apply these functions on strings using This makes it possible to apply these functions on strings using
colon syntax (`mystring:uc_upper()`). colon syntax (`mystring:uc_upper()`).
[`text` module]: #module-text
# Examples # Examples
The following filters are presented as examples. The following filters are presented as examples.

View file

@ -49,6 +49,7 @@ import Text.Pandoc.Lua.Util (loadScriptFromDataDir)
import qualified Foreign.Lua as Lua import qualified Foreign.Lua as Lua
import qualified Foreign.Lua.Module.Text as Lua import qualified Foreign.Lua.Module.Text as Lua
import qualified Text.Pandoc.Definition as Pandoc import qualified Text.Pandoc.Definition as Pandoc
import qualified Text.Pandoc.Lua.Module.Pandoc as ModulePandoc
-- | Lua error message -- | Lua error message
newtype LuaException = LuaException String deriving (Show) newtype LuaException = LuaException String deriving (Show)
@ -95,16 +96,37 @@ luaPackageParams = do
-- | Initialize the lua state with all required values -- | Initialize the lua state with all required values
initLuaState :: LuaPackageParams -> Lua () initLuaState :: LuaPackageParams -> Lua ()
initLuaState luaPkgParams = do initLuaState pkgParams = do
Lua.openlibs Lua.openlibs
Lua.preloadTextModule "text" Lua.preloadTextModule "text"
installPandocPackageSearcher luaPkgParams installPandocPackageSearcher pkgParams
loadScriptFromDataDir (luaPkgDataDir luaPkgParams) "init.lua" initPandocModule
putConstructorsInRegistry loadScriptFromDataDir (luaPkgDataDir pkgParams) "init.lua"
where
initPandocModule :: Lua ()
initPandocModule = do
-- Push module table
ModulePandoc.pushModule (luaPkgDataDir pkgParams)
-- register as loaded module
Lua.pushvalue Lua.stackTop
Lua.getfield Lua.registryindex Lua.loadedTableRegistryField
Lua.setfield (Lua.nthFromTop 2) "pandoc"
Lua.pop 1
-- copy constructors into registry
putConstructorsInRegistry
-- assign module to global variable
Lua.setglobal "pandoc"
-- | AST elements are marshaled via normal constructor functions in the
-- @pandoc@ module. However, accessing Lua globals from Haskell is
-- expensive (due to error handling). Accessing the Lua registry is much
-- cheaper, which is why the constructor functions are copied into the
-- Lua registry and called from there.
--
-- This function expects the @pandoc@ module to be at the top of the
-- stack.
putConstructorsInRegistry :: Lua () putConstructorsInRegistry :: Lua ()
putConstructorsInRegistry = do putConstructorsInRegistry = do
Lua.getglobal "pandoc"
constrsToReg $ Pandoc.Pandoc mempty mempty constrsToReg $ Pandoc.Pandoc mempty mempty
constrsToReg $ Pandoc.Str mempty constrsToReg $ Pandoc.Str mempty
constrsToReg $ Pandoc.Para mempty constrsToReg $ Pandoc.Para mempty
@ -113,7 +135,6 @@ putConstructorsInRegistry = do
constrsToReg $ Pandoc.Citation mempty mempty mempty Pandoc.AuthorInText 0 0 constrsToReg $ Pandoc.Citation mempty mempty mempty Pandoc.AuthorInText 0 0
putInReg "Attr" -- used for Attr type alias putInReg "Attr" -- used for Attr type alias
putInReg "ListAttributes" -- used for ListAttributes type alias putInReg "ListAttributes" -- used for ListAttributes type alias
Lua.pop 1
where where
constrsToReg :: Data a => a -> Lua () constrsToReg :: Data a => a -> Lua ()
constrsToReg = mapM_ (putInReg . showConstr) . dataTypeConstrs . dataTypeOf constrsToReg = mapM_ (putInReg . showConstr) . dataTypeConstrs . dataTypeOf