Lua filters: load module pandoc before calling init.lua (#5287)

The file `init.lua` in pandoc's data directory is run as part of
pandoc's Lua initialization process. Previously, the `pandoc` module was
loaded in `init.lua`, and the structure for marshaling was set-up after.
This allowed simple patching of element marshaling, but made using
`init.lua` more difficult:

  - it encouraged mixing essential initialization with user-defined
    customization;

  - upstream changes to init.lua had to be merged manually;

  - accidentally breaking marshaling by removing required modules was
    possible;

Instead, all required modules are now loaded before calling `init.lua`.
The file can be used entirely for user customization. Patching
marshaling functions, while discouraged, is still possible via the
`debug` module.
This commit is contained in:
Albert Krewinkel 2019-02-09 22:56:49 +01:00 committed by John MacFarlane
parent 66ed198fff
commit 75c791b4fe
3 changed files with 37 additions and 22 deletions

View file

@ -1,5 +1,3 @@
-- This Lua script is run every time the Lua interpreter is started when running
-- a Lua filter. It can be customized to load additional modules or to alter the
-- default modules.
pandoc = require 'pandoc'

View file

@ -219,21 +219,15 @@ Some pandoc functions have been made available in lua:
# Lua interpreter initialization
The way the Lua interpreter is set-up can be controlled by
placing a file `init.lua` in pandoc's data directory. The
default init file loads the `pandoc` and `pandoc.mediabag`
modules:
Initialization of pandoc's Lua interpreter can be controlled by
placing a file `init.lua` in pandoc's data directory. A common
use-case would be to load additional modules, or even to alter
default modules.
``` {.lua}
pandoc = require 'pandoc'
pandoc.mediabag = require 'pandoc.mediabag'
```
A common use-case would be to add code to load additional
modules or to alter default modules. E.g., the following snippet
adds all unicode-aware functions defined in the [`text`
module](#module-text) to the default `string` module, prefixed
with the string `uc_`.
The following snippet is an example of code that might be useful
when added to `init.lua`. The snippet adds all unicode-aware
functions defined in the [`text` module] to the default `string`
module, prefixed with the string `uc_`.
``` {.lua}
for name, fn in pairs(require 'text') do
@ -244,6 +238,8 @@ end
This makes it possible to apply these functions on strings using
colon syntax (`mystring:uc_upper()`).
[`text` module]: #module-text
# Examples
The following filters are presented as examples.

View file

@ -49,6 +49,7 @@ import Text.Pandoc.Lua.Util (loadScriptFromDataDir)
import qualified Foreign.Lua as Lua
import qualified Foreign.Lua.Module.Text as Lua
import qualified Text.Pandoc.Definition as Pandoc
import qualified Text.Pandoc.Lua.Module.Pandoc as ModulePandoc
-- | Lua error message
newtype LuaException = LuaException String deriving (Show)
@ -95,16 +96,37 @@ luaPackageParams = do
-- | Initialize the lua state with all required values
initLuaState :: LuaPackageParams -> Lua ()
initLuaState luaPkgParams = do
initLuaState pkgParams = do
Lua.openlibs
Lua.preloadTextModule "text"
installPandocPackageSearcher luaPkgParams
loadScriptFromDataDir (luaPkgDataDir luaPkgParams) "init.lua"
putConstructorsInRegistry
installPandocPackageSearcher pkgParams
initPandocModule
loadScriptFromDataDir (luaPkgDataDir pkgParams) "init.lua"
where
initPandocModule :: Lua ()
initPandocModule = do
-- Push module table
ModulePandoc.pushModule (luaPkgDataDir pkgParams)
-- register as loaded module
Lua.pushvalue Lua.stackTop
Lua.getfield Lua.registryindex Lua.loadedTableRegistryField
Lua.setfield (Lua.nthFromTop 2) "pandoc"
Lua.pop 1
-- copy constructors into registry
putConstructorsInRegistry
-- assign module to global variable
Lua.setglobal "pandoc"
-- | AST elements are marshaled via normal constructor functions in the
-- @pandoc@ module. However, accessing Lua globals from Haskell is
-- expensive (due to error handling). Accessing the Lua registry is much
-- cheaper, which is why the constructor functions are copied into the
-- Lua registry and called from there.
--
-- This function expects the @pandoc@ module to be at the top of the
-- stack.
putConstructorsInRegistry :: Lua ()
putConstructorsInRegistry = do
Lua.getglobal "pandoc"
constrsToReg $ Pandoc.Pandoc mempty mempty
constrsToReg $ Pandoc.Str mempty
constrsToReg $ Pandoc.Para mempty
@ -113,7 +135,6 @@ putConstructorsInRegistry = do
constrsToReg $ Pandoc.Citation mempty mempty mempty Pandoc.AuthorInText 0 0
putInReg "Attr" -- used for Attr type alias
putInReg "ListAttributes" -- used for ListAttributes type alias
Lua.pop 1
where
constrsToReg :: Data a => a -> Lua ()
constrsToReg = mapM_ (putInReg . showConstr) . dataTypeConstrs . dataTypeOf