Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
{-# LANGUAGE PatternGuards #-}
|
2017-12-28 16:39:47 +01:00
|
|
|
{-# LANGUAGE OverloadedStrings #-}
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
{-# LANGUAGE MultiWayIf #-}
|
2017-12-28 16:39:47 +01:00
|
|
|
|
|
|
|
module Tests.Writers.Powerpoint (tests) where
|
|
|
|
|
|
|
|
import Text.Pandoc
|
|
|
|
import Test.Tasty
|
2018-01-22 15:02:20 +01:00
|
|
|
import Test.Tasty.Golden.Advanced
|
2017-12-28 16:39:47 +01:00
|
|
|
import Codec.Archive.Zip
|
2017-12-29 15:18:54 +01:00
|
|
|
import Text.XML.Light
|
2018-01-22 15:02:20 +01:00
|
|
|
import qualified Data.ByteString as BS
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
import qualified Data.ByteString.Lazy as BL
|
|
|
|
import qualified Data.Text.IO as T
|
2018-01-22 15:02:20 +01:00
|
|
|
import Data.List (isPrefixOf, isSuffixOf, sort, (\\), intercalate, union)
|
|
|
|
import Data.Maybe (catMaybes, mapMaybe)
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
import Tests.Helpers
|
|
|
|
import Data.Algorithm.Diff
|
|
|
|
|
|
|
|
compareXMLBool :: Content -> Content -> Bool
|
|
|
|
-- We make a special exception for times at the moment, and just pass
|
|
|
|
-- them because we can't control the utctime when running IO. Besides,
|
|
|
|
-- so long as we have two times, we're okay.
|
|
|
|
compareXMLBool (Elem myElem) (Elem goodElem)
|
|
|
|
| (QName "created" _ (Just "dcterms")) <- elName myElem
|
|
|
|
, (QName "created" _ (Just "dcterms")) <- elName goodElem =
|
|
|
|
True
|
|
|
|
compareXMLBool (Elem myElem) (Elem goodElem)
|
|
|
|
| (QName "modified" _ (Just "dcterms")) <- elName myElem
|
|
|
|
, (QName "modified" _ (Just "dcterms")) <- elName goodElem =
|
|
|
|
True
|
|
|
|
compareXMLBool (Elem myElem) (Elem goodElem) =
|
2018-01-23 19:25:20 +01:00
|
|
|
elName myElem == elName goodElem &&
|
|
|
|
elAttribs myElem == elAttribs goodElem &&
|
|
|
|
and (zipWith compareXMLBool (elContent myElem) (elContent goodElem))
|
|
|
|
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
compareXMLBool (Text myCData) (Text goodCData) =
|
2018-01-23 19:25:20 +01:00
|
|
|
cdVerbatim myCData == cdVerbatim goodCData &&
|
|
|
|
cdData myCData == cdData goodCData &&
|
|
|
|
cdLine myCData == cdLine goodCData
|
|
|
|
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
compareXMLBool (CRef myStr) (CRef goodStr) =
|
|
|
|
myStr == goodStr
|
|
|
|
compareXMLBool _ _ = False
|
|
|
|
|
|
|
|
displayDiff :: Content -> Content -> String
|
|
|
|
displayDiff elemA elemB =
|
|
|
|
showDiff (1,1) $ getDiff (lines $ ppContent elemA) (lines $ ppContent elemB)
|
|
|
|
|
2018-01-22 15:02:20 +01:00
|
|
|
goldenArchive :: FilePath -> IO Archive
|
|
|
|
goldenArchive fp = (toArchive . BL.fromStrict) <$> BS.readFile fp
|
|
|
|
|
|
|
|
testArchive :: WriterOptions -> FilePath -> IO Archive
|
|
|
|
testArchive opts fp = do
|
|
|
|
txt <- T.readFile fp
|
|
|
|
bs <- runIOorExplode $ readNative def txt >>= writePowerpoint opts
|
|
|
|
return $ toArchive bs
|
|
|
|
|
|
|
|
compareFileList :: FilePath -> Archive -> Archive -> Maybe String
|
|
|
|
compareFileList goldenFP goldenArch testArch =
|
|
|
|
let testFiles = filesInArchive testArch
|
|
|
|
goldenFiles = filesInArchive goldenArch
|
|
|
|
diffTestGolden = testFiles \\ goldenFiles
|
|
|
|
diffGoldenTest = goldenFiles \\ testFiles
|
|
|
|
|
|
|
|
results =
|
|
|
|
[ if null diffGoldenTest
|
|
|
|
then Nothing
|
|
|
|
else Just $
|
|
|
|
"Files in " ++ goldenFP ++ " but not in generated archive:\n" ++
|
|
|
|
intercalate ", " diffGoldenTest
|
|
|
|
, if null diffTestGolden
|
|
|
|
then Nothing
|
|
|
|
else Just $
|
|
|
|
"Files in generated archive but not in " ++ goldenFP ++ ":\n" ++
|
|
|
|
intercalate ", " diffTestGolden
|
|
|
|
]
|
|
|
|
in
|
|
|
|
if null $ catMaybes results
|
|
|
|
then Nothing
|
|
|
|
else Just $ intercalate "\n" $ catMaybes results
|
|
|
|
|
|
|
|
compareXMLFile' :: FilePath -> Archive -> Archive -> Either String ()
|
|
|
|
compareXMLFile' fp goldenArch testArch = do
|
|
|
|
testEntry <- case findEntryByPath fp testArch of
|
|
|
|
Just entry -> Right entry
|
|
|
|
Nothing -> Left $
|
|
|
|
"Can't extract " ++ fp ++ " from generated archive"
|
|
|
|
testXMLDoc <- case parseXMLDoc $ fromEntry testEntry of
|
|
|
|
Just doc -> Right doc
|
|
|
|
Nothing -> Left $
|
|
|
|
"Can't parse xml in " ++ fp ++ " from generated archive"
|
|
|
|
|
|
|
|
goldenEntry <- case findEntryByPath fp goldenArch of
|
|
|
|
Just entry -> Right entry
|
|
|
|
Nothing -> Left $
|
|
|
|
"Can't extract " ++ fp ++ " from archive in stored pptx file"
|
|
|
|
goldenXMLDoc <- case parseXMLDoc $ fromEntry goldenEntry of
|
|
|
|
Just doc -> Right doc
|
|
|
|
Nothing -> Left $
|
|
|
|
"Can't parse xml in " ++ fp ++ " from archive in stored pptx file"
|
|
|
|
|
2018-01-23 19:25:20 +01:00
|
|
|
let testContent = Elem testXMLDoc
|
|
|
|
goldenContent = Elem goldenXMLDoc
|
2018-01-22 15:02:20 +01:00
|
|
|
|
2018-01-23 19:25:20 +01:00
|
|
|
if compareXMLBool goldenContent testContent
|
2018-01-22 15:02:20 +01:00
|
|
|
then Right ()
|
|
|
|
else Left $
|
|
|
|
"Non-matching xml in " ++ fp ++ ":\n" ++ displayDiff testContent goldenContent
|
|
|
|
|
|
|
|
compareXMLFile :: FilePath -> Archive -> Archive -> Maybe String
|
|
|
|
compareXMLFile fp goldenArch testArch =
|
|
|
|
case compareXMLFile' fp goldenArch testArch of
|
|
|
|
Right _ -> Nothing
|
|
|
|
Left s -> Just s
|
|
|
|
|
|
|
|
compareAllXMLFiles :: Archive -> Archive -> Maybe String
|
|
|
|
compareAllXMLFiles goldenArch testArch =
|
|
|
|
let allFiles = filesInArchive goldenArch `union` filesInArchive testArch
|
|
|
|
allXMLFiles = sort $
|
|
|
|
filter
|
|
|
|
(\fp -> ".xml" `isSuffixOf` fp || ".rels" `isSuffixOf` fp)
|
|
|
|
allFiles
|
|
|
|
results =
|
|
|
|
mapMaybe (\fp -> compareXMLFile fp goldenArch testArch) allXMLFiles
|
|
|
|
in
|
|
|
|
if null results
|
|
|
|
then Nothing
|
|
|
|
else Just $ unlines results
|
|
|
|
|
|
|
|
compareMediaFile' :: FilePath -> Archive -> Archive -> Either String ()
|
|
|
|
compareMediaFile' fp goldenArch testArch = do
|
|
|
|
testEntry <- case findEntryByPath fp testArch of
|
|
|
|
Just entry -> Right entry
|
|
|
|
Nothing -> Left $
|
|
|
|
"Can't extract " ++ fp ++ " from generated archive"
|
|
|
|
goldenEntry <- case findEntryByPath fp goldenArch of
|
|
|
|
Just entry -> Right entry
|
|
|
|
Nothing -> Left $
|
|
|
|
"Can't extract " ++ fp ++ " from archive in stored pptx file"
|
|
|
|
|
2018-01-23 19:25:20 +01:00
|
|
|
if fromEntry testEntry == fromEntry goldenEntry
|
2018-01-22 15:02:20 +01:00
|
|
|
then Right ()
|
|
|
|
else Left $
|
|
|
|
"Non-matching binary file: " ++ fp
|
|
|
|
|
|
|
|
compareMediaFile :: FilePath -> Archive -> Archive -> Maybe String
|
|
|
|
compareMediaFile fp goldenArch testArch =
|
|
|
|
case compareMediaFile' fp goldenArch testArch of
|
|
|
|
Right _ -> Nothing
|
|
|
|
Left s -> Just s
|
|
|
|
|
|
|
|
compareAllMediaFiles :: Archive -> Archive -> Maybe String
|
|
|
|
compareAllMediaFiles goldenArch testArch =
|
|
|
|
let allFiles = filesInArchive goldenArch `union` filesInArchive testArch
|
|
|
|
allMediaFiles = sort $
|
|
|
|
filter
|
|
|
|
(\fp -> "/ppt/media/" `isPrefixOf` fp)
|
|
|
|
allFiles
|
|
|
|
results =
|
|
|
|
mapMaybe (\fp -> compareMediaFile fp goldenArch testArch) allMediaFiles
|
|
|
|
in
|
|
|
|
if null results
|
|
|
|
then Nothing
|
|
|
|
else Just $ unlines results
|
|
|
|
|
|
|
|
pptxTest :: String -> WriterOptions -> FilePath -> FilePath -> TestTree
|
|
|
|
pptxTest testName opts nativeFP goldenFP =
|
|
|
|
goldenTest
|
|
|
|
testName
|
|
|
|
(goldenArchive goldenFP)
|
|
|
|
(testArchive opts nativeFP)
|
|
|
|
(\goldenArch testArch ->
|
|
|
|
let res = catMaybes [ compareFileList goldenFP goldenArch testArch
|
|
|
|
, compareAllXMLFiles goldenArch testArch
|
|
|
|
, compareAllMediaFiles goldenArch testArch
|
|
|
|
]
|
|
|
|
in return $ if null res then Nothing else Just $ unlines res)
|
2018-01-22 15:55:41 +01:00
|
|
|
(\a -> BL.writeFile goldenFP $ fromArchive a)
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
|
|
|
|
--------------------------------------------------------------
|
2017-12-28 16:39:47 +01:00
|
|
|
|
|
|
|
tests :: [TestTree]
|
2018-01-22 15:02:20 +01:00
|
|
|
tests = [ pptxTest
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
"Inline formatting"
|
2018-01-22 15:02:20 +01:00
|
|
|
def
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
"pptx/inline_formatting.native"
|
|
|
|
"pptx/inline_formatting.pptx"
|
2018-01-22 15:02:20 +01:00
|
|
|
, pptxTest
|
|
|
|
"Slide breaks (default slide-level)"
|
|
|
|
def
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
"pptx/slide_breaks.native"
|
|
|
|
"pptx/slide_breaks.pptx"
|
2018-01-22 15:02:20 +01:00
|
|
|
, pptxTest
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
"slide breaks (slide-level set to 1)"
|
2018-01-22 15:02:20 +01:00
|
|
|
def{ writerSlideLevel = Just 1 }
|
Powerpoint writer tests: New test framework for pptx.
Previously we had tested certain properties of the output PowerPoint
slides. Corruption, though, comes as the result of a numebr of
interrelated issues in the output pptx archive. This is a new
approach, which compares the output of the Powerpoint writer with
files that we know to (a) not be corrupt, and (b) to show the desired
output behavior (details below). This commit introduces three tests
using the new framework. More will follow.
The test procedure: given a native file and a pptx file, we generate a
pptx archive from the native file, and then test:
1. Whether the same files are in the two archives
2. Whether each of the contained xml files is the same. (We skip time
entries in `docProps/core.xml`, since these are derived from IO. We
just check to make sure that they're there in the same way in both
files.)
3. Whether each of the media files is the same.
Note that steps 2 and 3, though they compare multiple files, are one
test each, since the number of files depends on the input file (if
there is a failure, it will only report the first failed file
comparison in the test failure).
2018-01-21 06:25:03 +01:00
|
|
|
"pptx/slide_breaks.native"
|
|
|
|
"pptx/slide_breaks_slide_level_1.pptx"
|
2018-01-22 17:52:03 +01:00
|
|
|
, pptxTest
|
|
|
|
"table of contents"
|
|
|
|
def{ writerTableOfContents = True }
|
|
|
|
"pptx/slide_breaks.native"
|
|
|
|
"pptx/slide_breaks_toc.pptx"
|
|
|
|
, pptxTest
|
|
|
|
"end notes"
|
|
|
|
def
|
|
|
|
"pptx/endnotes.native"
|
|
|
|
"pptx/endnotes.pptx"
|
|
|
|
, pptxTest
|
|
|
|
"end notes, with table of contents"
|
|
|
|
def { writerTableOfContents = True }
|
|
|
|
"pptx/endnotes.native"
|
|
|
|
"pptx/endnotes_toc.pptx"
|
2018-01-22 20:06:47 +01:00
|
|
|
, pptxTest
|
|
|
|
"images"
|
|
|
|
def
|
|
|
|
"pptx/images.native"
|
|
|
|
"pptx/images.pptx"
|
|
|
|
, pptxTest
|
|
|
|
"two-column layout"
|
|
|
|
def
|
|
|
|
"pptx/two_column.native"
|
|
|
|
"pptx/two_column.pptx"
|
2017-12-29 15:18:54 +01:00
|
|
|
]
|