Commit Graph

21 Commits

Author SHA1 Message Date
Tissevert aed7af376a WIP: still trying to figure things out, moved to a separate submodule for Navigation, proper naming is hell 2020-02-11 08:29:08 +01:00
Tissevert 9f1b1afafe Implement Text rendering from parsed Content 2020-02-10 10:54:44 +01:00
Tissevert 20466c4f13 WIP: Clean code parsing «pages» (now Content), separated from text rendering (will be reimplemented as an upper layer, also providing modification as stream filters) — Page is also forgotten for now, will need a big improvement in Object navigation 2020-02-09 22:42:57 +01:00
Tissevert 325250383a Add support for fonts and implement MacRomanEncoding 2020-02-08 08:15:32 +01:00
Tissevert f9f799c59b Take the dirty code of «getText» and turn it into a relatively clean module exposing pages, that can be retrieved all at once or by page number (numbered human-style, starting from 1) 2019-11-29 11:51:35 +01:00
Tissevert 42a02808c1 Merge branch 'main' into extract-text 2019-11-27 18:05:47 +01:00
Tissevert 380c1e439b Fix a bug preventing Hufflepdf from reading objects with a ' ' after the `obj` keyword 2019-11-27 18:01:19 +01:00
Tissevert c9f050e64b Remove deprecated debug script and forgotten comments to bypass the selective export of Text module 2019-10-14 10:17:15 +02:00
Tissevert 36d7f9b819 Still debugging, broke pretty much everything and finally implementing a proper coderange parsing for CMap because apparently that's necessary 2019-10-14 10:17:15 +02:00
Tissevert 3b59fd0c61 Separate CMap and Text in two distinct modules 2019-10-14 10:17:15 +02:00
Tissevert 1dd22c3889 Going to try with Text, naturally handling UTF-16 but will still have to parse «int codes» manually from strings 2019-10-14 10:17:15 +02:00
Tissevert c349d9b4c2 Don't trust serializer, they have nothing todo with a reasonable binary encoding 2019-10-14 10:17:15 +02:00
Tissevert e7484ef536 Completely lost, the same old Char8 / Word8 again, implemented all the text reading, still needing a couple details to parse CMaps 2019-10-14 10:17:15 +02:00
Tissevert 6f3c159ea7 Adding a module to implement text reading and a demo program to go with it 2019-10-14 10:17:15 +02:00
Tissevert d6994f0813 Release 0.2.0.0 2019-10-14 10:16:14 +02:00
Tissevert 68f90d20e2 Implement PDF's multilayer updates and use it in getObj to display only the current version of the object taken into account instead of the concatenation of all its versions 2019-09-22 01:40:39 +02:00
Tissevert 9ab010de61 Add to example programs to show how the lib can be used 2019-09-20 22:42:17 +02:00
Tissevert dd79cb3fc7 Release bugfix v0.1.1.1 2019-05-31 15:16:23 +02:00
Tissevert 11cb6504d7 Go strict ByteStrings with attoparsec 2019-05-24 10:48:09 +02:00
Tissevert b60f337cc4 First useable version 2019-05-18 11:09:03 +02:00
Tissevert 2c165daaa7 Finally opt for uppercase Hufflepdf and rename cabal package 2019-05-18 09:49:31 +02:00