|
9f1b1afafe
|
Implement Text rendering from parsed Content
|
2020-02-10 10:54:44 +01:00 |
|
|
20466c4f13
|
WIP: Clean code parsing «pages» (now Content), separated from text rendering (will be reimplemented as an upper layer, also providing modification as stream filters) — Page is also forgotten for now, will need a big improvement in Object navigation
|
2020-02-09 22:42:57 +01:00 |
|
|
325250383a
|
Add support for fonts and implement MacRomanEncoding
|
2020-02-08 08:15:32 +01:00 |
|
|
f9f799c59b
|
Take the dirty code of «getText» and turn it into a relatively clean module exposing pages, that can be retrieved all at once or by page number (numbered human-style, starting from 1)
|
2019-11-29 11:51:35 +01:00 |
|
|
42a02808c1
|
Merge branch 'main' into extract-text
|
2019-11-27 18:05:47 +01:00 |
|
|
380c1e439b
|
Fix a bug preventing Hufflepdf from reading objects with a ' ' after the obj keyword
|
2019-11-27 18:01:19 +01:00 |
|
|
c9f050e64b
|
Remove deprecated debug script and forgotten comments to bypass the selective export of Text module
|
2019-10-14 10:17:15 +02:00 |
|
|
36d7f9b819
|
Still debugging, broke pretty much everything and finally implementing a proper coderange parsing for CMap because apparently that's necessary
|
2019-10-14 10:17:15 +02:00 |
|
|
3b59fd0c61
|
Separate CMap and Text in two distinct modules
|
2019-10-14 10:17:15 +02:00 |
|
|
1dd22c3889
|
Going to try with Text, naturally handling UTF-16 but will still have to parse «int codes» manually from strings
|
2019-10-14 10:17:15 +02:00 |
|
|
c349d9b4c2
|
Don't trust serializer, they have nothing todo with a reasonable binary encoding
|
2019-10-14 10:17:15 +02:00 |
|
|
e7484ef536
|
Completely lost, the same old Char8 / Word8 again, implemented all the text reading, still needing a couple details to parse CMaps
|
2019-10-14 10:17:15 +02:00 |
|
|
6f3c159ea7
|
Adding a module to implement text reading and a demo program to go with it
|
2019-10-14 10:17:15 +02:00 |
|
|
d6994f0813
|
Release 0.2.0.0
|
2019-10-14 10:16:14 +02:00 |
|
|
68f90d20e2
|
Implement PDF's multilayer updates and use it in getObj to display only the current version of the object taken into account instead of the concatenation of all its versions
|
2019-09-22 01:40:39 +02:00 |
|
|
9ab010de61
|
Add to example programs to show how the lib can be used
|
2019-09-20 22:42:17 +02:00 |
|
|
dd79cb3fc7
|
Release bugfix v0.1.1.1
|
2019-05-31 15:16:23 +02:00 |
|
|
11cb6504d7
|
Go strict ByteStrings with attoparsec
|
2019-05-24 10:48:09 +02:00 |
|
|
b60f337cc4
|
First useable version
|
2019-05-18 11:09:03 +02:00 |
|
|
2c165daaa7
|
Finally opt for uppercase Hufflepdf and rename cabal package
|
2019-05-18 09:49:31 +02:00 |
|