Commit graph

15 commits

Author SHA1 Message Date
ae938acc02 Merge branch 'main' into extract-text 2020-02-12 17:34:56 +01:00
325250383a Add support for fonts and implement MacRomanEncoding 2020-02-08 08:15:32 +01:00
8373bd1ea0 Removing +x permission on getText source that shouldn't ever have been set 2019-11-29 19:07:54 +01:00
7eca875900 Improve getObj example to catch no-existing ObjectId and default to listing existing ObjectIds when none is provided 2019-11-29 11:53:08 +01:00
f9f799c59b Take the dirty code of «getText» and turn it into a relatively clean module exposing pages, that can be retrieved all at once or by page number (numbered human-style, starting from 1) 2019-11-29 11:51:35 +01:00
c9f050e64b Remove deprecated debug script and forgotten comments to bypass the selective export of Text module 2019-10-14 10:17:15 +02:00
3a3e1533b4 Clean ByteString types to identify when a ByteString contains the representation of an integer in a given base and fix the last remaining PDF string (un)escaping issue 2019-10-14 10:17:15 +02:00
36d7f9b819 Still debugging, broke pretty much everything and finally implementing a proper coderange parsing for CMap because apparently that's necessary 2019-10-14 10:17:15 +02:00
3b59fd0c61 Separate CMap and Text in two distinct modules 2019-10-14 10:17:15 +02:00
0374b72920 Finish implementing reading, still bugs to investigate 2019-10-14 10:17:15 +02:00
e7484ef536 Completely lost, the same old Char8 / Word8 again, implemented all the text reading, still needing a couple details to parse CMaps 2019-10-14 10:17:15 +02:00
f9e5683bf4 WIP: Use previous changes to start implementing font caching and text parsing (still very broken, doesn't compile) 2019-10-14 10:17:15 +02:00
6f3c159ea7 Adding a module to implement text reading and a demo program to go with it 2019-10-14 10:17:15 +02:00
68f90d20e2 Implement PDF's multilayer updates and use it in getObj to display only the current version of the object taken into account instead of the concatenation of all its versions 2019-09-22 01:40:39 +02:00
9ab010de61 Add to example programs to show how the lib can be used 2019-09-20 22:42:17 +02:00