|
d9f69014a0
|
Make a couple improvements in performance + add an example script to extract pages from a PDF
|
2020-05-28 18:54:15 +02:00 |
|
|
f6664683c7
|
Once again something that should never have been committed
|
2020-03-20 09:34:53 +01:00 |
|
|
c491e8a70c
|
Forgot to remove deprecated source file
|
2020-03-19 12:53:35 +01:00 |
|
|
09bd706748
|
Export Content operators, needed to write filters like reveal
|
2020-03-19 10:27:29 +01:00 |
|
|
729e312f90
|
Actually, the spec calls 'catalog' what we call 'origin' — use 'catalog' for more clarity in regard to the spec
|
2020-03-19 10:27:29 +01:00 |
|
|
6d265633e4
|
Export Instructions constructor from PDF.Content, used by reveal
|
2020-03-19 10:27:29 +01:00 |
|
|
1eb1c23053
|
Found a nicer way to handle the too long IndirectObjCoordinates for Object Navigation
|
2020-03-19 10:27:29 +01:00 |
|
|
44125f75a6
|
The orphan instance for MonadState s m => MonadReader s m really can't be used, so replace it with a mere function that runs an operation on a ReaderT into the monad State, allowing to borrow operations on MonadReader in a MonadState context
|
2020-03-19 10:27:29 +01:00 |
|
|
c8a5e2b191
|
Wait, CachedFonts are indexed by Id Object so it could be an IdMap actually
|
2020-03-19 10:27:29 +01:00 |
|
|
11640c8465
|
Replace 'cacheFonts' by more versatile 'withFonts' inspired by 'withResources' that avoid having to declare an inline function to capture the 'layer' argument and pass it twice
|
2020-03-19 10:27:29 +01:00 |
|
|
e94a09b3ec
|
Add a Traversable instance for IdMap, needed in reveal and useful in general to be able to use atAll
|
2020-03-19 10:27:29 +01:00 |
|
|
ba7dd6a690
|
Make cacheFonts slightly more useful by passing layer directly to it and run the ReaderT underneath
|
2020-03-19 10:27:29 +01:00 |
|
|
d21e14f9a4
|
Hey, zlib isn't needed anymore for getText since all decoding is done directly in the Box instance for Streams
|
2020-03-19 10:27:28 +01:00 |
|
|
a1c2fbf110
|
Add an alias to Id to lift type ambiguities like 'chunk' in PDF.Content.Text
|
2020-03-19 10:27:28 +01:00 |
|
|
24630a04a1
|
Implement 'w' for Pages Box instances
|
2020-03-19 10:27:28 +01:00 |
|
|
ee5e7500a8
|
Implement 'w' for Box m Chunks Content (Indexed Text)
|
2020-03-19 10:27:28 +01:00 |
|
|
d8aec5bf80
|
Add Box instance for IdMap a b, remove restriction on new keys in the Map instance since it's not really needed and could be better implemented like in OrderedMap by first using 'r'
|
2020-03-19 10:27:28 +01:00 |
|
|
25e2823c75
|
Generalize register to all IdMap a b, since it's gonna be needed by Indexed Text too
|
2020-03-19 10:27:28 +01:00 |
|
|
5027b079eb
|
Include page numbers in chunks label, needed for long documents with many pages
|
2020-03-19 10:27:28 +01:00 |
|
|
5722dd1a04
|
Use IntMap for all Maps on Ids
|
2020-03-19 10:27:28 +01:00 |
|
|
f31e9eb38b
|
Generalize Ids out of Content to handle Object Ids too
|
2020-03-19 10:27:21 +01:00 |
|
|
0f857c457d
|
Use a defined monadic stack in Pages to lift the MonadReader ambiguity and allow finishing to reimplement getText demo
|
2020-03-14 16:57:16 +01:00 |
|
|
40475a3093
|
Clean unneeded stuff separating the monadic type constraint from the actual monad stack used, one more step towrds MonadFail -> MonadError
|
2020-03-14 16:55:34 +01:00 |
|
|
a9d3e5d326
|
Clean unused dependencies from Map + use a more defined Monad for the Box Chunks instance, hoping we will be able to clear the whole stack someday and stop requiring that RoContext type, unboxing and reboxing the FontSet for no good
|
2020-03-14 16:27:56 +01:00 |
|
|
f2a99e1fd2
|
Reorder module PDF.Body in alphabetical order
|
2020-03-14 16:25:26 +01:00 |
|
|
5bf2b08fa9
|
Try replacing general monadic type constraint by a definite monad stack
|
2020-03-11 22:35:19 +01:00 |
|
|
5b8d951516
|
WIP: Try about everything that's possible to try, OrderedMap or [(,)], try to decouple Box instance for Content and the one for Indexed Text, breaks getText… will probably require some advanced effect library, there seems to be a weird MonadReader conflict in the errors messages
|
2020-03-11 18:55:18 +01:00 |
|
|
d3f1b97f3a
|
Replace the fake instance of Box for Content over Indexed Text with the true one using renderText
|
2020-03-11 18:53:41 +01:00 |
|
|
c4c3e35e09
|
Write said instance
|
2020-03-11 18:52:09 +01:00 |
|
|
10f8c711da
|
Implement set and mapi on OrderedMap for convenience and to write a Box instance over OrderedMap like the one over Map
|
2020-03-11 18:51:49 +01:00 |
|
|
b6c1f670ef
|
Generalize the search for FlateDecode (there can be several filters in an array)
|
2020-03-11 10:47:52 +01:00 |
|
|
3b1a5152e4
|
Try connecting all the Box instance in the getText demo, try to encode pages contents with a simple assoc list
|
2020-03-10 22:57:11 +01:00 |
|
|
a04adff1d2
|
Prepare real instance of Box using renderText
|
2020-03-10 22:55:16 +01:00 |
|
|
103037ffb2
|
Fix mistake in arity of operator "
|
2020-03-10 22:53:27 +01:00 |
|
|
dce10ae63a
|
Keep Page as only a reference object keeping the ObjectId explicit so we can modify the actual objects one day, write an OrderedMap data structure to help
|
2020-03-08 22:18:47 +01:00 |
|
|
f2986da96d
|
Simplify Content abstracting over MonadParser for no reason and provide instead an parse that's in MonadFail to avoid having to handle Either outside
|
2020-03-08 22:16:23 +01:00 |
|
|
673321bf0a
|
Implement encoder for good
|
2020-03-08 22:14:36 +01:00 |
|
|
0ade9cc2f5
|
Implement proper text formatting into PDF instructions using the new encode feature available in Fonts
|
2020-03-08 00:04:18 +01:00 |
|
|
457f1755e6
|
Prepare storing the reverse mapping for CMaps, divided by length to be able to implement encoding with a reasonable complexity
|
2020-03-08 00:02:24 +01:00 |
|
|
ca40d2df76
|
Don't use (!?) operator that doesn't exist before containers 0.5.9 for maximum compatibility
|
2020-03-08 00:00:24 +01:00 |
|
|
44bc898ed3
|
Generalize the Indexed type to handle both arbitrary Content instructions and text-related ones that can be viewed as text chunks
|
2020-03-06 19:21:16 +01:00 |
|
|
1ec47c5d07
|
Update Font type to cover both encoding and decoding — WIP for CMap, but complete though not tested yet for MacRoman encoding
|
2020-03-06 19:19:53 +01:00 |
|
|
6e245189fd
|
Add a simple Box instance that exposes IndexedInstructions within a Content
|
2020-03-05 17:44:38 +01:00 |
|
|
90348c57d6
|
Disable text rendering and font loading from the Page abstraction, this code will have to be moved into a separate Box instance
|
2020-03-05 17:40:58 +01:00 |
|
|
50ac0692b2
|
Implement r for access by PageNumber and clean the mess a bit
|
2020-03-05 10:09:09 +01:00 |
|
|
2b9abc24b6
|
Add a separate instance for Raw streams that don't try to decode them
|
2020-03-04 18:31:30 +01:00 |
|
|
309f6ed461
|
Actually re-implement getText with the simpler Box instance
|
2020-03-04 18:19:10 +01:00 |
|
|
93c9863426
|
Remove accidentally commited trailing space on a line
|
2020-03-04 18:14:54 +01:00 |
|
|
7cef65d799
|
Fixed vicious bug introduced by 6096a1a237 (since follow is now automatic for references, it's not called explicitely but should in case of 'several' Content, which is an array of references, each of which should be expended) — TODO: add a unit test for that
|
2020-03-04 18:14:33 +01:00 |
|
|
d288ecf0ac
|
Start reimplementing getAll as a Box instance and try to separate the various monad run steps
|
2020-03-03 18:17:44 +01:00 |
|