diff options
author | Nick White <git@njw.name> | 2021-08-09 15:51:50 +0100 |
---|---|---|
committer | Nick White <git@njw.name> | 2021-08-09 15:51:50 +0100 |
commit | 69eeb41a33f6a764fc6baf1a95e629a6482b67ea (patch) | |
tree | 0e28eef52d0eef3405c16d9ee853e43977e33c4f /lib/hocr | |
parent | 1f2a05e466c195dde83effd82c96d4329259d249 (diff) |
pdf: significantly improve character coordinates
A few good changes to make word coordinate lookups significantly
more accurate:
- Set font size dynamically based on the line height (previously it was
fixed as size 10)
- Correct height and width of word boxes (previously they were way too
large, which probably didn't make a difference in the general case,
but now they're correct)
- Set word box margin to zero
Also change PDF size to A5 paper, as that's closer to an average book page size.
Diffstat (limited to 'lib/hocr')
0 files changed, 0 insertions, 0 deletions