index
:
sh
master
Various shell scripts, mostly superceded by Go tools
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Collapse
)
Author
2019-06-03
Fix dir-to-pdf output naming
Nick White
2019-05-15
Adjust fullocrdir.sh to latest version of pgconf
Nick White
2019-05-14
Add bookgraphv2, to go hand in hand with fullocrdir
Nick White
2019-05-14
fix typo
Nick White
2019-05-14
Add fullocrdir script, which does multiple binarisation options and picks ↵
Nick White
the ones with the highest confidence
2019-05-08
Ensure dir-to-pdf saves to dirname.pdf not dirname/.pdf, and handle all ↵
Nick White
different naming conventions
2019-05-08
Make scrape scripts executable
Nick White
2019-05-08
Make scrapers more robust, and have them scrape into a directory per book
Nick White
2019-05-08
Make BNF scraper much more robust
Nick White
2019-05-08
Allow an argument to set pdf savefile, and resize pdf images to be way smaller
Nick White
2019-05-08
Rename pdf prep tool as it creates the pdf too now
Nick White
2019-05-08
Use sane page numbering for erara scraper
Nick White
2019-05-08
Add scrape-erara.sh script (not fully tested)
Nick White
2019-05-08
Set DPI for images, and maximally compress jpg (with binarisation it doesn't ↵
Nick White
make much difference)
2019-05-08
Add format-for-hocr-pdf.sh script
Nick White
2019-04-23
Save dehyphenated text to a different file, rather than overwriting the original
Nick White
2019-04-23
Add dehyphenate script
Nick White
2019-04-09
Modify traintessv4.sh to include step to construct final training
Nick White
2019-04-02
Fix bugs in traintessv4.sh
Nick White
2019-04-02
Add tesseractv4 training script
Nick White
2019-03-26
Make book graph scripts more robust to dodgy page filenames, and name ↵
Nick White
bookgraph better
2019-03-26
Add nonewlines script
Nick White
2019-03-11
Add basic bsb scraper
Nick White
2019-02-25
Make bookgraph script more readable
Nick White
2019-02-25
Add various helper scripts
Nick White