summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-01Mention documentation URLNick White
2020-04-14Remove getbests; it belongs with bookpipeline (and putting it there removes ↵v0.1.3Nick White
an annoying circular dependency)
2020-04-14Add godoc documentationNick White
2020-03-13Update go.mod now that getbests util has a dependencyv0.1.2Nick White
2020-03-13Add simple "getbests" utility, useful for statistics gatheringNick White
2020-03-13Add copyright statements to each fileNick White
2020-02-28Add license, copyright statements and a basic readmev0.1.1Nick White
2020-02-27Add go.modv0.1.0Nick White
2020-02-27Reorganise all commands to be behind cmd/Nick White
2020-02-20[pare-gt] gofmtNick White
2020-02-20[pare-gt] Fix sampling formula, make robust in the face of a 100% sample ↵Nick White
request, and fix up test output
2020-02-20[pare-gt] Add some tests, and make deterministicNick White
These tests have uncovered at least 2 bugs that haven't yet been squashed: - 1% selection hangs - 20% selection only takes as many as 10%
2020-02-20[pare-gt] gofmtNick White
2020-02-19Split sampling functionality in pare-gt into a separate function that can be ↵Nick White
tested (coming soon)
2020-02-11Add pare-gt toolNick White
2020-01-22Fix up boxtotxt toolNick White
2020-01-22Add GetWordConfs function to hocr pkgNick White
2020-01-22Add simple boxtotxt toolNick White
2019-11-12Clean up, and add comment explaining design choice to fonttobytesNick White
2019-11-12Add fonttobytes, to embed the font into pdf tools in due courseNick White
2019-10-31Export a couple of more generally useful functionsNick White
2019-10-30Simplify and document hocr package slightly betterNick White
2019-10-23Add tiny doc.go, hopefully will ensure 'go get rescribe.xyz/utils' doesn't ↵Nick White
return an error for lack of .go files
2019-10-23Make bucket-lines and related packages more robustNick White
bucket-lines would crash for any line that didn't have a corresponding image. Lines which weren't grayscale would also cause crashes; now they are just converted to grayscale if necessary. As a bonus, lines in jpeg can also be decoded successfull.
2019-10-08Remove parts that have been moved elsewhere, and rename to rescribe.xyz/utilsNick White
bookpipeline is now at rescribe.xyz/bookpipeline preproc is now at rescribe.xyz/preproc integralimg is now at rescribe.xyz/preproc/integralimg
2019-10-07Ensure wipe pipeline uses the expected png filesNick White
2019-10-02Improve usage notice for booktopipelineNick White
2019-10-02Add -prebinarised flag to booktopipelineNick White
2019-10-02gofmtNick White
2019-10-02Add wipeonly queue and functionalityNick White
This is useful for prebinarised images, which don't need full preprocessing, but do require wiping, albeit with a more conservative threshold.
2019-09-27Improve wiping procedure to work better with 2 column layoutsNick White
2019-09-27Fix crash bug when graph was used on source with less than 10 pagesNick White
2019-09-27One more update of graph.go to correspond to new go-chart, and improve usage ↵Nick White
for wipe
2019-09-27Hardcode to ignore "workhorse" from logsNick White
2019-09-27Update usage of go-chart to correspond to latest version of libraryNick White
2019-09-24gofmtNick White
2019-09-24Improve ssh logs; ensure only fully operational servers are tried, and ↵Nick White
ensure connections to new ips not in known_hosts still succeed
2019-09-24Do ssh log collection concurrentlyNick White
2019-09-24Get ssh logs from all running serversNick White
2019-09-24Add list of books done and in progress to lspipelineNick White
2019-09-24Rewrite GetInstanceDetails so page function is separateNick White
2019-09-24Move ec2 stuff out of lspipeline and into aws.goNick White
2019-09-23gofmtNick White
2019-09-23Move the sqs stuff out to aws.goNick White
2019-09-19Add queue listing to lspipelineNick White
2019-09-19Switch to using a goroutine for ec2 instance info, so can do all aws ↵Nick White
requests concurrently in due course
2019-09-18Add start of lspipelineNick White
2019-09-17gofmtNick White
2019-09-16Be more careful to try to grab the message after a heartbeat failure more ↵Nick White
quickly Rather than waiting for the whole length of a visibility timeout, in which time another process may grab the message, instead wait a short amount of time, each time the message is searched for. Also add a bit more logging.
2019-09-14Ensure enough time has elapsed before looking for the message to reget in ↵Nick White
the case of heartbeat running out