summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-19Add ocrpage queue for processing individual pagesNick White
This should be a good way to get around the ongoing heartbeat issue, as individual page jobs will never come close to a the 12 hour mark that can cause the bug. The OCR page processing is done and working now, still to do is to populate the queue (rather than the ocr queue) after preprocessing / wiping.
2019-11-12Merge branch 'addpdf'Nick White
2019-11-12Embed a font, compressed, into the binaryNick White
2019-11-12Fix sleep in unstickocrNick White
2019-11-12Add unstickocr tool, until the heartbeat bug is eliminatedNick White
2019-11-12Add spotme command to start appropriate spot instancesNick White
2019-11-12Merge branch 'addpdf'Nick White
2019-11-11Add go.mod and go.sumNick White
2019-11-11Switch to main gofpdf, now our SetTextRenderingMode has been mergedNick White
2019-11-01Compress the font with zlib, and include it in repoNick White
2019-10-31Add capability to embed font files into toolNick White
2019-10-31PDF: add functionality to use "best" file if it existsNick White
2019-10-31PDF: add space to each word to ensure copy-past ability from more PDF readersNick White
2019-10-31PDF: lay out every word with coordinates separatelyNick White
I presumed this would mean that multiple words next to each other couldn't be reliably searched for, but this seems not to be the case.
2019-10-31Add flag to switch between binarised and colour outputNick White
2019-10-31Move PDF handling code to a separate fileNick White
2019-10-31Many improvements to pdfbook; basically working nowNick White
2019-10-31Add work in progress PDF producerNick White
2019-10-29Print heartbeat error on failureNick White
2019-10-29Debugging: kill process immediately a heartbeat error is detected (systemd ↵Nick White
will restart it soon thereafter)
2019-10-29Another attempt to fix the ongoing heartbeat issueNick White
This time wait up to 1 second between attempts, reduce long polling time significantly, and attempt for longer before giving up.
2019-10-28Try to fix heartbeat renew issue more fullyNick White
This approach first sets the remaining visibility timeout to zero. This should ensure that the message is available to re-find as soon as the process looks for it. Correspondingly the delay between checks is much shorter, as there shouldn't be a reason for much delay.
2019-10-23getpipelinebook: default to downloading corresponding page images, and add ↵Nick White
option to download the original page images too
2019-10-23Manually calculate yticks, so they fall on reasonable numbersNick White
2019-10-23Add more annotations to graph; anything outside of the 80% "normal" band ↵Nick White
gets an annotation now, and that band is labelled
2019-10-17Adjust the heartbeat searching function to hopefully have better luck at ↵Nick White
finding it and not letting another process steal it.
2019-10-16Rewrite booktopipeline to use bookpipeline aws interfaceNick White
2019-10-16Sort book list in lspipeline by modified dateNick White
2019-10-16Ensure booktopipeline complains if given too many argumentsNick White
2019-10-16Another attempted fix to "too many open files" issueNick White
2019-10-16Ensure files are promptly closed by booktopipelineNick White
2019-10-11Ensure graph produces output by falling back on generic page numbers if none ↵Nick White
can be determined
2019-10-09Make confgraph and graph in general more resilient to bad inputNick White
2019-10-09Match prebinarised presegmented output from ocropus in wipepattern (named ↵Nick White
like "010001.bin.png")
2019-10-08Update paths of other rescribe importsNick White
2019-10-08Separate out bookpipeline from catch-all go.git repo, and rename to ↵Nick White
rescribe.xyz/bookpipeline The dependencies from the go.git repo will follow in due course.
2019-10-07Ensure wipe pipeline uses the expected png filesNick White
2019-10-02Improve usage notice for booktopipelineNick White
2019-10-02Add -prebinarised flag to booktopipelineNick White
2019-10-02gofmtNick White
2019-10-02Add wipeonly queue and functionalityNick White
This is useful for prebinarised images, which don't need full preprocessing, but do require wiping, albeit with a more conservative threshold.
2019-09-27Improve wiping procedure to work better with 2 column layoutsNick White
2019-09-27Fix crash bug when graph was used on source with less than 10 pagesNick White
2019-09-27One more update of graph.go to correspond to new go-chart, and improve usage ↵Nick White
for wipe
2019-09-27Hardcode to ignore "workhorse" from logsNick White
2019-09-27Update usage of go-chart to correspond to latest version of libraryNick White
2019-09-24gofmtNick White
2019-09-24Improve ssh logs; ensure only fully operational servers are tried, and ↵Nick White
ensure connections to new ips not in known_hosts still succeed
2019-09-24Do ssh log collection concurrentlyNick White
2019-09-24Get ssh logs from all running serversNick White