Age | Commit message (Collapse) | Author | |
---|---|---|---|
2019-12-03 | Rewrite lspipeline book listing part to be much faster by taking advantage ↵ | Nick White | |
of the aws CommonPrefixes output | |||
2019-12-03 | Don't pause between OCR page jobs; this should save us significant amounts ↵ | Nick White | |
of time when there are large numbers of pages | |||
2019-11-29 | Make error message clear what page is causing issues | Nick White | |
2019-11-26 | Improve usage notice | Nick White | |
2019-11-26 | Ensure error in file walking is correctly returned | Nick White | |
2019-11-20 | Add x/image to go.mod | Nick White | |
2019-11-20 | Merge branch 'addpdf' | Nick White | |
2019-11-20 | Implement image resizing option into PDF generation, so that smaller PDFs to ↵ | Nick White | |
be generated | |||
2019-11-19 | Send pages to the individual OCR Page queue by default | Nick White | |
This now concludes the OCR Page queue stuff; it should all be working out of the box now. | |||
2019-11-19 | Add ocrpage queue for processing individual pages | Nick White | |
This should be a good way to get around the ongoing heartbeat issue, as individual page jobs will never come close to a the 12 hour mark that can cause the bug. The OCR page processing is done and working now, still to do is to populate the queue (rather than the ocr queue) after preprocessing / wiping. | |||
2019-11-12 | Merge branch 'addpdf' | Nick White | |
2019-11-12 | Embed a font, compressed, into the binary | Nick White | |
2019-11-12 | Fix sleep in unstickocr | Nick White | |
2019-11-12 | Add unstickocr tool, until the heartbeat bug is eliminated | Nick White | |
2019-11-12 | Add spotme command to start appropriate spot instances | Nick White | |
2019-11-12 | Merge branch 'addpdf' | Nick White | |
2019-11-11 | Add go.mod and go.sum | Nick White | |
2019-11-11 | Switch to main gofpdf, now our SetTextRenderingMode has been merged | Nick White | |
2019-11-01 | Compress the font with zlib, and include it in repo | Nick White | |
2019-10-31 | Add capability to embed font files into tool | Nick White | |
2019-10-31 | PDF: add functionality to use "best" file if it exists | Nick White | |
2019-10-31 | PDF: add space to each word to ensure copy-past ability from more PDF readers | Nick White | |
2019-10-31 | PDF: lay out every word with coordinates separately | Nick White | |
I presumed this would mean that multiple words next to each other couldn't be reliably searched for, but this seems not to be the case. | |||
2019-10-31 | Add flag to switch between binarised and colour output | Nick White | |
2019-10-31 | Move PDF handling code to a separate file | Nick White | |
2019-10-31 | Many improvements to pdfbook; basically working now | Nick White | |
2019-10-31 | Add work in progress PDF producer | Nick White | |
2019-10-29 | Print heartbeat error on failure | Nick White | |
2019-10-29 | Debugging: kill process immediately a heartbeat error is detected (systemd ↵ | Nick White | |
will restart it soon thereafter) | |||
2019-10-29 | Another attempt to fix the ongoing heartbeat issue | Nick White | |
This time wait up to 1 second between attempts, reduce long polling time significantly, and attempt for longer before giving up. | |||
2019-10-28 | Try to fix heartbeat renew issue more fully | Nick White | |
This approach first sets the remaining visibility timeout to zero. This should ensure that the message is available to re-find as soon as the process looks for it. Correspondingly the delay between checks is much shorter, as there shouldn't be a reason for much delay. | |||
2019-10-23 | getpipelinebook: default to downloading corresponding page images, and add ↵ | Nick White | |
option to download the original page images too | |||
2019-10-23 | Manually calculate yticks, so they fall on reasonable numbers | Nick White | |
2019-10-23 | Add more annotations to graph; anything outside of the 80% "normal" band ↵ | Nick White | |
gets an annotation now, and that band is labelled | |||
2019-10-17 | Adjust the heartbeat searching function to hopefully have better luck at ↵ | Nick White | |
finding it and not letting another process steal it. | |||
2019-10-16 | Rewrite booktopipeline to use bookpipeline aws interface | Nick White | |
2019-10-16 | Sort book list in lspipeline by modified date | Nick White | |
2019-10-16 | Ensure booktopipeline complains if given too many arguments | Nick White | |
2019-10-16 | Another attempted fix to "too many open files" issue | Nick White | |
2019-10-16 | Ensure files are promptly closed by booktopipeline | Nick White | |
2019-10-11 | Ensure graph produces output by falling back on generic page numbers if none ↵ | Nick White | |
can be determined | |||
2019-10-09 | Make confgraph and graph in general more resilient to bad input | Nick White | |
2019-10-09 | Match prebinarised presegmented output from ocropus in wipepattern (named ↵ | Nick White | |
like "010001.bin.png") | |||
2019-10-08 | Update paths of other rescribe imports | Nick White | |
2019-10-08 | Separate out bookpipeline from catch-all go.git repo, and rename to ↵ | Nick White | |
rescribe.xyz/bookpipeline The dependencies from the go.git repo will follow in due course. | |||
2019-10-07 | Ensure wipe pipeline uses the expected png files | Nick White | |
2019-10-02 | Improve usage notice for booktopipeline | Nick White | |
2019-10-02 | Add -prebinarised flag to booktopipeline | Nick White | |
2019-10-02 | gofmt | Nick White | |
2019-10-02 | Add wipeonly queue and functionality | Nick White | |
This is useful for prebinarised images, which don't need full preprocessing, but do require wiping, albeit with a more conservative threshold. |