Age | Commit message (Collapse) | Author | |
---|---|---|---|
2020-07-21 | [bookpipeline] If preprocessing fails, email us and remove the job from the ↵ | Nick White | |
queue This prevents the current situation where a failed preprocessing job is endlessly repeated, potentially spawning thousands of ocrpage jobs in its wake each time. Note that the email stuff works but requires putting secrets into .go files, so need to rewrite that to read from somewhere more sensible like a dotfile on the host. | |||
2020-07-20 | Merge branch 'master' of https://git.rescribe.xyz/bookpipeline | Nick White | |
2020-07-20 | Update preproc to v0.1.4 to take advantage of vertical wiping parameters, ↵v0.2.5 | Nick White | |
and change WipeFile() to take advantage of them | |||
2020-06-16 | [getallhocrs] Skip files which have already been downloaded | Nick White | |
2020-06-15 | Add getallhocrs tool | Nick White | |
2020-06-03 | Hopefully fix last bug in analyse step of bookpipeline | Nick White | |
2020-06-03 | Fix bug in analyse step of bookpipeline | Nick White | |
2020-06-02 | Fix race condition that could cause errors to be silently discarded | Nick White | |
This was a nasty one. By closing the up channel, the up() function would finish and send to the done channel. This means that the select between err and done would be random as to which was picked, whereas of course if there has been an error that path must be taken. | |||
2020-05-29 | [bookpipeline] Remove local copy of original page image once preprocessed | Nick White | |
2020-05-29 | Merge branch 'minimisedisk'v0.2.4 | Nick White | |
2020-05-26 | Add -c conntype for necessary tools to allow local connection to be used | Nick White | |
2020-05-22 | Fix bookpipeline failing if shutdown option isnt used | Nick White | |
2020-05-22 | [untested] Use less disk spaceminimisedisk | Nick White | |
There are several ways that disk usage is reduced with this patch: - Files are deleted as soon as they have been uploaded - Once a page image has been added to a PDF, immediately delete it This should allow much larger books to be processed without needing bigger disks. | |||
2020-05-19 | Add getandpurgequeue debugging tool | Nick White | |
2020-04-21 | Simplify spotme | Nick White | |
2020-04-14 | Add getbests tool that was previously in the utils repo | Nick White | |
2020-04-14 | Briefly document each of the commands in a godoc friendly way, and improve ↵ | Nick White | |
the cloudsettings documentation slightly | |||
2020-04-07 | Remove unused OCR queue (was superceded by the ocrpage queue some time ago) | Nick White | |
2020-04-07 | gofmt | Nick White | |
2020-04-07 | Separate out cloud settings into a separate file; cloudsettings.go | Nick White | |
2020-03-31 | Disable autoshutdown by default for bookpipeline, and update to ami 0.11 ↵ | Nick White | |
(which reenables it for spot instances) | |||
2020-03-31 | [bookpipeline] Fix typo in previous commit and rename HeartbeatTime to ↵ | Nick White | |
HeartbeatSeconds, as it is not a Time | |||
2020-03-31 | [bookpipeline] Stop using filepath.Join for storage keys, as we want to ↵ | Nick White | |
ensure it is always a / delimeter | |||
2020-03-31 | [bookpipeline] Improve logging output | Nick White | |
2020-03-31 | [bookpipeline] Add (experimental) log saving functionality | Nick White | |
2020-03-30 | [bookpipeline] Clean up autoshutdown | Nick White | |
2020-03-30 | [bookpipeline] Enable real shutdown when bookpipeline has been idle for 5 ↵ | Nick White | |
minutes | |||
2020-03-30 | [bookpipeline] Neaten shutdown fix | Nick White | |
2020-03-30 | [bookpipeline] Fix hang bug when restarting shutdown timer | Nick White | |
2020-03-30 | Rewrite autoshutdown to do things right [bugs excluded] (wip) | Nick White | |
2020-03-24 | [bookpipeline] Improve autoshutdown wip | Nick White | |
2020-03-24 | [bookpipeline] Add experimental (dummy) shutdown part | Nick White | |
2020-03-23 | [getpipelinebook] Switch to MinimalInit() so that it can be run without SQS ↵ | Nick White | |
permissions | |||
2020-03-23 | Add Log() function to Pipeliner interface | Nick White | |
This simplifies things nicely from using conn.GetLogger().Println() to conn.Log() | |||
2020-03-23 | Replace errors.New(fmt.Sprintf with fmt.Errorf | Nick White | |
Embarassing I hadn't noticed the fmt.Errorf function before, but better late than never. | |||
2020-03-23 | Don't try to make a graph with one line (it will fail), and don't mark ↵ | Nick White | |
analysis as failed if graph isn't made for that reason | |||
2020-03-23 | [getpipelinebook] Add -binarisedpdf and -colourpdf flags | Nick White | |
2020-03-23 | [getpipelinebook] Add -graph flag to download just graphs | Nick White | |
2020-03-09 | Add nobooks flag to lspipeline so it has a faster mode | Nick White | |
2020-02-27 | Remove fonttobytes (use the one in rescribe.xyz/utils repo instead) | Nick White | |
2020-02-27 | Add documentation, license notices, and license | Nick White | |
2020-02-27 | Improve usage description of confgraph and pagegraph | Nick White | |
2020-02-05 | Fix allOCRed for wipeonly books (hopefully) | Nick White | |
allOCRed was checking for wipePattern files, however they should have been transformed into the regular preprocessedPattern for OCR anyway, so shouldn't have been directly OCRed. Thus, allOCRed was mistakenly looking for .hocr versions of the original wipePattern files, which never would have been produced. | |||
2020-01-22 | [pagegraph] Stop printing debug output | Nick White | |
2020-01-22 | [pagegraph] Fix bug where word graphs werent stable as their number wasnt ↵ | Nick White | |
parsed by graph, and add line or word option | |||
2020-01-22 | Make pagegraph use lines again | Nick White | |
2020-01-22 | Remove unused function in pagegraph | Nick White | |
2020-01-21 | Add pagegraph tool | Nick White | |
2019-12-17 | Add png flag to getpipelinebook | Nick White | |
2019-12-17 | Add pdf flag to getpipelinebook | Nick White | |