Age | Commit message (Collapse) | Author |
|
TESSDATA_PREFIX accordingly
|
|
|
|
|
|
|
|
|
|
only use 0.1,0.2,0.3
|
|
minimal.
|
|
called rescribe
|
|
No functionality changes, but this should make it easier to make custom
builds using the pipeline in slightly different ways.
|
|
just be stopped after a period, rather than the whole computer shut down
|
|
(failing) log saving, mail sending, and removing erroneous references to AWS
|
|
This ensures that bookpipeline will still work even if TESSDATA_PREFIX has
been set to a directory without configs in it.
|
|
arguments, even if all are strings
|
|
|
|
|
|
There was no reason not to do this with wipeonly as well, and sure enough a
single broken PNG image in a wipeonly task would cause the queue to exponentially
fill as happened previously.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
as that isnt present on go1.11
|
|
|
|
queue
This prevents the current situation where a failed preprocessing job
is endlessly repeated, potentially spawning thousands of ocrpage
jobs in its wake each time.
Note that the email stuff works but requires putting secrets into
.go files, so need to rewrite that to read from somewhere more
sensible like a dotfile on the host.
|
|
|
|
and change WipeFile() to take advantage of them
|
|
|
|
|
|
|
|
|
|
This was a nasty one. By closing the up channel, the up() function
would finish and send to the done channel. This means that the select
between err and done would be random as to which was picked, whereas
of course if there has been an error that path must be taken.
|
|
|
|
|
|
|
|
|
|
There are several ways that disk usage is reduced with this patch:
- Files are deleted as soon as they have been uploaded
- Once a page image has been added to a PDF, immediately delete it
This should allow much larger books to be processed without needing
bigger disks.
|
|
|
|
|
|
|
|
the cloudsettings documentation slightly
|
|
|
|
|
|
|
|
(which reenables it for spot instances)
|
|
HeartbeatSeconds, as it is not a Time
|
|
ensure it is always a / delimeter
|
|
|
|
|
|
|