Age | Commit message (Collapse) | Author |
|
This should be a good way to get around the ongoing heartbeat
issue, as individual page jobs will never come close to a the
12 hour mark that can cause the bug.
The OCR page processing is done and working now, still to do
is to populate the queue (rather than the ocr queue) after
preprocessing / wiping.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
I presumed this would mean that multiple words next to
each other couldn't be reliably searched for, but this
seems not to be the case.
|
|
|
|
|
|
|
|
|
|
|
|
will restart it soon thereafter)
|
|
This time wait up to 1 second between attempts, reduce long
polling time significantly, and attempt for longer before
giving up.
|
|
This approach first sets the remaining visibility timeout to zero.
This should ensure that the message is available to re-find as soon
as the process looks for it.
Correspondingly the delay between checks is much shorter, as there
shouldn't be a reason for much delay.
|
|
option to download the original page images too
|
|
|
|
gets an annotation now, and that band is labelled
|
|
finding it and not letting another process steal it.
|
|
|
|
|
|
|
|
|
|
|
|
can be determined
|
|
|
|
like "010001.bin.png")
|
|
|
|
rescribe.xyz/bookpipeline
The dependencies from the go.git repo will follow in due course.
|
|
|
|
|
|
|
|
|
|
This is useful for prebinarised images, which don't need full preprocessing,
but do require wiping, albeit with a more conservative threshold.
|
|
|
|
|
|
for wipe
|
|
|
|
|
|
|
|
ensure connections to new ips not in known_hosts still succeed
|
|
|
|
|