Age | Commit message (Collapse) | Author |
|
Some issues:
1) The PDF generation stores every page in memory while it constructs it. That means that
there's a higher chance of failure due to running out of memory with these. There's no
getting around this except by improving the PDF generation library, which is not easy.
2) Currently I've just changed the pipeline to always generate these full size PDFs, and
then the rescribe tool will just delete them if they weren't requested. This is bad in
particular because of point 1, and would probably cause issues of failures in the server
pipeline as a result
Therefore the plan is to add a tag to queue messages so that full size generation can be
selectively enabled.
Also, it should be split from the loop with colour pdf generation, as holding them both in RAM at
the same time is unnecessary.
|
|
|
|
Tesseract
|
|
|
|
graph.png file, and allow failure to download that as it won't be created in the case of a 1 page book, which is fine
|
|
interface, to ensure no "duplicate function" errors when compiling
|
|
they need
We were using Pipeliner as a catch-all, but it's nicer if the functions
can just state that e.g. they need download functionality, so decompose
things so that that's how we do things now.
|
|
This involved adding a test queue, so it can be run safely without
intefering with the pipeline.
|
|
|
|
an error
This is needed so that in tests the error can be selected out reliably,
rather than an empty process signal.
|
|
|
|
|
|
This can also result in the file being uploaded twice simultaneously,
as up() is running in a separate goroutine. This can cause failures
on Windows as the file is attempted to be removed by one upload
process while being open to upload by the other process. Probably it
could also fail if the process completed by one (so the file was
deleted) before being started by the other.
|
|
There were a couple of places where a file was uploaded while still open,
which resulted in an attempt to remove it, which causes an error from
Windows.
The allOCRed function also included an assumption that the path separator
would be a /, which is always correct for AWS, and correct for local on
Linux and OSX, but not for local Windows. Fixed by leaving the separator
well alone.
Also, the local connection was not stripping leading \, like it did /,
which caused an issue with Windows local.
Windows local is now tested and working, at least through wine.
|
|
|
|
some error output)
|
|
|
|
only use 0.1,0.2,0.3
|
|
called rescribe
|