Age | Commit message (Collapse) | Author |
|
|
|
|
|
interface, to ensure no "duplicate function" errors when compiling
|
|
|
|
|
|
they need
We were using Pipeliner as a catch-all, but it's nicer if the functions
can just state that e.g. they need download functionality, so decompose
things so that that's how we do things now.
|
|
be needed
|
|
internal library later as its only needed for tests
|
|
This involved adding a test queue, so it can be run safely without
intefering with the pipeline.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
rescribe tool
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This prevents issues if a .DS_Store file is present in a directory.
|
|
|
|
|
|
available to Pipeliner
|
|
an error
This is needed so that in tests the error can be selected out reliably,
rather than an empty process signal.
|
|
|
|
|
|
destination file, so if it fails an empty file isnt left behind
|
|
|
|
|
|
|
|
|
|
lspipeline as there are some hard to debug issues in concurrency version
|
|
The -prefix option is useful to us.
Previously only a .jpg for page number 100 was retreived, which
failed if the book had fewer (or unusually named) pages, and also
didn't provide a corresponding .hocr at all (bug introduced with
48958d2). Using 'best', which is (effectively) randomly sorted,
provides a guaranteed to exist page, and a random one at that.
|
|
ssh://ssh.phx.nearlyfreespeech.net/home/public/bookpipeline
|
|
|
|
|
|
large books
|
|
ListObjectWithMeta for single file listing, so we can still be as fast, but do not have a misleading api
|
|
|
|
single results from ListObjects requests
|
|
up the request markedly
|
|
books starting with "1"
|
|
|
|
|
|
|
|
already has a hocr directory in it will work
|