diff options
author | Nick White <git@njw.name> | 2020-11-24 12:40:54 +0000 |
---|---|---|
committer | Nick White <git@njw.name> | 2020-11-24 12:40:54 +0000 |
commit | 0d914a5de3f8169d41df4fcff1ee4aea6d01afbe (patch) | |
tree | 6ba24389250bfc13edd32798af120b3f56dc0d73 /cmd/booktopipeline/main.go | |
parent | 0b9bd466dd2e099bf6c7d3165f1285f4b7a8f38e (diff) |
[booktopipeline] Add a check to disallow adding a book that already exists
This is important as if a book is added which has already been done,
then an analyse job will be added every time a page is OCRed, which
will clog up the pipeline with unnecessary work. Also if a book was
added with the same name but differently named files, or a different
number of pages, the results would almost certainly not be as
intended.
In the case of a book really wanting to be added with a particular
name, either the original directory can be removed on S3, or "v2"
or similar can be appended to the book name before calling
booktopipeline.
Diffstat (limited to 'cmd/booktopipeline/main.go')
-rw-r--r-- | cmd/booktopipeline/main.go | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/cmd/booktopipeline/main.go b/cmd/booktopipeline/main.go index 7254d78..b4f4d99 100644 --- a/cmd/booktopipeline/main.go +++ b/cmd/booktopipeline/main.go @@ -102,6 +102,15 @@ func main() { log.Fatalln(err) } + verboselog.Println("Checking that a book hasn't already been uploaded with that name") + list, err := conn.ListObjects(conn.WIPStorageId(), bookname) + if err != nil { + log.Fatalln(err) + } + if len(list) > 0 { + log.Fatalf("Error: There is already a book in S3 named %s", bookname) + } + verboselog.Println("Uploading all images are valid in", bookdir) err = pipeline.UploadImages(bookdir, bookname, conn) if err != nil { |