summaryrefslogtreecommitdiff
path: root/cmd/booktopipeline
diff options
context:
space:
mode:
authorNick White <git@njw.name>2020-11-24 12:40:54 +0000
committerNick White <git@njw.name>2020-11-24 12:40:54 +0000
commit0d914a5de3f8169d41df4fcff1ee4aea6d01afbe (patch)
tree6ba24389250bfc13edd32798af120b3f56dc0d73 /cmd/booktopipeline
parent0b9bd466dd2e099bf6c7d3165f1285f4b7a8f38e (diff)
[booktopipeline] Add a check to disallow adding a book that already exists
This is important as if a book is added which has already been done, then an analyse job will be added every time a page is OCRed, which will clog up the pipeline with unnecessary work. Also if a book was added with the same name but differently named files, or a different number of pages, the results would almost certainly not be as intended. In the case of a book really wanting to be added with a particular name, either the original directory can be removed on S3, or "v2" or similar can be appended to the book name before calling booktopipeline.
Diffstat (limited to 'cmd/booktopipeline')
-rw-r--r--cmd/booktopipeline/main.go9
1 files changed, 9 insertions, 0 deletions
diff --git a/cmd/booktopipeline/main.go b/cmd/booktopipeline/main.go
index 7254d78..b4f4d99 100644
--- a/cmd/booktopipeline/main.go
+++ b/cmd/booktopipeline/main.go
@@ -102,6 +102,15 @@ func main() {
log.Fatalln(err)
}
+ verboselog.Println("Checking that a book hasn't already been uploaded with that name")
+ list, err := conn.ListObjects(conn.WIPStorageId(), bookname)
+ if err != nil {
+ log.Fatalln(err)
+ }
+ if len(list) > 0 {
+ log.Fatalf("Error: There is already a book in S3 named %s", bookname)
+ }
+
verboselog.Println("Uploading all images are valid in", bookdir)
err = pipeline.UploadImages(bookdir, bookname, conn)
if err != nil {