summaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
authorNick White <git@njw.name>2020-02-27 17:05:13 +0000
committerNick White <git@njw.name>2020-02-27 17:10:48 +0000
commit6cce1b77f9ea5dec26099398b19d4edd39a88d4e (patch)
tree6cf10d6f629739617c0761dee3abcde48ec13ae8 /README
parent9a35fe6d0e4c654697126939bfab8f0461d8780a (diff)
Add documentation, license notices, and license
Diffstat (limited to 'README')
-rw-r--r--README48
1 files changed, 48 insertions, 0 deletions
diff --git a/README b/README
new file mode 100644
index 0000000..e5e918f
--- /dev/null
+++ b/README
@@ -0,0 +1,48 @@
+# rescribe.xyz/bookpipeline package
+
+This package contains various tools and functions for the OCR of
+books, with a focus on distributed OCR using short-lived virtual
+servers.
+
+This is a Go package, and can be installed in the standard go way,
+by running `go get rescribe.xyz/bookpipeline/...`
+
+## Commands
+
+The commands in the cmd/ directory are at the heart of this
+package. For more details on their usage, use `go doc` or read
+doc.go in the package repository. The key commands are:
+
+ - bookpipeline : processes items from queues, doing preprocessing,
+ ocr and postprocessing, and moving items on to
+ the next queue step on completion. this is the
+ core command of the package.
+ - booktopipeline : uploads a book to the pipeline and adds it to the
+ appropriate queue.
+ - getpipelinebook : downloads the pipeline results for a book.
+ - lspipeline : prints useful information about the status of the
+ pipeline.
+ - mkpipeline : sets up storage buckets and queues for use by the
+ pipeline.
+ - spotme : starts up a short-lived virtual server running
+ bookpipeline.
+
+There are also some commands which are more useful in a standalone
+setting:
+
+ - confgraph : creates a graph showing average word confidence of
+ each page of hOCR in a directory
+ - pagegraph : creates a graph showing average confidence of each
+ word in a page of hOCR
+ - pdfbook : creates a searchable PDF from a directory of hOCR
+ and image files
+
+## Contributions
+
+Any and all comments, bug reports, patches or pull requests would
+be very welcomely received. Please email them to <nick@rescribe.xyz>.
+
+## License
+
+This package is licensed under the GPLv3. See the LICENSE file for
+more details.