From 6cce1b77f9ea5dec26099398b19d4edd39a88d4e Mon Sep 17 00:00:00 2001 From: Nick White Date: Thu, 27 Feb 2020 17:05:13 +0000 Subject: Add documentation, license notices, and license --- README | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 README (limited to 'README') diff --git a/README b/README new file mode 100644 index 0000000..e5e918f --- /dev/null +++ b/README @@ -0,0 +1,48 @@ +# rescribe.xyz/bookpipeline package + +This package contains various tools and functions for the OCR of +books, with a focus on distributed OCR using short-lived virtual +servers. + +This is a Go package, and can be installed in the standard go way, +by running `go get rescribe.xyz/bookpipeline/...` + +## Commands + +The commands in the cmd/ directory are at the heart of this +package. For more details on their usage, use `go doc` or read +doc.go in the package repository. The key commands are: + + - bookpipeline : processes items from queues, doing preprocessing, + ocr and postprocessing, and moving items on to + the next queue step on completion. this is the + core command of the package. + - booktopipeline : uploads a book to the pipeline and adds it to the + appropriate queue. + - getpipelinebook : downloads the pipeline results for a book. + - lspipeline : prints useful information about the status of the + pipeline. + - mkpipeline : sets up storage buckets and queues for use by the + pipeline. + - spotme : starts up a short-lived virtual server running + bookpipeline. + +There are also some commands which are more useful in a standalone +setting: + + - confgraph : creates a graph showing average word confidence of + each page of hOCR in a directory + - pagegraph : creates a graph showing average confidence of each + word in a page of hOCR + - pdfbook : creates a searchable PDF from a directory of hOCR + and image files + +## Contributions + +Any and all comments, bug reports, patches or pull requests would +be very welcomely received. Please email them to . + +## License + +This package is licensed under the GPLv3. See the LICENSE file for +more details. -- cgit v1.2.1-24-ge1ad