| Age | Commit message (Collapse) | Author | 
|---|
|  |  | 
|  |  | 
|  | graph.png file, and allow failure to download that as it won't be created in the case of a 1 page book, which is fine | 
|  |  | 
|  | don't lock gui when processing | 
|  |  | 
|  | other of binarised or colour may not exist | 
|  | return an error if none downloaded, as there are times when the colour PDF will not exist, which is fine | 
|  |  | 
|  | labels for the progress bar text to show what's being done | 
|  |  | 
|  |  | 
|  | ends, so multiple books can be processed by the gui one after the other | 
|  |  | 
|  |  | 
|  | rather than an entry | 
|  |  | 
|  | folder | 
|  |  | 
|  |  | 
|  |  | 
|  | are used to create a coherant graph if any page numbers cannot be found from file names | 
|  |  | 
|  |  | 
|  | upload | 
|  |  | 
|  |  | 
|  | filename when images are uploaded to the pipeline | 
|  | There are several TODO items before this can be considered "good
enough", let alone complete. See the comments in the code for
details.
On a good day, with a fair wind, though, this works. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | change) | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | output for training file not found, so that its clear that the file specified may not exist | 
|  | Previously for PDFs using binarised images we kept them as PNG, but
there's no good reason to do so, it's better to just get the space
savings on offer from jpeg. | 
|  | is chosen | 
|  |  | 
|  | EC2's rate limiting | 
|  | makefile to where it makes sense | 
|  | - Words are stretched to fit their boxes, which means the accuracy
  is now very high indeed. This was done by modifying gofpdf to add
  the SetCellStretchToFit function, which will hopefully be
  upstreamed in due course.
- Copy pasting from a PDF works well with lines rarely if ever being
  erroneously broken by the PDF reader. There was quite a bit of
  trial-and-error to improve this, and the stretched text plus a space
  being added after the word in CellFormat was the best (plus preserves
  accuracy of word and character locations). | 
|  |  | 
|  | things neater in the PDF in most cases |