summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-09-11Work around the SQS limit of 12 hours of visibility timeoutNick White
This is done by checking for the error that is emitted in such a case, and if it's found trying several times to find the message back in the queue, and returning the message with an updated handle back to the caller to use in the future.
2019-09-06Add flags to disable checking various queuesNick White
2019-09-05Handle no words found error in a better way so any page that is actually 0 ↵Nick White
confidence is recognised
2019-09-05Don't abort analysis if we encounter a hocr with no words, just skip itNick White
2019-09-05gofmtNick White
2019-09-05Update Pipeliner interface in getpipelinebook, and update some commentsNick White
2019-09-04Rewrite heartbeat so errors during it will be reported, and the aws api ↵Nick White
doesn't rely on channels
2019-09-04Ensure any channels that need to be consumed before goroutine is finished ↵Nick White
are done in the case of an error
2019-09-03Improve debug loggingNick White
2019-09-02Log upload and download eventsNick White
2019-09-02Add initial getpipelinebook cmd (untested)Nick White
2019-08-28Add medium and bad lines to graphsNick White
2019-08-28Add standalone graph tool; confgraphNick White
2019-08-28Move booktopipeline and mkpipeline into bookpipeline/cmdNick White
2019-08-28Split out bookpipeline to cmd/Nick White
2019-08-28Move graph function to its own file, and further improve layoutNick White
2019-08-28Separate graph creation from analyse().Nick White
2019-08-27Print x axis ticks nicelyNick White
2019-08-27Add annotations for pages with confidence below 70Nick White
2019-08-27Add basic graphing (still work to do, but basics are working)Nick White
2019-08-27Add basic analyse step, working but incompleteNick White
2019-08-23Expect source files to be .jpgNick White
2019-08-23Fix gaping bugs by using correct queues and downloadsNick White
This has involved refactoring to make the interface simpler, and just use the URLs / IDs for the necessary queues and storage locations, rather than wrap these in functions.
2019-08-22Generalise preprocessing and ocring to reuse common codeNick White
2019-08-22Switch to using flag to process command line, and allow different training ↵Nick White
to be passed
2019-08-22gofmtNick White
2019-08-22Update usage string, and commentsNick White
2019-08-22Improve timing of queue checksNick White
Now each queue is checked every 3 minutes, though the channel for each queue check request won't be rechecked until any previous job is completed.
2019-08-22Fix process finishing by closing dl channelNick White
2019-08-20Handle errors properly with goroutinesNick White
2019-08-20Handle errors correctly in main parts of programNick White
2019-08-20Substantially improve problematic object listing part of APINick White
Switch to regular non-concurrent stuff, concurrency is better handled by the main program anyway. Now we handle errors properly, and things are way simpler.
2019-08-20Add basic OCR support, and reorganise codeNick White
The previously committed thing didn't work, as listobjects was sending to a channel synchronously, so it was never being received. The current API isn't great, mixing synchronous and non-synchronous things, not handling errors consistently, and generally is over complicated. That will be fixed soon.
2019-08-20Split aws implementation from main.go in pipelinepreprocessNick White
2019-08-20Export qmsg typeNick White
2019-08-19Fix pipelinepreprocess segfaultsNick White
These were caused by using non-pointer methods, which meant that the values set in Init() were not saved.
2019-08-19Work in progress rearchitecture to use interfaces; currently pointers are ↵Nick White
screwy causing segfaults
2019-08-13Various improvements to pipelinepreprocessNick White
- Ensure temporary directory already being present isn't an issue - Remove temporary directory when done with it - Ensure any already preprocessed files aren't preprocessed themselves (this could happen in the case of a run stopping half way through)
2019-08-13Correct typo in bucket name for pipelinepreprocess; tested and seems to ↵Nick White
work, remarkably
2019-08-13Add bonus verbose log pointsNick White
2019-08-13Add booktopipeline tool (only lightly tested)Nick White
2019-08-13Reduce SQS WaitTime to something in-spec, and add bonus verbose log pointsNick White
2019-08-13Switch ksizes to use by preprocmultiNick White
2019-08-13Add basic verbose logging capabilities to pipelinepreprocessNick White
2019-07-25Add first draft of pipelinepreprocess - completely untested, will contain bugsNick White
2019-07-19rename setupawspipeline to mkpipelineNick White
2019-07-19rename pipelineaws to setupawspipelineNick White
2019-07-19Add aws pipeline setupNick White
2019-06-25Remove 0.6 binarisation threshold option from preprocmultiNick White
2019-06-25Experimentally adjust wipe threshold according to binarisation levelNick White