<feed xmlns='http://www.w3.org/2005/Atom'>
<title>bookpipeline/cmd, branch v0.3.3</title>
<subtitle>Tools to process books in a cloud based pipeline system</subtitle>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/'/>
<entry>
<title>rescribe: change default training directory to trainings/</title>
<updated>2021-03-16T12:26:18+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-03-16T12:26:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=8ab178eaa23215c8f5ac41ca36f7863d56d06cf3'/>
<id>8ab178eaa23215c8f5ac41ca36f7863d56d06cf3</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>lspipeline: Rename to lspipeline-ng, and restore pre concurrency version to lspipeline as there are some hard to debug issues in concurrency version</title>
<updated>2021-02-22T16:02:55+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-02-22T16:02:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=866aa409f8844ab2cb8d578672a703b4ddead30c'/>
<id>866aa409f8844ab2cb8d578672a703b4ddead30c</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>getsamplepages: Add -prefix option, and use 'best' to get random page numbers</title>
<updated>2021-02-15T17:09:20+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-02-15T17:09:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=44a027984044a55a8a483268ddf0b841e9f33e83'/>
<id>44a027984044a55a8a483268ddf0b841e9f33e83</id>
<content type='text'>
The -prefix option is useful to us.

Previously only a .jpg for page number 100 was retreived, which
failed if the book had fewer (or unusually named) pages, and also
didn't provide a corresponding .hocr at all (bug introduced with
48958d2). Using 'best', which is (effectively) randomly sorted,
provides a guaranteed to exist page, and a random one at that.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The -prefix option is useful to us.

Previously only a .jpg for page number 100 was retreived, which
failed if the book had fewer (or unusually named) pages, and also
didn't provide a corresponding .hocr at all (bug introduced with
48958d2). Using 'best', which is (effectively) randomly sorted,
provides a guaranteed to exist page, and a random one at that.
</pre>
</div>
</content>
</entry>
<entry>
<title>Make ListObjectsWithMeta generic again and create a specialised ListObjectWithMeta for single file listing, so we can still be as fast, but do not have a misleading api</title>
<updated>2021-01-26T14:56:10+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-01-26T14:56:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=5c3cee66a90ce6ef87e125b3bf011a6903d38083'/>
<id>5c3cee66a90ce6ef87e125b3bf011a6903d38083</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Improve lspipeline concurrency by removing WaitGroup stuff</title>
<updated>2021-01-26T14:17:19+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-01-26T14:17:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=54150b54cd06e3deba44e73b151070b74a4d8e76'/>
<id>54150b54cd06e3deba44e73b151070b74a4d8e76</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Speed up lspipeline by making s3 requests concurrently and only processing single results from ListObjects requests</title>
<updated>2021-01-26T13:50:02+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-01-26T13:50:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=86cc5d6c921ac05e0d08f66b205b51e1f5adb938'/>
<id>86cc5d6c921ac05e0d08f66b205b51e1f5adb938</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[rmbook] Append / to end of bookname, to ensure e.g. "1" doesnt match all books starting with "1"</title>
<updated>2020-12-15T12:38:36+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2020-12-15T12:38:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=7d9f77c2f1102aec026d1af78d1fe4725ed76674'/>
<id>7d9f77c2f1102aec026d1af78d1fe4725ed76674</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[rmbook] Add -dryrun flag</title>
<updated>2020-12-15T12:37:43+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2020-12-15T12:37:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=fb1069b504e8cd37b9a2bcdccefa9699d0e1dee9'/>
<id>fb1069b504e8cd37b9a2bcdccefa9699d0e1dee9</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add rmbook tool</title>
<updated>2020-12-14T17:08:14+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2020-12-14T17:08:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=9147e57a3a634ad303e8f1e7c456988996d5c75b'/>
<id>9147e57a3a634ad303e8f1e7c456988996d5c75b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[rescribe] Fix up *.hocr glob, which ensures that using a savedir that already has a hocr directory in it will work</title>
<updated>2020-12-07T17:04:12+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2020-12-07T17:04:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=17b2d91d5f323fd985ca012e50d36908cbceba87'/>
<id>17b2d91d5f323fd985ca012e50d36908cbceba87</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
