<feed xmlns='http://www.w3.org/2005/Atom'>
<title>bookpipeline/internal/pipeline, branch v1.0.2</title>
<subtitle>Tools to process books in a cloud based pipeline system</subtitle>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/'/>
<entry>
<title>Only generate full-size PDF if requested</title>
<updated>2022-03-21T13:51:51+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-03-21T13:51:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=76d91ea8f65c6ad52efb24ac2c94b22c2908bc5c'/>
<id>76d91ea8f65c6ad52efb24ac2c94b22c2908bc5c</id>
<content type='text'>
This avoids the issue that large PDFs require a lot of RAM, so there
are chances of running out of memory. Plus it's a waste of space and
time.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This avoids the issue that large PDFs require a lot of RAM, so there
are chances of running out of memory. Plus it's a waste of space and
time.
</pre>
</div>
</content>
</entry>
<entry>
<title>Separate out fullsize pdf creation from colour pdf creation, so less memory is needed</title>
<updated>2022-03-11T17:34:48+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-03-11T17:34:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=af8650c074bc111200b132b0918d44cacd423b6e'/>
<id>af8650c074bc111200b132b0918d44cacd423b6e</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add initial support for full-size PDF generation</title>
<updated>2022-03-11T13:36:59+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-03-11T13:36:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=9d1382b69700129a66541d786ba3b784eda56e36'/>
<id>9d1382b69700129a66541d786ba3b784eda56e36</id>
<content type='text'>
Some issues:
1) The PDF generation stores every page in memory while it constructs it. That means that
there's a higher chance of failure due to running out of memory with these. There's no
getting around this except by improving the PDF generation library, which is not easy.

2) Currently I've just changed the pipeline to always generate these full size PDFs, and
then the rescribe tool will just delete them if they weren't requested. This is bad in
particular because of point 1, and would probably cause issues of failures in the server
pipeline as a result

Therefore the plan is to add a tag to queue messages so that full size generation can be
selectively enabled.

Also, it should be split from the loop with colour pdf generation, as holding them both in RAM at
the same time is unnecessary.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some issues:
1) The PDF generation stores every page in memory while it constructs it. That means that
there's a higher chance of failure due to running out of memory with these. There's no
getting around this except by improving the PDF generation library, which is not easy.

2) Currently I've just changed the pipeline to always generate these full size PDFs, and
then the rescribe tool will just delete them if they weren't requested. This is bad in
particular because of point 1, and would probably cause issues of failures in the server
pipeline as a result

Therefore the plan is to add a tag to queue messages so that full size generation can be
selectively enabled.

Also, it should be split from the loop with colour pdf generation, as holding them both in RAM at
the same time is unnecessary.
</pre>
</div>
</content>
</entry>
<entry>
<title>adjusted file renaming to make suffixes of png and jpg files lowercase and change jpeg to jpg</title>
<updated>2022-02-28T16:41:30+00:00</updated>
<author>
<name>Antonia Rescribe</name>
<email>antonia@rescribe.xyz</email>
</author>
<published>2022-02-28T15:41:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=787074fda5c9e4c7c959e20dd8ffe2d39f248a14'/>
<id>787074fda5c9e4c7c959e20dd8ffe2d39f248a14</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add PreNoWipe queue, that just does binarisation but no wiping</title>
<updated>2022-02-28T16:17:35+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-02-28T16:17:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=21d49b546a27de6c53d8fe7d1a68d5a3b5506c93'/>
<id>21d49b546a27de6c53d8fe7d1a68d5a3b5506c93</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Ensure that no new console windows are opened on Windows when executing Tesseract</title>
<updated>2022-02-21T12:33:48+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-02-21T12:33:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=69ac99da02988ad2ed675570ccaa5ff7777f0279'/>
<id>69ac99da02988ad2ed675570ccaa5ff7777f0279</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>pipeline: Fail if no images are present</title>
<updated>2022-01-31T17:09:16+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-01-31T17:09:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=4fae7d93e0c07d5f6fc4c33db389c60d95276d01'/>
<id>4fae7d93e0c07d5f6fc4c33db389c60d95276d01</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Make pipeline context-aware, so the rescribe tool can cancel jobs</title>
<updated>2022-01-31T14:11:21+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-01-31T14:11:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=550752fa2ab493fb6d10aa9d963fc45996c0d100'/>
<id>550752fa2ab493fb6d10aa9d963fc45996c0d100</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>internal/pipeline: if a graph cannot be created, don't leave an empty graph.png file, and allow failure to download that as it won't be created in the case of a 1 page book, which is fine</title>
<updated>2022-01-17T13:10:33+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-01-17T13:10:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=cbd4adcdf3b27fc157df526fcdcbac8d6b74bb81'/>
<id>cbd4adcdf3b27fc157df526fcdcbac8d6b74bb81</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>internal/pipeline: Have DownloadPdfs() try to download all PDFs, but only return an error if none downloaded, as there are times when the colour PDF will not exist, which is fine</title>
<updated>2022-01-10T16:15:52+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2022-01-10T16:15:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/bookpipeline/commit/?id=436551c5e6f9d96f82fdf31e01e422b2a937b6ee'/>
<id>436551c5e6f9d96f82fdf31e01e422b2a937b6ee</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
