<feed xmlns='http://www.w3.org/2005/Atom'>
<title>utils/pkg/hocr, branch v0.1.4</title>
<subtitle>Packages and tools for image preprocessing</subtitle>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/'/>
<entry>
<title>gofmt</title>
<updated>2021-07-23T15:17:40+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-07-23T15:17:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=b571f780dcc4baafa659bebbc316204a46e42f4f'/>
<id>b571f780dcc4baafa659bebbc316204a46e42f4f</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>hocr: Add ability to specify a custom image path for hocr line extraction, and use it in extracthocrlines</title>
<updated>2021-03-23T11:14:35+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-03-23T11:14:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=fa4f48ad54ec94c222269d335d40b21becff92a4'/>
<id>fa4f48ad54ec94c222269d335d40b21becff92a4</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>hocr: Use extracted page name for line naming</title>
<updated>2021-02-09T17:45:35+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-02-09T17:45:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=45943f847b3db8db5142c79a806f251659264ca0'/>
<id>45943f847b3db8db5142c79a806f251659264ca0</id>
<content type='text'>
This means that even in multi page hocrs with lines with the same
id (like line_1_1), then the page name will be different, so
extracthocrlines now won't mistakenly name different lines the same
and therefore overwrite them.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This means that even in multi page hocrs with lines with the same
id (like line_1_1), then the page name will be different, so
extracthocrlines now won't mistakenly name different lines the same
and therefore overwrite them.
</pre>
</div>
</content>
</entry>
<entry>
<title>hocr: Use image specified in ocr_page title, so can support multipage hocrs cleanly</title>
<updated>2021-02-09T17:34:45+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2021-02-09T16:58:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=13ce8fc4b45073e1f81a39c4923e44420509be73'/>
<id>13ce8fc4b45073e1f81a39c4923e44420509be73</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add godoc documentation</title>
<updated>2020-04-14T10:56:56+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2020-04-14T10:56:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=848f961c277525ebba3ab08fadc116970bcfed24'/>
<id>848f961c277525ebba3ab08fadc116970bcfed24</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add copyright statements to each file</title>
<updated>2020-03-13T16:51:31+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2020-03-13T16:51:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=97f27954532d85b0fe39ce639337c5d0b59433af'/>
<id>97f27954532d85b0fe39ce639337c5d0b59433af</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add GetWordConfs function to hocr pkg</title>
<updated>2020-01-22T16:17:05+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2020-01-22T16:17:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=632a149df0196f0f057fa4d552aa28d22901bcda'/>
<id>632a149df0196f0f057fa4d552aa28d22901bcda</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Export a couple of more generally useful functions</title>
<updated>2019-10-31T10:40:28+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2019-10-31T10:40:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=a46b217fd34f1bb3ee9f675fcb24cbc9a8cc2847'/>
<id>a46b217fd34f1bb3ee9f675fcb24cbc9a8cc2847</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Simplify and document hocr package slightly better</title>
<updated>2019-10-30T13:01:17+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2019-10-30T13:01:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=9060bc1bf6c90d21f67de77d3f95c0de84f41d68'/>
<id>9060bc1bf6c90d21f67de77d3f95c0de84f41d68</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Make bucket-lines and related packages more robust</title>
<updated>2019-10-23T09:29:29+00:00</updated>
<author>
<name>Nick White</name>
<email>git@njw.name</email>
</author>
<published>2019-10-23T09:29:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.rescribe.xyz/cgit/cgit.cgi/utils/commit/?id=8b5c87c6d4e9ed8e220a4f8732c0cd48e92c7a09'/>
<id>8b5c87c6d4e9ed8e220a4f8732c0cd48e92c7a09</id>
<content type='text'>
bucket-lines would crash for any line that didn't have a corresponding image.

Lines which weren't grayscale would also cause crashes; now they are just
converted to grayscale if necessary.

As a bonus, lines in jpeg can also be decoded successfull.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bucket-lines would crash for any line that didn't have a corresponding image.

Lines which weren't grayscale would also cause crashes; now they are just
converted to grayscale if necessary.

As a bonus, lines in jpeg can also be decoded successfull.
</pre>
</div>
</content>
</entry>
</feed>
