summaryrefslogtreecommitdiff
path: root/content/posts/adaptive-binarisation/index.md
blob: e4e22327e9130162278b216e5ec99d1e7337d5a7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
title: "Adaptive Binarisation"
date: 2019-12-17
draft: true
categories: [binarisation, preprocessing, image manipulation]
---
The [previous post](/posts/binarisation-introduction) covered the
basics of binarisation, and introduced the Otsu algorithm, a good
method for finding a global threshold number for a page. But there
are inevitable limitations with using a global threshold for
binarisation. Better would be to use a threshold that is adapted
over different regions of the page, so that as the conditions of the
page change so can the threshold. This technique is called adaptive
binarisation.

For each pixel of an image, adaptive binarisation considers the
pixels around it to determine a good threshold. This means that even
in an area which is heavily shaded, for example near the spine of a
book, the text will be correctly differentiated from the background,
as even though they may both be darker than the text in the rest of
the page, it is the darkness relative to its surroundings that
matters.

<!--
(diagram showing 2 different areas of a page, one light and one dark,
comparing global and local thresholding [can be fake, as the global
threshold diagram was])
(actually can probably just have a dark area of a page, comparing global
and local thresholding, setting the global one such that the image is
screwed up)
-->

A popular algorithm for this adaptive binarisation technique was
described in a 2000 paper by J. Sauvola
[Adaptive document image binarization](http://www.ee.oulu.fi/mvg/files/pdf/pdf_24.pdf),
and is now generally just refered to as the "Sauvola algorithm".

only some of what sauvola actually outlined in the paper is generally used today, which is a modification of niblack's (1986). sauvola's paper suggests several different binarisation methods, with an algorithm to switch between them, but these days it's just the main "binarization of textual components" part (3.3) that the paper is remembered for, and is generally simply refered to as "sauvola".

there are two variables, 'k', the so-called 'threshold value', and the window size.