Share My Creation At The Threshold

this was a fun project, of interest,
perhaps, to ocr mavens. the
problem doesn't occur as much with
barcode scanning.

background noise can drown out the
text you are trying to extract. attempting
to separate the text (so-called foreground)
from the noise (background) is achieved
with varying degrees of success using
a process called thresholding.

there are a number of such methods.
i've tried to put a number of them under
one roof.

if text extraction fails, the user could tap
a button to run the various thresholding
methods and choose the best one (if any)
to use. in some cases, in addition to
performing thresholding, the resultant
image might need to be inverted. ocr
expects the text to be black. a button
could be tapped to perform that operation
as well.

attached please find a example of a noisy image
and its thresholded counterparts
 

Attachments

  • threshold.png
    threshold.png
    63.2 KB · Views: 1,890
  • flip.jpg
    flip.jpg
    241.5 KB · Views: 152

JohnC

Expert
Licensed User
Longtime User
It looks like the screen pixels are making it harder to read because even when a letter is white - it's speckled with blue dots.

I wonder if running the image through a despeckle image process first, will it then increase the threshold success of the OCR?
 

Johan Schoeman

Expert
Licensed User
Longtime User
I have messed around with Otsu threasholding a few years ago and not sure if it might yield a more OCR-able result. It first gray scales and then calc the threshold value used for binarization of the image.

 

drgottjr

Expert
Licensed User
Longtime User
of course, i've seen and tried the otsu thresholding app (thanks, as usual, for a fine job), but it wouldn't run with the image
i had at hand. it complained about the particular format of the image (the same image which i had no problems with otherwise
and which i use in my example, just a .jpg taken by my device). so rather than wrestle with it, i put it aside for the moment.

in any case, otsu is one of many thresholding methods. they all share the same behavior that you describe: convert the base
image to grayscale and then work out the thresholding point. there are a lot of people who have spent a lot of time working
on thresholding. if you read their scholarly papers and look at their examples, otsu is not always the best. so i present
a representative selection for the user's convenience.

i spent a lot of time with leptonica in my tesseract days. i had some amazing results with sauvola thresholding. even leptonica's
base thresholding and tesseract's own minimal thresholding and zxing's hybrid binarizer all produce good results on the types of
images that we usually try to extract text from. i could have added them (at great additional app size). but, frankly, any ocr app
already includes one of them, invoked automatically before attempting to extract the text.

the purpose of my example was to produce a kind of "point and shoot" way of examining thresholding from different points of
view. if text extraction didn't work initially, the user could ask for a set of variously binarized images, and she could
choose one (if one such image was appropriate) without being a thresholding expert. maybe it's the otsu version. some images
produce no competely satisfactory result. some images need to be binarized by breaking up the image and dealing with each part
separately by hand, in stages. in such cases - eg, digitization of older, stained, skewed manuscripts - opencv and a lot of custom
thresholding techniques are required. in addition, other operations, such as edge finding, despeckling, deskewing, denoising,
rotating, inverting, etc may be required. leptonica, for example, has hundreds of such methods. having them all run without guidance
is beyond the scope of the example. and who, running an ocr app on his phone, is going to know exactly which arbitrary values are to
be assigned to the many, many variables required to binarize all images optimally? if these are the types of images one has to deal with
on a daily basis, then something other than a representative sampling is needed. i get that. i thought there might be some value to
my example; i can't be right all the time.
 
Top