Android Question Tesseract Revisited

drgottjr · Aug 2, 2017

this link:

https://www.b4x.com/android/forum/threads/ocr-offline-tesseract.57727/#content

starts an interesting thread about using tesseract offline with b4a. what you end up with - or at least what i ended up with - is not nearly as polished or as good as textfairy, but there is hope.

that thread hasn't been updated, but there are new links which might be of use to anyone who got tesseract going, but whose results were less than satisfactory.

i refer to the so called javacpp-presets and the tessdata. i think the updates make a difference. some of my tests were pretty gratifying.

here is the new link for the javaccp-presets. (the latest version is at 1.3, but its format is different
from what is expected. 1.2 is the latest in .zip format):
https://repo1.maven.org/maven2/org/bytedeco/javacpp-presets/1.2/javacpp-presets-1.2-bin.zip

new link for tessdata (pick your languages. regrettably, the files are even bigger than they were):
https://github.com/tesseract-ocr/tessdata

i just overwrote the old files with the new ones. the sample app in the original thread is pretty much all you need.

Eme Fibonacci · Aug 3, 2017

what does that mean?

drgottjr · Aug 5, 2017

it just means that you might get better results if you download the updated files and plug them into your project. they aren't part of the b4a libraries, so if you weren't following the thread closely, you might not have been aware they were out there. for me, ocr is something in the back of my mind that i revisit from time to time. this time things clicked. there are even some tools available for pre-processing the images which weren't available early on. they seem to fit nicely into the jigsaw puzzle that is tesseract.

JordiCP · Aug 5, 2017

Thanks for the input. I also followed that post and used it to test some code.
About the preprocessing tools, are you talking about something that is in the same tesseract page-project or somewhere else?

drgottjr · Aug 5, 2017

the pre-processing tools (about which, i believe you know a bit) are at the bottom of page 2 of the original thread. from my limited knowledge, tesseract seems to use, among others, opencv for thresholding, eroding, etc. user joilts provided some valuable inline java code which is easily modified to suit and seems to fit in well with the original test example. the tools were suggested as help to another user who was having a problem recognizing automobile license plates, but it appears that every image will probably benefit from at least some pre-processing. i think it's your opencv library that's used in this case.

regarding the updated trained data that i linked to, i think it may be incorrect. that is to say, you will see available trained data, but it is for tesseract 4, not 3.0X. the update javacpp (v1.2) code may be based on tesseract 3.0X. i have to check that. in any case, the tesseract engine initializes successfully (with the new data), so it may not be an issue. plus, in my case, i turn off a number of tesseract options that use the dictionary. that may or may not be helpful to someone who expects to use a dictionary.

JordiCP · Aug 5, 2017

Indeed, I suppose Tesseract uses image processing techniques internally (perhaps OpenCV or adhoc routines). I also agree that it is better to do some pre-processing before, specially if the "zone of interest" has to be first detected and then "cleaned"

Regarding license plate recognition, it seems quite "easy" to translate a working example, but so difficult to have a "fine tuned" solution, i.e., something that can be deployed commercially.
I will need something similar soon, and I am wondering which of these options (or combined) is best OPENCV - Tesseract - OpenALPNR, since they seem to complement but intersect at the same time...and a previous knowledge of them is needed (*)
(*) at least one must know them enough so as to guess which options need to be (de)activated in order to tune it for a specific case... For instance, what you've said about turning off the dictionary in Tesseract is a good point for license plates

MindTorque · Aug 7, 2017

I must say I'm curious as to how one switches Tesseract options, I don't see an obvious way of doing so with the files provided given that I don't see a way of providing a command line nor of supplying a config file. I don't imagine your recompiled & reconverted the original C source. On the plus side I got it going without run time errors, although so far the OCR results from "normal" (not preprocessed) camera shots have been unimpressive.

drgottjr · Aug 12, 2017

sorry for the cone of silence. have been struggling with issues relating to running the example under android 7. otherwise, the results you're getting have nothing to do with the example. tesseract is a stern and powerful mistress - if that's what does it for you.

options can be changed at runtime. you just need to add the relevant calls to the inline java code in the example.
(eg, api.SetVariable() to change things) you can also use api calls to query the state of options you've changed. whether or not you fully understand what the options do is another matter. you can also add opencv functionality to the example the same way. (tesseract itself uses leptonica, whose functions could be added simlarly.) you may not be able to cobble together a fullblown ocr app this way

, but you could spring for a few euros and get one of the libraries or wrappers that seem to be available here.

if you just want to dip your toe in the waters, the example works quite well. to learn the api calls, you need
https://zdenop.github.io/tesseract-doc/group___advanced_a_p_i.html it's in c++ as i recall, but if you look at the java in the example, the conversion of c++ versions is trivial. i can only tell you they work. you can convert, thresh, erode, dilate and get the text - all from the example and a couple additional functions that easily fit in. follow the original thread and research "changing tesseract" (and variations of that) online. obviously a lot depends on what you're looking to achieve.

drgottjr · Aug 12, 2017

here: https://github.com/tesseract-ocr/tesseract/wiki/ControlParams
live example (to insert in inline java after initializing tesseract): api.SetVariable("tessedit_enable_doc_dict", "0") semicolon.
hope this helps.

MindTorque · Aug 13, 2017

It does indeed. Thank you much.

My apologies for gapping it heh...

drgottjr · Aug 13, 2017

just to keep your hopes up, i took a picture of a fair amount of some pretty small type in an instruction manual and fed it to the example (with no image preprocessing). every single word was correctly converted to text. i literally had to make sure i wasn't looking at the photograph instead of the derived text. tesseract can be slow, and it definitely needs to see things the way it needs to see them (clear, black text on white background. or vice versa. you'll read that the vice versa isn't true, but it isn't not true.)

on the other hand, and unless what you're trying to do is write an ocr app, you might want to consider using ocr apps that are out there and feed their output to your app. once you see what's really involved in just getting something usable to tesseract, you may rethink. in my case, if my users don't aim the camera correctly, the app shoots 600 volts into their index finger as a warning

.

fixit30 · Aug 13, 2017

drgottjr said:
if my users don't aim the camera correctly, the app shoots 600 volts into their index finger as a warning

I need the same feature for my app when users disable location services.

Can you share your B4A code?

drgottjr · Aug 13, 2017

just think of what we could do with a network of thousands of b4a-bots so endowed. sigh.

Android Question Tesseract Revisited

drgottjr

Expert

Eme Fibonacci

Well-Known Member

drgottjr

Expert

JordiCP

Expert

drgottjr

Expert

JordiCP

Expert

MindTorque

Member

drgottjr

Expert

drgottjr

Expert

MindTorque

Member

drgottjr

Expert

fixit30

Active Member

drgottjr

Expert

Similar Threads