Android Example OCR OFFLINE - Tesseract

Discussion in 'Tutorials & Examples' started by joilts, Aug 28, 2015.

Similar threads

B4A Library OCR with Tesseract
B4A Tutorial [Example] Add OCR features to your Android application
B4A Question Tesseract Revisited
B4A Question opencv32
B4A Question OCR tesseract help
  1. joilts

    joilts Member Licensed User

    I'm working in a project that needs OCR offline. Made a small progress and decided to share here and get some feedback.
    I've searched a little bit at this forum and google about it. Found options to use online OCR (NJDUDE's Lib or Erel's example). In the same project I also needs to manipulate some images and got DrewG Exemple about inline code to use JAVACV/OPENCV. This was the point to test Tesseract OCR in the same way (Inline code).
    Downloaded the Lib at this link https://repo1.maven.org/maven2/org/bytedeco/javacpp-presets/1.0/javacpp-presets-1.0-bin.zip
    More details about this can be found here.

    Unzipped and copied the files I needed to my Additional Lib Folder.
    The files I used: javacpp.jar, tesseract-android-arm.jar, leptonica-android-arm.jar, tesseract.jar, leptonica.jar

    Coded Basic Example from bytedeco page. Made some changes to send the image as a file path to the image saved somewhere in the phone and got the "translation" text.

    OBS: I needed to download tessdata files to my cell. Tried to add them to my app, but they were too big and I got some error deploying my app to cell (need to see this more carefully). The files to many languages can be found here or at google project page. I have download this one for my test example.

    Here is the code I used. My test phone is a S4.

    Hope it helps.
     

    Attached Files:

    Last edited: Aug 28, 2015
    Devan, JordiCP, tuhatinhvn and 7 others like this.
  2. hibrid0

    hibrid0 Active Member Licensed User

    Thanks for sharing
     
  3. Urishev

    Urishev Member Licensed User

    Hello! I'm a doctor and a programmer newbie. I want to create an application.
    Scanning and recognition of a standard blood test and computer conclusion.
    The problem is in the recognition of text.
    Where to start?
     
  4. lemonisdead

    lemonisdead Well-Known Member Licensed User

    Hello,
    Please what do you mean by that ? Have you tried with the examples provided in the first message of that thread ?
     
  5. Urishev

    Urishev Member Licensed User

    Thanks for the reply.
    I downloaded "javacpp-presets-1,0-bin", but failed to install the library for the application "testTest".
    How to install the library?
    Log:"java.io.FileNotFoundException: /tesseract-ocr-3.02.eng.tar.gz (Read-only file system)"
    Where to insert "tesseract-ocr-3.02.eng.tar.gz"?
     
    Last edited: Jan 24, 2016
  6. lemonisdead

    lemonisdead Well-Known Member Licensed User

    As I understand the first post, you should unzip the downloaded file and install the .jar files in your Additional Libraries folder
    The required .jar files are linked to the project from line 69 to 73
    Code:
    #AdditionalJar: javacpp
    #AdditionalJar: tesseract-android-arm
    #AdditionalJar: leptonica-android-arm
    #AdditionalJar: tesseract
    #AdditionalJar: leptonica
    Did you made it like that ? I will try to do it this way and report

    Edit : it works great as expected. The sole error I've got was with an Intel CPU and a crash on install. In such cases you have to copy tesseract-android-x86 and leptonica-android-x86 in the additional libraries folder too
     
    Last edited: Jan 20, 2016
    Johan Schoeman and joilts like this.
  7. Urishev

    Urishev Member Licensed User

    Did as you. Log:
    ** Activity (main) Create, isFirst = true **
    Here - getText()
    Before Init
    RETCODE =-1
    Could not initialize tesseract.
    ** Activity (main) Pause, UserClosed = false **

    Log:"java.io.FileNotFoundException: /tesseract-ocr-3.02.eng.tar.gz (Read-only file system)"
    Where to insert "tesseract-ocr-3.02.eng.tar.gz"?
     
    Last edited: Jan 24, 2016
  8. DonManfred

    DonManfred Expert Licensed User

    Try to uncheck FILTERED to get the unfiltered log and see if the log outputs more info now
     
  9. joilts

    joilts Member Licensed User

    Hello, Sorry to take so long to answers (was on vacation).
    I´m going to put in here what I did to install the lib. I´m sorry if its too "rookie", but that what I´m in B4X.
    1- Downloaded the Lib at this link https://repo1.maven.org/maven2/org/bytedeco/javacpp-presets/1.0/javacpp-presets-1.0-bin.zip
    2-Unzipped Files to any folder. The files I used: javacpp.jar, tesseract-android-arm.jar, leptonica-android-arm.jar, tesseract.jar, leptonica.jar (but you may use different files with -x86 extension, as said by lemonisdead in above post).
    3-Open B4A-> Tools -> Configure Paths
    4-At Additional Libs edit field insert for example "C:\Program Files (x86)\Anywhere Software\Basic4android\AdditionalLibs" (The folder you used to unzip at step 2)
    5-Compile program.

    That´s all I did.

    I Do not have the original project here and I´m not able to download the example from the post right now, but I have a modified project (to do ANPR) and I use these Libs:


    Code:
    #AdditionalJar: opencv
    #AdditionalJar: opencv-android-arm
    #AdditionalJar: javacv
    #AdditionalJar: javacpp
    #AdditionalJar: tesseract-android-arm
    #AdditionalJar: leptonica-android-arm
    #AdditionalJar: tesseract
    #AdditionalJar: leptonica

    Opencv is used to work with images and I don´t think you need them. Anyway, here is a link to all additional Libs I have in my path right now.

    https://drive.google.com/file/d/0B-i5U_B2M-ETaWxWNktYdi1FNmc/view?usp=sharing

    H
    ope it help.
     
    roberto64 likes this.
  10. Urishev

    Urishev Member Licensed User

    Thank you for your attention to my problem. What did I do wrong? The log outputs more info:
    onReceive
    widget onReceive ->InfoAlarmWidget.action.widget.news.scroll
    Starting: Intent { act=android.intent.action.MAIN flg=0x30000000 cmp=b4a.example/.main } from pid 2264
    HistoryRecord{40b54838 b4a.example/.main} failed creating starting window
    java.lang.RuntimeException: Binary XML file line #25: You must supply a layout_height attribute.
    at android.content.res.TypedArray.getLayoutDimension(TypedArray.java:491)
    at android.view.ViewGroup$LayoutParams.setBaseAttributes(ViewGroup.java:3599)
    at android.view.ViewGroup$MarginLayoutParams.<init>(ViewGroup.java:3678)
    at android.widget.LinearLayout$LayoutParams.<init>(LinearLayout.java:1400)
    at android.widget.LinearLayout.generateLayoutParams(LinearLayout.java:1326)
    at android.widget.LinearLayout.generateLayoutParams(LinearLayout.java:47)
    at android.view.LayoutInflater.rInflate(LayoutInflater.java:625)
    at android.view.LayoutInflater.inflate(LayoutInflater.java:408)
    at android.view.LayoutInflater.inflate(LayoutInflater.java:320)
    at android.view.LayoutInflater.inflate(LayoutInflater.java:276)
    at com.android.internal.policy.impl.PhoneWindow.generateLayout(PhoneWindow.java:2400)
    at com.android.internal.policy.impl.PhoneWindow.installDecor(PhoneWindow.java:2455)
    at com.android.internal.policy.impl.PhoneWindow.getDecorView(PhoneWindow.java:1621)
    at com.android.internal.policy.impl.PhoneWindowManager.addStartingWindow(PhoneWindowManager.java:1092)
    at com.android.server.WindowManagerService$H.handleMessage(WindowManagerService.java:8182)
    at android.os.Handler.dispatchMessage(Handler.java:99)
    at android.os.Looper.loop(Looper.java:130)
    at com.android.server.WindowManagerService$WMThread.run(WindowManagerService.java:576)
    ** Activity (main) Pause, UserClosed = false **
    Start proc b4a.example for activity b4a.example/.main: pid=2796 uid=10097 gids={1015, 3003}
    setHidden false
    Could not find method android.view.ViewGroup.addChildrenForAccessibility, referenced from method anywheresoftware.b4a.BALayout.addChildrenForAccessibility
    VFY: unable to resolve virtual method 319: Landroid/view/ViewGroup;.addChildrenForAccessibility (Ljava/util/ArrayList;)V
    VFY: replacing opcode 0x6f at 0x0009
    VFY: dead code 0x000c-000c in Lanywheresoftware/b4a/BALayout;.addChildrenForAccessibility (Ljava/util/ArrayList;)V
    setHidden false
    Could not find method android.view.View.setPivotX, referenced from method anywheresoftware.b4a.objects.ViewWrapper.AnimateFrom
    VFY: unable to resolve virtual method 313: Landroid/view/View;.setPivotX (F)V
    VFY: replacing opcode 0x6e at 0x0025
    VFY: dead code 0x0028-018e in Lanywheresoftware/b4a/objects/ViewWrapper;.AnimateFrom (Landroid/view/View;IIIII)V
    VFY: dead code 0x0190-0192 in Lanywheresoftware/b4a/objects/ViewWrapper;.AnimateFrom (Landroid/view/View;IIIII)V
    Could not find method android.animation.ValueAnimator.ofFloat, referenced from method anywheresoftware.b4a.objects.ViewWrapper.SetColorAnimated
    VFY: unable to resolve static method 12: Landroid/animation/ValueAnimator;.ofFloat ([F)Landroid/animation/ValueAnimator;
    VFY: replacing opcode 0x71 at 0x004c
    VFY: dead code 0x004f-0089 in Lanywheresoftware/b4a/objects/ViewWrapper;.SetColorAnimated (III)V
    Could not find method android.animation.ObjectAnimator.ofFloat, referenced from method anywheresoftware.b4a.objects.ViewWrapper.SetVisibleAnimated
    VFY: unable to resolve static method 7: Landroid/animation/ObjectAnimator;.ofFloat (Ljava/lang/Object;Ljava/lang/String;[F)Landroid/animation/ObjectAnimator;
    VFY: replacing opcode 0x71 at 0x0029
    Could not find method android.animation.ObjectAnimator.ofFloat, referenced from method anywheresoftware.b4a.objects.ViewWrapper.SetVisibleAnimated
    VFY: unable to resolve static method 7: Landroid/animation/ObjectAnimator;.ofFloat (Ljava/lang/Object;Ljava/lang/String;[F)Landroid/animation/ObjectAnimator;
    VFY: replacing opcode 0x71 at 0x0059
    VFY: dead code 0x002c-004e in Lanywheresoftware/b4a/objects/ViewWrapper;.SetVisibleAnimated (IZ)V
    VFY: dead code 0x005c-005e in Lanywheresoftware/b4a/objects/ViewWrapper;.SetVisibleAnimated (IZ)V
    GC_EXTERNAL_ALLOC freed 82K, 47% free 2963K/5575K, external 2462K/2652K, paused 20ms
    setHidden false
    Displayed b4a.example/.main: +449ms
    setHidden false
    ** Activity (main) Create, isFirst = true **
    setHidden false
    Here - getText()
    setHidden false
    No JNI_OnLoad found in /system/lib/libc.so 0x40513ed0, skipping init
    No JNI_OnLoad found in /system/lib/libm.so 0x40513ed0, skipping init
    No JNI_OnLoad found in /system/lib/libz.so 0x40513ed0, skipping init
    setHidden false
    No JNI_OnLoad found in /system/lib/libdl.so 0x40513ed0, skipping init
    No JNI_OnLoad found in /system/lib/liblog.so 0x40513ed0, skipping init
    Trying to load lib /data/data/b4a.example/lib/liblept.so 0x40513ed0
    Added shared lib /data/data/b4a.example/lib/liblept.so 0x40513ed0
    No JNI_OnLoad found in /data/data/b4a.example/lib/liblept.so 0x40513ed0, skipping init
    Trying to load lib /data/data/b4a.example/lib/libjnilept.so 0x40513ed0
    setHidden false
    Added shared lib /data/data/b4a.example/lib/libjnilept.so 0x40513ed0
    setHidden false
    Trying to load lib /data/data/b4a.example/lib/libtesseract.so 0x40513ed0
    setHidden false
    Added shared lib /data/data/b4a.example/lib/libtesseract.so 0x40513ed0
    No JNI_OnLoad found in /data/data/b4a.example/lib/libtesseract.so 0x40513ed0, skipping init
    setHidden false
    Trying to load lib /data/data/b4a.example/lib/libjnitesseract.so 0x40513ed0
    Added shared lib /data/data/b4a.example/lib/libjnitesseract.so 0x40513ed0
    setHidden false
    Before Init
    setHidden false
    RETCODE =-1
    Could not initialize tesseract.
    setHidden false
    ...
     
  11. joilts

    joilts Member Licensed User

    The RETCODE =-1 seems to be motivated by missing dat file that holds tesseract data. They must be copied to your phone manually like, the obs at first post: "OBS: I needed to download tessdata files to my cell. Tried to add them to my app, but they were too big and I got some error deploying my app to cell (need to see this more carefully). The files to many languages can be found here or at google project page. I have download this one for my test example.".
    The app code must be changed to use theses files where you decide to put them.
    Unfortunately I'm away from my programming laptop and I can´t add the code to copy the file from app folder to destination folder.

    UPDATE: Here is the code I used in another project to copy trained data from my app file dir to RootExternal... The file must be unzipped

    Code:
    'Create dir for Trainned Data
        If File.IsDirectory(File.DirRootExternal,"tessdata") = False Then
           
    File.MakeDir(File.DirRootExternal, "tessdata")
           tessDataPath = 
    File.DirRootExternal & "/tessdata"
            
    'Copy Trainned Files to RootDir
            Dim fList As List = File.ListFiles(File.DirAssets)
            
    Dim fileName As String
            
    For i=0 To fList.Size-1
                fileName = fList.Get(i)
                
    If fileName.ToUpperCase.Contains("pla.traineddata"Then
                    
    If File.Exists(tessDataPath,fileName)=False Then
                        
    File.Copy(File.DirAssets,fileName, tessDataPath, fileName)
                    
    End If
                
    End If
            
    Next
        
    End If
     
    Last edited: Feb 7, 2016
  12. Urishev

    Urishev Member Licensed User

    Understand it now. Thank you very much!
     
  13. MarcoRome

    MarcoRome Expert Licensed User

    Hi all. Joilts Thank you for this example is very usefull
    I try this example. work well in english. But if i add example italian. any time i have this message
    The code that i modified is this:
    Code:
    public static String getTextIta(String pathString filename, String extension, String TrainFileDir) {
        BA.Log(
    "" + "Here - getText() ");
        BytePointer outText;
        TessBaseAPI api = new TessBaseAPI();
        BA.Log(
    "" + "Before Init ");
        int retCode = api.Init(TrainFileDir, 
    "ita");
        BA.Log(
    "RETCODE =" + retCode);
        
    if (retCode != 0) {
            
    return("Could not initialize tesseract.");
        
    }
      
        PIX image = pixRead(path+filename+extension);
        BA.Log("" + "File Open");
        api.SetImage(image);
        BA.Log("" + "Before get Text");
        outText = api.GetUTF8Text();
        api.End();
        outText.deallocate();
        pixDestroy(image);
        return(outText.getString());

    }
    at line
    ( first was "eng" ).

    I add also file ita.traineddata
    The thing strain is that if i donwload this file about GITHUB i have a file 13.6Mb ( ita.traineddata ), if i download file GOOGLE i have file 2.3Mb.
    Anyway i try both and i have anyway this message "Not initialize"

    Of course if i change with eng.traineddata at change at line
    all work
    Any idea ?
    Thank you
    Marco
     
  14. joilts

    joilts Member Licensed User

    Hi Marco,

    It is probably a bad (corrupted) file you are using or a missing file in tessdata directory. I just downloaded ita.tainneddata from google (it was a gz file -about 917 kb). Just unzipped (final size was around 2 mb) and it seems to work fine. Got a text png in Italian and made a test app. Here it is, with the data file attached. Also a screen shot of result. Have no idea about what is written in Italian, so forgive me if it's no good.

    As the example project got large with the tess data file, I just upload to google drive. You can find it here.
    The screen shoot is attached in this post.

    Hope it helps.

    See you.
     

    Attached Files:

    Last edited: Mar 18, 2016
    MarcoRome likes this.
  15. MarcoRome

    MarcoRome Expert Licensed User

    Yes, right Corrupted file.
    Thank you very much for your support
     
    joilts likes this.
  16. roberto64

    roberto64 Active Member Licensed User

    joilts prejudice, I'm trying to create with a gutshot ANPR for the recognition of license plates, but I have tried in vain for a libreia riconoscimeto dele plates in B4A, I read that you do is rializzando, you could give me a hand?
    Greetings
     
  17. joilts

    joilts Member Licensed User

    Hi Roberto. I´ve done some work on ANPR solution. It is done on a picture from the vehicle. To work I had to train OCR to recognize the font used in my country. You should do the same for your country. I have used jTessBoxEditor and Tesseract tools to train. There are many other tools for that (some online). But before process picture in OCR functions, I had to crop image to have only the plate (there are many examples on internet). Then I had some image transformations to get image sharpen and B&W. It depends on the color of plates in your country. In my country we have 5 types of plates and I had to design 4 different methods to transform image. With the prepared image, you can use OCR. Hope it helps. Some of the images transformations I´ve made can be found at this lib for B4i
     
  18. roberto64

    roberto64 Active Member Licensed User

    joilts hello and thank you for your time, I have never used b4i the lib you can also use on B4A?
    thank you
     
  19. joilts

    joilts Member Licensed User

    No. This lib for b4i can not be used on B4a as uses inline code (Object-C). The idea in showing the b4i lib is just to help you to have an idea of which image transformations are needed (in my case). You must rewrite them to work at b4a. I did it using inline java code and opencv lib. But there are some b4a libs that has some of the transformations you may need.
     
  20. roberto64

    roberto64 Active Member Licensed User

    hello joilts, or Picasso used to trasfomare the image and tesseract ocr for riconoscimeto of letters and numbers with no success, if you can help me with some examples to you already, but unfortunately not familiar java vb net.
    greetings Roberto
     
Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice