B4A Library OCR - Extracting text from a bitmap using the Play Services Vision API

Discussion in 'Additional libraries, classes and official updates' started by Johan Schoeman, Oct 18, 2017.

  1. Johan Schoeman

    Johan Schoeman Expert Licensed User

    This request comes from here:
    https://www.b4x.com/android/forum/threads/ocr-offline-on-screen.84867/#post-538774

    The attached project extracts text from a bitmap that you can pass to the library making use of the Android Vision API.

    Sample code:
    Code:
    #Region  Project Attributes
        
    #ApplicationLabel: b4aMobileVisionBitmap
        
    #VersionCode: 1
        
    #VersionName:
        
    'SupportedOrientations possible values: unspecified, landscape or portrait.
        #SupportedOrientations: portrait
        
    #CanInstallToExternalStorage: False
    #End Region


    #Region  Activity Attributes
        
    #FullScreen: False
        
    #IncludeTitle: True
    #End Region

    Sub Process_Globals
        
    'These global variables will be declared once when the application starts.
        'These variables can be accessed from all modules.

    End Sub

    Sub Globals
        
    'These global variables will be redeclared each time the activity is created.
        'These variables can only be accessed from this module.
     
        
    Dim mvbm As MobileVisionBitmap

        
    Private Button1 As Button
        
    Private iv1 As ImageView
        
    Private bm As Bitmap
    End Sub

    Sub Activity_Create(FirstTime As Boolean)
        
    'Do not forget to load the layout file created with the visual designer. For example:
        Activity.LoadLayout("main")
        bm.Initialize(
    File.DirAssets,"saying1.png")   'saying.png and saying1.png are in the /Files folder of the B4A project
        iv1.Bitmap = bm
     
        mvbm.Initialize(
    "mvbm")

    End Sub

    Sub Activity_Resume

    End Sub

    Sub Activity_Pause (UserClosed As Boolean)

    End Sub


    Sub Button1_Click
     
        mvbm.decodeBitmap(bm) 
        bm = 
    Null
        bm.Initialize(
    File.DirAssets,"saying1.png")   'saying.png and saying1.png are in the /Files folder of the B4A project
        iv1.Bitmap = bm
     
    End Sub

    Sub mvbm_blocks_result(blocks As String)
     
        
    Log("B4A Blocks = " & blocks)
     
    End Sub

    Sub mvbm_lines_result(lines As String)
     
        
    Log("B4A Lines = " & lines)
     
    End Sub

    Sub mvbm_words_result(words As String)
     
        
    Log("B4A Words = " & words)
     
    End Sub

    Sub mvbm_error_result(error As String)
     
        
    Log("B4A ERROR = " & error)
     
    End Sub
    Take note of the B4A manifest file:
    Code:
    'This code will be applied to the manifest file during compilation.
    'You do not need to modify it in most cases.
    'See this link for for more information: https://www.b4x.com/forum/showthread.php?p=78136
    AddManifestText(
    <uses-sdk android:minSdkVersion=
    "5" android:targetSdkVersion="22"/>
    <supports-screens android:largeScreens=
    "true"
        android:normalScreens=
    "true"
        android:smallScreens=
    "true"
        android:anyDensity=
    "true"/>)
    SetApplicationAttribute(android:icon, 
    "@drawable/icon")
    SetApplicationAttribute(android:
    label"$LABEL$")
    'End of default text.

    AddApplicationText(<meta-data
                android:name=
    "com.google.android.gms.version"
                android:value=
    "@integer/google_play_services_version" />
            <meta-data
                android:name=
    "com.google.android.gms.vision.DEPENDENCIES"
                android:value=
    "ocr" />)
    If you don't have the Android Vision dependencies installed then you will need an initial internet connection for the dependencies to be installed. After that an internet connection will not be required.

    This is version 1.00. It will expire on 31 October 2017.


    Bitmap loaded into the imageview.
    1.png


    B4A Log when clicking on button Read Text
    2.png
     

    Attached Files:

    Last edited: Oct 19, 2017
  2. Sandman

    Sandman Well-Known Member Licensed User

    Why an expiration date? Is this a commercial library?
     
  3. Johan Schoeman

    Johan Schoeman Expert Licensed User

    I have just spent the last 3 hours on this to embed it in B4A.
     
    Shivito1 and MarcoRome like this.
  4. DonManfred

    DonManfred Expert Licensed User

    if a dev not doing this then the number of donations one can expect are near null.
     
    moster67 and Johan Schoeman like this.
  5. Sandman

    Sandman Well-Known Member Licensed User

    Aha, that makes it easier to understand. And I kind of suspect that in full it means "This is a time-limited version that will expire on 31 October 2017. If you want the version without that limitation, you should buy it by giving me a donation and I'll send you the file."

    I suspect that most of the regulars and all of the seniors here in the forum understood Johan's meaning without problems. But I still consider myself a beginner and a rookie here, and to me it wasn't really clear at all. (I'm not trying to pick a fight or act stupid or anything, I was genuinely confused by the time-limit.)

    And for the record I think it absolutely is fair to be paid for ones efforts.
     
    Syd Wright, Krammig and moster67 like this.
  6. Syd Wright

    Syd Wright Active Member Licensed User

    I do agree that it is better to (somehow) allow for a 30 day trial period (or something like that), but that is technically more complicated to realize.

    PS
    Today I tested the Camera versus the Bitmap version to perform OCR on a TV subtitle (to make a speaking subtitle reader for blind users). At first I thought the camera performed better, but after more experimentation there appears to be no difference: The results vary between excellent, good and poor. If parts of a subtitle have a bright background then that part of the text gets distorted with the OCR. I am still looking for ways to enhance a bitmap (not easy) to make the (white) text more pronounced.
     
    Last edited: Oct 25, 2017
  7. Johan Schoeman

    Johan Schoeman Expert Licensed User

    Syd, it seems to be rather complicated to extract text from some images especially so if the color of the text vs that of the background color are close to one another. I guess the only way to try and do it successfully is to greyscale the image (there are numerous different ways to do it - I have posted a sample project here https://www.b4x.com/android/forum/t...-of-converting-rgb-images-to-grayscale.44316/ ), determine the OTSU threshold of the greyscaled image (see this project https://www.b4x.com/android/forum/threads/otsu-thresholding-binarization-of-images.44406/), and then to binarize the greyscaled image from where one can then try to extract the text. I have tried to do it this way with mixed results - from excellent to good to poor to nothing. I have however not tried to extract the text from a greyscaled image only - just from a binarized image. But I guess we can expect similar results when passing a greyscaled image or binarized imaged to the lib - Excellent, good, poor, nothing....

    When time permits I will look into the lib again to try and understand what the original author did and then see if we can somehow improve the OCR.
     
    Last edited: Oct 25, 2017
    Syd Wright and JordiCP like this.
  8. Syd Wright

    Syd Wright Active Member Licensed User

    Thank you very much. Today I have tried to use a camera to obtain a bitmap and then perform OCR.
    Below is my code (using the Advanced Camera Library, ACL). The results could be better. One of the problems is that the camera does not focus properly on nearby objects with text. I compared the results to "Google Goggles" (which I assume also uses Google Vision) and the results are then 10 times better. Could it be that your library is based on an older Google Vision version? Also it appears that Google Goggles takes more time to do the OCR (which can be derived from the blue vertical line that goes from left to right).

    Code:
    #Region  Project Attributes
        
    #ApplicationLabel: OCR-DSH1
        
    #VersionCode: 1
        
    #VersionName:
        
    'SupportedOrientations possible values: unspecified, landscape or portrait.
        #SupportedOrientations: unspecified
        
    #CanInstallToExternalStorage: False
    #End Region

    #Region  Activity Attributes
        
    #FullScreen: true
        
    #IncludeTitle: false
    #End Region

    Sub Process_Globals
        
    Dim Ticks1 As Long
    End Sub

    Sub Globals
        
    Dim mvbm As MobileVisionBitmap
        
    Private Button1 As Button
        
    Dim CamOn1 As Int
        
    Dim Camera1 As AdvancedCamera           'Was: Camera, nu "ACL"
        Private ImageView1 As ImageView
        
    Private Bitmap1 As Bitmap
        
    Private PicName1 As String
        
    Private Panel1 As Panel
    End Sub

    Sub Activity_Create(FirstTime As Boolean)
        
    Activity.LoadLayout("main")          'Holds: Panel1, Button1 and Imageview1
        PicName1="Foto1.jpg"
        Panel1.Top=
    0
        Panel1.Left=
    0
        Panel1.Width=
    100%x
        Panel1.Height=
    100%y
        ImageView1.Top=
    0
        ImageView1.Left=
    50%x
        ImageView1.Width=
    40%x
        ImageView1.Height=
    40%y
        Panel1.BringToFront
        Button1.BringToFront
        mvbm.Initialize(
    "mvbm")
    End Sub

    Sub Activity_Resume
        Camera1.Initialize(Panel1, 
    "Camera1")
    End Sub

    Sub Activity_Pause (UserClosed As Boolean)
        Camera1.StopPreview
          Camera1.Release
    End Sub

    Sub Button1_Click
        Ticks1=
    DateTime.Now
        Camera1.TakePicture
    End Sub

    Sub mvbm_blocks_result(blocks As String)
        
    Dim Dhulp1 As String
        
    Log(CRLF & "B4A Blocks = " & CRLF &  blocks)
        Dhulp1 = (
    DateTime.Now-Ticks1)
        
    Log("Speed= " & Dhulp1 & " msec.")
    End Sub

    Sub mvbm_lines_result(lines As String)
        
    'Log("B4A Lines = " & lines)
    End Sub

    Sub mvbm_words_result(words As String)
        
    'Log("B4A Words = " & words)
    End Sub

    Sub mvbm_error_result(error As String)
        
    Log("B4A ERROR = " & error)
    End Sub

    Sub Camera1_Ready (Success As Boolean)
        
    If Success Then
         Camera1.StartPreview
         CamOn1 = 
    1
        
    Else
         CamOn1 = 
    0
        
    End If
    End Sub

    Sub Camera1_PictureTaken(Data() As Byte)
        
    Dim out As OutputStream
        out = 
    File.OpenOutput(File.DirRootExternal, PicName1, False)
        out.WriteBytes(Data, 
    0, Data.Length)
        out.Close

        
    ToastMessageShow("Image saved: " & File.Combine(File.DirRootExternal, PicName1), True)

        
    'Camera1.StartPreview
        Bitmap1 = Null
        Bitmap1 = 
    LoadBitmapSample(File.DirRootExternal,PicName1,800,480)
        ImageView1.Bitmap = Bitmap1

        Ticks1=
    DateTime.Now
        
    Do While DateTime.Now < Ticks1+1500
         
    DoEvents
        
    Loop
        mvbm.decodeBitmap(Bitmap1)
    End Sub
     
    Last edited: Oct 25, 2017
  9. Syd Wright

    Syd Wright Active Member Licensed User

    PS, I tried greyscaled images and reduced bitdepth images (16M, 65k, 256 or 16 bits) but it hardly makes any difference.
    What does make a difference is the bitmap resolution. The optimum between OCR speed and best quality is when the resolution is around 800 x 480. Higher resolutions add almost nothing to the results and with small(er) images OCR is bad to very bad.
     
  10. Eme Fibonacci

    Eme Fibonacci Well-Known Member Licensed User

    It's hard to get results. They vary a lot from Excellent to poor.
     
  11. peacemaker

    peacemaker Well-Known Member Licensed User

    The lib "cannot start as expired".

    Can trial be updated ?
    I'd like to check working, as found that the Vision libs cannot be initialized, app is stopped on my f...ng Samsung Android 7.
     
    Last edited: Nov 20, 2017
    ThRuST likes this.
  12. ThRuST

    ThRuST Well-Known Member Licensed User

    This is the weakness and backside with small 3rd party solutions that cost money. As time goes by the developer might not even be around or reached to update his product range. It would be better to make all source code publically available because of this fact. Why else should the library post still reside on a public forum for developers?
     
  13. Johan Schoeman

    Johan Schoeman Expert Licensed User

    I will post the Java code for you and you can then change it to your liking, recompile it, post the new lib files in this thread. Will do so tomorrow.
     
  14. ThRuST

    ThRuST Well-Known Member Licensed User

    Thanks for updating your library so initialization is possible. Even though there's other solutions since your previous update is getting old. Cheers :)
     
  15. Johan Schoeman

    Johan Schoeman Expert Licensed User

    Here it the source code - change it to your liking.
     

    Attached Files:

    • src.zip
      File size:
      2.5 KB
      Views:
      52
    DonManfred likes this.
  16. ThRuST

    ThRuST Well-Known Member Licensed User

    Thanks, but that's not really needed since I found another solution. Hopefully your solution can help someone else. In general source code updates should always be posted in the first post to prevent library updates to be scattered allover the place. Erel might have told you this as well allready.
     
  17. ThRuST

    ThRuST Well-Known Member Licensed User

    If not he should :D
     
  18. moster67

    moster67 Expert Licensed User

    and no one has told you to "hålla käften"?:oops:
    :D
     
    ThRuST and DonManfred like this.
  19. ThRuST

    ThRuST Well-Known Member Licensed User

    Thanks monster67, you're welcome to place your comments >here< :)
     
  20. Johan Schoeman

    Johan Schoeman Expert Licensed User

    Use from the forum whatever works for you. There are probably many other solutions on the forum that will solve your requirements.
     
Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice