B4A Library OCR - Extracting text from a bitmap using the Play Services Vision API

This request comes from here:
https://www.b4x.com/android/forum/threads/ocr-offline-on-screen.84867/#post-538774

The attached project extracts text from a bitmap that you can pass to the library making use of the Android Vision API.

Sample code:
B4X:
#Region  Project Attributes
    #ApplicationLabel: b4aMobileVisionBitmap
    #VersionCode: 1
    #VersionName:
    'SupportedOrientations possible values: unspecified, landscape or portrait.
    #SupportedOrientations: portrait
    #CanInstallToExternalStorage: False
#End Region


#Region  Activity Attributes
    #FullScreen: False
    #IncludeTitle: True
#End Region

Sub Process_Globals
    'These global variables will be declared once when the application starts.
    'These variables can be accessed from all modules.

End Sub

Sub Globals
    'These global variables will be redeclared each time the activity is created.
    'These variables can only be accessed from this module.
 
    Dim mvbm As MobileVisionBitmap

    Private Button1 As Button
    Private iv1 As ImageView
    Private bm As Bitmap
End Sub

Sub Activity_Create(FirstTime As Boolean)
    'Do not forget to load the layout file created with the visual designer. For example:
    Activity.LoadLayout("main")
    bm.Initialize(File.DirAssets,"saying1.png")   'saying.png and saying1.png are in the /Files folder of the B4A project
    iv1.Bitmap = bm
 
    mvbm.Initialize("mvbm")

End Sub

Sub Activity_Resume

End Sub

Sub Activity_Pause (UserClosed As Boolean)

End Sub


Sub Button1_Click
 
    mvbm.decodeBitmap(bm) 
    bm = Null
    bm.Initialize(File.DirAssets,"saying1.png")   'saying.png and saying1.png are in the /Files folder of the B4A project
    iv1.Bitmap = bm
 
End Sub

Sub mvbm_blocks_result(blocks As String)
 
    Log("B4A Blocks = " & blocks)
 
End Sub

Sub mvbm_lines_result(lines As String)
 
    Log("B4A Lines = " & lines)
 
End Sub

Sub mvbm_words_result(words As String)
 
    Log("B4A Words = " & words)
 
End Sub

Sub mvbm_error_result(error As String)
 
    Log("B4A ERROR = " & error)
 
End Sub

Take note of the B4A manifest file:
B4X:
'This code will be applied to the manifest file during compilation.
'You do not need to modify it in most cases.
'See this link for for more information: https://www.b4x.com/forum/showthread.php?p=78136
AddManifestText(
<uses-sdk android:minSdkVersion="5" android:targetSdkVersion="22"/>
<supports-screens android:largeScreens="true"
    android:normalScreens="true"
    android:smallScreens="true"
    android:anyDensity="true"/>)
SetApplicationAttribute(android:icon, "@drawable/icon")
SetApplicationAttribute(android:label, "$LABEL$")
'End of default text.

AddApplicationText(<meta-data
            android:name="com.google.android.gms.version"
            android:value="@integer/google_play_services_version" />
        <meta-data
            android:name="com.google.android.gms.vision.DEPENDENCIES"
            android:value="ocr" />)

If you don't have the Android Vision dependencies installed then you will need an initial internet connection for the dependencies to be installed. After that an internet connection will not be required.

This is version 1.00. It will expire on 31 October 2017.


Bitmap loaded into the imageview.
1.png



B4A Log when clicking on button Read Text
2.png
 

Attachments

  • MobileVisionBitmap.xml
    1.2 KB · Views: 356
  • MobileVisionBitmap.jar
    10 KB · Views: 373
  • b4aMobileVisionBitmap.zip
    130 KB · Views: 395
Last edited:

Sandman

Well-Known Member
Licensed User
if a dev not doing this then the number of donations one can expect are near null

Aha, that makes it easier to understand. And I kind of suspect that in full it means "This is a time-limited version that will expire on 31 October 2017. If you want the version without that limitation, you should buy it by giving me a donation and I'll send you the file."

I suspect that most of the regulars and all of the seniors here in the forum understood Johan's meaning without problems. But I still consider myself a beginner and a rookie here, and to me it wasn't really clear at all. (I'm not trying to pick a fight or act stupid or anything, I was genuinely confused by the time-limit.)

And for the record I think it absolutely is fair to be paid for ones efforts.
 

Syd Wright

Well-Known Member
Licensed User
if a dev not doing this then the number of donations one can expect are near null.
I do agree that it is better to (somehow) allow for a 30 day trial period (or something like that), but that is technically more complicated to realize.

PS
Today I tested the Camera versus the Bitmap version to perform OCR on a TV subtitle (to make a speaking subtitle reader for blind users). At first I thought the camera performed better, but after more experimentation there appears to be no difference: The results vary between excellent, good and poor. If parts of a subtitle have a bright background then that part of the text gets distorted with the OCR. I am still looking for ways to enhance a bitmap (not easy) to make the (white) text more pronounced.
 
Last edited:

Johan Schoeman

Expert
Licensed User
I do agree that it is better to (somehow) allow for a 30 day trial period (or something like that), but that is technically more complicated to realize.

PS
Today I tested the Camera versus the Bitmap version to perform OCR on a TV subtitle (to make a speaking subtitle reader for blind users). At first I thought the camera performed better, but after more experimentation there appears to be no difference: The results vary between excellent, good and poor. If parts of a subtitle have a bright background then that part of the text gets distorted with the OCR. I am still looking for ways to enhance a bitmap (not easy) to make the (white) text more pronounced.
Syd, it seems to be rather complicated to extract text from some images especially so if the color of the text vs that of the background color are close to one another. I guess the only way to try and do it successfully is to greyscale the image (there are numerous different ways to do it - I have posted a sample project here https://www.b4x.com/android/forum/t...-of-converting-rgb-images-to-grayscale.44316/ ), determine the OTSU threshold of the greyscaled image (see this project https://www.b4x.com/android/forum/threads/otsu-thresholding-binarization-of-images.44406/), and then to binarize the greyscaled image from where one can then try to extract the text. I have tried to do it this way with mixed results - from excellent to good to poor to nothing. I have however not tried to extract the text from a greyscaled image only - just from a binarized image. But I guess we can expect similar results when passing a greyscaled image or binarized imaged to the lib - Excellent, good, poor, nothing....

When time permits I will look into the lib again to try and understand what the original author did and then see if we can somehow improve the OCR.
 
Last edited:

Syd Wright

Well-Known Member
Licensed User
When time permits I will look into the lib again to try and understand what the original author did and then see if we can somehow improve the OCR.
Thank you very much. Today I have tried to use a camera to obtain a bitmap and then perform OCR.
Below is my code (using the Advanced Camera Library, ACL). The results could be better. One of the problems is that the camera does not focus properly on nearby objects with text. I compared the results to "Google Goggles" (which I assume also uses Google Vision) and the results are then 10 times better. Could it be that your library is based on an older Google Vision version? Also it appears that Google Goggles takes more time to do the OCR (which can be derived from the blue vertical line that goes from left to right).

B4X:
#Region  Project Attributes
    #ApplicationLabel: OCR-DSH1
    #VersionCode: 1
    #VersionName:
    'SupportedOrientations possible values: unspecified, landscape or portrait.
    #SupportedOrientations: unspecified
    #CanInstallToExternalStorage: False
#End Region

#Region  Activity Attributes
    #FullScreen: true
    #IncludeTitle: false
#End Region

Sub Process_Globals
    Dim Ticks1 As Long
End Sub

Sub Globals
    Dim mvbm As MobileVisionBitmap
    Private Button1 As Button
    Dim CamOn1 As Int
    Dim Camera1 As AdvancedCamera           'Was: Camera, nu "ACL"
    Private ImageView1 As ImageView
    Private Bitmap1 As Bitmap
    Private PicName1 As String
    Private Panel1 As Panel
End Sub

Sub Activity_Create(FirstTime As Boolean)
    Activity.LoadLayout("main")          'Holds: Panel1, Button1 and Imageview1
    PicName1="Foto1.jpg"
    Panel1.Top=0
    Panel1.Left=0
    Panel1.Width=100%x
    Panel1.Height=100%y
    ImageView1.Top=0
    ImageView1.Left=50%x
    ImageView1.Width=40%x
    ImageView1.Height=40%y
    Panel1.BringToFront
    Button1.BringToFront
    mvbm.Initialize("mvbm")
End Sub

Sub Activity_Resume
    Camera1.Initialize(Panel1, "Camera1")
End Sub

Sub Activity_Pause (UserClosed As Boolean)
    Camera1.StopPreview
      Camera1.Release
End Sub

Sub Button1_Click
    Ticks1=DateTime.Now
    Camera1.TakePicture
End Sub

Sub mvbm_blocks_result(blocks As String)
    Dim Dhulp1 As String
    Log(CRLF & "B4A Blocks = " & CRLF &  blocks)
    Dhulp1 = (DateTime.Now-Ticks1)
    Log("Speed= " & Dhulp1 & " msec.")
End Sub

Sub mvbm_lines_result(lines As String)
    'Log("B4A Lines = " & lines)
End Sub

Sub mvbm_words_result(words As String)
    'Log("B4A Words = " & words)
End Sub

Sub mvbm_error_result(error As String)
    Log("B4A ERROR = " & error)
End Sub

Sub Camera1_Ready (Success As Boolean)
    If Success Then
     Camera1.StartPreview
     CamOn1 = 1
    Else
     CamOn1 = 0
    End If
End Sub

Sub Camera1_PictureTaken(Data() As Byte)
    Dim out As OutputStream
    out = File.OpenOutput(File.DirRootExternal, PicName1, False)
    out.WriteBytes(Data, 0, Data.Length)
    out.Close

    ToastMessageShow("Image saved: " & File.Combine(File.DirRootExternal, PicName1), True)

    'Camera1.StartPreview
    Bitmap1 = Null
    Bitmap1 = LoadBitmapSample(File.DirRootExternal,PicName1,800,480)
    ImageView1.Bitmap = Bitmap1

    Ticks1=DateTime.Now
    Do While DateTime.Now < Ticks1+1500
     DoEvents
    Loop
    mvbm.decodeBitmap(Bitmap1)
End Sub
 
Last edited:

Syd Wright

Well-Known Member
Licensed User
PS, I tried greyscaled images and reduced bitdepth images (16M, 65k, 256 or 16 bits) but it hardly makes any difference.
What does make a difference is the bitmap resolution. The optimum between OCR speed and best quality is when the resolution is around 800 x 480. Higher resolutions add almost nothing to the results and with small(er) images OCR is bad to very bad.
 

peacemaker

Expert
Licensed User
The lib "cannot start as expired".

Can trial be updated ?
I'd like to check working, as found that the Vision libs cannot be initialized, app is stopped on my f...ng Samsung Android 7.
 
Last edited:

ThRuST

Well-Known Member
Licensed User
This is the weakness and backside with small 3rd party solutions that cost money. As time goes by the developer might not even be around or reached to update his product range. It would be better to make all source code publically available because of this fact. Why else should the library post still reside on a public forum for developers?
 

Johan Schoeman

Expert
Licensed User
This is the weakness and backside with small 3rd party solutions that cost money. As time goes by the developer might not even be around or reached to update his product range. It would be better to make all source code publically available because of this fact. Why else should the library post still reside on a public forum for developers?
I will post the Java code for you and you can then change it to your liking, recompile it, post the new lib files in this thread. Will do so tomorrow.
 

ThRuST

Well-Known Member
Licensed User
If you don't have the Android Vision dependencies installed then you will need an initial internet connection for the dependencies to be installed. After that an internet connection will not be required. This is version 1.00. It will expire on 31 October 2017.

Thanks for updating your library so initialization is possible. Even though there's other solutions since your previous update is getting old. Cheers :)
 

Johan Schoeman

Expert
Licensed User
Thanks for updating your library so initialization is possible. Even though there's other solutions since your previous update is getting old. Cheers :)
Here it the source code - change it to your liking.
 

Attachments

  • src.zip
    2.5 KB · Views: 221

ThRuST

Well-Known Member
Licensed User
Thanks, but that's not really needed since I found another solution. Hopefully your solution can help someone else. In general source code updates should always be posted in the first post to prevent library updates to be scattered allover the place. Erel might have told you this as well allready.
 

Johan Schoeman

Expert
Licensed User
Thanks, but that's not really needed since I found another solution. Hopefully your solution can help someone else. In general source code updates should always be posted in the first post to prevent library updates to be scattered allover the place. Erel might have told you this as well allready.
Use from the forum whatever works for you. There are probably many other solutions on the forum that will solve your requirements.
 
Top