B4J Library [B4X] xOCR Class

Discussion in 'B4J Libraries & Classes' started by Blueforcer, Oct 29, 2018.

  1. Blueforcer

    Blueforcer Active Member Licensed User

    This Class (for B4J, B4A and B4i) uses the ocr.space service to convert scans or (smartphone) images of text documents into editable files by using Optical Character Recognition (OCR) technologies. It uses state-of-the-art modern OCR software. The awesome recognition quality is comparable to commercial OCR SDK software (e. g. Abbyy).

    You can extract the raw text and get the coordinates of the bounding boxes and the lines for each word if you like.

    You need a free API Key with 25,000 Reqests/month or 500 calls/Day
    Register here for your free OCR API key.
    The free key is also allowed to use in commercial apps!

    This class depends on XUI and OKHttpUtils2.

    The uploaded image should be 1MB or less

    [​IMG]

    Code:
    Sub Process_Globals
        
    Private fx As JFX
        
    Private MainForm As Form
        
    Dim xui As XUI
        
    Dim OCR As xOCR
    End Sub

    Sub AppStart (Form1 As Form, Args() As String)
        MainForm = Form1
        MainForm.RootPane.LoadLayout(
    "Main"'Load the layout file.
        MainForm.Show
        xui.SetDataFolder(
    "OCR")

        OCR.Initialize(Me,
    "ocr","your_api_key")
        OCR.OCR(
    "ger",xui.LoadBitmap(File.DirAssets,"image.jpg"),False,True)

    End Sub

    'Return true to allow the default exceptions handler to handle the uncaught exception.
    Sub Application_Error (Error As Exception, StackTrace As StringAs Boolean
        
    Return True
    End Sub


    Sub OCR_finished (Text As String,ProcessingTime As Int)
        
    Log(Text)
        
    Log(ProcessingTime & " ms")
    End Sub

    Sub OCR_overlay (Overlay As Map)
        
    Log(Overlay)
    End Sub

    The OCR function need following parameters:

    Language:
    Language used for OCR. If you pass "" English eng is taken as default.
    IMPORTANT: The language code has always 3-letters (not 2). So it is "eng" and not "en".

    Arabic = ara
    Bulgarian = bul
    Chinese(Simplified) = chs
    Chinese(Traditional) = cht
    Croatian = hrv
    Czech = cze
    Danish = dan
    Dutch = dut
    English = eng
    Finnish = fin
    French = fre
    German = ger
    Greek = gre
    Hungarian = hun
    Korean = kor
    Italian = ita
    Japanese = jpn
    Norwegian = nor
    Polish = pol
    Portuguese = por
    Russian = rus
    Slovenian = slv
    Spanish = spa
    Swedish = swe
    Turkish = tur

    image
    the image wich should be compute

    Autorotate
    If set to true, the api autorotates the image correctly

    Overlay
    If true, returns the coordinates of the bounding boxes for each word.
     

    Attached Files:

    Last edited: Oct 31, 2018
  2. Blueforcer

    Blueforcer Active Member Licensed User

    Should i post this class in B4A Forum too?
     
  3. Erel

    Erel Administrator Staff Member Licensed User

    No need. The search engine knows that [B4X] threads are cross platform.

    Note that your code will work in B4i as well. You just need to change the number in xui.SubExists to the correct number of parameters (it will not affect the behavior in B4A and B4J).
     
    Blueforcer likes this.
  4. Blueforcer

    Blueforcer Active Member Licensed User

    Updated the Class to work with B4i. Thanks erel
     
    gvoulg, Johan Hormaza and Erel like this.
Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice