B4J Question [SOLVED] Tesseract API - a $120 opotunity

Daestrum

Expert
Licensed User
Longtime User
Try this it can read from an image file or a byte buffer.

jTesseract.bas is the source for the library.

pic1.png is what I used for testing it.
 

Attachments

  • jTesseract.zip
    3.1 KB · Views: 311
  • jTesseractTester.zip
    988 bytes · Views: 293
  • jTesseract.bas
    2.2 KB · Views: 325
  • pic1.png
    pic1.png
    8.3 KB · Views: 432
Last edited:
Upvote 0

xulihang

Active Member
Licensed User
Longtime User
As this is a desktop app, we can also simply run tesseract's commandline program. I made a pdf to text tool this way.
 
Upvote 0

jroriz

Active Member
Licensed User
Longtime User
Try this it can read from an image file or a byte buffer.

jTesseract.bas is the source for the library.

pic1.png is what I used for testing it.

Error when i try to compile:

B4J Version: 7.00
Java Version: 8
Parsing code. (0.21s)
Building folders structure. (0.02s)
Compiling code. (0.24s)
Compiling layouts code. (0.00s)
Organizing libraries. (0.00s)
Compiling generated Java code. Error
Cannot find: c:\temp\tess4j\commons-beanutils-1.9.2.jar
 
Upvote 0

Daestrum

Expert
Licensed User
Longtime User
Sorry these are the extra jar files tesseract requires ( I just placed them in c:/temp/tess4j to stop my extralibs getting too large.
B4X:
commons-beanutils-1.9.2.jar     
commons-collections-3.2.1.jar
commons-io-2.6.jar             
commons-logging-1.2.jar
fontbox-2.0.12.jar             
ghost4j-1.0.1.jar
itext-2.1.7.jar                 
jai-imageio-core-1.4.0.jar
jbig2-imageio-3.0.2.jar         
jboss-logging-3.1.4.GA.jar
jboss-vfs-3.2.14.Final.jar     
jcl-over-slf4j-1.7.25.jar
jna-5.1.0.jar                   
jul-to-slf4j-1.7.25.jar
lept4j-1.10.0.jar               
log4j-1.2.17.jar
log4j-over-slf4j-1.7.25.jar     
logback-classic-1.2.3.jar
logback-core-1.2.3.jar         
pdfbox-2.0.12.jar               
pdfbox-debugger-2.0.12.jar
pdfbox-tools-2.0.12.jar         
slf4j-api-1.7.25.jar
tess4j-4.3.1.jar               
xmlgraphics-commons-1.4.jar
 
Upvote 0

jroriz

Active Member
Licensed User
Longtime User
Sorry these are the extra jar files tesseract requires ( I just placed them in c:/temp/tess4j to stop my extralibs getting too large.
B4X:
commons-beanutils-1.9.2.jar    
commons-collections-3.2.1.jar
commons-io-2.6.jar            
commons-logging-1.2.jar
fontbox-2.0.12.jar            
ghost4j-1.0.1.jar
itext-2.1.7.jar                
jai-imageio-core-1.4.0.jar
jbig2-imageio-3.0.2.jar        
jboss-logging-3.1.4.GA.jar
jboss-vfs-3.2.14.Final.jar    
jcl-over-slf4j-1.7.25.jar
jna-5.1.0.jar                  
jul-to-slf4j-1.7.25.jar
lept4j-1.10.0.jar              
log4j-1.2.17.jar
log4j-over-slf4j-1.7.25.jar    
logback-classic-1.2.3.jar
logback-core-1.2.3.jar        
pdfbox-2.0.12.jar              
pdfbox-debugger-2.0.12.jar
pdfbox-tools-2.0.12.jar        
slf4j-api-1.7.25.jar
tess4j-4.3.1.jar              
xmlgraphics-commons-1.4.jar
Could you please attach them all?
 
Upvote 0

jroriz

Active Member
Licensed User
Longtime User
Get them from here (then we don't use up Erels storage)
https://jar-download.com/artifact-search/tess4j
Now the error is gone.
But there is another one:

Code:
B4X:
    Dim t As jTesseract
    t.Initialize("C:\tess\tessdata")
    Dim c3po As AWTRobot
    Log(t.OcrFromBuffer(c3po.ScreenCaptureAsByteArray))

Raises the error:
jtesseract._ocrfrombuffer (java line: 72)
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at anywheresoftware.b4j.object.JavaObject.RunMethod(JavaObject.java:132)
at b4j.example.jtesseract._ocrfrombuffer(jtesseract.java:72)
at b4j.example.main$ResumableSub_Teste.resume(main.java:7700)
at b4j.example.main._teste(main.java:7672)
at b4j.example.main._mainform_mouseclicked(main.java:4286)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at anywheresoftware.b4a.BA.raiseEvent2(BA.java:91)
at anywheresoftware.b4a.BA.raiseEvent(BA.java:78)
at anywheresoftware.b4j.objects.NodeWrapper$1.handle(NodeWrapper.java:93)
at anywheresoftware.b4j.objects.NodeWrapper$1.handle(NodeWrapper.java:1)
at com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:86)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:238)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:191)
at com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
at com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:54)
at javafx.event.Event.fireEvent(Event.java:198)
at javafx.scene.Scene$ClickGenerator.postProcess(Scene.java:3470)
at javafx.scene.Scene$ClickGenerator.access$8100(Scene.java:3398)
at javafx.scene.Scene$MouseHandler.process(Scene.java:3766)
at javafx.scene.Scene$MouseHandler.access$1500(Scene.java:3485)
at javafx.scene.Scene.impl_processMouseEvent(Scene.java:1762)
at javafx.scene.Scene$ScenePeerListener.mouseEvent(Scene.java:2494)
at com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:394)
at com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:295)
at java.security.AccessController.doPrivileged(Native Method)
at com.sun.javafx.tk.quantum.GlassViewEventHandler.lambda$handleMouseEvent$353(GlassViewEventHandler.java:432)
at com.sun.javafx.tk.quantum.QuantumToolkit.runWithoutRenderLock(QuantumToolkit.java:389)
at com.sun.javafx.tk.quantum.GlassViewEventHandler.handleMouseEvent(GlassViewEventHandler.java:431)
at com.sun.glass.ui.View.handleMouseEvent(View.java:555)
at com.sun.glass.ui.View.notifyMouse(View.java:937)
at com.sun.glass.ui.win.WinApplication._runLoop(Native Method)
at com.sun.glass.ui.win.WinApplication.lambda$null$147(WinApplication.java:177)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.UnsatisfiedLinkError: Não foi possÃvel encontrar o módulo especificado.
at com.sun.jna.Native.open(Native Method)
at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:288)
at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:427)
at com.sun.jna.Library$Handler.<init>(Library.java:179)
at com.sun.jna.Native.loadLibrary(Native.java:641)
at com.sun.jna.Native.loadLibrary(Native.java:625)
at net.sourceforge.tess4j.util.LoadLibs.getTessAPIInstance(LoadLibs.java:85)
at net.sourceforge.tess4j.TessAPI.<clinit>(TessAPI.java:42)
at net.sourceforge.tess4j.Tesseract.init(Tesseract.java:426)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:310)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:293)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:274)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:258)
at b4j.example.jtesseract.getImgTextFromBuffer(jtesseract.java:118)
... 47 more
 
Upvote 0

Daestrum

Expert
Licensed User
Longtime User
Did you download the traineddata file for the language you want to use ?

In the tess4j folder there should be 25 jar files and one directory(tessdata)
 
Last edited:
Upvote 0

jroriz

Active Member
Licensed User
Longtime User
Did you download the traineddata file for the language you want to use ?
Thats what i have done so far:
- downloade tess4j, and extracted to c:\tess4j
- edited the jTesseract.xml, an changed c:\temp\tess4j for c:\tess4j, where i put the others jars.

Im using english for language.

Thats my C:\Tess4J\tessdata folder:
Capturar.PNG
 
Upvote 0

Daestrum

Expert
Licensed User
Longtime User
ok try this - it is using the jTesseract as a class module.

You will need to change the location of the jars in the main module.
 

Attachments

  • tess4jClassAsModule.zip
    2.2 KB · Views: 285
Upvote 0

Daestrum

Expert
Licensed User
Longtime User
I loaded your b4j app - changed the path to the jars in the main module and it worked fine.
 
Upvote 0

Daestrum

Expert
Licensed User
Longtime User
Note my tessdata folder only has eng.traineddata in it no other files at all.
 
Upvote 0

jroriz

Active Member
Licensed User
Longtime User
Note my tessdata folder only has eng.traineddata in it no other files at all.
Is there something special with file tess4j-3.4.8.jar wich is in the C:\Tess4J\dist folder. Shoud it be copied to any special place?
Did you "instaled" something? I simply downloaded and unziped the tess4j.
 
Upvote 0

Daestrum

Expert
Licensed User
Longtime User
On the link I posted in post #8 I just downloaded the file (it was called jar_files.zip), unpacked it into my c:/temp/tess4j folder
 
Upvote 0

jroriz

Active Member
Licensed User
Longtime User
On the link I posted in post #8 I just downloaded the file (it was called jar_files.zip), unpacked it into my c:/temp/tess4j folder
I've done it all again and now it worked.
But it's issuing a warning, which I think slows down the OCR process.
"Warning: Parameter not found: enable_new_segsearch"
 
Upvote 0

Daestrum

Expert
Licensed User
Longtime User
You really don't owe me anything.
I enjoy writing code not for monetary reward.
What you could do with it
A, buy a book on java this will help you.
B, have a meal with it.
C, use it to extend your b4x licence.
D, keep it.

The plus for me is I had never heard of Tesseract before today, so I learned something new too.
 
Last edited:
Upvote 0
Top