Share My Creation Screen OCR in 5 minutes and 50 lines!

roberto64 · Apr 3, 2018

hi, com libjAWT Robot ver. 1.55 does not recognize the commands,"c3po.runCommand"-"c3po.rectangleAsArbitrary"-"c3po.CreateScreenCaptureToFile"

B4X:

Sub DoOCR
 
    c3po.runCommand("C:\TO\tesseract frame.png text")
End Sub
error
Sub ScreenCapture
    c3po.rectangleAsArbitrary(OCRFrame.f.WindowLeft, OCRFrame.f.WindowTop, OCRFrame.f.Width, OCRFrame.f.Height)
    c3po.CreateScreenCaptureToFile("Frame.png")
End Sub

jroriz · Apr 4, 2018

roberto64 said:
hi, com libjAWT Robot ver. 1.55 does not recognize the commands,"c3po.runCommand"-"c3po.rectangleAsArbitrary"-"c3po.CreateScreenCaptureToFile"

B4X:

Sub DoOCR c3po.runCommand("C:\TO\tesseract frame.png text") End Sub error Sub ScreenCapture c3po.rectangleAsArbitrary(OCRFrame.f.WindowLeft, OCRFrame.f.WindowTop, OCRFrame.f.Width, OCRFrame.f.Height) c3po.CreateScreenCaptureToFile("Frame.png") End Sub

Roberto deve ser brasileiro... Tem um monte de jeito de resolver isso. Um deles é usar a 1.0 (anexa).

Pode também usar o jshell:

B4X:

Sub DoOCR
    Dim shl As Shell
    shl.Initialize("", "C:\TO\tesseract", Array As String("frame.png", "text"))
    shl.WorkingDirectory = File.DirApp
    shl.Run(1000)
    'c3po.runCommand("C:\TO\tesseract frame.png text")    ' change tesseract folder if needed
End Sub

joulongleu · Apr 4, 2018

Hi:jroriz Can use Chinese, I copy chi_sim.traineddata into tesseract-OCR\tessdata ,But can,t use

jroriz · Apr 4, 2018

joulongleu said:
Hi:jroriz Can use Chinese, I copy chi_sim.traineddata into tesseract-OCR\tessdata ,But can,t use

Hi.
You will need to change
c3po.runCommand("C:\TO\tesseract frame.png text")
to
c3po.runCommand("C:\TO\tesseract frame.png text -l chi_sim")

There are newer versions of tessdata, with better-trained files.
Google it and make tests.

Note that there are other parameters you can try:

Usage:tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfile...
]

pagesegmode values are:
0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
-l lang and/or -psm pagesegmode must occur before anyconfigfile.

Single options:
-v --version: version info
--list-langs: list available languages for tesseract engine

Hugs.

supriono · May 17, 2018

jroriz said:
View attachment 65504

Exactly like the thread title! 5 minutes. 50 lines...

1 - Download and install Tesseract (<15Mb): https://sourceforge.net/projects/tesseract-ocr-alt/files/
2 - Install Tesseract in "C:\TO" - OR change the DoOcr sub to match the location where you installed it
3 - Start the program. Move the frame to the position of the screen where the text to be read is.
4 - Click OCR!
5 - That's it!

i found this error
Cannot run program "C:\TO\tesseract": CreateProcess error=2, The system cannot find the file specified

DonManfred · May 17, 2018

supriono said:
i found this error
Cannot run program "C:\TO\tesseract": CreateProcess error=2, The system cannot find the file specified

jroriz said:
2 - Install Tesseract in "C:\TO" - OR change the DoOcr sub to match the location where you installed it

Did you adapt the code to match the folder where you installed it?

Share My Creation Screen OCR in 5 minutes and 50 lines!

Attachments

roberto64

Active Member

jroriz

Active Member

Attachments

joulongleu

Active Member

jroriz

Active Member

supriono

Member

DonManfred

Expert

Similar Threads