OCR online free

laviniut

Active Member
Licensed User
Longtime User
I tried to build a post request, but i failed. please help me!
my code is attached.
I think is needed more simple syntax request.
api from ocr server above say:

1. Introduction to WeOCR API

WeOCR API has been designed so that it can provide various
applications with a very simple means for accessing online OCR
services.

WeOCR API uses HTTP GET and POST methods.
No SOAP nor special protocol is used.


2. Non-interactive use of WeOCR server

WeOCR servers can be used non-interactively as well as via an
HTML page. To do so, you just call the CGI program directly
using a web tool such as cURL.

Example:
$ curl -F userfile=@Image_File_Name \
-F outputencoding="utf-8" \
-F outputformat="txt" \
http://Server_Address/cgi-bin/weocr/submit.cgi >result.txt

You probably need to use --max-time and --connect-timeout
options as well to avoid undesirable blockings of the process.

If you specify outputformat="txt", the first line of the
output data is used as the status line. The first line will be
blank upon successful recognition.


3. Locating the server

To find a WeOCR server suitable for your application, visit
http://weocr.ocrgrid.org/

(There may be some other WeOCR search engine sites.)

Every WeOCR web site has a server spec file "srvspec.xml"
in the site top directory. The CGI program can be located by
looking at the following entry in the spec file.

<ocrserver specversion="1.x">
<svinfo>
<cgi> ... </cgi>


4. Parameters

WeOCR servers accept the following parameters.
The default value is used if the parameter is not specified.

outputencoding: Output Encoding {utf-8, ...} (default: utf-8)
specifies the encoding of the output data.
For example, "utf-8" and "iso-8859-15" specify UTF-8 and
Latin9 (ISO-8859-15), respectively.
At least UTF-8 must be supported by the server.


contentlang: Contents Language {eng, deu, ...} (default: auto)
specifies the language used in the document image in case
the server supports multiple languages. The special value
"auto" allows the server to assume the language supported
by it or to detect automatically the language(s).
The language code is based on ISO 639-3, although some
derivatives may be accepted by some servers.


outputformat: Output Format {html,txt} (default: txt)
specifies the format of the output data.

If the output format is "txt", the first line of the text
data is used as the status line. A blank line represents
"no error".

and ocr server specification say:
<svinfo>
<!--
Please update the revision every time you change
the configuration of the server.
The revision change will tell the automatic spec
collecter to update the database entry for your server.
format: <revision>YYMMDD.Sr</revision>
YY: last 2 digit of the year
MM: month in 2 digit (01-12)
DD: day in 2 digit (01-31)
Sr: arbitrary serial number in 2 digit (01-99)
-->
<revision>130602.01</revision>

<svengine name="OCRExtract" version="0.10"/>
<title>OCR server with Tesseract 3.02.02</title>
<url>http://www.ocr-extract.com/</url>
<cgi>http://www.ocr-extract.com/api/rest/extract</cgi>
<specxml>http://www.ocr-extract.com/srvspec.xml</specxml>
<organization>cluster:Systems CSG GmbH</organization>
<department>SaaS and Appliances</department>
<address>Gletscherstr.13, 16341 Panketal</address>
<country>GERMANY</country>
<contact>webmaster[a t]clustersystems.de</contact>

<!-- single/hybrid/synergetic -->
<svtype>single</svtype>

<!-- regular/experimental -->
<svlevel>regular</svlevel>
</svinfo>

so, what is the URL i need to type in postrequest ?
and how can i get it to work ?
 

Attachments

  • OCR.ZIP
    11.6 KB · Views: 420
Upvote 0

Frank

Member
Licensed User
Longtime User
Hi, did you get any further with this? I would be interested in an OCR solution as well.
Thanks, Frank
 
Upvote 0

mjtaryan

Active Member
Licensed User
Longtime User
Erel, how fast is the turn around from the service? I'm wanting to create a "live" handheld ocr to speech app for visually impaired persons. Therefore, I need my app to be able to photograph the "page", send the image, get the results and convert it to speech quickly and in real time. It would seem having our own library (or a free and thoroughly tested/debugged wrapper for Tesseract) might speed up the process -- among many other reasons; hint, hint :)
 
Upvote 0
Top