Optical Character Recognition
Zia Optical Character Recognition electronically detects textual characters in images or digital documents, and converts them into machine-encoded text. Zia OCR can recognize text in 10 major languages.
Note: Catalyst does not store any of the files you upload in its systems. The files you upload are used for one-time processing only. They are not used for ML model training purposes either. Catalyst components are fully compliant with all applicable data protection and privacy laws.
OCR
Description
This API is used to detect textual characters in images and documents, and deliver the recognized text as a JSON response. The response also contains a confidence score, which defines the accuracy of the detection.
You must specify the path to the image or document file in the API request, as shown in the sample request. You can optionally specify the languages present in the text, for quicker processing. OCR supports 9 international languages and 10 Indian languages, that are mentioned in the tables below. The language is automatically detected and the text is processed, if it is not specified.
Request URL
https://api.catalyst.zoho.com/baas/v1/project/{project_id}/ml/ocr
project_id - The unique ID of the project
Request Headers
Authorization: Zoho-oauthtoken 1000.910***************************16.2f****************************57
content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW
Request Method
POST
Scope
scope=ZohoCatalyst.mlkit.READ
Form-Data Properties
| Parameter Name | Data Type | Mandatory | Description |
|---|---|---|---|
| image | File | Yes | The input file to be processed You must provide the path to it in your local system Allowed formats: .jpg, .jpeg, .png, .bmp, .tiff, .pdf File size limit: 20 MB |
| language | String | No | The language code of the text to be identified Refer the tables below for the language codes |
International Languages Supported by OCR
| Language | Language Codes |
|---|---|
| Arabic | ara |
| Chinese | chi_sim |
| French | fra |
| Italian | ita |
| Japanese | jpn |
| Portuguese | por |
| Romanian | ron |
| Spanish | spa |
Indian Languages Supported by OCR
| Language | Language Codes |
|---|---|
| English | eng |
| Hindi | hin |
| Bengali | ben |
| Marathi | mar |
| Telugu | tel |
| Tamil | tam |
| Gujarati | guj |
| Urdu | urd |
| Kannada | kan |
| Malayalam | mal |
| Sanskrit | san |
SDK documentation
Sample Request: OCR
curl -X POST \
https://api.catalyst.zoho.com/baas/v1/project/4000000006007/ml/ocr \
-H "Authorization: Zoho-oauthtoken 1000.910***************************16.2f****************************57" \
-H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \
-F 'image=/Desktop/HelplineCard.jpg' \
-F 'language=eng,spa' Sample Response: OCR
{
"status":"success",
"data":{
"confidence":79.71514892578125,
"text":"Whenever you\nneed to talk,\nwe‘re open\n\n[—] text eseses\n[J] KidsHelpPhone.ca\n\n(@, call 1—800—663—6868 Kids Help Phone ©"
}
}