Full-text recognition

The methods /fulltext and /fulltext_by_lines return all the text from any documents. They have many differences from the /recognize method: they do not search for specific fields, do not use dictionaries and masks and cannot send text for manual re-chec
king.
API specification
Below is the API specification for 2 full-text recognition methods . For more details on how to compose a query, see Connecting and testing.
fulltext
POST
https://latest.handl.ai/fulltext
The tool requires access to the cloud version of Handl to work correctly. The text is returned word by word, each word is accompanied by a confidence
level
Query Parameters
proprity
integer
Task priority, takes "1" by default
async
boolean
true - request in asynchronous mode, see "Asynchronous mode" in the "Connecting and testing" section. false - request in the synchronous mode
doc2pdf
boolean
true - returns the PDF file with the recognition results embedded in the text layer. false - standard mode of working
Request Body
image
string
File to be recognized
{
"detail": [], // technical information
"items": [
{
"words": [
{
"text": "text", // the word from the text
"confidence": 0.8697810769 // confidence level of the fact that word recognized correctly
},
{
"text": "example", // the word from the file
"confidence": 0.8697810769 // confidence level of the fact that word recognized correctly
}
]
}
],
"task_id": null, // inner id of the task
"code": null, // code of error
"message": null, // message with the error description
"errno": null, // code of error
"traceback": null, // message with the error description
"fake": null,
"pages_count": null,
"docs_count": null
}
fulltext_by_lines
POST
https://latest.handl.ai/fulltext_by_lines
The tool can work in a closed internal IT system. The text is returned line by line, each line is accompanied by a confidence level
Query Parameters
priority
integer
Task priority, takes "1" by default
async
boolean
true - request in asynchronous mode, see "Asynchronous mode" in the "Connecting" section. false - request in the synchronous mode
language
boolean
true - returns in response the PDF file with the recognition results embedded in the text layer false - standard mode of working
Request Body
image
string
File to be recognized
{
{ "detail": [], // technical information
{ "items": [
{
}, "words": [
{
{ "text": "text", // a string from the text in the input file
"confidence": 0.8697810769 // confidence of the recognized string
},
{
"text": "example", // text string in the input file
"confidence": 0.8697810769 // confidence of the recognized string
}
]
}
],
"task_id": null, //task's internal id
"code": null, // error code
"message": null, // error message within the object
"errno": null, // error number
"traceback": null, // error message within the limits of object
"fake": null, // not used in this method
"pages_count": null, // not used in this method
"docs_count": null // not used in this method
}
Last updated
Was this helpful?