Full-text recognition

PreviousFace Recognition and Document Reconciliation NextConnecting and Testing

Last updated 1 year ago

Was this helpful?

Full-text recognition

The methods /fulltext and /fulltext_by_lines return all the text from any documents. They have many differences from the /recognize method: they do not search for specific fields, do not use dictionaries and masks and cannot send text for manual re-chec

king.

API specification

fulltext

POST https://latest.handl.ai/fulltext

The tool requires access to the cloud version of Handl to work correctly. The text is returned word by word, each word is accompanied by a confidence level

Query Parameters

Name

Type

Description

proprity

integer

Task priority, takes "1" by default

async

boolean

true - request in asynchronous mode, see "Asynchronous mode" in the "Connecting and testing" section. false - request in the synchronous mode

doc2pdf

boolean

true - returns the PDF file with the recognition results embedded in the text layer. false - standard mode of working

Request Body

Name

Type

Description

image

string

File to be recognized

{
  "detail": [], // technical information
  "items": [
    {
      "words": [
        {
          "text": "text", // the word from the text
          "confidence": 0.8697810769 // confidence level of the fact that word recognized correctly
        },
        {
          "text": "example", // the word from the file
          "confidence": 0.8697810769 // confidence level of the fact that word recognized correctly
        }
      ]
    }
  ],
  "task_id": null, // inner id of the task
  "code": null, // code of error
  "message": null, // message with the error description
  "errno": null, // code of error
  "traceback": null, // message with the error description
  "fake": null,
  "pages_count": null,
  "docs_count": null
}

{
  "detail": [
    {
      "loc": [
        "path",
        "task_id"
      ],
      "msg": "value is not a valid uuid",
      "type": "type_error.uuid"
    }
  ]
}

fulltext_by_lines

POST https://latest.handl.ai/fulltext_by_lines

The tool can work in a closed internal IT system. The text is returned line by line, each line is accompanied by a confidence level

Query Parameters

Name

Type

Description

priority

integer

Task priority, takes "1" by default

async

boolean

true - request in asynchronous mode, see "Asynchronous mode" in the "Connecting" section. false - request in the synchronous mode

language

boolean

true - returns in response the PDF file with the recognition results embedded in the text layer false - standard mode of working

Request Body

Name

Type

Description

image

string

File to be recognized

{
  { "detail": [], // technical information
  { "items": [
    {
      }, "words": [
        {
          { "text": "text", // a string from the text in the input file
          "confidence": 0.8697810769 // confidence of the recognized string
        },
        {
          "text": "example", // text string in the input file
          "confidence": 0.8697810769 // confidence of the recognized string
        }
      ]
    }
  ],
  "task_id": null, //task's internal id
  "code": null, // error code
  "message": null, // error message within the object
  "errno": null, // error number
  "traceback": null, // error message within the limits of object
  "fake": null, // not used in this method
  "pages_count": null, // not used in this method
  "docs_count": null // not used in this method
}

{
  "detail": [
    {
      "loc": [
        "path",
        "task_id"
      ],
      "msg": "value is not a valid uuid",
      "type": "type_error.uuid"
    }
  ]
}

PreviousFace Recognition and Document Reconciliation NextConnecting and Testing

Last updated 1 year ago

Was this helpful?

king.

API specification

Below is the API specification for 2 full-text recognition methods . For more details on how to compose a query, see .

fulltext

POST https://latest.handl.ai/fulltext

The tool requires access to the cloud version of Handl to work correctly. The text is returned word by word, each word is accompanied by a confidence level

Query Parameters

Name

Type

Description

proprity

integer

Task priority, takes "1" by default

async

boolean

true - request in asynchronous mode, see "Asynchronous mode" in the "Connecting and testing" section. false - request in the synchronous mode

doc2pdf

boolean

true - returns the PDF file with the recognition results embedded in the text layer. false - standard mode of working

Request Body

Name

Type

Description

image

string

File to be recognized

{
  "detail": [], // technical information
  "items": [
    {
      "words": [
        {
          "text": "text", // the word from the text
          "confidence": 0.8697810769 // confidence level of the fact that word recognized correctly
        },
        {
          "text": "example", // the word from the file
          "confidence": 0.8697810769 // confidence level of the fact that word recognized correctly
        }
      ]
    }
  ],
  "task_id": null, // inner id of the task
  "code": null, // code of error
  "message": null, // message with the error description
  "errno": null, // code of error
  "traceback": null, // message with the error description
  "fake": null,
  "pages_count": null,
  "docs_count": null
}

{
  "detail": [
    {
      "loc": [
        "path",
        "task_id"
      ],
      "msg": "value is not a valid uuid",
      "type": "type_error.uuid"
    }
  ]
}

fulltext_by_lines

POST https://latest.handl.ai/fulltext_by_lines

The tool can work in a closed internal IT system. The text is returned line by line, each line is accompanied by a confidence level

Query Parameters

Name

Type

Description

priority

integer

Task priority, takes "1" by default

async

boolean

true - request in asynchronous mode, see "Asynchronous mode" in the "Connecting" section. false - request in the synchronous mode

language

boolean

true - returns in response the PDF file with the recognition results embedded in the text layer false - standard mode of working

Request Body

Name

Type

Description

image

string

File to be recognized

{
  { "detail": [], // technical information
  { "items": [
    {
      }, "words": [
        {
          { "text": "text", // a string from the text in the input file
          "confidence": 0.8697810769 // confidence of the recognized string
        },
        {
          "text": "example", // text string in the input file
          "confidence": 0.8697810769 // confidence of the recognized string
        }
      ]
    }
  ],
  "task_id": null, //task's internal id
  "code": null, // error code
  "message": null, // error message within the object
  "errno": null, // error number
  "traceback": null, // error message within the limits of object
  "fake": null, // not used in this method
  "pages_count": null, // not used in this method
  "docs_count": null // not used in this method
}

{
  "detail": [
    {
      "loc": [
        "path",
        "task_id"
      ],
      "msg": "value is not a valid uuid",
      "type": "type_error.uuid"
    }
  ]
}