Manual Recognition

PreviousDocument recognition NextFeedback

Last updated 1 year ago

Was this helpful?

Manual Recognition

Human-in-the-loop - is the additional module for manual verification of recognition results to provide the highest data accuracy even on the most complex corner cases. The module is available both in the cloud and the local version of the Handl service. Crowd platform has more than 1 million registered users, and 37,000 of them are active. The large number of available users (validators) allows the HITL module to process requests online at any time of the day or night.

The platform does not receive personal data: the human validators receive a mixed set of fields from various documents.

Stages of HITL validation

The validators receive the "cut out field + extracted text" pair and evaluate the correctness of the result using the "Yes"/"No" buttons. Each field passes through several validators. The digitized text is considered correct only if the validators reach a consensus.
If at least one validator chooses "No", the excised field is sent to manual input. The validator enters text using widgets and dictionaries. For example, the date must be chosen in the calendar, and the car model strictly matches the make chosen in the previous field. The algorithm asks for new answers on the field from different validators until a consensus is reached.

Sometimes the HITL validators cannot reach a consensus. This result can be caused by:

defects on the documents, such as glare or creases;
poor quality of the incoming image;
illegible handwritten text.

To indicate such situations, HITL changes theconfidence:

1.00 - "absolutely sure".
0.80-0.99 - "quite sure".
0.70-0.79 - "there might be an error in the answer".
0.69 and lower - there are obvious problems in the field, the original extracted text from the OCR step is returned.

PreviousDocument recognition NextFeedback

Last updated 1 year ago

Was this helpful?