Understanding key-value pair extraction confidence score

PSPDFKit Processor has been deprecated and replaced by Document Engine. To migrate to Document Engine and unlock advanced document processing capabilities, refer to our migration guide. Learn more about these enhancements on our blog.

Nutrient’s key-value pair (KVP) extraction engine calculates a confidence score that expresses how confident the engine is in the accuracy of the extracted data.

The confidence score is calculated by considering the following factors, among others:

The confidence in the optical character recognition (OCR) result at the character level. Some characters are more difficult to recognize than others.
The confidence in the OCR result at the word level. Some words are more difficult to recognize than others.
The data type of the key. Some data types are more difficult to recognize than others. For example, dates and IBANs are relatively easy to recognize, while phone numbers and addresses are generally more difficult.

The confidence score enables you to filter results based on their assumed accuracy. For example, you can disregard data extraction results with a low confidence score or flag them as data items that require manual checks.

Understanding key-value pair extraction confidence score

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.