OCR

You can perform OCR (optical character recognition) on any document with Nutrient Web SDK, an advanced OCR SDK that includes intelligent character recognition (ICR) technology.

Information

OCR is available when using Web SDK with Document Engine. For more information, refer to the operational mode guide.

To do so, open the document from Document Engine and apply the performOcr document operation with Instance.applyOperations:

await instance.applyOperations([
  { type: "performOcr", language: "english", pageIndexes: "all" }
]);

This will detect all English text in the document and make it available for searching and manual text selection.

Other languages

If your document is written in a language other than English, you can extract its text by modifying the language parameter. For example, to perform OCR in Spanish, run:

await instance.applyOperations([
  { type: "performOcr", language: "spanish", pageIndexes: "all" }
]);

Nutrient Web SDK can perform OCR in the following languages:

  • Croatian

  • Czech

  • Danish

  • Dutch

  • English

  • Finnish

  • French

  • German

  • Indonesian

  • Italian

  • Malay

  • Norwegian

  • Polish

  • Portuguese

  • Serbian

  • Slovak

  • Slovenian

  • Spanish

  • Swedish

  • Turkish

  • Welsh