OCR
You can perform OCR (optical character recognition) on any document with Nutrient Web SDK, an advanced OCR SDK that includes intelligent character recognition (ICR) technology.
OCR is available when using the Web SDK with Document Engine in server-backed operational mode.
To do so, open the document from Document Engine and apply the performOcr
document operation with Instance.applyOperations
:
await instance.applyOperations([ { type: "performOcr", language: "english", pageIndexes: "all" } ]);
This will detect all English text in the document and make it available for searching and manual text selection.
Other languages
If your document is written in a language other than English, you can extract its text by modifying the language
parameter. For example, to perform OCR in Spanish, run:
await instance.applyOperations([ { type: "performOcr", language: "spanish", pageIndexes: "all" } ]);
Nutrient Web SDK can perform OCR in the following languages:
-
Croatian
-
Czech
-
Danish
-
Dutch
-
English
-
Finnish
-
French
-
German
-
Indonesian
-
Italian
-
Malay
-
Norwegian
-
Polish
-
Portuguese
-
Serbian
-
Slovak
-
Slovenian
-
Spanish
-
Swedish
-
Turkish
-
Welsh