Supported languages: 100+ OCR language dictionaries
Nutrient .NET SDK (formerly GdPicture.NET) includes the following language dictionaries for recognizing text with optical character recognition (OCR):
Language | Code |
---|---|
Arabic | ara |
German | deu |
English | eng |
French | fra |
Hebrew | heb |
Italian | ita |
Dutch, Flemish | nld |
Portuguese | por |
Spanish, Castilian | spa |
Vietnamese | vie |
To recognize languages not listed above, follow the steps below:
- Download the language files(opens in a new tab) provided by the Tesseract team, which include more than 120 languages. To use previous language data files without long short-term memory (LSTM) engine use, download a previous release(opens in a new tab) provided by the Tesseract team.
- Add the language files to the folder where your OCR dictionaries are already installed. The default language resources are located in
GdPicture.NET 14\Redist\OCR
. - Determine language names based on the language codes and the Tesseract documentation(opens in a new tab).