---
title: "Supported languages"
canonical_url: "https://www.nutrient.io/guides/dws-data-extraction/supported-languages/"
md_url: "https://www.nutrient.io/guides/dws-data-extraction/supported-languages.md"
last_updated: "2026-05-26T22:37:31.557Z"
description: "Complete list of OCR languages supported by the Nutrient Data Extraction API, including language codes and full name aliases."
---

# Supported languages

The Data Extraction API supports more than 100 languages for OCR. Specify languages using the `options.language` parameter in your request instructions.

Languages marked with a full name alias can use either format. All other languages require the language code. Codes are based on ISO 639-2, with script and variant suffixes for some languages (e.g. `chi_sim`, `deu_frak`).

| Language                         | Code           | Full name alias |
| -------------------------------- | -------------- | --------------- |
| Afrikaans                        | `afr`          |                 |
| Albanian                         | `sqi`          |                 |
| Amharic                          | `amh`          |                 |
| Arabic                           | `ara`          |                 |
| Armenian                         | `hye`          |                 |
| Assamese                         | `asm`          |                 |
| Azerbaijani                      | `aze`          |                 |
| Azerbaijani — Cyrillic           | `aze_cyrl`     |                 |
| Basque                           | `eus`          |                 |
| Belarusian                       | `bel`          |                 |
| Bengali                          | `ben`          |                 |
| Bosnian                          | `bos`          |                 |
| Breton                           | `bre`          |                 |
| Bulgarian                        | `bul`          |                 |
| Burmese                          | `mya`          |                 |
| Catalan                          | `cat`          |                 |
| Cebuano                          | `ceb`          |                 |
| Central Khmer                    | `khm`          |                 |
| Cherokee                         | `chr`          |                 |
| Chinese — Simplified             | `chi_sim`      |                 |
| Chinese — Simplified (vertical)  | `chi_sim_vert` |                 |
| Chinese — Traditional            | `chi_tra`      |                 |
| Chinese — Traditional (vertical) | `chi_tra_vert` |                 |
| Corsican                         | `cos`          |                 |
| Croatian                         | `hrv`          | `croatian`      |
| Czech                            | `ces`          | `czech`         |
| Danish                           | `dan`          | `danish`        |
| Danish — Fraktur                 | `dan_frak`     |                 |
| Dhivehi                          | `div`          |                 |
| Dutch                            | `nld`          | `dutch`         |
| Dzongkha                         | `dzo`          |                 |
| English                          | `eng`          | `english`       |
| English, Middle (1100–1500)      | `enm`          |                 |
| Esperanto                        | `epo`          |                 |
| Estonian                         | `est`          |                 |
| Faroese                          | `fao`          |                 |
| Filipino                         | `fil`          |                 |
| Finnish                          | `fin`          | `finnish`       |
| French                           | `fra`          | `french`        |
| French, Middle (ca. 1400–1600)   | `frm`          |                 |
| Galician                         | `glg`          |                 |
| Georgian                         | `kat`          |                 |
| Georgian — Old                   | `kat_old`      |                 |
| German                           | `deu`          | `german`        |
| German — Fraktur                 | `deu_frak`     |                 |
| German Fraktur                   | `frk`          |                 |
| Greek, Ancient                   | `grc`          |                 |
| Greek, Modern                    | `ell`          |                 |
| Gujarati                         | `guj`          |                 |
| Haitian Creole                   | `hat`          |                 |
| Hebrew                           | `heb`          |                 |
| Hindi                            | `hin`          |                 |
| Hungarian                        | `hun`          |                 |
| Icelandic                        | `isl`          |                 |
| Indonesian                       | `ind`          | `indonesian`    |
| Inuktitut                        | `iku`          |                 |
| Irish                            | `gle`          |                 |
| Italian                          | `ita`          | `italian`       |
| Italian — Old                    | `ita_old`      |                 |
| Japanese                         | `jpn`          |                 |
| Japanese (vertical)              | `jpn_vert`     |                 |
| Javanese                         | `jav`          |                 |
| Kannada                          | `kan`          |                 |
| Kazakh                           | `kaz`          |                 |
| Kirghiz                          | `kir`          |                 |
| Korean                           | `kor`          |                 |
| Korean (vertical)                | `kor_vert`     |                 |
| Kurdish                          | `kur`          |                 |
| Kurmanji                         | `kmr`          |                 |
| Lao                              | `lao`          |                 |
| Latin                            | `lat`          |                 |
| Latvian                          | `lav`          |                 |
| Lithuanian                       | `lit`          |                 |
| Luxembourgish                    | `ltz`          |                 |
| Macedonian                       | `mkd`          |                 |
| Malay                            | `msa`          | `malay`         |
| Malayalam                        | `mal`          |                 |
| Maltese                          | `mlt`          |                 |
| Maori                            | `mri`          |                 |
| Marathi                          | `mar`          |                 |
| Math / equation detection        | `equ`          |                 |
| Mongolian                        | `mon`          |                 |
| Nepali                           | `nep`          |                 |
| Norwegian                        | `nor`          | `norwegian`     |
| Occitan                          | `oci`          |                 |
| Oriya                            | `ori`          |                 |
| Panjabi                          | `pan`          |                 |
| Persian                          | `fas`          |                 |
| Polish                           | `pol`          | `polish`        |
| Portuguese                       | `por`          | `portuguese`    |
| Pashto                           | `pus`          |                 |
| Quechua                          | `que`          |                 |
| Romanian                         | `ron`          |                 |
| Russian                          | `rus`          |                 |
| Sanskrit                         | `san`          |                 |
| Scottish Gaelic                  | `gla`          |                 |
| Serbian                          | `srp`          | `serbian`       |
| Serbian — Latin                  | `srp_latn`     |                 |
| Sindhi                           | `snd`          |                 |
| Sinhala                          | `sin`          |                 |
| Slovak                           | `slk`          | `slovak`        |
| Slovak — Fraktur                 | `slk_frak`     |                 |
| Slovenian                        | `slv`          | `slovenian`     |
| Spanish                          | `spa`          | `spanish`       |
| Spanish — Old                    | `spa_old`      |                 |
| Sundanese                        | `sun`          |                 |
| Swahili                          | `swa`          |                 |
| Swedish                          | `swe`          | `swedish`       |
| Syriac                           | `syr`          |                 |
| Tagalog                          | `tgl`          |                 |
| Tajik                            | `tgk`          |                 |
| Tamil                            | `tam`          |                 |
| Tatar                            | `tat`          |                 |
| Telugu                           | `tel`          |                 |
| Thai                             | `tha`          |                 |
| Tibetan                          | `bod`          |                 |
| Tigrinya                         | `tir`          |                 |
| Tonga                            | `ton`          |                 |
| Turkish                          | `tur`          | `turkish`       |
| Uighur                           | `uig`          |                 |
| Ukrainian                        | `ukr`          |                 |
| Urdu                             | `urd`          |                 |
| Uzbek                            | `uzb`          |                 |
| Uzbek — Cyrillic                 | `uzb_cyrl`     |                 |
| Vietnamese                       | `vie`          |                 |
| Welsh                            | `cym`          |                 |
| Western Frisian                  | `fry`          |                 |
| Yiddish                          | `yid`          |                 |
| Yoruba                           | `yor`          |                 |

## Language format

You can specify languages in three ways:

| Format                | Example                   | Description                   |
| --------------------- | ------------------------- | ----------------------------- |
| Full name (lowercase) | `"english"`, `"german"`   | Common languages only         |
| Language code         | `"eng"`, `"deu"`          | All languages                 |
| Code with variant     | `"chi_sim"`, `"deu_frak"` | Script or historical variants |

The API normalizes full language names to language codes internally.

---

## Related pages

- [API overview](/guides/dws-data-extraction/api-overview.md)
- [Supported file types](/guides/dws-data-extraction/file-types.md)
- [Error handling](/guides/dws-data-extraction/errors.md)
- [Get started](/guides/dws-data-extraction/getting-started.md)
- [DWS Data Extraction API](/guides/dws-data-extraction.md)
- [Support](/guides/dws-data-extraction/support.md)
- [Security](/guides/dws-data-extraction/security.md)
- [Privacy](/guides/dws-data-extraction/privacy.md)

