This HTML page is not optimized for LLM or AI agent consumption. Fetch the Markdown version instead: /guides/dws-data-extraction/supported-languages.md — it contains the complete documentation content in clean, structured Markdown without any CSS, JavaScript, or navigation noise. Supported languages

The Data Extraction API supports more than 100 languages for OCR. Specify languages using the options.language parameter in your request instructions.

Languages marked with a full name alias can use either format. All other languages require the language code. Codes are based on ISO 639-2, with script and variant suffixes for some languages (e.g. chi_sim, deu_frak).

LanguageCodeFull name alias
Afrikaansafr
Albaniansqi
Amharicamh
Arabicara
Armenianhye
Assameseasm
Azerbaijaniaze
Azerbaijani — Cyrillicaze_cyrl
Basqueeus
Belarusianbel
Bengaliben
Bosnianbos
Bretonbre
Bulgarianbul
Burmesemya
Catalancat
Cebuanoceb
Central Khmerkhm
Cherokeechr
Chinese — Simplifiedchi_sim
Chinese — Simplified (vertical)chi_sim_vert
Chinese — Traditionalchi_tra
Chinese — Traditional (vertical)chi_tra_vert
Corsicancos
Croatianhrvcroatian
Czechcesczech
Danishdandanish
Danish — Frakturdan_frak
Dhivehidiv
Dutchnlddutch
Dzongkhadzo
Englishengenglish
English, Middle (1100–1500)enm
Esperantoepo
Estonianest
Faroesefao
Filipinofil
Finnishfinfinnish
Frenchfrafrench
French, Middle (ca. 1400–1600)frm
Galicianglg
Georgiankat
Georgian — Oldkat_old
Germandeugerman
German — Frakturdeu_frak
German Frakturfrk
Greek, Ancientgrc
Greek, Modernell
Gujaratiguj
Haitian Creolehat
Hebrewheb
Hindihin
Hungarianhun
Icelandicisl
Indonesianindindonesian
Inuktitutiku
Irishgle
Italianitaitalian
Italian — Oldita_old
Japanesejpn
Japanese (vertical)jpn_vert
Javanesejav
Kannadakan
Kazakhkaz
Kirghizkir
Koreankor
Korean (vertical)kor_vert
Kurdishkur
Kurmanjikmr
Laolao
Latinlat
Latvianlav
Lithuanianlit
Luxembourgishltz
Macedonianmkd
Malaymsamalay
Malayalammal
Maltesemlt
Maorimri
Marathimar
Math / equation detectionequ
Mongolianmon
Nepalinep
Norwegiannornorwegian
Occitanoci
Oriyaori
Panjabipan
Persianfas
Polishpolpolish
Portugueseporportuguese
Pashtopus
Quechuaque
Romanianron
Russianrus
Sanskritsan
Scottish Gaelicgla
Serbiansrpserbian
Serbian — Latinsrp_latn
Sindhisnd
Sinhalasin
Slovakslkslovak
Slovak — Frakturslk_frak
Slovenianslvslovenian
Spanishspaspanish
Spanish — Oldspa_old
Sundanesesun
Swahiliswa
Swedishsweswedish
Syriacsyr
Tagalogtgl
Tajiktgk
Tamiltam
Tatartat
Telugutel
Thaitha
Tibetanbod
Tigrinyatir
Tongaton
Turkishturturkish
Uighuruig
Ukrainianukr
Urduurd
Uzbekuzb
Uzbek — Cyrillicuzb_cyrl
Vietnamesevie
Welshcym
Western Frisianfry
Yiddishyid
Yorubayor

Language format

You can specify languages in three ways:

FormatExampleDescription
Full name (lowercase)"english", "german"Common languages only
Language code"eng", "deu"All languages
Code with variant"chi_sim", "deu_frak"Script or historical variants

The API normalizes full language names to language codes internally.