---
title: "Extracting text from multilingual images | Nutrient Java SDK"
canonical_url: "https://www.nutrient.io/guides/java/extraction/read-text-from-image-multi-language/"
md_url: "https://www.nutrient.io/guides/java/extraction/read-text-from-image-multi-language.md"
last_updated: "2026-05-30T02:20:01.341Z"
description: "Extract text from multilingual images using OCR with Nutrient Java SDK."
---

# Extracting text from multilingual images

Multi-language text extraction addresses a fundamental challenge in global operations where organizations must process documents containing content in multiple languages within the same image. This capability is essential for international companies handling multilingual contracts, government agencies processing diverse public documents, and educational institutions managing multicultural content.

Simultaneously recognizing and extracting text in multiple languages eliminates the need for separate workflows. This enables the efficient handling of documents that contain mixed-language text, such as:

- International business correspondence

- Multilingual product documentation

- Travel documents and passports

- Cross-border legal materials

From automated translation workflows to compliance systems processing multilingual regulatory documents, multi-language OCR enables businesses to handle diverse linguistic content with the same efficiency as single-language processing. This breaks down language barriers in document digitization and content management.

## Streamlining document workflows with our Java SDK

Developers can implement this feature by adding just a few lines of code to their applications. The SDK integrates multi-language Adaptive OCR text extraction directly, eliminating the requirement for external tools or complex setups. Whether you’re building a document processing pipeline or adding extraction functionality to a web application, our SDK provides a reliable and efficient solution right out of the box.

## Preparing the project

Specify a package name and create a new class for the task:

```java

package io.nutrient.Sample;

```

Import Nutrient Java SDK. It’s recommended to specify the actual classes, though wildcards are supported:

```java

import io.nutrient.sdk.Document;
import io.nutrient.sdk.Vision;
import io.nutrient.sdk.enums.VisionEngine;
import io.nutrient.sdk.exceptions.NutrientException;

import java.io.FileWriter;
import java.io.IOException;

public class ReadTextFromImageMultiLanguage {

```

Create the main function and specify the potential exceptions. In a production environment, you may choose to wrap these in a try-catch block for custom error handling:

```java

    public static void main(String[] args) throws NutrientException, IOException {

```

With the Java environment ready, you can now focus on the SDK-specific implementation.

## Loading and configuring multi-language Adaptive OCR

Open the image file and configure the vision API with multi-language support. Setting the default languages tells the Adaptive OCR engine which language models to load for optimal recognition accuracy:

```java

        try (Document document = Document.open("input_ocr_multiple_languages.png")) {
            // Configure OCR engine for text extraction
            document.getSettings().getVisionSettings().setEngine(VisionEngine.AdaptiveOcr);

            // Configure multiple languages for recognition
            document.getSettings().getOcrSettings().setDefaultLanguages("eng+fra");

```

The `setDefaultLanguages` method accepts a string with ISO language codes separated by plus signs (for example, “eng+fra” for English and French). Each language addition loads specialized recognition models that include:

- Character sets specific to the language.

- Linguistic patterns and dictionaries.

- Contextual analysis rules for improved word accuracy.

## Executing multi-language text extraction

Create a vision instance and extract the text content. The vision API applies language-specific recognition algorithms and extracts text while maintaining high accuracy:

```java

            Vision vision = Vision.set(document);
            String contentJson = vision.extractContent();

```

The Adaptive OCR engine automatically handles language transitions within the document. It maintains accuracy when text switches between languages, preserving the natural flow and organization of multilingual content.

## Saving extracted results

Write the extracted content to a JSON file for use in downstream applications:

```java

            try (FileWriter writer = new FileWriter("output.json")) {
                writer.write(contentJson);
            }
        }
    }
}

```

This creates a JSON file containing all recognized text from the image in all configured languages. The structured output is ready for:

- Translation workflows

- Content management systems

- Automated data processing

## Understanding the output

The `extractContent` method returns a JSON structure that provides comprehensive metadata for the processed document. This structure includes:

- **Text content** — The full string of extracted text from the document, preserving multi-language characters.

- **Bounding boxes** — The precise (x, y) coordinates and dimensions (width/height) of text regions on the page.

- **Word-level data** — Detailed information for individual words, including their specific coordinates and confidence scores.

## Error handling

Nutrient Java SDK handles errors with exception handling. The methods presented in this guide throw a `NutrientException` if a failure occurs. This helps with troubleshooting and implementing error handling logic for your Adaptive OCR workflows.

## Conclusion

That’s all it takes to extract text from a multi-language image! The extracted content preserves the linguistic diversity and organization of the original document while enabling further processing, translation, or analysis. You can also download [this ready-to-use sample package](https://www.nutrient.io/downloads/samples/java/read-text-from-image-multi-language.zip), which is fully configured to help you explore the Java SDK and its seamless multi-language text extraction capabilities.
---

## Related pages

- [Applying OCR to a PDF page](/guides/java/extraction/apply-ocr-to-pdf-page.md)
- [Applying OCR to a PDF document](/guides/java/extraction/apply-ocr-to-pdf.md)
- [Generating image descriptions using Claude](/guides/java/extraction/describe-image-with-claude.md)
- [Generating image descriptions using local AI](/guides/java/extraction/describe-image-with-local-ai.md)
- [Extracting data from images using vision language models](/guides/java/extraction/extract-data-from-image-vlm.md)
- [Extracting JSON data from a PDF document](/guides/java/extraction/json-data-extraction.md)
- [Speeding up first ICR operation by predownloading models](/guides/java/extraction/speed-up-first-icr-by-downloading-requirements.md)
- [Extracting text from images](/guides/java/extraction/read-text-from-image.md)
- [Extracting data from images using ICR](/guides/java/extraction/extract-data-from-image-icr.md)
- [Extracting data from images using OCR](/guides/java/extraction/extract-data-from-image-ocr.md)
- [Generating image descriptions using OpenAI](/guides/java/extraction/describe-image-with-openai.md)
- [Nutrient Java SDK extraction guides](/guides/java/extraction.md)

