Extracting text from multilingual images

Multi-language text extraction addresses a fundamental challenge in global operations where organizations must process documents containing content in multiple languages within the same image. This capability is essential for international companies handling multilingual contracts, government agencies processing diverse public documents, and educational institutions managing multicultural content.

Simultaneously recognizing and extracting text in multiple languages eliminates the need for separate workflows. This enables the efficient handling of documents that contain mixed-language text, such as:

International business correspondence
Multilingual product documentation
Travel documents and passports
Cross-border legal materials

From automated translation workflows to compliance systems processing multilingual regulatory documents, multi-language OCR enables businesses to handle diverse linguistic content with the same efficiency as single-language processing. This breaks down language barriers in document digitization and content management.

Streamlining document workflows with our Java SDK

Developers can implement this feature by adding just a few lines of code to their applications. The SDK integrates multi-language Adaptive OCR text extraction directly, eliminating the requirement for external tools or complex setups. Whether you’re building a document processing pipeline or adding extraction functionality to a web application, our SDK provides a reliable and efficient solution right out of the box.

Preparing the project

Specify a package name and create a new class for the task:

package io.nutrient.Sample;

Import Nutrient Java SDK. It’s recommended to specify the actual classes, though wildcards are supported:

import io.nutrient.sdk.Document;
import io.nutrient.sdk.Vision;
import io.nutrient.sdk.enums.VisionEngine;
import io.nutrient.sdk.exceptions.NutrientException;

import java.io.FileWriter;
import java.io.IOException;

public class ReadTextFromImageMultiLanguage {

Create the main function and specify the potential exceptions. In a production environment, you may choose to wrap these in a try-catch block for custom error handling:

    public static void main(String[] args) throws NutrientException, IOException {

With the Java environment ready, you can now focus on the SDK-specific implementation.

Loading and configuring multi-language Adaptive OCR

Open the image file and configure the vision API with multi-language support. Setting the default languages tells the Adaptive OCR engine which language models to load for optimal recognition accuracy:

        try (Document document = Document.open("input_ocr_multiple_languages.png")) {
            // Configure OCR engine for text extraction
            document.getSettings().getVisionSettings().setEngine(VisionEngine.AdaptiveOcr);

            // Configure multiple languages for recognition
            document.getSettings().getOcrSettings().setDefaultLanguages("eng+fra");

The setDefaultLanguages method accepts a string with ISO language codes separated by plus signs (for example, “eng+fra” for English and French). If you need to detect a document’s language before choosing OCR settings, refer to the detect document language guide. Each language addition loads specialized recognition models that include:

Character sets specific to the language.
Linguistic patterns and dictionaries.
Contextual analysis rules for improved word accuracy.

Executing multi-language text extraction

Create a vision instance and extract the text content. The vision API applies language-specific recognition algorithms and extracts text while maintaining high accuracy:

            Vision vision = Vision.set(document);
            String contentJson = vision.extractContent();

The Adaptive OCR engine automatically handles language transitions within the document. It maintains accuracy when text switches between languages, preserving the natural flow and organization of multilingual content.

Saving extracted results

Write the extracted content to a JSON file for use in downstream applications:

            try (FileWriter writer = new FileWriter("output.json")) {
                writer.write(contentJson);
            }
        }
    }
}

This creates a JSON file containing all recognized text from the image in all configured languages. The structured output is ready for:

Translation workflows
Content management systems
Automated data processing

Understanding the output

The extractContent method returns a JSON structure that provides comprehensive metadata for the processed document. This structure includes:

Text content — The full string of extracted text from the document, preserving multi-language characters.
Bounding boxes — The precise (x, y) coordinates and dimensions (width/height) of text regions on the page.
Word-level data — Detailed information for individual words, including their specific coordinates and confidence scores.

Error handling

Nutrient Java SDK handles errors with exception handling. The methods presented in this guide throw a NutrientException if a failure occurs. This helps with troubleshooting and implementing error handling logic for your Adaptive OCR workflows.

Conclusion

That’s all it takes to extract text from a multi-language image! The extracted content preserves the linguistic diversity and organization of the original document while enabling further processing, translation, or analysis. You can also download this ready-to-use sample package, which is fully configured to help you explore the Java SDK and its seamless multi-language text extraction capabilities.

Extracting text from multilingual images

Streamlining document workflows with our Java SDK

Preparing the project

Loading and configuring multi-language Adaptive OCR

Executing multi-language text extraction

Saving extracted results

Understanding the output

Error handling

Conclusion

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.