---
title: "Generating image descriptions using Claude | Nutrient Java SDK"
canonical_url: "https://www.nutrient.io/guides/java/extraction/describe-image-with-claude/"
md_url: "https://www.nutrient.io/guides/java/extraction/describe-image-with-claude.md"
last_updated: "2026-06-10T14:59:56.779Z"
description: "Generate accessible image descriptions using Claude AI with Nutrient Java SDK."
---

# Generating image descriptions using Claude

Use image description to generate alt text and visual summaries from images.

Common use cases include:

- Accessibility workflows for screen readers

- Digital asset cataloging

- Document enrichment for scanned reports

- E-learning content description

- Archive and metadata generation

This guide uses Claude as the VLM provider through Nutrient Vision API.

[Download sample](https://www.nutrient.io/downloads/samples/java/describe-image-with-claude.zip)

## How Nutrient helps

Nutrient Java SDK handles provider configuration, request handling, and response parsing.

The SDK handles:

- VLM API authentication, endpoint configuration, and request formatting

- Image encoding and multimodal API request structures

- Model parameters such as temperature, max tokens, and provider-specific settings

- Vision service failures and API rate limits

## Complete implementation

This example generates an image description using Claude:

```java

package io.nutrient.Sample;

```

Import the required classes and define the sample class:

```java

import io.nutrient.sdk.Document;
import io.nutrient.sdk.Vision;
import io.nutrient.sdk.enums.VlmProvider;
import io.nutrient.sdk.exceptions.NutrientException;
import io.nutrient.sdk.settings.ClaudeApiSettings;
import io.nutrient.sdk.settings.VisionSettings;

import java.io.FileWriter;
import java.io.IOException;

public class DescribeImageWithClaude {

```

## Configuring the Claude provider

Create the main method, open the image in try-with-resources, and configure Claude.

In this sample:

- `setProvider(VlmProvider.Claude)` selects Claude.

- `setApiKey("CLAUDE_API_KEY")` sets the Anthropic API key.

- Input can be PNG, JPEG, GIF, BMP, or TIFF.

```java

    public static void main(String[] args) throws NutrientException, IOException {

```

## Creating a vision instance and generating the description

Create a vision instance and call `describe()` to generate text.

In this sample:

- `Vision.set(document)` binds processing to the opened image.

- `vision.describe()` returns a description string.

- The SDK handles encoding, request construction, and response parsing.

```java

        try (Document document = Document.open("input_photo.png")) {
            // Configure Claude as the VLM provider
            VisionSettings visionSettings = document.getSettings().getVisionSettings();
            visionSettings.setProvider(VlmProvider.Claude);

            // Set the Claude API key
            ClaudeApiSettings claudeSettings = document.getSettings().getClaudeApiSettings();
            claudeSettings.setApiKey("CLAUDE_API_KEY");

```

## Saving the description

Write the description to a text file.

This sample uses try-with-resources for both document and file writer cleanup:

```java

            Vision vision = Vision.set(document);
            String description = vision.describe();

```

```java

            try (FileWriter writer = new FileWriter("output.txt")) {
                writer.write(description);
            }
        }
    }
}

```

## Understanding the output

`describe()` returns natural language text for accessibility and content understanding.

Claude descriptions are typically:

- **Concise** — Focused on key subjects and details, often one to three sentences

- **Accessible** — Suitable for users who rely on screen readers

- **Accurate** — Based on visible content only

- **Contextual** — Include relevant relationships and scene context

Use this output for accessibility metadata, image search, and document workflows.

## Claude API settings

The Claude provider uses these `ClaudeApiSettings` properties:

- `ApiEndpoint` — The Claude API endpoint (default: `https://api.anthropic.com/v1/`).

- `ApiKey` — Your Anthropic API key for authentication.

- `Model` — The model identifier to use (default: `claude-sonnet-4-5`).

- `Temperature` — Controls response creativity (0.0 = deterministic, 1.0 = creative).

- `MaxTokens` — Maximum tokens in the response (default: 16384).

## Error handling

The sample can throw:

- `NutrientException` for vision and API issues

- `IOException` for file I/O operations

Common failure scenarios include:

- The input image can’t be read due to path, permission, or format issues

- The Claude API key is missing or invalid

- The Claude API is unavailable

- Rate limits are exceeded

- Network requests fail before reaching the API

- Image data is too large or corrupted

- File writing fails due to path, disk, or permission issues

In production code:

- Catch `NutrientException` and `IOException`.

- Return clear error messages.

- Log failure details for debugging.

- Add retry logic for transient API failures.

## Conclusion

Use this workflow to generate image descriptions with Claude:

1. Open the image file using try-with-resources for automatic resource cleanup.

2. The SDK supports multiple image formats, including PNG, JPEG, GIF, BMP, and TIFF.

3. Retrieve the vision settings with `document.getSettings().getVisionSettings()` to configure the VLM provider.

4. Set the provider to Claude with `setProvider(VlmProvider.Claude)` instead of alternatives like OpenAI or local models.

5. Retrieve Claude-specific settings with `document.getSettings().getClaudeApiSettings()` for API configuration.

6. Set the Anthropic API key with `setApiKey()` using credentials obtained from the Anthropic Console.

7. Claude API settings control endpoint URLs, model selection (default: claude-sonnet-4-5), temperature, and max tokens.

8. Create a vision instance with `Vision.set()` bound to the document with configured provider settings.

9. Generate the description with `vision.describe()`, which sends the image to Claude’s vision endpoint and returns natural language text.

10. The SDK encodes image data, constructs multimodal API requests, and parses responses automatically.

11. Generated descriptions are concise (1–3 sentences), accessible (WCAG-compliant alt text), accurate (observable details only), and contextual.

12. Write the description to a file using try-with-resources with `FileWriter` for automatic resource cleanup.

13. Handle `NutrientException` for vision processing failures, including authentication errors, API failures, and rate limits.

14. Handle `IOException` for file operations, including read failures or write errors when saving output.

For related image workflows, refer to the [Java SDK guides](https://www.nutrient.io/guides/java.md).

Download [this ready-to-use sample package](https://www.nutrient.io/downloads/samples/java/describe-image-with-claude.zip) to explore Claude-based image description.
---

## Related pages

- [Applying OCR to a PDF page](/guides/java/extraction/apply-ocr-to-pdf-page.md)
- [Applying OCR to a PDF document](/guides/java/extraction/apply-ocr-to-pdf.md)
- [Generating image descriptions using local AI](/guides/java/extraction/describe-image-with-local-ai.md)
- [Generating image descriptions using OpenAI](/guides/java/extraction/describe-image-with-openai.md)
- [Extracting data from images using ICR](/guides/java/extraction/extract-data-from-image-icr.md)
- [Extracting data from images using OCR](/guides/java/extraction/extract-data-from-image-ocr.md)
- [Extracting data from images using vision language models](/guides/java/extraction/extract-data-from-image-vlm.md)
- [Nutrient Java SDK extraction guides](/guides/java/extraction.md)
- [Extracting structured data from documents](/guides/java/extraction/extract-structured-data.md)
- [Extracting form fields from images](/guides/java/extraction/extract-form-fields-from-image.md)
- [Labeling form fields with a vision language model](/guides/java/extraction/label-form-fields-with-vlm.md)
- [Extracting JSON data from a PDF document](/guides/java/extraction/json-data-extraction.md)
- [Extracting text from multilingual images](/guides/java/extraction/read-text-from-image-multi-language.md)
- [Extracting text from images](/guides/java/extraction/read-text-from-image.md)
- [Extracting text from PDF documents](/guides/java/extraction/pdf-to-text.md)
- [Speeding up first ICR operation by predownloading models](/guides/java/extraction/speed-up-first-icr-by-downloading-requirements.md)

