---
title: "Extracting form fields from images | Nutrient Java SDK"
canonical_url: "https://www.nutrient.io/guides/java/extraction/extract-form-fields-from-image/"
md_url: "https://www.nutrient.io/guides/java/extraction/extract-form-fields-from-image.md"
last_updated: "2026-06-09T21:11:56.021Z"
description: "Detect form fields in scanned forms and export them as structured JSON with Nutrient Java SDK."
---

# Extracting form fields from images

Scanned forms such as tax filings, healthcare intake sheets, lease agreements, and expense reports contain boxes, checkboxes, and signature lines, but a flat image has no machine-readable structure. To process them downstream, you need to locate each fillable region on the page.

This sample shows how to detect form fields on every page of a document and export the result as structured JSON with Nutrient Java SDK. Detection runs locally and offline, so no external model is contacted. To also assign a human-readable semantic label to each field, such as "First name" or "Date of birth", with a vision model, refer to the [label form fields with a VLM](https://www.nutrient.io/guides/java/extraction/label-form-fields-with-vlm.md) guide.

If you want to turn an image-based form into a fillable PDF with AcroForm widgets, refer to the [detect and add form fields](https://www.nutrient.io/guides/java/editor/detect-and-add-form-fields.md) guide. That workflow uses the same detection model, but it writes fields back into the PDF instead of exporting data.

[Download sample](https://www.nutrient.io/downloads/samples/java/extract-form-fields-from-image.zip)

## How Nutrient helps

Nutrient Java SDK runs the full form-detection pipeline behind a single method call. It handles:

- Rendering each page of the document to a bitmap at the resolution the detection model expects

- Running the form-field detection model on every page and classifying each region as text, checkbox, or signature

- Recording each field's type, bounding box, and confidence

- Serializing the result to JSON

The result is structured data you can index, validate, or feed into a downstream workflow.

## Supported field types and limits

The model recognizes three field types: text, checkbox, and signature. Because detection runs on rendered page images, accuracy depends on image quality. Clean scans produce better results than heavily compressed or skewed pages. Form detection requires the vision form feature in your license.

## Prepare the project

Set a package name and create the main class:

```java

package io.nutrient.Sample;

```

Import the required classes from the SDK:

```java

import io.nutrient.sdk.Document;
import io.nutrient.sdk.Vision;
import io.nutrient.sdk.exceptions.NutrientException;

import java.io.FileWriter;
import java.io.IOException;

public class ExtractFormFieldsFromImage {

```

## Load the document

Open the document with try-with-resources so the SDK closes resources after processing:

```java

    public static void main(String[] args) throws NutrientException, IOException {
        try (Document document = Document.open("input_forms_detection.pdf")) {

```

## Detect form fields

Create a vision instance from the document with `Vision.set(document)`, then call `detectForms()`. Detection runs locally and offline:

```java

            Vision vision = Vision.set(document);
            String formsJson = vision.detectForms();

```

Write the JSON result to a file for downstream processing:

```java

            try (FileWriter writer = new FileWriter("output.json")) {
                writer.write(formsJson);
            }
        }
    }
}

```

## Understand the output

`detectForms()` returns structured JSON. The `elements` array holds one form element per page. Each form element includes its `pageNumber` and a `fields` list, so fields from a multi-page document stay grouped by the page they came from. Each field includes:

- **`fieldType`** — The detected type: `Text`, `Checkbox`, or `Signature`.

- **`bounds`** — The bounding box of the field on the page.

- **`confidence`** — The detection confidence for the field.

- **`id`** — A unique identifier for the field.

## Tune detection

If the model misses a class or detects too many of them, adjust the logit-bias settings in [form recognition settings](https://www.nutrient.io/api/java-sdk/nutrient-java-sdk/io.nutrient.sdk.settings/form-recognition-settings/index.html) — `textLogitBias`, `checkboxLogitBias`, and `signatureLogitBias`. A positive value increases the rate of that class, and a negative value suppresses it. The default of `0` applies no bias. These settings affect detection itself, so they apply to every detection run.

## Handle errors

Vision API throws `VisionException`, which derives from `NutrientException`, when detection fails.

Common failure scenarios include:

- The document can't be read due to path or permission issues

- The page produces no renderable image

- The form detection model is missing or inaccessible, or the feature isn't licensed

In production code:

- Catch `NutrientException`.

- Return a clear error message.

- Log failure details for debugging.

## Conclusion

The workflow for extracting form-field data from an image is:

1. Open the source document using try-with-resources for automatic resource cleanup.

2. Create a vision instance with `Vision.set()`.

3. Call `detectForms()` to detect every field and export the result as JSON.

4. Write the JSON to a file for indexing, validation, or downstream processing.

5. Handle `NutrientException` for robust error recovery.

Detection runs locally and offline. To assign semantic labels to each field with a vision model, refer to the [label form fields with a VLM](https://www.nutrient.io/guides/java/extraction/label-form-fields-with-vlm.md) guide. To produce a fillable PDF instead of data, refer to the [detect and add form fields](https://www.nutrient.io/guides/java/editor/detect-and-add-form-fields.md) guide.

For related image extraction workflows, refer to the [Java SDK](https://www.nutrient.io/guides/java.md) guides.

Download the [sample package](https://www.nutrient.io/downloads/samples/java/extract-form-fields-from-image.zip) to explore form-field extraction.
---

## Related pages

- [Applying OCR to a PDF document](/guides/java/extraction/apply-ocr-to-pdf.md)
- [Applying OCR to a PDF page](/guides/java/extraction/apply-ocr-to-pdf-page.md)
- [Generating image descriptions using Claude](/guides/java/extraction/describe-image-with-claude.md)
- [Generating image descriptions using local AI](/guides/java/extraction/describe-image-with-local-ai.md)
- [Extracting data from images using OCR](/guides/java/extraction/extract-data-from-image-ocr.md)
- [Generating image descriptions using OpenAI](/guides/java/extraction/describe-image-with-openai.md)
- [Extracting data from images using ICR](/guides/java/extraction/extract-data-from-image-icr.md)
- [Extracting JSON data from a PDF document](/guides/java/extraction/json-data-extraction.md)
- [Extracting data from images using vision language models](/guides/java/extraction/extract-data-from-image-vlm.md)
- [Extracting structured data from documents](/guides/java/extraction/extract-structured-data.md)
- [Extracting text from PDF documents](/guides/java/extraction/pdf-to-text.md)
- [Labeling form fields with a vision language model](/guides/java/extraction/label-form-fields-with-vlm.md)
- [Nutrient Java SDK extraction guides](/guides/java/extraction.md)
- [Extracting text from multilingual images](/guides/java/extraction/read-text-from-image-multi-language.md)
- [Extracting text from images](/guides/java/extraction/read-text-from-image.md)
- [Speeding up first ICR operation by predownloading models](/guides/java/extraction/speed-up-first-icr-by-downloading-requirements.md)

