This HTML page is not optimized for LLM or AI agent consumption. Fetch the Markdown version instead: /api/python/vision.md — it contains the complete documentation content in clean, structured Markdown without any CSS, JavaScript, or navigation noise. Vision

Provides machine learning and computer vision capabilities for document processing. Enables AI-powered document description and content extraction.

from nutrient_sdk import Vision

Construction

Vision cannot be instantiated directly. Obtain instances through static factory methods or via other SDK classes.

Class Methods

set

@classmethod
def set(cls, document: Document) -> Vision

Creates a Vision instance for the specified document.

Parameters:

NameTypeDescription
documentDocumentThe document to analyze using vision capabilities.

Returns: Vision - A Vision instance ready to perform analysis on the document. Raises:


Methods

describe

def describe(self) -> str

Generates an AI-powered description of the document content.

Returns: str - A string containing the document description.


detect_forms

def detect_forms(self) -> str

Detects form fields on the document and exports the result. Output format is determined by (JSON or IR Lite). Each detected field carries its type and bounding box. To also assign AI semantic labels (e.g. “First name”), set FormLabelingSettings.EnableAiLabeling on the document’s settings before calling — no separate method is needed.

Returns: str - The exported content as a string (JSON or IR Lite JSON depending on settings).


detect_forms_to_file

def detect_forms_to_file(self, output_path: str) -> None

Detects form fields on the document and writes the exported result to a file. Output format is determined by (JSON or IR Lite).

Parameters:

NameTypeDescription
output_pathstrPath to the output file.

extract_content

def extract_content(self) -> str
def extract_content(self, settings: DocumentLayoutJsonExportSettings) -> str

Extracts structured content from the document using machine vision processing. The pipeline used is determined by the setting and the output format by .

Parameters:

NameTypeDescription
settings (optional)DocumentLayoutJsonExportSettingsSettings controlling what to include in the JSON output.

Returns: str - The exported content as a string (JSON, Markdown, or IR Lite JSON depending on settings).


extract_content_to_file

def extract_content_to_file(self, output_path: str) -> None
def extract_content_to_file(self, output_path: str, settings: DocumentLayoutJsonExportSettings) -> None

Extracts structured content from the document and writes it to a file. The pipeline used is determined by the setting and the output format by .

Parameters:

NameTypeDescription
output_pathstrPath to the output file.
settings (optional)DocumentLayoutJsonExportSettingsSettings controlling what to include in the JSON output.

extract_structured

def extract_structured(self, request: StructuredExtractionRequest) -> str

Extracts structured data from the document, shaped to the JSON Schema carried by the ‘s {“schema”: …} envelope. The document is first read by the extraction pipeline selected by , then an AI model fills the schema from the recognized content. Provider, model, endpoint, and confidence reporting are driven by AiProcessingSettings on the document’s settings.

Parameters:

NameTypeDescription
requestStructuredExtractionRequestThe extraction request carrying the schema envelope (required) and optional instructions.

Returns: str - A JSON string with two top-level nodes: extraction (the schema-shaped extracted fields) and metadata (per-field source locations and grounding labels).


extract_structured_to_file

def extract_structured_to_file(self, request: StructuredExtractionRequest, output_path: str) -> None

Extracts structured data from the document, shaped to the JSON Schema carried by the ‘s {“schema”: …} envelope, and writes the JSON result to a file. See for the result shape.

Parameters:

NameTypeDescription
requestStructuredExtractionRequestThe extraction request carrying the schema envelope (required) and optional instructions.
output_pathstrPath to the output file.

warmup

def warmup(self) -> None

Preloads (warms up) all resources needed for vision processing. This downloads all model files based on the document’s VisionSettings before execution. Call this to avoid download delays during ExtractContent().