Speeding up first ICR operation by predownloading models

Use warmup to pre-download vision models before processing documents.

Common use cases include:

Removing first-request latency in user-facing apps
Preparing batch jobs before processing starts
Marking containers ready only after dependencies are available
Preloading models before offline operation
Meeting latency targets for production APIs

This guide shows how to warm up ICR models so extract_content() runs without initial download delays.

Download sample

How Nutrient helps

Nutrient Python SDK handles model download orchestration and cache management.

The SDK handles:

Model downloads and cache storage details
Engine-specific model dependencies
Download retries and transient failure handling
Readiness checks for model availability

Complete implementation

This example warms up ICR models and then runs extraction:

from nutrient_sdk import Document, Vision, VisionEngine

Warming up Vision API

Open a document in a context manager(opens in a new tab), set VisionEngine.Icr, create a vision instance, and call warmup().

In this sample:

vision_settings.engine = VisionEngine.ICR selects ICR mode.
vision.warmup() downloads required models.
Models are cached for subsequent requests.
Print statements show progress.

with Document.open("input.png") as document:
    # Configure ICR engine
    document.settings.vision_settings.engine = VisionEngine.ICR

    # Create Vision instance
    vision = Vision.set(document)

    # Pre-download all required models
    # This ensures subsequent extract_content() calls are fast
    print("Downloading Vision models...")
    vision.warmup()
    print("Models ready!")

Processing documents after warmup

After warmup, run extract_content() without download latency.

In this sample:

extract_content() returns a JSON string.
The JSON output is written to output.json.
File handling uses a nested context manager.

    # Now extract_content() won't need to download anything
    content_json = vision.extract_content()

    with open("output.json", "w") as f:
        f.write(content_json)

Best practices

Apply these patterns for using warmup effectively in production environments:

Application startup — Run warmup before accepting requests.
Background thread — Run warmup asynchronously during initialization.
Health checks — Expose warmup status in readiness probes.
Deployment pipelines — Validate model availability during deployment.
Offline environments — Download models while connected, then process offline.

What gets downloaded?

Warmup downloads model sets based on VisionSettings.engine:

ICR mode (VisionEngine.ICR) — Layout, text, tables, equations, and key-value detection models
OCR mode (VisionEngine.ADAPTIVE_OCR) — OCR language and text recognition resources
VLM-enhanced mode (VisionEngine.VLM_ENHANCED_ICR) — ICR resources plus VLM-related resources

Downloaded models are cached locally and reused across restarts until the cache is cleared or models are updated.

Conclusion

Use this workflow to pre-download ICR requirements:

Open a document using a context manager(opens in a new tab) for automatic resource cleanup after warmup and processing complete.
The SDK supports multiple document formats, including PNG, JPEG, PDF, and TIFF for vision operations.
Access the vision settings with document.settings.vision_settings.engine to configure the vision engine.
Set the engine to ICR with property assignment VisionEngine.ICR to enable advanced document understanding with layout detection, text recognition, table extraction, equation recognition, and key-value pair detection.
Alternative engines include OCR mode for basic text extraction and VLM-enhanced mode for semantic understanding with vision language models.
Create a vision instance with Vision.set() bound to the document with configured engine settings.
Call vision.warmup() to trigger pre-download of all AI models required for the configured vision engine, fetching models from the SDK’s model repository and caching them locally.
Warmup downloads different model sets based on engine configuration — ICR downloads comprehensive document understanding models, OCR downloads text recognition models, and VLM downloads ICR models plus semantic understanding resources.
Print statements provide feedback during model downloads, informing users about download progress and completion status for potentially multi-second operations.
After warmup completes, call vision.extract_content() to perform ICR operations without model download delays, ensuring predictable and fast processing for all subsequent requests.
The extract_content() method returns extracted content as JSON, including document structure (headings, paragraphs, tables, lists), textual content, table structures, equations, and key-value pairs.
Write the extracted JSON to a file using a nested context manager with open() for automatic resource cleanup after writing completes.
Handle NutrientException for vision processing failures, including model download errors, processing failures, or configuration issues.
The context manager ensures proper resource cleanup when processing completes or exceptions occur.

For related image extraction workflows, refer to the Python SDK guides.

Download this ready-to-use sample package to integrate warmup into application startup.

Speeding up first ICR operation by predownloading models

How Nutrient helps

Complete implementation

Warming up Vision API

Processing documents after warmup

Best practices

What gets downloaded?

Conclusion

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.