Speeding up first ICR operation by predownloading models
Use warmup to pre-download vision models before processing documents.
Common use cases include:
- Removing first-request latency in user-facing apps
- Preparing batch jobs before processing starts
- Marking containers ready only after dependencies are available
- Preloading models before offline operation
- Meeting latency targets for production APIs
This guide shows how to warm up ICR models so extract_content() runs without initial download delays.
How Nutrient helps
Nutrient Python SDK handles model download orchestration and cache management.
The SDK handles:
- Model downloads and cache storage details
- Engine-specific model dependencies
- Download retries and transient failure handling
- Readiness checks for model availability
Complete implementation
This example warms up ICR models and then runs extraction:
from nutrient_sdk import Document, Visionfrom nutrient_sdk.settings import VisionEngineWarming up Vision API
Open a document in a context manager(opens in a new tab), set VisionEngine.Icr, create a vision instance, and call warmup().
In this sample:
vision_settings.engine = VisionEngine.Icrselects ICR mode.vision.warmup()downloads required models.- Models are cached for subsequent requests.
- Print statements show progress.
with Document.open("input.png") as document: # Configure ICR engine document.settings.vision_settings.engine = VisionEngine.Icr
# Create Vision instance vision = Vision.set(document)
# Pre-download all required models # This ensures subsequent extract_content() calls are fast print("Downloading Vision models...") vision.warmup() print("Models ready!")Processing documents after warmup
After warmup, run extract_content() without download latency.
In this sample:
extract_content()returns a JSON string.- The JSON output is written to
output.json. - File handling uses a nested context manager.
# Now extract_content() won't need to download anything content_json = vision.extract_content()
with open("output.json", "w") as f: f.write(content_json)Best practices
Apply these patterns for using warmup effectively in production environments:
- Application startup — Run warmup before accepting requests.
- Background thread — Run warmup asynchronously during initialization.
- Health checks — Expose warmup status in readiness probes.
- Deployment pipelines — Validate model availability during deployment.
- Offline environments — Download models while connected, then process offline.
What gets downloaded?
Warmup downloads model sets based on VisionSettings.engine:
- ICR mode (
VisionEngine.Icr) — Layout, text, tables, equations, and key-value detection models - OCR mode (
VisionEngine.Ocr) — OCR language and text recognition resources - VLM-enhanced mode (
VisionEngine.VLM_ENHANCED_ICR) — ICR resources plus VLM-related resources
Downloaded models are cached locally and reused across restarts until the cache is cleared or models are updated.
Conclusion
Use this workflow to pre-download ICR requirements:
- Open a document using a context manager(opens in a new tab) for automatic resource cleanup after warmup and processing complete.
- The SDK supports multiple document formats, including PNG, JPEG, PDF, and TIFF for vision operations.
- Access the vision settings with
document.settings.vision_settings.engineto configure the vision engine. - Set the engine to ICR with property assignment
VisionEngine.Icrto enable advanced document understanding with layout detection, text recognition, table extraction, equation recognition, and key-value pair detection. - Alternative engines include OCR mode for basic text extraction and VLM-enhanced mode for semantic understanding with vision language models.
- Create a vision instance with
Vision.set()bound to the document with configured engine settings. - Call
vision.warmup()to trigger pre-download of all AI models required for the configured vision engine, fetching models from the SDK’s model repository and caching them locally. - Warmup downloads different model sets based on engine configuration — ICR downloads comprehensive document understanding models, OCR downloads text recognition models, and VLM downloads ICR models plus semantic understanding resources.
- Print statements provide feedback during model downloads, informing users about download progress and completion status for potentially multi-second operations.
- After warmup completes, call
vision.extract_content()to perform ICR operations without model download delays, ensuring predictable and fast processing for all subsequent requests. - The
extract_content()method returns extracted content as JSON, including document structure (headings, paragraphs, tables, lists), textual content, table structures, equations, and key-value pairs. - Write the extracted JSON to a file using a nested context manager with
open()for automatic resource cleanup after writing completes. - Handle
NutrientExceptionfor vision processing failures, including model download errors, processing failures, or configuration issues. - The context manager ensures proper resource cleanup when processing completes or exceptions occur.
For related image extraction workflows, refer to the Python SDK guides.
Download this ready-to-use sample package to integrate warmup into application startup.