AI Document Processing
AI Document Processing (formerly XtractFlow) is an intelligent document processing (IDP) SDK that extends Nutrient’s existing key-value pair (KVP) extraction technology with large language models (LLMs) to deliver best-in-class extraction and classification accuracy. This approach surpasses traditional data extraction methods by combining LLMs, heuristics, math, and machine learning, resulting in a higher degree of accuracy compared to pure AI/ML alternatives.
Why use AI Document Processing with Nutrient .NET SDK
While Nutrient .NET SDK offers robust extraction features such as key-value extraction, table and form data extraction, OMR, MRZ, MICR, and metadata extraction, AI Document Processing takes these further by:
- Combining LLMs with KVP and machine vision — This hybrid approach delivers higher extraction and classification accuracy, especially for complex, unstructured, or semi-structured documents, compared to traditional AI/ML-only solutions.
- Automating classification and extraction — AI Document Processing enables unsupervised, automated document classification and data extraction, reducing the need for manual labeling or rule-based configuration.
- No-code and natural language support — It enables for data extraction using natural language instructions, minimizing the need for exhaustive coding or rigid rules.
- Broad file type support — Processes more than 100 input file types, including PDFs, images, Office documents, and emails.
- Security and compliance — The SDK is designed to run on-premises, giving you full control over document processing. LLM-powered features can be configured to use external providers (such as OpenAI) or local LLMs, depending on data privacy requirements. You can review and customize this behavior to align with your organization’s compliance and retention policies.
- Versatility — Supports integration as a REST microservice or directly within .NET applications, making it suitable for a wide range of deployment scenarios.
- Performance — Offers multithreaded support for fast, automated, and batch processing of large document volumes.
How core features work
Below is an overview of how core features of AI Document Processing work.
Automated data extraction
- Data extraction is powered by generative AI and machine vision, enabling the system to understand and extract information from documents using natural language instructions — for example, extract the customer’s name, address, and invoice total.
- The system can extract structured data, such as tables and forms, as well as unstructured data, such as paragraphs or freeform text.
- It supports extraction of textual data, numerical values, identification numbers, and workflow-specific information — for example, legal clauses and medical codes.
- There’s no need for predefined templates or KVP rules, though you can use customizable templates and strategic hinting for even greater precision if desired.
- The process is unsupervised, meaning the system learns and adapts without requiring annotated training data.
For more information, refer to our guide on invoice recognition and data extraction.
Unsupervised, automated document classification
- The system uses LLMs combined with heuristics, mathematics, and machine learning to automatically recognize and categorize documents.
- It doesn’t require you to manually label documents or define extraction rules in advance.
- You can provide natural language instructions — for example, classify invoices, contracts, and receipts — and the system will intelligently identify and sort documents into predefined or custom categories based on their content and structure.
For more information, refer to our guide on document classification and recognition.
Integrating AI Document Processing with Nutrient .NET SDK
For detailed instructions on integrating AI Document Processing with Nutrient .NET SDK, refer to our guide for getting started with AI Document Processing.
Explore AI Document Processing guides
- Important concepts to understand
- Build a custom data extraction template
- Classify and recognize documents
- Recognize and process invoices