PDF OCR server
Document Engine includes custom-built optical character recognition (OCR) technology to accurately recognize text and patterns, as well as generate searchable PDF/A files.
Looking for more advanced OCR capabilities?:
Nutrient .NET SDK OCR offers additional powerful features, such as zonal OCR, key-value extraction, image preprocessing, searchable PDF/A generation with layout retention, orientation detection, confidence scoring, and more. It’s available as a separate SDK and can be used in conjunction with Document Engine.
Read more
Comparing OCR SDKs — Nutrient vs. Apryse
Feature | Document Engine (Server) OCR | Nutrient .NET SDK OCR | Apryse OCR |
---|---|---|---|
Multi-language support | 30+ built-in languages | 30+ built-in languages | Six built-in languages with OCR module binary and 10 with IRIS OCR module |
Searchable PDF creation | ✅ | ✅ | ✅ |
OCR with exact bounding box coordinates | ❌ | ✅ | ✅ |
Zone-based OCR/custom OCR regions | ❌ | ✅ | ✅ |
Key-value/table extraction | ✅ (available through the Data Extraction API) | ✅ | ❌ |
Orientation detection | ❌ | ✅ | ✅ |
Image preprocessing (deskew, etc.) | ❌ | ✅ | ✅ (manual) |
Performance and speed | ✅ Fast | ✅ Fast | Depends on SDK setup (OCR module/IRIS module) |
API access | Three-step API call once initial setup is done | Requires SDK setup | Requires SDK setup |
Key capabilities
Highly accurate
Completely custom-built AI- and ML-powered OCR engine
Language support
Includes English, French, German, and Spanish
Searchable PDF
Turn scans, images, and documents into searchable PDF or PDF/A documents
Extract data
Extract key-value pairs from unstructured documents
Post-processing
Add signatures, annotations, document assembly, and more
Display PDFs
Open PDFs in integrated web or mobile PDF viewers
Extendable
Add forms, signing, annotations, and more
Real-world use cases
- Invoice OCR — Convert scanned invoices into searchable PDFs, or extract totals and vendor info using OCR.
- Contract digitization — Turn scanned contracts into searchable, selectable PDFs for legal archiving.
- Form processing — Use OCR to extract fields like names, dates, and signatures from scanned forms.
- Multi-language document digitization — OCR documents in multiple languages with full Unicode support.
Which OCR SDK should I use?
Need | SDK to use |
---|---|
Basic OCR from PDFs/images | Document Engine OCR |
Production-ready OCR solution without SDK setup | Document Engine OCR |
OCR with form data, zones, orientation detection | Nutrient .NET SDK OCR |
Batch processing of scanned documents | Either, depending on volume |
Need to preserve layout, tables | Prefer Nutrient .NET SDK OCR |