How to OCR PDFs with Nutrient OCR API: JavaScript, Python, PHP, and Java
Implement OCR with Nutrient’s OCR engine API in JavaScript, Python, PHP, and Java. Convert scanned PDFs to searchable text with code examples for each language.
Convert Office files, images, and more than 100 other formats to PDF. Extract data, add watermarks, run OCR, and hyper-compress — whatever your workflow needs.
Handle Microsoft Office, InfoPath, CAD, HTML, and scores of other file types with a single service call.
Call a lightweight REST endpoint or self-hosted SOAP server — code samples in C#, Java, PHP, JavaScript, and Ruby get you running fast.
Chain post-processing in one job: Watermark, merge, split, compress, or OCR the output without extra tooling.
The engine preserves fonts, graphics, links, and metadata while handling high-volume workloads proven in enterprise deployments.
CAPABILITIES
Convert, extract, encrypt, and process documents at scale. Every feature is production-ready, automation-friendly, and built to fit into your stack.
Create automated flows that convert numerous file formats to editable PDF documents with perfect fidelity.
Manually add watermarks to your documents, or add them as part of an automated workflow.
Combine multiple files into a single PDF. Automatically add bookmarks and a table of contents to ease navigation.
Extract data from PDF documents and images using key-value pair (KVP) extraction.
OCR text in images, image-based PDFs, and scanned or faxed documents. Optical character recognition makes the content editable and searchable.
Convert PDFs into PDF/A 1b, 2b, and 3b files to meet the guidelines for retaining business records.
Encrypt PDF and Office documents, or restrict the ability to print and copy content to ensure confidentiality and compliance.
Split a single PDF file into one or more individual PDF files, or split based on the number of pages or PDF bookmarks.
DEPLOYMENT
Convert 100+ file types to PDF with a single call — use Nutrient’s cloud REST API, or host the Windows service yourself and access it over SOAP. Get started fast with ready-made client samples for C#, Java, Node.js, PHP, Ruby, and .NET Core.
Nutrient SDKs and Cloud APIs add full document lifecycle support to any platform, tech stack, or infrastructure in minutes. The same technology meets Fortune 500 requirements while helping startups ship fast.
Clean documentation, drop-in code, and MCP hooks for both hands-on developers and AI agents.
Web, mobile, desktop, server, or Nutrient Cloud — with no lock-in.
SOC 2 Type 2 and WCAG 2.2-compliant workflows with PDF/UA-accessible documents.
Built-in document AI with support for leading LLMs and their private implementations.
PROVEN AT SCALE
The digital arm of Germany’s national railway digitizes millions of track maintenance blueprints with the Nutrient PDF SDK, keeping 40,000 trains rolling each day.
Governance portal trusted by 2,000+ boards in 30 countries embeds Nutrient Web SDK to enable in‑portal annotations and cross‑device continuity, achieving 80 percent user engagement.
Rolled out nationwide PAdES-compliant signatures with the Nutrient PDF SDK, letting every Austrian citizen sign official documents securely in seconds.
FREE TRIAL
Start building with Document Conversion Services in minutes — no payment information required.
Document Converter Services can convert more than 100 file types to PDF, including MS Office documents, InfoPath forms, CAD drawings, and HTML files. This versatility enables users to handle a wide range of document formats with ease.
Navigate to the orders screen, (you may be asked to log in first), click Get License Key for the relevant order, and click Download. After downloading the license key, you can forward it to someone else by attaching it to an email.
You can make a purchase online or via a purchase order (wire transfer or ACH). We support Mastercard and Visa credit and debit cards, as well as American Express.
Our support desk typically provides information, advice, and guidance for no additional fee. However, if you need us to carry out (part of) the implementation of your project, please contact us. We may pick it up internally or put you in contact with a third party familiar with our products and services.
All our historical releases have been driven by real-world customer requests. If you require a generic feature to be implemented, then please contact us.
It depends on the number of cores in your server, the number of servers, the kind of documents you’re converting, the complexity of the documents you’re converting, the number of concurrent users, etc. You can read more about scalability and performance here.
With Document Converter Services, you can easily convert 100+ file types to PDF, and there are some cross-conversion capabilities as well. Check out the complete list of supported file types.
You’ll need to uninstall the old version and install the new one. Additionally, depending on the version you have, there are some things you need to have in mind. Learn more here about upgrading to the latest version of Document Converter Services.
Take a look at the most common deployment scenarios for Document Converter Services, and check out the advice on securing the server through firewall ports, network authentication, secure communication, and secure Office installation.
Document Converter Services comes with a comprehensive but friendly web services interface that can be accessed from any modern web services-based development environment, including C#, VB, Java, Documentum, SAP, and SharePoint. Learn how to apply PDF security from your own source code.
Yes we do. It’s available on GitHub.
Here are the articles about watermarking documents with Document Converter Services. This is where you can find everything you need to apply watermarks to your Word, Excel, PowerPoint, and PDF files.
The most common reason for this is that, by default, watermarks are displayed in the background, behind the document’s content. In most cases, this isn’t a problem, as the content of most documents is largely transparent. However, this isn’t the case for PowerPoint presentations or scanned content. Learn more here.