Optimize PDFs with advanced OCR features

The Muhimbi Document Converter comes with support for a number of OCR (optical character recognition) related facilities including the ability to make image based PDFs (Scans, faxes) fully searchable and indexable. In addition it support a way to extract this text to allow information such as Invoice numbers, Purchase Order numbers or other identifiable information to be extracted and used as part of a larger software / workflow process.

For more details and examples see the following articles:

The How and Why of OCR /Providing document access to the visually impaired(opens in a new tab)
OCR Facilities provided by Muhimbi’s server based PDF Conversion products(opens in a new tab)
Converting scans and images to searchable PDFs using Java and server side OCR(opens in a new tab)
Converting scans and images to searchable PDFs using C# and server side OCR(opens in a new tab)
Converting scans and images to searchable PDFs using SharePoint Designer Workflows(opens in a new tab)
Converting scans and images to searchable PDFs using OCR & Nintex Workflow(opens in a new tab)
Extract text from scanned content using OCR and SharePoint Designer Workflows(opens in a new tab)
Extract text from scanned content using OCR and Nintex Workflow(opens in a new tab)
Utilise 3rd party OCR Engines in Muhimbi’s range of Server Side PDF Products(opens in a new tab)

Please note that in order to use OCR in a production environment, a valid add-on license for the OCR and PDF/A Archiving Add-on must be installed alongside a regular license.

Optimize PDFs with advanced OCR features

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.