This HTML page is not optimized for LLM or AI agent consumption. Fetch the Markdown version instead: /guides/document-converter/sharepoint/knowledge-base/ocr-facilities-provided-by-the-pdf-converter.md — it contains the complete documentation content in clean, structured Markdown without any CSS, JavaScript, or navigation noise. OCR facilities for document conversion

The Muhimbi Document Converter comes with support for a number of OCR (optical character recognition) related facilities including the ability to make image based PDFs (Scans, faxes) fully searchable and indexable. In addition it support a way to extract this text to allow information such as Invoice numbers, Purchase Order numbers or other identifiable information to be extracted and used as part of a larger software / workflow process.

For more details and examples see the following articles:

Please note that in order to use OCR in a production environment, a valid add-on license for the OCR and PDF/A Archiving Add-on must be installed alongside a regular license.