Class OcrProcessor

An Optical Character Recognition (OCR) processor used to process documents with images that contain text.

The extra NuGet package PSPDFKitOcr.NET is required for OCR functionality and must be referenced as part of the calling project. The processor supports a number of languages with which to perform OCR (OcrLanguage).

Inheritance
System.Object
OcrProcessor
Namespace: PSPDFKit.Ocr
Assembly: PSPDFKit.dll
Syntax
public class OcrProcessor : object

Constructors

OcrProcessor(Document)

Creates an instance of an OCR processor.

Declaration
public OcrProcessor(Document document)
Parameters
Type Name Description
Document document

The source document on which to perform OCR.

Fields

Pages

The page indices of the source document on which to perform OCR. If empty, all source document pages will be processed.

Declaration
public List<int> Pages
Field Value
Type Description
List<System.Int32>

Properties

Language

The OcrLanguage with which to perform OCR.

Declaration
public OcrLanguage Language { get; set; }
Property Value
Type Description
OcrLanguage

Methods

PerformOcr(IWritableDataProvider)

For the given source document, performs optical character recognition (OCR) on Pages using the given Language. Defaults to English. Please visit https://pspdfkit.com/guides/dotnet/current/ocr/overview/ for more information.

Declaration
public void PerformOcr(IWritableDataProvider writableDataProvider)
Parameters
Type Name Description
IWritableDataProvider writableDataProvider

The data provider in which to export the processed document.