Class OcrProcessor

An Optical Character Recognition (OCR) processor used to process documents with images that contain text.

The extra NuGet package PSPDFKitOcr.NET is required for OCR functionality and must be referenced as part of the calling project. The processor supports a number of languages with which to perform OCR (OcrLanguage).

Inheritance

System.Object

OcrProcessor

Namespace: PSPDFKit.Ocr

Assembly: PSPDFKit.dll

Syntax

public class OcrProcessor : object

Constructors

OcrProcessor(Document)

Creates an instance of an OCR processor.

Declaration

public OcrProcessor(Document document)

Parameters

Type	Name	Description
Document	document	The source document on which to perform OCR.

Fields

Pages

The page indices of the source document on which to perform OCR. If empty, all source document pages will be processed.

Declaration

public List<int> Pages

Field Value

Type	Description
List<System.Int32>

Properties

Language

The OcrLanguage with which to perform OCR.

Declaration

public OcrLanguage Language { get; set; }

Property Value

Type	Description
OcrLanguage

Methods

PerformOcr(IWritableDataProvider)

For the given source document, performs optical character recognition (OCR) on Pages using the given Language. Defaults to English. Please visit https://pspdfkit.com/guides/dotnet/current/ocr/overview/ for more information.

Declaration

public void PerformOcr(IWritableDataProvider writableDataProvider)

Parameters

Type	Name	Description
IWritableDataProvider	writableDataProvider	The data provider in which to export the processed document.