This HTML page is not optimized for LLM or AI agent consumption. Fetch the Markdown version instead: /guides/dotnet/ocr/usage/apply-ocr-to-pdf.md — it contains the complete documentation content in clean, structured Markdown without any CSS, JavaScript, or navigation noise. Applying OCR to a PDF document | Nutrient .NET SDK

OCR converts image-based PDFs into searchable, selectable documents. This workflow helps you process scanned files while preserving original page appearance.

Use this sample to:

  • Extract text from scanned PDF pages
  • Add an invisible text layer for search and copy
  • Keep original layout and visual content intact

Project setup

Install:

  • The core Nutrient .NET SDK package
  • GdPicture.Resources for OCR language and recognition resources

Prepare the project

Register the SDK license before running OCR operations. For setup details, refer to the getting started with .NET SDK guide.

using GdPicture14;
LicenseManager licence = new LicenseManager();
licence.RegisterKEY(""); // Set your license key

Load the PDF document

Create a GdPicturePDF instance and load the source PDF:

using GdPicturePDF pdf = new GdPicturePDF();
pdf.LoadFromFile(@"input_image_based.pdf");

Apply OCR processing

Run OCR across all pages:

pdf.OcrPages("*", 0, "eng", "", "", 200);

Parameter summary:

  • "*" — Process all pages
  • 0 — Use default OCR mode
  • "eng" — Use English OCR language data
  • "", "" — No character allowlist or denylist
  • 200 — Process at 200 DPI

Save the OCRed document

Write the processed PDF with the added text layer:

pdf.SaveToFile(@"output.pdf");

The output PDF keeps its original visual content and adds searchable text for indexing, text selection, and accessibility tooling.