---
title: "Applying OCR to a PDF page | Nutrient Python SDK"
canonical_url: "https://www.nutrient.io/guides/python/extraction/apply-ocr-to-pdf-page/"
md_url: "https://www.nutrient.io/guides/python/extraction/apply-ocr-to-pdf-page.md"
last_updated: "2026-05-30T02:20:01.349Z"
description: "How to run OCR on a single PDF page using Nutrient Python SDK."
---

# Applying OCR to a PDF page

Running OCR on a single page is useful when only part of a document needs recognition, or when pages are processed on demand as a larger workflow progresses. Examples include scanning the cover page of a batch to extract a reference number, applying OCR only to pages flagged during triage, or incrementally processing a long document without reprocessing pages that are already searchable.

Applying OCR at the page level adds an invisible text layer to just that page. The rest of the document is untouched, so the operation is fast and idempotent per page.

This sample shows how to run OCR on a single page of a document using Nutrient Python SDK and save the result. The input can be any document format the SDK supports, such as an image-based PDF, a multi-page TIFF, or a single image. If the input isn't already a PDF, the SDK converts it to PDF automatically when you create the editor.

[Download sample](https://www.nutrient.io/downloads/samples/python/apply-ocr-to-pdf-page.zip)

## How Nutrient helps

Nutrient Python SDK exposes the same OCR pipeline at the page level. Behind a single method call the SDK:

- Implicitly converts non-PDF inputs (images, multi-page TIFFs, Office documents) to PDF when the editor is created

- Renders just the target page to a bitmap at the resolution OCR needs

- Runs text recognition with the configured languages

- Preserves reading order and text block orientation returned by the recognizer

- Places an invisible, correctly positioned text layer over the original page content

Other pages in the document aren't touched.

## Preparing the project

Import the classes used in the sample:

```python

from nutrient_sdk import Document
from nutrient_sdk import PdfEditor
from nutrient_sdk import NutrientException

```

## Running OCR on a single page

The `main()` function opens the source document inside a [context manager](https://docs.python.org/3/reference/datamodel.html#context-managers), configures the OCR language, then runs OCR only on the first page. The sample passes an image-based PDF as input, but the same code handles raw images or any other supported document format. The context manager closes the document automatically when the block ends, even if an error is raised:

```python

def main():
    try:
        with Document.open("input_image_based.pdf") as document:
            document.settings.ocr_settings.default_languages = "eng"

            editor = PdfEditor.edit(document)
            pages = editor.get_page_collection()

            page = pages.get_first()
            page.make_searchable()

```

Assigning `document.settings.ocr_settings.default_languages = "eng"` tells the recognizer which language models to load. Combine languages with `+` (for example `"eng+deu"`) when the page contains more than one language.

`PdfEditor.edit(document)` attaches an editor to the open document. If the input isn't already a PDF, the SDK converts it to PDF at this step. `editor.get_page_collection().get_first()` then returns the first page of the resulting PDF as a `PdfPage`. Calling `page.make_searchable()` runs OCR on that page and writes an invisible text layer on top of it. Any hidden text already present on the page is removed before the new layer is drawn. Other pages in the document are left unchanged.

To target a different page, use index access on the page collection (for example `pages[2]` for the third page) and call `make_searchable()` on that page instead.

## Saving the result

Save the modified document to a new file and close the editor. Wrap the call in `try/except` on `NutrientException` to surface any licensing, language-pack, or I/O issue that the SDK reports:

```python

            editor.save_as("output.pdf")
            editor.close()
    except NutrientException as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    main()

```

## Conclusion

The workflow for OCR-ing a single PDF page is:

1. Open the source document.

2. Configure OCR languages on the document settings.

3. Create a `PdfEditor` for the document.

4. Get the target page from `editor.get_page_collection()`.

5. Call `make_searchable()` on that page.

6. Save the result and close the editor.

Only the targeted page gains the invisible text layer. The rest of the document is bit-for-bit identical to the input.
---

## Related pages

- [Generating image descriptions using local AI](/guides/python/extraction/describe-image-with-local-ai.md)
- [Generating image descriptions using Claude](/guides/python/extraction/describe-image-with-claude.md)
- [Extracting data from images using ICR](/guides/python/extraction/extract-data-from-image-icr.md)
- [Extracting text from multilingual images](/guides/python/extraction/read-text-from-image-multi-language.md)
- [Nutrient Python SDK extraction guides](/guides/python/extraction.md)
- [Extracting structured JSON data from PDF documents](/guides/python/extraction/json-data-extraction.md)
- [Extracting data from images using vision language models](/guides/python/extraction/extract-data-from-image-vlm.md)
- [Extracting text from images](/guides/python/extraction/read-text-from-image.md)
- [Extracting data from images using OCR](/guides/python/extraction/extract-data-from-image-ocr.md)
- [Speeding up first ICR operation by predownloading models](/guides/python/extraction/speed-up-first-icr-by-downloading-requirements.md)
- [Applying OCR to a PDF document](/guides/python/extraction/apply-ocr-to-pdf.md)
- [Generating image descriptions using OpenAI](/guides/python/extraction/describe-image-with-openai.md)