How to edit a PDF in Python: Add text, images, and annotations

Table of contents

    Editing PDFs in Python is more difficult than reading them. Most open source libraries let you overlay new content and fill basic AcroForm fields, but truly rewriting text or handling advanced form workflows like XFA, validation, and signatures typically requires a dedicated SDK. This guide covers what each approach can do — with code you can copy.
    How to edit a PDF in Python: Add text, images, and annotations
    TL;DR

    Four ways to edit PDFs in Python:

    Open source tools handle simple overlays and basic form filling. For advanced form workflows, structural redaction, or digital signatures, you’ll need Nutrient Python SDK.

    Choosing between libraries? See our Python PDF library comparison for a full feature-by-feature breakdown of seven tools.

    What “editing” a PDF means

    PDF editing ranges from simple overlays to structural changes. The right tool depends on what you need:

    • Text overlay — Stamp new text on top of existing content without changing the original. Good for watermarks or headers.
    • Content insertion — Add text, images, or shapes at specific coordinates. The original content stays untouched.
    • Form filling — Write values into predefined form fields (AcroForm or XFA).
    • Annotation — Add highlights, sticky notes, or freehand drawings as a separate layer.
    • Text replacement — Find and replace text in the PDF content stream. This is rare and fragile in open source libraries.
    • Redaction — Permanently remove content from the document structure; don’t just draw a black box over it.
    • Digital signatures — Apply certificate-based signatures for legal validity.

    Most open source Python PDF libraries handle the first two well. Everything below that line typically requires a commercial SDK.

    Prerequisites

    All examples in this tutorial use Python 3.10+. Create a virtual environment and install the libraries as you go:

    Terminal window
    python -m venv pdf-edit-env
    source pdf-edit-env/bin/activate # macOS/Linux
    # pdf-edit-env\Scripts\activate # Windows

    You’ll also need a sample PDF to work with. Any multipage PDF with text will do.

    Method 1: Add a text overlay with PyPDF and ReportLab

    PyPDF(opens in a new tab) can merge PDF pages, so you can generate a “stamp” page with ReportLab(opens in a new tab) and merge it onto an existing page. This is the most common open source approach for adding text to a PDF.

    Install

    Terminal window
    pip install pypdf reportlab

    Code example

    from io import BytesIO
    from pypdf import PdfReader, PdfWriter
    from reportlab.pdfgen import canvas
    # Read the source PDF first so the overlay matches its page size.
    reader = PdfReader("input.pdf")
    first_page = reader.pages[0]
    page_width = float(first_page.mediabox.width)
    page_height = float(first_page.mediabox.height)
    # Create a text overlay sized to the source page.
    buffer = BytesIO()
    c = canvas.Canvas(buffer, pagesize=(page_width, page_height))
    c.setFont("Helvetica", 36)
    c.setFillColorRGB(0.8, 0, 0) # Red text
    c.drawString(100, page_height - 100, "DRAFT — DO NOT DISTRIBUTE")
    c.save()
    buffer.seek(0)
    # Merge overlay onto the first page.
    overlay_reader = PdfReader(buffer)
    writer = PdfWriter()
    first_page.merge_page(overlay_reader.pages[0])
    writer.add_page(first_page)
    # Copy remaining pages unchanged.
    for page in reader.pages[1:]:
    writer.add_page(page)
    with open("output_stamped.pdf", "wb") as f:
    writer.write(f)
    print("Overlay applied to first page.")

    What this approach can do

    • Stamp text, shapes, or images anywhere on existing pages.
    • Add watermarks or headers/footers across all pages.
    • Combine multiple PDFs into one.

    Limitations

    • You can’t read or change existing text; you can only add new layers on top.
    • There’s no annotation or redaction support. PyPDF does support basic AcroForm filling via update_page_form_field_values(), but that’s a separate workflow from the overlay approach shown here.
    • Positioning is manual (you specify x/y coordinates in points).

    Method 2: Insert text and images with PyMuPDF

    PyMuPDF(opens in a new tab) provides direct page-level methods for inserting text and images without needing a separate library.

    Install

    Terminal window
    pip install PyMuPDF

    Code example: Insert text

    import pymupdf
    doc = pymupdf.open("input.pdf")
    page = doc[0]
    # Insert text at a specific position.
    page.insert_text(
    (72, 72), # x, y in points (1 inch from top-left)
    "Reviewed: 2026-02-27",
    fontsize=14,
    color=(0, 0, 0.8), # Blue
    )
    doc.save("output_text.pdf")
    doc.close()
    print("Text inserted on first page.")

    Code example: Insert an image

    import pymupdf
    doc = pymupdf.open("input.pdf")
    page = doc[0]
    # Define a rectangle where the image should appear.
    rect = pymupdf.Rect(400, 20, 550, 80) # x0, y0, x1, y1
    page.insert_image(rect, filename="logo.png")
    doc.save("output_logo.pdf")
    doc.close()
    print("Image inserted on first page.")

    Code example: Add a highlight annotation

    import pymupdf
    doc = pymupdf.open("input.pdf")
    page = doc[0]
    # Search for text and highlight it.
    text_instances = page.search_for("important")
    for inst in text_instances:
    page.add_highlight_annot(inst)
    doc.save("output_highlighted.pdf")
    doc.close()
    print(f"Highlighted {len(text_instances)} instances.")

    What this approach can do

    • Insert text, images, and vector shapes at precise coordinates.
    • Add annotations (highlights, underlines, sticky notes, stamps).
    • Basic redaction via add_redact_annot() and apply_redactions().
    • Extract and replace images.

    Limitations

    • The AGPL-3.0 license requires you to open source your own code or purchase a commercial license.
    • Text replacement is fragile — it works by redacting the old text and inserting new text at the same position, which can break with complex fonts or layouts.
    • Form filling is supported via the Widget class, but digital signature creation isn’t.
    • Redaction is annotation-based, which is less robust than structural redaction. add_redact_annot() and apply_redactions() remove visible page content, but recoverable data may remain depending on the document structure and how the file is saved. Use a commercial SDK for regulatory or compliance redaction.

    Method 3: Full PDF editing with Nutrient Python SDK

    Nutrient Python SDK supports structural redaction, digital signatures, and reliable text annotation — capabilities missing from open source libraries — in a single pip install.

    Install

    Terminal window
    pip install nutrient-sdk

    Code example: Add a text annotation

    from nutrient_sdk import Document, PdfEditor, Color
    with Document.open("input.pdf") as document:
    editor = PdfEditor.edit(document)
    page = editor.get_page_collection().get_first()
    annotations = page.get_annotation_collection()
    annotations.add_free_text(
    100.0, 700.0, 250.0, 40.0, # x, y, width, height
    "Reviewer",
    "Reviewed and approved",
    "Arial",
    14.0,
    Color.from_argb(255, 0, 0, 0),
    )
    editor.save_as("output_annotated.pdf")
    editor.close()
    print("Text annotation added.")

    Code example: Fill form fields

    from nutrient_sdk import Document, PdfEditor
    with Document.open("form.pdf") as document:
    editor = PdfEditor.edit(document)
    form_fields = editor.get_form_field_collection()
    name = form_fields.find_by_full_name("name")
    if name is not None:
    name.set_value("Jane Doe")
    email = form_fields.find_by_full_name("email")
    if email is not None:
    email.set_value("jane@example.com")
    date = form_fields.find_by_full_name("date")
    if date is not None:
    date.set_value("2026-02-27")
    editor.save_as("output_filled.pdf")
    editor.close()
    print("Form fields filled.")

    Code example: Apply structural redaction

    from nutrient_sdk import Document, PdfEditor, Color
    with Document.open("report.pdf") as document:
    editor = PdfEditor.edit(document)
    page = editor.get_page_collection().get_first()
    annotations = page.get_annotation_collection()
    redaction = annotations.add_redact(
    72.0, 200.0, 468.0, 30.0, # x, y, width, height
    )
    redaction.interior_color = Color.from_argb(255, 0, 0, 0)
    editor.save_as("output_redacted.pdf")
    editor.close()
    print("Redaction applied — content permanently removed on save.")

    Code example: Apply a digital signature

    from nutrient_sdk import Document, Signature, DigitalSignatureOptions
    with Signature() as signer, Document.open("contract.pdf") as document:
    options = DigitalSignatureOptions()
    options.certificate_path = "certificate.pfx"
    options.certificate_password = "cert-password"
    options.signer_name = "Jane Doe"
    options.reason = "Contract Approval"
    signer.sign(document, "signed_contract.pdf", options)
    print("Digital signature applied.")

    What this approach can do

    Everything in methods 1 and 2, plus:

    • Fill AcroForm and XFA form fields programmatically.
    • Apply structural redaction that removes content from the PDF stream (not just a visual overlay).
    • Add certificate-based digital signatures.
    • Convert between common document and image formats (PDF, Word, Excel, PowerPoint, HTML, and images).
    • Perform optical character recognition (OCR) on scanned documents with multilanguage support.
    • Batch process thousands of files in server-side workflows.

    Limitations

    • Commercial license — requires a paid subscription.
    • Larger install than pure-Python libraries.
    Try Nutrient Python SDK

    Test form filling, redaction, and signatures with your own documents.

    Method 4: Edit PDFs via Nutrient API

    Nutrient API provides the same editing operations over HTTP, with no local install. This works well for serverless environments, CI pipelines, or offloading processing to a remote service.

    Code example: Add a watermark via API

    import os
    import requests
    # Store the API key in an environment variable — don't hardcode or commit it.
    api_key = os.environ.get("NUTRIENT_API_KEY")
    if not api_key:
    raise ValueError(
    "Set the NUTRIENT_API_KEY environment variable before running this example."
    )
    url = "https://api.nutrient.io/build"
    with open("input.pdf", "rb") as input_file:
    response = requests.post(
    url,
    headers={"Authorization": f"Bearer {api_key}"},
    files={"file": input_file},
    data={
    "instructions": '{"parts":[{"file":"file"}],'
    '"actions":[{"type":"watermark",'
    '"text":"CONFIDENTIAL","fontSize":48}]}'
    },
    timeout=30,
    )
    response.raise_for_status()
    with open("output_watermarked.pdf", "wb") as f:
    f.write(response.content)
    print("Watermark applied via API.")

    The free tier includes 200 credits. Sign up here(opens in a new tab).

    Comparison: Which method fits your use case?

    CapabilityPyPDF and ReportLabPyMuPDFNutrient SDKNutrient API
    Text overlay
    Image insertion✅ (via ReportLab)
    Annotations
    Form filling✅ Basic
    Structural redaction
    Digital signatures
    Text replacement⚠️ Fragile
    LicenseBSD-3 + BSDAGPL-3.0CommercialPay-per-use
    Local install❌ (HTTP)

    Conclusion

    For text overlays or watermarks, PyPDF with ReportLab or PyMuPDF work well at zero cost. PyMuPDF is faster and has better annotation support, but its AGPL license restricts commercial use without a separate license.

    For structural redaction, digital signatures, or cross-format conversion, open source libraries lack support. Nutrient Python SDK covers those use cases in a single library.

    FAQ

    Can I edit existing text inside a PDF with Python?

    True in-place text editing (changing the words already in a PDF) is difficult with open source libraries. PyMuPDF can redact old text and insert new text at the same position, but this breaks with complex fonts or multiline content. Nutrient Python SDK provides more reliable text editing through its document engine.

    What’s the difference between a text overlay and real PDF editing?

    A text overlay adds new content on top of an existing page — like placing a sticky note on a printed page. The original content remains unchanged underneath. Real PDF editing modifies the document’s internal content stream, which is needed for operations like form filling, redaction, and text replacement.

    Do I need a paid library to fill PDF form fields in Python?

    Not necessarily. PyPDF supports basic AcroForm filling via update_page_form_field_values(), and PyMuPDF provides form interaction through its Widget class. Both handle simple text fields and checkboxes well. For advanced form types (XFA forms, complex field validation, signature fields) or production batch workflows, Nutrient Python SDK provides broader coverage with support for text fields, checkboxes, dropdowns, radio buttons, and signature fields.

    Is PyMuPDF free for commercial use?

    PyMuPDF is released under AGPL-3.0, which requires you to release your source code if you distribute your application. For commercial use without open sourcing, you need a commercial license from Artifex(opens in a new tab). Our Python PDF library comparison covers licensing for all major libraries.

    Hulya Masharipov

    Hulya Masharipov

    Technical Writer

    Hulya is a frontend web developer and technical writer who enjoys creating responsive, scalable, and maintainable web experiences. She’s passionate about open source, web accessibility, cybersecurity privacy, and blockchain.

    Explore related topics

    Try for free Ready to get started?