Adding text markup annotations to a PDF document
Adding text markup annotations to PDFs programmatically enables teams to automate document review processes, build collaborative editing tools, and implement content highlighting workflows. Whether you’re creating automated proofreading systems, building document editing platforms, implementing spellcheck feedback tools, or creating collaborative review applications, text markup annotations provide visual indicators for text-level changes. Text markup annotations include highlight overlays for emphasizing important content, underline markings for drawing attention to specific text, strikeout lines for indicating deleted or outdated content, and squiggly underlines for marking spelling or grammar issues, all with customizable colors and precise positioning control.
How Nutrient helps you achieve this
Nutrient Python SDK handles PDF text markup annotation structures and appearance generation. With the SDK, you don’t need to worry about:
- Parsing text markup annotation dictionaries and quadrilateral arrays
- Managing translucent overlay rendering and blend modes
- Handling text coordinate calculations and bounding box intersections
- Complex annotation appearance streams and color transformations
Instead, Nutrient provides an API that handles all the complexity behind the scenes, letting you focus on your business logic.
Complete implementation
Below is a complete working example that demonstrates adding various text markup annotations to a PDF. The following lines set up the Python application. The import statements bring in all necessary classes from the Nutrient SDK:
from nutrient_sdk import Documentfrom nutrient_sdk import PdfEditorfrom nutrient_sdk import Colorfrom nutrient_sdk import NutrientExceptionWorking with text markup annotations
The main() function defines the entry point that will contain the text markup annotation creation logic. The Document.open() call opens the PDF document. The context manager(opens in a new tab) syntax ensures the document is automatically closed when you’re done, preventing resource leaks. The following code creates a PDF editor, accesses the page collection, ensures at least one page exists by adding a letter-size page (612×792 points) if the document is empty, and retrieves the annotation collection from the first page:
def main(): try: with Document.open("input.pdf") as document: editor = PdfEditor.edit(document) pages = editor.get_page_collection()
if pages.get_count() == 0: pages.add(612.0, 792.0)
page = pages.get_first() annotations = page.get_annotation_collection()Adding a highlight annotation
The following code adds a highlight annotation at coordinates (50, 700) with dimensions 150×20 points. The add_highlight() method creates a translucent colored overlay covering a rectangular text region, defined by position (x, y), size (width, height), author name, and contents metadata. Highlight annotations are created with a default yellow color, which is commonly used to emphasize important information or key passages. After creation, the color property is assigned a custom translucent yellow color using ARGB values (128, 255, 255, 0), where the alpha value of 128 (50 percent transparency) creates a semi-transparent overlay that reveals the underlying text while providing visual emphasis:
highlight = annotations.add_highlight( 50.0, 700.0, 150.0, 20.0, # x, y, width, height "Highlighter", "Important information" ) # Optionally customize the color highlight.color = Color.from_argb(128, 255, 255, 0)Adding an underline annotation
The following code adds an underline annotation at coordinates (50, 650) with dimensions 150×20 points, positioned 50 points below the highlight annotation to maintain vertical spacing. The add_underline() method creates a horizontal line beneath the specified text region, drawing attention to content without obscuring it with an overlay. Underline annotations are created with a default green color. After creation, the color property is assigned a blue color using ARGB values (255, 0, 0, 255), creating a fully opaque blue underline. This pattern is commonly used for emphasizing hyperlinks, marking important terms in technical documentation, or indicating text requiring attention:
underline = annotations.add_underline( 50.0, 650.0, 150.0, 20.0, # x, y, width, height "Underliner", "Emphasized text" ) # Optionally customize the color underline.color = Color.from_argb(255, 0, 0, 255)Adding a strikeout annotation
The following code adds a strikeout annotation at coordinates (50, 600) with dimensions 150×20 points, maintaining the 50-point vertical spacing pattern. The add_strike_out() method creates a horizontal line through the middle of the specified text region, visually indicating deletion or obsolescence. Strikeout annotations are created with a default red color (ARGB: 255, 255, 0, 0), making the deletion marker visually prominent. Unlike highlight or underline annotations, strikeout typically retains the default red color without customization, as red universally signals removal or deletion. This annotation style is commonly used in legal document redlining, content revision workflows, or marking deprecated information:
strike_out = annotations.add_strike_out( 50.0, 600.0, 150.0, 20.0, # x, y, width, height "Reviewer", "This content should be removed" )Adding a squiggly annotation
The following code adds a squiggly annotation at coordinates (50, 550) with dimensions 150×20 points, completing the vertical series of text markup annotations with 50-point spacing. The add_squiggly() method creates a wavy underline beneath the specified text region, mimicking the visual feedback provided by word processors for spelling and grammar errors. Squiggly annotations are created with a default red color, typically used to indicate errors requiring correction. After creation, the color property is assigned a green color using ARGB values (255, 0, 128, 0). While red squiggly lines conventionally indicate spelling or grammar issues, customized colors enable alternative feedback systems, such as green for suggestions or warnings rather than errors:
squiggly = annotations.add_squiggly( 50.0, 550.0, 150.0, 20.0, # x, y, width, height "Proofreader", "Check spelling here" ) # Optionally customize the color squiggly.color = Color.from_argb(255, 0, 128, 0)Saving the document
The final code block saves the document with all text markup annotations and closes the editor. The try-except block handles potential errors using NutrientException:
editor.save_as("output.pdf") editor.close() except NutrientException as e: print(f"Error: {e}")
if __name__ == "__main__": main()Conclusion
The text markup annotation workflow consists of several key operations:
- Open the document using a context manager(opens in a new tab) for automatic resource cleanup.
- Create an editor and access the page collection.
- Ensure at least one page exists by adding a letter-size page if needed.
- Retrieve the annotation collection from the target page.
- Add highlight annotations with translucent overlays using
add_highlight()for emphasizing content. - Customize highlight transparency using ARGB alpha values (128 = 50 percent transparency).
- Add underline annotations with
add_underline()for drawing attention beneath text. - Add strikeout annotations with
add_strike_out()for marking deleted or outdated content. - Add squiggly annotations with
add_squiggly()for indicating spelling or grammar issues. - Customize text markup colors using the
colorproperty with ARGB color values. - Position annotations with consistent vertical spacing to create organized markup sequences.
- Save and close the editor.
Nutrient handles text markup annotation dictionary structures, quadrilateral array calculations, translucent overlay rendering, blend mode operations, and appearance stream generation so you don’t need to understand PDF text markup specifications or manage coordinate transformations manually. The text markup annotation system provides visual indicators for document review processes, collaborative editing workflows, automated proofreading systems, and content highlighting applications where precise text-level feedback is required.