---
title: "Pattern highlighting with Python | Nutrient DCS"
canonical_url: "https://www.nutrient.io/guides/document-converter/document-converter-services/document-security/pattern-highlighting-using-python/"
md_url: "https://www.nutrient.io/guides/document-converter/document-converter-services/document-security/pattern-highlighting-using-python.md"
last_updated: "2026-06-08T09:14:14.317Z"
description: "Highlight text patterns in documents using Python and Nutrient Document Converter Services. Complete code example with regex pattern matching."
---

This guide demonstrates how to implement pattern highlighting in PDF documents using Python and Nutrient Document Converter Services (DCS). Pattern highlighting visually marks specific text patterns without removing the underlying content, making it ideal for:

- **Document review workflows** - Mark sensitive information for review before publication

- **Compliance auditing** - Highlight regulated data patterns for verification

- **Content analysis** - Identify and mark key information patterns for further processing

- **Training materials** - Emphasize important concepts or data in educational documents

Unlike pattern redaction, which permanently removes content, highlighting preserves the original text while making it visually distinct. This approach is particularly useful when you need to mark content for human review or maintain document integrity while indicating areas of interest.

You can run this code in any Python environment with access to the [Zeep library](https://docs.python-zeep.org/en/master/in_depth.html#).

The Zeep library enables interaction with Web Services Description Language (WSDL), which defines how to call the web services and describes the data structures returned. Nutrient Document Converter Services (DCS) provides these WSDL definitions for document highlighting operations.

## Prerequisites

Before implementing pattern highlighting, ensure you have:

- Python 3.x installed on your system

- The Zeep library installed (`pip install zeep`)

- Nutrient Document Converter Services running locally on port 41734

- Sample PDF documents for testing highlighting operations

- Basic understanding of regular expressions for pattern matching

- Appropriate file system permissions for reading input files and writing output

For initial DCS setup with Python, refer to the [using Document Converter Services with Python](https://www.nutrient.io/guides/document-converter/document-converter-services/dcs-with-python.md) guide.

## WSDL

Zeep extracts the following WSDL definitions:

```python

PatternHighlight(sourceFile: xsd:base64Binary, openOptions: ns2:OpenOptions, patternHighlightSettings: ns3:PatternHighlightSettings) -> PatternHighlightResult: xsd:base64Binary...
ns2:OpenOptions(UserName: xsd:string, Password: xsd:string, FileExtension: xsd:string, OriginalFileName: xsd:string, RefreshContent: xsd:boolean, AllowExternalConnections: xsd:boolean, AllowMacros: ns3:MacroSecurityOption, SystemSettings: ns5:SystemSettings, SubscriptionSettings: ns9:SubscriptionSettings)...
ns3:PatternHighlightSettings(Alpha: xsd:unsignedByte,Red: xsd:unsignedByte, Green: xsd:unsignedByte, Blue: xsd:unsignedByte, CaseSensitive: ns3:BooleanEnum, Debug: ns3:BooleanEnum, PageRange: xsd:string, Pattern: xsd:string )

```

The `PatternHighlight` method requires three parameters:

- `sourceFile: xsd:base64Binary`

- `openOptions: ns2:OpenOptions`

- `patternHighlightSettings: ns3:PatternHighlightSettings`

Use a Base64-encoded binary string for `sourceFile`, as defined by the W3C XML schema.

Create the `openOptions` and `patternHighlightSettings` objects using Zeep type factories (`ns2` and `ns3` namespaces, respectively).

Configure the `OpenOptions` type by setting basic fields such as file name and extension.

The `PatternHighlightSettings` type supports:

- Highlight color via `Red`, `Green`, `Blue`, and `Alpha` byte values

- Case sensitivity and debug options using `BooleanEnum`

- Target `PageRange`

- Search `Pattern`

The method returns a Base64-encoded binary representation of the highlighted file.

## Sample code

The following Python code demonstrates how to highlight text in a PDF file based on a regular expression pattern:

```python

import zeep
import base64

print("Highlight a PDF file using a regular expression pattern")
#Service URL.

service_url = "http://localhost:41734/Muhimbi.DocumentConverter.WebService/"

# WSDL URL.

wsdl_url = service_url+"?WSDL"

# Source file.

sourceFile = "Redaction-Test-2.pdf"

# Construct the header.

header = zeep.xsd.Element(
    "Header",
    zeep.xsd.ComplexType(
        [
            zeep.xsd.Element(
                "{http://www.w3.org/2005/08/addressing}Action", zeep.xsd.String()
            ),
            zeep.xsd.Element(
                "{http://www.w3.org/2005/08/addressing}To", zeep.xsd.String()
            ),
        ]
    ),
)

# Create a heading object.

header_value = header(Action=service_url,To=service_url)

# Create client.

client = zeep.Client(wsdl=wsdl_url)

# Load the source file as a Base64 string.

with open(sourceFile, "rb") as image_file:
    encoded_string = base64.b64encode(image_file.read()).decode('utf-8')

# Create a factory type to construct objects with the suffix ns2 (see the WSDL).

factory = client.type_factory("ns2")

# Create the OpenOptions object with minimum settings.

open_options = factory.OpenOptions(OriginalFileName = sourceFile, FileExtension = "pdf")

# Create a factory type to construct objects with the suffix ns3 (see the WSDL).

factory2 = client.type_factory("ns3")

# Create the PatternHighlightSettings only with the page range.

PatternHighlightSettings = factory2.PatternHighlightSettings(PageRange = "*", Alpha = 128, Red = 0 ,Green = 0, Blue = 255, Pattern = "374245455400126")

# Call the PatternHighlight method with the required parameters.

result = client.service.PatternHighlight(encoded_string, open_options, PatternHighlightSettings)

# Write the redacted file

with open("Redaction-Test-highlighted.pdf", "wb") as f:
  f.write(result)

print("Done")

```

The code creates a highlighted PDF by processing the original document with the specified pattern. The output file contains the original content with matching patterns visually highlighted in the specified color.

## Troubleshooting

**Pattern matching error: No patterns found**

- Verify that the pattern exists in the document content

- Test your regex pattern with online regex validators

- Ensure the pattern format matches the document’s text structure

- Check if the pattern is case-sensitive and adjust accordingly

**Service connection error: Cannot connect to DCS**

- Ensure Nutrient Document Converter Services is running on `localhost:41734`

- Check that no firewall is blocking the connection

- Verify the service URL in your code matches your DCS installation

**File access error: Permission denied**

- Verify that Python has read access to the source PDF files

- Check that the output directory has write permissions

- Ensure files aren’t locked by other applications or PDF viewers

**Highlighting not visible: Pattern processed but no highlighting appears**

- Verify that Alpha value is set appropriately (128 is recommended)

- Check RGB color values are distinct from document background

- Ensure the page range includes the pages containing your pattern

- Test with a simpler, more visible pattern first

## What’s next

Now that you can highlight patterns in documents with Python, explore these related document security capabilities:

- **Pattern redaction** - Learn how to permanently remove sensitive content with [pattern redaction using Python](https://www.nutrient.io/guides/document-converter/document-converter-services/document-security/pattern-redaction-using-python.md)

- **Smart redaction with C#** - Discover automated sensitive data detection using [smart redaction](https://www.nutrient.io/guides/document-converter/document-converter-services/document-security/smart-redaction.md) (C# implementation)

- **C# implementation** - Compare approaches with [pattern highlighting using C#](https://www.nutrient.io/guides/document-converter/document-converter-services/document-security/code-samples.md) code samples

- **Complete Python setup** - Review the [using Document Converter Services with Python](https://www.nutrient.io/guides/document-converter/document-converter-services/dcs-with-python.md) guide for more features
---

## Related pages

- [Sample code in C# for pattern redaction and highlighting](/guides/document-converter/document-converter-services/document-security/code-samples.md)
- [Secure PDF API](/guides/document-converter/document-converter-services/document-security.md)
- [Secure and protect PDFs and Office documents in C#](/guides/document-converter/document-converter-services/document-security/csharp.md)
- [Secure PDF and MS Office documents with Java](/guides/document-converter/document-converter-services/document-security/java.md)
- [Secure your PDFs and Office files easily with .NET](/guides/document-converter/document-converter-services/document-security/dotnet-core.md)
- [Secure PDF and Office documents with JavaScript](/guides/document-converter/document-converter-services/document-security/javascript.md)
- [Pattern redaction and highlighting with C#](/guides/document-converter/document-converter-services/document-security/pattern-redaction-and-highlighting.md)
- [Password protection for PDF documents in PHP](/guides/document-converter/document-converter-services/document-security/php.md)
- [WSDL URL.](/guides/document-converter/document-converter-services/document-security/pattern-redaction-using-python.md)
- [Smart document redaction with C#](/guides/document-converter/document-converter-services/document-security/smart-redaction.md)

