This guide shows you how to use the redaction API to permanently remove sensitive information from PDF documents.

Introduction to PDF redaction

Redaction is the process of removing image, text, and vector content from a PDF page. This not only involves obscuring the content, but also removing the data in the document within the specified region.

Redaction is generally used when you want to remove personally identifiable or sensitive information from a document to ensure confidentiality and conform to regulations and privacy laws, such as General Data Protection Regulation (GDPR) or Health Insurance Portability and Accountability Act (HIPAA). By using the Redaction component, the original content of a PDF can’t be restored, thereby guaranteeing privacy.

Redaction is a two-step process:

  • First, redaction annotations are created in the areas that are to be redacted. This step won’t remove any content from the document yet; it just marks regions for redaction.
  • Second, to actually remove the content, the redaction annotations need to be applied. In this step, the page content within the region of the redaction annotations is irreversibly removed.

The actual removal of content happens only after redaction annotations are applied to the document. Before applying them, you can edit and remove them the same as any other annotation.

In this example, you’ll create redactions using a text search rule. Any text matching a provided query will be covered by redaction annotations. To create redactions, use the createRedactions action with a text strategy.

To do this, add a document.pdf file to the same folder as your code. You can use any document containing text, or use our provided sample document.

Run the code, and you’ll get a result.pdf with all occurrences of the searched text marked with redaction annotations.

Create redactions from search (basic API):

Terminal window
curl -X POST https://api.nutrient.io/processor/redact \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F file=@document.pdf \
-F data='{
"strategy": "text",
"strategyOptions": {
"text": "macaque",
"caseSensitive": false
}
}'

Advanced API:

Terminal window
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F document=@document.pdf \
-F instructions='{
"parts": [
{
"file": "document"
}
],
"actions": [
{
"type": "createRedactions",
"strategy": "text",
"strategyOptions": {
"text": "macaque",
"includeAnnotations": true,
"caseSensitive": false
}
}
]
}'

Preset pattern redaction

The redaction API enables you to create redactions on top of text matching predefined patterns, such as email addresses, URLs, and more.

For our example, you can create redactions using the email-address preset to search for all occurrences of email addresses. To do this, add a document.pdf file to the same folder as your code. You can use any document containing text, or use our provided sample document.

Run the code, and you’ll get a result.pdf file with all occurrences of email addresses marked with redaction annotations.

For a complete list of supported presets, refer to our API reference.

Create redactions by searching for a pattern (basic API):

Terminal window
curl -X POST https://api.nutrient.io/processor/redact \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F file=@document.pdf \
-F data='{
"strategy": "preset",
"strategyOptions": {
"preset": "email-address"
}
}'

Advanced API:

Terminal window
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F document=@document.pdf \
-F instructions='{
"parts": [
{
"file": "document"
}
],
"actions": [
{
"type": "createRedactions",
"strategy": "preset",
"strategyOptions": {
"preset": "email-address"
}
}
]
}'

The redaction API enables you to create redactions on top of text matching a provided regular expression. This is the most versatile redaction creation strategy.

In the example below, you’ll create redactions using a regex pattern. Any text matching the pattern will be covered by redaction annotations. To create redactions, use the createRedactions action with a regex strategy. To do this, add a document.pdf file to the same folder as your code. You can use any document containing text, or use our provided sample document.

Run the code, and you’ll get a result.pdf file with all text matching the regex pattern marked with redaction annotations.

Create redactions by searching with regular expression (basic API):

Terminal window
curl -X POST https://api.nutrient.io/processor/redact \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F file=@document.pdf \
-F data='{
"strategy": "regex",
"strategyOptions": {
"regex": "macaques?",
"caseSensitive": false
}
}'

Advanced API:

Terminal window
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F document=@document.pdf \
-F instructions='{
"parts": [
{
"file": "document"
}
],
"actions": [
{
"type": "createRedactions",
"strategy": "regex",
"strategyOptions": {
"regex": "macaques?",
"includeAnnotations": true,
"caseSensitive": false
}
}
]
}'

Applying redactions

After you create redaction annotations, apply them to the document to permanently remove the covered content. You can achieve this by adding the applyRedactions action to the instructions.

The redact API automatically creates and applies redactions in a single step — permanently removing content immediately. To review redactions before applying them, set redactionState to "stage". This creates redaction annotations without removing the underlying content.

To do this, use the result.pdf file from the previous example. Make sure it’s in the same folder as your code.

Stage redactions to stop applying them automatically (basic API):

Terminal window
curl -X POST https://api.nutrient.io/processor/redact \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F file=@document.pdf \
-F data='{
"strategy": "text",
"strategyOptions": {
"text": "macaque",
"caseSensitive": false
},
"redactionState": "stage"
}'

Advanced API:

Terminal window
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F document=@result.pdf \
-F instructions='{
"parts": [
{
"file": "document"
}
],
"actions": [
{
"type": "applyRedactions"
}
]
}'