1.16 release notes
RSSBefore attempting to upgrade to Document Engine 1.16, make sure your application runs as expected on your current version. If you’re on version 1.6.1 or later, you can upgrade directly to 1.16. If you’re on an earlier version, follow the step-by-step upgrade path outlined in our upgrade guide.
Highlights
Three changes ship in 1.16, all of which are outlined below.
Clustering
Multiple Document Engine nodes can now form a cluster that automatically routes document requests using consistent hashing for better scalability. See the horizontal scaling guide for details.
Password protection removal for PDF exports
The /api/documents/{documentId}/pdf endpoint now accepts a remove_password_protection option to strip password protection and security restrictions from downloaded PDFs.
Faster updates for unsigned PDFs
Document Engine now skips unnecessary digital signature refreshes after syncing documents that are known to be unsigned. This speeds up annotation, comment, bookmark, and form operations, especially for large PDFs.
Breaking changes
There are two breaking changes — read about both before upgrading.
Structured logging emits flattened fields by default
When LOG_STRUCTURED=true, structured logs now emit flattened fields by default. Set LOG_STRUCTURED_FLATTEN=false to keep nested JSON fields.
Nutrient Web SDK (server-backed) keys are no longer supported
Document Engine no longer starts with Nutrient Web SDK (server-backed) activation keys or legacy server-backed/Processor license keys, including legacy offline license keys.
If one of these keys is configured through ACTIVATION_KEY or LICENSE_KEY, Document Engine logs an error and fails to start.
This also affects existing installations that previously activated with a Nutrient Web SDK (server-backed) or Processor license and keep the activated license in the database or persistent storage. If no replacement key is configured, Document Engine detects the stored legacy license during startup, logs an error, and fails to start.
Before upgrading, replace any Nutrient Web SDK (server-backed) activation key or legacy server-backed/Processor license key with a Document Engine activation key or offline license key.
Document Engine activation keys must start with the 3- prefix. You can get Document Engine activation keys and offline license keys from the Nutrient Portal(opens in a new tab).
If an upgraded installation is already stuck on a stored legacy license, clear the stored activated license from the database or persistent storage, configure a Document Engine activation key or offline license key, and start Document Engine again.
Deprecations
Core OCR is deprecated and will be removed in a future release. GdPicture OCR is the default and recommended engine, so most teams won’t need to do anything.
The x-pspdfkit-ocr-engine request header and the OCR_ENGINE=core configuration value are also deprecated. No replacement OCR engine selector will be added, because Document Engine will use a single OCR engine after Core OCR is removed.
Clustering
Document Engine now supports clustering multiple nodes into a single logical deployment.
What it does
When clustering is active, document-scoped API requests are automatically routed to the node responsible for a given document using consistent hashing. This improves cache locality and throughput because each document is primarily served by one node rather than all of them competing for the same resources.
Nodes discover each other automatically via a configurable discovery method. In Kubernetes deployments, DNS-based discovery finds peers using a headless service.
When to use it
Clustering is useful when a single Document Engine instance is no longer sufficient to handle your workload. If you’re experiencing high CPU or memory pressure during document operations, adding more nodes with clustering enabled allows the load to be distributed across them.
What changed in this release
- The node join protocol now prevents requests from being routed to a node before it’s ready to serve them. Nodes go through a readiness handshake before entering the hash ring.
- All clustering configuration now lives in the standard configuration schema. Clustering environment variables are documented in the configuration reference.
- HTTP/2 shared rendering is now cluster-aware. Tile requests previously bypassed cluster routing because they run outside the standard router pipeline; they’re now resolved to the document’s owner node, so shared rendering and clustering can be used together without routing conflicts.
Getting started
Clustering is disabled by default. To enable it, set CLUSTERING_ENABLED to true and choose a discovery method in your environment configuration. Refer to the horizontal scaling guide for an overview, and to the configuration reference for details.
Operating clustering
On a healthy join, each node logs that the local endpoint is ready and then logs Cluster handshake completed peer=<node> as peers become routable. During rolling restarts, Removing it from the ring messages are expected when an old pod exits, as long as they’re followed by Discovered new node and handshake completion messages for the replacement pod.
When the Prometheus exporter is enabled, clustering publishes:
cluster_ring_sizewith the current number of routable nodescluster_peer_changes_totallabeled byevent=added|removedcluster_redirect_totallabeled by redirect outcomecluster_redirect_retries_totalfor stale-ring redirect retriescluster_task_supervisor_overloaded_totalwhen the clustering task supervisor reaches its child limit
Each cluster metric carries an erlang_node label with the full Erlang node name (e.g. pssync@10.2.130.162).
Use cluster_ring_size as the first health signal. For example, alert when the minimum ring size over several minutes is lower than the expected replica count. If routing looks wrong, inspect redirect failures first. Then compare peer-change churn and ring size across nodes.
Troubleshooting
Clustering has been tested in representative deployments, but every topology and failure scenario is different. If you encounter unexpected request routing, nodes failing to join the cluster, or degraded performance after enabling clustering, disable it by unsetting CLUSTERING_ENABLED and restarting your nodes.
If you run into any issues, get in touch with Nutrient Support(opens in a new tab).
Configuration options
TILE_MAX_SCALE no longer has a default limit. If unset, rendered tile requests aren’t rejected based on tile scale.
Set TILE_MAX_SCALE explicitly to bound the maximum allowed ratio between requested tile dimensions and page dimensions.
Dashboard
The dashboard now displays the configured maximum upload size and provides clearer error feedback when document uploads fail. A small HTTP/2 compatibility issue is now fixed.
Document upload and creation improvements
There have been two improvements to the upload and creation path.
Null-byte protection for annotations, bookmarks, and other records
Records persisted to PostgreSQL through the upstream API — annotations, bookmarks, comments, form fields, and form field widgets — now have their content sanitized for null bytes before being written to the database. Previously, content containing null bytes caused PostgreSQL to reject the insert with a 22P05 (untranslatable_character) error that surfaced to clients as an empty 500 response with no actionable information.
The sanitizer applies the same rules already used elsewhere for document properties such as titles and OCG layer names:
- Null bytes at the beginning or end of a value are stripped.
- Inner null bytes are replaced with spaces.
- One or more consecutive inner null bytes are collapsed to a single space.
For example, an annotation submitted as:
{ "content": { "type": "pspdfkit/text", "creatorName": "\u0000Alice\u0000", "note": "hello\u0000\u0000world" }}is now accepted and persisted as:
{ "content": { "type": "pspdfkit/text", "creatorName": "Alice", "note": "hello world" }}This applies to both the JSON POST /api/documents/:document_id/annotations path and to annotations extracted from uploaded PDFs whose /Contents or /T fields contain null bytes.
Structured JSON error responses for unhandled database errors
When an unexpected Postgrex.Error reaches the global error handler, the response is now a structured JSON body with the SQLSTATE code instead of an empty body. The HTTP status remains 500 for most errors, but PostgreSQL data-exception errors (SQLSTATE class 22, e.g. malformed Unicode) now return 422.
Here’s an example response when invalid input still reaches PostgreSQL:
{ "error": { "code": "22P05", "message": "Invalid input data." }}Raw PostgreSQL message and detail fields are intentionally withheld from the response body to avoid leaking schema and constraint metadata. The full exception is still logged server-side for debugging.
Email conversion language and timezone
Document Engine now supports selecting the language and timezone used for EML- and MSG-to-PDF conversion. Use the email_language parameter on a Build API file part to pass a BCP 47 language tag. The language controls translated email header labels and date/time formatting in the generated PDF.
Use email_timezone to pass an IANA timezone name for email timestamp rendering.
When email_language is omitted, Document Engine uses en-US. When email_timezone is omitted, Document Engine uses UTC.
Examples
Use the default English formatting explicitly:
{ "parts": [ { "file": "email", "email_language": "en-US", "email_timezone": "UTC" } ]}This renders labels such as From, To, and Subject, and it formats a UTC timestamp like Thursday, August 15, 2019 5:54:37 AM.
Use German formatting:
{ "parts": [ { "file": "email", "email_language": "de-DE", "email_timezone": "Europe/Prague" } ]}This renders labels such as Von, An, and Betreff, and it formats a Prague timestamp like Donnerstag, 15. August 2019 07:54:37.
Password protection removal for PDF exports
Document Engine now supports stripping password protection and PDF security restrictions when downloading a document as a PDF. Pass remove_password_protection as a query parameter on GET requests or as a JSON body field on POST requests to the /api/documents/{documentId}/pdf endpoint.
If the document was uploaded with an owner password, include it in the PSPDFKit-Pdf-Password header. Attempting to strip security without the owner password returns 403.
This option cannot be combined with source. To set new passwords on a PDF, use the Build API.
Examples
Query parameter (GET):
GET /api/documents/{documentId}/pdf?remove_password_protection=trueJSON body (POST), with owner password:
POST /api/documents/{documentId}/pdfContent-Type: application/jsonPSPDFKit-Pdf-Password: ownerpassword123
{ "remove_password_protection": true}Performance
This release improves syncing performance for unsigned PDFs. When a document is known to be unsigned, Document Engine now skips an unnecessary digital signature refresh after syncing. Previously, that refresh could trigger expensive work, including regenerating the current PDF.
This improves:
- Annotation creation, updates, and deletion
- Comment creation
- Bookmark creation, updates, and deletion
- Form field and form value updates
The improvement is most noticeable for large PDFs and documents with many annotations or pages.
Measured improvements
All benchmarks used synthetic files in a development environment. Production results will vary depending on hardware, concurrent load, and document characteristics.
For representative sync operations on unsigned documents:
- 100-page form-heavy file — 238 ms ⇒ 98 ms (59 percent faster)
- 100-page annotation-heavy file — 233 ms ⇒ 36 ms (85 percent faster)
- 1,000-page file — 292 ms ⇒ 54 ms (82 percent faster)
- 10,000-page file — 2.6 s ⇒ 208 ms (92 percent faster)
- 500 MB file — 8.0 s ⇒ 46 ms (99 percent faster)
Reduced rendering worker churn
This release bounds the cleanup step that runs at the end of each worker session with its own short deadline, so workers are no longer recycled when cleanup gets squeezed out by an exhausted request deadline. This avoids unnecessary restarts without affecting user-facing request behavior.
Faster follow-up rendering and processing after upload
Document Engine now primes the local source PDF cache during upload, improving responsiveness for follow-up operations. Previously, the first follow-up operation — such as rendering a page, generating thumbnails or tiles, or running another PDF operation — could incur extra work to fetch or rematerialize the source file. By warming that cache during upload, common upload-then-view and upload-then-process workflows now complete faster.
Word-based search
Document Engine now supports a new word_based search type on the /search endpoint and as a redaction strategy on the /redactions, /redact, and Build API createRedactions endpoints. Word-based search ignores whitespace between characters — including line breaks, tabs, and Unicode space variants — so phrases that wrap across lines, tables, or columns can be matched as a single hit. Punctuation must still match exactly. The search is case-insensitive by default.
Example — search request:
GET /api/documents/{documentId}/search?q=Smith,%20John&type=word_basedThis will match Smith,\nJohn in the document text (newline ignored, comma preserved on both sides), but will not match Smith John (the comma in the document has no counterpart in the query).
Example — redaction request:
POST /api/documents/{documentId}/redactions
{ "strategy": "word_based", "strategyOptions": { "word_based": "Smith, John" }}Structured logging
Structured logging now emits metadata as top-level dotted fields by default. This is useful for OpenTelemetry collector pipelines and log backends that index top-level attributes for filtering and table columns.
For example, with flattening enabled, structured logs emit fields like this:
{ "message": "PSPDFKit transport error", "meta.event": "pspdfkit_transport_error", "meta.phase": "recv", "location.file": "lib/shared/pspdfkit/protocol.ex", "location.line": 1577}When LOG_STRUCTURED=true, flattening is now enabled by default. Set LOG_STRUCTURED_FLATTEN=false to keep nested JSON fields.
Database migrations
There are no database migrations in this release.