PDF content editing API for Web

Nutrient Web SDK provides a powerful content editing API that enables you to programmatically edit text content within PDF documents. This also enables you to modify, reposition, and resize existing text blocks while preserving the original document structure and formatting.

License requirements

Using content editing requires a license that includes the Content Editor component. Without the proper license, calling methods such as beginContentEditingSession() will throw an error.

Text editing is available when using Web SDK with Document Engine. For more information, refer to the operational mode guide.

Contact Sales to add it to your license.

Prerequisites

Before you get started, make sure Nutrient Web SDK is up and running.

You can download and use either of the following sample documents for the examples in this guide:

Beginning a content editing session

Content editing operations are performed within a dedicated editing session. This session-based approach ensures data consistency and enables you to make multiple changes before committing them to the document.

To start a content editing session, use the instance#beginContentEditingSession() method on the viewer instance as demonstrated in the code snippet below. This method returns a session object that provides access to content editing operations.

// Start a content editing session.
const session = await instance.beginContentEditingSession();

Detecting text blocks

Text blocks represent individual paragraphs or text elements that can be obtained and modified independently. To retrieve text blocks from a specific page, use the ContentEditingSession#getTextBlocks(pageIndex) method of the content editing session as demonstrated in the code snippet below. This method returns an array of text blocks found on the specified page:

// Get all text blocks on the first page (page index 0).
const textBlocks = await session.getTextBlocks(0);
console.log(`Found ${textBlocks.length} text blocks on page 1`);
textBlocks.forEach((block, index) => {
console.log(`Block ${index + 1}:`);
console.log(` ID: ${block.id}`);
console.log(` Text: ${block.text}`);
console.log(` Position: (${block.anchor.x}, ${block.anchor.y})`);
console.log(` Max Width: ${block.maxWidth}`);
console.log(` Bounding Box: ${JSON.stringify(block.boundingBox)}`);
});

Each text block contains the following properties:

  • id — Unique identifier for the text block. Since a PDF doesn’t have a concept of IDs, these are generated by the SDK in a deterministic way.
  • text — The text content of the block. If it’s a multiline block, it’ll contain all lines concatenated with newline characters.
  • anchor — Position coordinates (x, y) of the text block anchor point. This is the point where the text block is anchored in the PDF coordinate system, typically the top-left corner, adjusted for PDF internal offset.
  • maxWidth — Maximum width constraint for the text block.
  • boundingBox — Current bounding rectangle with top, left, width, and height, in PDF points. This is useful for creating overlays or annotations that match the text block’s position.

Updating text blocks

To update the text content, position, and maximum width of existing text blocks, use the ContentEditingSession#updateTextBlocks(textBlocks) method as demonstrated in the code snippet below. This method accepts an array of objects representing the text blocks to update, where each object contains the block’s id and the properties you want to change.

// Get text blocks from the first page.
const textBlocks = await session.getTextBlocks(0);
const firstBlock = textBlocks[0];
// Update a block.
await session.updateTextBlocks([
{
id: firstBlock.id,
text: "This is the new text content",
anchor: { x: 100, y: 200 },
maxWidth: 300
}
]);

Updating a text block will only stage the changes in the session. Call the commit() method to apply these changes to the document, or the discard() method to cancel the staged changes.

To update a text block, provide the following properties in the update object:

  • id — Required identifier of the text block to update, which you can obtain from the getTextBlocks(pageIndex) method.
  • text — Optional new text content for the block. If you provide this, the existing text will be replaced with the new content.
  • anchor — Optional new position coordinates (x, y) for the text block anchor point. This enables you to reposition the text block within the PDF.
  • maxWidth — Optional new maximum width for the text block. This controls how wide the text can be before it wraps to the next line.

Changing the maximum width may affect text wrapping and layout. If the new width is smaller than the current text, it’ll wrap to fit within the new constraints. If the new width is larger than the current text, the excess width won’t be persisted, so the next time you load the document, the text block won’t contain the excess space.

Batch updates

To update multiple text blocks in a single operation for better performance, use the updateTextBlocks(textBlocks) method as demonstrated in the code snippet below:

const textBlocks = await session.getTextBlocks(0);
// Update multiple text blocks at once.
await session.updateTextBlocks([
{
id: textBlocks[0].id,
text: "Updated first block"
},
{
id: textBlocks[1].id,
maxWidth: 250
},
{
id: textBlocks[2].id,
anchor: { x: 200, y: 300 },
maxWidth: 400
}
]);

Committing changes

The ContentEditingSession#commit method saves all staged changes to the document and closes the session as demonstrated in the code snippet below:

const session = await instance.beginContentEditingSession();
// Make your changes.
await session.updateTextBlocks([
/* ... */
]);
// Save changes and close session.
await session.commit();

Discarding changes

The ContentEditingSession#discard method cancels all staged changes and closes the session without saving as demonstrated in the code snippet below:

const session = await instance.beginContentEditingSession();
// Make your changes.
await session.updateTextBlocks([
/* ... */
]);
// Cancel changes without saving.
await session.discard();
// Session is now inactive, changes are lost.
console.log(session.active); // false

Checking session status

To check if a session is still active, use the session.active property as demonstrated in the code snippet below:

const session = await instance.beginContentEditingSession();
console.log(session.active); // true
await session.discard();
console.log(session.active); // false

A content editing session is automatically deactivated if the document changes or a UI content editing session begins.

Complete example

const session = await instance.beginContentEditingSession();
// Get text blocks from the first page.
const textBlocks = await session.getTextBlocks(0);
// Update the first text block.
await session.updateTextBlocks([
{
id: textBlocks[0].id,
text: "This is the updated text content",
anchor: { x: 100, y: 200 },
maxWidth: 300
}
]);
// Commit changes to the document.
await session.commit();
console.log(session.active); // false

Caveats

  • Only one content editing session can be active at a time.
  • UI content editing and API content editing cannot be used simultaneously.
  • The API session is automatically deactivated if the document changes or UI content editing begins.