PDF content editing API for Node.js

Nutrient Node.js SDK provides a powerful content editing API that enables you to programmatically edit text content within PDF documents. This also enables you to modify, reposition, and resize existing text blocks while preserving the original document structure and formatting.

Prerequisites

Before you get started, make sure Nutrient Node.js SDK is up and running.

You can download and use either of the following sample documents for the examples in this guide:

License requirements

Using the content editing feature of the SDK requires a separate license that includes the Content Editor API component. Without the proper license, calling methods such as beginContentEditingSession() will throw an error.

Contact Sales to add it to your license.

Beginning a content editing session

Content editing operations are performed within a dedicated editing session. This session-based approach ensures data consistency and enables you to make multiple changes before committing them to a document.

To start a content editing session, use the beginContentEditingSession() method on the document instance as demonstrated in the code snippet below. This method returns a session object that provides access to content editing operations.

import fs from "node:fs";
import { load } from "@nutrient-sdk/node";

const documentBuffer = fs.readFileSync("4-page-example-document.pdf");
const instance = await load({ document: documentBuffer });

// Start a content editing session.
const session = await instance.beginContentEditingSession();

Detecting text blocks

Text blocks represent individual paragraphs or text elements that can be obtained and modified independently. To retrieve text blocks from a specific page, use the getTextBlocks(pageIndex) method of the content editing session as demonstrated in the code snippet below. This method returns an array of text blocks found on the specified page:

import fs from "node:fs";
import { load } from "@nutrient-sdk/node";

const documentBuffer = fs.readFileSync("4-page-example-document.pdf");
const instance = await load({ document: documentBuffer });

const session = await instance.beginContentEditingSession();

// Get all text blocks on the first page (page index 0).
const textBlocks = await session.getTextBlocks(0);

console.log(`Found ${textBlocks.length} text blocks on page 1`);

textBlocks.forEach((block, index) => {
  console.log(`Block ${index + 1}:`);
  console.log(`  ID: ${block.id}`);
  console.log(`  Text: ${block.text}`);
  console.log(`  Position: (${block.anchor.x}, ${block.anchor.y})`);
  console.log(`  Max Width: ${block.maxWidth}`);
  console.log(`  Bounding Box: ${JSON.stringify(block.boundingBox)}`);
});

Each text block contains the following properties:

  • id — Unique identifier for the text block. Since a PDF doesn’t have a concept of IDs, these are generated by the SDK in a deterministic way.

  • text — The text content of the block. If it’s a multiline block, it’ll contain all lines concatenated with newline characters.

  • anchor — Position coordinates (x, y) of the text block anchor point. This is the point where the text block is anchored in the PDF coordinate system — typically the top-left corner, adjusted for PDF internal offset.

  • maxWidth — Maximum width constraint for the text block.

  • boundingBox — Current bounding rectangle with top, left, width, and height, in PDF points. This is useful for creating overlays or annotations that match the text block’s position.

Updating text blocks

To update the text content and position and the maximum width of existing text blocks, use the updateTextBlocks(textBlocks) method as demonstrated in the code snippet below. This method accepts an array of objects representing the text blocks to update, where each object contains the block’s id and the properties you want to change:

import fs from "node:fs";
import { load } from "@nutrient-sdk/node";

const documentBuffer = fs.readFileSync("document.pdf");
const instance = await load({ document: documentBuffer });

const session = await instance.beginContentEditingSession();

// Get text blocks from the first page.
const textBlocks = await session.getTextBlocks(0);
const firstBlock = textBlocks[0];

// Update a block.
await session.updateTextBlocks([
  {
    id: firstBlock.id,
    text: "This is the new text content",
    anchor: { x: 100, y: 200 },
    maxWidth: 300
  }
]);

Updating a text block will only stage the changes in the session. Call the commit() method to apply these changes to the document, or the discard() method to cancel the staged changes.

To update a text block, provide the following properties in the update object:

  • id — Required identifier of the text block to update, which you can obtain from the getTextBlocks(pageIndex) method.

  • text — Optional new text content for the block. If you provide this, the existing text will be replaced with the new content.

  • anchor — Optional new position coordinates (x, y) for the text block anchor point. This enables you to reposition the text block within the PDF.

  • maxWidth — Optional new maximum width for the text block. This controls how wide the text can be before it wraps to the next line.

Changing the maximum width may affect text wrapping and layout. If the new width is smaller than the current text, it’ll wrap to fit within the new constraints. If the new width is larger than the current text, the excess width won’t be persisted, so next time you load the document, the text block won’t contain the excess space.

Batch updates

To update multiple text blocks in a single operation for better performance, use the updateTextBlocks(textBlocks) method as demonstrated in the code snippet below:

import fs from "node:fs";
import { load } from "@nutrient-sdk/node";

const documentBuffer = fs.readFileSync("4-page-example-document.pdf");
const instance = await load({ document: documentBuffer });

const session = await instance.beginContentEditingSession();
const textBlocks = await session.getTextBlocks(0);

// Update multiple text blocks at once.
await session.updateTextBlocks([
  {
    id: textBlocks[0].id,
    text: "Updated first block"
  },
  {
    id: textBlocks[1].id,
    maxWidth: 250
  },
  {
    id: textBlocks[2].id,
    anchor: { x: 200, y: 300 },
    maxWidth: 400
  }
]);

Committing changes

The commit() method saves all staged changes to the document and closes the session as demonstrated in the code snippet below:

const session = await instance.beginContentEditingSession();

// Make your changes.
await session.updateTextBlocks([
  /* ... */
]);

// Save changes and close the session.
await session.commit();

Discarding changes

The discard() method cancels all staged changes and closes the session without saving as demonstrated in the code snippet below:

const session = await instance.beginContentEditingSession();

// Make your changes.
await session.updateTextBlocks([
  /* ... */
]);

// Cancel changes without saving.
await session.discard();

// Session is now inactive, changes are lost.
console.log(session.active); // false

Checking session status

To check if a session is still active, use the session.active property as demonstrated in the code snippet below:

const session = await instance.beginContentEditingSession();

console.log(session.active); // true

await session.discard();

console.log(session.active); // false

Using document editing operations while a content editing session is active will discard any staged changes and close the session.

Complete example

import fs from "node:fs";
import { load } from "@nutrient-sdk/node";

const documentBuffer = fs.readFileSync("4-page-example-document.pdf");
const instance = await load({ document: documentBuffer });
const session = await instance.beginContentEditingSession();

// Get text blocks from the first page.
const textBlocks = await session.getTextBlocks(0);
// Update the first text block.
await session.updateTextBlocks([
  {
    id: textBlocks[0].id,
    text: "This is the updated text content",
    anchor: { x: 100, y: 200 },
    maxWidth: 300
  }
]);
// Commit changes to the document.
await session.commit();
console.log(session.active); // false

// Close the document instance.
await instance.close();