Node.js large document triage

This example automates a support-oriented triage workflow for large documents.

The script:

Uploads a local file
Extracts the returned document_id
Fetches document_info
Fetches properties
Fetches text from page 0
Optionally runs a search query
Renders page 0 as a PNG preview
Saves all outputs to disk

Prerequisites

A DWS Viewer API key in NUTRIENT_DWS_VIEWER_API_KEY
Node.js 18 or later
A local input file

Complete example

The following script can be run with node and accepts command-line arguments for the input file path, MIME type, and an optional search query:

import { mkdir, readFile, writeFile } from "node:fs/promises";
import path from "node:path";

const apiKey = process.env.NUTRIENT_DWS_VIEWER_API_KEY;
const inputPath = process.argv[2];
const mimeType = process.argv[3] ?? "application/pdf";
const searchQuery = process.argv[4] ?? "";

if (!apiKey) {
  throw new Error("Missing NUTRIENT_DWS_VIEWER_API_KEY");
}

if (!inputPath) {
  throw new Error(
    "Usage: node triage.mjs <inputPath> [mimeType] [optionalSearchQuery]",
  );
}

const fileBuffer = await readFile(inputPath);

const uploadResponse = await fetch("https://api.nutrient.io/viewer/documents", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${apiKey}`,
    "Content-Type": mimeType,
    "Content-Length": fileBuffer.length.toString(),
  },
  body: fileBuffer,
});

if (!uploadResponse.ok) {
  const errorText = await uploadResponse.text();
  throw new Error(`Upload failed: ${uploadResponse.status} ${errorText}`);
}

const uploadResult = await uploadResponse.json();
const documentId = uploadResult.data?.document_id;

if (!documentId) {
  throw new Error("No document_id found in upload response");
}

const outputDir = path.resolve(`./dws-triage-${documentId}`);
await mkdir(outputDir, { recursive: true });

const requestJson = async (pathname) => {
  const response = await fetch(`https://api.nutrient.io${pathname}`, {
    headers: {
      Authorization: `Bearer ${apiKey}`,
      Accept: "application/json",
    },
  });

  if (!response.ok) {
    const errorText = await response.text();
    throw new Error(`${pathname} failed: ${response.status} ${errorText}`);
  }

  return response.json();
};

const requestBinary = async (pathname, accept) => {
  const response = await fetch(`https://api.nutrient.io${pathname}`, {
    headers: {
      Authorization: `Bearer ${apiKey}`,
      Accept: accept,
    },
  });

  if (!response.ok) {
    const errorText = await response.text();
    throw new Error(`${pathname} failed: ${response.status} ${errorText}`);
  }

  return Buffer.from(await response.arrayBuffer());
};

const documentInfo = await requestJson(
  `/viewer/documents/${documentId}/document_info`,
);
const documentProperties = await requestJson(
  `/viewer/documents/${documentId}/properties`,
);
const firstPageText = await requestJson(
  `/viewer/documents/${documentId}/pages/0/text`,
);

let searchResults = null;
if (searchQuery) {
  const params = new URLSearchParams({ q: searchQuery });
  searchResults = await requestJson(
    `/viewer/documents/${documentId}/search?${params.toString()}`,
  );
}

const previewImage = await requestBinary(
  `/viewer/documents/${documentId}/pages/0/image?width=1600`,
  "image/png",
);

await Promise.all([
  writeFile(
    path.join(outputDir, "upload-response.json"),
    JSON.stringify(uploadResult, null, 2),
  ),
  writeFile(
    path.join(outputDir, "document-info.json"),
    JSON.stringify(documentInfo, null, 2),
  ),
  writeFile(
    path.join(outputDir, "document-properties.json"),
    JSON.stringify(documentProperties, null, 2),
  ),
  writeFile(
    path.join(outputDir, "page-0-text.json"),
    JSON.stringify(firstPageText, null, 2),
  ),
  writeFile(path.join(outputDir, "page-0.png"), previewImage),
]);

if (searchResults) {
  await writeFile(
    path.join(outputDir, "search-results.json"),
    JSON.stringify(searchResults, null, 2),
  );
}

console.log(`Saved triage packet to ${outputDir}`);
console.log(`document_id: ${documentId}`);

Run the script

Run the script with the required environment variable and command-line arguments — for example:

export NUTRIENT_DWS_VIEWER_API_KEY=your_api_key_here
node triage.mjs ./large-document.pdf application/pdf invoice

Example output directory:

./dws-triage-<document_id>/
├── document-info.json
├── document-properties.json
├── page-0-text.json
├── page-0.png
├── search-results.json
└── upload-response.json

If you don’t want to run a search check, omit the final argument.

What this packet helps you validate

Whether:

The upload succeeded and returned the expected document_id
The document metadata and properties look reasonable
Page 0 exposes usable extracted text
A representative term can be found with search
The first page renders correctly as a server-generated image

Node.js large document triage

Prerequisites

Complete example

Run the script

What this packet helps you validate

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.

Node.js large document triage

Prerequisites

Complete example

Run the script

What this packet helps you validate

Related guides

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.