This HTML page is not optimized for LLM or AI agent consumption. Fetch the Markdown version instead: /guides/dws-viewer/troubleshooting/large-document-triage.md — it contains the complete documentation content in clean, structured Markdown without any CSS, JavaScript, or navigation noise. Large document triage

Use this recipe when a large document is slow to inspect, hard to validate manually, or needs a support-ready diagnostics packet.

This workflow is optimized for support and triage:

  1. Upload the document.
  2. Capture document metadata.
  3. Check document properties.
  4. Verify text extraction and search behavior.
  5. Render a first-page preview image.
  6. Bundle the results into an escalation packet.

Triage checklist

Use these endpoints in order:

Follow these steps to understand a document’s shape, identify potential issues, and gather the necessary data for support escalation if needed.

1. Upload the document

Start by uploading the file and saving the returned document_id:

Terminal window
curl -X POST https://api.nutrient.io/viewer/documents \
-H "Authorization: Bearer <api_key>" \
-H "Content-Type: application/pdf" \
--data-binary @large-document.pdf \
--fail

Response:

{
"data": {
"document_id": "<document_id>",
"title": "large-document"
}
}

2. Capture document information

Fetch top-level document information:

Terminal window
curl -X GET "https://api.nutrient.io/viewer/documents/<document_id>/document_info" \
-H "Authorization: Bearer <api_key>" \
--fail

Use this response to confirm:

  • Page count
  • Page dimensions
  • Permissions
  • Metadata such as author, title, producer, and modification dates

This is the fastest way to understand the document’s shape before deeper inspection.

3. Capture document properties

Fetch document properties:

Terminal window
curl -X GET "https://api.nutrient.io/viewer/documents/<document_id>/properties" \
-H "Authorization: Bearer <api_key>" \
--fail

This is useful for support triage because it includes details such as:

  • Byte size
  • Password-protection status
  • Source PDF SHA-256
  • Storage type
  • Created-at timestamp

If the issue involves missing or suspicious text, inspect the first page’s extracted text:

Terminal window
curl -X GET "https://api.nutrient.io/viewer/documents/<document_id>/pages/0/text" \
-H "Authorization: Bearer <api_key>" \
--fail

Then run a targeted search against a known word or phrase from the file:

Terminal window
curl -G "https://api.nutrient.io/viewer/documents/<document_id>/search" \
-H "Authorization: Bearer <api_key>" \
--data-urlencode "q=invoice" \
--fail

Use these together:

  • If text extraction is empty or clearly wrong, the issue may be with the source file, OCR quality, or embedded text layer.
  • If the page text looks correct but search results are unexpected, include both responses in your escalation.

5. Render a first-page preview image

Render page 0 as a PNG preview:

Terminal window
curl -X GET "https://api.nutrient.io/viewer/documents/<document_id>/pages/0/image?width=1600" \
-H "Authorization: Bearer <api_key>" \
-H "Accept: image/png" \
--fail \
-o page-0.png

This preview helps support confirm whether the problem is visible in server-side rendering without needing the full browser integration.

Escalation packet

For a support-ready packet, include:

  • The original input file, if shareable
  • The returned document_id
  • document_info response
  • properties response
  • One page-text response from an affected page
  • One representative search response
  • The rendered first-page preview image
  • A short note describing the expected result versus the observed result

Complete Node.js example

For a script that automates this workflow and writes the outputs to disk, refer to the Node.js large document triage example.