Large document triage

Use this recipe when a large document is slow to inspect, hard to validate manually, or needs a support-ready diagnostics packet.

This workflow is optimized for support and triage:

Upload the document.
Capture document metadata.
Check document properties.
Verify text extraction and search behavior.
Render a first-page preview image.
Bundle the results into an escalation packet.

Triage checklist

Use these endpoints in order:

Upload the document — POST /viewer/documents
Fetch document information — GET /viewer/documents/{documentId}/document_info
Fetch document properties — GET /viewer/documents/{documentId}/properties
Search the document — GET /viewer/documents/{documentId}/search
Fetch page text — GET /viewer/documents/{documentId}/pages/{pageIndex}/text
Render a first-page preview — GET /viewer/documents/{documentId}/pages/{pageIndex}/image

Recommended workflow

Follow these steps to understand a document’s shape, identify potential issues, and gather the necessary data for support escalation if needed.

1. Upload the document

Start by uploading the file and saving the returned document_id:

curl -X POST https://api.nutrient.io/viewer/documents \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/pdf" \
  --data-binary @large-document.pdf \
  --fail

Response:

{
  "data": {
    "document_id": "<document_id>",
    "title": "large-document"
  }
}

2. Capture document information

Fetch top-level document information:

curl -X GET "https://api.nutrient.io/viewer/documents/<document_id>/document_info" \
  -H "Authorization: Bearer <api_key>" \
  --fail

Use this response to confirm:

Page count
Page dimensions
Permissions
Metadata such as author, title, producer, and modification dates

This is the fastest way to understand the document’s shape before deeper inspection.

3. Capture document properties

Fetch document properties:

curl -X GET "https://api.nutrient.io/viewer/documents/<document_id>/properties" \
  -H "Authorization: Bearer <api_key>" \
  --fail

This is useful for support triage because it includes details such as:

Byte size
Password-protection status
Source PDF SHA-256
Storage type
Created-at timestamp

4. Check text extraction and search

If the issue involves missing or suspicious text, inspect the first page’s extracted text:

curl -X GET "https://api.nutrient.io/viewer/documents/<document_id>/pages/0/text" \
  -H "Authorization: Bearer <api_key>" \
  --fail

Then run a targeted search against a known word or phrase from the file:

curl -G "https://api.nutrient.io/viewer/documents/<document_id>/search" \
  -H "Authorization: Bearer <api_key>" \
  --data-urlencode "q=invoice" \
  --fail

Use these together:

If text extraction is empty or clearly wrong, the issue may be with the source file, OCR quality, or embedded text layer.
If the page text looks correct but search results are unexpected, include both responses in your escalation.

5. Render a first-page preview image

Render page 0 as a PNG preview:

curl -X GET "https://api.nutrient.io/viewer/documents/<document_id>/pages/0/image?width=1600" \
  -H "Authorization: Bearer <api_key>" \
  -H "Accept: image/png" \
  --fail \
  -o page-0.png

This preview helps support confirm whether the problem is visible in server-side rendering without needing the full browser integration.

Escalation packet

For a support-ready packet, include:

The original input file, if shareable
The returned document_id
document_info response
properties response
One page-text response from an affected page
One representative search response
The rendered first-page preview image
A short note describing the expected result versus the observed result

Complete Node.js example

For a script that automates this workflow and writes the outputs to disk, refer to the Node.js large document triage example.

Large document triage

Triage checklist

Recommended workflow

1. Upload the document

2. Capture document information

3. Capture document properties

4. Check text extraction and search

5. Render a first-page preview image

Escalation packet

Complete Node.js example

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.

Large document triage

Triage checklist

Recommended workflow

1. Upload the document

2. Capture document information

3. Capture document properties

4. Check text extraction and search

5. Render a first-page preview image

Escalation packet

Complete Node.js example

Related guides

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.