API overview

Nutrient hosts the DWS Data Extraction API at https://api.nutrient.io. Use this HTTP API to extract structured content and domain-specific data from documents.

Base URL

Use this base URL for all Data Extraction API endpoints:

https://api.nutrient.io

All endpoints are relative to this base URL.

Authentication

Include your API key in the Authorization header with every request:

Authorization: Bearer pdf_live_...

Get API keys from the Data Extraction API dashboard(opens in a new tab). Use keys that start with pdf_live_ for production. Use keys that start with pdf_test_ for testing with limitations.

Available endpoints

The API provides the endpoints below for parsing documents and extracting schema-shaped data.

Endpoint	Description
POST /extraction/parse	Extracts structured elements or Markdown from documents. Supports four processing modes: `text`, `structure`, `understand`, and `agentic`. Supports spatial elements and Markdown output.
POST /extraction/extract	Extracts domain-specific JSON data from documents and maps it to your JSON Schema, with optional per-field citations.

Further details

Use these guides to continue configuring the Data Extraction API:

Refer to the supported languages guide for the full list of 100+ optical character recognition (OCR) languages with ISO codes and aliases.
Refer to the supported file types guide for PDFs, images, and Office files accepted by the API.
Refer to the error handling guide for HTTP status codes, error response formats, and troubleshooting.

API overview

Base URL

Authentication

Available endpoints

Further details

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.