Nutrient hosts the DWS Data Extraction API at https://api.nutrient.io. Use this HTTP API to extract structured content and domain-specific data from documents.
Base URL
Use this base URL for all Data Extraction API endpoints:
https://api.nutrient.ioAll endpoints are relative to this base URL.
Authentication
Include your API key in the Authorization header with every request:
Authorization: Bearer pdf_live_...Get API keys from the Data Extraction API dashboard(opens in a new tab). Use keys that start with pdf_live_ for production. Use keys that start with pdf_test_ for testing with limitations.
Available endpoints
The API provides the endpoints below for parsing documents and extracting schema-shaped data.
| Endpoint | Description |
|---|---|
| POST /extraction/parse | Extracts structured elements or Markdown from documents. Supports four processing modes: text, structure, understand, and agentic. Supports spatial elements and Markdown output. |
| POST /extraction/extract | Extracts domain-specific JSON data from documents and maps it to your JSON Schema, with optional per-field citations. |
Further details
Use these guides to continue configuring the Data Extraction API:
- Refer to the supported languages guide for the full list of 100+ optical character recognition (OCR) languages with ISO codes and aliases.
- Refer to the supported file types guide for PDFs, images, and Office files accepted by the API.
- Refer to the error handling guide for HTTP status codes, error response formats, and troubleshooting.