---
title: "DWS Data Extraction API"
canonical_url: "https://www.nutrient.io/guides/dws-data-extraction/"
md_url: "https://www.nutrient.io/guides/dws-data-extraction.md"
last_updated: "2026-05-26T22:37:31.557Z"
description: "Extract structured content from PDFs, images, and Office files using the Nutrient Data Extraction API. Get typed document elements with spatial data or whole-document Markdown."
---

# DWS Data Extraction API

Extract structured content from documents through a simple HTTP API. Upload a PDF, image, or Office file and receive typed document elements with spatial data, or get a whole-document Markdown representation.

## What it does

DWS Data Extraction API helps you to:

- Extract paragraphs, tables, formulas, pictures, and key-value pairs from documents, with bounding box coordinates and confidence scores

- Convert documents to structured Markdown for RAG pipelines, search indexing, and content migration

- Choose between four processing modes: fast text extraction, OCR-based structure extraction, AI-augmented document understanding, and VLM-augmented agentic extraction

- Process documents in more than 100 languages with multilingual OCR support

DWS Data Extraction API is part of Nutrient Document Web Services (DWS). It focuses on content extraction workflows, while [DWS Processor API](https://www.nutrient.io/guides/dws-processor.md) covers document generation, conversion, and editing actions.

## Processing modes

Choose the processing pipeline that fits your use case.

**Text mode**

Fast Markdown extraction from born-digital documents. No OCR or AI. 1 credit per page.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/parsing/processing-modes.md)

**Structure mode**

OCR-based extraction with typed spatial elements and bounding boxes. 1.5 credits per page.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/parsing/processing-modes.md)

**Understand mode**

Full AI-augmented pipeline with layout analysis, table detection, and semantic classification. 9 credits per page.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/parsing/processing-modes.md)

**Agentic mode**

VLM-augmented extraction building on understand mode. The deepest visual understanding of document content. 18 credits per page.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/parsing/processing-modes.md)

## Output formats

The API returns one of two formats, depending on what your downstream system needs.

**Spatial elements**

Typed document elements (paragraphs, tables, formulas, pictures, key-value pairs) with bounding boxes, confidence scores, and reading order.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/parsing/extract-document-elements.md)

**Markdown**

Whole-document Markdown representation. Ideal for RAG, search indexing, and content pipelines.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/parsing/extract-markdown.md)

## Essential guides

Start with these guides to set up your first request, explore the API, or review pricing.

**Get started**

Sign up, get an API key, and send your first extraction request in less than a minute.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/getting-started.md)

**Developer guides**

API reference, request formats, output schemas, and integration patterns.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/api-overview.md)

**Pricing**

Credit costs per mode and plan options.

[Read more](https://www.nutrient.io/guides/dws-data-extraction/pricing.md)
---

## Related pages

- [API overview](/guides/dws-data-extraction/api-overview.md)
- [Supported file types](/guides/dws-data-extraction/file-types.md)
- [Error handling](/guides/dws-data-extraction/errors.md)
- [Get started](/guides/dws-data-extraction/getting-started.md)
- [Support](/guides/dws-data-extraction/support.md)
- [Security](/guides/dws-data-extraction/security.md)
- [Privacy](/guides/dws-data-extraction/privacy.md)
- [Supported languages](/guides/dws-data-extraction/supported-languages.md)

## Pages in this section

- [extract.py](/guides/dws-data-extraction/examples/build-document-extraction-pipeline.md)
- [ingestion/extract.py](/guides/dws-data-extraction/examples/build-rag-ingestion-pipeline.md)
- [Examples](/guides/dws-data-extraction/examples.md)
- [Pricing](/guides/dws-data-extraction/pricing.md)
- [API returns render-space pixels; display at 850 px wide.](/guides/dws-data-extraction/parsing/coordinate-spaces.md)
- [Multilingual extraction](/guides/dws-data-extraction/parsing/multilingual-extraction.md)
- [Extract document elements](/guides/dws-data-extraction/parsing/extract-document-elements.md)
- [Extract Markdown](/guides/dws-data-extraction/parsing/extract-markdown.md)
- [Parse endpoint](/guides/dws-data-extraction/parsing.md)
- [Processing modes](/guides/dws-data-extraction/parsing/processing-modes.md)

