---
title: "Parse endpoint"
canonical_url: "https://www.nutrient.io/guides/dws-data-extraction/parsing/"
md_url: "https://www.nutrient.io/guides/dws-data-extraction/parsing.md"
last_updated: "2026-05-26T22:37:31.557Z"
description: "Extract structured content from documents using the /extraction/parse endpoint. Supports multipart upload, URL input, and raw binary."
---

# Parse endpoint

The parse endpoint extracts structured content from a document and returns either typed spatial elements with bounding boxes or a whole-document Markdown representation:

```

POST https://api.nutrient.io/extraction/parse

```

## Request formats

You can send documents to the parse endpoint in three ways.

### Multipart form upload

Upload a file directly with optional processing instructions.

### curl

```shell

curl -X POST https://api.nutrient.io/extraction/parse \
  -H "Authorization: Bearer your_api_key_goes_here" \
  -F "file=@document.pdf" \
  -F 'instructions={"mode":"understand","output":{"format":"spatial"}}'

```

### Python

```python

import requests

response = requests.post(
    "https://api.nutrient.io/extraction/parse",
    headers={"Authorization": "Bearer your_api_key_goes_here"},
    files={"file": open("document.pdf", "rb")},
    data={
        "instructions": '{"mode":"understand","output":{"format":"spatial"}}'
    },
)

print(response.json())

```

### JavaScript

```javascript

import fs from "node:fs";

const form = new FormData();
form.append("file", fs.createReadStream("document.pdf"));
form.append(
  "instructions",
  JSON.stringify({ mode: "understand", output: { format: "spatial" } }),
);

const response = await fetch("https://api.nutrient.io/extraction/parse", {
  method: "POST",
  headers: { Authorization: "Bearer your_api_key_goes_here" },
  body: form,
});

const result = await response.json();
console.log(result);

```

When `instructions` is omitted, the API defaults to `understand` mode with spatial element output.

### JSON body with URL

Process a document hosted at a public URL without uploading a file.

### curl

```shell

curl -X POST https://api.nutrient.io/extraction/parse \
  -H "Authorization: Bearer your_api_key_goes_here" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://storage.example.com/invoice.pdf",
    "mode": "understand",
    "output": { "format": "spatial" }
  }'

```

### Python

```python

import requests

response = requests.post(
    "https://api.nutrient.io/extraction/parse",
    headers={
        "Authorization": "Bearer your_api_key_goes_here",
        "Content-Type": "application/json",
    },
    json={
        "url": "https://storage.example.com/invoice.pdf",
        "mode": "understand",
        "output": {"format": "spatial"},
    },
)

print(response.json())

```

### JavaScript

```javascript

const response = await fetch("https://api.nutrient.io/extraction/parse", {
  method: "POST",
  headers: {
    Authorization: "Bearer your_api_key_goes_here",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    url: "https://storage.example.com/invoice.pdf",
    mode: "understand",
    output: { format: "spatial" },
  }),
});

const result = await response.json();
console.log(result);

```

### Raw binary upload

Send a file directly as the request body with the appropriate `Content-Type` header.

### curl

```shell

curl -X POST https://api.nutrient.io/extraction/parse \
  -H "Authorization: Bearer your_api_key_goes_here" \
  -H "Content-Type: application/pdf" \
  --data-binary @document.pdf

```

### Python

```python

import requests

with open("document.pdf", "rb") as f:
    response = requests.post(
        "https://api.nutrient.io/extraction/parse",
        headers={
            "Authorization": "Bearer your_api_key_goes_here",
            "Content-Type": "application/pdf",
        },
        data=f.read(),
    )

print(response.json())

```

### JavaScript

```javascript

import fs from "node:fs";

const fileBuffer = fs.readFileSync("document.pdf");

const response = await fetch("https://api.nutrient.io/extraction/parse", {
  method: "POST",
  headers: {
    Authorization: "Bearer your_api_key_goes_here",
    "Content-Type": "application/pdf",
  },
  body: fileBuffer,
});

const result = await response.json();
console.log(result);

```

## Processing modes

The `mode` parameter controls the extraction pipeline.

| Mode         | Description                                                              | Speed   | Cost                 |
| ------------ | ------------------------------------------------------------------------ | ------- | -------------------- |
| `text`       | Fast Markdown extraction via Document Engine. No OCR or AI augmentation. | Fastest | 1 credit per page    |
| `structure`  | OCR-based structured extraction with spatial element output              | Fast    | 1.5 credits per page |
| `understand` | Full extraction pipeline with AI augmentation for richer results         | Slower  | 9 credits per page   |
| `agentic`    | VLM-augmented extraction for complex documents                           | Slowest | 18 credits per page  |

Default: `understand`. Refer to the [processing modes](https://www.nutrient.io/guides/dws-data-extraction/parsing/processing-modes.md) guide for a detailed comparison.

## Output formats

The `output.format` parameter controls what the API returns.

| Format     | Description                                                                                        | Returns             |
| ---------- | -------------------------------------------------------------------------------------------------- | ------------------- |
| `spatial`  | Typed document elements with bounding boxes and confidence scores. Not available with `text` mode. | `output.elements[]` |
| `markdown` | Whole-document Markdown representation                                                             | `output.markdown`   |

Default format depends on the mode: `text` defaults to `markdown`; `structure`, `understand`, and `agentic` default to `spatial`.

When `output.format` is `spatial`, you can also set `output.includeWords: true` to include word-level OCR data nested inside paragraph and table cell elements.

## Response structure

Every successful response includes:

```json

{
  "status": 200,
  "requestId": "req_e5f6g7h8",
  "output": {... },
  "metrics": {
    "processingTimeMs": 4200,
    "pagesProcessed": 1
  },
  "usage": {
    "data_extraction_credits": {
      "cost": 9,
      "remainingCredits": 850
    }
  },
  "configuration": {
    "mode": "understand",
    "outputFormat": "spatial"
  }
}

```

- `requestId` — Unique identifier for debugging and support requests.

- `output` — Contains either `elements` (spatial format) or `markdown` (Markdown format).

- `metrics` — Processing time and pages processed.

- `usage` — Credit consumption for this request.

- `configuration` — The mode and output format that were used.

## Next steps

- [Processing modes](https://www.nutrient.io/guides/dws-data-extraction/parsing/processing-modes.md) — Compare text, structure, understand, and agentic modes.

- [Extract document elements](https://www.nutrient.io/guides/dws-data-extraction/parsing/extract-document-elements.md) — Spatial output with typed elements and bounding boxes.

- [Extract Markdown](https://www.nutrient.io/guides/dws-data-extraction/parsing/extract-markdown.md) — Whole-document Markdown output.

- [Coordinate spaces](https://www.nutrient.io/guides/dws-data-extraction/parsing/coordinate-spaces.md) — Understand units, bounding boxes, and how to map coordinates to rendered pages.

- [Multilingual extraction](https://www.nutrient.io/guides/dws-data-extraction/parsing/multilingual-extraction.md) — Process documents in other languages.

- [Error handling](https://www.nutrient.io/guides/dws-data-extraction/errors.md) — HTTP status codes and error response format.
---

## Related pages

- [API returns render-space pixels; display at 850 px wide.](/guides/dws-data-extraction/parsing/coordinate-spaces.md)
- [Multilingual extraction](/guides/dws-data-extraction/parsing/multilingual-extraction.md)
- [Extract document elements](/guides/dws-data-extraction/parsing/extract-document-elements.md)
- [Extract Markdown](/guides/dws-data-extraction/parsing/extract-markdown.md)
- [Processing modes](/guides/dws-data-extraction/parsing/processing-modes.md)

