If you already have an API key, run this command to extract content from a PDF:
curl -X POST https://api.nutrient.io/extraction/parse \ -H "Authorization: Bearer your_api_key_goes_here" \ -F "file=@document.pdf"This sends a document to the Data Extraction API using the default settings (understand mode with spatial element output) and returns structured elements with bounding boxes and confidence scores.
Step-by-step setup
Follow these steps to create an account, get an API key, and run your first request.
1. Sign up
Go to the Nutrient dashboard(opens in a new tab) and create an account. If you already have a Nutrient DWS account, skip to step 2.
2. Get your API key
Navigate to the Data Extraction API keys page(opens in a new tab) in the dashboard. Copy your live API key — it starts with pdf_live_.
3. Send your first request
Use the API key to extract content from a document. The examples below upload a local PDF and return structured spatial elements.
curl -X POST https://api.nutrient.io/extraction/parse \ -H "Authorization: Bearer your_api_key_goes_here" \ -F "file=@document.pdf" \ -F 'instructions={"mode":"understand","output":{"format":"spatial"}}'import requests
response = requests.post( "https://api.nutrient.io/extraction/parse", headers={"Authorization": "Bearer your_api_key_goes_here"}, files={"file": open("document.pdf", "rb")}, data={ "instructions": '{"mode":"understand","output":{"format":"spatial"}}' },)
print(response.json())import fs from "node:fs";
const form = new FormData();form.append("file", fs.createReadStream("document.pdf"));form.append( "instructions", JSON.stringify({ mode: "understand", output: { format: "spatial" } }),);
const response = await fetch("https://api.nutrient.io/extraction/parse", { method: "POST", headers: { Authorization: "Bearer your_api_key_goes_here" }, body: form,});
const result = await response.json();console.log(result);POST /extraction/parse HTTP/1.1Host: api.nutrient.ioAuthorization: Bearer your_api_key_goes_hereContent-Type: multipart/form-data; boundary=boundary
--boundaryContent-Disposition: form-data; name="file"; filename="document.pdf"Content-Type: application/pdf
<binary PDF data>--boundaryContent-Disposition: form-data; name="instructions"Content-Type: application/json
{"mode":"understand","output":{"format":"spatial"}}--boundary--4. Review the response
The API returns a JSON response with extracted document elements:
{ "status": 200, "requestId": "req_e5f6g7h8", "output": { "elements": [ { "id": "a1b2c3d4-1111-4000-8000-000000000001", "type": "paragraph", "role": "Title", "text": "Quarterly Report", "confidence": 0.95, "readingOrder": 0, "bounds": { "x": 100, "y": 50, "width": 400, "height": 35 }, "page": { "pageIndex": 0, "pageNumber": 1, "width": 1818, "height": 2422 } } ] }, "metrics": { "processingTimeMs": 4200, "pagesProcessed": 1 }, "configuration": { "mode": "understand", "outputFormat": "spatial" }}Each element includes its type, text content, spatial coordinates (bounds), detection confidence, and page reference. Refer to extract document elements for the full element schema.
Next steps
- Extract document elements — Learn about all element types and spatial output options.
- Extract Markdown — Get whole-document Markdown instead of structured elements.
- Multilingual extraction — Process documents in other languages.
- API overview — Review request formats, authentication, and response structure.