Parse configuration
The Nutrient DWS Data Extraction API extract endpoint runs in two stages. It first parses the document into structured context, and then extracts your schema’s fields from that context.
Use the parseConfig object and the top-level instructions string to configure the parse stage.
Parse modes
parseConfig.mode selects the vision pipeline that runs before extraction. The extract endpoint doesn’t support text mode because schema extraction requires structured, spatial context. To compare this behavior with parsing, refer to the parse endpoint guide.
| Mode | Pipeline | When to use |
|---|---|---|
structure | OCR-backed structured extraction | Clean, simple layouts where processing time and cost matter most. |
understand | ICR-backed document understanding | Default. Most documents, including tables, forms, and multicolumn layouts. |
agentic | VLM-enhanced analysis | The most complex documents — degraded scans, cursive handwriting, dense visual content. |
The default mode is understand. The parse mode affects extraction quality and the parse component of the request cost. For credit details, refer to the pricing guide.
{ "schema": { "type": "object", "properties": { "total": { "type": "number" } } }, "parseConfig": { "mode": "structure" }}Language hints
Set parseConfig.options.language to guide OCR for non-English documents. It accepts these values:
- A lowercase language name, such as
"english"or"german". - An ISO 639-2 code, such as
"eng"or"deu". - For multilingual documents, an array such as
["eng", "spa"]or a+-joined string such as"eng+spa".
The following example configures OCR for English and German:
{ "schema": { "type": "object", "properties": { "total": { "type": "number" } } }, "parseConfig": { "mode": "understand", "options": { "language": ["eng", "deu"] } }}For the full list of codes and aliases, refer to the supported languages guide.
Free-text instructions
The top-level instructions string gives the extraction model document-wide guidance that doesn’t belong on a single schema field. It accepts up to 10,000 characters:
{ "schema": { "type": "object", "properties": { "line_items": { "type": "array", "items": { "type": "object", "properties": { "description": { "type": "string" } } } } } }, "instructions": "Extract all line items exactly as they appear in the invoice table. Treat shipping and handling as separate line items.", "parseConfig": { "mode": "understand" }}Use instructions for cross-field rules and disambiguation. Use a field’s description for per-field guidance. For field-level descriptions, refer to the define a schema guide.
Choose a parse mode
Start with the default understand mode. Move to a different mode only when your documents or cost requirements call for it.
- Drop to
structurewhen documents have clean, predictable layouts and you need lower cost or shorter processing time. - Move to
agenticwhenunderstandproduces visible gaps, such as degraded scans, cursive or freeform handwriting, or dense visual content that requires visual reasoning.
Test a representative sample at each mode before you commit a pipeline to a mode. Compare the extracted data and citation confidence. To configure citations, refer to the citations and confidence guide.
Next steps
Use these guides to continue configuring extraction:
- Refer to the define a schema guide for field types, supported keywords, and limits.
- Refer to the citations and confidence guide for per-field grounding and confidence signals.
- Refer to the processing modes guide for a detailed comparison of parse-stage modes.