Parse configuration

The Nutrient DWS Data Extraction API extract endpoint runs in two stages. It first parses the document into structured context, and then extracts your schema’s fields from that context.

Use the parseConfig object and the top-level instructions string to configure the parse stage.

Parse modes

parseConfig.mode selects the vision pipeline that runs before extraction. The extract endpoint doesn’t support text mode because schema extraction requires structured, spatial context. To compare this behavior with parsing, refer to the parse endpoint guide.

Mode	Pipeline	When to use
`structure`	OCR-backed structured extraction	Clean, simple layouts where processing time and cost matter most.
`understand`	ICR-backed document understanding	Default. Most documents, including tables, forms, and multicolumn layouts.
`agentic`	VLM-enhanced analysis	The most complex documents — degraded scans, cursive handwriting, dense visual content.

The default mode is understand. The parse mode affects extraction quality and the parse component of the request cost. For credit details, refer to the pricing guide.

{
  "schema": { "type": "object", "properties": { "total": { "type": "number" } } },
  "parseConfig": { "mode": "structure" }
}

Language hints

Set parseConfig.options.language to guide OCR for non-English documents. It accepts these values:

A lowercase language name, such as "english" or "german".
An ISO 639-2 code, such as "eng" or "deu".
For multilingual documents, an array such as ["eng", "spa"] or a +-joined string such as "eng+spa".

The following example configures OCR for English and German:

{
  "schema": { "type": "object", "properties": { "total": { "type": "number" } } },
  "parseConfig": {
    "mode": "understand",
    "options": { "language": ["eng", "deu"] }
  }
}

For the full list of codes and aliases, refer to the supported languages guide.

Free-text instructions

The top-level instructions string gives the extraction model document-wide guidance that doesn’t belong on a single schema field. It accepts up to 10,000 characters:

{
  "schema": {
    "type": "object",
    "properties": {
      "line_items": {
        "type": "array",
        "items": { "type": "object", "properties": { "description": { "type": "string" } } }
      }
    }
  },
  "instructions": "Extract all line items exactly as they appear in the invoice table. Treat shipping and handling as separate line items.",
  "parseConfig": { "mode": "understand" }
}

Use instructions for cross-field rules and disambiguation. Use a field’s description for per-field guidance. For field-level descriptions, refer to the define a schema guide.

Choose a parse mode

Start with the default understand mode. Move to a different mode only when your documents or cost requirements call for it.

Drop to structure when documents have clean, predictable layouts and you need lower cost or shorter processing time.
Move to agentic when understand produces visible gaps, such as degraded scans, cursive or freeform handwriting, or dense visual content that requires visual reasoning.

Test a representative sample at each mode before you commit a pipeline to a mode. Compare the extracted data and citation confidence. To configure citations, refer to the citations and confidence guide.

Next steps

Use these guides to continue configuring extraction:

Refer to the define a schema guide for field types, supported keywords, and limits.
Refer to the citations and confidence guide for per-field grounding and confidence signals.
Refer to the processing modes guide for a detailed comparison of parse-stage modes.

Parse configuration

Parse modes

Language hints

Free-text instructions

Choose a parse mode

Next steps

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.