Extract data from documents from anywhere in your workflow

Use the AI Data Extraction Task Type to extract data from documents by leveraging AI services such as OpenAI or Claude. This task type processes documents in the background and automatically populates form fields with extracted data, eliminating the need for manual data entry.

Note: This feature involves Large Language Models (LLMs). While LLMs are powerful, they can occasionally generate inaccurate or fabricated information (“hallucinations”). Always test your prompts with sample documents to ensure high-quality results.

Prerequisites

Before beginning, have the following ready:

  • An account with a chosen AI provider (e.g. OpenAI or Claude).
  • A valid API key from the AI provider.
  • Administrative access to your workflow automation system.
  • Sample documents for testing the extraction process.

Workflow Overview

Below is a simplified workflow of how to use the AI Data Extraction question:

  1. Create an API Credentials

    Generate your API key on the provider’s website (OpenAI or Claude), then register it in Nutrient Workflow Automation’s AI Sources.

  2. Design Your Process

    Create a workflow process with forms that include file upload fields and target fields for extracted data.

  3. Add & Configure the AI Data Extraction Task

    • Add the AI Data Extraction task to your process
    • Select the AI provider connection
    • Add a custom prompt describing what to extract
    • Configure prefills to reference uploaded documents
    • Test the prompt with sample documents and refine as needed
    • Map the AI's JSON response to process outputs

Step 1: Create a New API Credential

  1. Obtain an API Key from your AI Provider
  2. Add the Credential in Nutrient Workflow Automation
    1. Open SettingsAI Sources
    2. Click Add Connection and select the AI Provider and Model.
    3. Select Credentials (either select an existing one or add a new one).
    4. Enter a name for your credential (for example, “OpenAI Key” or “Claude Key”)
    5. Paste your API key from the AI provider.
    6. Click Save.

Step 2: Design Your Workflow Process

  1. Create Initial Form Task

    • Add a form task where users can upload documents
    • Include file attachment fields for documents to be processed
    • Add any other input fields needed for the workflow
  2. Add Target Forms for Extracted Data

    • Create forms with fields that will hold the AI-extracted data
    • For example, if extracting invoice information, include fields such as:
      • Invoice Number
      • Due Date
      • Customer Name
      • Payment Terms
      • Total Amount
  3. Design Process Flow

    • Connect your tasks in the desired sequence
    • Plan where the AI Data Extraction task will fit in your process flow

Step 3: Add and Configure the AI Data Extraction Task

  1. Add the Task to Your Process

    1. In the process designer, add a new AI Data Extraction task to your workflow
    2. Position it after the form task that collects the documents to be processed
  2. Configure AI Settings

    1. Right-click on the AI Data Extraction task > Configuration > Configure Task to open configuration
    2. In the Settings tab, under AI Settings section:
      • Select the AI Connection created in Step 1
      • The selected provider and model will be displayed
  3. Create the Prompt with Prefills

    The prompt is the instruction given to the AI to tell it what to extract and how to format the response. This step is crucial for getting accurate results.

    Basic Prompt Structure:

    You are given a document. Extract the following information:
    1. Invoice Number
    2. Invoice Date
    3. Customer Name
    4. Total Amount
    5. Line Items (description and amount for each)

    Adding Document References: To reference uploaded documents from previous tasks, use prefills:

    1. Click Add prefills to Prompt
    2. Select the task and field containing the uploaded document
    3. The prefill will be inserted as: {[Task_Input|TaskName|FieldName]}

    Example Prompt with Prefills:

    Analyze the uploaded invoice document and extract:
    - Invoice Number
    - Invoice Date (format: YYYY-MM-DD)
    - Customer Name
    - Total Amount
    - Line Items with descriptions and amounts
    Document to process:
    {[Task_Input|InvoiceUpload|InvoiceFile]}
    Return as JSON with exact field names.
  4. Configure Advanced Settings (Optional)

    Expand the Advanced Settings section to fine-tune AI behavior:

    • Temperature (0-1): Controls randomness (lower = more focused)
    • Max Tokens: Maximum response length
    • Stop Sequences: Text that stops AI generation
    • Retry on Error: Enable automatic retries for failed processing
    • Max Retries: Number of retry attempts (1-10)
    • Return to Request Detail: Process in background while user continues
  5. Test the Configuration

    Before deploying, test your AI configuration:

    1. In the Test the prompt section, set Test Prefill Values
    2. For file attachment prefills, upload sample documents
    3. Click Run Test to see the AI response
    4. Review the JSON output and refine the prompt if needed

Step 4: Configure Response Mappings

Map the AI response fields to process outputs so the extracted data can be used in subsequent tasks:

  1. Switch to Response Values Tab

    Click on the Response Values tab to configure output mappings.

  2. Add Response Mappings

    For each field you want to extract from the AI response:

    1. Click Add to create a new mapping
    2. Key: Unique identifier for this output
    3. Label: Display name for the field
    4. AI Response Field: The JSON field name from AI response
    5. Data Type: Type of data (String, Number, Date, File Attachment)

    Example Mappings:

    • Key: invoice_number, AI Response Field: Invoice Number, Data Type: String
    • Key: total_amount, AI Response Field: Total Amount, Data Type: Number
    • Key: invoice_date, AI Response Field: Invoice Date, Data Type: Date
  3. Configure File Attachments (if applicable)

    For extracted files or generated documents:

    • Set Data Type to "File Attachment"
    • Configure Filename source (fixed value or from AI response)
    • The AI response should contain base64-encoded file data

Step 5: Use Extracted Data in Subsequent Tasks

Once the AI Data Extraction task completes, use the extracted data in subsequent form tasks:

  1. Create Follow-up Form Tasks

    Add form tasks after the AI Data Extraction task with fields for the extracted data.

  2. Configure Prefills in Forms

    In your form fields, set up prefills to populate with extracted data:

    1. Right-click on the Form task > Configuration > Configure Task to open configuration

    2. Go to PREFILL SETTINGS tab

    3. Edit 🖊️ each field to prefill with the extracted data from AI

      • Source: Select Data
      • Task: Select your AI Data Extraction task
      • Field: Select the corresponding response output label defined in Step 4

      Example:

      • Invoice Number field prefilled with Invoice Number output
      • Total Amount field prefilled with Total Amount output

Step 6: Deploy and Monitor

  1. Test End-to-End

    Run a complete test of your workflow:

    • Submit documents through the initial form
    • Monitor the AI Data Extraction task execution
    • Verify extracted data appears correctly in subsequent forms

Tips & Best Practices

  • Give Clear Instructions: Provide concise, direct prompts for better results.
  • Validate the AI Output: Always confirm the AI response is formatted correctly as JSON and that it includes the desired fields.
  • Refine Iteratively: Small changes in prompt can produce significantly different results, so experiment with prompt wording for optimal results.
  • Watch for Hallucinations: AI may sometimes provide information that isn’t in the document. If this happens, refine the prompt by using additional validation.