Skip to main content
PreviewAgent Builder is currently in preview and may change before general availability.
In this tutorial, you build a document processing pipeline that verifies customer onboarding documents against application data. The pipeline receives uploaded files (ID card, proof of address, salary slip), classifies each document, extracts structured data, compares it to what the applicant declared, and routes discrepancies to a human reviewer. What you will build:
  • A file upload UI that accepts multiple documents
  • A document classification workflow that identifies each document type using AI
  • A fan-out extraction pipeline that routes each type to a specialized extractor
  • An AI reconciliation step that compares extracted data against application data
  • Business rules that flag mismatches (name, address, income)
  • A human review task for documents with discrepancies
  • A summary generation step that produces a verification report
AI node types used: Text Understanding, Extract Data from File, Text Generation Patterns demonstrated: Fan-out extraction, AI comparison and reconciliation

Architecture overview

The pipeline processes documents in four phases: upload, classify and extract, reconcile, and review. Workflow breakdown:
WorkflowAI nodesPurpose
classifyAndExtractText Understanding + Extract Data from FileClassify document type, then extract fields using type-specific prompts
reconcileDataText UnderstandingCompare extracted fields against application data
generateSummaryText GenerationProduce a human-readable verification report

Prerequisites

Before starting, make sure you have:
  • Access to a FlowX Designer workspace with AI Platform enabled
  • Familiarity with creating processes, workflows, and UI flows in FlowX
  • A project with the Documents Plugin configured (for file uploads)

Data model

Define the following data model keys in your process. These keys hold the application data submitted by the customer and the results produced by the AI pipeline.
{
  "applicant": {
    "firstName": "string",
    "lastName": "string",
    "dateOfBirth": "string",
    "address": {
      "street": "string",
      "city": "string",
      "postalCode": "string",
      "country": "string"
    },
    "monthlyIncome": "number",
    "employer": "string"
  },
  "documents": {
    "uploadedFiles": [
      {
        "fileId": "string",
        "filePath": "string",
        "fileName": "string"
      }
    ]
  },
  "extraction": {
    "classifiedDocs": [
      {
        "fileId": "string",
        "documentType": "string",
        "confidence": "number",
        "extractedData": "object"
      }
    ]
  },
  "reconciliation": {
    "matchRate": "number",
    "fieldResults": "array",
    "exceptions": "array",
    "overallStatus": "string"
  },
  "review": {
    "reviewerDecision": "string",
    "reviewerNotes": "string"
  },
  "summary": {
    "report": "string"
  }
}
Define these keys under your process data model before building the workflows. The AI nodes and business rules reference these paths at runtime.

Step 1: Build the classification and extraction workflow

Create a workflow named classifyAndExtract. This workflow receives a single document file path, classifies the document type, and then routes to the appropriate extraction branch. This implements the fan-out extraction pattern.

1.1 Add the classification node

Add a Text Understanding node as the first node after Start Flow. This node reads the document content and classifies it. Operation Prompt:
You are a document classifier for a bank's customer onboarding process.
Analyze the provided document and determine its type.

Respond with exactly one of the following values:
- ID_CARD
- PROOF_OF_ADDRESS
- SALARY_SLIP
- UNKNOWN

Classification rules:
- ID_CARD: Government-issued identification documents (passport, national ID,
  driver's license). Contains photo, name, date of birth, document number.
- PROOF_OF_ADDRESS: Utility bills, bank statements, or government letters
  showing a name and residential address. Must be dated within the last
  3 months.
- SALARY_SLIP: Employment payslips or salary certificates showing employer
  name, employee name, gross/net salary, and pay period.
- UNKNOWN: Document does not match any of the above categories.

Base your classification on the document layout, headers, field labels,
and content structure. If uncertain, return UNKNOWN.
Response schema:
{
  "type": "object",
  "properties": {
    "document_type": {
      "type": "string",
      "enum": ["ID_CARD", "PROOF_OF_ADDRESS", "SALARY_SLIP", "UNKNOWN"]
    },
    "confidence": {
      "type": "number",
      "description": "Classification confidence between 0 and 1"
    },
    "reasoning": {
      "type": "string",
      "description": "Brief explanation of why this type was chosen"
    }
  },
  "required": ["document_type", "confidence"]
}
Response Key: classificationResult

1.2 Add the Condition node

Add a Condition node after the Text Understanding node. Configure branches based on the document_type value:
BranchCondition (Python)Target
Ifinput["classificationResult"]["document_type"] == "ID_CARD"Extract Data from File (ID card)
Else ifinput["classificationResult"]["document_type"] == "PROOF_OF_ADDRESS"Extract Data from File (proof of address)
Else ifinput["classificationResult"]["document_type"] == "SALARY_SLIP"Extract Data from File (salary slip)
Else(default)Script node (unknown document)
The Else branch handles UNKNOWN documents. Use a Script node to return a structured error so the parent process can flag the document for manual classification.

1.3 Configure type-specific extraction nodes

Each branch contains an Extract Data from File node with a prompt and schema tailored to that document type.
Extraction Strategy: LLM Model (handles varied ID layouts, photos, and security features)Operation Prompt:
Extract all personal identification fields from this ID document.
The document may be a passport, national ID card, or driver's license.
If a field is not present or not legible, return null for that field.
Response schema:
{
  "type": "object",
  "properties": {
    "document_number": { "type": "string" },
    "first_name": { "type": "string" },
    "last_name": { "type": "string" },
    "date_of_birth": {
      "type": "string",
      "description": "Format: YYYY-MM-DD"
    },
    "nationality": { "type": "string" },
    "expiry_date": {
      "type": "string",
      "description": "Format: YYYY-MM-DD"
    },
    "issuing_authority": { "type": "string" },
    "address": {
      "type": "object",
      "properties": {
        "street": { "type": "string" },
        "city": { "type": "string" },
        "postal_code": { "type": "string" },
        "country": { "type": "string" }
      }
    }
  },
  "required": ["first_name", "last_name", "date_of_birth"]
}
Response Key: extractedData
Choose the extraction strategy based on the document characteristics. LLM Model provides the highest accuracy for complex layouts but costs more per page. See Extract Data from File for strategy comparison.

1.4 Add the End Flow node

Add an End Flow node where all branches converge. Set the body to pass results back to the parent process:
{
  "output": {
    "documentType": "${classificationResult.document_type}",
    "confidence": "${classificationResult.confidence}",
    "extractedData": ${extractedData}
  }
}

Step 2: Build the reconciliation workflow

Create a workflow named reconcileData. This workflow compares the extracted document data against the applicant’s declared data. This implements the AI comparison and reconciliation pattern.

2.1 Add the comparison node

Add a Text Understanding node that receives both the extracted data and the application data. Operation Prompt:
You are a document verification agent for a bank's customer onboarding
process. Compare the AI-extracted document data against the applicant's
declared data and produce a structured exception report.

Instructions:
1. Compare each field individually. Use fuzzy matching for names
   (e.g., "John Smith" vs "JOHN SMITH" is a MATCH, "Jon Smith" vs
   "John Smith" is a WARNING).
2. For addresses, compare at the component level (street, city,
   postal code). Minor formatting differences are acceptable.
3. For dates, normalize to YYYY-MM-DD before comparing.
4. For income, flag if the extracted net salary differs from the
   declared monthly income by more than 10%.
5. Compute an overall match rate as a percentage (0-100).
6. Assign a confidence score (0-100) reflecting how certain you are
   in the comparison results.
7. Flag each exception with a severity:
   - CRITICAL: Identity mismatch (different person), expired document
   - WARNING: Minor name variation, address component mismatch,
     income difference 10-25%
   - INFO: Formatting differences, abbreviations

Extracted document data:
${extraction.classifiedDocs}

Applicant's declared data:
${applicant}
Response schema:
{
  "type": "object",
  "properties": {
    "matchRate": {
      "type": "number",
      "description": "Overall match rate from 0 to 100"
    },
    "confidenceScore": {
      "type": "number",
      "description": "Confidence in comparison accuracy from 0 to 100"
    },
    "fieldResults": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "fieldName": { "type": "string" },
          "documentType": { "type": "string" },
          "extractedValue": { "type": "string" },
          "declaredValue": { "type": "string" },
          "status": {
            "type": "string",
            "enum": ["MATCH", "MISMATCH", "MISSING"]
          },
          "severity": {
            "type": "string",
            "enum": ["CRITICAL", "WARNING", "INFO"]
          },
          "note": { "type": "string" }
        },
        "required": ["fieldName", "status", "severity"]
      }
    },
    "exceptions": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "fieldName": { "type": "string" },
          "description": { "type": "string" },
          "severity": {
            "type": "string",
            "enum": ["CRITICAL", "WARNING", "INFO"]
          }
        },
        "required": ["fieldName", "description", "severity"]
      }
    },
    "overallStatus": {
      "type": "string",
      "enum": ["APPROVED", "REVIEW_REQUIRED", "REJECTED"],
      "description": "Recommended action based on match rate and exceptions"
    }
  },
  "required": ["matchRate", "confidenceScore", "fieldResults",
    "exceptions", "overallStatus"]
}
Response Key: reconciliationResult

2.2 Add the End Flow node

{
  "output": ${reconciliationResult}
}

Step 3: Build the summary generation workflow

Create a workflow named generateSummary with a single Text Generation node. Operation Prompt:
You are a compliance documentation assistant. Generate a document
verification summary report based on the reconciliation results.

Structure the report as follows:

1. VERIFICATION OVERVIEW
   - Applicant name
   - Number of documents processed
   - Overall match rate
   - Verification status (Approved / Review Required / Rejected)

2. DOCUMENT DETAILS
   For each document:
   - Document type and classification confidence
   - Fields extracted
   - Match/mismatch status per field

3. EXCEPTIONS
   - List all exceptions with severity and description
   - Highlight any CRITICAL issues

4. RECOMMENDATION
   - Clear recommendation based on the findings
   - Specific follow-up actions if needed

Use professional, concise language. Format the report in Markdown.

Applicant data:
${applicant}

Extraction results:
${extraction.classifiedDocs}

Reconciliation results:
${reconciliation}

Reviewer notes (if any):
${review.reviewerNotes}
Response Key: summaryReport End Flow body:
{
  "output": {
    "report": "${summaryReport}"
  }
}

Step 4: Build the BPMN process

Create a process named documentVerify that orchestrates the full pipeline using the workflows you built.
1

Add a User Task for file upload

Add a User Task node after the Start Event. This task presents the file upload UI to the user.Configure the task with:
  • Task name: Upload documents
  • Assignment: Assigned to the initiating user
Attach a UI Flow (built in Step 5) that allows uploading multiple files.
2

Loop through uploaded documents

For each uploaded document, trigger the classifyAndExtract workflow. Add a Send Message Task node with a Start Integration Workflow action.Input mapping:
{
  "filePath": "${documents.uploadedFiles[index].filePath}",
  "fileId": "${documents.uploadedFiles[index].fileId}"
}
Add a Receive Message Task node to capture the extraction output. Set the Result Key to extraction.classifiedDocs[index].
For multiple documents, repeat the Send/Receive pattern for each file, or use a loop structure with an exclusive gateway that iterates until all files are processed.
3

Trigger the reconciliation workflow

Add another Send Message Task with a Start Integration Workflow action pointing to the reconcileData workflow.Input mapping:
{
  "classifiedDocs": "${extraction.classifiedDocs}",
  "applicant": "${applicant}"
}
Add a Receive Message Task node. Set the Result Key to reconciliation.
4

Add a business rule for validation

Add a Business Rule action (JavaScript) to perform deterministic validation checks that supplement the AI reconciliation.
// Check if any CRITICAL exceptions exist
var hasCritical = output.reconciliation.exceptions.some(function(e) {
  return e.severity === "CRITICAL";
});

// Check if ID document is expired
var idDoc = output.extraction.classifiedDocs.find(function(d) {
  return d.documentType === "ID_CARD";
});

var isExpired = false;
if (idDoc && idDoc.extractedData.expiry_date) {
  var expiryDate = new Date(idDoc.extractedData.expiry_date);
  isExpired = expiryDate < new Date();
}

// Check income discrepancy
var salaryDoc = output.extraction.classifiedDocs.find(function(d) {
  return d.documentType === "SALARY_SLIP";
});

var incomeDiscrepancy = false;
if (salaryDoc && salaryDoc.extractedData.net_salary) {
  var declared = output.applicant.monthlyIncome;
  var extracted = salaryDoc.extractedData.net_salary;
  var diff = Math.abs(declared - extracted) / declared;
  incomeDiscrepancy = diff > 0.1; // More than 10% difference
}

// Check all required document types are present
var docTypes = output.extraction.classifiedDocs.map(function(d) {
  return d.documentType;
});
var missingId = docTypes.indexOf("ID_CARD") === -1;
var missingAddress = docTypes.indexOf("PROOF_OF_ADDRESS") === -1;
var missingSalary = docTypes.indexOf("SALARY_SLIP") === -1;

// Set validation result
output.validation = {
  hasCriticalExceptions: hasCritical,
  isIdExpired: isExpired,
  incomeDiscrepancy: incomeDiscrepancy,
  missingDocuments: {
    idCard: missingId,
    proofOfAddress: missingAddress,
    salarySlip: missingSalary
  },
  requiresReview: hasCritical || isExpired || incomeDiscrepancy
    || missingId || missingAddress || missingSalary
};
Business rules provide deterministic, auditable checks. Use them alongside AI reconciliation to catch issues the LLM might miss, such as expired documents or missing required document types.
5

Add the routing gateway

Add an Exclusive Gateway after the business rule. Configure two branches:
BranchConditionTarget
Auto-approvevalidation.requiresReview == false AND reconciliation.matchRate >= 90Proceed to summary generation
Human review(default)Human review User Task
6

Add the human review task

Add a User Task node for manual review. The reviewer sees:
  • Uploaded documents (viewable in a document viewer)
  • Extracted data side-by-side with declared data
  • The exception report from reconciliation
  • Validation flags from the business rule
The reviewer submits a decision:
  • Approve — continue to summary
  • Reject — end process with rejection status
  • Request re-upload — loop back to the upload step
Store the decision in review.reviewerDecision and any notes in review.reviewerNotes.
7

Trigger the summary generation workflow

After both the auto-approve and human-review-approve paths converge, add a Send Message Task to trigger the generateSummary workflow.Input mapping:
{
  "applicant": "${applicant}",
  "classifiedDocs": "${extraction.classifiedDocs}",
  "reconciliation": "${reconciliation}",
  "reviewerNotes": "${review.reviewerNotes}"
}
Add a Receive Message Task. Set the Result Key to summary.
8

Add the End Event

Add an End Event after the summary is received. The process instance now contains the full verification report at summary.report.

Step 5: Build the upload UI

Create a UI Flow with a page for document upload.
1

Create the UI Flow

Go to UI Flows in the project sidebar and create a new UI Flow named documentUpload.
2

Add an upload component

Add a File Upload component to the page. Configure it to:
  • Accept multiple files
  • Restrict file types to PDF, JPG, PNG
  • Map uploaded file paths to documents.uploadedFiles
3

Add applicant data display

Add form fields (read-only) that display the applicant’s declared data from applicant. This gives context to the person uploading documents.
4

Add a submit button

Add a Button component labeled Submit documents. Configure it to save the data and advance the User Task.
For the human review step, create a second UI Flow page that displays the extracted data, reconciliation results, and exception report alongside the original documents. Use a side-by-side layout so the reviewer can compare easily.

Step 6: Build the review UI

Create a second page in the UI Flow for the human review task. The review page should include:
SectionData sourceComponent
Applicant infoapplicantRead-only form fields
Uploaded documentsdocuments.uploadedFilesDocument viewer
Extraction resultsextraction.classifiedDocsData table
Reconciliation reportreconciliation.fieldResultsData table with status badges
Exceptionsreconciliation.exceptionsList with severity highlighting
Validation flagsvalidationAlert components for each flag
Decisionreview.reviewerDecisionRadio buttons (Approve / Reject / Request re-upload)
Notesreview.reviewerNotesText area
Use conditional visibility to highlight rows with MISMATCH or CRITICAL status in the reconciliation table. This draws the reviewer’s attention to the issues that need their judgment.

Testing

1

Test classification in isolation

Open the classifyAndExtract workflow and use Run Workflow with a test file. Upload sample documents one at a time and verify the classification output.
Test documentExpected typeExpected confidence
Scanned passportID_CARD> 0.9
Electricity bill PDFPROOF_OF_ADDRESS> 0.9
Monthly payslipSALARY_SLIP> 0.9
Random brochureUNKNOWN< 0.5
2

Test extraction accuracy

For each document type, compare the extracted fields against the actual document content. Check that:
  • Names are captured correctly (including accented characters)
  • Dates are in the expected YYYY-MM-DD format
  • Numeric values (salary, postal code) are accurate
  • Null is returned for missing fields (not hallucinated values)
3

Test reconciliation with known mismatches

Prepare test data with deliberate mismatches:
{
  "applicant": {
    "firstName": "John",
    "lastName": "Smith",
    "monthlyIncome": 5000
  }
}
Upload an ID card with the name “Jonathan Smith” and a salary slip showing a net salary of 4200. Verify the reconciliation output flags:
  • Name variation as WARNING
  • Income discrepancy (16%) as WARNING
4

Test the business rule

Verify the JavaScript business rule catches:
  • Expired ID documents
  • Missing required document types
  • Income discrepancy above 10%
  • CRITICAL exceptions from reconciliation
Test edge cases: all documents valid (auto-approve path), one missing document (review path), expired ID (review path).
5

Test the full end-to-end flow

Run the complete documentVerify process:
  1. Upload three documents (ID, proof of address, salary slip)
  2. Verify classification and extraction complete
  3. Check the reconciliation report
  4. If routed to review, complete the reviewer task
  5. Verify the summary report is generated
Test both the auto-approve path (all documents match) and the human review path (with discrepancies).

What you learned

In this tutorial, you built a document processing pipeline that demonstrates several key patterns:
  • Fan-out extraction — classifying documents by type and routing each to a specialized extraction node with tailored prompts and schemas
  • AI reconciliation — comparing AI-extracted data against application data with structured exception reports
  • Hybrid AI + business rules — combining AI-driven comparison with deterministic validation (expired documents, missing types, income thresholds)
  • Human-in-the-loop — routing edge cases to a reviewer while auto-approving clean results
  • Workflow composition — building modular workflows for classification, reconciliation, and summary generation, then orchestrating them from a BPMN process

Next steps

Last modified on March 16, 2026