Skip to main content
Available starting with FlowX.AI 5.6.0The “Extract Text from Document” node has been renamed to Extract Data from File and now supports image inputs, configurable extraction methods, image extraction options, and signature detection.

Overview

The Extract Data from File node extracts text and structured data from documents and images within Agent Builder workflows. It supports multiple extraction strategies so you can balance accuracy, speed, and cost based on your document types.

Supported file formats

CategoryFormats
DocumentsPDF, DOCX, XLSX, XLS, XLSM, PPTX
ImagesJPG, PNG, TIFF
Image files are automatically converted to PDF before processing. This conversion is handled by the Document Parser service.

Configuration

To add the node to an Agent Builder workflow:
1

Open your workflow

Open your workflow in Agent Builder.
2

Add the node

Add an Extract Data from File node from the Document Operations category.
3

Configure extraction settings

Configure the extraction settings described below.
Extract Data from File node configuration

Document source

Document Source
select
required
The source system for the document. Select Document Plugin to use files stored in the FlowX Documents Plugin.Default: Document Plugin

Use test file

Use Test File
boolean
Toggle ON to use a test file during workflow configuration and testing, without requiring a live file path from process data.Default: OFF

File path

File Path
string
required
The path to the input file to process. This can reference a file stored in the Documents Plugin.
When Use Test File is turned off, map this field to a process variable or workflow data key that contains the file path at runtime.

Response key

responseKey
string
required
The key where the extraction results are stored in the workflow data.Example: extractedData

Extraction method

Extraction Method
select
required
Select the method used to extract content from the file. Each method has different accuracy, speed, and cost characteristics.
MethodBest forSpeedCostAccuracy
LLM ModelComplex layouts, handwritten text, mixed contentSlowHighHigh
OCR EngineScanned documents, image-heavy filesMediumMediumMedium
Text ParsingClean digital PDFs with selectable textFastFreeLow–Medium
Uses AI vision models (such as GPT-4o) to analyze document content. This strategy provides the highest accuracy and can handle:
  • Complex page layouts with multiple columns
  • Handwritten text and annotations
  • Mixed content (text, tables, images on the same page)
  • Documents with non-standard formatting
LLM Model is the most expensive strategy due to AI API calls per page. Use it when accuracy is critical and the document structure is complex.

Image extraction options

When using LLM Model or OCR Engine, you can configure how images found within the document are handled.
Image Extraction
select
Select how images embedded in the document should be processed.
OptionDescriptionWhen to use
Image DescriptionGenerates a text description of imagesWhen you need to understand what images depict (charts, photos, diagrams)
Image ContentsExtracts text and data from imagesWhen images contain text, tables, or data you need to capture
LLM Model supports both Image Description and Image Contents. OCR Engine supports only Image Contents.
Image extraction options are not available when using the Text Parsing strategy, since Text Parsing only handles selectable text content.

Signature detection

Detect Signatures
boolean
Turn on detection of signatures within the document.Default: OFFWhen enabled, the node identifies areas of the document that contain signatures and includes their locations in the extraction results.
Signature detection is only available when using LLM Model or OCR Engine strategies. It is not available for Text Parsing.

Examples

Scenario: Extract line items and totals from a scanned paper invoice.Configuration:
  • Extraction Method: OCR Engine
  • Image Extraction: Image Contents
  • Detect Signatures: ON (to capture the approval signature)
The OCR engine processes the scanned image, extracts text from both the document body and any embedded images (such as company logos with text), and identifies signature areas.
Scenario: Extract text and understand visual elements from a contract that includes charts and diagrams.Configuration:
  • Extraction Method: LLM Model
  • Image Extraction: Image Description
  • Detect Signatures: OFF
The LLM analyzes each page, extracts the contract text, and generates descriptions of charts and diagrams (for example, “Bar chart showing quarterly revenue growth from Q1 to Q4 2025”).
Scenario: Extract text from a digitally generated report PDF.Configuration:
  • Extraction Method: Text Parsing
  • Image Extraction: N/A (not available for Text Parsing)
  • Detect Signatures: N/A (not available for Text Parsing)
Text Parsing directly extracts the selectable text from the PDF with no AI or OCR processing, making it the fastest and lowest-cost option.

Best practices

Start with Text Parsing

For digital PDFs, try Text Parsing first. Only use OCR or LLM if the results are insufficient.

Match strategy to document type

Use OCR for scanned documents, LLM for complex layouts, and Text Parsing for clean digital files.

Consider cost at scale

LLM processing costs increase linearly with page count. For high-volume workloads, use Text Parsing or OCR where possible.

Turn off unused features

Turn off signature detection and image extraction when not needed to reduce processing time and cost.

Document Parser setup

Configure the Document Parser service, parsing engines, and deployment sizing

AI node types

Overview of all AI node types available in Agent Builder

Agent Builder overview

Get started with Agent Builder workflows

Use cases

See real-world Agent Builder workflow examples
Last modified on March 25, 2026