> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flowx.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# AI node types

> Understanding the different AI node types available in Agent Builder for building AI workflows.

<Warning>
  **Preview**

  Agent Builder is currently in preview and may change before general availability.
</Warning>

Agent Builder provides AI nodes organized into categories based on the type of content they process. Combine these nodes to create sophisticated workflows.

<Frame>
  ![AI Nodes](https://s3.eu-west-1.amazonaws.com/docx.flowx.ai/5.x/ai_nodes.png)
</Frame>

## Node categories

<CardGroup cols={2}>
  <Card title="Text operations" icon="font" href="#ai-text-operations">
    Process and analyze text content
  </Card>

  <Card title="Document operations" icon="file-lines" href="#ai-document-operations">
    Work with documents and files
  </Card>

  <Card title="Image operations" icon="image" href="#ai-image-operations">
    Analyze visual content
  </Card>

  <Card title="Data operations" icon="database" href="#ai-data-operations">
    Transform and enrich data
  </Card>
</CardGroup>

Additionally, the **Custom Agent**, **Context Retrieval**, **Intent Classification**, and **Speech to Text** nodes provide specialized capabilities.

***

## Intent Classification

Classify user messages and automatically route the workflow to the matching branch — combining AI classification and conditional branching in a single node.

| Node type                 | Description                                                                      | Use cases                                                    |
| ------------------------- | -------------------------------------------------------------------------------- | ------------------------------------------------------------ |
| **Intent Classification** | Classifies a user message into defined intents and routes to the matching branch | Chatbot routing, email triage, support ticket classification |

### Configuration options

* **User message** — Input text to classify (supports `${}` references)
* **Intents** — Up to 10 natural language intent descriptions, each becoming an output branch
* **Use memory** — Include conversation history for context-aware classification
* **Include Reason for Selection** — Add LLM explanation for the chosen intent
* **Fallback branch** — Automatic "If No Intent Matches" path

<Card title="Intent Classification" icon="arrows-split-up-and-left" href="./intent-classification">
  Learn more about configuring intents, output format, and routing behavior
</Card>

***

## Context Retrieval

Perform RAG (Retrieval-Augmented Generation) searches against a Knowledge Base and return relevant chunks — without calling an LLM.

| Node type             | Description                                                                | Use cases                                                                 |
| --------------------- | -------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| **Context Retrieval** | Queries a Knowledge Base and returns matching chunks with relevance scores | RAG pipelines, document search, context gathering for downstream AI nodes |

### Configuration options

* **Source** — Knowledge Base or Memory (Memory only available in conversational workflows)
* **Knowledge Base** — Select which Knowledge Base to query (when source is Knowledge Base)
* **User Query** — the search query, supports process variable expressions
* **Search type** — Hybrid (default), Semantic, or Keywords
* **Max Number of Chunks** — how many chunks to return (1-10)
* **Min Relevance Score** — minimum relevance threshold (0-100%)
* **Metadata Filters** — structured filters with typed operators and AND/OR grouping to refine results by chunk metadata
* **Use advance metadata filters** — toggle for expression-based filtering
* **Use Re-rank** — re-rank retrieved chunks before returning

### Output format

The node outputs an array of retrieved chunks, each containing:

| Field            | Description                                                                                                                               |
| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
| `chunkContent`   | The text content of the retrieved chunk                                                                                                   |
| `chunkMetadata`  | Metadata associated with the chunk                                                                                                        |
| `relevanceScore` | Similarity score between the query and the chunk                                                                                          |
| `searchType`     | Search strategy that returned this chunk: `HYBRID`, `SEMANTIC`, or `KEYWORD`. Only emitted when the RAG service returns a non-null value. |
| `contentSource`  | The store the chunk belongs to (field name preserved for API compatibility)                                                               |

<Card title="Context Retrieval" icon="magnifying-glass" href="./context-retrieval">
  Learn more about configuring Context Retrieval nodes
</Card>

***

## Custom Agent

Create custom agents with advanced capabilities powered by Model Context Protocol (MCP) tools, optional Knowledge Base retrieval, and direct chat reply in Chat Driven workflows.

| Node type               | Description                      | Use cases                         |
| ----------------------- | -------------------------------- | --------------------------------- |
| **Text generation**     | Create text content from prompts | Reports, summaries, responses     |
| **Summarization**       | Condense long content            | Document summaries, meeting notes |
| **Translation**         | Convert between languages        | Multi-language support            |
| **Document completion** | Fill in templates                | Form letters, contracts           |

### Configuration options

* **Instructions** *(mandatory)* — The agent's role, tasks, and expected input/output. Sent as the System message and **cached** by the LLM between executions, so `${variable}` references are not allowed in this field.
* **Background → Use conversation memory** — Include conversation history in the LLM call (Chat Driven workflows only)
* **Background → Knowledge Base** — Attach a knowledge base for RAG retrieval; includes Search Type (vector / keyword / hybrid) and Re-rank options
* **Background → Context** — Free-text field for the per-execution prompt; supports `${variable}` references. Sent as the User message and not cached.
* **Background → Use only referenced values as input** — Send only values referenced in the prompt to the LLM; narrows context and reduces token usage
* **Tools → MCP Servers** — Attach MCP tools the agent can call
* **Tools → Built-in tools → Use Web Search Tool** — Toggle **ON** to let the agent look up recent information from the public web. Off by default.
* **Personal Information Guard** — Toggle PII redaction on the node's input and/or output
* **Response → Send as Chat Reply** — In Chat Driven workflows, deliver the response directly to the Chat component as Markdown
* **Response → Include Task for Prompt Suggestions** — In Chat Driven workflows, generate AI follow-up prompts shown in the Chat component after the response
* **Response Key** — Key where the node output is stored in the workflow data
* **Response Schema** — Expected JSON structure of the LLM response (hidden when Send as Chat Reply is ON)

<Card title="Custom Agent in Chat Driven workflows" icon="comments" href="/5.9/ai-platform/conversational-workflows#custom-agent-node">
  Full reference for Use Memory, Send as Chat Reply, Prompt Suggestions, and other Chat Driven specifics
</Card>

***

## Speech to Text

Transcribe audio to text within integration workflows.

| Node type          | Description                                                                      | Use cases                                                                                               |
| ------------------ | -------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
| **Speech to Text** | Transcribe audio files into text with language detection and confidence metadata | Voice message processing, audio transcription, accessibility, conversational workflows with voice input |

### Configuration options

* **File Source** — Document Plugin, S3, or Chat Input (conversational workflows)
* **File Path** — Path to the audio file (supports `${}` references); becomes a test-file dropdown when **Use Test File** is on
* **Use Test File** — Upload a sample audio file for development testing
* **Response Key** — Key under which the transcript and metadata are stored

<Card title="Speech to Text" icon="microphone" href="./speech-to-text">
  Learn more about configuring transcription, text-to-speech, and chat voice integration
</Card>

***

## Understanding nodes

Understanding nodes analyze content to extract meaning and intent.

| Node type                    | Description                        | Use cases                         |
| ---------------------------- | ---------------------------------- | --------------------------------- |
| **Sentiment analysis**       | Detect emotional tone              | Customer feedback, reviews        |
| **Topic modeling**           | Identify themes and subjects       | Document categorization           |
| **Intent recognition**       | Understand user goals              | Chatbot routing, request handling |
| **Named entity recognition** | Find people, places, organizations | Data extraction, compliance       |

### Configuration options

* **Classification labels** - Define categories for classification
* **Confidence threshold** - Minimum score for results
* **Multi-label** - Allow multiple classifications per input

***

## AI Document Operations

Process documents to extract data, generate reports, or understand content.

| Node                       | Description                                                                             | Use cases                                                            |
| -------------------------- | --------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
| **Document Generation**    | Automatically build reports or complete templates based on given inputs                 | Report generation, template completion                               |
| **Document Extraction**    | Identify and extract structured data, entities or metadata from documents               | Form processing, invoice data extraction                             |
| **Document Understanding** | Analyze documents to extract meaning, topics, sentiment, or important information       | Document classification, content analysis                            |
| **Extract Data from File** | Extract text and data from documents and images with configurable extraction strategies | OCR, PDF text extraction, image data extraction, signature detection |

<Card title="Extract Data from File" icon="file-lines" href="./extract-data-from-file">
  Learn more about configuring extraction strategies, image extraction, and signature detection
</Card>

### Configuration options

* **Document type** - Specify expected document format
* **Schema definition** - Define expected output structure
* **Field mapping** - Map extracted fields to data model
* **Confidence threshold** - Minimum score for extractions

***

## AI Image Operations

Analyze visual content to generate captions, extract details, or identify objects.

| Node type                 | Description               | Use cases                                  |
| ------------------------- | ------------------------- | ------------------------------------------ |
| **Object recognition**    | Identify items in images  | Document classification, damage assessment |
| **Text extraction (OCR)** | Read text from images     | Invoice processing, ID verification        |
| **Scene understanding**   | Interpret image context   | Property assessment, claims processing     |
| **Emotion analysis**      | Detect facial expressions | Customer experience, fraud detection       |

### Configuration options

* **Detection confidence** - Minimum threshold for detections
* **Region of interest** - Focus on specific image areas
* **Output format** - Structured data or annotations

***

## Combining nodes

Nodes can be connected in workflows to create complex processing pipelines:

```
Document Input → Text Extraction → Entity Recognition → Validation → Output
```

### Best practices

<AccordionGroup>
  <Accordion title="Start simple" icon="1">
    Begin with a single node and add complexity incrementally. Test each addition before moving on.
  </Accordion>

  <Accordion title="Use validation nodes" icon="check">
    Add validation steps after extraction to ensure data quality before processing continues.
  </Accordion>

  <Accordion title="Handle errors" icon="triangle-exclamation">
    Include fallback paths for when nodes fail or return low-confidence results.
  </Accordion>

  <Accordion title="Monitor performance" icon="chart-line">
    Track execution times and accuracy metrics to identify bottlenecks and improvement opportunities.
  </Accordion>
</AccordionGroup>

## Related resources

<CardGroup cols={2}>
  <Card title="Agent Builder overview" icon="robot" href="./overview">
    Get started with Agent Builder
  </Card>

  <Card title="Use cases" icon="lightbulb" href="./use-cases">
    See real-world examples
  </Card>

  <Card title="Conversational workflows" icon="messages" href="../conversational-workflows">
    Multi-turn chat with Custom Agent node changes for chat replies and memory
  </Card>
</CardGroup>
