Skip to main content
PreviewAgent Builder is currently in preview and may change before general availability.
Retrieval-augmented generation (RAG) lets an AI agent answer questions by searching your documents first, then synthesizing a response from only the retrieved context. This eliminates hallucinations and keeps answers grounded in your actual content.

When to use

Use this pattern when:
  • Users need accurate answers drawn from a specific document corpus (product sheets, FAQs, regulatory guides, policies)
  • Responses must cite sources and avoid fabrication
  • The knowledge changes over time and you want to update documents without retraining a model
  • You need domain-specific answers that a general-purpose LLM cannot reliably provide
Do not use this pattern when:
  • The answer requires real-time data from an external API (use an MCP integration instead)
  • The task is classification or extraction rather than open-ended Q&A (use TEXT_UNDERSTANDING or TEXT_EXTRACTION nodes)
  • The corpus is very small (fewer than 5 documents) and could fit entirely in a system prompt

Architecture

User question
    |
    v
+---------------------------+
| CUSTOM_AGENT node         |
| (with Knowledge Base tool)|
+---------------------------+
    |
    v
+---------------------------+
| Qdrant vector search      |
| (Top K chunks retrieved)  |
+---------------------------+
    |
    v
+---------------------------+
| LLM synthesis             |
| (answer from context only)|
+---------------------------+
    |
    v
Grounded response
The CUSTOM_AGENT node handles the full cycle internally:
  1. Analyzes the user query for key terms and intent
  2. Searches the connected Qdrant collection for relevant document chunks
  3. Receives the top matching chunks ranked by relevance score
  4. Synthesizes a response using only the retrieved context
  5. Returns a grounded answer with source references

Implementation

Prerequisites

Before configuring the workflow:
  • Create a Knowledge Base data source in Integration Designer
  • Upload your documents (PDFs) to the Knowledge Base content sources
  • Wait for automatic chunking and vector indexing to complete
Knowledge Bases use vector embeddings stored in Qdrant for semantic search. Documents are automatically chunked and indexed when uploaded. For details, see the Knowledge Base integration documentation.

Configure the CUSTOM_AGENT node

1

Add a CUSTOM_AGENT node to your workflow

In the Agent Builder canvas, drag a Custom Agent node into your workflow.
2

Enable Knowledge Base

In the node configuration panel, scroll to the Knowledge Base section and enable it. Select the Knowledge Base you created.
3

Set retrieval parameters

Configure how the agent searches your documents.
4

Write the system prompt

Define the agent behavior, retrieval instructions, and response constraints (see prompt template below).

Retrieval parameters

ParameterDescriptionRecommended value
Max. Number of ResultsNumber of document chunks retrieved per query (1-10)10
Min. Relevance ScoreRelevance threshold (0-100%). Chunks below this score are excluded10
Content Source FilterRestrict search to specific content sources or search allAll content sources
Start with a low Min. Relevance Score (around 10%) and increase it if the agent returns too much irrelevant context. A high threshold may cause the agent to miss relevant chunks that use different terminology than the query.

System prompt template

The prompt must instruct the agent to search the Knowledge Base and follow strict grounding rules.
You are a knowledgeable assistant. Your job is to answer user questions
accurately using ONLY information retrieved from the knowledge base.

INSTRUCTIONS:
1. Analyze the user's question to identify key terms and concepts.
2. Search the knowledge base for relevant information.
3. Synthesize a clear, concise response from ONLY the retrieved context.

STRICT RULES:
- Source fidelity: Base every claim on retrieved document chunks. If the
  knowledge base does not contain relevant information, say: "I don't have
  information about that in my current knowledge base."
- Citations: Reference the source document when possible.
- No hallucinations: Never fabricate information, statistics, or claims
  not present in the retrieved context.
- Stay on topic: Only answer questions related to the knowledge base domain.
The grounding rules are critical. Without explicit instructions to admit gaps in knowledge, the LLM may fall back to its general training data and produce inaccurate answers.

Input and output

DirectionKeyDescription
InputUser message (from chat or process data)The question to answer
OutputAgent response textThe grounded answer synthesized from retrieved chunks

Real-world example

The mortgage advisor chatbot tutorial uses this pattern in its knowledgeBaseQA workflow. When the intent classifier detects a KNOWLEDGE_QA intent, the router calls a CUSTOM_AGENT node connected to a Knowledge Base containing mortgage product sheets, regulatory guides, and FAQ documents. The agent answers questions such as:
  • “What is DTI?” (retrieves debt-to-income ratio definition from the FAQ)
  • “What are the requirements for a fixed-rate mortgage?” (retrieves eligibility criteria from product sheets)
  • “What documents do I need to apply?” (retrieves checklist from the regulatory guide)

Mortgage advisor tutorial

Build the full chatbot including intent routing, knowledge base Q&A, and recommendation generation

Variations

Query multiple Knowledge Base collections in a single workflow. Use parallel CUSTOM_AGENT nodes, each connected to a different Knowledge Base, then merge the results in a downstream node. When to use: Your documents span distinct domains (for example, product documentation and regulatory compliance) that are better organized in separate collections. Combine vector (semantic) search with keyword search for better recall. The Knowledge Base integration supports multiple search modes including hybrid, semantic, or keyword algorithms. When to use: Your documents contain precise terminology (part numbers, policy codes, legal references) where exact keyword matching outperforms semantic similarity alone.

Re-ranking

Add a second-pass ranking step after initial retrieval. Use a TEXT_GENERATION node downstream of the CUSTOM_AGENT to score and re-order the retrieved chunks before synthesizing the final answer. When to use: The initial retrieval returns many marginally relevant chunks and you need higher precision in the final response.

Best practices

Upload clean, well-structured documents. Remove headers, footers, and boilerplate that add noise. Organize documents into separate content sources by topic for easier filtering.
Use the Knowledge Base test interface to run sample queries and verify that relevant chunks are returned with acceptable relevance scores before wiring the node into a workflow.
Start with a higher Max. Number of Results (8-10) and a lower Min. Relevance Score (10-20%). Monitor the chunks returned and adjust. Too few results leads to incomplete answers; too many leads to diluted context.
Always include instructions for what the agent should do when the Knowledge Base does not contain relevant information. This prevents the LLM from falling back to general knowledge.

Last modified on March 16, 2026