> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flowx.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Data architecture

> Which data stores FlowX.AI uses, what each stores, and how they fit together.

## Overview

FlowX.AI uses multiple specialized data stores, each chosen for specific workload characteristics. They fall into two categories:

* **Embedded data components** — shipped with the platform, managed by FlowX.AI
* **Platform data services** — third-party dependencies deployed by your team

```mermaid theme={"system"}
flowchart LR
    FX["FlowX.AI Platform"]

    FX -->|"process definitions, instance state,\nplatform config, metadata"| PG[("PostgreSQL")]
    FX -->|"runtime config, workflow state,\nFlowX Database"| MONGO[("MongoDB")]
    FX -->|"cache, sessions,\ndistributed locks"| REDIS[("Redis")]
    FX -->|"event streaming,\nasync messaging"| KAFKA[("Kafka")]
    FX -->|"files, documents,\nbinary assets"| S3[("S3-compatible\nobject storage")]
    FX -->|"audit logs,\nsearch indexes"| ES[("Elasticsearch")]
    FX -->|"vector embeddings,\nsemantic retrieval"| QD[("Qdrant")]

    style PG fill:#336791,color:#fff
    style MONGO fill:#4DB33D,color:#fff
    style REDIS fill:#DC382D,color:#fff
    style KAFKA fill:#231F20,color:#fff
    style S3 fill:#FF9900,color:#fff
    style ES fill:#FED10A,color:#000
    style QD fill:#DC244C,color:#fff
```

***

## Embedded data components

<Note>
  Embedded data components are delivered as part of the FlowX.AI platform (Docker images + Helm charts). They run on Kubernetes only, and their versioning is managed by FlowX.AI. These components are included in the standard deployment and cannot be replaced by alternative implementations.
</Note>

<Card title="Qdrant" icon="diagram-project" href="https://qdrant.tech/documentation/">
  **Type:** Vector database

  Qdrant is used by the AI Platform RAG path (`knowledgebase-rag`, `embedder`, `knowledgebase-indexer-v2`) and by AI agents for semantic retrieval. It stores vector embeddings and serves similarity search over indexed knowledge.
</Card>

***

## Platform data services

<Info>
  The following data services are required by the FlowX.AI platform but are third-party dependencies. You choose the deployment model — managed service or self-hosted, inside or outside the Kubernetes cluster — according to your enterprise standards. FlowX.AI connects to these services via configuration.

  For supported versions and compatibility details, see the [Third-party components](../platform-deep-dive/third-party-components#compatibility-matrix) compatibility matrix.
</Info>

### Relational storage — PostgreSQL / Oracle

PostgreSQL is the primary relational database and system-of-record for most core services. Oracle Database is also supported as an alternative. They store:

* Process definitions and metadata
* Runtime instance state
* Platform configuration
* Administrative data

Most FlowX.AI microservices use PostgreSQL or Oracle as their authoritative storage layer.

### Document storage — MongoDB

MongoDB provides document-oriented storage for unstructured and semi-structured content. It stores:

* Runtime configuration
* Runtime workflow state
* Flexible data structures used by apps and integrations

MongoDB is also the storage layer for [FlowX Database](../platform-deep-dive/integrations/flowx-database) — a cross-instance, long-term storage feature that enables apps to persist and query data beyond a single process instance.

### Caching — Redis

Redis is an in-memory cache used for performance optimization across the platform. It stores:

* Cached process definitions
* Compiled scripts
* Transient session data
* Distributed locks across microservices

<Info>
  Redis is used strictly for short-lived, performance-related data and is **not** a system-of-record. Long-term persistence is handled by PostgreSQL and MongoDB.
</Info>

### Event streaming — Kafka

Kafka is the event streaming backbone that enables asynchronous, event-driven communication between microservices. It handles:

* Internal event propagation between platform services
* External integration messaging
* Decoupled processing workflows

<Info>
  Kafka is used for transient message propagation. Authoritative business data is persisted in PostgreSQL and MongoDB, not in Kafka.
</Info>

### Object storage — S3-compatible

S3-compatible object storage provides persistent file storage. It stores:

* File attachments
* Document outputs (generated PDFs, converted files)
* Large binary assets

FlowX.AI accesses object storage via an S3-compatible interface.

<Tip>
  For providers without native S3 support (such as Azure Blob Storage), an S3 proxy is required to expose an S3-compatible interface. If you use a native S3 service (like AWS S3), no proxy is needed.
</Tip>

### Search and indexing — Elasticsearch

Elasticsearch is used primarily for audit logging and full-text search across workflow and runtime data. It stores:

* Searchable representations of audit events
* Platform activity logs
* Indexed workflow data for fast retrieval

Elasticsearch is optimized for fast search and traceability rather than primary data storage.

***

## Quick reference

| Data Store              | Role                 | What it stores                                        | Persistence |
| ----------------------- | -------------------- | ----------------------------------------------------- | ----------- |
| **PostgreSQL / Oracle** | System of record     | Process definitions, config, metadata, instance state | Permanent   |
| **MongoDB**             | Document store       | Runtime state, flexible data, FlowX Database          | Permanent   |
| **Redis**               | Cache                | Process definitions cache, sessions, locks            | Transient   |
| **Kafka**               | Event streaming      | Inter-service messages, integration events            | Transient   |
| **Object Storage**      | File store           | Attachments, documents, binary assets                 | Permanent   |
| **Elasticsearch**       | Search engine        | Audit logs, indexed workflow data                     | Permanent   |
| **Qdrant**              | Vector database (AI) | Embeddings for RAG and semantic retrieval             | Permanent   |

***

## Service-level database mapping

The tables and diagrams below show which FlowX.AI services connect to which databases and what operations they perform. This is useful for infrastructure sizing, backup planning, and troubleshooting.

### PostgreSQL databases

Each core service owns its own PostgreSQL database and manages its schema through Liquibase migrations at startup.

```mermaid theme={"system"}
flowchart LR
    subgraph Services
        ADM["Admin"]
        ADV["Advancing Controller"]
        APPM["Application Manager"]
        AUTH["Authorization System"]
        DOC["Document Plugin"]
        EMAIL["Email Gateway"]
        LIC["License"]
        ORG["Organization Manager"]
        ENG["Process Engine"]
        TASK["Task Management"]
        DSYNC["Data Sync"]
    end

    subgraph PostgreSQL
        ADMDB[("flowxadmin")]
        ADVDB[("advancing")]
        APPMDB[("app_manager")]
        AUTHDB[("auth_system")]
        DOCDB[("document")]
        EMAILDB[("email_gateway")]
        LICDB[("license")]
        ORGDB[("org_manager")]
        ENGDB[("process_engine")]
    end

    ADM -->|"creates, reads, writes"| ADMDB
    ADV -->|"creates, reads, writes"| ADVDB
    APPM -->|"creates, reads, writes"| APPMDB
    AUTH -->|"creates, reads, writes"| AUTHDB
    DOC -->|"creates, reads, writes"| DOCDB
    EMAIL -->|"creates, reads, writes"| EMAILDB
    LIC -->|"creates, reads, writes"| LICDB
    ORG -->|"creates, reads, writes"| ORGDB
    ENG -->|"creates, reads, writes"| ENGDB
    TASK -->|"reads, writes"| ENGDB
    DSYNC -.->|"manages migrations"| APPMDB
    DSYNC -.->|"manages migrations"| AUTHDB
    DSYNC -.->|"manages migrations"| ENGDB
```

| Service              | Database         | Operations             |
| -------------------- | ---------------- | ---------------------- |
| Admin                | `flowxadmin`     | Creates, reads, writes |
| Advancing Controller | `advancing`      | Creates, reads, writes |
| Application Manager  | `app_manager`    | Creates, reads, writes |
| Authorization System | `auth_system`    | Creates, reads, writes |
| Document Plugin      | `document`       | Creates, reads, writes |
| Email Gateway        | `email_gateway`  | Creates, reads, writes |
| License              | `license`        | Creates, reads, writes |
| Organization Manager | `org_manager`    | Creates, reads, writes |
| Process Engine       | `process_engine` | Creates, reads, writes |
| Task Management      | `process_engine` | Reads, writes          |

<Info>
  Task Management shares the `process_engine` database with Process Engine. Data Sync acts as a centralized migration coordinator for `app_manager`, `auth_system`, and `process_engine`.
</Info>

### MongoDB databases

Each service manages its own MongoDB database. The `app-runtime` database is a shared read-only dependency for multiple services.

```mermaid theme={"system"}
flowchart LR
    subgraph Services
        ADM["Admin"]
        APPM["Application Manager"]
        CMS["CMS Core"]
        DOC["Document Plugin"]
        INTD["Integration Designer"]
        NOTIF["Notification Plugin"]
        SCHED["Scheduler"]
        TASK["Task Management"]
        ENG["Process Engine"]
        EMAIL["Email Gateway"]
        NOSQL["NoSQL DB Runner"]
    end

    subgraph MongoDB
        DM[("data-model")]
        AR[("app-runtime")]
        CMSDB[("cms-core")]
        DOCDB[("document")]
        NOTIFDB[("notification")]
        TASKDB[("task-management-plugin")]
        INTDDB[("integration-designer")]
        NOSQLDB[("nosql-db-runner")]
    end

    ADM -->|"creates, reads, writes"| DM
    APPM -->|"creates, reads, writes"| AR
    CMS -->|"creates, reads, writes"| CMSDB
    DOC -->|"creates, reads, writes"| DOCDB
    DOC -->|"reads"| AR
    INTD -->|"creates, reads, writes"| INTDDB
    INTD -->|"reads"| AR
    NOTIF -->|"creates, reads, writes"| NOTIFDB
    NOTIF -->|"reads"| AR
    SCHED -->|"creates, reads, writes"| NOTIFDB
    TASK -->|"creates, reads, writes"| TASKDB
    TASK -->|"reads"| AR
    ENG -->|"reads"| AR
    EMAIL -->|"reads"| AR
    NOSQL -->|"creates, reads, writes"| NOSQLDB
```

| Service              | Database                 | Operations             |
| -------------------- | ------------------------ | ---------------------- |
| Admin                | `data-model`             | Creates, reads, writes |
| Application Manager  | `app-runtime`            | Creates, reads, writes |
| CMS Core             | `cms-core`               | Creates, reads, writes |
| Document Plugin      | `document`               | Creates, reads, writes |
| Document Plugin      | `app-runtime`            | Reads                  |
| Integration Designer | `integration-designer`   | Creates, reads, writes |
| Integration Designer | `app-runtime`            | Reads                  |
| Notification Plugin  | `notification`           | Creates, reads, writes |
| Notification Plugin  | `app-runtime`            | Reads                  |
| Scheduler            | `notification`           | Creates, reads, writes |
| Task Management      | `task-management-plugin` | Creates, reads, writes |
| Task Management      | `app-runtime`            | Reads                  |
| Process Engine       | `app-runtime`            | Reads                  |
| Email Gateway        | `app-runtime`            | Reads                  |
| NoSQL DB Runner      | `nosql-db-runner`        | Creates, reads, writes |

<Warning>
  The `app-runtime` database is created by Application Manager and read by 6 other services. It contains runtime configuration, build data, and deployed app state. Availability of this database is critical for runtime operations.
</Warning>

### Elasticsearch indexes

| Service        | Index                  | Operations             |
| -------------- | ---------------------- | ---------------------- |
| Audit Core     | Audit index            | Creates, reads, writes |
| Process Engine | Process instance index | Creates, reads, writes |
| Data Search    | Process instance index | Reads                  |

### Redis

All core services use Redis for **caching only** via Spring Cache Manager. Events Gateway additionally uses Redis for pub/sub messaging.

Services using Redis for caching: Admin, Application Manager, Authorization System, CMS Core, Document Plugin, Email Gateway, Integration Designer, License, Notification Plugin, Organization Manager, Process Engine, Task Management.

| Service        | Usage          |
| -------------- | -------------- |
| Events Gateway | Cache, pub/sub |

### S3-compatible object storage buckets

<Info>
  Bucket names shown are defaults from the configuration and are configurable via environment variables (e.g., `MINIO_BUCKET_PREFIX`). Your deployment may use different names. In some deployments, multiple services may share the same bucket. Ensure cross-bucket read access is configured where needed.
</Info>

| Service              | Default Bucket               | Env Variable          | Purpose                                    |
| -------------------- | ---------------------------- | --------------------- | ------------------------------------------ |
| Application Manager  | `applications-bucket`        | —                     | App builds and exported/imported resources |
| CMS Core             | `media-library-bucket`       | —                     | Public media assets                        |
| CMS Core             | `cms-private-storage-bucket` | —                     | Private theme and font files               |
| Document Plugin      | `flowx-dev-bucket`           | `MINIO_BUCKET_PREFIX` | Document templates and generated files     |
| Document Plugin      | `temp-bucket`                | `MINIO_TEMP_BUCKET`   | Temporary processing files                 |
| Integration Designer | `workflows-bucket`           | —                     | Workflow definitions and test files        |
| Notification Plugin  | `flowx-dev`                  | `MINIO_BUCKET_PREFIX` | Notification templates and attachments     |

***

## Related resources

<CardGroup cols={2}>
  <Card title="Third-party components" href="../platform-deep-dive/third-party-components" icon="cube">
    Supported versions and compatibility matrix
  </Card>

  <Card title="FlowX Database" href="../platform-deep-dive/integrations/flowx-database" icon="database">
    Cross-instance, long-term data storage using MongoDB
  </Card>

  <Card title="FlowX.AI architecture" href="./flowx-architecture" icon="diagram-project">
    Overall platform architecture and microservices overview
  </Card>

  <Card title="Redis configuration" href="../../setup-guides/redis-configuration" icon="fire">
    Redis deployment modes and configuration
  </Card>
</CardGroup>
