> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flowx.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# FlowX Data Search setup

> Comprehensive guide for installing, configuring, and deploying the FlowX Data Search service

The FlowX Data Search service enables searching capabilities across your FlowX platform. This guide provides detailed instructions for setting up, configuring, and deploying the service in your environment.

***

## Infrastructure prerequisites

The FlowX Data Search service requires the following infrastructure components:

| Component         | Purpose                                     |
| ----------------- | ------------------------------------------- |
| **Redis**         | Caching search results and configurations   |
| **Kafka**         | Message-based communication with the engine |
| **Elasticsearch** | Indexing and searching data                 |

***

## Configuration

### Kafka configuration

Configure Kafka communication using these environment variables and properties:

#### Basic Kafka settings

| Variable                  | Description                                                              | Default Value      |
| ------------------------- | ------------------------------------------------------------------------ | ------------------ |
| `KAFKA_BOOTSTRAP_SERVERS` | Kafka broker addresses (fallback: `SPRING_KAFKA_BOOTSTRAP_SERVERS`)      | `localhost:9092`   |
| `KAFKA_SECURITY_PROTOCOL` | Security protocol for Kafka (fallback: `SPRING_KAFKA_SECURITY_PROTOCOL`) | `PLAINTEXT`        |
| `KAFKA_CONSUMER_THREADS`  | Number of Kafka consumer threads                                         | `1`                |
| `KAFKA_MESSAGE_MAX_BYTES` | Maximum message size                                                     | `52428800` (50 MB) |

#### OAuth authentication (when using SASL\_PLAINTEXT)

| Environment Variable             | Description          | Default Value          |
| -------------------------------- | -------------------- | ---------------------- |
| `KAFKA_OAUTH_CLIENT_ID`          | OAuth client ID      | `kafka`                |
| `KAFKA_OAUTH_CLIENT_SECRET`      | OAuth client secret  | `kafka-secret`         |
| `KAFKA_OAUTH_TOKEN_ENDPOINT_URI` | OAuth token endpoint | `kafka.auth.localhost` |

<Info>
  When using the `kafka-auth` profile, the security protocol will automatically be set to `SASL_PLAINTEXT` and the SASL mechanism will be set to `OAUTHBEARER`.
</Info>

#### Topic naming configuration

The Data Search service uses a structured topic naming convention:

```
{package}{environment}.{component}.{action}.{version}
```

For example: `ai.flowx.core.trigger.search.data.v1`

| Variable                                  | Description                         | Default Value     |
| ----------------------------------------- | ----------------------------------- | ----------------- |
| `KAFKA_TOPIC_NAMING_PACKAGE`              | Package prefix for topic names      | `ai.flowx.`       |
| `KAFKA_TOPIC_NAMING_ENVIRONMENT`          | Environment segment for topic names | ` `               |
| `KAFKA_TOPIC_NAMING_VERSION`              | Version suffix for topic names      | `.v1`             |
| `KAFKA_TOPIC_NAMING_SEPARATOR`            | Primary separator for topic naming  | `.`               |
| `KAFKA_TOPIC_NAMING_SEPARATOR2`           | Secondary separator                 | `-`               |
| `KAFKA_TOPIC_NAMING_ENGINERECEIVEPATTERN` | Engine receive pattern              | `engine.receive.` |

#### Kafka topics

The service uses these specific topics:

| Topic                         | Default Value                                         | Purpose                  |
| ----------------------------- | ----------------------------------------------------- | ------------------------ |
| `KAFKA_TOPIC_DATA_SEARCH_IN`  | `ai.flowx.core.trigger.search.data.v1`                | Incoming search requests |
| `KAFKA_TOPIC_DATA_SEARCH_OUT` | `ai.flowx.engine.receive.core.search.data.results.v1` | Outgoing search results  |

### Elasticsearch configuration

Configure Elasticsearch connection using the following environment variables:

| Variable                                  | Description                                 | Default Value      | Default Value        |
| ----------------------------------------- | ------------------------------------------- | ------------------ | -------------------- |
| `SPRING_ELASTICSEARCH_REST_URIS`          | URL(s) of Elasticsearch nodes (no protocol) | -                  | `elasticsearch:9200` |
| `SPRING_ELASTICSEARCH_REST_PROTOCOL`      | Connection protocol                         | `https`            | `https` or `http`    |
| `SPRING_ELASTICSEARCH_REST_DISABLESSL`    | Disable SSL verification                    | `false`            | `false`              |
| `SPRING_ELASTICSEARCH_REST_USERNAME`      | Authentication username                     | -                  | `elastic`            |
| `SPRING_ELASTICSEARCH_REST_PASSWORD`      | Authentication password                     | -                  | `your-password`      |
| `SPRING_ELASTICSEARCH_INDEXSETTINGS_NAME` | Index name for search data                  | `process_instance` | `process_instance`   |

### Security configuration

The Search Data service validates incoming tokens with the JWT public key mechanism. It does not initiate service-to-service calls, so it has no service-account client registration:

| Variable                                                  | Description                                                           | Default Value                    |
| --------------------------------------------------------- | --------------------------------------------------------------------- | -------------------------------- |
| `SECURITY_TYPE`                                           | Token validation mechanism (JWT public key validation)                | `jwt-public-key`                 |
| `SECURITY_OAUTH2_BASESERVERURL`                           | Base URL of the Keycloak server                                       |                                  |
| `FLOWX_LIB_SECURITY_SERVICES_ORGANIZATIONMANAGER_BASEURL` | URL of the organization-manager service, used by the security library | `http://organization-manager:80` |

<Warning>
  **Upgrading from 5.1.x?** Remove the legacy opaque-token env vars: `SECURITY_OAUTH2_REALM`, `SECURITY_OAUTH2_CLIENT_CLIENTID`, and `SECURITY_OAUTH2_CLIENT_CLIENTSECRET`. These belong to the removed introspection model and prevent the service from starting on 5.9.x. See the [authentication and IAM migration guide](/5.9/migrating-from-5.1-lts/authentication-iam) for the full list.
</Warning>

### Logging configuration

Control the verbosity of logs with these variables:

| Variable             | Description                    | Default Value |
| -------------------- | ------------------------------ | ------------- |
| `LOGGING_LEVEL_ROOT` | Root Spring Boot log level     | `INFO`        |
| `LOGGING_LEVEL_APP`  | Application-specific log level | `INFO`        |

***

## Elasticsearch index configuration

The Data Search service creates and manages Elasticsearch indices based on the configured index pattern. The default index name is `process_instance`.

### Index pattern

The service derives the index pattern from the `spring.elasticsearch.index-settings.name` property. This pattern is used to query across multiple indices that match the pattern.

### Sample search query

Below is an example of a search query generated by the Data Search service for Elasticsearch:

```json theme={"system"}
{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "boost": 1,
      "must": [
        {
          "nested": {
            "boost": 1,
            "ignore_unmapped": false,
            "path": "keyIdentifiers",
            "query": {
              "bool": {
                "adjust_pure_negative": true,
                "boost": 1,
                "must": [
                  {
                    "match": {
                      "keyIdentifiers.key.keyword": {
                        "query": "astonishingAttribute",
                        "operator": "OR"
                      }
                    }
                  },
                  {
                    "match": {
                      "keyIdentifiers.originalValue.keyword": {
                        "query": "OriginalGangsta",
                        "operator": "OR"
                      }
                    }
                  }
                ]
              }
            },
            "score_mode": "none"
          }
        },
        {
          "terms": {
            "boost": 1,
            "processDefinitionName.keyword": [
              "TEST_PORCESS_NAME_0",
              "TEST_PORCESS_NAME_1"
            ]
          }
        }
      ]
    }
  }
}
```

***

## Integration with Kibana

Kibana provides an interface for visualizing and exploring data indexed by the Data Search service.

### Using Kibana with FlowX Data Search

1. Connect Kibana to the same Elasticsearch instance
2. Create an index pattern matching your configured index name
3. Use the Discover tab to explore indexed data
4. Create visualizations and dashboards based on your data

<Info>
  Kibana is an open-source data visualization and exploration tool designed primarily for Elasticsearch. It serves as the visualization layer for the Elastic Stack, allowing users to interact with their data stored in Elasticsearch to perform various activities such as querying, analyzing, and visualizing data. For more information, visit the [Kibana official documentation](https://www.elastic.co/guide/en/kibana/current/index.html).
</Info>

***

## Best practices

1. **Security**:
   * Store sensitive credentials in Kubernetes Secrets
   * Use TLS for Elasticsearch and Kafka communication
   * Implement network policies to restrict access

2. **Performance**:
   * Scale the number of replicas based on query load
   * Adjust Kafka consumer threads based on message volume
   * Configure appropriate resource limits and requests

3. **Monitoring**:
   * Set up monitoring for Elasticsearch, Kafka, and Redis
   * Create alerts for service availability and performance
   * Monitor disk space for Elasticsearch data nodes

***

## Troubleshooting

<AccordionGroup>
  <Accordion title="Elasticsearch connection failures">
    **Symptoms:** Service fails to start or search requests return errors.

    **Solutions:**

    1. Verify Elasticsearch is running and accessible at the configured URL
    2. Check that credentials in `SPRING_ELASTICSEARCH_REST_USERNAME` and `SPRING_ELASTICSEARCH_REST_PASSWORD` are correct
    3. Ensure SSL settings match your environment — set `SPRING_ELASTICSEARCH_REST_DISABLESSL` to `true` if not using TLS
    4. Confirm the protocol in `SPRING_ELASTICSEARCH_REST_PROTOCOL` matches your Elasticsearch setup (`https` or `http`)
  </Accordion>

  <Accordion title="Database and Redis issues">
    **Symptoms:** Cache misses, stale search results, or Redis connection errors.

    **Solutions:**

    1. Verify Redis is running and accessible
    2. Check Redis authentication credentials
    3. Ensure network policies allow traffic between the Data Search pod and Redis
    4. Monitor Redis memory usage — eviction policies may cause cache misses under high load
  </Accordion>

  <Accordion title="Kafka sync failures">
    **Symptoms:** Search requests are not received or results are not delivered back to the engine.

    **Solutions:**

    1. Verify Kafka topics exist — check that `KAFKA_TOPIC_DATA_SEARCH_IN` and `KAFKA_TOPIC_DATA_SEARCH_OUT` topics are created
    2. Check Kafka permissions for the consumer group
    3. Ensure bootstrap servers in `KAFKA_BOOTSTRAP_SERVERS` are correctly specified
    4. If using OAuth, verify the token endpoint is accessible and credentials are valid
  </Accordion>

  <Accordion title="Indexing performance">
    **Symptoms:** Slow search responses, high latency, or timeouts.

    **Solutions:**

    1. Increase `KAFKA_CONSUMER_THREADS` to process more messages in parallel
    2. Verify the Elasticsearch cluster health — yellow or red status impacts performance
    3. Check index size and shard distribution in Elasticsearch
    4. Monitor Elasticsearch JVM heap usage and adjust resource limits if needed
    5. Review the `KAFKA_MESSAGE_MAX_BYTES` setting if large payloads are being processed
  </Accordion>
</AccordionGroup>

***

## Related resources

<CardGroup cols={2}>
  <Card title="Elasticsearch Indexing" icon="magnifying-glass" href="./flowx-engine-setup-guide/configuring-elasticsearch-indexing/elasticsearch-indexing">
    Configure Elasticsearch indexing for process data
  </Card>

  <Card title="Redis Configuration" icon="database" href="./redis-configuration">
    Complete Redis setup including Sentinel and Cluster modes
  </Card>

  <Card title="Kafka Authentication" icon="lock" href="./kafka-authentication-config">
    Configure Kafka security and authentication
  </Card>

  <Card title="IAM Configuration" icon="key" href="./access-management/configuring-an-iam-solution">
    Identity and access management setup
  </Card>
</CardGroup>
