Overview

The FlowX Data Search service enables powerful searching capabilities across your FlowX platform. This guide provides detailed instructions for setting up, configuring, and deploying the service in your environment.

Quick start

# 1. Ensure infrastructure prerequisites are met (Redis, Kafka, Elasticsearch)
# 2. Configure your environment variables in a data-search.yaml file
# 3. Deploy the service
kubectl apply -f data-search.yaml
# 4. Verify the deployment
kubectl get deployment data-search
kubectl logs deployment/data-search

Infrastructure prerequisites

The FlowX Data Search service requires the following infrastructure components:

ComponentMinimum VersionPurpose
Redis6.0+Caching search results and configurations
Kafka2.8+Message-based communication with the engine
Elasticsearch7.11.0+Indexing and searching data

Configuration

Kafka configuration

Configure Kafka communication using these environment variables and properties:

Basic Kafka settings

VariableDescriptionDefault/Example
SPRING_KAFKA_BOOTSTRAPSERVERSAddress of Kafka server(s)localhost:9092
SPRING_KAFKA_SECURITY_PROTOCOLSecurity protocol for KafkaPLAINTEXT
KAFKA_CONSUMER_THREADSNumber of Kafka consumer threads1
KAFKA_MESSAGE_MAX_BYTESMaximum message size52428800 (50MB)
KAFKA_OAUTH_CLIENT_IDOAuth client ID for Kafka authenticationkafka
KAFKA_OAUTH_CLIENT_SECRETOAuth client secretkafka-secret
KAFKA_OAUTH_TOKEN_ENDPOINT_URIOAuth token endpointkafka.auth.localhost

Topic naming configuration

The Data Search service uses a structured topic naming convention:

{package}.{environment}.{component}.{action}.{version}

For example: ai.flowx.dev.core.trigger.search.data.v1

VariableDescriptionDefault
KAFKA_TOPIC_NAMING_SEPARATORPrimary separator for topic naming.
KAFKA_TOPIC_NAMING_SEPARATOR2Secondary separator-
KAFKA_TOPIC_NAMING_PACKAGEPackage prefixai.flowx.
KAFKA_TOPIC_NAMING_ENVIRONMENTEnvironment namedev.
KAFKA_TOPIC_NAMING_VERSIONVersion suffix.v1

Kafka topics

The service uses these specific topics:

TopicDefault ValuePurpose
KAFKA_TOPIC_DATA_SEARCH_INai.flowx.dev.core.trigger.search.data.v1Incoming search requests
KAFKA_TOPIC_DATA_SEARCH_OUTai.flowx.dev.engine.receive.core.search.data.results.v1Outgoing search results

Elasticsearch configuration

Set up Elasticsearch connectivity with these environment variables:

VariableDescriptionDefaultExample
SPRING_ELASTICSEARCH_REST_URISElasticsearch server address(es)-elasticsearch-master:9200
SPRING_ELASTICSEARCH_REST_PROTOCOLProtocol for Elasticsearch communicationhttphttps
SPRING_ELASTICSEARCH_REST_DISABLESSLWhether to disable SSL verificationfalsetrue
SPRING_ELASTICSEARCH_REST_USERNAMEElasticsearch username-elastic
SPRING_ELASTICSEARCH_REST_PASSWORDElasticsearch password-changeme
SPRING_ELASTICSEARCH_INDEX_SETTINGS_NAMEName of the index to useprocess_instanceflowx_data

Security configuration

Configure authentication and authorization with these variables:

VariableDescriptionExample
SECURITY_OAUTH2_BASESERVERURLBase URL for OAuth2 serverhttps://keycloak.example.com/auth
SECURITY_OAUTH2_CLIENT_CLIENTIDOAuth2 client IDdata-search-service
SECURITY_OAUTH2_CLIENT_CLIENTSECRETOAuth2 client secretdata-search-service-secret
SECURITY_OAUTH2_REALMOAuth2 realm nameflowx

Logging configuration

Control the verbosity of logs with these variables:

VariableDescriptionDefaultExample
LOGGING_LEVEL_ROOTRoot Spring Boot log levelINFOERROR
LOGGING_LEVEL_APPApplication-specific log levelINFODEBUG

Elasticsearch configuration

Set up Elasticsearch connectivity with these environment variables:

VariableDescriptionDefaultExample
SPRING_ELASTICSEARCH_REST_URISElasticsearch server address(es)localhost:9200elasticsearch-master:9200
SPRING_ELASTICSEARCH_REST_PROTOCOLProtocol for Elasticsearch communicationhttpshttps
SPRING_ELASTICSEARCH_REST_DISABLESSLWhether to disable SSL verificationfalsetrue
SPRING_ELASTICSEARCH_REST_USERNAMEElasticsearch username"" (empty)elastic
SPRING_ELASTICSEARCH_REST_PASSWORDElasticsearch password"" (empty)changeme
SPRING_ELASTICSEARCH_INDEXSETTINGS_NAMEName of the index to useprocess_instanceflowx_data

Elasticsearch index configuration

The Data Search service creates and manages Elasticsearch indices based on the configured index pattern. The default index name is process_instance.

Index pattern

The service derives the index pattern from the spring.elasticsearch.index-settings.name property. This pattern is used to query across multiple indices that match the pattern.

Sample search query

Below is an example of a search query generated by the Data Search service for Elasticsearch:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "boost": 1,
      "must": [
        {
          "nested": {
            "boost": 1,
            "ignore_unmapped": false,
            "path": "keyIdentifiers",
            "query": {
              "bool": {
                "adjust_pure_negative": true,
                "boost": 1,
                "must": [
                  {
                    "match": {
                      "keyIdentifiers.key.keyword": {
                        "query": "astonishingAttribute",
                        "operator": "OR"
                      }
                    }
                  },
                  {
                    "match": {
                      "keyIdentifiers.originalValue.keyword": {
                        "query": "OriginalGangsta",
                        "operator": "OR"
                      }
                    }
                  }
                ]
              }
            },
            "score_mode": "none"
          }
        },
        {
          "terms": {
            "boost": 1,
            "processDefinitionName.keyword": [
              "TEST_PORCESS_NAME_0",
              "TEST_PORCESS_NAME_1"
            ]
          }
        }
      ]
    }
  }
}

Troubleshooting

Common issues

  1. Elasticsearch connection problems:

    • Verify Elasticsearch is running and accessible
    • Check if credentials are correct
    • Ensure SSL settings match your environment
  2. Kafka Communication Issues:

    • Verify Kafka topics exist and are properly configured
    • Check Kafka permissions for the service
    • Ensure bootstrap servers are correctly specified
  3. Search Not Returning Results:

    • Verify index pattern matches existing indices
    • Check if data is being properly indexed
    • Review search query format for errors

Logs analysis

Monitor logs for errors and warnings:

# For Docker
docker logs flowx-data-search

# For Kubernetes
kubectl logs deployment/data-search

Integration with Kibana

Kibana provides a powerful interface for visualizing and exploring data indexed by the Data Search service.

  1. Connect Kibana to the same Elasticsearch instance
  2. Create an index pattern matching your configured index name
  3. Use the Discover tab to explore indexed data
  4. Create visualizations and dashboards based on your data

Kibana is an open-source data visualization and exploration tool designed primarily for Elasticsearch. It serves as the visualization layer for the Elastic Stack, allowing users to interact with their data stored in Elasticsearch to perform various activities such as querying, analyzing, and visualizing data. For more information, visit the Kibana official documentation.

Best practices

  1. Security:

    • Store sensitive credentials in Kubernetes Secrets
    • Use TLS for Elasticsearch and Kafka communication
    • Implement network policies to restrict access
  2. Performance:

    • Scale the number of replicas based on query load
    • Adjust Kafka consumer threads based on message volume
    • Configure appropriate resource limits and requests
  3. Monitoring:

    • Set up monitoring for Elasticsearch, Kafka, and Redis
    • Create alerts for service availability and performance
    • Monitor disk space for Elasticsearch data nodes