Reporting setup
The Reporting setup guide assists in configuring the reporting plugin, relying on specific dependencies and configurations.
The FlowX Reporting solution provides powerful data analytics and visualization capabilities for your FlowX platform. This guide offers step-by-step instructions for setting up and configuring all components of the reporting system.
The reporting solution consists of three main components:
- Reporting Plugin: Extracts and processes data from the FlowX Engine
- Spark Application: Handles data transformation and loading operations
- Apache Superset: Provides the visualization interface and dashboard capabilities
Dependencies
The reporting plugin, available as a Docker image, requires the following dependencies:
- PostgreSQL: Dedicated instance for reporting data storage.
- Reporting-plugin Helm Chart:
- Utilizes a Spark Application to extract data from the FLOWX.AI Engine database and populate the Reporting plugin database.
- Utilizes Spark Operator (more info here).
- Superset:
- Requires a dedicated PostgreSQL database for its operation.
- Utilizes Redis for efficient caching.
- Exposes its user interface via an ingress.
Prerequisites
Before starting the installation, ensure you have:
- Kubernetes cluster with Helm installed
- Access to PostgreSQL databases for:
- FlowX Engine database (source)
- Reporting database (destination)
- Superset metadata database
- Docker registry access for the reporting images
- Redis instance for Superset caching
- Ingress controller for exposing Superset UI
Reporting plugin helm chart configuration
Configuring the reporting plugin involves several steps:
Installation of Spark Operator
- Install the Spark Operator using Helm:
- Apply RBAC configurations:
- Build the reporting image:
-
Update the
reporting-image
URL in thespark-app.yml
file. -
Configure the correct database ENV variables in the
spark-app.yml
file (check them in the above examples with/without webhook). -
Deploy the application:
Spark Operator deployment options
Without webhook
For deployments without a webhook, manage secrets and environmental variables for security:
NOTE: Passwords are currently set as plain strings, which is not secure practice in a production environment.
With webhook
When using the webhook, employ environmental variables with secrets for a balanced security approach:
In Kubernetes-based Spark deployments managed by the Spark Operator, you can define the sparkApplication configuration to customize the behavior, resources, and environment for both the driver and executor components of Spark jobs. The driver section allows fine-tuning of parameters specifically pertinent to the driver part of the Spark application.
Below are the configurable values within the chart values.yml file (with webhook):
Superset configuration
Detailed Superset Configuration Guide:
Superset configuration
Superset docker image
Refer to Superset Documentation for in-depth information:
Superset documentation
Post-installation steps
After installation, perform the following essential configurations:
Datasource configuration
For document-related data storage, configure these environment variables:
SPRING_DATASOURCE_URL
SPRING_DATASOURCE_USERNAME
SPRING_DATASOURCE_PASSWORD
Ensure accurate details to prevent startup errors. The Liquibase script manages schema and migrations.
Redis configuration
The following values should be set with the corresponding Redis-related values:
SPRING_REDIS_HOST
SPRING_REDIS_PORT
Keycloak configuration
To implement alternative user authentication:
- Override
AUTH_TYPE
in yoursuperset.yml
configuration file:- Set
AUTH_TYPE: AUTH_OID
- Set
- Provide the reference to your
openid-connect
realm:OIDC_OPENID_REALM: 'flowx'
With this configuration, the login page changes to a prompt where the user can select the desired OpenID provider.
Extend the security manager
Firstly, you will want to make sure that flask stops using flask-openid
and starts using flask-oidc
instead.
To do so, you will need to create your own security manager that configures flask-oidc
as its authentication provider.
To enable OpenID in Superset, you would previously have had to set the authentication type to AUTH_OID
.
The security manager still executes all the behavior of the super class, but overrides the OID attribute with the OpenIDConnect
object.
Further, it replaces the default OpenID authentication view with a custom one:
On authentication, the user is redirected back to Superset.
Configure Superset authentication
Finally, we need to add some parameters to the superset .yml file:
Was this page helpful?