Improving data management using data partitioning and the archival processes.
Concept
Implementation
process_instance
table can be partitioned by day, week, or month.Benefits
Process
archived_${table_name}_${interval_name}_${reference}
for example: archived_process_instance_monthly_2024_03
.yyyy-MM-dd
if the interval is DAY
, or in the format yyyy-MM
if the interval is MONTH
, or in the format yyyy-weekOfYear
if the interval is WEEK
.Steps
Benefits
Process
Steps
archived__${table_name}__${interval_name}_${reference}
, example: archived__process_instance__weekly_2024_09 .archived__${table_name}__${interval_name}_${reference}
) and data is moved from the primary table here.Benefits
Partition ID Calculation
partition_id
is automatically calculated based on the configured interval (DAY, WEEK, MONTH).
Example for daily partitioning: 2024-03-01 13:00:00
results in partition_id = 124061
. See the Partition ID calculation section.Retention Management
Archiving process
Space management
Environment Variable | Description | Default Value | Possible Values |
---|---|---|---|
FLOWX_DATA_PARTITIONING_ENABLED | Activates data partitioning. | false | true , false |
FLOWX_DATA_PARTITIONING_INTERVAL | Interval for partitioning (the time interval contained in a partition). | MONTH | DAY , WEEK , MONTH |
FLOWX_DATA_PARTITIONING_RETENTION_INTERVALS | Number of intervals retained in the FlowX database (for partitioned tables). | 3 | Integer values (e.g., 1 , 2 , 3 ) |
FLOWX_DATA_PARTITIONING_DETACHED_PARTITION_COMPRESSION | Enables compression for archived (detached) partitions (Oracle only). | OFF | OFF , BASIC , ADVANCED |
FLOWX_DATA_PARTITIONING_MOVED_DATA_BATCH_SIZE | Batch size for moving data (PostgreSQL only). | 5000 | Integer values (e.g., 1000 , 5000 ) |
SCHEDULER_DATAPARTITIONING_ENABLED | Activates the cron job for archiving partitions. | true | true , false |
SCHEDULER_DATAPARTITIONING_CRONEXPRESSION | Cron expression for the data partitioning scheduler. | 0 0 1 * * ? -> every day at 1:00AM | Cron expression (e.g., 0 0 1 * * ? ) |
DATA_PARTITIONING_LOG
: For tracking archived partitions.DATA_PARTITIONING_LOG_ENTRY
: For logging SQL commands executed for archiving.transforms.routeTS.topic.format: "process_instance-${timestamp}"
: This value must start with the index name defined in the process-engine config: flowx.indexing.processInstance.index-name. In this example, the index name is prefixed with “process_instance-” and appended with a timestamp for dynamic index creation. For backward compatibility, the prefix must be “process_instance-”. However, backward compatibility is not strictly required here.
yaml
transforms.routeTS.timestamp.format: "yyyyMMdd"
: This format ensures that timestamps are consistently represented and easily parsed when creating or searching for indices based on the process instance start date. You can adjust this value as needed (e.g., for monthly indexes, use “yyyyMM”). However, note that changing this format will cause existing indexed objects to remain unchanged, and update messages will be treated as new objects, indexed again in new indices. It is crucial to determine your index size and maintain consistency.
partition_id
format follows this structure: <LEVEL || YEAR || BIN_ID_OF_YEAR>
. This ID is calculated based on the start date of the process_instance
, the partition interval, and the partition level.
LEVEL
: This represents the “Partitioning level,” which increments with each change in the partitioning interval (for example, if it changes from DAY
to MONTH
or vice versa).YEAR
: The year extracted from the process_instance
date.BIN_ID_OF_YEAR
: This is the ID of a bucket associated with the YEAR
. It is created for all instances within the selected partitioning interval. The maximum number of buckets is determined by the partitioning frequency:
2024-03-01 13:00:00
with a daily partitioning interval, the partition_id
would be 124061
:
1
: Partitioning Level (LEVEL
)24
: Year - 2024 (YEAR
)061
: Bucket per configuration (61st day of the year)archived__${table_name}__${interval_name}_${reference}
. Examples: