Tiered storage capabilities in Axon Server
Introduction
Tiered storage is a useful feature that allows you to store data across different levels of storage media based on access speed and cost. This helps optimize performance and reduce overall storage costs by placing frequently accessed data on faster storage media, such as SSDs, and less frequently accessed data on slower but more cost-effective storage media, such as HDDs.
In a tiered storage system, data is usually classified according to its importance or frequency of use, and then automatically migrated between storage tiers based on these criteria. This approach helps users to optimize their storage resources, reduce costs, and improve performance by ensuring that data is stored on the most appropriate type of storage media.
In Axon Server, there are two features that allow you to manage how data is stored:Tiered Storage and Secondary Nodes. They work differently but can complement each other.
Let's first explain these two different options and compare them to each other.
Tiered Storage
Tiered Storage is a highly anticipated feature of the 2023.0.0 release, available with Axon Server Enterprise, that enables each node to maintain a local representation of its own tiered storage over its event store replica. This feature allows you to configure multiple storage tiers based on each node’s role, such as primary, secondary, or backup nodes. With Tiered Storage, you can configure as many tiers as you need, and you can set retention intervals for each tier to determine when the data should be moved from one tier to another. There are several supported local tier types available, including the default, custom storage, and black hole. In the future, we plan to add more tier types, such as one suitable for archiving and cold storage.
Default
The Default tier type is a convenient option that allows you to quickly set up your event store without having to specify a custom location on disk. If you do not have any specific requirements for the physical storage location of your event data or you are migrating from an older version of Axon Server where the event store location was set via environment variables, then using the Default Location tier is a suitable initial tier to use.
Custom storage
The Custom storage tier type enables you to set a custom filesystem location for a specific tier in Axon Server. You can add as many storage locations as needed, such as different hard drives or even mounted network drives.
Configuration is shared for all nodes of the same role. That's why all nodes should provide paths where they want to store data for certain named locations (storages). After configuration is set, it's replicated to all nodes, after that nodes maintain and run segment-moving operations on their own, independently.
Storage locations are referenced by name (e.g., slow_disk), and the provided path is resolved at runtime on each node.
It's important that path on each node points to a unique physical location!
Black Hole Tier
The Black Hole tier type is an option in Axon Server that consumes your events and you will never see them again.
Using the Black Hole tier will mark your context as ephemeral, which means that data is permanently removed after a specified retention interval. Once the data is removed, it cannot be recovered, so it's essential to use this feature with caution and only if you're certain that you no longer need the data.
Use Cases for Ephemeral Context
Ephemeral context is particularly useful in scenarios such as event streaming or integration contexts, where events are published to multiple observers in real-time, and after some time, the events are no longer of interest. Another use case is for contexts that produce many events, like notifications, which are no longer useful for the business after a certain period.
However:
-
The Black Hole tier should not be used if you have event-sourced aggregates.
-
Ephemeral context is not suitable for fine-grain event removal as it removes segments that contain many different events of different types of aggregates, making it difficult to remove specific events.
Conditional Removal
In the case you need to use event-sourced aggregates and want to delete events after a period of time and ensure that you always have a valid state, Axon Server provides an experimental feature called conditional removal, which allows you to remove segments conditionally. To use this feature, you first need to enable snapshots for your aggregates. Conditional remove instructs the black hole tier to remove an event segment only if each event in the segment was previously included in a snapshot. It also removes a snapshot segment only if there is a newer snapshot for each snapshot in the segment.
However, note that this feature is experimental, and it comes with some caveats. For example, there are many events of different aggregates in one segment. If only one aggregate from this segment does not have a snapshot, it may prevent the segment from being deleted indefinitely. This feature is best used if you have a small number of aggregate instances, ideally one aggregate, which is a common case for integration purposes contexts.
Retention Intervals
Axon Server supports both time-based and size-based retention intervals tiered storage.
Time-Based Retention
Time-based retention specifies how long a segment should be in one tier before being moved to another tier. After the segment is closed for writing and the retention time is due, the segment becomes eligible to move to the next tier.
Size-Based Retention
Size-based retention monitors the size of the whole tier. After the size threshold is breached, the oldest segments in the tier that are outside of the size boundary are moved to the next tier, maintaining the specified size of the tier.
Specified size is calculated only using event segment file sizes and does not include index file sizes. So, make sure to leave enough space on the disk for indexes and for the currently open segment.
Size-based retention is useful when you want to keep the newest events on fast storage limited by size, while moving everything else to a slower disk.
Secondary Nodes
Secondary Nodes are a feature in Axon Server that allows you to reduce the number of copies of data that is stored, by keeping only the most recent event store on your primary nodes and keeping the full event store on the secondary nodes. The primary nodes can have faster (more expensive) disks, while the secondary nodes can have slower but more cost-effective disks. This can help reduce storage costs without significantly impacting performance.
When Axon Server processes a transaction to append events, the leader replicates this transaction to all of the nodes in the cluster. This includes the primary nodes, as well as the secondary and backup nodes. While the leader will be satisfied when the majority of the primary nodes have acknowledged the receipt of the transaction, it will also keep track of the progress of the other nodes.
Each node holds an exact copy of the data initially. So with a cluster of three nodes, each element of data (typically events) will be stored a total of three times. The main reason for this is to ensure that the failure of a single node will not result in the data becoming unavailable or lost. This is particularly relevant for recent information, which is accessed frequently by various event processors, and when using event sourcing. However, the added value of these extra copies degrades over time, as these entries are accessed less frequently.
Secondary nodes contain a full copy of all the data that the primary nodes also process. While replicating that data, they inform the primary nodes of their progress. Once the data has aged to the configured retention time, it becomes eligible for removal from the last tier in primary nodes, but only if all available secondary nodes have a safe copy of that data. When primary nodes need to access old data, they will retrieve it from the secondary nodes.
By using secondary nodes, you can leverage concurrent access performance of faster disks while minimizing cost by moving events to slower disks once access requirements are reduced. Additionally, a secondary node could be used to keep access for incidental operational use of older events. This secondary node could use several storage tiers to be able to cope with the large amount of data to store. If needed, after a certain retention period, data can be removed altogether.
Combining Tiered storage with Secondary Nodes
How are Secondary Nodes Different?
The main difference between Secondary Nodes and Tiered Storage is the cardinality of data. Secondary Nodes allow for a reduction in the number of copies of each data element that is stored; whereas in Tiered Storage, the number of copies always equals the number of nodes in the cluster.
Another significant difference is that Secondary Nodes copy all the data, including the most recent events, even though they still exist on the primary node. Once the data has aged to the configured retention time, it becomes eligible for removal from the primary nodes, but only if all available secondary nodes have a safe copy of that data. In contrast, Tiered Storage involves an actual data copy operation at the moment data transitions from one tier to the next.
How Can They Complement Each Other?
The differences between Secondary Nodes and Tiered Storage allow for interesting data management techniques. High-performance systems require the ability to concurrently ingest data and read events for event sourcing. Additionally, events that are "cooling down" may still occasionally be needed for operational purposes, making the availability of this data essential.
In such scenarios, one could use SSD and HDD on the primary nodes to leverage the concurrent access performance of SSD and minimize costs by moving events to a local HDD once access requirements are reduced. Additionally, a Secondary node could be used to keep access for incidental operational use of older events. This Secondary node could use several storage tiers to cope with large amounts of data to store.
By combining Secondary Nodes and Tiered Storage, users can effectively manage their data and strike a balance between performance and cost.
Conclusion
Tiered Storage is a powerful feature in Axon Server Enterprise that can help you optimize performance and reduce overall storage costs. By storing data across different levels of storage media based on access speed and cost, Tiered Storage can help you minimize the cost of storing infrequently accessed data while also ensuring fast access to frequently accessed data.
There are two features in Axon Server to manage how data is stored: Tiered Storage, and Secondary Nodes. Each has its own benefits and limitations, and they can also be used together to implement powerful data management strategies.
While Tiered Storage is a powerful feature, it also has some limitations. It is important to carefully consider your storage requirements and configure the tiers appropriately to ensure optimal performance and data accessibility.