More and more SSD and flash memory storage solutions are coming to the storage market. Some of vendors use the fast storage as cache, some – as a tiered storage. Let’s look at the difference between these two approaches.
Storage Tiering
In Storage Tiering all data are distributed between different storage tiers:
- Tier 1 – SSD
- Tier 2 – FC or SAS HDD
- Tier 3 – NL SAS or SATA HDD
The “hot” data blocks are placed on the fast storage (SSD), all other data blocks are placed on HDDs. After the learning process which identifies what data is “hot”, the data blocks are redistributed between storage tiers. This process is running periodically by the administrator defined schedule.
Examples: HP 3PAR Adaptive Optimization, IBM Easy Tier, NetApp Flash Pool, EMC FAST
Storage Cache
What’s the storage cache? The cache is the interim fast device that transparently stores data so that future requests for that data can be served faster. After data is written to the cache it must also be written to the main storage (HDDs). The caching device keeps data temporarily.
There are two types of cache:
Read-Only – cache is not used for writes or is used only in Write-Through mode, where data must be written to hard drives before the storage system will send data write confirmation to the host. This type of caching is often used with MLC SSDs which provide excellent read performance but are not so good for write workloads.
Read-Only caching also can be used on the host side.
Examples: NetApp Flex Cache, Nimble Storage.
Read-Write – cache is used in Write-Back mode and the storage system sends the data write confirmation to the host before data is written to the main storage (HDDs). Data can be modified within the cache even without going to the main storage. This type of cache processes write intensive workloads and the SLC or flash memory should be used as caching device. As the Read-Write cache can contain data which is not written to the HDDs it requires the RAID or other protection.
Typically the storage controller placed memory is used as Read-Write Level 1 cache. And it’s protected by the mirroring between the storage controllers and the battery devices.
What’s the difference between caching and tiering approaches?
First of all, the cache contains a copy of data already placed on HDDs. And in case of the storage tiering the data is distributed between different storage tiers (SDDs and HDDs). The data redistribution between the storage tiers is resource-demanding process, that’s why data should remain on the defined storage tiers for quite a long time (usually hours or even days).
What’s better?
The Storage Tiering is good for the relatively sustained workloads, when the workload pattern is being changing not too often (remember that data blocks redistribution is a resource-demanding process). It works well for both read and write workloads.
The Storage Caching is good for unpredictable fast changing workloads. The cache provides much faster response for the workload pattern change. The Read-Only caching solutions are getting more and more popular because of effective usage of relatively inexpensive MLC SSDs.
The main idea is to understand how the proposed solution works and how it meets your requirements.
And also be careful, some solutions named as "caching" realize the tiered storage approach, and vice versa.