Quality of Service is not a feature. It is an architecture.
Adding QoS features to an existing storage platform may solve one performance bottleneck for individual performance conditions, but this approach fails to solve the exponentially more significant challenges conditions that occur at cloud scale. A true solution requires a purposebuilt storage architecture that solves performance problems comprehensively, not individually.
Rather than implementing individual features to solve for predictable performance, multi-tenant infrastructures demand a storage system capable of delivering total Quality of Service (QoS). SolidFire has established that benchmark architecture with a solution built from the ground up to solve performance problems comprehensively, not individually. Only this approach allows cloud providers to guarantee performance despite tenant activities or failure conditions, made possible only through six core architectural elements:
All-SSD Architecture
Enables the delivery of consistent latency for every IO
True Scale-Out Architecture
Linear, predictable performance gains as system scales
RAID-less Data Protection
Predictable performance in any failure condition
Balanced Load Distribution
Eliminate hot spots that create unpredictable IO latency
Fine-Grain QoS Control
Completely eliminate noisy neighbors and guarantee volume performance
Performance Virtualization
Control performance independent of capacity and on demand
Requirement #1: All-SSD Architecture
Anyone deploying either a large public or private cloud infrastructure is faced with the same issue: how to deal with inconsistent and unpredictable application performance. Overcoming this problem requires an architecture built from the ground up to guarantee QoS for many simultaneous applications. The first requirement for achieving this level of performance is moving from spinning media to an all-SSD architecture. Only an all-SSD architecture allows you to deliver consistent latency for every IO.
At first, this idea might seem like overkill. If you don’t actually need the performance of SSD storage, why can’t you guarantee performance using spinning disk? Or even a hybrid disk and SSD approach?
Fundamentally, it comes down to simple physics. A spinning disk can only serve a single IO at a time, and any seek between IOs adds significant latency. In cloud environments where multiple applications or virtual machines share disks, the unpredictable queue of IO to the single head can easily result in orders of magnitude variance in latency, from 5 ms with no contention to 50 ms or more on a busy disk.
The solutions are part of the problem
Modern storage systems attempt to overcome this fundamental physical bottleneck in a number of ways, including caching (in DRAM and flash), tiering, and wide striping.
Caching is the easiest way to reduce contention for a spinning disk. The hottest data is kept in large DRAM or flash-based caches, which can offload a significant amount of IO from the disks. Indeed, this is why large DRAM caches are standard on every modern disk-based storage system. But while caching can certainly increase the overall throughput of the spinning disk system, it causes highly variable latency. Data in DRAM or flash cache can be served in under 1 ms, while cache misses served from disk will take 10-100 ms. That’s three orders of magnitude for an individual IO. Clearly the overall performance for an individual application is going to be strongly influenced by how cache-friendly it is, how large the cache is, and how many other applications are sharing it. In a dynamic cloud environment, that last criteria is changing constantly. All told it’s impossible to predict, much less guarantee, the performance of any individual application in a system based on caching.
Tiering is another approach to overcome the physical limits of spinning disk, but suffers from many of the same problems as caching. Principally, tiered systems move “hot” and “cold” data between different storage in an attempt to give popular applications more performance. But this solution suffers from the same unpredictability problems as caching.
Wide striping data for a volume across many spinning disks doesn’t solve the problem either. While this approach can help balance IO load across the system, many more applications are now sharing each individual disk. A backlog at any disk can cause a performance issue, and a single noisy neighbor can ruin the party for everyone.
All-SSD is truly the only way to go
All-SSD architectures have significant advantages when it comes to being able to guarantee QoS. The lack of a moving head means latency is consistent no matter how many applications demand IOs, regardless of whether the IOs are sequential or random. Compared to the single-IO bottleneck of disk, SSDs have eight to 16 channels to serve IOs in parallel, and each IO is completed quickly. So even at a high queue depth, the variance in latency for an individual IO is low. All-SSD architectures often do away with DRAM caching altogether. Modern host operating systems and databases do extensive DRAM caching already, and the low latency of flash means that hitting the SSD is often nearly as fast as serving from a storage-system DRAM cache anyway. The net result in a well-designed system is consistent latency for every IO, a strong requirement for delivering guaranteed performance.
An all-SSD architecture is just the starting point for guaranteed QoS, however. Even a fast flash storage system can have noisy neighbors, degraded performance from failures, or unbalanced performance.
Requirement #2: True Scale-Out Architecture
Traditional storage architectures follow a scale-up model, where a controller (or pair of controllers) are attached to a set of disk shelves. More capacity can be added by simply adding shelves, but controller resources can only be upgraded by moving to the next “larger” controller (often with a data migration). Once you’ve maxed out the biggest controller, the only option is to deploy more storage systems, increasing the management burden and operational costs.
Tipping the scales not in your favor
This scale-up model poses significant challenges to guaranteeing consistent performance to individual applications. As more disk shelves and applications are added to the system, contention for controller resources increases, causing decreased performance as the system scales. While adding disk spindles is typically seen as increasing system performance, many storage architectures only put new volumes on the added disks, or require manual migration. Mixing disks with varying capacities and performance characteristics (such as SATA and SSD) makes it even more difficult to predict how much performance will be gained, particularly when the controller itself can quickly become the bottleneck.
Scaling out is the only way to go
By comparison, a true-scale out architecture such as SolidFire adds controller resources and storage capacity together. Each time capacity is increased and more applications are added, a consistent amount of performance is added as well. The SolidFire architecture ensures that the added performance is available for any volume in the system, not just new data. This solution is critical for both the administrator’s planning ability as well as for the storage system itself. If the storage system itself can’t predict how much performance it has now or will have in the future, it can’t possibly offer any kind of guaranteed QoS.
Requirement #3: RAID-less Data Protection
The invention of RAID 30+ years ago was a major advance in data protection, allowing “inexpensive” disks to store redundant copies of data, rebuilding onto a new disk when a failure occured. RAID has advanced over the years with multiple approaches and parity schemes to try and maintain relevance as disk capacities have increased dramatically. Some form of RAID is used on virtually all enterprise storage systems today. However, the problems with traditional RAID can no longer be glossed over, particularly when you want a storage architecture that can guarantee performance even when failures occur.
The problem with RAID
When it comes to QoS, RAID causes a significant performance penalty when a disk fails, often 50% or more. This penalty occurs because a failure causes a 2-5X increase in IO load to the remaining disks. In a simple RAID10 setup, a mirrored disk now has to serve double the IO load, plus the additional load of a full disk read to rebuild into a spare. The impact is even greater for parity-based schemes like RAID5 and RAID6, where a read that would have hit a single disk now has to hit every disk in the RAID set to rebuild the original data – in addition to the load from reading every disk to rebuild into a spare.
The performance impact from RAID rebuilds becomes compounded with long rebuild times incurred by multi-terabyte drives. Since traditional RAID rebuilds entirely into a new spare drive, there is a massive bottleneck of the write speed of that single drive combined with the read bottleneck of the few other drives in the RAID set. Rebuild times of 24 hours or more are now common, and the performance impact is felt the entire time. How can you possibly meet a performance SLA when a single disk failure can lead to hours or days of degraded performance?
In a cloud environment, telling the customer “the RAID array is rebuilding from a failure” is of little comfort. The only option available for service providers is to dramatically under-provision the performance of the system and hope that the impact of RAID rebuilds goes unnoticed.
Introducing SolidFire HelixTM data protection
SolidFire’s Helix data protection is a post-RAID distributed replication algorithm. This solution spreads redundant copies of data for single disk throughout all the other disks in the cluster rather than just a limited RAID set. Data is distributed in such a way that when a disk fails, the IO load it was serving spreads out evenly among every remaining disk in the system, with each disk only needing to handle a few percent more IO – not double or triple what it served before like RAID. Furthermore, data is rebuilt in parallel to the free space on all remaining disks rather than to a dedicated spare drive. Each drive in the system simply needs to share 1-2% of its data with its peers, allowing for rebuilds in a matter of seconds or minutes rather than hours or days. The combination of even load redistribution and rapid rebuilds allows SolidFire to continue to guarantee performance even when failures occur, something that just isn’t possible with traditional RAID.
Requirement #4:Balanced Load Distribution
Most block storage architectures use very basic algorithms to lay out provisioned space. Data is striped across a set of disks in a RAID set, or possibly across multiple RAID sets in a storage pool. For systems that support thin provisioning, the placement may be done via smaller chunks or extents rather than the entire volume at once. Typically, however, at least several hundred megabytes of data will be striped together.
Once data is placed on a disk, it is seldom moved (except possibly in tiering systems to move to a new tier). Even when a drive fails, all its data is simply restored onto a spare. When new drive shelves are added they are typically used for new data only – not to rebalance the load from existing volumes. Wide striping is one attempt to deal with this imbalance, by simply spreading a single volume across many disks. But when combined with spinning disk, wide striping just increases the number of applications that are affected when a hotspot or failure does occur.
Unbalanced loads cause unbalanced performance
The result of this static data placement is uneven load distribution between storage pools, RAID sets, and individual disks. When the storage pools have different capacity or different types of drives (e.g. SATA, SAS, or SSD) the difference can be even more acute. Some drives and RAID sets will get maxed out while others are relatively idle. Managing data placement to effectively balance IO load as well as capacity distribution is left to the storage administrator, often working with Microsoft Excel spreadsheets to try and figure out the best location for any particular volume. Not only does this manual management model not scale to cloud environments, it just isn’t viable when storage administrators have little or no visibility to the underlying application, or when application owners cannot see the underlying infrastructure. The unbalanced distribution of load also makes it impossible for the storage system itself to make any guarantees about performance. If the system can’t even balance the IO load it has, how can it guarantee QoS to an individual application as that load changes over time?
SolidFire restores the balance
SolidFire’s unique approach to data placement distributes individual 4K blocks of data throughout the storage cluster to evenly balance both capacity and performance. Data is distributed based on content rather than location, which avoids hotspots caused by problematic application behavior such as heavy access to a small range of LBAs.
Furthermore, as capacity is added (or removed) from the system, data is automatically redistributed in the background across all the storage capacity. Rather than ending up with a system that has traffic jams in older neighborhoods while the suburbs are mostly empty, SolidFire creates perfect balance as the system scales. This even distribution of data and IO load across the system allows SolidFire to deliver predictable performance regardless of the IO behavior of an individual application. As load on the system increases, it happens predictably and consistently. And as new capacity and performance is added, the SolidFire system gives a predictable amount of additional performance. This balanced load distribution continues to stay balanced over time, an essential aspect of delivering consistent performance day after day. You just can’t guarantee QoS without it.
Requirement #5: Fine-Grain QoS Control
Another key requirement for guaranteeing Quality of Service is a fine-grain QoS model that describes performance in all situations. Contrast fine-grain control against today’s rudimentary approaches to QoS, such as rate limiting and prioritization. These features merely provide a limited amount of control and don’t enable specific performance in all situations.
The trouble with having no control
For example, basic rate limiting, which sets a cap on the IOPS or bandwidth an application consumes, doesn’t take into account the fact that most storage workloads are prone to performance bursts. Database checkpoints, table scans, page cache flushes, file copies, and other operations tend to occur suddenly, requiring a sharp increase in the amount of performance needed from the system. Setting a hard cap simply means that when an application actually does need to do IO, it is quickly throttled. Latency then spikes and the storage seems painfully slow, even though the application isn’t doing that much IO overall.
Prioritization assigns labels to each workload, yet similarly suffers with bursty applications. While high priority workloads may be able to easily burst by stealing resources from lower priority ones, moderate or low priority workloads may not be able to burst at all. Worse, these lower priority workloads are constantly being impacted by the bursting of high priority workloads.
Failure and over-provisioned situations also present challenges for coarse-grained QoS. Rate limiting doesn’t provide any guarantees if the system can’t even deliver at the configured limit when it is overtaxed or suffering from performance-impacting failures. While prioritization can minimize the impact of failures for some applications, it still can’t tell you ahead of time how much impact there will be, and the applications in the lower tiers will likely see absolutely horrendous performance.
SolidFire enables the control you’ve been looking for
SolidFire’s QoS controls are built around a robust model for configuring QoS for an individual volume. The model takes into account bursty workloads, changing performance requirements, different IO patterns, and the possibility of over-provisioning. Whether an application is allocated a lot of performance or a little, the amount of performance it gets in any situation is never in doubt. Cloud operators finally have the confidence to guarantee QoS and write firm SLAs against performance. Only an architecture built with a fine-grain Quality of Service model can support these types of guarantees.
Requirement #6: Performance Virtualization
All modern storage systems virtualize the underlying raw capacity of their disks, creating an opaque pool of space from which individual volumes are carved. However the performance of those individual volumes is a second-order effect, determined by a number of variables such as the number of disks the volume is spread across, the speed of those disks, the RAID-level used, how many other applications share the same disks, and the controller resources available to service IO.
Traditional capacity virtualization does not suffice
Historically this approach has prevented storage systems from delivering any specific level of performance. “More” or “less” performance could be obtained by placing a volume on faster or slower disks or by relocating adjacent applications that may be causing impact. However, this solution is a manual and errorprone process. In a cloud environment, where both the scale and the dynamic nature prevent manual management of individual volumes, this approach just isn’t possible. Worst of all, significant raw capacity is often wasted as sets of disks get maxed out from a performance standpoint well before all their capacity is used.
Finally, performance can be managed independent of capacity
SolidFire’s performance virtualization removes all this complexity, creating separate pools of capacity and performance from which individual volumes can be provisioned. Performance becomes a first-class citizen, and management is as simple as specifying the performance requirements for an application rather than manually placing data and trying to adjust later.
Furthermore, SolidFire performance virtualization allows performance for an individual volume to be changed over time – simply increased or decreased as application workloads change or as requirements become more clear. SolidFire’s ability to dynamically adjust performance gives service providers the complete flexibility to deliver customers the exact performance they need, precisely when they need it.
Separating performance from capacity has the added benefit of providing a consistent way to view the current load on the system, both in terms of the capacity and performance that is actually used. Ensuring that the system doesn’t become unexpectedly overloaded is now as simple as reading a gas gauge rather than reading tea leaves. SolidFire’s ability to separate performance from capacity in our architecture is the last essential part of guaranteeing QoS. Without it, you’re left with a manual process full of guessing games that result in poor overall efficiency.
Conclusion
As storage QoS becomes a must-have component of a storage infrastructure, the difference between QoS features and a purpose-built architecture will become more evident. Only SolidFire’s all-SSD storage system has been built specifically to overcome every challenge associated with unpredictable performance in the cloud, and delivers the only solution for guaranteeing true performance QoS in multi-tenant settings. Now, service providers can write firm SLAs around performance for any application in the cloud, and confidently run businesscritical apps that will lead to greater profitability.