Amazon Elastic Block Storage (EBS) performance is a mystery to many—certainly if you are not familiar with all the factors that impact performance. Organizations often spin-up Amazon EBS Provisioned IOPS SSD (io1) volumes to achieve guaranteed performance but don’t see the performance they expect to see.
Assess EBS performance by reviewing the following:
Confusion on EBS performance comes from misunderstanding the following:
IOPS storage (io1) measures in 256K or smaller blocks. Send down larger block sizes and AWS breaks them into 256K blocks for IOPS measurement. Send a block of 512K and it’s counted as two blocks.
Provision an EBS volume with 100 IOPS and AWS offers a block device capable of handling 100 blocks—each with a maximum of 256K/second. If your application sends down 512K blocks, your IOPS will be 50!
If you expect to see a specific IOPS performance from an application point of view, know the block size used by the application and provision EBS volumes appropriately. In the above example, if your application uses 512K blocks and you want 1000 IOPS from EBS, provision 2000 IOPS.
The file system and block device layers could consolidate IO requests (e.g., Linux block device layer will consolidate sequential blocks of 16K into 128K requests). This is done to improve performance as storage devices tend to perform better with larger block sizes (up to a certain size). Depending on where the IOPS are measured, this can cause confusion.
This is the one of the least known factors about EBS performance. AWS EBS backend attempts to consolidate IO request when possible (for the same reasons OS/Hosts consolidate IO requests). However, CloudWatch metrics report IO stats based on the rates at which they are received by the EBS system. This means CloudWatch metrics could show read/write IOPS metrics that exceed the provisioned capacity (EBS backend measures IOPS based on consolidated IO). If you provision an io1 volume of 200 IOPS, you can send IO at the rate of say 800 IOPS if you send down 64k sequential blocks!
EC2 instances access EBS volumes over network connections. EBS volumes can be accessed using dedicated networks (available on EBS-optimized instances) and shared networks (non EBS-optimized instances). EBS-optimized instances offer dedicated network connection to storage with throughput options from 500 Mbps to 4000 Mbps with a per instance maximum of 32,000 IOPS.
With non-EBS-optimized instances, network traffic is shared by all traffic – storage and non-storage. Instance types determine the available network bandwidth. AWS is somewhat vague about network bandwidth of instance types and categorizes them as ‘low’, moderate’, ‘high’ and ‘10Gb’ (See ‘Instance Types Matrix’ section here). Shared bandwidth is an issue for applications that use significant network traffic (e.g., web/clustered applications).
Bottom line: your application performance is limited by the network bandwidth available to your instances. Understand the network bandwidth capabilities of instances and select appropriate instances that match your application performance needs.
As mentioned above there is a per volume/instance limit on throughput. A single io1 EBS volume can only do a maximum of 500 MiB/s. If you provision a 20,000 IOPS volume, and your application runs at 32KB, you are going to be disappointed to see that you will only see a maximum of 15,625 IOPS. Using multiple EBS volumes to increase performance doesn’t work—AWS has a limit on max throughput per instance. A single instance can only do a maximum of 1750 MiB/s.
Performance is always a complex matter as there are many factors that contribute to performance issues. But we believe most of the performance issues can be addressed by understanding the factors described above and carefully planning/selecting appropriate resources capable of meeting application requirements.
Editor's note: This post was originally published on the now retired FittedCloud blog (January 2018).
A hybrid IT approach offers the best of both on-premises and public cloud by keeping costs and risk low while increasing efficiency and speed. But monitoring and optimizing hybrid environments is complicated.
Apptio® Cloudability provides cost and resource optimization capabilities across all leading public cloud providers (AWS, Azure and Google Cloud) so IT leaders can increase the efficiency of public cloud spending and slash waste from over-purchase and underuse.