[2020] Amazon AWS Storage Selection Guide for SQL Server – When to use S3, EBS or EFS

Which AWS storage options work best for SQL Servers?

If you are looking for AWS storage options for SQL Servers for your EC2 instance, your choices are minimal. You can use EBS GP2 volumes or EBS Provisioned IOPS – that is it!

Amazon has multiple storage options. It may be a bit confusing which to use and for what purpose.

I ran into that question way back when I was starting to learn the AWS platform. I was beginning to figure out how to use AWS best for SQL Server installations.

S3 or Simple Storage Service

S3 is one of the mostwidely adopted cloud services by AWS.

Amazon created S3 as a completely new file system from the ground up, with its own set of commands for file manipulation.

S3 stores data as objects in a flat environment (without a hierarchy or directories).

amazon-aws-s3-logo

It is an object store with a simple key-value store design.

Each object is assigned a name (Key). You can use that Key to access the item from anywhere, even directly through the internet.

The content which is stored in the object is called Value.

Each object (file) in the storage contains a header with an associated sequence of bytes from 0 bytes to 5 TB (the maximum size of an object).

The data can be stored in different “buckets”, which are logical placeholders for data, like the folders on a PC.

This service is designed for 99.999999999% (11! 9’s) of durability. It stores data for millions of applications for companies all around the world.

The number above gets overlooked, especially by beginners. But that one is quite important. The hard drive is still a part that dies first and most frequently in any PC or Server. And to that part reliable – is nothing to be sneezed at.

The AWS S3 features include:

  1. capabilities to append metadata tags to objects,
  2. configure and enforce data access controls,
  3. move and store data across the S3 Storage Classes,
  4. secure data against unauthorized users,
  5. run big data analytics, and
  6. monitor data at the object and bucket levels.
  7. and it is possible there will be more features by the time you read this.

At a very basic level, S3 can be:

  •  a host for a company’s documents and files or
  •  mapped as a file server.

These files can be encrypted with Amazon Key Management Service (KMS) keys.

Also, it can be used to host a static web site’s content that can be cached using the AWS CloudFront content delivery network.

Amazon S3 is a popular choice for AWS services to store snapshots, log files, or SQL Server database backups (other databases).

Some AWS storage types in S3 can be directly queried from analytics applications like Amazon Athena with Structured Query Language (SQL).

EBS or Amazon Elastic Block Store

AWS EBS provides highly available, consistent, low-latency block storage for Amazon EC2 (Elastic Computer Cloud).

Amazon EBS is a storage for the drives of your virtual machines (which works well for SQL Servers!).

amazon-aws-ebs-logo

Like traditional file systems, EBS stores data in data-blocks and can be attached to different EC2 instances only.

Amazon EBS capacity planning is essential, and volume addition or expansion should be appropriately planned to overcome any unplanned outages.

If storage volume space is exhausted, then you can attach another volume (specific limit of attaching EBS volumes).  Or you can increase the size of the existing EBS volume.

To increase the size of an existing volume, the EBS volume is required to be detached from the EC2 instance (that means stopping your EC2 instance).

(I don’t think the above statement is true anymore)

Typical use cases include relational and NoSQL databases like Microsoft SQL Server and MySQL or Cassandra and MongoDB, Big Data analytics engines (like the Hadoop/HDFS ecosystem and Amazon EMR), stream and log processing applications (like Kafka and Splunk), and data warehousing applications (like Vertica and Teradata).

To choose the right Amazon EBS volume type, you need to consider a few parameters such as:

  1. “IOPS” and throughput requirements for your application,
  2. the Read vs. Write ratios,
  3. Data type (Random or Sequential Access) and
  4. the chunk size of data (to align EBS volume to your application).

There are four EBS volume types in AWS:

  1. Provisioned IOPS SSD (io1)

    1. Description: Designed for higher workload. Typical usage is for for high transactional RDBMS databases like MS SQL Server. Highest-performance SSD volume for mission-critical low-latency or high-throughput workloads.
    2. IOPS per volume: up to 64,000.
    3. IOPS per instance: up to 80,000.
    4. Max. Throughput/Volume: 1,000 MiB/s.
    5. Max. Throughput/Instance: 1,750 MiB/s
    6. Use cases: Critical business applications that require sustained IOPS performance, or more than 16,000 IOPS or 250 MiB/s of throughput per volume Large database workloads, such as: Microsoft SQL Server, mySQL, Oracle.
  2. General Purpose SSD Volumes (gp2):

    1. Description: General purpose SSD volume that balances price and performance for a wide variety of workloads. It has baseline performance of 3 IOPS/GB.
    2. Max. IOPS/Volume: 16,000.
    3. IOPS per instance: 80,000.
    4. Max. Throughput/Volume: 250 MiB/s.
    5. Max. Throughput/Instance: 1,750 MiB/s.
    6. Good choice for: system boot volumes and small-medium size databases SQL Servers, Oracle, mySQL, MongoDB, Couchbase, etc.
  3. Throughput optimized HDD (stl)

    1. Description: Low cost HDD volume designed for frequently accessed, throughput intensive workloads. It can be used with testing and development environments on Amazon EC2 or with applications that don’t require a lot of read/write operations.
    2. Max. IOPS/Volume: 500.
    3. IOPS per instance: 80,000.
    4. Max. Throughput/Volume: 500 MiB/s.
    5. Max. Throughput/Instance: 1,750 MiB/s.
    6. Good choice for: Streaming workloads requiring consistent, fast throughput at a low price, Big data, Data warehouses, Log processing, Cannot be a boot volume.
  4. Cold HDD (sc1)

    1. Description: HDD volume that provides the lowest cost per GB of all EBS volume types. Designed for less frequently accessed workloads.
    2. Max. IOPS/Volume: 250.
    3. IOPS per instance: 80,000.
    4. Max. Throughput/Volume: 250 MiB/s.
    5. Max. Throughput/Instance: 1,750 MiB/s.
    6. Good choice for: Throughput-oriented storage for large volumes of data that is infrequently accessed. Scenarios where the lowest storage cost is important. Cannot be a boot volume

EFS or Amazon Elastic File System

AWS EFS is a shared, elastic file storage system that grows and shrinks as you add and remove files.

It offers a traditional file storage paradigm, with data organized into directories and subdirectories.

amazon-aws-efs-logo

It is a highly scalable service for use with AWS Cloud services and on-premises resources.

It uses the NFSv4 protocol to allow traditional hierarchical directory structure.

You can mount EFS to various AWS services and access it from different virtual machines.

Amazon EFS is automatically scalable. That means that your running applications won’t have any problems if the workload suddenly becomes higher (the storage will scale itself automatically).

If the workload decreases, the storage will scale down too.

Amazon Elastic File System was created to fulfill an application with high workloads that need scalable storage and relatively fast output.

Amazon EFS is especially helpful for running servers, shared volumes (like NAS devices), big data analysis, SaaS applications, and content management systems.

There is a Standard and an Infrequent Access storage class available with Amazon EFS.

Using Lifecycle Management, files not accessed for 30 days will automatically be moved to a cost-optimized Infrequent Access storage class. Giving you a simple way to store and access active and infrequently accessed file system data in the same file system while reducing AWS storage costs by up to 85%.

Which Amazon AWS Storage Service is right for you? 

compare-aws-storage-services-s3-ebs-efs
Table 1 – Feature comparison between Amazon AWS Cloud Storage Options and Services – S3 vs. EBS vs. EFS.

The deciding factor between AWS storage options most likely comes down to how much you can afford to pay for storage performance that fits your needs.

Amazon S3 can be accessed from anywhere. It seems to be the cheapest for data storage.

However, there are various other pricing parameters in S3, including cost per number of requests made, S3 Analytics, and data transfer out of S3 per gigabyte. More detail about pricing can be found here.

EBS and EFS are both faster than Amazon S3, with higher max throughput, more IOPS, and lower latency.

EBS is scalable up or down with a single API call (stopping instance).

AWS EBS is only available in EC2 instances but is cheaper than EFS. More info about EBS pricing can be found here.

EFS is best used for large quantities of data, such as large analytic workloads.

Data at this scale cannot be stored on a single EC2 instance allowed in EBS (requiring users to break up data and distribute it between EBS instances).

The EFS service allows concurrent access to thousands of EC2 instances, making it possible to process and analyze large amounts of data seamlessly.

More info about EFS pricing can be found here.

Conclusion

AWS has several storage options. For SQL Servers, the choices are simple – you are between EBS or provisioned IOPS volumes.

Price out both choices before jumping into provisioned IOPS as default – as provisioned IOPS can easily cost half your SQL Server bill.

 Contact us! We can help you achieve your business goals. 

Mark Varnas

Mark Varnas

Hey I'm Mark, one of the guys behind Red9. I make a living performance tuning SQL Servers and making them more stable. I channel my SQL into our SQL Managed Services, SQL Consulting and our internal database products.

Leave a Reply

Your email address will not be published. Required fields are marked *