Amazon AWS

Amazon AWS Cloud Storage Selection Guide: When to use S3 vs EBS vs EFS. And Comparison

Which AWS Storage options work for SQL Servers?

If you looking for AWS storage options for SQL Servers for your EC2 instance, your choices are limited. You can use EBS GP2 or EBS Provisioned IOPS – that’s it!

Amazon has multiple storage options. It may be a bit confusing which to use for what purpose. We ran into that question way back when just starting to learn AWS platform and how just starting to figure out how to use it best for SQL Server installations.

S3 or Simple Storage Service

AWS S3 is one of the most widely-adopted cloud services from AWS.

Amazon created it as a completely new file system from ground up, with its own set of commands for file manipulation.

S3 stores data as objects in a flat environment (without a hierarchy).amazon-aws-s3-logo

It’s an object store with simple key, value store design.

Each object is assigned a name (Key), you can use that Key to access the object from anywhere, even directly through internet.

The content that is stored in the object is called Value.

Each object (file) in the storage contains a header with an associated sequence of bytes from 0 bytes to 5 TB (the maximum size of an object).

The data can be stored in different “buckets” which are logical placeholders for data, like the folders in a computer file system.

This service is designed for 99.999999999% (11! 9’s) of durability, and stores data for millions of applications for companies all around the world.

The features include capabilities to append metadata tags to objects, configure and enforce data access controls, move and store data across the S3 Storage Classes, secure data against unauthorized users, run big data analytics, and monitor data at the object and bucket levels.

At a very basic level, S3 can be a host for a company’s documents and files and mapped as a file server.

These files can be encrypted with Amazon Key Management Service (KMS) keys.

Also, it can be used to host a static host a static website’s contents, that can be cached using CloudFront content delivery network.

Amazon S3 is a popular choice for AWS services to store snapshots, log files or SQL Server database backups (other other databases).

Some types of data stored in S3 can be directly queried from analytics applications like Amazon Athena with Structured Query Language (SQL).

EBS or Amazon Elastic Block Store

AWS EBS provides highly available, consistent, low-latency block storage for Amazon EC2 (Elastic Computer Cloud).

Amazon EBS is a storage for the drives of your virtual machines (works well for SQL Servers).

amazon-aws-ebs-logo

Like traditional file systems it stores data in data-blocks and can be attached to different EC2 instances only.

Amazon EBS capacity planning is very important and volume addition or expansion should be planned properly to overcome any unplanned business outage.

If storage volume space is exhausted, then you can attach another volume (specific limit of attaching EBS volumes) or you can increase the size of existing EBS volume.

To increase the size of an existing volume, the EBS volume is required to be detached from the EC2 instance (that means stopping your EC2 instance).

Typical use cases include relational and NoSQL databases (like Microsoft SQL Server and MySQL or Cassandra and MongoDB), Big Data analytics engines (like the Hadoop/HDFS ecosystem and Amazon EMR), stream and log processing applications (like Kafka and Splunk), and data warehousing applications (like Vertica and Teradata).

To choose the right Amazon EBS volume type you need to consider a few parameters such as:

  1. “IOPS” and throughput requirements for your application,
  2. the Read vs. Write ratios,
  3. Data type (Random or Sequential Access) and
  4. the chunk size of data (to align EBS volume to your application).

There are four EBS volume types in AWS:

  1. Provisioned IOPS SSD (io1)
    1. Description: Designed for higher workload. Typical usage is for for high transactional RDBMS databases like MS SQL Server. Highest-performance SSD volume for mission-critical low-latency or high-throughput workloads.
    2. IOPS per volume: up to 64,000.
    3. IOPS per instance: up to 80,000.
    4. Max. Throughput/Volume: 1,000 MiB/s.
    5. Max. Throughput/Instance: 1,750 MiB/s
    6. Use cases: Critical business applications that require sustained IOPS performance, or more than 16,000 IOPS or 250 MiB/s of throughput per volume Large database workloads, such as: Microsoft SQL Server, mySQL, Oracle.
  2. General Purpose SSD Volumes (gp2):
    1. Description: General purpose SSD volume that balances price and performance for a wide variety of workloads. It has baseline performance of 3 IOPS/GB.
    2. Max. IOPS/Volume: 16,000.
    3. IOPS per instance: 80,000.
    4. Max. Throughput/Volume: 250 MiB/s.
    5. Max. Throughput/Instance: 1,750 MiB/s.
    6. Good choice for: system boot volumes and small-medium size databases SQL Servers, Oracle, mySQL, MongoDB, Couchbase, etc.
  3. Throughput optimized HDD (stl)
    1. Description: Low cost HDD volume designed for frequently accessed, throughput intensive workloads. It can be used with testing and development environments on Amazon EC2 or with applications that don’t require a lot of read/write operations.
    2. Max. IOPS/Volume: 500.
    3. IOPS per instance: 80,000.
    4. Max. Throughput/Volume: 500 MiB/s.
    5. Max. Throughput/Instance: 1,750 MiB/s.
    6. Good choice for: Streaming workloads requiring consistent, fast throughput at a low price, Big data, Data warehouses, Log processing, Cannot be a boot volume.
  4. Cold HDD (sc1)
    1. Description: HDD volume that provides the lowest cost per GB of all EBS volume types. Designed for less frequently accessed workloads.
    2. Max. IOPS/Volume: 250.
    3. IOPS per instance: 80,000.
    4. Max. Throughput/Volume: 250 MiB/s.
    5. Max. Throughput/Instance: 1,750 MiB/s.
    6. Good choice for: Throughput-oriented storage for large volumes of data that is infrequently accessed. Scenarios where the lowest storage cost is important. Cannot be a boot volume

EFS or Amazon Elastic File System

AWS EFS is a shared, elastic file storage system that grows and shrinks as you add and remove files.

It offers a traditional file storage paradigm, with data organized into directories and subdirectories.

amazon-aws-efs-logo

It is highly scalable service for use with AWS Cloud services and on-premises resources.

It uses NFSv4 protocol to allow traditional hierarchical directory structure.

You can mount EFS to various AWS services and access it from various virtual machines.

Amazon EFS is automatically scalable, that means that your running applications won’t have any problems if the workload suddenly becomes higher (the storage will scale itself automatically).

If the workload decreases, the storage will scale down too.

Amazon Elastic File System was created to fulfill an application with high workloads that need scalable storage and relatively fast output.

Amazon EFS is especially helpful for running servers, shared volumes (like NAS devices), big data analysis, SaaS applications and content management systems.

There is a Standard and an Infrequent Access storage class available with Amazon EFS.

Using Lifecycle Management, files not accessed for 30 days will automatically be moved to a cost-optimized Infrequent Access storage class, giving you a simple way to store and access active and infrequently accessed file system data in the same file system while reducing storage costs by up to 85%.

Which Amazon AWS Storage Service is right for you? 

compare-aws-storage-services-s3-ebs-efs

Table 1 – Feature comparison between Amazon AWS Cloud Storage Options and Services – S3 vs. EBS vs. EFS.

The deciding factor between AWS storage options most likely comes down to how much you can afford to pay for storage performance that fits your needs.

Amazon S3 can be accessed from anywhere. It seems to be the cheapest for data storage. However, there are various other pricing parameters in S3, including cost per number of requests made, S3 Analytics, and data transfer out of S3 per gigabyte. More detail about pricing can be found here.

EBS and EFS are both faster than Amazon S3, with higher max throughput, more IOPS and lower latency.

EBS is scalable up or down with a single API call (stopping instance).

AWS EBS is only available in an EC2 instances, but is cheaper than EFS. More info about EBS pricing can be found here.

EFS is best used for large quantities of data, such as large analytic workloads.

Data at this scale cannot be stored on a single EC2 instance allowed in EBS (requiring users to break up data and distribute it between EBS instances).

The EFS service allows concurrent access to thousands of EC2 instances, making it possible to process and analyze large amounts of data seamlessly.

More info about EFS pricing can be found here.

Conclusion

AWS has several storage options. For SQL Servers, the choices are even simpler – you are between EBS or provisioned IOPS volumes. Price out both choices before jumping into provisioned IOPS – as those can cost 50% of your whole server cost.

your code here

Share this post

Share on facebook
Share on google
Share on twitter
Share on linkedin
Share on pinterest
Share on print
Share on email