This blog post was updated to reflect S3 price model changes announced on AWS Storage Day 2021.
Amazon S3 is a cloud storage service provided by Amazon Web Services (AWS), designed for storing objects such as images, audio, videos, documents, and backups. S3 offers a range of storage classes that you can choose from based on the requirements of your workload. S3 storage classes are purpose-built to provide the lowest cost storage for different access patterns and when the access patterns are unknown. In deciding which S3 storage class best fits your workload, you have to consider the access patterns and retention of your data to optimize for the lowest total S3 storage costs over the lifetime of your data.
There are three storage classes in S3 that provide low latency and high throughput performance, and redundantly store data across a minimum of three physically separated AWS Availability Zones. These classes all offer identical millisecond-class access performance – the difference lies purely in the cost model:
- S3 Standard is ideal for frequently accessed data; this is the best choice if you access data more than once or twice a month.
- S3 Standard-Infrequent access (S3 Standard-IA) offers 40% lower cost per gigabyte and is designed for less frequently accessed workloads, where your cost to store data decreases, but the cost to access your data increases. S3 Standard-IA is ideal for data retained for at least a month and accessed no more than once every month or two.
- If you have data with unknown or changing access patterns, I recommend using S3 Intelligent-Tiering (S3–INT) because it delivers automatic storage cost savings by moving objects between access tiers for a small monthly per-object monitoring charge. There are no retrieval charges when using the S3 Intelligent-Tiering storage class
In this post I’d like to share three tips for optimizing your S3 storage costs, and that you walk away with an understanding to deciding which storage class best fits your workload.
Tip #1: When should I use S3 Lifecycle?
While it is cheaper to upload objects directly to their final storage class, if you want to use S3 Standard-IA it often makes sense to upload your objects to the Standard storage class and use the S3 lifecycle feature to move them down to S3 Standard-IA as they age. That is because in many cases fresh objects are initially warm (that is, accessed frequently) before cooling down.
For S3 Intelligent-Tiering as of the latest announcement few days ago on AWS storage day, there is no minimum object size or minimum storage duration. This means that using a S3 lifecycle transition policies only makes sense to transition existing objects from S3 Standard to S3 Intelligent-Tiering. And for newly created objects, we recommend directly uploading to the S3 Intelligent-Tiering storage class.
Now, let’s take a closer look at S3’s pricing. Your S3 storage costs consist of monthly recurring fees, and per-operation fees.
* Using a S3 Lifecycle transition is recommended only for S3-Standard-IA, and only if objects have an initial “warm” period.
Rule of thumb: For newly created objects, don’t use S3 lifecycle policies with S3 Intelligent-Tiering as it is always cheaper to upload directly. For S3 Standard-IA, if freshly uploaded objects tend to be warm or short-lived use S3 lifecycle policies. If your objects are guaranteed to be long-lived and infrequently accessed, upload directly to S3 Standard-IA.
Tip #2: Know when to Use S3 Standard-Infrequent Access (Standard-IA)
As its name indicates, infrequently accessed objects that are not small or temporary, can be stored with the Standard-IA storage class for major cost savings.
But how infrequent is “infrequent”? Let’s do the math.
In the upcoming example, let’s assume that you are uploading 1 TB to your bucket, and then keeping them for a year. In this table, I calculated the cost savings for S3 Intelligent- Tiering compared to the S3 Standard storage class. The following calculation assumes that objects are uploaded directly to S3-Standard-IA, without using the S3 lifecycle feature.
To check out the math and calculate for your own scenario, download Aron’s S3 ROI calculator here (in .xls format)
* This is 50 accesses per month. I’ve included this as a purposefully extreme example, to demonstrate the cost of accidentally placing “hot” data in the S3-IA class.
Rule of thumb: S3 Standard-IA is a good choice if your data is accessed, on average, no more than once per two months, assuming that the access patterns are uniform and predictable.
Tip #3: Know when to Use S3 Intelligent Tiering
The S3 Intelligent-tiering storage class is useful when your data is non-uniform, i.e., some portions are frequently accessed, and other portions are rarely accessed. It is also useful as more predictable alternative to S3-IA: it protects you from extreme retrieval fees in case your data suddenly needs to be frequently accessed . S3 Intelligent tiering dynamically transitions objects between the “Frequent” and “Infrequent” access tiers as data access patterns change, but does not charge you for each transition, nor do you pay per gigabyte of data retrieved.
Instead, S3 Intelligent Tiering charges you a fixed and predictable Monitoring & Automation fee based on the total number of objects. This management fee of $2.5 per million objects, per month, is insignificant when dealing with larger objects but may be significant for small objects between 128KB-250KB. For objects 128KB and smaller you are always billed at the “Frequent access” rate, and you are not charged the Monitoring & Automation fee.
In the upcoming example, let’s assume 1 TB in your bucket, and let’s calculate the cost savings over 1 year for intelligent tiering compared to the standard storage class.
To check out the math and calculate for your own scenario, download Aron’s S3 ROI calculator here (in .xls format)
* This is an extreme example, included here to demonstrate the cost of placing hot data in the S3-INT class. As you can see, the implications of doing so are not that large. One of the greatest benefits of intelligent tiering is that your S3 storage costs are capped to the cost of S3-Standard plus the monitoring and automation fee, which is easily predictable.
Rule of thumb:
S3-INT is a good choice if your data is accessed, on average, no more than once per two months, where objects are typically 500KB and above, and especially when the access patterns are non-uniform or hard to predict.
If your objects tend to be 4MB or larger, there is little to lose by choosing S3 INT, which will deliver similar savings to S3 Standard-IA without you having to ever worry about retrieval fees.
If your objects are primarily 128-500KB in size, S3 Standard-IA or S3 Standard may be more cost effective; Try to consolidate your objects to larger ones to enjoy greater savings.
Note that S3 INT will charge you at “standard” storage class rates for small objects under 128KB, and the monitoring and automation fee is waived for these small objects, as of the latest announcement on AWS Storage Day 2021.
Summary
To summarize, here are a few tips to remember:
- Usage of S3 Lifecycle policies: Don’t use S3 lifecycle policies to transition newly created objects into intelligent tiering – as it is always cheaper to upload directly. Likewise, if your objects are guaranteed to be long-lived and infrequently accessed, upload them directly to S3-Standard-IA. However, if freshly uploaded objects tend to be warm or short-lived – I often recommend using S3 lifecycle policies to transition objects from S3-Standard to S3-INT.
- When to use S3-Standard: If your objects are accessed once per month or more, as a rule of thumb, keep them in S3-Standard. Otherwise, use S3-IA or S3-Intelligent tiering.
- Choosing between S3 Standard-IA and S3-INT: Choose S3 Intelligent-Tiering when your data access patterns are non-uniform or unpredictable, and your typical objects are at least 500KB. S3-INT is particularly cost-effective if your objects are 4MB and above, and in those cases, it will deliver similar savings to S3 Standard-IA without you having to ever worry about retrieval fees. So, for workloads consisting of large objects, there is no point in choosing S3 Standard-IA and we always recommend choosing intelligent tiering.