AWS S3 Tutorial: Buckets, Storage Classes, Lifecycle & Security (2026)
Amazon S3 is more than object storage — it's a platform for data pipelines, static hosting, backup archival, and event-driven architectures. But getting it right means understanding storage classes, lifecycle automation, access control models, and security gotchas that trip up even experienced engineers. This guide walks through all of it with real-world examples.
Table of Contents
Creating Buckets and Basic Operations
S3 bucket names are globally unique across all AWS accounts. Names must be 3–63 characters, lowercase, no underscores, and must not look like an IP address. The bucket exists in a specific region even though the namespace is global.
# Create a bucket in us-east-1
aws s3api create-bucket \
--bucket my-company-data-2026 \
--region us-east-1
# For regions other than us-east-1, specify LocationConstraint
aws s3api create-bucket \
--bucket my-company-data-west \
--region us-west-2 \
--create-bucket-configuration LocationConstraint=us-west-2
# Enable default encryption (SSE-S3)
aws s3api put-bucket-encryption \
--bucket my-company-data-2026 \
--server-side-encryption-configuration '{
"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]
}'
# Upload, download, sync
aws s3 cp file.txt s3://my-company-data-2026/uploads/file.txt
aws s3 sync ./local-folder s3://my-company-data-2026/backup/ --delete
Storage Classes Compared
Storage class selection is the single biggest lever for S3 cost optimization. The wrong class can mean paying 10–20x more than necessary for infrequently accessed data.
| Storage Class | Min Duration | Retrieval Latency | $/GB/month | Best For |
|---|---|---|---|---|
| S3 Standard | None | Milliseconds | ~$0.023 | Frequently accessed data |
| S3 Intelligent-Tiering | None | Milliseconds | $0.023 + monitoring fee | Unknown or changing access patterns |
| S3 Standard-IA | 30 days | Milliseconds | ~$0.0125 | Infrequent access, rapid retrieval |
| S3 One Zone-IA | 30 days | Milliseconds | ~$0.01 | Reproducible data, single AZ OK |
| S3 Glacier Instant | 90 days | Milliseconds | ~$0.004 | Archive with instant retrieval |
| S3 Glacier Flexible | 90 days | 1–12 hours | ~$0.0036 | Long-term archive, flexible retrieval |
| S3 Glacier Deep Archive | 180 days | 12–48 hours | ~$0.00099 | 7–10 year compliance archive |
Lifecycle Policies
Lifecycle rules automatically transition objects between storage classes or expire (delete) them after a defined period. This is essential for log files, backups, and any time-bounded data.
{
"Rules": [
{
"ID": "log-archival",
"Status": "Enabled",
"Filter": {"Prefix": "logs/"},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER_IR"
},
{
"Days": 365,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {
"Days": 2555
},
"NoncurrentVersionExpiration": {
"NoncurrentDays": 30
}
}
]
}
# Apply lifecycle policy via CLI
aws s3api put-bucket-lifecycle-configuration \
--bucket my-company-data-2026 \
--lifecycle-configuration file://lifecycle.json
NoncurrentVersionExpiration to clean up old object versions when versioning is enabled. Without it, every overwrite and delete marker accumulates indefinitely and your storage costs grow silently. Also add AbortIncompleteMultipartUpload with a 7-day cutoff to clean up abandoned multipart uploads.Versioning and Cross-Region Replication
Versioning keeps every version of every object, providing protection against accidental deletes and overwrites. Once enabled, it can only be suspended, not disabled.
# Enable versioning
aws s3api put-bucket-versioning \
--bucket my-company-data-2026 \
--versioning-configuration Status=Enabled
# List object versions
aws s3api list-object-versions \
--bucket my-company-data-2026 \
--prefix important-file.txt
# Restore a specific version
aws s3api get-object \
--bucket my-company-data-2026 \
--key important-file.txt \
--version-id "abc123xyz" \
restored-file.txt
Cross-Region Replication (CRR) automatically copies objects to a bucket in another region. Both buckets must have versioning enabled. CRR is useful for disaster recovery, compliance (data residency in multiple regions), and reducing read latency for geographically distributed users.
{
"Role": "arn:aws:iam::123456789:role/s3-replication-role",
"Rules": [
{
"ID": "replicate-all",
"Status": "Enabled",
"Filter": {},
"Destination": {
"Bucket": "arn:aws:s3:::my-company-data-eu-west",
"StorageClass": "STANDARD_IA",
"ReplicationTime": {
"Status": "Enabled",
"Time": {"Minutes": 15}
},
"Metrics": {
"Status": "Enabled",
"EventThreshold": {"Minutes": 15}
}
},
"DeleteMarkerReplication": {"Status": "Enabled"}
}
]
}
Bucket Policies vs ACLs vs Block Public Access
AWS S3 has three overlapping access control mechanisms. Understanding which one controls what is critical for avoiding security misconfigurations.
- Block Public Access: Account or bucket-level guardrail. Overrides bucket policies and ACLs. Always enable at account level.
- Bucket Policies: JSON resource-based policies attached to the bucket. Control access for IAM principals, other AWS accounts, and services. This is the primary mechanism you should use.
- ACLs: Legacy per-object/bucket access lists. AWS recommends disabling ACLs (Object Ownership = Bucket owner enforced) in all new buckets.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowCloudFrontOAC",
"Effect": "Allow",
"Principal": {
"Service": "cloudfront.amazonaws.com"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-website-bucket/*",
"Condition": {
"StringEquals": {
"AWS:SourceArn": "arn:aws:cloudfront::123456789:distribution/ABCDEF"
}
}
},
{
"Sid": "DenyNonHTTPS",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::my-company-data-2026",
"arn:aws:s3:::my-company-data-2026/*"
],
"Condition": {
"Bool": {"aws:SecureTransport": "false"}
}
}
]
}
Pre-Signed URLs
Pre-signed URLs grant temporary access to a private S3 object without requiring the requester to have AWS credentials. The URL encodes the credentials, expiry time, and permissions of the signing IAM principal.
import boto3
from botocore.config import Config
s3_client = boto3.client(
's3',
region_name='us-east-1',
config=Config(signature_version='s3v4')
)
# Generate a pre-signed URL for download (valid 1 hour)
url = s3_client.generate_presigned_url(
'get_object',
Params={
'Bucket': 'my-private-bucket',
'Key': 'reports/2026-q1.pdf'
},
ExpiresIn=3600
)
print(url) # Share this URL with the user
# Generate a pre-signed URL for upload (PUT)
upload_url = s3_client.generate_presigned_url(
'put_object',
Params={
'Bucket': 'my-upload-bucket',
'Key': f'uploads/user-123/avatar.png',
'ContentType': 'image/png'
},
ExpiresIn=300 # 5 minutes
)
Event Notifications to Lambda and SQS
S3 event notifications trigger downstream processing when objects are created, deleted, or restored. This is the foundation of serverless data pipelines — an image upload triggers a Lambda that creates thumbnails; a CSV upload triggers a Lambda that loads it into Redshift.
{
"LambdaFunctionConfigurations": [
{
"LambdaFunctionArn": "arn:aws:lambda:us-east-1:123456789:function:process-upload",
"Events": ["s3:ObjectCreated:*"],
"Filter": {
"Key": {
"FilterRules": [
{"Name": "prefix", "Value": "uploads/"},
{"Name": "suffix", "Value": ".jpg"}
]
}
}
}
],
"QueueConfigurations": [
{
"QueueArn": "arn:aws:sqs:us-east-1:123456789:s3-events-queue",
"Events": ["s3:ObjectCreated:*", "s3:ObjectRemoved:*"],
"Filter": {
"Key": {
"FilterRules": [
{"Name": "prefix", "Value": "data/"}
]
}
}
}
]
}
Frequently Asked Questions
Can S3 be used to host a static website?
Yes. Enable static website hosting on a bucket, set the index and error documents, and make the content accessible via a bucket policy. For production, always put CloudFront in front of the S3 origin using Origin Access Control (OAC) — this gives you HTTPS, caching, and keeps the bucket private. Serving directly from the S3 website endpoint is HTTP only and has no caching.
What's the maximum object size in S3?
Single PUT: 5 GB. Multipart upload: up to 5 TB per object. The AWS CLI and SDK automatically use multipart uploads for files over 8 MB. For very large files (100 GB+), use S3 Transfer Acceleration combined with multipart to use AWS edge locations for faster uploads over long distances.
How do I prevent accidental bucket deletion?
Enable MFA Delete on the bucket — this requires MFA authentication to permanently delete object versions or change versioning state. Also apply an S3 Object Lock with Compliance mode for data that must not be deleted for regulatory reasons (WORM — Write Once Read Many). For the bucket itself, use an SCP (Service Control Policy) to deny s3:DeleteBucket in production OUs.
What's the difference between S3 Transfer Acceleration and multipart upload?
They solve different problems. Multipart upload splits a large file into chunks and uploads them in parallel — it improves throughput for large files on any connection. Transfer Acceleration routes your upload through AWS CloudFront edge locations, reducing latency for geographically distant clients uploading to a bucket in a different region. They can be used together for maximum throughput on large files over long distances.
How does S3 Intelligent-Tiering work?
S3 Intelligent-Tiering monitors access patterns per object and automatically moves objects between frequent access, infrequent access, and archive tiers with no retrieval fees. There's a small monitoring fee ($0.0025 per 1,000 objects/month). It's cost-effective for objects larger than 128 KB with unknown or changing access patterns. Objects smaller than 128 KB are always charged at the frequent access tier rate regardless of access pattern.