AWS S3 Tutorial: Buckets, Storage Classes, Lifecycle & Security (2026)

Amazon S3 is more than object storage — it's a platform for data pipelines, static hosting, backup archival, and event-driven architectures. But getting it right means understanding storage classes, lifecycle automation, access control models, and security gotchas that trip up even experienced engineers. This guide walks through all of it with real-world examples.

Creating Buckets and Basic Operations

S3 bucket names are globally unique across all AWS accounts. Names must be 3–63 characters, lowercase, no underscores, and must not look like an IP address. The bucket exists in a specific region even though the namespace is global.

# Create a bucket in us-east-1
aws s3api create-bucket \
  --bucket my-company-data-2026 \
  --region us-east-1

# For regions other than us-east-1, specify LocationConstraint
aws s3api create-bucket \
  --bucket my-company-data-west \
  --region us-west-2 \
  --create-bucket-configuration LocationConstraint=us-west-2

# Enable default encryption (SSE-S3)
aws s3api put-bucket-encryption \
  --bucket my-company-data-2026 \
  --server-side-encryption-configuration '{
    "Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]
  }'

# Upload, download, sync
aws s3 cp file.txt s3://my-company-data-2026/uploads/file.txt
aws s3 sync ./local-folder s3://my-company-data-2026/backup/ --delete
Pro Tip: Always enable S3 Block Public Access at the account level (AWS console → S3 → Block Public Access settings for this account). This prevents any bucket or object from being made public regardless of bucket policies. Turn it off only for specific buckets that genuinely need public access (e.g., static website hosting).

Storage Classes Compared

Storage class selection is the single biggest lever for S3 cost optimization. The wrong class can mean paying 10–20x more than necessary for infrequently accessed data.

Storage ClassMin DurationRetrieval Latency$/GB/monthBest For
S3 StandardNoneMilliseconds~$0.023Frequently accessed data
S3 Intelligent-TieringNoneMilliseconds$0.023 + monitoring feeUnknown or changing access patterns
S3 Standard-IA30 daysMilliseconds~$0.0125Infrequent access, rapid retrieval
S3 One Zone-IA30 daysMilliseconds~$0.01Reproducible data, single AZ OK
S3 Glacier Instant90 daysMilliseconds~$0.004Archive with instant retrieval
S3 Glacier Flexible90 days1–12 hours~$0.0036Long-term archive, flexible retrieval
S3 Glacier Deep Archive180 days12–48 hours~$0.000997–10 year compliance archive
Note: Standard-IA and Glacier classes charge a per-GB retrieval fee on top of storage. For data you access even once a month, Standard may be cheaper than Standard-IA when you factor in retrieval costs. Use S3 Storage Lens to analyze access patterns before migrating storage classes.

Lifecycle Policies

Lifecycle rules automatically transition objects between storage classes or expire (delete) them after a defined period. This is essential for log files, backups, and any time-bounded data.

{
  "Rules": [
    {
      "ID": "log-archival",
      "Status": "Enabled",
      "Filter": {"Prefix": "logs/"},
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER_IR"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555
      },
      "NoncurrentVersionExpiration": {
        "NoncurrentDays": 30
      }
    }
  ]
}
# Apply lifecycle policy via CLI
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-company-data-2026 \
  --lifecycle-configuration file://lifecycle.json
Pro Tip: Add NoncurrentVersionExpiration to clean up old object versions when versioning is enabled. Without it, every overwrite and delete marker accumulates indefinitely and your storage costs grow silently. Also add AbortIncompleteMultipartUpload with a 7-day cutoff to clean up abandoned multipart uploads.

Versioning and Cross-Region Replication

Versioning keeps every version of every object, providing protection against accidental deletes and overwrites. Once enabled, it can only be suspended, not disabled.

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket my-company-data-2026 \
  --versioning-configuration Status=Enabled

# List object versions
aws s3api list-object-versions \
  --bucket my-company-data-2026 \
  --prefix important-file.txt

# Restore a specific version
aws s3api get-object \
  --bucket my-company-data-2026 \
  --key important-file.txt \
  --version-id "abc123xyz" \
  restored-file.txt

Cross-Region Replication (CRR) automatically copies objects to a bucket in another region. Both buckets must have versioning enabled. CRR is useful for disaster recovery, compliance (data residency in multiple regions), and reducing read latency for geographically distributed users.

{
  "Role": "arn:aws:iam::123456789:role/s3-replication-role",
  "Rules": [
    {
      "ID": "replicate-all",
      "Status": "Enabled",
      "Filter": {},
      "Destination": {
        "Bucket": "arn:aws:s3:::my-company-data-eu-west",
        "StorageClass": "STANDARD_IA",
        "ReplicationTime": {
          "Status": "Enabled",
          "Time": {"Minutes": 15}
        },
        "Metrics": {
          "Status": "Enabled",
          "EventThreshold": {"Minutes": 15}
        }
      },
      "DeleteMarkerReplication": {"Status": "Enabled"}
    }
  ]
}

Bucket Policies vs ACLs vs Block Public Access

AWS S3 has three overlapping access control mechanisms. Understanding which one controls what is critical for avoiding security misconfigurations.

  • Block Public Access: Account or bucket-level guardrail. Overrides bucket policies and ACLs. Always enable at account level.
  • Bucket Policies: JSON resource-based policies attached to the bucket. Control access for IAM principals, other AWS accounts, and services. This is the primary mechanism you should use.
  • ACLs: Legacy per-object/bucket access lists. AWS recommends disabling ACLs (Object Ownership = Bucket owner enforced) in all new buckets.
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowCloudFrontOAC",
      "Effect": "Allow",
      "Principal": {
        "Service": "cloudfront.amazonaws.com"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-website-bucket/*",
      "Condition": {
        "StringEquals": {
          "AWS:SourceArn": "arn:aws:cloudfront::123456789:distribution/ABCDEF"
        }
      }
    },
    {
      "Sid": "DenyNonHTTPS",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::my-company-data-2026",
        "arn:aws:s3:::my-company-data-2026/*"
      ],
      "Condition": {
        "Bool": {"aws:SecureTransport": "false"}
      }
    }
  ]
}

Pre-Signed URLs

Pre-signed URLs grant temporary access to a private S3 object without requiring the requester to have AWS credentials. The URL encodes the credentials, expiry time, and permissions of the signing IAM principal.

import boto3
from botocore.config import Config

s3_client = boto3.client(
    's3',
    region_name='us-east-1',
    config=Config(signature_version='s3v4')
)

# Generate a pre-signed URL for download (valid 1 hour)
url = s3_client.generate_presigned_url(
    'get_object',
    Params={
        'Bucket': 'my-private-bucket',
        'Key': 'reports/2026-q1.pdf'
    },
    ExpiresIn=3600
)
print(url)  # Share this URL with the user

# Generate a pre-signed URL for upload (PUT)
upload_url = s3_client.generate_presigned_url(
    'put_object',
    Params={
        'Bucket': 'my-upload-bucket',
        'Key': f'uploads/user-123/avatar.png',
        'ContentType': 'image/png'
    },
    ExpiresIn=300  # 5 minutes
)
Note: Pre-signed URLs inherit the permissions of the IAM entity that signed them. If the signing entity loses access (e.g., IAM role deleted, SCP applied), the URL immediately stops working even if it hasn't expired. For production file download/upload flows, pre-signed URLs are the standard pattern.

Event Notifications to Lambda and SQS

S3 event notifications trigger downstream processing when objects are created, deleted, or restored. This is the foundation of serverless data pipelines — an image upload triggers a Lambda that creates thumbnails; a CSV upload triggers a Lambda that loads it into Redshift.

{
  "LambdaFunctionConfigurations": [
    {
      "LambdaFunctionArn": "arn:aws:lambda:us-east-1:123456789:function:process-upload",
      "Events": ["s3:ObjectCreated:*"],
      "Filter": {
        "Key": {
          "FilterRules": [
            {"Name": "prefix", "Value": "uploads/"},
            {"Name": "suffix", "Value": ".jpg"}
          ]
        }
      }
    }
  ],
  "QueueConfigurations": [
    {
      "QueueArn": "arn:aws:sqs:us-east-1:123456789:s3-events-queue",
      "Events": ["s3:ObjectCreated:*", "s3:ObjectRemoved:*"],
      "Filter": {
        "Key": {
          "FilterRules": [
            {"Name": "prefix", "Value": "data/"}
          ]
        }
      }
    }
  ]
}
Pro Tip: For high-throughput S3 event processing, prefer SQS as the notification target rather than Lambda directly. S3 can batch notifications, SQS buffers them, and Lambda polls SQS — this gives you back-pressure and retry semantics. Direct S3-to-Lambda can overwhelm your Lambda concurrency during burst uploads.

Frequently Asked Questions

Can S3 be used to host a static website?

Yes. Enable static website hosting on a bucket, set the index and error documents, and make the content accessible via a bucket policy. For production, always put CloudFront in front of the S3 origin using Origin Access Control (OAC) — this gives you HTTPS, caching, and keeps the bucket private. Serving directly from the S3 website endpoint is HTTP only and has no caching.

What's the maximum object size in S3?

Single PUT: 5 GB. Multipart upload: up to 5 TB per object. The AWS CLI and SDK automatically use multipart uploads for files over 8 MB. For very large files (100 GB+), use S3 Transfer Acceleration combined with multipart to use AWS edge locations for faster uploads over long distances.

How do I prevent accidental bucket deletion?

Enable MFA Delete on the bucket — this requires MFA authentication to permanently delete object versions or change versioning state. Also apply an S3 Object Lock with Compliance mode for data that must not be deleted for regulatory reasons (WORM — Write Once Read Many). For the bucket itself, use an SCP (Service Control Policy) to deny s3:DeleteBucket in production OUs.

What's the difference between S3 Transfer Acceleration and multipart upload?

They solve different problems. Multipart upload splits a large file into chunks and uploads them in parallel — it improves throughput for large files on any connection. Transfer Acceleration routes your upload through AWS CloudFront edge locations, reducing latency for geographically distant clients uploading to a bucket in a different region. They can be used together for maximum throughput on large files over long distances.

How does S3 Intelligent-Tiering work?

S3 Intelligent-Tiering monitors access patterns per object and automatically moves objects between frequent access, infrequent access, and archive tiers with no retrieval fees. There's a small monitoring fee ($0.0025 per 1,000 objects/month). It's cost-effective for objects larger than 128 KB with unknown or changing access patterns. Objects smaller than 128 KB are always charged at the frequent access tier rate regardless of access pattern.