AWS Fargate Pricing and Cost Optimization: Cut Your Bills by 50%

AWS Fargate Cost Optimization Strategies

AWS Fargate lets you run containers without managing servers — but that convenience comes with a price tag that surprises many teams the first time they see their AWS bill. The good news: Fargate pricing is completely transparent and almost entirely within your control. Teams that understand the billing model and apply five or six targeted optimizations routinely cut their Fargate spend in half. This guide covers every lever available in 2026 — from Fargate Spot and Graviton ARM64 to Savings Plans, scale-to-zero patterns, and container image optimization — with real cost tables, task definition JSON, and Terraform HCL you can copy and adapt today.

Fargate Pricing Model: vCPU + Memory Per-Second Billing
Right-Sizing Containers: Finding and Eliminating Waste
Fargate Spot: 70% Discount for Interruption-Tolerant Workloads
Graviton on Fargate: ARM64 for 20% More Savings
Compute Savings Plans: Committed Spend Discounts
Auto Scaling to Zero: Pay Nothing for Idle Environments
Container Image Optimization: Faster Starts, Lower Bills
Bin Packing Tasks: Sidecar Patterns and Shared Resources
Cost Visibility: Tagging, Container Insights, and Cost Explorer
Fargate vs EC2 vs Lambda: Break-Even Analysis

Fargate Pricing Model: vCPU + Memory Per-Second Billing

Fargate charges you for two resources: vCPU and memory. Billing is per second, with a minimum charge of one minute, and it begins the moment a task transitions to the RUNNING state and stops the instant the task stops. You are billed for the resource requests you declare in your task definition — not actual utilization. A task that requests 1 vCPU but idles at 5% CPU still costs the same as a task running at 100%.

Current Pricing (us-east-1, 2026)

Resource	On-Demand Price	Fargate Spot Price (est.)
vCPU per hour	$0.04048	~$0.01214 (70% off)
GB memory per hour	$0.004445	~$0.00133 (70% off)
EKS Fargate vCPU per hour	$0.04048	N/A (EKS Fargate has no Spot)
EKS Fargate GB memory per hour	$0.004445	N/A

ECS Fargate and EKS Fargate use the same underlying compute pricing per vCPU-hour and GB-hour. The difference is that EKS adds an EKS cluster charge of $0.10/hour (~$73/month) for the managed control plane, whereas ECS has no cluster fee. For small workloads, ECS Fargate is materially cheaper if you don't need Kubernetes features.

Supported vCPU / Memory Combinations

vCPU	Allowed Memory (GB)	Hourly Cost (vCPU + min mem)	Monthly (30d, 1 task)
0.25	0.5, 1, 2	$0.0104	$7.47
0.5	1 – 4	$0.0246	$17.73
1	2 – 8	$0.0493	$35.52
2	4 – 16	$0.0986	$71.05
4	8 – 30	$0.1970	$142.10

Fargate rounds your declared CPU and memory up to the nearest supported combination. If you request 0.3 vCPU and 600 MB, Fargate rounds up to 0.5 vCPU / 1 GB. You're paying for the rounded-up tier regardless. This is the first place cost optimization starts: declare exact supported values, not rough estimates.

Real Monthly Cost Example: 20-Microservice Fleet

# Scenario: 20 microservices, each 0.5 vCPU / 1 GB, running 24/7
vCPU cost:   20 × 0.5 × $0.04048 × 720h = $291.46
Memory cost: 20 × 1.0 × $0.004445 × 720h = $64.01
Total On-Demand:  $355.47/month

# Same fleet, 100% Fargate Spot:
vCPU cost:   20 × 0.5 × $0.01214 × 720h =  $87.41
Memory cost: 20 × 1.0 × $0.00133 × 720h =  $19.15
Total Spot:  $106.56/month  (savings: ~70%)

# Realistic mixed strategy (70% Spot, 30% On-Demand):
Total mixed: ($106.56 × 0.7) + ($355.47 × 0.3) = $74.59 + $106.64 = $181.23/month
Savings vs pure On-Demand: ~49%

Per-second billing is your friend for batch jobs: A batch task that runs for 8 minutes and 23 seconds is billed for exactly 503 seconds of vCPU and memory — not a full hour. This makes Fargate extremely cost-effective for event-driven and scheduled workloads compared to leaving an EC2 instance running all day.

Right-Sizing Containers: Finding and Eliminating Waste

Because Fargate bills on declared requests rather than actual utilization, over-provisioned task definitions are pure money wasted. A task declaring 2 vCPU that actually uses 0.3 vCPU is 85% overprovisioned — you're paying 6.7x more than necessary. The fix is systematic right-sizing using CloudWatch Container Insights metrics combined with a disciplined review cycle.

Enable Container Insights on Your ECS Cluster

# Enable Container Insights on new cluster
aws ecs create-cluster \
  --cluster-name production \
  --settings name=containerInsights,value=enabled

# Enable on existing cluster
aws ecs update-cluster-settings \
  --cluster production \
  --settings name=containerInsights,value=enabled

Query CPU and Memory Utilization via CloudWatch Insights

# CloudWatch Logs Insights query — run in /aws/ecs/containerinsights/{cluster}/performance
fields @timestamp, TaskId, ServiceName, CpuUtilized, CpuReserved, MemoryUtilized, MemoryReserved
| stats
    avg(CpuUtilized/CpuReserved*100)     as avg_cpu_pct,
    max(CpuUtilized/CpuReserved*100)     as peak_cpu_pct,
    avg(MemoryUtilized/MemoryReserved*100) as avg_mem_pct,
    max(MemoryUtilized/MemoryReserved*100) as peak_mem_pct
  by ServiceName
| sort avg_cpu_pct asc
| limit 20

Review the results for services where both peak_cpu_pct is below 40% and peak_mem_pct is below 50% — those are your right-sizing targets. The general rule for Fargate right-sizing: set CPU request to 1.2× your observed p99 CPU utilization (in actual vCPU units), and set memory request to 1.3× your observed p99 memory usage. Always leave headroom — OOM kills cost more in incident time than the few cents of over-provisioning.

Task Definition: Right-Sized Example

{
  "family": "api-service",
  "containerDefinitions": [
    {
      "name": "api",
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/api:v2.1.0",
      "cpu": 256,
      "memory": 512,
      "memoryReservation": 384,
      "essential": true,
      "portMappings": [{"containerPort": 8080, "protocol": "tcp"}],
      "environment": [
        {"name": "JAVA_OPTS", "value": "-Xms256m -Xmx384m -XX:+UseContainerSupport"}
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/api-service",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "api"
        }
      }
    }
  ],
  "requiresCompatibilities": ["FARGATE"],
  "networkMode": "awsvpc",
  "cpu": "256",
  "memory": "512"
}

Java workloads need special attention: The JVM does not automatically respect container memory limits unless you use Java 11+ with -XX:+UseContainerSupport (enabled by default in Java 11+). Without it, the JVM sizes its heap based on the host VM's RAM (gigabytes), leading to OOM kills and misleading memory metrics. Always set explicit -Xms and -Xmx values and let Container Insights show real heap usage.

Fargate Spot: 70% Discount for Interruption-Tolerant Workloads

Fargate Spot runs your containers on spare AWS capacity at up to 70% off On-Demand prices. When AWS needs the capacity back, it sends a two-minute interruption notice via the ECS task state change event, then terminates your task. For stateless microservices, batch jobs, CI/CD runners, and dev/staging environments, this interruption is perfectly acceptable — especially when combined with a mixed strategy that keeps a baseline of On-Demand tasks for critical traffic.

ECS Capacity Provider Strategy

The recommended pattern is a capacity provider strategy mixing FARGATE (On-Demand) and FARGATE_SPOT. The base parameter ensures a minimum number of On-Demand tasks always run; the weight parameters control how new tasks are distributed.

# Create capacity provider strategy for 70% Spot, 30% On-Demand
aws ecs put-cluster-capacity-providers \
  --cluster production \
  --capacity-providers FARGATE FARGATE_SPOT \
  --default-capacity-provider-strategy \
    capacityProvider=FARGATE,weight=3,base=2 \
    capacityProvider=FARGATE_SPOT,weight=7,base=0

# Create/update a service using this strategy
aws ecs create-service \
  --cluster production \
  --service-name api-service \
  --task-definition api-service:5 \
  --desired-count 10 \
  --capacity-provider-strategy \
    capacityProvider=FARGATE,weight=3,base=2 \
    capacityProvider=FARGATE_SPOT,weight=7,base=0

Handling Spot Interruptions Gracefully

Spot interruption events are delivered as ECS task state change events to EventBridge. Your application has two minutes from the SIGTERM signal to finish in-flight requests and shut down cleanly. The steps to be interruption-ready:

# 1. Listen for SIGTERM in your application (Node.js example)
process.on('SIGTERM', async () => {
  console.log('Received SIGTERM, draining connections...');
  // Stop accepting new connections
  server.close(async () => {
    await db.end();  // close DB pool
    console.log('Graceful shutdown complete');
    process.exit(0);
  });
  // Force exit if drain takes too long
  setTimeout(() => process.exit(1), 90_000);
});

# 2. Configure ECS service with deregistration delay
aws ecs update-service \
  --cluster production \
  --service api-service \
  --health-check-grace-period-seconds 60

# 3. Set ALB deregistration delay to 30s (default 300s is too long)
aws elbv2 modify-target-group-attributes \
  --target-group-arn arn:aws:elasticloadbalancing:... \
  --attributes Key=deregistration_delay.timeout_seconds,Value=30

Terraform: Mixed Capacity Strategy

resource "aws_ecs_cluster_capacity_providers" "production" {
  cluster_name       = aws_ecs_cluster.production.name
  capacity_providers = ["FARGATE", "FARGATE_SPOT"]

  default_capacity_provider_strategy {
    capacity_provider = "FARGATE"
    weight            = 3
    base              = 2   # always keep 2 On-Demand tasks as baseline
  }

  default_capacity_provider_strategy {
    capacity_provider = "FARGATE_SPOT"
    weight            = 7
    base              = 0
  }
}

resource "aws_ecs_service" "api" {
  name            = "api-service"
  cluster         = aws_ecs_cluster.production.id
  task_definition = aws_ecs_task_definition.api.arn
  desired_count   = 10

  capacity_provider_strategy {
    capacity_provider = "FARGATE"
    weight            = 3
    base              = 2
  }

  capacity_provider_strategy {
    capacity_provider = "FARGATE_SPOT"
    weight            = 7
    base              = 0
  }
}

Best workloads for 100% Fargate Spot: Batch data processing, CI/CD test runners, dev/staging environments, asynchronous queue consumers (SQS, Kinesis), nightly ETL jobs, and ML inference jobs with retry logic. Avoid pure Spot for databases, cache layers, or synchronous APIs without a load balancer retry budget.

Graviton on Fargate: ARM64 for 20% More Savings

AWS Graviton3 processors (ARM64 architecture) run in Fargate tasks at approximately 20% lower cost than equivalent x86_64 tasks — not a discount on the published price, but because Graviton delivers the same or better performance per vCPU at the same price, you need fewer vCPUs to handle the same workload. Additionally, Graviton Fargate tasks are eligible for the same Fargate Spot discounts, stacking the savings.

Enabling Graviton in a Task Definition

{
  "family": "api-service-arm",
  "runtimePlatform": {
    "operatingSystemFamily": "LINUX",
    "cpuArchitecture": "ARM64"
  },
  "containerDefinitions": [
    {
      "name": "api",
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/api:v2.1.0-arm64",
      "cpu": 256,
      "memory": 512,
      "essential": true,
      "portMappings": [{"containerPort": 8080}]
    }
  ],
  "requiresCompatibilities": ["FARGATE"],
  "networkMode": "awsvpc",
  "cpu": "256",
  "memory": "512"
}

Building Multi-Architecture Images with Docker Buildx

# Set up buildx for cross-compilation
docker buildx create --use --name multi-arch-builder

# Build and push both amd64 and arm64 in one command
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --tag 123456789012.dkr.ecr.us-east-1.amazonaws.com/api:v2.1.0 \
  --push \
  .

# Or use separate tags for each arch
docker buildx build --platform linux/arm64 \
  --tag 123456789012.dkr.ecr.us-east-1.amazonaws.com/api:v2.1.0-arm64 \
  --push .

Migration Steps from x86 to Graviton Fargate

Audit dependencies: Most JVM, Node.js, Python, and Go workloads work without code changes. Native extensions (C/C++ addons, custom JNI libs) need recompilation. Check with file ./binary — you want ELF 64-bit ARM aarch64.
Build ARM64 image: Use Docker Buildx or a CI/CD pipeline matrix (GitHub Actions: runs-on: ubuntu-latest with QEMU emulation for cross-compile, or native ARM runners).
Create a parallel Fargate service with the ARM64 task definition and route a small traffic slice via weighted target groups.
Validate performance: Compare CloudWatch Container Insights CPU utilization between the x86 and ARM64 services under equivalent load. Expect 5–30% better vCPU efficiency.
Cut over: Shift traffic fully to the ARM64 service and decommission the x86 service.

Not all runtimes are equal on Graviton: Java 17+ and Go 1.19+ have excellent ARM64 performance. Python 3.10+ is near-identical on both architectures. Node.js 18+ performs well. Older runtimes and frameworks with heavy JIT compilation (e.g., .NET Framework on Mono) may see smaller gains or need tuning.

Compute Savings Plans: Committed Spend Discounts

AWS Compute Savings Plans apply automatically to Fargate usage (ECS and EKS) in addition to EC2 and Lambda. You commit to a consistent hourly spend (e.g., $5/hour) for 1 or 3 years, and AWS applies discounts of up to 52% off On-Demand rates in exchange for that commitment. Unlike Reserved Instances, Savings Plans require no instance type, region, or OS specification — the discount applies to any Fargate workload automatically.

Fargate Savings Plan Discount Tiers

Plan Type	Term	Payment	Fargate Discount
Compute Savings Plan	1 year	No upfront	~17%
Compute Savings Plan	1 year	All upfront	~21%
Compute Savings Plan	3 year	No upfront	~40%
Compute Savings Plan	3 year	All upfront	~52%

Purchase Strategy: What to Commit

# Step 1: Check your past 30 days of Fargate spend
aws ce get-cost-and-usage \
  --time-period Start=2026-05-01,End=2026-06-01 \
  --granularity MONTHLY \
  --filter '{"Dimensions":{"Key":"SERVICE","Values":["AWS Fargate"]}}' \
  --metrics "UnblendedCost"

# Step 2: Find your minimum hourly Fargate spend (your commitment baseline)
# Example output: $720/month → $1.00/hour minimum
# Rule of thumb: commit to 70% of your minimum baseline spend
# Reserve Spot + On-Demand flexibility for the remaining 30%

# Step 3: View Savings Plan recommendations in Cost Explorer
aws ce get-savings-plans-purchase-recommendation \
  --savings-plans-type COMPUTE_SP \
  --term-in-years ONE_YEAR \
  --payment-option NO_UPFRONT \
  --lookback-period-in-days THIRTY_DAYS

Break-Even Analysis: When Do Savings Plans Pay Off?

# Scenario: $1,000/month Fargate On-Demand baseline
# 1-year Compute SP, all upfront: 21% discount

Annual On-Demand cost:   $1,000 × 12 = $12,000
Annual SP cost (21% off): $12,000 × 0.79 = $9,480
Annual savings:          $2,520
Break-even:              Month 1 (prepaid = $9,480 saved over 12 months vs On-Demand)

# 3-year SP, all upfront: 52% discount
3-year On-Demand:  $36,000
3-year SP cost:    $36,000 × 0.48 = $17,280
3-year savings:    $18,720 ($6,240/year)

Stack Savings Plans with Fargate Spot: Savings Plans apply to On-Demand Fargate tasks. Your Spot tasks are already discounted 70% — Savings Plans don't stack on top of Spot pricing. The optimal strategy is Savings Plans for your On-Demand baseline + Spot for burst/variable capacity. This combination can achieve 55–65% overall cost reduction vs pure On-Demand.

Auto Scaling to Zero: Pay Nothing for Idle Environments

Fargate's per-second billing means you pay absolutely nothing when zero tasks are running. Dev, staging, and preview environments that run 24/7 are often the largest waste on a Fargate bill — a 5-service staging environment at 0.5 vCPU / 1 GB each runs about $88/month On-Demand. Scaled to zero during nights and weekends (16 hours/day × 9 days/week = 67% of the time), that drops to $29/month. The challenge is cold starts — Fargate task startup takes 30–90 seconds. The solutions depend on your acceptable latency.

ECS Scheduled Scaling for Non-Production

# Register a scalable target for the ECS service
aws application-autoscaling register-scalable-target \
  --service-namespace ecs \
  --resource-id service/staging/api-service \
  --scalable-dimension ecs:service:DesiredCount \
  --min-capacity 0 \
  --max-capacity 10

# Scale UP at 8am UTC on weekdays
aws application-autoscaling put-scheduled-action \
  --service-namespace ecs \
  --resource-id service/staging/api-service \
  --scalable-dimension ecs:service:DesiredCount \
  --scheduled-action-name scale-up-morning \
  --schedule "cron(0 8 ? * MON-FRI *)" \
  --scalable-target-action MinCapacity=2,MaxCapacity=10

# Scale DOWN to ZERO at 7pm UTC on weekdays
aws application-autoscaling put-scheduled-action \
  --service-namespace ecs \
  --resource-id service/staging/api-service \
  --scalable-dimension ecs:service:DesiredCount \
  --scheduled-action-name scale-down-evening \
  --schedule "cron(0 19 ? * MON-FRI *)" \
  --scalable-target-action MinCapacity=0,MaxCapacity=0

Lambda Warmup Pattern for Scale-to-Zero Production APIs

# Lambda function to pre-warm a Fargate service before traffic arrives
import boto3
import time

ecs = boto3.client('ecs')

def handler(event, context):
    # Scale up the service
    ecs.update_service(
        cluster='production',
        service='api-service',
        desiredCount=2
    )

    # Wait for tasks to become RUNNING
    waiter = ecs.get_waiter('services_stable')
    waiter.wait(
        cluster='production',
        services=['api-service'],
        WaiterConfig={'Delay': 15, 'MaxAttempts': 20}
    )

    print('Service warmed up and stable')
    return {'statusCode': 200, 'body': 'Warmed up'}

# Schedule this Lambda to run 5 minutes before expected traffic
# e.g., every weekday at 7:55am UTC

Terraform: Scale-to-Zero for Dev Environment

resource "aws_appautoscaling_target" "dev_api" {
  max_capacity       = 5
  min_capacity       = 0
  resource_id        = "service/${aws_ecs_cluster.dev.name}/${aws_ecs_service.dev_api.name}"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"
}

resource "aws_appautoscaling_scheduled_action" "dev_scale_up" {
  name               = "dev-scale-up"
  service_namespace  = aws_appautoscaling_target.dev_api.service_namespace
  resource_id        = aws_appautoscaling_target.dev_api.resource_id
  scalable_dimension = aws_appautoscaling_target.dev_api.scalable_dimension
  schedule           = "cron(0 8 ? * MON-FRI *)"

  scalable_target_action {
    min_capacity = 1
    max_capacity = 5
  }
}

resource "aws_appautoscaling_scheduled_action" "dev_scale_down" {
  name               = "dev-scale-down"
  service_namespace  = aws_appautoscaling_target.dev_api.service_namespace
  resource_id        = aws_appautoscaling_target.dev_api.resource_id
  scalable_dimension = aws_appautoscaling_target.dev_api.scalable_dimension
  schedule           = "cron(0 19 ? * MON-FRI *)"

  scalable_target_action {
    min_capacity = 0
    max_capacity = 0
  }
}

Container Image Optimization: Faster Starts, Lower Bills

Fargate billing starts the moment a task enters the RUNNING state, but task startup time (the period before your application is ready) affects your perceived availability and can indirectly inflate costs. A large container image (1–3 GB) takes 30–90 seconds to pull from ECR — during which you're paying for the task but getting no work done. Smaller images also mean less ECR storage and data transfer cost. Optimized images improve cold start time by 60–80% and bring ECR storage bills down sharply.

Multi-Stage Dockerfile: Java Spring Boot Example

# Stage 1: Build
FROM eclipse-temurin:21-jdk-alpine AS builder
WORKDIR /app
COPY pom.xml .
COPY src ./src
# Cache Maven dependencies as a separate layer
RUN mvn dependency:go-offline -q
RUN mvn package -DskipTests -q

# Stage 2: Extract layered JAR (Spring Boot 3.x layertools)
FROM eclipse-temurin:21-jdk-alpine AS extractor
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
RUN java -Djarmode=layertools -jar app.jar extract

# Stage 3: Minimal runtime image
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
# Copy layers in order of least-to-most frequently changing
COPY --from=extractor /app/dependencies/ ./
COPY --from=extractor /app/spring-boot-loader/ ./
COPY --from=extractor /app/snapshot-dependencies/ ./
COPY --from=extractor /app/application/ ./

EXPOSE 8080
ENTRYPOINT ["java", \
  "-XX:+UseContainerSupport", \
  "-XX:MaxRAMPercentage=75.0", \
  "-XX:+UseG1GC", \
  "org.springframework.boot.loader.launch.JarLauncher"]

Distroless Images for Node.js and Go

# Go: final image is ~8 MB (vs 300 MB with golang:1.22)
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o server .

FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]

# Node.js: use node:20-alpine3.18 + prune dev dependencies
FROM node:20-alpine3.18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build && npm prune --production

FROM node:20-alpine3.18
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
COPY package.json .
EXPOSE 3000
CMD ["node", "dist/index.js"]

ECR Image Size Comparison

Approach	Image Size	Pull Time (Fargate)	ECR Storage/mo
JDK full + fat JAR (naive)	~650 MB	45–80s	$0.065
JRE Alpine + layered JAR	~180 MB	15–25s	$0.018
Go distroless static	~8 MB	2–4s	<$0.001
Node.js Alpine pruned	~95 MB	8–12s	$0.0095

Enable ECR image caching: Fargate caches image layers on the underlying microVM infrastructure. Frequently pulled images (same digest) start faster than first pulls. Tag your images with immutable digests (@sha256:...) rather than floating tags like :latest to ensure Fargate's cache lookup hits correctly. Also enable ECR lifecycle policies to delete untagged images and images older than 30 days — this prevents ECR storage from silently accumulating.

Bin Packing Tasks: Sidecar Patterns and Shared Resources

Each Fargate task is billed as a unit: the total vCPU and memory declared at the task level, not per container within the task. This creates an opportunity to consolidate related work into a single task — "bin packing" — so that multiple logical components share the same billed resources instead of each running as a separate task with its own overhead.

Sidecar Patterns That Save Money

{
  "family": "app-with-sidecar",
  "cpu": "512",
  "memory": "1024",
  "containerDefinitions": [
    {
      "name": "app",
      "image": "myapp:latest",
      "cpu": 384,
      "memory": 768,
      "essential": true,
      "portMappings": [{"containerPort": 8080}]
    },
    {
      "name": "log-router",
      "image": "amazon/aws-for-fluent-bit:latest",
      "cpu": 64,
      "memory": 128,
      "essential": false,
      "environment": [
        {"name": "FLB_LOG_LEVEL", "value": "warn"}
      ]
    },
    {
      "name": "envoy-proxy",
      "image": "public.ecr.aws/appmesh/aws-appmesh-envoy:latest",
      "cpu": 64,
      "memory": 128,
      "essential": false
    }
  ],
  "requiresCompatibilities": ["FARGATE"],
  "networkMode": "awsvpc"
}

In this example, the total task costs 0.5 vCPU and 1 GB of billed resources. Without bin packing, running the app, log router, and proxy as three separate 0.25 vCPU / 0.5 GB tasks would cost 3× as much. The sidecar pattern typically saves 40–60% vs per-container individual tasks. The key constraint: all containers in a task share the task's CPU and memory pool, and they start and stop together — ensure your sidecar is truly coupled to the main container's lifecycle.

Avoid Over-Provisioned Single-Container Tasks

# BEFORE: each microservice as a bloated standalone task
# Service A: 1 vCPU / 2 GB  (actual usage: 0.15 vCPU / 400 MB)
# Service B: 1 vCPU / 2 GB  (actual usage: 0.10 vCPU / 300 MB)
# Monthly: 2 × (1 × $0.04048 + 2 × $0.004445) × 720 = $70.07

# AFTER: right-sized and co-located where lifecycle allows
# Service A: 0.25 vCPU / 0.5 GB
# Service B: 0.25 vCPU / 0.5 GB
# Monthly: 2 × (0.25 × $0.04048 + 0.5 × $0.004445) × 720 = $17.07
# Savings: 76%

Cost Visibility: Tagging, Container Insights, and Cost Explorer

You cannot optimize what you cannot measure. Before applying any of the techniques above, establish a cost visibility layer that tells you exactly which services, teams, and environments are spending what. The three pillars are: a consistent tagging strategy, Container Insights metrics, and Cost Explorer filters.

Tagging Strategy for Fargate

# Apply tags at ECS task, service, and cluster level
aws ecs tag-resource \
  --resource-arn arn:aws:ecs:us-east-1:123456789012:cluster/production \
  --tags \
    key=Environment,value=production \
    key=Team,value=platform \
    key=CostCenter,value=engineering \
    key=Project,value=api-gateway

# Tag a service
aws ecs tag-resource \
  --resource-arn arn:aws:ecs:us-east-1:123456789012:service/production/api-service \
  --tags \
    key=Service,value=api \
    key=Environment,value=production

# Enable cost allocation tags in Billing Console
# (required for tags to appear in Cost Explorer)
aws ce list-cost-allocation-tags  # verify tags are active

Container Insights Cost Dashboard Query

# CloudWatch Insights: cost per service (approximate)
# Requires Container Insights enabled on the cluster
fields @timestamp, ServiceName, CpuReserved, MemoryReserved
| stats
    avg(CpuReserved)    as avg_vcpu,
    avg(MemoryReserved) as avg_mem_gb
  by ServiceName
| eval hourly_cost = (avg_vcpu * 0.04048) + (avg_mem_gb / 1024 * 0.004445)
| eval monthly_cost = hourly_cost * 720
| sort monthly_cost desc
| limit 20

Cost Explorer: Filter by ECS Cluster and Service

# CLI: get Fargate costs by cluster tag for last 30 days
aws ce get-cost-and-usage \
  --time-period Start=2026-05-01,End=2026-06-01 \
  --granularity DAILY \
  --filter '{
    "And": [
      {"Dimensions": {"Key": "SERVICE", "Values": ["AWS Fargate"]}},
      {"Tags": {"Key": "Environment", "Values": ["production"]}}
    ]
  }' \
  --group-by Type=TAG,Key=Service \
  --metrics "UnblendedCost" "UsageQuantity"

Use AWS Cost Anomaly Detection for Fargate: Sudden spikes in Fargate spend (runaway tasks, infinite retry loops, forgotten dev environments) are common. Set up an anomaly monitor for the Fargate service with a $50 alert threshold. This alone has caught thousands of dollars in waste across engineering teams before the monthly bill arrives.

Fargate vs EC2 vs Lambda: Break-Even Analysis

Fargate is not always the right compute choice. Understanding the break-even points helps you route workloads to the cheapest option without sacrificing operational simplicity. The three primary options for container and serverless workloads are ECS/EKS Fargate, EC2 managed node groups, and AWS Lambda (for very short-lived functions).

Cost Comparison Table: 10 vCPU Equivalent Workload

Option	Configuration	Monthly Cost	Operational Overhead
Fargate On-Demand	20 × 0.5 vCPU / 1 GB, 24/7	~$355	None (serverless)
Fargate Spot (70%)	Same, all Spot	~$107	None + interruption handling
Fargate mixed 70/30	70% Spot, 30% On-Demand	~$181	None
EC2 t3.2xlarge (x2)	2 × 8 vCPU / 32 GB On-Demand	~$240	Node patching, capacity mgmt
EC2 t3.2xlarge Spot	2 × 8 vCPU / 32 GB Spot	~$72	Same + spot handling
EC2 + Savings Plan (3yr)	2 × t3.2xlarge, 3yr all upfront	~$115	Committed spend, less flex
Lambda (10M invocations/mo)	512 MB, 500ms avg duration	~$10	Cold starts, 15min limit

Break-Even: When to Switch from Fargate to EC2

# Rule of thumb calculation
# EC2 break-even: when your Fargate On-Demand bill exceeds
# the EC2 cost + 30% for operational overhead of node management

fargate_monthly = vcpu_count * 0.5 * 0.04048 * 720 + mem_gb * 0.004445 * 720

# Example: 40 × 0.5 vCPU / 1 GB tasks
fargate = 40 * 0.5 * 0.04048 * 720 + 40 * 1 * 0.004445 * 720
fargate = 583 + 128 = $711/month

# Equivalent EC2: 4 × m5.2xlarge (8 vCPU / 32 GB each, On-Demand)
ec2 = 4 * $0.384 * 720 = $1,105/month  # too expensive — try smaller

# Try 3 × m5.xlarge (4 vCPU / 16 GB each)
ec2 = 3 * $0.192 * 720 = $415/month
# + Karpenter/node management overhead: 20% = $83
# Total EC2: $498/month vs Fargate $711/month → EC2 wins at this scale

# Decision matrix:
# < 10 tasks:   Fargate wins (no EC2 overhead, lower total cost)
# 10-30 tasks:  Mixed strategy (Fargate Spot + small On-Demand baseline)
# > 30 tasks:   EC2 Spot node groups likely cheaper (with Karpenter autoscaling)

Lambda vs Fargate Decision Framework

# Use Lambda when:
# - Task duration < 15 minutes
# - Invocations are sporadic (< 100/day steady state)
# - Cold start latency is acceptable (100-500ms)
# - Memory fits in 10 GB

# Use Fargate when:
# - Task runs > 15 minutes
# - Need persistent TCP connections (WebSockets, gRPC streaming)
# - Memory > 10 GB needed
# - Need to run multiple containers together
# - Consistent sub-10ms response time required (warm containers)

# Lambda cost formula for comparison:
# Cost = (invocations × $0.0000002) + (GB-seconds × $0.0000166667)
# 1M invocations × 512MB × 1s avg = $0.20 + $8.33 = $8.53/month
# Same on Fargate (continuous): 0.5 vCPU / 0.5 GB × 720h = $16.40/month
# → Lambda wins for sporadic workloads, Fargate wins for continuous processing

Recommended optimization roadmap: Start with right-sizing (free, immediate impact). Add Fargate Spot for non-critical services (50–70% savings on those tasks). Migrate dev/staging to scale-to-zero schedules (60–70% savings on those environments). Switch to Graviton ARM64 for compatible services (15–20% additional savings). Purchase Compute Savings Plans for the steady-state On-Demand baseline (17–52% discount). Fully applied, this stack of optimizations routinely produces 50–65% total Fargate bill reduction.