AWS ECS vs EKS vs Lambda: Choosing the Right Compute (2026)
Three AWS services dominate backend compute discussions in 2026: Amazon ECS (managed container orchestration), Amazon EKS (managed Kubernetes), and AWS Lambda (serverless functions). Each one solves a real problem, but they solve different problems — and choosing the wrong one early means painful re-architecture later. This guide gives you the architecture deep-dive, cost model, and decision framework you need to pick with confidence.
Table of Contents
Architecture Overview
Before comparing them, it helps to understand what abstraction layer each service operates at. AWS has a layered compute stack — each layer trades control for operational simplicity.
| Dimension | ECS | EKS | Lambda |
|---|---|---|---|
| Abstraction level | Container (AWS-native) | Container (Kubernetes API) | Function (managed runtime) |
| Deployment unit | Task (1+ containers) | Pod (1+ containers) | Function handler |
| Infrastructure model | Fargate (serverless) or EC2 | Fargate profiles or EC2 node groups | Fully serverless (no infra) |
| Scaling model | Task count (Application Auto Scaling) | Pod count (HPA) + node count (Cluster Autoscaler / Karpenter) | Concurrent executions (automatic) |
| Max execution time | Unlimited | Unlimited | 15 minutes |
| Control plane cost | Free | $0.10/hr (~$73/month) | Free |
| Cold start | Slow (30–90 sec for new task) | Slow (30–90 sec for new pod) | Milliseconds to seconds |
| Vendor lock-in | High (AWS-only API) | Low (CNCF standard) | High (AWS event model) |
ECS: Tasks, Services, and Clusters
Amazon ECS models your workload as a cluster of compute capacity, divided into services that run tasks. A task is the atomic unit — it runs one or more containers as defined by a task definition JSON blueprint.
ECS with Fargate Launch Type
Fargate removes EC2 management entirely. AWS provisions, patches, and scales the underlying compute. You define CPU and memory at the task level:
{
"family": "payment-service",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "1024",
"memory": "2048",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/paymentServiceRole",
"containerDefinitions": [
{
"name": "payment-api",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/payment-api:v2.3.1",
"essential": true,
"portMappings": [{"containerPort": 8080, "protocol": "tcp"}],
"environment": [{"name": "ENV", "value": "prod"}],
"secrets": [
{"name": "DB_URL", "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/db-url"},
{"name": "STRIPE_KEY", "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/stripe-key"}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/payment-service",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/actuator/health || exit 1"],
"interval": 30, "timeout": 5, "retries": 3
}
}
]
}
ECS Service with Auto Scaling
An ECS Service keeps the desired task count running and handles rolling deployments. Attach Application Auto Scaling to scale based on CPU, memory, or custom metrics:
# Create the ECS service
aws ecs create-service \
--cluster prod \
--service-name payment-service \
--task-definition payment-service:5 \
--desired-count 3 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={
subnets=[subnet-aaa111,subnet-bbb222],
securityGroups=[sg-web999],
assignPublicIp=DISABLED
}" \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/payment-tg/abc,containerName=payment-api,containerPort=8080"
# Register as a scalable target
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/prod/payment-service \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 2 \
--max-capacity 20
# Target tracking: keep average CPU at 60%
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/prod/payment-service \
--scalable-dimension ecs:service:DesiredCount \
--policy-name cpu-tracking \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 60.0,
"PredefinedMetricSpecification": {"PredefinedMetricType": "ECSServiceAverageCPUUtilization"},
"ScaleInCooldown": 300,
"ScaleOutCooldown": 60
}'
EKS: Control Plane, Node Groups, and Fargate Profiles
Amazon EKS runs a fully managed Kubernetes control plane (etcd, API server, scheduler, controller manager) across multiple Availability Zones. You pay $0.10/hour for this regardless of workload size. Your workloads run on either EC2-backed managed node groups or Fargate profiles — or both.
EKS Cluster and Node Group (Terraform)
# eks-cluster.tf
resource "aws_eks_cluster" "main" {
name = "prod-cluster"
role_arn = aws_iam_role.eks_cluster.arn
version = "1.30"
vpc_config {
subnet_ids = var.private_subnet_ids
endpoint_private_access = true
endpoint_public_access = false
}
enabled_cluster_log_types = ["api", "audit", "authenticator"]
}
resource "aws_eks_node_group" "general" {
cluster_name = aws_eks_cluster.main.name
node_group_name = "general-ng"
node_role_arn = aws_iam_role.eks_node.arn
subnet_ids = var.private_subnet_ids
instance_types = ["m6i.xlarge"]
capacity_type = "ON_DEMAND"
scaling_config {
desired_size = 3
max_size = 15
min_size = 2
}
update_config {
max_unavailable = 1
}
}
# Fargate profile for batch jobs
resource "aws_eks_fargate_profile" "batch" {
cluster_name = aws_eks_cluster.main.name
fargate_profile_name = "batch-profile"
pod_execution_role_arn = aws_iam_role.fargate_pod.arn
subnet_ids = var.private_subnet_ids
selector {
namespace = "batch"
labels = { compute = "fargate" }
}
}
Deploying a Microservice on EKS
# order-service-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: order-service
template:
metadata:
labels:
app: order-service
spec:
serviceAccountName: order-service-sa # IRSA — maps to IAM role
containers:
- name: order-service
image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/order-service:v1.4.2
ports:
- containerPort: 8080
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "1024Mi"
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: order-service-pdb
namespace: production
spec:
minAvailable: 2
selector:
matchLabels:
app: order-service
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: order-service-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: order-service
minReplicas: 3
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 65
Lambda: Event-Driven, Cold Starts, and Limits
AWS Lambda runs your code in response to events — HTTP requests via API Gateway, S3 uploads, SQS messages, DynamoDB streams, scheduled CloudWatch Events, and dozens more. You pay only for execution time (rounded to 1ms) and the number of invocations. There are no idle costs.
Lambda Function with SQS Trigger
# template.yaml (SAM)
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Globals:
Function:
Runtime: python3.12
Architectures: [arm64] # Graviton2 — 20% cheaper, same speed
Timeout: 30
MemorySize: 512
Environment:
Variables:
TABLE_NAME: !Ref OrdersTable
REGION: !Sub "${AWS::Region}"
Resources:
OrderProcessor:
Type: AWS::Serverless::Function
Properties:
Handler: handler.process_order
CodeUri: src/
Role: !GetAtt OrderProcessorRole.Arn
ReservedConcurrentExecutions: 100 # prevent runaway scaling
Events:
SQSTrigger:
Type: SQS
Properties:
Queue: !GetAtt OrderQueue.Arn
BatchSize: 10
FunctionResponseTypes:
- ReportBatchItemFailures # partial batch failure support
Layers:
- !Sub "arn:aws:lambda:${AWS::Region}:017000801446:layer:AWSLambdaPowertoolsPythonV2-Arm64:57"
# Provisioned concurrency to eliminate cold starts for the API path
OrderProcessorAlias:
Type: AWS::Lambda::Alias
Properties:
FunctionName: !Ref OrderProcessor
FunctionVersion: !GetAtt OrderProcessor.Version
Name: live
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 5
Lambda Limits You Must Know
| Limit | Default / Maximum | Impact |
|---|---|---|
| Max execution time | 15 minutes | Long-running jobs need ECS/EKS or Step Functions |
| Deployment package | 250 MB unzipped (50 MB zipped direct) | Heavy ML models need Lambda layers or container images (up to 10 GB) |
| Ephemeral storage (/tmp) | 512 MB – 10 GB (configurable) | Large file processing now viable |
| Concurrent executions (account default) | 1,000 (soft limit, can increase) | Burst traffic may get throttled; set reserved concurrency per function |
| Memory | 128 MB – 10 GB | More memory = more vCPU (proportional allocation) |
| VPC cold start | Added ~1 s (first invocation after idle) | Mitigated by Hyperplane ENI since 2020 — no longer a major concern |
Decision Matrix by Use Case
Use this matrix as a starting point. Real decisions involve team skills, existing tooling, and organisational constraints — but these are solid defaults.
| Use Case | Recommended | Why |
|---|---|---|
| REST / GraphQL API (moderate traffic) | Lambda + API Gateway | No idle cost, scales to zero, minimal ops |
| REST / GraphQL API (high, sustained traffic) | ECS Fargate | Lambda per-invocation cost exceeds ECS at high volume |
| Long-running microservices | ECS or EKS | Lambda 15-min limit is a hard constraint |
| Batch / ETL jobs | Lambda (short) / ECS Fargate (long) | Lambda for jobs <15 min; ECS for hours-long jobs |
| Event-driven pipelines (S3, SQS, streams) | Lambda | Native triggers, pay-per-event model, minimal glue code |
| Machine learning inference (real-time) | ECS or SageMaker endpoints | Model size and GPU needs exceed Lambda limits |
| Stateful workloads (sessions, WebSocket) | ECS or EKS | Lambda is stateless; sticky sessions require containers |
| Kubernetes-native tooling (Helm, Istio) | EKS | ECS has no Helm, no CRDs, no service mesh ecosystem |
| Multi-cloud / hybrid strategy | EKS | Kubernetes runs everywhere; ECS and Lambda are AWS-only |
| Startup MVP with small team | Lambda or ECS Fargate | EKS operational overhead is too high early |
| Large platform team (10+ engineers) | EKS | Investment in Kubernetes pays off at scale |
Cost Comparison
Cost depends heavily on traffic patterns. Let's model a concrete example: a JSON API serving 10 million requests/month, average response time 200ms, 512 MB memory.
Lambda Cost
# Lambda pricing (us-east-1, arm64, 2026)
Requests: 10,000,000 × $0.0000002 = $2.00
Compute: 10,000,000 × 0.200s × 512MB/1024
= 1,000,000 GB-seconds
× $0.0000133 per GB-s = $13.30
Total Lambda/month ≈ $15.30
# Add API Gateway HTTP API
10,000,000 requests × $1.00/million = $10.00
Total with API Gateway ≈ $25.30/month
ECS Fargate Cost
# ECS Fargate (us-east-1, 2 tasks × 24/7)
# 0.5 vCPU, 1 GB memory per task (handles ~100 RPS each)
vCPU: 2 tasks × 0.5 × 730h × $0.04048/vCPU-hr = $29.55
Memory: 2 tasks × 1.0 × 730h × $0.004445/GB-hr = $6.49
Total Fargate/month (2 tasks) ≈ $36.04
# No API Gateway cost if using ALB (~$20/month base)
EKS Cost
# EKS cluster control plane
Control plane: = $73.00/month
# Nodes: 2× m6i.large (on-demand, us-east-1)
2 × $0.096/hr × 730hr = $140.16
# Or with Fargate profiles (no node cost, pod-level billing)
# Same as Fargate above + $73 control plane
Total EKS (Fargate profiles) ≈ $109/month
Migration Paths
Teams rarely pick the perfect compute on day one. Here are the most common migration paths and how to execute them.
Lambda → ECS Fargate (outgrown Lambda limits)
# Step 1: Containerise the Lambda handler
# lambda/handler.py → app/main.py (FastAPI wrapper)
# Dockerfile
FROM public.ecr.aws/lambda/python:3.12 AS lambda-base
COPY requirements.txt .
RUN pip install -r requirements.txt
FROM python:3.12-slim AS fargate-target
WORKDIR /app
COPY --from=lambda-base /var/task /app
RUN pip install fastapi uvicorn
COPY app/main.py .
EXPOSE 8080
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
# Step 2: Push to ECR
aws ecr get-login-password | docker login --username AWS \
--password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
docker build -t my-api .
docker tag my-api:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-api:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-api:latest
# Step 3: Create ECS task definition and service (see ECS guide)
# Step 4: Route traffic gradually via weighted ALB target groups
ECS → EKS (need Kubernetes ecosystem)
# Use Kompose to translate Docker Compose → Kubernetes manifests as a starting point
kompose convert -f docker-compose.yml -o k8s/
# Map ECS concepts to Kubernetes:
# ECS Task Definition → Pod spec inside Deployment
# ECS Service → Deployment + Service + HPA
# ECS Cluster → Namespace (logical) + Node Group (compute)
# ECS Exec → kubectl exec -it pod -- /bin/sh
# ECS Service Connect → Kubernetes Service DNS (svc.cluster.local)
# ALB (ECS) → AWS Load Balancer Controller (EKS)
# IAM Task Role (IRSA) → ServiceAccount + IAM OIDC provider binding
# Install AWS Load Balancer Controller on EKS
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=prod-cluster \
--set serviceAccount.create=false \
--set serviceAccount.name=aws-load-balancer-controller
Real-World Scenarios
Scenario 1: E-Commerce Platform (Microservices)
A mid-size e-commerce site has 12 microservices: catalogue, cart, order, payment, notification, search, recommendation, inventory, user, review, analytics, and an admin API. Traffic peaks during flash sales (10× normal load).
- Core services (catalogue, cart, order, payment) — ECS Fargate with Application Auto Scaling. These run continuously and need < 1 s response times. Fargate avoids EC2 management while scaling within seconds.
- Notification service — Lambda triggered by SQS. Sends emails/SMS asynchronously; no sustained load, no idle cost.
- Recommendation engine — ECS EC2 with GPU instances (p3.xlarge). ML inference needs GPU; Fargate doesn't support GPU.
- Analytics pipeline — Lambda triggered by Kinesis. Processes click-stream events in micro-batches; cost-effective at millions of small events.
Scenario 2: API Backend for Mobile App
A B2C mobile app has 50,000 daily active users with spiky evening traffic and near-zero traffic overnight.
- Choice: Lambda + API Gateway HTTP API. The overnight idle period means ECS/EKS tasks would run unused for 8+ hours daily. Lambda's scale-to-zero eliminates that cost. With provisioned concurrency on the most latency-sensitive endpoints, cold starts are eliminated during peak hours.
- Estimated saving vs ECS Fargate (2 tasks 24/7): ~60% lower monthly compute bill.
Scenario 3: Enterprise Platform Team (100+ Services)
A fintech company has 100+ microservices, a dedicated platform engineering team of 15 engineers, and a mandate for multi-cloud readiness.
- Choice: EKS with Karpenter for node autoscaling. The platform team can justify the EKS learning curve. Helm charts, ArgoCD GitOps, Istio service mesh, and Prometheus/Grafana observability stack are all Kubernetes-native. Karpenter replaces the Cluster Autoscaler and reduces node-level cost by 40% via spot instance bin-packing.
- Lambda is still used for glue (event bridges, scheduled jobs, webhook receivers).
Frequently Asked Questions
Absolutely — this is the recommended pattern. Run your long-lived, latency-sensitive services on ECS Fargate and use Lambda for event-driven, sporadic, or background tasks. They share the same VPC, the same IAM boundary, and communicate via SQS, SNS, EventBridge, or internal ALBs.
EKS makes sense when the Kubernetes ecosystem value (Helm, Istio, ArgoCD, Karpenter, custom operators) outweighs the $73/month. For a single small team with 2–3 services, it rarely does. For a platform team managing 20+ services with GitOps pipelines, it's table stakes.
Less so than before. The Hyperplane ENI model eliminated the VPC cold start penalty. For Python and Node.js on arm64, init times are typically 200–500 ms. Use provisioned concurrency to eliminate cold starts entirely on your most critical endpoints. See our Lambda cold starts guide for benchmark data.
Yes, using VPC Interface Endpoints for ECR, S3, CloudWatch Logs, and Secrets Manager. This avoids NAT gateway data transfer costs (~$0.045/GB) which can become significant at scale. Create endpoints for: ecr.api, ecr.dkr, s3, logs, and secretsmanager.
Fargate profiles run each Pod in an isolated micro-VM managed by AWS — no node to patch, but no DaemonSets, no GPU, no privileged containers, and no persistent volumes (except EFS). Managed node groups give you full EC2 access, DaemonSet support, GPU, and local NVMe storage, at the cost of managing AMI updates and cluster upgrades across the node group.