AWS ECS: Container Orchestration with Fargate and EC2 (2026)

Amazon Elastic Container Service (ECS) is AWS's native container orchestrator. Unlike Kubernetes, ECS is a fully proprietary AWS service — which means less operational complexity but also less portability. For teams that live inside the AWS ecosystem and don't need Kubernetes-specific features, ECS is often the faster path to production. This guide covers everything from writing your first task definition to zero-downtime blue/green deployments.

ECS vs EKS: When to Choose Which

This is one of the most common architectural decisions for AWS-hosted container workloads. Neither is universally better — the right answer depends on your team's expertise and your operational goals.

FactorECSEKS
Learning curveLow — AWS-native conceptsHigh — full Kubernetes API
Operational overheadLow (Fargate: near zero)Medium (control plane managed, but add-ons, IRSA, upgrades)
Ecosystem / toolingAWS-onlyCNCF ecosystem (Helm, Istio, ArgoCD, etc.)
Multi-cloud portabilityNoneHigh
Control plane costFree$0.10/hour (~$73/month)
Scheduling flexibilityGoodExcellent (node selectors, affinity, taints)
Best forAWS-first teams, simple microservicesLarge platforms, Kubernetes expertise
Rule of thumb: Start with ECS Fargate. Migrate to EKS if you hit its limits or if your org standardizes on Kubernetes. Don't over-engineer early.

Task Definitions

A task definition is a JSON blueprint that describes one or more containers — their image, CPU/memory allocation, environment variables, port mappings, logging config, and IAM roles.

{
  "family": "my-api",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::123456789012:role/myAppTaskRole",
  "containerDefinitions": [
    {
      "name": "api",
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-api:latest",
      "essential": true,
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {"name": "ENV", "value": "production"},
        {"name": "PORT", "value": "8080"}
      ],
      "secrets": [
        {
          "name": "DB_PASSWORD",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/db-password"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/my-api",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3
      }
    }
  ]
}

Register it with the CLI:

aws ecs register-task-definition \
  --cli-input-json file://task-def.json
Note: Always use the secrets field (backed by Secrets Manager or SSM Parameter Store) rather than putting credentials in environment. The executionRoleArn needs secretsmanager:GetSecretValue permission to inject secrets at startup.

ECS Services

An ECS Service keeps a specified number of task copies running, integrates with a load balancer, and handles rolling deploys. Think of it as the ECS equivalent of a Kubernetes Deployment + Service.

aws ecs create-service \
  --cluster prod-cluster \
  --service-name my-api \
  --task-definition my-api:3 \
  --desired-count 3 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={
    subnets=[subnet-abc123,subnet-def456],
    securityGroups=[sg-xyz789],
    assignPublicIp=DISABLED
  }" \
  --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:...,containerName=api,containerPort=8080" \
  --deployment-configuration "minimumHealthyPercent=100,maximumPercent=200"

For infrastructure-as-code, use a CloudFormation or CDK definition. See the CDK guide for CDK patterns.

Fargate vs EC2 Launch Type

The launch type determines who manages the underlying infrastructure for your containers.

FeatureFargateEC2
Infrastructure managementAWS manages itYou manage EC2 instances
Billing granularityPer-task (vCPU + GB)Per EC2 instance
Cost at scaleHigher per-unit costLower with reserved instances
Startup time~30-90 secondsFaster (instance already running)
GPU workloadsNot supportedSupported (p3, g4dn instances)
Custom AMINot possibleFully supported
DaemonSet equivalentNot availableDaemon scheduling supported

Use Fargate for: microservices without GPU needs, variable-load workloads, dev/staging environments. Use EC2 for: GPU workloads, high-throughput services where per-unit Fargate cost is prohibitive, or workloads requiring host-level tuning.

Capacity Providers

Capacity providers decouple where tasks run from the service definition. You define capacity provider strategies on the cluster and services pick them up automatically.

# Create a capacity provider backed by an Auto Scaling Group
aws ecs create-capacity-provider \
  --name my-asg-cp \
  --auto-scaling-group-provider "autoScalingGroupArn=arn:aws:autoscaling:...,
    managedScaling={status=ENABLED,targetCapacity=80},
    managedTerminationProtection=ENABLED"

# Attach to cluster with a strategy: prefer Fargate Spot, fall back to Fargate
aws ecs put-cluster-capacity-providers \
  --cluster prod-cluster \
  --capacity-providers FARGATE FARGATE_SPOT \
  --default-capacity-provider-strategy \
    capacityProvider=FARGATE_SPOT,weight=4,base=0 \
    capacityProvider=FARGATE,weight=1,base=1

The strategy above runs 80% of tasks on Fargate Spot (up to 70% cheaper) with Fargate as a fallback. The base=1 ensures at least one task always runs on regular Fargate for stability.

Service Discovery with Cloud Map

ECS integrates with AWS Cloud Map for DNS-based service discovery. Each ECS task registers itself with a Cloud Map namespace so other services can reach it by DNS name instead of hardcoded IPs.

# Create a private DNS namespace
aws servicediscovery create-private-dns-namespace \
  --name prod.local \
  --vpc vpc-abc123

# Create a service in that namespace
aws servicediscovery create-service \
  --name my-api \
  --dns-config "NamespaceId=ns-abc123,DnsRecords=[{Type=A,TTL=10}]" \
  --health-check-custom-config FailureThreshold=1

Then reference the service discovery service ARN in your ECS service definition. Tasks become reachable at my-api.prod.local from within the VPC. TTL of 10 seconds means DNS updates propagate quickly when tasks restart.

ECS Exec for Debugging

ECS Exec uses AWS Systems Manager Session Manager to open an interactive shell inside a running container — no SSH, no bastion host needed.

# Enable ECS Exec on the service (requires re-deploy)
aws ecs update-service \
  --cluster prod-cluster \
  --service my-api \
  --enable-execute-command

# Open a shell in a running task
aws ecs execute-command \
  --cluster prod-cluster \
  --task arn:aws:ecs:us-east-1:123456789012:task/abc123 \
  --container api \
  --interactive \
  --command "/bin/sh"

The task role needs these permissions:

{
  "Effect": "Allow",
  "Action": [
    "ssmmessages:CreateControlChannel",
    "ssmmessages:CreateDataChannel",
    "ssmmessages:OpenControlChannel",
    "ssmmessages:OpenDataChannel"
  ],
  "Resource": "*"
}
Security note: Disable ECS Exec in production after debugging. Any IAM principal with ecs:ExecuteCommand can shell into your containers. Restrict it via IAM conditions or enable it only via a break-glass process.

Blue/Green Deployments with CodeDeploy

ECS integrates with CodeDeploy for blue/green deployments: the new version ("green") runs alongside the old version ("blue"), traffic shifts gradually, and rollback is instant if health checks fail.

# appspec.yaml for CodeDeploy ECS blue/green
version: 0.0
Resources:
  - TargetService:
      Type: AWS::ECS::Service
      Properties:
        TaskDefinition: "arn:aws:ecs:us-east-1:123456789012:task-definition/my-api:NEW_REVISION"
        LoadBalancerInfo:
          ContainerName: "api"
          ContainerPort: 8080
        PlatformVersion: "LATEST"

Hooks:
  - BeforeAllowTraffic: "arn:aws:lambda:us-east-1:123456789012:function:PreTrafficHook"
  - AfterAllowTraffic: "arn:aws:lambda:us-east-1:123456789012:function:PostTrafficHook"

Configure the deployment group with a traffic shifting strategy:

aws deploy create-deployment-group \
  --application-name my-api-app \
  --deployment-group-name my-api-dg \
  --deployment-config-name CodeDeployDefault.ECSCanary10Percent5Minutes \
  --ecs-services clusterName=prod-cluster,serviceName=my-api \
  --load-balancer-info "targetGroupPairInfoList=[{
    targetGroups=[{name=my-api-tg-blue},{name=my-api-tg-green}],
    prodTrafficRoute={listenerArns=[arn:aws:elasticloadbalancing:...]}
  }]"

ECSCanary10Percent5Minutes shifts 10% of traffic to green, waits 5 minutes, then shifts 100% if health checks pass. If the Lambda hooks or health checks fail, CodeDeploy automatically rolls back to blue.

Logging to CloudWatch

The awslogs driver is the simplest way to get container logs into CloudWatch Logs. Use awsfirelens (backed by Fluent Bit) for advanced log routing to multiple destinations.

# Fluent Bit sidecar for multi-destination logging
{
  "name": "log-router",
  "image": "public.ecr.aws/aws-observability/aws-for-fluent-bit:latest",
  "essential": true,
  "firelensConfiguration": {
    "type": "fluentbit",
    "options": {
      "enable-ecs-log-metadata": "true"
    }
  }
},
{
  "name": "api",
  "image": "...",
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "cloudwatch_logs",
      "region": "us-east-1",
      "log_group_name": "/ecs/my-api",
      "log_stream_prefix": "from-firelens-",
      "auto_create_group": "true"
    }
  }
}

ALB Integration

ECS services register task IPs directly with ALB target groups (IP target type). When a task is replaced during a deploy, ECS deregisters the old task IP and registers the new one — the ALB connection draining period ensures in-flight requests complete.

# Create target group (IP type for Fargate/awsvpc)
aws elbv2 create-target-group \
  --name my-api-tg \
  --protocol HTTP \
  --port 8080 \
  --vpc-id vpc-abc123 \
  --target-type ip \
  --health-check-path /health \
  --health-check-interval-seconds 15 \
  --healthy-threshold-count 2 \
  --unhealthy-threshold-count 3 \
  --deregistration-delay-attributes deregistration_delay.timeout_seconds=30

Set deregistration-delay to match your application's maximum request duration. For APIs with short requests, 30 seconds is plenty. For file uploads or long-running requests, increase it accordingly.

Frequently Asked Questions

Q: How do I pass secrets to ECS containers securely?

Use the secrets field in the task definition, pointing to AWS Secrets Manager ARNs or SSM Parameter Store paths. The ECS agent fetches the secret value at task startup and injects it as an environment variable. The executionRoleArn must have secretsmanager:GetSecretValue or ssm:GetParameters permission.

Q: How do ECS tasks communicate with each other?

In the same VPC, tasks on awsvpc network mode get ENIs and can communicate directly by IP. For reliable service-to-service communication, use AWS Cloud Map (DNS discovery) or an internal ALB. Avoid hardcoding task IPs since they change with every deployment.

Q: What's the difference between task role and execution role?

The execution role is used by the ECS agent to pull images from ECR and fetch secrets. The task role is assumed by the application code running inside the container to access AWS services (S3, DynamoDB, etc.). Always use separate roles with least-privilege policies.

Q: Can I run ECS tasks on a schedule like cron?

Yes. Use EventBridge (CloudWatch Events) rules with ECS as the target. Set a cron expression and the rule will trigger a task run at the specified time. This replaces the older Scheduled Tasks feature in the ECS console.

Q: How do I auto-scale an ECS service?

Use Application Auto Scaling with ECS as the scalable target. Scale on CPU utilization, memory utilization, or custom CloudWatch metrics (e.g., SQS queue depth). Step scaling and target tracking policies are both supported. See the SQS guide for queue-based scaling patterns.