AWS CloudWatch: Metrics, Alarms, Logs and Dashboards (2026)

CloudWatch is AWS's observability platform — it collects metrics from every AWS service, stores application logs, triggers alarms, and surfaces dashboards. This guide covers the full observability stack: custom metrics, structured logging, powerful Logs Insights queries, composite alarms, and Container Insights for Kubernetes.

Metrics and Namespaces
Publishing Custom Metrics
Alarms and Composite Alarms
CloudWatch Logs and Log Groups
Logs Insights Queries
CloudWatch Agent
Container Insights for EKS
Dashboards
FAQ

1. Metrics and Namespaces

CloudWatch organises metrics into namespaces (e.g., AWS/EC2, AWS/RDS, AWS/Lambda). Each metric has dimensions that narrow it to a specific resource. Key built-in metrics:

EC2: CPUUtilization, NetworkIn/Out, DiskReadOps (note: RAM and disk % require CloudWatch Agent)
Lambda: Duration, Errors, Throttles, ConcurrentExecutions, IteratorAge (for SQS/Kinesis triggers)
RDS: CPUUtilization, DatabaseConnections, FreeStorageSpace, ReadLatency, WriteLatency
ALB: RequestCount, TargetResponseTime, HTTPCode_Target_5XX_Count, UnHealthyHostCount

Note: CloudWatch metrics have a default retention period of 15 months for 1-minute data, rolling to lower resolution over time. Standard resolution is 1 minute; high resolution is 1 second (extra cost).

2. Publishing Custom Metrics

Push custom application metrics with the AWS SDK:

import boto3

cloudwatch = boto3.client('cloudwatch', region_name='us-east-1')

def record_order_processed(order_value: float):
    cloudwatch.put_metric_data(
        Namespace='MyApp/Orders',
        MetricData=[
            {
                'MetricName': 'OrdersProcessed',
                'Value': 1,
                'Unit': 'Count',
                'Dimensions': [
                    {'Name': 'Environment', 'Value': 'production'},
                    {'Name': 'Region', 'Value': 'us-east-1'}
                ]
            },
            {
                'MetricName': 'OrderValue',
                'Value': order_value,
                'Unit': 'None',
                'Dimensions': [{'Name': 'Environment', 'Value': 'production'}]
            }
        ]
    )

Or via CLI for quick testing:

aws cloudwatch put-metric-data \
  --namespace "MyApp/Orders" \
  --metric-name "OrdersProcessed" \
  --value 1 \
  --unit Count \
  --dimensions Environment=production

3. Alarms and Composite Alarms

Create an alarm on Lambda error rate with SNS notification:

aws cloudwatch put-metric-alarm \
  --alarm-name "lambda-high-error-rate" \
  --alarm-description "Lambda error rate above 5%" \
  --namespace AWS/Lambda \
  --metric-name Errors \
  --dimensions Name=FunctionName,Value=myapp-processor \
  --statistic Sum \
  --period 60 \
  --evaluation-periods 3 \
  --threshold 5 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --treat-missing-data notBreaching \
  --alarm-actions arn:aws:sns:us-east-1:123456789:myapp-alerts \
  --ok-actions arn:aws:sns:us-east-1:123456789:myapp-alerts

Composite alarms combine multiple alarms with AND/OR logic — reduces alert noise:

aws cloudwatch put-composite-alarm \
  --alarm-name "service-degraded" \
  --alarm-rule "ALARM(lambda-high-error-rate) AND ALARM(alb-5xx-high)" \
  --alarm-actions arn:aws:sns:us-east-1:123456789:myapp-critical

Pro Tip: Use treat-missing-data ignore for metrics that are only published when events occur (like Lambda invocations during low-traffic hours) — otherwise you'll get false alarms at night.

4. CloudWatch Logs and Log Groups

Structure your logs as JSON for powerful querying — avoid plain text logs in production:

import json, logging, time

class JsonFormatter(logging.Formatter):
    def format(self, record):
        log_obj = {
            "timestamp": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
            "level": record.levelname,
            "message": record.getMessage(),
            "logger": record.name,
        }
        if hasattr(record, 'request_id'):
            log_obj['request_id'] = record.request_id
        return json.dumps(log_obj)

logger = logging.getLogger()
handler = logging.StreamHandler()
handler.setFormatter(JsonFormatter())
logger.addHandler(handler)
logger.setLevel(logging.INFO)

Set log group retention to avoid unbounded storage costs:

aws logs put-retention-policy \
  --log-group-name /aws/lambda/myapp-processor \
  --retention-in-days 30

5. Logs Insights Queries

CloudWatch Logs Insights lets you run SQL-like queries across log groups. Essential queries:

-- Find top 10 slowest Lambda invocations in the last hour
fields @timestamp, @duration, @requestId
| filter @type = "REPORT"
| sort @duration desc
| limit 10

-- Count errors by type
fields @timestamp, level, message
| filter level = "ERROR"
| stats count(*) as error_count by message
| sort error_count desc

-- P99 API response times
fields @timestamp, responseTime
| filter ispresent(responseTime)
| stats pct(responseTime, 99) as p99, pct(responseTime, 95) as p95, avg(responseTime) as avg_ms
| sort @timestamp desc

6. CloudWatch Agent

The CloudWatch Agent runs on EC2 and collects OS-level metrics (memory, disk) that aren't available by default. Install and configure:

# Install on Amazon Linux 2023
sudo dnf install amazon-cloudwatch-agent -y

# Create config file
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

A minimal agent config for memory and disk:

{
  "metrics": {
    "namespace": "CWAgent",
    "metrics_collected": {
      "mem": {
        "measurement": ["mem_used_percent"],
        "metrics_collection_interval": 60
      },
      "disk": {
        "measurement": ["used_percent"],
        "metrics_collection_interval": 60,
        "resources": ["/"]
      }
    }
  },
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [{
          "file_path": "/var/log/myapp/app.log",
          "log_group_name": "/ec2/myapp",
          "log_stream_name": "{instance_id}"
        }]
      }
    }
  }
}

7. Container Insights for EKS

Container Insights collects cluster, node, pod and container-level metrics from EKS. Enable via CloudWatch add-on:

aws eks create-addon \
  --cluster-name my-cluster \
  --addon-name amazon-cloudwatch-observability \
  --service-account-role-arn arn:aws:iam::123456789:role/CloudWatchAgentRole

This deploys the CloudWatch agent as a DaemonSet and enables Container Insights metrics in the ContainerInsights namespace — including pod_cpu_utilization, pod_memory_utilization, node_cpu_utilization, and cluster_failed_node_count.

8. Dashboards

Create a dashboard via CLI with widgets as JSON:

aws cloudwatch put-dashboard \
  --dashboard-name "MyApp-Production" \
  --dashboard-body '{
    "widgets": [
      {
        "type": "metric",
        "properties": {
          "title": "Lambda Error Rate",
          "metrics": [["AWS/Lambda","Errors","FunctionName","myapp-processor"]],
          "period": 60,
          "stat": "Sum",
          "view": "timeSeries"
        }
      }
    ]
  }'

Frequently Asked Questions

Why can't I see memory usage for my EC2 instances in CloudWatch?

Memory and disk utilization are OS-level metrics that AWS cannot access from the hypervisor. Install and configure the CloudWatch Agent on your EC2 instances to collect and publish these metrics to the custom CWAgent namespace.

How much does CloudWatch cost?

Key cost drivers: custom metrics ($0.30/metric/month after 10 free), Logs ingestion ($0.50/GB), Logs storage ($0.03/GB/month), Logs Insights queries ($0.005 per GB scanned). Avoid dumping huge volumes of verbose logs — use structured JSON and filter at the source.

What is the difference between a metric filter and Logs Insights?

Metric filters run continuously and convert matching log events into CloudWatch metrics in real time — useful for creating alarms on log patterns. Logs Insights is an ad-hoc query engine for historical analysis. Use metric filters for alerting, Logs Insights for investigation.

How do I avoid CloudWatch alarm noise?

Use composite alarms to require multiple conditions before alerting, set appropriate evaluation periods (require N of M datapoints), use treat-missing-data notBreaching for sparse metrics, and implement alarm suppression during maintenance windows with alarm actions on parent composite alarms.

AWS CloudWatch: Metrics, Alarms, Logs and Dashboards (2026)

Table of Contents

1. Metrics and Namespaces

2. Publishing Custom Metrics

3. Alarms and Composite Alarms

4. CloudWatch Logs and Log Groups

5. Logs Insights Queries

6. CloudWatch Agent

7. Container Insights for EKS

8. Dashboards

Frequently Asked Questions

Why can't I see memory usage for my EC2 instances in CloudWatch?

How much does CloudWatch cost?

What is the difference between a metric filter and Logs Insights?

How do I avoid CloudWatch alarm noise?

Read Next

AWS Load Balancers Guide

EC2 Complete Guide

AWS Articles