AWS Backup: Centralized Data Protection Strategy

Managing backups across dozens of AWS services — EC2 volumes, RDS databases, DynamoDB tables, EFS file systems, S3 buckets — using individual service controls quickly becomes unmanageable. AWS Backup is a fully managed service that centralizes and automates data protection across AWS services and hybrid environments. This guide covers everything from creating your first backup plan to advanced cross-account, cross-region protection with Audit Manager compliance reporting.

Core Concepts: Plans, Vaults, and Rules
Creating a Backup Plan
Assigning Resources and Tag-Based Selection
Cross-Region and Cross-Account Backup
Service-Specific Backup Configurations
AWS Backup Audit Manager
Restore Testing and Validation
Cost Optimization Tips

1. Core Concepts: Plans, Vaults, and Rules

Before diving into configuration, it's important to understand the three building blocks of AWS Backup:

Backup Plan — a policy that defines when to back up, how long to retain backups, and where to store them. A plan contains one or more backup rules.
Backup Rule — specifies the schedule (cron or rate), backup window, lifecycle (warm/cold storage transition and deletion), and the target vault.
Backup Vault — an encrypted logical container for recovery points (backups). Each vault has an access policy and an optional vault lock for immutability.
Recovery Point — a snapshot or backup of a specific resource at a point in time. Identified by a unique ARN.
Resource Assignment — rules that map AWS resources (by ARN, tag, or resource type) to a backup plan.

Key Insight: AWS Backup does not store data differently from native service backups — an RDS recovery point is still an RDS snapshot, and an EC2 recovery point is still an EBS snapshot. The value is centralized scheduling, policy enforcement, cross-service visibility, and compliance reporting.

Supported Services

As of 2026, AWS Backup supports the following services:

Service	Resource Types	Notes
Amazon EC2	Instances, EBS Volumes	AMI + EBS snapshots
Amazon RDS	DB Instances, Clusters	Includes Aurora
Amazon DynamoDB	Tables	On-demand and continuous backups
Amazon EFS	File Systems	Full and incremental backups
Amazon S3	Buckets	Object-level continuous backup
Amazon FSx	Windows, Lustre, NetApp ONTAP	File system backups
AWS Storage Gateway	Volumes	Hybrid on-premises volumes
Amazon Redshift	Clusters	Automated snapshots
Amazon DocumentDB	Clusters	Includes cluster snapshots
Amazon Neptune	Clusters	Graph database backups
VMware Cloud on AWS	VMs	Hybrid workloads

2. Creating a Backup Plan

You can create backup plans via the console, CLI, CloudFormation, or Terraform. Here's a production-ready plan using the AWS CLI that creates daily backups retained for 35 days with a weekly cold-tier transition and monthly backups retained for 1 year:

# 1. Create a backup vault with KMS encryption
aws backup create-backup-vault \
  --backup-vault-name prod-backup-vault \
  --encryption-key-arn arn:aws:kms:us-east-1:123456789012:key/mrk-abc123 \
  --tags Environment=production,Team=platform

# 2. Create the backup plan JSON
cat > backup-plan.json <<'EOF'
{
  "BackupPlanName": "prod-backup-plan",
  "Rules": [
    {
      "RuleName": "daily-backup",
      "TargetBackupVaultName": "prod-backup-vault",
      "ScheduleExpression": "cron(0 5 ? * * *)",
      "StartWindowMinutes": 60,
      "CompletionWindowMinutes": 180,
      "Lifecycle": {
        "MoveToColdStorageAfterDays": 30,
        "DeleteAfterDays": 35
      },
      "CopyActions": []
    },
    {
      "RuleName": "monthly-backup",
      "TargetBackupVaultName": "prod-backup-vault",
      "ScheduleExpression": "cron(0 5 1 * ? *)",
      "StartWindowMinutes": 60,
      "CompletionWindowMinutes": 360,
      "Lifecycle": {
        "MoveToColdStorageAfterDays": 7,
        "DeleteAfterDays": 365
      },
      "CopyActions": []
    }
  ]
}
EOF

# 3. Create the backup plan
aws backup create-backup-plan \
  --backup-plan file://backup-plan.json

Schedule Tip: Use cron(0 5 ? * * *) to run at 05:00 UTC daily. The ? in the day-of-month position means "any" — required because you cannot specify both day-of-month and day-of-week simultaneously. Set the start window large enough (60+ minutes) to accommodate service throttling during peak backup periods.

Vault Lock for Immutability

For compliance requirements (HIPAA, PCI-DSS, SEC Rule 17a-4), enable Vault Lock to make recovery points immutable during the retention period — even the root account cannot delete them:

# Enable vault lock in governance mode (14-day cool-off period)
aws backup put-backup-vault-lock-configuration \
  --backup-vault-name prod-backup-vault \
  --min-retention-days 7 \
  --max-retention-days 365 \
  --changeable-for-days 14

# After 14 days it switches to compliance mode — IRREVERSIBLE
# In compliance mode even AWS Support cannot delete recovery points

3. Assigning Resources and Tag-Based Selection

Resource assignments tell AWS Backup which resources a plan protects. The most scalable approach is tag-based assignment — any resource with a matching tag is automatically included, including resources created after the plan was set up.

# Get the backup plan ID from the create output
PLAN_ID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

# Create a tag-based resource assignment
aws backup create-backup-selection \
  --backup-plan-id $PLAN_ID \
  --backup-selection '{
    "SelectionName": "prod-tagged-resources",
    "IamRoleArn": "arn:aws:iam::123456789012:role/AWSBackupDefaultServiceRole",
    "ListOfTags": [
      {
        "ConditionType": "STRINGEQUALS",
        "ConditionKey": "Backup",
        "ConditionValue": "daily"
      }
    ],
    "NotResources": [],
    "Conditions": {
      "StringEquals": [
        {
          "ConditionKey": "aws:ResourceTag/Environment",
          "ConditionValue": "production"
        }
      ]
    }
  }'

With this setup, any resource tagged Backup=daily and Environment=production — whether it's an EC2 instance, RDS cluster, or DynamoDB table — is automatically enrolled in the backup plan. Apply these tags in your infrastructure-as-code (Terraform/CloudFormation) to ensure every new resource is backed up on day one.

IAM Role: The AWSBackupDefaultServiceRole is created automatically when you first use AWS Backup in the console. For CLI/IaC workflows, attach the managed policies AWSBackupServiceRolePolicyForBackup and AWSBackupServiceRolePolicyForRestores to a custom role.

4. Cross-Region and Cross-Account Backup

A single-region backup does not protect against an AWS region outage. Cross-region copy creates a second recovery point in another region automatically as part of the backup rule. For even stronger isolation, cross-account copy places backups in a separate AWS account where a compromised production account cannot delete them.

Cross-Region Copy

{
  "RuleName": "daily-with-cross-region-copy",
  "TargetBackupVaultName": "prod-backup-vault",
  "ScheduleExpression": "cron(0 5 ? * * *)",
  "StartWindowMinutes": 60,
  "CompletionWindowMinutes": 480,
  "Lifecycle": {
    "DeleteAfterDays": 35
  },
  "CopyActions": [
    {
      "DestinationBackupVaultArn": "arn:aws:backup:eu-west-1:123456789012:backup-vault:prod-backup-vault-dr",
      "Lifecycle": {
        "DeleteAfterDays": 35
      }
    }
  ]
}

Cross-Account Copy with AWS Organizations

Cross-account backup requires enabling backup in AWS Organizations and establishing trust between source and destination account vaults:

# Step 1: Enable cross-account backup in the management account
aws backup put-backup-vault-access-policy \
  --backup-vault-name backup-account-vault \
  --policy '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "AWS": "arn:aws:iam::PROD-ACCOUNT-ID:root"
        },
        "Action": [
          "backup:CopyIntoBackupVault"
        ],
        "Resource": "*"
      }
    ]
  }'

# Step 2: In backup plan CopyActions, reference the cross-account vault ARN
# "DestinationBackupVaultArn": "arn:aws:backup:us-east-1:BACKUP-ACCOUNT-ID:backup-vault:backup-account-vault"

# Step 3: Enable cross-account management via Organizations (management account)
aws backup put-backup-vault-notifications \
  --backup-vault-name prod-backup-vault \
  --sns-topic-arn arn:aws:sns:us-east-1:123456789012:backup-notifications \
  --backup-vault-events BACKUP_JOB_STARTED BACKUP_JOB_COMPLETED BACKUP_JOB_FAILED \
    COPY_JOB_FAILED RESTORE_JOB_FAILED

Best Practice — 3-2-1 Rule: Keep 3 copies of data, on 2 different storage types, with 1 copy in a different account or region. AWS Backup makes the 3-2-1 rule achievable through a single backup plan with two CopyActions: one cross-region and one cross-account.

5. Service-Specific Backup Configurations

While AWS Backup provides a unified interface, each service has nuances worth understanding.

EC2 Instance Backup

EC2 backups create an AMI (Amazon Machine Image) plus EBS snapshots for all attached volumes. For application consistency, enable crash-consistent backups or use pre/post hooks with SSM:

# Tag EC2 instances for backup with pre/post scripts
aws ec2 create-tags \
  --resources i-0abc1234def567890 \
  --tags Key=Backup,Value=daily \
         Key=Environment,Value=production \
         Key="aws:backup:snapshot-without-reboot",Value=true

# For application-consistent backups, create a pre-backup hook
# that AWS Backup calls via SSM before taking the snapshot
aws ssm create-document \
  --name "PreBackupHook-AppQuiesce" \
  --document-type "Command" \
  --content '{
    "schemaVersion": "2.2",
    "description": "Quiesce application before backup",
    "mainSteps": [
      {
        "action": "aws:runShellScript",
        "name": "quiesceApp",
        "inputs": {
          "runCommand": [
            "systemctl stop myapp || true",
            "sync && echo 3 > /proc/sys/vm/drop_caches"
          ]
        }
      }
    ]
  }'

RDS and Aurora Continuous Backup

AWS Backup integrates with RDS automated backups but also adds centralized management and cross-account copy. For point-in-time recovery (PITR), enable continuous backups:

# Enable continuous backup for a DynamoDB table (PITR)
aws dynamodb update-continuous-backups \
  --table-name orders-prod \
  --point-in-time-recovery-specification PointInTimeRecoveryEnabled=true

# For RDS, verify backup retention via AWS Backup
aws backup list-recovery-points-by-resource \
  --resource-arn arn:aws:rds:us-east-1:123456789012:db:myapp-prod \
  --query 'RecoveryPoints[*].{Created:CreationDate,Status:Status,Size:BackupSizeInBytes}' \
  --output table

EFS Backup

EFS backups use AWS Backup natively (the old EFS backup service was deprecated). Unlike EBS snapshots, EFS recovery points are stored in AWS Backup-managed S3 storage and support incremental backups after the first full backup:

# Verify EFS backup is enabled (automatic backup creates a plan)
aws efs describe-backup-policy \
  --file-system-id fs-0abc1234

# Response shows ENABLED or DISABLED
# To enable automatic EFS backup:
aws efs put-backup-policy \
  --file-system-id fs-0abc1234 \
  --backup-policy Status=ENABLED

S3 Continuous Backup

S3 backup in AWS Backup provides object-level point-in-time recovery — useful when S3 Versioning alone isn't sufficient (for example, when you need to restore a bucket to a specific timestamp across millions of objects):

import boto3

backup = boto3.client('backup')

# Create a backup plan with S3 continuous backup rule
response = backup.create_backup_plan(
    BackupPlan={
        'BackupPlanName': 's3-continuous-backup',
        'Rules': [
            {
                'RuleName': 's3-pitr-rule',
                'TargetBackupVaultName': 'prod-backup-vault',
                'ScheduleExpression': 'cron(0 5 ? * * *)',
                'EnableContinuousBackup': True,  # Enables PITR for S3
                'Lifecycle': {
                    'DeleteAfterDays': 35
                }
            }
        ]
    }
)

plan_id = response['BackupPlanId']
print(f"S3 backup plan created: {plan_id}")

# Assign specific S3 bucket
backup.create_backup_selection(
    BackupPlanId=plan_id,
    BackupSelection={
        'SelectionName': 's3-prod-buckets',
        'IamRoleArn': 'arn:aws:iam::123456789012:role/AWSBackupDefaultServiceRole',
        'Resources': [
            'arn:aws:s3:::my-prod-data-bucket'
        ]
    }
)
print("S3 bucket assigned to backup plan")

6. AWS Backup Audit Manager

AWS Backup Audit Manager evaluates your backup activity against compliance controls and generates reports. It answers questions like "Do all production resources have a backup plan?" and "Were any backups deleted within their retention period?"

Built-in Controls

Audit Manager provides pre-built controls you can enable without writing custom rules:

Control	What It Checks
BACKUP_PLAN_MIN_FREQUENCY_AND_MIN_RETENTION_CHECK	Resources backed up at least N times per period with Y-day retention
BACKUP_RECOVERY_POINT_ENCRYPTED	All recovery points are KMS-encrypted
BACKUP_RECOVERY_POINT_MINIMUM_RETENTION_CHECK	Recovery points meet minimum retention requirement
BACKUP_RESOURCES_PROTECTED_BY_BACKUP_PLAN	Tagged resources have an active backup plan
BACKUP_RECOVERY_POINT_MANUAL_DELETION_DISABLED	Vault lock prevents manual deletion
RESTORE_TIME_MEETS_TARGET	Restore jobs complete within defined RTO

# Create a backup framework with compliance controls
aws backup create-framework \
  --framework-name "prod-compliance-framework" \
  --framework-controls '[
    {
      "ControlName": "BACKUP_RESOURCES_PROTECTED_BY_BACKUP_PLAN",
      "ControlInputParameters": [
        {
          "ParameterName": "requiredRetentionDays",
          "ParameterValue": "35"
        }
      ],
      "Scope": {
        "ComplianceResourceTypes": ["RDS", "DynamoDB", "EFS", "EC2"],
        "Tags": {"Environment": "production"}
      }
    },
    {
      "ControlName": "BACKUP_RECOVERY_POINT_ENCRYPTED",
      "ControlInputParameters": []
    },
    {
      "ControlName": "BACKUP_RECOVERY_POINT_MINIMUM_RETENTION_CHECK",
      "ControlInputParameters": [
        {
          "ParameterName": "requiredRetentionDays",
          "ParameterValue": "7"
        }
      ]
    }
  ]'

# Create a daily compliance report
aws backup create-report-plan \
  --report-plan-name "daily-backup-compliance" \
  --report-delivery-channel '{
    "S3BucketName": "techoral-backup-reports",
    "S3KeyPrefix": "compliance/",
    "Formats": ["CSV", "JSON"]
  }' \
  --report-setting '{
    "ReportTemplate": "RESOURCE_COMPLIANCE_REPORT",
    "FrameworkArns": ["arn:aws:backup:us-east-1:123456789012:framework/prod-compliance-framework"]
  }' \
  --report-plan-tags Environment=production

Compliance Automation: Integrate Audit Manager reports with AWS Security Hub or send them to an SNS topic for Slack/PagerDuty alerts. A non-compliant resource (one that misses a backup window or fails encryption check) can trigger an automated remediation Lambda that enforces the correct backup plan.

7. Restore Testing and Validation

A backup that has never been tested is not a backup — it's an assumption. AWS Backup Restore Testing (launched 2023, GA in 2024) automates restore validation on a schedule. It restores a recovery point to an isolated environment, runs validation scripts, then cleans up automatically.

# Create a restore testing plan
aws backup create-restore-testing-plan \
  --restore-testing-plan '{
    "RestoreTestingPlanName": "weekly-restore-validation",
    "ScheduleExpression": "cron(0 8 ? * 1 *)",
    "StartWindowHours": 2,
    "RecoveryPointSelection": {
      "Algorithm": "LATEST_WITHIN_WINDOW",
      "IncludeVaults": [
        "arn:aws:backup:us-east-1:123456789012:backup-vault:prod-backup-vault"
      ],
      "RecoveryPointTypes": ["SNAPSHOT", "CONTINUOUS"],
      "SelectionWindowDays": 7
    }
  }'

# Add a RDS restore testing selection with validation
aws backup create-restore-testing-selection \
  --restore-testing-plan-name "weekly-restore-validation" \
  --restore-testing-selection '{
    "RestoreTestingSelectionName": "rds-restore-test",
    "ProtectedResourceType": "RDS",
    "IamRoleArn": "arn:aws:iam::123456789012:role/RestoreTestingRole",
    "ProtectedResourceArns": [
      "arn:aws:rds:us-east-1:123456789012:db:myapp-prod"
    ],
    "RestoreMetadataOverrides": {
      "DBInstanceIdentifier": "restore-test-{{random}}",
      "MultiAZ": "false",
      "DBInstanceClass": "db.t3.small"
    },
    "ValidationWindowHours": 4
  }'

After the restore completes, AWS Backup runs a validation step. Connect a Lambda function to perform application-level checks:

import boto3
import json

def lambda_handler(event, context):
    """
    AWS Backup restore testing validation Lambda.
    Called after restore completes. Returns SUCCEEDED or FAILED.
    """
    restore_job_id = event.get('restoreJobId')
    restored_resource_arn = event.get('createdResourceArn')

    # Parse the restored RDS instance identifier from ARN
    # arn:aws:rds:us-east-1:123456789012:db:restore-test-abc123
    db_identifier = restored_resource_arn.split(':')[-1]

    rds = boto3.client('rds')

    # Wait for the restored instance to be available
    waiter = rds.get_waiter('db_instance_available')
    waiter.wait(DBInstanceIdentifier=db_identifier)

    # Describe instance to verify it came up correctly
    response = rds.describe_db_instances(DBInstanceIdentifier=db_identifier)
    instance = response['DBInstances'][0]

    checks = {
        'status_available': instance['DBInstanceStatus'] == 'available',
        'encrypted': instance['StorageEncrypted'],
        'engine_correct': instance['Engine'] == 'postgres',
        'storage_sufficient': instance['AllocatedStorage'] >= 50
    }

    all_passed = all(checks.values())

    print(json.dumps({
        'restoreJobId': restore_job_id,
        'restoredArn': restored_resource_arn,
        'checks': checks,
        'result': 'SUCCEEDED' if all_passed else 'FAILED'
    }))

    return {
        'statusCode': 200,
        'validationStatus': 'SUCCEEDED' if all_passed else 'FAILED',
        'validationStatusMessage': f"Checks: {checks}"
    }

8. Cost Optimization Tips

AWS Backup costs are primarily driven by storage of recovery points. Here are proven strategies to minimize cost without compromising protection:

1. Use Cold Storage Tiering

Move recovery points to cold storage (up to 50% cheaper than warm) after the hot-restore window has passed. Typical pattern: warm for 7–30 days, cold for remainder:

{
  "Lifecycle": {
    "MoveToColdStorageAfterDays": 7,
    "DeleteAfterDays": 365
  }
}

Note: cold storage requires a minimum 90-day total retention (objects must stay in cold at least 90 days). If your total retention is under 90 days, skip cold tier.

2. Deduplicate with Incremental Backups

EFS, DynamoDB, and S3 backups are automatically incremental — only changed data is stored after the first full backup. EC2 (EBS) snapshots are also incremental at the block level. Only the first snapshot is full; subsequent ones store only changed blocks.

3. Right-Size Retention by Tier

Resource Tier	Recommended Retention	Rationale
Tier 1 (Critical prod DB)	35 days warm + 1 year cold	Compliance + disaster recovery
Tier 2 (Application servers)	14 days warm	Rollback window for deployments
Tier 3 (Dev/test environments)	7 days warm, no cold	Cost control — dev data is recreatable
Tier 4 (Ephemeral)	Exclude from backup	Stateless instances do not need backup

4. Monitor and Alert on Backup Size Growth

# Create a CloudWatch alarm for backup storage exceeding budget
aws cloudwatch put-metric-alarm \
  --alarm-name "backup-storage-overage" \
  --metric-name "NumberOfBackupJobsCompleted" \
  --namespace "AWS/Backup" \
  --dimensions Name=BackupVaultName,Value=prod-backup-vault \
  --statistic Sum \
  --period 86400 \
  --threshold 1000 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 1 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:backup-alerts \
  --alarm-description "Alert when daily backup job count exceeds 1000"

# Use Cost Explorer to track backup costs by vault
aws ce get-cost-and-usage \
  --time-period Start=2026-06-01,End=2026-06-07 \
  --granularity DAILY \
  --filter '{"Dimensions":{"Key":"SERVICE","Values":["AWS Backup"]}}' \
  --metrics BlendedCost \
  --group-by Type=TAG,Key=Environment

5. Delete Orphaned Recovery Points

When you delete a resource (EC2 instance, RDS database), the recovery points remain in the vault and continue to accrue storage costs. Automate cleanup with an EventBridge rule that triggers when resources are deleted:

import boto3

def cleanup_orphaned_recovery_points(vault_name: str, dry_run: bool = True):
    """Find and delete recovery points whose source resources no longer exist."""
    backup = boto3.client('backup')

    paginator = backup.get_paginator('list_recovery_points_by_backup_vault')
    orphaned = []

    for page in paginator.paginate(BackupVaultName=vault_name):
        for rp in page['RecoveryPoints']:
            resource_arn = rp['ResourceArn']
            rp_arn = rp['RecoveryPointArn']

            # Check if source resource still exists
            try:
                backup.describe_protected_resource(ResourceArn=resource_arn)
            except backup.exceptions.ResourceNotFoundException:
                orphaned.append(rp_arn)
                print(f"Orphaned: {rp_arn} (source: {resource_arn})")

                if not dry_run:
                    backup.delete_recovery_point(
                        BackupVaultName=vault_name,
                        RecoveryPointArn=rp_arn
                    )
                    print(f"  Deleted: {rp_arn}")

    print(f"\nTotal orphaned recovery points: {len(orphaned)}")
    return orphaned

# Run in dry-run mode first
cleanup_orphaned_recovery_points('prod-backup-vault', dry_run=True)

Cost Reality Check: For most organizations, the biggest backup cost driver is EC2 EBS snapshots (which store all blocks, not just OS). Consider excluding non-data volumes (ephemeral NVMe, swap) from EC2 backup selections and using separate, lighter backup plans for dev/test vs production environments.