AWS Lambda Serverless: Functions, Triggers, Performance (2026)
AWS Lambda runs your code in response to events without you managing servers. You pay only for execution time, and Lambda scales from zero to thousands of concurrent invocations automatically. But getting Lambda right — handling cold starts, choosing the right memory, wiring up error handling and dead letter queues — requires understanding the execution model deeply. This guide covers all of it with real code examples.
Table of Contents
Function Anatomy and Runtimes
A Lambda function has a handler — the entry point AWS calls for each invocation — and an execution environment that is initialized once and reused across multiple invocations (warm start). Code outside the handler runs during initialization and can be reused.
import boto3
import json
import os
# Initialization code — runs once per container lifecycle
# Reuse across multiple invocations
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE_NAME'])
def handler(event, context):
"""
event: dict with trigger-specific payload (API GW, SQS, S3, etc.)
context: runtime info — function name, remaining time, request ID
"""
# Log structured JSON for CloudWatch Logs Insights queries
print(json.dumps({
"level": "INFO",
"requestId": context.aws_request_id,
"message": "Processing event",
"recordCount": len(event.get('Records', [event]))
}))
try:
result = process_event(event)
return {
'statusCode': 200,
'body': json.dumps(result),
'headers': {'Content-Type': 'application/json'}
}
except Exception as e:
print(json.dumps({"level": "ERROR", "error": str(e)}))
raise # Re-raise so Lambda marks invocation as failed
def process_event(event):
# Your business logic here
return {"status": "ok"}
Supported runtimes (managed by AWS, auto-patched): Python 3.12, Node.js 22, Java 21, .NET 8, Ruby 3.3. Use custom runtimes for Go, Rust, or any language via provided.al2023. For Java, choose Java 21 with SnapStart enabled — it reduces cold start from 5–10s to under 1s by snapshotting the initialized JVM state.
Triggers: API Gateway, SQS, S3, EventBridge
Lambda triggers define what invokes your function. Different triggers have different invocation models — synchronous (API Gateway) or asynchronous (S3, SNS) or polling (SQS, Kinesis, DynamoDB Streams).
# SAM template: Lambda with multiple triggers
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
ProcessOrderFunction:
Type: AWS::Serverless::Function
Properties:
Handler: app.handler
Runtime: python3.12
MemorySize: 512
Timeout: 30
Environment:
Variables:
TABLE_NAME: !Ref OrdersTable
Events:
# API Gateway trigger (synchronous)
ApiEvent:
Type: Api
Properties:
Path: /orders
Method: POST
# SQS trigger (polling, batched)
SQSEvent:
Type: SQS
Properties:
Queue: !GetAtt OrderQueue.Arn
BatchSize: 10
FunctionResponseTypes:
- ReportBatchItemFailures # partial batch failure support
# S3 trigger (asynchronous)
S3Event:
Type: S3
Properties:
Bucket: !Ref UploadBucket
Events: s3:ObjectCreated:*
Filter:
S3Key:
Rules:
- Name: suffix
Value: '.csv'
# EventBridge scheduled rule
ScheduledEvent:
Type: Schedule
Properties:
Schedule: rate(5 minutes)
Description: "Periodic cleanup job"
ReportBatchItemFailures. Without it, if any message in a batch fails, the entire batch is retried — including messages that succeeded. With partial batch failure reporting, only the failed message IDs are requeued.Cold Starts and Mitigation
A cold start happens when Lambda needs to initialize a new execution environment: download code, start runtime, run initialization code. Cold start duration varies by runtime (Python: ~100ms, Node: ~100ms, Java: 2–10s without SnapStart, .NET: ~500ms) and by package size.
Cold starts occur when: (1) function is invoked for the first time, (2) concurrency scales out beyond existing warm containers, (3) function has been idle for ~15 minutes.
Mitigation strategies:
- Provisioned Concurrency: Pre-warms N execution environments. They're always ready, always warm. Costs ~$0.015/hr per provisioned unit.
- Java SnapStart: Snapshots the initialized JVM after the init phase. Subsequent invocations restore the snapshot rather than re-initializing. Reduces Java cold starts to <1s.
- Keep packages small: Only include what you need. A 50 MB deployment package takes longer to initialize than a 5 MB one. Use Lambda Layers for shared dependencies.
- Choose smaller runtimes: Python/Node cold starts are 10–50x faster than Java without SnapStart.
# Enable Provisioned Concurrency on an alias (not $LATEST)
aws lambda put-provisioned-concurrency-config \
--function-name my-api-function \
--qualifier production \
--provisioned-concurrent-executions 10
# Enable Java SnapStart
aws lambda update-function-configuration \
--function-name my-java-function \
--snap-start ApplyOn=PublishedVersions
# Use Application Auto Scaling to scale Provisioned Concurrency
aws application-autoscaling register-scalable-target \
--service-namespace lambda \
--resource-id function:my-api-function:production \
--scalable-dimension lambda:function:ProvisionedConcurrency \
--min-capacity 5 \
--max-capacity 50
Memory, Timeout, and Performance Tuning
Lambda allocates CPU proportionally to memory. More memory = more CPU = faster execution. For CPU-bound functions, increasing memory (even if you don't need the RAM) often reduces execution time enough to lower total cost.
| Memory | vCPU equivalent | Typical Use Case |
|---|---|---|
| 128 MB | ~0.08 vCPU | Simple event routing, lightweight transforms |
| 512 MB | ~0.32 vCPU | API handlers, DB queries |
| 1792 MB | 1 full vCPU | Image processing, data transformation |
| 3008 MB | ~2 vCPU | ML inference, heavy computation |
| 10240 MB | ~6 vCPU | Large data processing, video transcoding |
Set timeouts conservatively but not too tight. The default is 3 seconds; the maximum is 15 minutes. For API-facing functions, keep timeout under 30 seconds (API Gateway hard limit is 29 seconds). For async/batch processing, tune to the actual P99 execution time + buffer.
Lambda Layers and Environment Variables
Layers are ZIP archives containing libraries, binaries, or configuration that multiple Lambda functions can share. They're mounted at /opt in the execution environment. Using layers keeps your deployment packages small and separates dependency management from application code.
# Create a layer with shared Python dependencies
pip install requests boto3-stubs -t python/
zip -r dependencies-layer.zip python/
aws lambda publish-layer-version \
--layer-name common-dependencies \
--zip-file fileb://dependencies-layer.zip \
--compatible-runtimes python3.12 python3.11
# Attach layer to a function
aws lambda update-function-configuration \
--function-name my-function \
--layers arn:aws:lambda:us-east-1:123456789:layer:common-dependencies:3
Environment variables store configuration that changes between environments (dev/staging/prod). For secrets, never store them as plain-text env vars — reference them from AWS Secrets Manager or SSM Parameter Store at runtime, or use Lambda's built-in encryption with a KMS key for env var values.
import boto3
import os
import json
# Pattern: cache secrets after first fetch (warm start reuse)
_secrets_cache = {}
def get_secret(secret_name):
if secret_name not in _secrets_cache:
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId=secret_name)
_secrets_cache[secret_name] = json.loads(response['SecretString'])
return _secrets_cache[secret_name]
def handler(event, context):
db_creds = get_secret(os.environ['DB_SECRET_ARN'])
# Use db_creds['username'] and db_creds['password']
Lambda@Edge
Lambda@Edge runs your Lambda functions at CloudFront edge locations, globally, close to your users. It's used to customize HTTP requests and responses in the CDN layer — authentication, A/B testing, URL rewrites, header manipulation, dynamic content generation.
Four hook points in the CloudFront request lifecycle:
- Viewer Request: Before CloudFront checks its cache. Use for auth, URL normalization.
- Origin Request: After cache miss, before forwarding to origin. Use for URL rewrites, adding headers.
- Origin Response: After receiving from origin, before caching. Use for modifying cache headers.
- Viewer Response: Before returning to the viewer. Use for adding security headers.
Error Handling and Dead Letter Queues
For async invocations (S3, SNS, EventBridge), Lambda retries failed invocations twice (3 total attempts) with delays between retries. After all retries are exhausted, the event can be sent to a Dead Letter Queue (SQS or SNS) for investigation or manual reprocessing.
# Configure DLQ and retry behavior for async invocations
aws lambda put-function-event-invoke-config \
--function-name my-async-function \
--maximum-retry-attempts 2 \
--maximum-event-age-in-seconds 3600 \
--destination-config '{
"OnFailure": {
"Destination": "arn:aws:sqs:us-east-1:123456789:my-dlq"
},
"OnSuccess": {
"Destination": "arn:aws:sqs:us-east-1:123456789:my-success-queue"
}
}'
def handler(event, context):
failed_items = []
# For SQS batch processing — partial failure pattern
for record in event['Records']:
try:
process_record(record)
except Exception as e:
print(f"Failed to process {record['messageId']}: {e}")
failed_items.append({'itemIdentifier': record['messageId']})
# Return failed items — Lambda will re-queue only these
return {'batchItemFailures': failed_items}
Frequently Asked Questions
What is the Lambda concurrency limit and how do I manage it?
The default account-level concurrency limit is 1,000 concurrent executions per region (can be increased via support ticket). This is shared across all functions. Use reserved concurrency to cap a function's concurrency (prevents one noisy function from consuming all capacity) and to guarantee minimum capacity. Use Provisioned Concurrency to pre-warm environments for latency-sensitive functions.
Can Lambda functions access resources in a private VPC?
Yes. Configure Lambda with VPC settings (subnet IDs and security group IDs) and it will create an ENI in your VPC. Lambda then routes VPC traffic through that ENI. For internet access, the Lambda function's subnet needs a route to a NAT Gateway. For AWS services (DynamoDB, S3), use VPC Endpoints to avoid NAT costs and latency.
How do I handle database connections in Lambda?
Never open a new connection per invocation. Establish connections in the initialization code (outside the handler) so they're reused across warm invocations. For Aurora MySQL/PostgreSQL, use RDS Proxy — it pools connections at the proxy layer, solving the problem of Lambda functions exhausting database connection limits at scale.
What's the best way to deploy Lambda functions?
Use AWS SAM (Serverless Application Model) or CDK for infrastructure-as-code deployments. For CI/CD, use CodePipeline with CodeBuild, or GitHub Actions with the AWS SAM CLI. Always deploy to a versioned alias (not $LATEST) for production — this enables traffic shifting (canary deployments: 10% to new version, 90% to old) and rollback without redeployment.
Is Lambda cost-effective compared to EC2?
For intermittent or event-driven workloads, Lambda is significantly cheaper — you pay $0.0000002 per request and $0.0000166667 per GB-second. A function handling 1 million requests/month at 512 MB / 200ms average costs ~$1.67. But for always-on workloads with consistent traffic, EC2 or containers (ECS/EKS) will be cheaper. The break-even point is typically around 20–30% CPU utilization equivalent.