Kubernetes Jobs and CronJobs: Batch Processing Guide (2026)

Kubernetes Jobs and CronJobs are the workhorses of batch processing in modern cloud-native architectures. Whether you need to run a one-off database migration, process millions of records in parallel, or schedule nightly reports, Kubernetes provides battle-tested primitives to run finite workloads reliably at scale. This guide covers everything from basic Job specs to advanced KEDA-based autoscaling, with production-ready YAML examples throughout.

Jobs vs CronJobs vs Deployments
Basic Job Spec: completions, parallelism, backoffLimit
Job Completion Modes: NonIndexed vs Indexed
Indexed Jobs and Work Queue Processing
Job Failure Handling: restartPolicy and Pod Failure Policy
CronJob Spec: schedule, concurrencyPolicy, startingDeadlineSeconds
CronJob concurrencyPolicy: Allow vs Forbid vs Replace
Real-World: Database Backup CronJob with S3 Upload
Real-World: Report Generation Job with EventBridge
Monitoring Jobs: kubectl, Conditions, Prometheus
Cleanup: TTL, History Limits
KEDA: Scaling Jobs Based on Queue Depth

1. Jobs vs CronJobs vs Deployments — When to Use Each

Kubernetes offers three primary workload types for running containers, each suited to different execution patterns. Choosing the right one dramatically affects reliability, resource cost, and operational complexity.

Workload Type	Lifecycle	Restarts on Exit	Best For
Deployment	Runs forever	Yes (always)	Web servers, APIs, daemons
Job	Runs to completion	Only on failure (configurable)	One-off batch tasks, migrations, ETL
CronJob	Scheduled Jobs	Only on failure	Periodic tasks: backups, reports, cleanup

A Job creates one or more Pods and tracks successful completions. Once a specified number of Pods successfully complete, the Job is done. A CronJob is a controller that creates Jobs on a schedule — think of it as Kubernetes's built-in cron daemon. A Deployment keeps Pods running indefinitely, restarting them whenever they exit.

Key insight: If your workload has a natural end point — it processes a queue, generates a report, or runs a migration — use a Job, not a Deployment. Deployments that "run until done then exit" create operational headaches because Kubernetes will restart them unnecessarily.

Common Job use cases in 2026 production environments include: database schema migrations before rolling deploys, ML model training runs, invoice generation pipelines, data archival to cold storage, ETL batch loads, and compliance report generation. CronJobs handle the scheduling layer on top of all of these.

2. Basic Job Spec: completions, parallelism, backoffLimit, activeDeadlineSeconds

The core Job spec lets you control how many Pods run, how many must succeed, and what happens when things go wrong. Understanding these four fields is essential before writing production Jobs.

Key Fields

completions: Total number of successful Pod completions required. Default: 1.
parallelism: Maximum number of Pods running simultaneously. Default: 1.
backoffLimit: Number of retries before the Job is marked as failed. Default: 6.
activeDeadlineSeconds: Hard timeout for the entire Job (wall clock). Overrides backoffLimit.

# basic-job.yaml — single completion, no parallelism
apiVersion: batch/v1
kind: Job
metadata:
  name: db-migration
  namespace: production
  labels:
    app: db-migration
    version: "v2.4.0"
spec:
  completions: 1          # Only need one Pod to succeed
  parallelism: 1          # Run one Pod at a time
  backoffLimit: 3         # Retry up to 3 times before failing
  activeDeadlineSeconds: 600  # Fail the whole Job after 10 minutes
  ttlSecondsAfterFinished: 3600  # Auto-delete Job 1 hour after completion
  template:
    metadata:
      labels:
        app: db-migration
    spec:
      restartPolicy: Never   # Never restart; Job controller handles retries
      containers:
      - name: migrator
        image: myapp/db-migrator:v2.4.0
        env:
        - name: DB_HOST
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: host
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: password
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "1"
            memory: "1Gi"

Parallel Batch Job

To process work in parallel — for example, 20 items processed 4 at a time — set both completions and parallelism:

# parallel-job.yaml — 20 completions, 4 at a time
apiVersion: batch/v1
kind: Job
metadata:
  name: image-resizer
  namespace: production
spec:
  completions: 20       # Need 20 successful completions total
  parallelism: 4        # Run 4 Pods simultaneously
  backoffLimit: 4       # Allow 4 total failures across all Pods
  activeDeadlineSeconds: 1800  # 30-minute hard timeout
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: resizer
        image: myapp/image-resizer:latest
        env:
        - name: BATCH_SIZE
          value: "50"
        - name: S3_BUCKET
          value: "myapp-images-raw"
        resources:
          requests:
            cpu: "250m"
            memory: "256Mi"

backoffLimit vs activeDeadlineSeconds: backoffLimit counts Pod failures; activeDeadlineSeconds is a wall-clock timer from Job creation. The Job fails as soon as either limit is breached. Use activeDeadlineSeconds as a circuit breaker for runaway Jobs.

3. Job Completion Modes: NonIndexed vs Indexed

Kubernetes 1.21+ introduced a completionMode field that fundamentally changes how parallel Jobs work. The two modes are NonIndexed (the default) and Indexed.

NonIndexed Mode (Default)

In NonIndexed mode, all Pods are interchangeable. The Job tracks successful completions as a count — it does not care which Pod processed which work item. This works well when Pods pull from a shared queue and each Pod independently grabs the next available item.

Pods are stateless with respect to work assignment
Typically paired with a work queue (SQS, Redis, Kafka)
Simpler to implement but requires external coordination

Indexed Mode

In Indexed mode, each Pod receives a unique index from 0 to completions-1 via the environment variable JOB_COMPLETION_INDEX and the annotation batch.kubernetes.io/job-completion-index. This allows static work partitioning without a queue — each Pod knows exactly which slice of data to process.

Built-in work partitioning — no external queue needed
Deterministic — Pod 0 always processes chunk 0
Ideal for partitioned datasets: S3 key ranges, database shards, file lists

Choosing between modes: Use NonIndexed when work items are dynamic (queue-based). Use Indexed when the full dataset is known upfront and can be statically partitioned by index.

4. Indexed Jobs: JOB_COMPLETION_INDEX and S3 Chunk Processing

Indexed Jobs are powerful for processing pre-partitioned datasets. Here is a production pattern for processing S3 objects split into 10 chunks, where each Pod processes exactly one chunk using its index.

# indexed-job-s3.yaml — process S3 data in 10 indexed chunks
apiVersion: batch/v1
kind: Job
metadata:
  name: s3-data-processor
  namespace: production
spec:
  completions: 10          # 10 chunks = 10 indexed Pods (0..9)
  parallelism: 5           # Process 5 chunks at a time
  completionMode: Indexed  # Enable indexed mode
  backoffLimit: 2
  activeDeadlineSeconds: 3600
  template:
    spec:
      restartPolicy: Never
      serviceAccountName: s3-reader
      containers:
      - name: processor
        image: myapp/data-processor:v3.1.0
        env:
        - name: JOB_COMPLETION_INDEX   # Auto-injected by Kubernetes
          valueFrom:
            fieldRef:
              fieldPath: metadata.annotations['batch.kubernetes.io/job-completion-index']
        - name: TOTAL_CHUNKS
          value: "10"
        - name: S3_BUCKET
          value: "myapp-data-lake"
        - name: S3_PREFIX
          value: "raw/2026-06-10/"
        command:
        - /bin/sh
        - -c
        - |
          echo "Processing chunk ${JOB_COMPLETION_INDEX} of ${TOTAL_CHUNKS}"
          python /app/process_chunk.py \
            --chunk-index=${JOB_COMPLETION_INDEX} \
            --total-chunks=${TOTAL_CHUNKS} \
            --bucket=${S3_BUCKET} \
            --prefix=${S3_PREFIX}
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "2"
            memory: "4Gi"

The corresponding Python script uses the index to calculate which S3 keys to fetch:

# process_chunk.py — calculates key range from JOB_COMPLETION_INDEX
import argparse, boto3, math

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--chunk-index', type=int, required=True)
    parser.add_argument('--total-chunks', type=int, required=True)
    parser.add_argument('--bucket', required=True)
    parser.add_argument('--prefix', required=True)
    args = parser.parse_args()

    s3 = boto3.client('s3')
    paginator = s3.get_paginator('list_objects_v2')
    all_keys = [
        obj['Key']
        for page in paginator.paginate(Bucket=args.bucket, Prefix=args.prefix)
        for obj in page.get('Contents', [])
    ]
    chunk_size = math.ceil(len(all_keys) / args.total_chunks)
    start = args.chunk_index * chunk_size
    my_keys = all_keys[start : start + chunk_size]
    print(f"Chunk {args.chunk_index}: processing {len(my_keys)} keys")
    for key in my_keys:
        process_object(s3, args.bucket, key)

if __name__ == '__main__':
    main()

Note on JOB_COMPLETION_INDEX: In Kubernetes 1.24+, the index is also exposed directly as the env var JOB_COMPLETION_INDEX without needing the fieldRef. Check your cluster version; both approaches are shown above for compatibility.

5. Job Failure Handling: restartPolicy, Pod Failure Policy, and Backoff

Failures are inevitable in distributed systems. Kubernetes provides several layers of failure handling for Jobs, from simple retry counting to fine-grained per-exit-code policies introduced in Kubernetes 1.26.

restartPolicy

Job Pods must use restartPolicy: Never or restartPolicy: OnFailure. They cannot use Always (that is for Deployments).

Never: Failed Pod is left as-is; Job controller creates a new Pod for each retry. Useful for debugging — you can inspect the failed Pod's logs.
OnFailure: Kubelet restarts the container in the same Pod on failure. Fewer Pods created, but you lose the failed container's logs when it restarts.

Production recommendation: Use restartPolicy: Never in production. It produces one Pod per attempt, making debugging straightforward. Pair with ttlSecondsAfterFinished to avoid Pod accumulation.

Pod Failure Policy (Kubernetes 1.26+)

The podFailurePolicy field lets you define rules based on exit codes or Pod conditions. This allows you to distinguish between transient errors (should retry) and permanent errors (should fail immediately).

# pod-failure-policy.yaml — fine-grained failure handling
apiVersion: batch/v1
kind: Job
metadata:
  name: etl-processor
  namespace: production
spec:
  completions: 1
  backoffLimit: 6
  podFailurePolicy:
    rules:
    # Exit code 42 means "bad input data" — don't retry, fail immediately
    - action: FailJob
      onExitCodes:
        containerName: etl
        operator: In
        values: [42]
    # Exit code 1 means transient error — count against backoffLimit and retry
    - action: Count
      onExitCodes:
        containerName: etl
        operator: In
        values: [1]
    # OOMKilled or Evicted — ignore (don't count against backoffLimit)
    - action: Ignore
      onPodConditions:
      - type: DisruptionTarget
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: etl
        image: myapp/etl:v2.0.0
        # Exit 42 for unrecoverable input errors
        # Exit 1 for transient failures (DB timeout, network blip)
        # Exit 0 for success

Backoff Behavior

When a Pod fails, Kubernetes applies an exponential backoff before creating the next retry Pod: 10s, 20s, 40s, 80s... up to a maximum of 6 minutes. The backoffLimit counts Pod failures, not container restarts. Once the limit is reached, the Job condition changes to Failed and no more Pods are created.

6. CronJob Spec: schedule, concurrencyPolicy, startingDeadlineSeconds

A CronJob wraps a Job spec with a schedule and concurrency controls. The schedule uses standard Unix cron syntax. Understanding startingDeadlineSeconds is critical for avoiding missed runs in production.

# cronjob-basic.yaml — nightly report generation
apiVersion: batch/v1
kind: CronJob
metadata:
  name: nightly-report
  namespace: production
spec:
  # Standard cron: minute hour day-of-month month day-of-week
  schedule: "0 2 * * *"          # 2:00 AM UTC every day
  timeZone: "Asia/Kolkata"        # Kubernetes 1.27+ — set timezone directly
  concurrencyPolicy: Forbid       # Don't start new Job if previous is still running
  startingDeadlineSeconds: 300    # If Job can't start within 5 min of schedule, skip
  successfulJobsHistoryLimit: 3   # Keep last 3 successful Job records
  failedJobsHistoryLimit: 3       # Keep last 3 failed Job records
  jobTemplate:
    spec:
      completions: 1
      backoffLimit: 2
      activeDeadlineSeconds: 3600   # 1-hour hard timeout per Job run
      ttlSecondsAfterFinished: 86400 # Delete Job after 24 hours
      template:
        metadata:
          labels:
            app: nightly-report
        spec:
          restartPolicy: Never
          serviceAccountName: report-generator
          containers:
          - name: reporter
            image: myapp/reporter:v1.5.0
            env:
            - name: REPORT_DATE
              value: "yesterday"
            - name: OUTPUT_BUCKET
              value: "myapp-reports"
            resources:
              requests:
                cpu: "500m"
                memory: "512Mi"
              limits:
                cpu: "2"
                memory: "2Gi"

Cron Schedule Quick Reference

Schedule	Meaning
`/5 * * *`	Every 5 minutes
`0 * * * *`	Every hour (on the hour)
`0 2 * * *`	Daily at 2:00 AM
`0 2 * * 0`	Weekly on Sunday at 2:00 AM
`0 2 1 * *`	Monthly on the 1st at 2:00 AM
`@hourly`	Alias for `0 * * * *`

startingDeadlineSeconds explained: If the CronJob controller is down or the cluster is overloaded, a scheduled run might be missed. startingDeadlineSeconds defines how late a Job can start before being skipped. If this value is unset and more than 100 runs are missed, the CronJob controller will stop creating Jobs until manually restarted. Always set this field.

7. CronJob concurrencyPolicy: Allow vs Forbid vs Replace

The concurrencyPolicy field controls what happens when the next scheduled Job is due but the previous one is still running. This is one of the most important decisions for CronJob reliability.

Allow (Default)

Multiple Jobs can run simultaneously. If the 2:00 AM run takes 90 minutes, the 3:00 AM run starts while it is still running. Use this only when Jobs are idempotent and parallel runs do not conflict. Suitable for: log rotation, cache warming, metrics collection.

Forbid

If the previous Job is still running when the next run is due, the new run is skipped. No overlap is ever possible. Use for: database maintenance, report generation, any Job where concurrent runs would cause data conflicts. This is the safest default for most batch workflows.

Replace

If the previous Job is still running, it is terminated and a new Job starts. Use sparingly — you lose whatever work the previous Job had done. Suitable for: real-time dashboard generation where only the latest snapshot matters, or health-check aggregation jobs.

# concurrency-policy-comparison.yaml

# Pattern 1: Forbid — safe for stateful operations
apiVersion: batch/v1
kind: CronJob
metadata:
  name: db-vacuum
spec:
  schedule: "0 3 * * *"
  concurrencyPolicy: Forbid     # If yesterday's vacuum is still running, skip today's
  startingDeadlineSeconds: 600
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: Never
          containers:
          - name: vacuum
            image: postgres:15
            command: ["vacuumdb", "--all", "--analyze"]

---
# Pattern 2: Replace — for snapshot-style jobs where only latest matters
apiVersion: batch/v1
kind: CronJob
metadata:
  name: dashboard-snapshot
spec:
  schedule: "*/15 * * * *"
  concurrencyPolicy: Replace    # Always use the freshest run
  startingDeadlineSeconds: 60
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: Never
          containers:
          - name: snapshot
            image: myapp/dashboard-gen:latest

Operational tip: Monitor CronJob missed runs with kubectl describe cronjob <name> and look at the Events section. If you see "Missed scheduled time" warnings frequently, either your Jobs take too long or startingDeadlineSeconds is too tight.

8. Real-World Pattern: Database Backup CronJob with S3 Upload

This is a complete, production-ready pattern for backing up a PostgreSQL database every night and uploading the compressed dump to S3. It uses a dedicated ServiceAccount, mounts credentials from Secrets, and handles cleanup of old backups.

# db-backup-cronjob.yaml — full production-ready backup pipeline
apiVersion: v1
kind: ServiceAccount
metadata:
  name: db-backup-sa
  namespace: production
  annotations:
    # AWS IRSA: bind to IAM role with S3 PutObject permission
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/db-backup-role

---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
  namespace: production
  labels:
    app: postgres-backup
    team: platform
spec:
  schedule: "30 1 * * *"           # 1:30 AM UTC daily
  timeZone: "UTC"
  concurrencyPolicy: Forbid
  startingDeadlineSeconds: 900      # 15-minute grace window
  successfulJobsHistoryLimit: 7     # Keep 1 week of successful records
  failedJobsHistoryLimit: 3
  jobTemplate:
    spec:
      completions: 1
      backoffLimit: 2
      activeDeadlineSeconds: 7200   # 2-hour hard limit
      ttlSecondsAfterFinished: 604800  # Auto-delete after 7 days
      template:
        metadata:
          labels:
            app: postgres-backup
        spec:
          restartPolicy: Never
          serviceAccountName: db-backup-sa
          initContainers:
          - name: verify-db
            image: postgres:15-alpine
            command:
            - sh
            - -c
            - |
              until pg_isready -h $PGHOST -p 5432 -U $PGUSER; do
                echo "Waiting for PostgreSQL..."
                sleep 5
              done
              echo "PostgreSQL is ready"
            env:
            - name: PGHOST
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: host
            - name: PGUSER
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: username
          containers:
          - name: backup
            image: myapp/pg-backup:v2.0.0   # Custom image with pg_dump + aws cli
            command:
            - /bin/sh
            - -c
            - |
              set -e
              TIMESTAMP=$(date +%Y%m%d_%H%M%S)
              BACKUP_FILE="/tmp/backup_${TIMESTAMP}.sql.gz"
              S3_KEY="backups/postgres/${TIMESTAMP}/dump.sql.gz"

              echo "Starting backup at $(date)"
              PGPASSWORD="$PGPASSWORD" pg_dump \
                -h "$PGHOST" \
                -U "$PGUSER" \
                -d "$PGDATABASE" \
                --no-owner \
                --no-acl \
                --format=plain \
                | gzip -9 > "$BACKUP_FILE"

              SIZE=$(du -sh "$BACKUP_FILE" | cut -f1)
              echo "Backup size: $SIZE. Uploading to s3://${S3_BUCKET}/${S3_KEY}"

              aws s3 cp "$BACKUP_FILE" "s3://${S3_BUCKET}/${S3_KEY}" \
                --storage-class STANDARD_IA \
                --sse aws:kms \
                --metadata "db=${PGDATABASE},timestamp=${TIMESTAMP}"

              # Prune backups older than 30 days
              aws s3 ls "s3://${S3_BUCKET}/backups/postgres/" \
                | awk '{print $2}' \
                | while read prefix; do
                    date_str=$(echo "$prefix" | cut -c1-8)
                    if [ "$(date -d "$date_str" +%s 2>/dev/null || echo 0)" -lt "$(date -d '30 days ago' +%s)" ]; then
                      aws s3 rm "s3://${S3_BUCKET}/backups/postgres/${prefix}" --recursive
                    fi
                  done

              echo "Backup complete at $(date)"
            env:
            - name: PGHOST
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: host
            - name: PGUSER
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: username
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: password
            - name: PGDATABASE
              value: "myapp_production"
            - name: S3_BUCKET
              value: "myapp-database-backups"
            resources:
              requests:
                cpu: "250m"
                memory: "512Mi"
              limits:
                cpu: "1"
                memory: "2Gi"

Security note: Always use AWS IRSA (IAM Roles for Service Accounts) or Workload Identity (GKE) rather than static AWS credentials in Secrets. The ServiceAccount annotation binds the Pod to an IAM role with the minimum required permissions — in this case, s3:PutObject and s3:DeleteObject on the backup bucket only.

9. Real-World Pattern: Report Generation Job Triggered by EventBridge

Not all batch Jobs run on a fixed schedule. A common pattern in AWS-based architectures is to trigger a Kubernetes Job in response to an event — for example, generating an invoice when a subscription renews. EventBridge routes the event to a Lambda function that uses the Kubernetes API to create a Job.

Job Template (applied via Lambda)

# report-job-template.yaml — triggered on-demand, not scheduled
apiVersion: batch/v1
kind: Job
metadata:
  generateName: invoice-generator-   # Unique name per invocation
  namespace: production
  labels:
    app: invoice-generator
    trigger: eventbridge
spec:
  completions: 1
  backoffLimit: 3
  activeDeadlineSeconds: 300         # 5-minute SLA for invoice generation
  ttlSecondsAfterFinished: 86400
  template:
    metadata:
      labels:
        app: invoice-generator
    spec:
      restartPolicy: Never
      serviceAccountName: invoice-sa
      containers:
      - name: generator
        image: myapp/invoice-generator:v4.2.0
        env:
        - name: SUBSCRIPTION_ID
          value: ""          # Populated by Lambda before applying
        - name: BILLING_PERIOD
          value: ""          # Populated by Lambda before applying
        - name: OUTPUT_BUCKET
          value: "myapp-invoices"
        - name: NOTIFY_WEBHOOK
          value: "https://hooks.slack.com/services/..."
        resources:
          requests:
            cpu: "250m"
            memory: "256Mi"

Lambda Function (Python) that Creates the Job

# lambda_handler.py — triggered by EventBridge subscription.renewed event
import json, boto3, yaml, copy
from kubernetes import client, config

def handler(event, context):
    subscription_id = event['detail']['subscriptionId']
    billing_period   = event['detail']['billingPeriod']

    # Load in-cluster config (Lambda runs with IRSA pod exec permissions)
    config.load_incluster_config()
    batch_v1 = client.BatchV1Api()

    job_manifest = {
        "apiVersion": "batch/v1",
        "kind": "Job",
        "metadata": {
            "generateName": "invoice-generator-",
            "namespace": "production",
            "labels": {"app": "invoice-generator", "trigger": "eventbridge"}
        },
        "spec": {
            "completions": 1,
            "backoffLimit": 3,
            "activeDeadlineSeconds": 300,
            "ttlSecondsAfterFinished": 86400,
            "template": {
                "spec": {
                    "restartPolicy": "Never",
                    "serviceAccountName": "invoice-sa",
                    "containers": [{
                        "name": "generator",
                        "image": "myapp/invoice-generator:v4.2.0",
                        "env": [
                            {"name": "SUBSCRIPTION_ID", "value": subscription_id},
                            {"name": "BILLING_PERIOD",   "value": billing_period},
                            {"name": "OUTPUT_BUCKET",    "value": "myapp-invoices"}
                        ],
                        "resources": {
                            "requests": {"cpu": "250m", "memory": "256Mi"}
                        }
                    }]
                }
            }
        }
    }

    response = batch_v1.create_namespaced_job(
        namespace="production",
        body=job_manifest
    )
    print(f"Created Job: {response.metadata.name}")
    return {"statusCode": 200, "jobName": response.metadata.name}

10. Monitoring Jobs: kubectl, Job Conditions, Prometheus Metrics

Observability for batch workloads requires a different approach than for long-running services. Since Jobs are ephemeral, you need to capture metrics and logs before the Job completes and gets cleaned up.

kubectl Commands for Job Inspection

# List all Jobs in a namespace with status
kubectl get jobs -n production

# Detailed Job status including conditions
kubectl describe job db-migration -n production

# Watch Job completion in real time
kubectl get job db-migration -n production --watch

# Get logs from the latest Job Pod
kubectl logs -n production -l job-name=db-migration --tail=100

# Get logs from all Pods created by a Job (including failed ones)
kubectl logs -n production -l job-name=db-migration --previous

# List CronJobs and their last schedule time
kubectl get cronjobs -n production

# View CronJob history (shows Job objects it created)
kubectl get jobs -n production -l app=nightly-report

# Manually trigger a CronJob immediately (for testing)
kubectl create job --from=cronjob/nightly-report manual-run-001 -n production

Job Conditions

Jobs have two conditions that you should monitor programmatically:

Complete: type: Complete, status: "True" — all required completions succeeded
Failed: type: Failed, status: "True" — Job hit backoffLimit or activeDeadlineSeconds

# Check Job conditions via jsonpath
kubectl get job db-migration -n production \
  -o jsonpath='{.status.conditions[*].type}'

# Get structured status
kubectl get job db-migration -n production -o json | \
  jq '.status | {active, succeeded, failed, completionTime}'

Prometheus Metrics for Jobs

The kube-state-metrics exporter provides essential Job and CronJob metrics:

# Jobs currently running (active Pods > 0)
kube_job_status_active{namespace="production"} > 0

# Jobs that have failed
kube_job_failed{namespace="production"} == 1

# CronJob last successful schedule time (unix timestamp)
kube_cronjob_status_last_successful_time{namespace="production"}

# Alert: CronJob hasn't run successfully in 25 hours (missed daily run)
time() - kube_cronjob_status_last_successful_time{cronjob="nightly-report"} > 90000

# Job duration histogram (requires custom instrumentation)
histogram_quantile(0.95,
  rate(job_duration_seconds_bucket{namespace="production"}[1h])
)

Alerting strategy: Create alerts for two scenarios: (1) a Job in the Failed state, and (2) a CronJob that has not had a successful run within 1.5x its schedule interval. The second alert catches both Job failures and missed schedules due to cluster issues.

11. Cleaning Up: ttlSecondsAfterFinished and CronJob History Limits

Without cleanup policies, completed Jobs and their Pods accumulate indefinitely. In a cluster running hundreds of CronJobs, this leads to significant etcd bloat and slow kubectl get responses. Kubernetes provides two mechanisms for automatic cleanup.

ttlSecondsAfterFinished (TTL Controller)

Set ttlSecondsAfterFinished on any Job to have the TTL controller delete it (and its Pods) automatically after it finishes — whether it succeeded or failed. This is the recommended approach for all production Jobs.

# ttl-cleanup-example.yaml — Job auto-deleted 1 hour after completion
apiVersion: batch/v1
kind: Job
metadata:
  name: data-export
  namespace: production
spec:
  completions: 1
  backoffLimit: 2
  ttlSecondsAfterFinished: 3600   # Delete Job + Pods 1 hour after finish
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: exporter
        image: myapp/exporter:latest

Choose ttlSecondsAfterFinished values based on your debugging needs:

0: Delete immediately on completion (production pipelines with external logging)
3600: 1 hour — enough time for on-call to inspect failed runs
86400: 24 hours — good for nightly Jobs that may need next-morning investigation

CronJob History Limits

successfulJobsHistoryLimit and failedJobsHistoryLimit control how many completed Job objects the CronJob controller retains. These are independent of ttlSecondsAfterFinished. Setting both to a small number is essential for clusters with many CronJobs:

# cronjob-with-history-limits.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: log-archiver
  namespace: production
spec:
  schedule: "0 * * * *"           # Hourly
  concurrencyPolicy: Forbid
  startingDeadlineSeconds: 300
  successfulJobsHistoryLimit: 3   # Keep only last 3 successful Jobs
  failedJobsHistoryLimit: 3       # Keep only last 3 failed Jobs
  jobTemplate:
    spec:
      ttlSecondsAfterFinished: 7200  # Belt AND suspenders: also use TTL
      template:
        spec:
          restartPolicy: Never
          containers:
          - name: archiver
            image: myapp/log-archiver:v1.2.0

Recommended defaults: Set successfulJobsHistoryLimit: 3 and failedJobsHistoryLimit: 3 on all CronJobs. Set ttlSecondsAfterFinished: 86400 on all Jobs. This keeps etcd clean while giving your team time to investigate failures.

12. KEDA: Scaling Jobs Based on Queue Depth (SQS, Kafka)

Kubernetes Event-Driven Autoscaling (KEDA) extends Kubernetes to scale Jobs based on external event sources — SQS queue depth, Kafka consumer lag, Redis list length, and more. Instead of running a fixed number of Pods, KEDA spins up Job Pods only when there is actual work to process, and scales down to zero when the queue is empty.

How KEDA ScaledJob Works

KEDA polls the external scaler (e.g., SQS ApproximateNumberOfMessages)
Creates Job Pods proportional to queue depth (one Job per message, or batched)
Each Job Pod processes messages and exits when done
When queue empties, KEDA scales back to zero

# keda-scaledjob-sqs.yaml — scale Jobs based on SQS queue depth
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: sqs-processor
  namespace: production
spec:
  jobTargetRef:
    # Standard Job spec embedded here
    template:
      spec:
        restartPolicy: Never
        serviceAccountName: sqs-consumer-sa
        containers:
        - name: processor
          image: myapp/sqs-processor:v2.0.0
          env:
          - name: SQS_QUEUE_URL
            value: "https://sqs.ap-south-1.amazonaws.com/123456789/myapp-jobs"
          - name: MAX_MESSAGES
            value: "10"     # Each Job Pod processes up to 10 messages
          resources:
            requests:
              cpu: "250m"
              memory: "256Mi"
            limits:
              cpu: "1"
              memory: "512Mi"
  pollingInterval: 30         # Check queue depth every 30 seconds
  maxReplicaCount: 50         # Never create more than 50 concurrent Jobs
  minReplicaCount: 0          # Scale to zero when queue is empty
  successfulJobsHistoryLimit: 5
  failedJobsHistoryLimit: 5
  scalingStrategy:
    strategy: "accurate"      # Create one Job per pending message (up to maxReplicaCount)
    customScalingQueueLengthDeduction: 0
    customScalingRunningJobPercentage: "0.5"
  triggers:
  - type: aws-sqs-queue
    authenticationRef:
      name: keda-sqs-auth      # TriggerAuthentication with IRSA
    metadata:
      queueURL: "https://sqs.ap-south-1.amazonaws.com/123456789/myapp-jobs"
      queueLength: "10"        # Target: 10 messages per Job Pod
      awsRegion: "ap-south-1"
      scaleOnInFlight: "true"  # Count in-flight messages too

---
# TriggerAuthentication using AWS IRSA
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-sqs-auth
  namespace: production
spec:
  podIdentity:
    provider: aws-eks   # Use IRSA — no static credentials needed

KEDA with Kafka Consumer Lag

# keda-scaledjob-kafka.yaml — scale Jobs based on Kafka consumer group lag
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: kafka-event-processor
  namespace: production
spec:
  jobTargetRef:
    template:
      spec:
        restartPolicy: Never
        containers:
        - name: consumer
          image: myapp/kafka-consumer:v3.0.0
          env:
          - name: KAFKA_BROKERS
            value: "kafka-broker.production.svc:9092"
          - name: KAFKA_TOPIC
            value: "order-events"
          - name: KAFKA_GROUP
            value: "order-processor"
  pollingInterval: 15
  maxReplicaCount: 30
  minReplicaCount: 0
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: "kafka-broker.production.svc:9092"
      consumerGroup: "order-processor"
      topic: "order-events"
      lagThreshold: "100"     # Create a new Job Pod for every 100 messages of lag
      offsetResetPolicy: "latest"

KEDA vs CronJob: Use KEDA ScaledJob when work arrives unpredictably (event-driven) and you want true scale-to-zero. Use CronJob when work is periodic and predictable. Many production systems combine both: a CronJob generates work items into a queue, and KEDA ScaledJobs process them.

Installing KEDA

# Install KEDA via Helm
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace \
  --version 2.14.0

# Verify installation
kubectl get pods -n keda
kubectl get crd | grep keda