AWS Comprehend: Natural Language Processing API for Text Analysis (2026)

AWS Comprehend NLP

Amazon Comprehend is a fully managed Natural Language Processing (NLP) service that uses machine learning to find insights and relationships in text — with no ML expertise required. From sentiment analysis and entity recognition to custom classifiers and topic modeling, Comprehend lets you extract structured meaning from unstructured text at any scale, paying only for what you use.

In 2026, NLP is no longer a specialist skill — it is table-stakes for any application that touches user-generated content, support tickets, legal documents, medical records, or social media. Comprehend removes the infrastructure burden and model-training complexity, giving developers a set of pre-trained and customizable APIs callable over standard boto3 Python calls. This guide covers every major feature with working code, cost notes, and a production-ready integration pattern.

Table of Contents

  1. What is Amazon Comprehend?
  2. Sentiment Analysis
  3. Entity Recognition
  4. Key Phrase Extraction
  5. Language Detection
  6. PII Detection and Redaction
  7. Syntax Analysis
  8. Custom Classifiers
  9. Custom Entity Recognizers
  10. Topic Modeling
  11. Comprehend Medical
  12. Integration Patterns
  13. Cost Model

1. What is Amazon Comprehend?

Amazon Comprehend is part of the AWS AI/ML services family alongside Rekognition (image/video analysis) and SageMaker (custom model training). Unlike SageMaker, Comprehend requires zero model building — AWS maintains and continuously retrains the underlying foundation models.

Built-in vs. Custom NLP

Comprehend offers two tiers of capability:

  • Built-in APIs — Pre-trained, instantly available, no training data needed. Cover sentiment, entities, key phrases, language detection, PII, syntax, and topic modeling.
  • Custom models — You supply labeled training data (CSV or augmented manifest). Comprehend fine-tunes a model specific to your domain. Covers custom classification (multi-class, multi-label) and custom entity recognition.

Core Use Cases

  • Customer support routing — classify tickets by topic and urgency before they hit an agent queue
  • Social media monitoring — real-time sentiment tracking across product mentions
  • Document intelligence — extract entities, key phrases, and relationships from contracts, invoices, or research papers
  • Content moderation — detect PII before storing user-generated content
  • Healthcare analytics — extract diagnoses, medications, and procedures via Comprehend Medical
  • Compliance — redact sensitive information from documents at scale
Prerequisites: boto3 installed (pip install boto3), AWS credentials configured (aws configure), and an IAM role or user with comprehend:* permissions. See the IAM Roles and Policies guide for setup details.

2. Sentiment Analysis

Sentiment analysis classifies text as POSITIVE, NEGATIVE, NEUTRAL, or MIXED, and returns confidence scores for each label. It is one of the most heavily used Comprehend APIs — ideal for customer reviews, social posts, and survey responses.

Basic detect_sentiment() Call

import boto3

comprehend = boto3.client('comprehend', region_name='us-east-1')

response = comprehend.detect_sentiment(
    Text="The new AWS Comprehend API is incredibly fast and easy to use. Highly recommended!",
    LanguageCode='en'
)

print(f"Sentiment: {response['Sentiment']}")
print(f"Scores: {response['SentimentScore']}")
# Sentiment: POSITIVE
# Scores: {'Positive': 0.9876, 'Negative': 0.0012, 'Neutral': 0.0089, 'Mixed': 0.0023}

Response Format

The response object contains:

  • Sentiment — string: POSITIVE | NEGATIVE | NEUTRAL | MIXED
  • SentimentScore — dict with four float values summing to ~1.0
  • ResponseMetadata — standard AWS HTTP metadata

Batch Processing with batch_detect_sentiment()

For throughput, use the batch API to process up to 25 documents per call, significantly reducing round-trip latency.

import boto3

comprehend = boto3.client('comprehend', region_name='us-east-1')

reviews = [
    "Product arrived damaged. Very disappointed.",
    "Exactly what I needed. Fast shipping too!",
    "It was okay, nothing special but does the job.",
    "Terrible customer support, waited 3 days for a reply.",
    "Outstanding quality. Would buy again without hesitation."
]

# batch_detect_sentiment accepts a list of text strings + language
response = comprehend.batch_detect_sentiment(
    TextList=reviews,
    LanguageCode='en'
)

for item in response['ResultList']:
    print(f"[{item['Index']}] {item['Sentiment']} — "
          f"Positive: {item['SentimentScore']['Positive']:.3f}")

# Check for any errors
if response['ErrorList']:
    for err in response['ErrorList']:
        print(f"Error at index {err['Index']}: {err['ErrorMessage']}")
Tip: For very large datasets (thousands of documents), use start_sentiment_detection_job() to run an asynchronous batch job against S3 input files. Results are written back to S3 as line-delimited JSON.

3. Entity Recognition

Entity recognition (Named Entity Recognition / NER) identifies and classifies named entities in text. Comprehend recognizes 12 built-in entity types including PERSON, LOCATION, ORGANIZATION, DATE, QUANTITY, TITLE, EVENT, and COMMERCIAL_ITEM.

import boto3
import json

comprehend = boto3.client('comprehend', region_name='us-east-1')

text = """
Amazon Web Services announced its new data center in Hyderabad, India on March 15, 2026.
The $3.5 billion facility will be managed by CEO Andy Jassy and serve customers across Asia Pacific.
"""

response = comprehend.detect_entities(
    Text=text,
    LanguageCode='en'
)

for entity in response['Entities']:
    print(f"  Type: {entity['Type']:<20} Text: {entity['Text']:<30} Score: {entity['Score']:.3f}")

# Output:
#   Type: ORGANIZATION         Text: Amazon Web Services          Score: 0.998
#   Type: LOCATION             Text: Hyderabad, India             Score: 0.996
#   Type: DATE                 Text: March 15, 2026               Score: 0.999
#   Type: QUANTITY             Text: $3.5 billion                 Score: 0.994
#   Type: TITLE                Text: CEO                          Score: 0.981
#   Type: PERSON               Text: Andy Jassy                   Score: 0.997

Each entity result includes:

  • Text — the exact string matched in the source text
  • Type — entity category
  • Score — confidence from 0 to 1
  • BeginOffset / EndOffset — character positions, useful for highlighting in a UI
Use offsets for highlighting: Store BeginOffset and EndOffset alongside entity results to render inline annotations in document review tools without re-running the API.

4. Key Phrase Extraction

Key phrase extraction identifies the most important noun phrases in a document. Unlike entity recognition which targets named things, key phrases capture conceptual topics — ideal for automatic document tagging, search index enrichment, and content recommendation.

import boto3

comprehend = boto3.client('comprehend', region_name='us-east-1')

document = """
Kubernetes cluster autoscaling relies on the Horizontal Pod Autoscaler and Cluster Autoscaler
working in tandem. Proper resource requests and limits are essential for predictable scaling
behavior in production environments running microservices workloads.
"""

response = comprehend.detect_key_phrases(
    Text=document,
    LanguageCode='en'
)

# Sort by score descending and print top phrases
phrases = sorted(response['KeyPhrases'], key=lambda x: x['Score'], reverse=True)
for phrase in phrases[:8]:
    print(f"  {phrase['Text']:<45} Score: {phrase['Score']:.3f}")

# Output:
#   Horizontal Pod Autoscaler                     Score: 0.999
#   Cluster Autoscaler                            Score: 0.998
#   Kubernetes cluster autoscaling                Score: 0.997
#   production environments                       Score: 0.994
#   predictable scaling behavior                  Score: 0.991
#   microservices workloads                       Score: 0.988
#   resource requests                             Score: 0.985
#   Proper resource requests and limits           Score: 0.982

Document Indexing Pipeline

A practical use case: when a new document is uploaded to S3, a Lambda function extracts key phrases and stores them as tags in DynamoDB. Users can then full-text search across documents using the phrase index, reducing search latency by orders of magnitude compared to scanning raw text.

5. Language Detection

Comprehend can identify the dominant language of a text sample from over 100 languages. This is essential for multi-lingual applications where downstream processing (translation, sentiment analysis) must use the correct language code.

import boto3

comprehend = boto3.client('comprehend', region_name='us-east-1')

samples = [
    "Machine learning is transforming every industry.",       # English
    "El aprendizaje automático transforma cada industria.",   # Spanish
    "机器学习正在改变每个行业。",                                   # Chinese
    "Das maschinelle Lernen verändert jede Branche.",         # German
    "L'apprentissage automatique transforme chaque secteur.", # French
]

for text in samples:
    response = comprehend.detect_dominant_language(Text=text)
    top = response['Languages'][0]
    print(f"  Lang: {top['LanguageCode']}  Score: {top['Score']:.3f}  Text: {text[:50]}")

Multi-Language Content Routing

In a customer support system, detect the language first, then route to the appropriate sentiment/entity endpoint with the correct LanguageCode — or trigger an Amazon Translate job before NLP processing. This pattern keeps latency low while supporting a global user base without separate per-language pipelines.

Language support: Built-in APIs (sentiment, entities, key phrases, syntax) support a subset of all detected languages. Check the AWS docs for the supported language matrix per API — English, Spanish, French, German, Italian, Portuguese, Japanese, Korean, Chinese, Arabic, and Hindi cover the vast majority of production use cases.

6. PII Detection and Redaction

PII (Personally Identifiable Information) detection finds sensitive data like names, addresses, phone numbers, email addresses, SSNs, credit card numbers, and passport numbers. This is critical for GDPR, HIPAA, and CCPA compliance before storing or logging user content.

detect_pii_entities()

import boto3

comprehend = boto3.client('comprehend', region_name='us-east-1')

text = """
Please contact John Smith at john.smith@example.com or call +1-555-867-5309.
His account SSN is 123-45-6789 and credit card ending 4532 1234 5678 9012.
Shipping address: 742 Evergreen Terrace, Springfield, IL 62701.
"""

response = comprehend.detect_pii_entities(
    Text=text,
    LanguageCode='en'
)

print("PII Entities Found:")
for entity in response['Entities']:
    snippet = text[entity['BeginOffset']:entity['EndOffset']]
    print(f"  Type: {entity['Type']:<20} Value: {snippet:<30} Score: {entity['Score']:.3f}")

Redaction with contains_pii_entities()

For a quick compliance gate (without needing exact positions), use contains_pii_entities() to get a boolean-style check, then apply redaction using the offset data from detect_pii_entities():

def redact_pii(text: str, language_code: str = 'en') -> str:
    """Replace all PII spans with [REDACTED] markers."""
    comprehend = boto3.client('comprehend', region_name='us-east-1')
    response = comprehend.detect_pii_entities(Text=text, LanguageCode=language_code)

    # Sort entities in reverse order so offsets stay valid as we replace
    entities = sorted(response['Entities'], key=lambda e: e['BeginOffset'], reverse=True)

    result = list(text)
    for entity in entities:
        start = entity['BeginOffset']
        end = entity['EndOffset']
        replacement = f"[{entity['Type']}]"
        result[start:end] = list(replacement)

    return "".join(result)

# Usage
clean_text = redact_pii(text)
print(clean_text)
# Output:
# Please contact [NAME] at [EMAIL] or call [PHONE].
# His account SSN is [SSN] and credit card ending [CREDIT_DEBIT_NUMBER].
# Shipping address: [ADDRESS], [LOCATION], [ADDRESS].
Async redaction at scale: For bulk document processing, use start_pii_entities_detection_job() with a RedactionConfig to have Comprehend write pre-redacted output documents directly to S3 — no Lambda code needed for the replacement logic.

7. Syntax Analysis

Syntax analysis (part-of-speech tagging) identifies the grammatical role of each token in a sentence — NOUN, VERB, ADJECTIVE, ADVERB, PROPN (proper noun), DET (determiner), etc. This is useful for building grammar-aware search, content simplification tools, or feeding downstream NLP pipelines.

import boto3

comprehend = boto3.client('comprehend', region_name='us-east-1')

response = comprehend.detect_syntax(
    Text="The serverless Lambda function processes incoming S3 events efficiently.",
    LanguageCode='en'
)

for token in response['SyntaxTokens']:
    pos = token['PartOfSpeech']
    print(f"  Token: {token['Text']:<15} POS: {pos['Tag']:<8} Score: {pos['Score']:.3f}")

# Output:
#   Token: The             POS: DET      Score: 1.000
#   Token: serverless      POS: ADJ      Score: 0.997
#   Token: Lambda          POS: PROPN    Score: 0.994
#   Token: function        POS: NOUN     Score: 0.999
#   Token: processes       POS: VERB     Score: 0.998
#   Token: incoming        POS: ADJ      Score: 0.992
#   Token: S3              POS: PROPN    Score: 0.989
#   Token: events          POS: NOUN     Score: 0.999
#   Token: efficiently     POS: ADV      Score: 0.998

Common POS tags: ADJ, ADP (adposition), ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, O (other), PART, PRON, PROPN, PUNCT, SCONJ, SYM, VERB.

8. Custom Classifiers

When the built-in sentiment labels (POSITIVE/NEGATIVE/etc.) are too coarse, train a custom classifier. Examples: routing support tickets to departments (Billing, Technical, Returns), classifying legal clauses by type, or labeling news articles by topic.

Training Data Format (CSV)

Prepare a UTF-8 CSV with two columns — label first, then text. No header row.

# training-data.csv
BILLING,"My invoice shows duplicate charges from last month."
BILLING,"I was charged twice for the annual subscription."
TECHNICAL,"The app keeps crashing when I open the settings screen."
TECHNICAL,"Login fails with error code 503 on mobile devices."
RETURNS,"I want to return the product I received yesterday."
RETURNS,"How do I initiate a refund for my recent order?"
GENERAL,"What are your business hours?"
GENERAL,"Can you tell me more about your premium plan?"

Upload the CSV to S3, then create the classifier:

import boto3

comprehend = boto3.client('comprehend', region_name='us-east-1')

# Step 1: Create the classifier (training takes 30–90 minutes)
response = comprehend.create_document_classifier(
    DocumentClassifierName='SupportTicketClassifier-v1',
    DataAccessRoleArn='arn:aws:iam::123456789012:role/ComprehendDataAccessRole',
    InputDataConfig={
        'DataFormat': 'COMPREHEND_CSV',
        'S3Uri': 's3://my-comprehend-bucket/training-data/training-data.csv'
    },
    OutputDataConfig={
        'S3Uri': 's3://my-comprehend-bucket/classifier-output/'
    },
    LanguageCode='en',
    Mode='MULTI_CLASS'   # or 'MULTI_LABEL' for documents with multiple tags
)

classifier_arn = response['DocumentClassifierArn']
print(f"Classifier ARN: {classifier_arn}")

# Step 2: Wait for TRAINED status (poll or use EventBridge)
import time
while True:
    status = comprehend.describe_document_classifier(
        DocumentClassifierArn=classifier_arn
    )['DocumentClassifierProperties']['Status']
    print(f"Status: {status}")
    if status in ('TRAINED', 'FAILED'):
        break
    time.sleep(60)

Real-Time vs. Async Inference

Real-time endpoint — create an endpoint from a trained classifier for low-latency synchronous classification (suitable for live ticket routing):

# Create real-time endpoint
endpoint_response = comprehend.create_endpoint(
    EndpointName='support-classifier-endpoint',
    ModelArn=classifier_arn,
    DesiredInferenceUnits=1   # scale up for higher throughput
)
endpoint_arn = endpoint_response['EndpointArn']

# Classify a document synchronously
result = comprehend.classify_document(
    Text="I was charged twice on my credit card this billing cycle.",
    EndpointArn=endpoint_arn
)
print(result['Classes'])
# [{'Name': 'BILLING', 'Score': 0.9823}, {'Name': 'GENERAL', 'Score': 0.0124}, ...]

Async batch job — use start_document_classification_job() for bulk inference against an S3 folder without a persistent endpoint (more cost-efficient for non-real-time workloads).

Data tip: Aim for at least 50 examples per class for acceptable accuracy, 500+ per class for production quality. Balance classes within a 10:1 ratio to avoid majority-class bias.

9. Custom Entity Recognizers

Custom entity recognizers let you teach Comprehend domain-specific entity types not covered by the built-in 12. Examples: product SKU codes, internal employee IDs, medical drug names, or proprietary terminology.

Annotation Format

You supply two files: a plain text document corpus and an annotations CSV mapping entity spans to their type.

# annotations.csv (columns: File, Line, Begin, End, Type)
documents.txt,0,12,25,PRODUCT_SKU
documents.txt,1,0,14,PRODUCT_SKU
documents.txt,2,18,31,EMPLOYEE_ID
# documents.txt (one document per line)
Order contains SKU-A1234-XL-BLK in the cart.
SKU-B9876-SM-RED was restocked today.
Assigned to employee EMP-00421 for processing.
import boto3

comprehend = boto3.client('comprehend', region_name='us-east-1')

response = comprehend.create_entity_recognizer(
    RecognizerName='ProductSKURecognizer-v1',
    DataAccessRoleArn='arn:aws:iam::123456789012:role/ComprehendDataAccessRole',
    InputDataConfig={
        'EntityTypes': [
            {'Type': 'PRODUCT_SKU'},
            {'Type': 'EMPLOYEE_ID'}
        ],
        'Documents': {
            'S3Uri': 's3://my-comprehend-bucket/entity-training/documents.txt'
        },
        'Annotations': {
            'S3Uri': 's3://my-comprehend-bucket/entity-training/annotations.csv'
        }
    },
    LanguageCode='en'
)

recognizer_arn = response['EntityRecognizerArn']
print(f"Recognizer ARN: {recognizer_arn}")

Once trained, deploy via an endpoint and use detect_entities() with the EndpointArn parameter to invoke your custom recognizer in real time.

10. Topic Modeling

Topic modeling uses Latent Dirichlet Allocation (LDA) to discover the dominant themes across a large document corpus — without any predefined labels. It is inherently asynchronous; input and output both live in S3.

import boto3
import time

comprehend = boto3.client('comprehend', region_name='us-east-1')

# Start an async topic detection job
response = comprehend.start_topics_detection_job(
    InputDataConfig={
        'S3Uri': 's3://my-comprehend-bucket/topic-input/',
        'InputFormat': 'ONE_DOC_PER_FILE'   # or ONE_DOC_PER_LINE
    },
    OutputDataConfig={
        'S3Uri': 's3://my-comprehend-bucket/topic-output/'
    },
    DataAccessRoleArn='arn:aws:iam::123456789012:role/ComprehendDataAccessRole',
    NumberOfTopics=10,   # 1–100; start with 10–20 and tune
    JobName='BlogTopicAnalysis-2026-06'
)

job_id = response['JobId']
print(f"Started job: {job_id}")

# Poll until complete
while True:
    job = comprehend.describe_topics_detection_job(JobId=job_id)
    status = job['TopicsDetectionJobProperties']['JobStatus']
    print(f"Status: {status}")
    if status in ('COMPLETED', 'FAILED', 'STOP_REQUESTED'):
        break
    time.sleep(30)

Parsing S3 Output

The completed job writes two gzipped files to your output S3 path:

  • topic-terms.csv — Each row: topic,term,weight — the top terms defining each topic
  • doc-topics.csv — Each row: docname,topic,proportion — how much each document belongs to each topic
import boto3
import gzip
import csv
import io

s3 = boto3.client('s3')

# Download and parse topic-terms
obj = s3.get_object(
    Bucket='my-comprehend-bucket',
    Key='topic-output/output/topic-terms.csv.gz'
)
with gzip.GzipFile(fileobj=io.BytesIO(obj['Body'].read())) as f:
    reader = csv.DictReader(io.TextIOWrapper(f, encoding='utf-8'))
    current_topic = None
    for row in reader:
        if row['topic'] != current_topic:
            current_topic = row['topic']
            print(f"\n--- Topic {current_topic} ---")
        print(f"  {row['term']:<25} weight: {float(row['weight']):.4f}")
Choosing NumberOfTopics: There is no automatic optimal number. Start with 10–15 for a new corpus, inspect the term lists, then re-run with a higher or lower count until topics are coherent and distinct. Coherence scores from external tools like Gensim can help guide the choice.

11. Comprehend Medical

Amazon Comprehend Medical is a separate but related service optimized for clinical text — physician notes, discharge summaries, lab reports. It understands medical ontologies and can extract entities, PHI (Protected Health Information), and map findings to standard codes.

import boto3

cm = boto3.client('comprehendmedical', region_name='us-east-1')

clinical_note = """
Patient: Jane Doe, DOB 1980-03-22. Diagnosed with Type 2 Diabetes Mellitus (E11.9).
Prescribed Metformin 500mg twice daily. Blood pressure 145/92 mmHg.
Referred to Dr. Patel at Mysore General Hospital for nephrology follow-up.
"""

# Detect medical entities
response = cm.detect_entities_v2(Text=clinical_note)

print("Medical Entities:")
for entity in response['Entities']:
    print(f"  Category: {entity['Category']:<22} Type: {entity['Type']:<25} Text: {entity['Text']}")

# Detect PHI separately
phi_response = cm.detect_phi(Text=clinical_note)
print("\nPHI Entities:")
for phi in phi_response['Entities']:
    print(f"  Type: {phi['Type']:<20} Text: {phi['Text']}")

Comprehend Medical entity categories include: MEDICATION, MEDICAL_CONDITION, ANATOMY, TEST_TREATMENT_PROCEDURE, TIME_EXPRESSION, and PROTECTED_HEALTH_INFORMATION. It also supports infer_icd10_cm() to map conditions to ICD-10 codes and infer_rx_norm() to map medications to RxNorm identifiers — critical for EHR integration and clinical analytics.

HIPAA note: Comprehend Medical is HIPAA-eligible. Ensure your AWS account has signed a Business Associate Addendum (BAA) with AWS before processing real PHI in any environment.

12. Integration Patterns

Comprehend works best as a component in an event-driven pipeline. A canonical pattern for real-time document analysis:

S3 → Lambda → Comprehend → DynamoDB Pipeline

import boto3
import json
import os

comprehend = boto3.client('comprehend', region_name=os.environ['AWS_REGION'])
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb', region_name=os.environ['AWS_REGION'])
table = dynamodb.Table(os.environ['RESULTS_TABLE'])

def lambda_handler(event, context):
    """Triggered by S3 PutObject event on the documents bucket."""
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']

        # Fetch document text from S3
        obj = s3.get_object(Bucket=bucket, Key=key)
        text = obj['Body'].read().decode('utf-8')[:5000]  # Comprehend max 5000 bytes

        # Run NLP in parallel (conceptually; real parallelism uses asyncio or threads)
        sentiment = comprehend.detect_sentiment(Text=text, LanguageCode='en')
        entities  = comprehend.detect_entities(Text=text, LanguageCode='en')
        phrases   = comprehend.detect_key_phrases(Text=text, LanguageCode='en')

        # Store enriched metadata in DynamoDB
        table.put_item(Item={
            'documentId': key,
            'sentiment': sentiment['Sentiment'],
            'sentimentScores': {
                k: str(round(v, 4))
                for k, v in sentiment['SentimentScore'].items()
            },
            'topEntities': [
                {'text': e['Text'], 'type': e['Type']}
                for e in sorted(entities['Entities'],
                                key=lambda x: x['Score'], reverse=True)[:5]
            ],
            'keyPhrases': [p['Text'] for p in phrases['KeyPhrases'][:10]],
            'processedAt': context.aws_request_id
        })

    return {'statusCode': 200, 'body': json.dumps('Processed successfully')}

Architecture details:

  • S3 bucket notification triggers Lambda on new object creation
  • Lambda reads the document, calls three Comprehend APIs, writes to DynamoDB
  • Downstream dashboards query DynamoDB for aggregated sentiment trends, entity frequencies, and phrase clouds
  • For monitoring and alerting on pipeline health, use CloudWatch metrics and alarms
  • For more complex multi-step workflows (e.g., detect language → translate → analyze sentiment), orchestrate with Step Functions
  • For event fan-out (e.g., routing processed results to multiple consumers), use EventBridge
Text size limits: Synchronous APIs accept up to 5,000 bytes (not characters — UTF-8 multibyte characters count as 2–4 bytes each). For larger documents, split into paragraphs, process each, and aggregate results. Async batch jobs support individual files up to 100 KB.

13. Cost Model

Comprehend charges per unit of 100 characters (with a minimum of 300 characters per API call). Pricing varies by API and tier.

Built-in API Pricing (us-east-1, 2026)

APIPrice per 100 charsNotes
Sentiment, Entities, Key Phrases, Language, Syntax$0.0001First 10M units/mo; volume discounts after
PII Detection$0.0001Same tier structure
Async batch jobs$0.0001Same rate, lower per-request overhead

Custom Model Pricing

ActivityPrice
Custom classifier / entity recognizer training$3.00 per training hour
Async custom inference (batch)$0.0005 per 100 chars
Real-time endpoint (Inference Unit)$0.0005 per hr per IU + $0.0005/100 chars

Free Tier

New AWS accounts get 50,000 units (5 million characters) per month free for each of the standard APIs for the first 12 months. This is enough to process roughly 20,000 average-length customer reviews per month at no charge — sufficient to build and validate a proof of concept.

Cost optimization tips:
  • Use async batch jobs instead of synchronous calls for large volumes — same rate but far fewer API round-trips.
  • Pre-filter documents before calling Comprehend (e.g., skip very short strings below 20 characters where NLP results are unreliable).
  • Cache results in DynamoDB keyed by a hash of the input text — avoid re-processing unchanged documents.
  • Delete custom endpoints when not in use — idle endpoints still accrue the hourly inference unit charge.
  • Compare against Amazon Bedrock foundation models for tasks where higher reasoning quality justifies the cost difference.

Read Next