Microsoft AutoGen is an open-source framework for building multi-agent AI applications where LLM-powered agents collaborate through structured conversations to solve complex tasks. Unlike single-agent chains, AutoGen agents can debate, critique, write and execute code, call tools, and route sub-tasks to specialists — producing higher-quality outputs than any single prompt. AutoGen 0.4 introduced an actor-model runtime with async messaging, making it production-ready for long-running autonomous workflows.
This guide covers the core building blocks: ConversableAgent, AssistantAgent, UserProxyAgent, group chat with a manager, tool registration, code execution sandboxing, custom agent patterns, and a real-world example of a software engineering team of agents.
AutoGen works with any OpenAI-compatible API — OpenAI, Azure OpenAI, Anthropic (via litellm), Ollama for local models, and more. Configuration is passed as a list of config dicts, allowing automatic fallback across models. Install the core package plus optional extras for code execution and tool support.
pip install pyautogen
pip install pyautogen[teachable] # Long-term memory with ChromaDB
pip install pyautogen[lmm] # Multimodal agents (vision)
# For local models via Ollama
pip install litellm
import autogen
# LLM configuration — supports multiple models with fallback
config_list = [
{
"model": "gpt-4o",
"api_key": "sk-...",
},
# Fallback to GPT-4-turbo if gpt-4o fails
{
"model": "gpt-4-turbo",
"api_key": "sk-...",
},
]
# Or load from environment / OAI_CONFIG_LIST file
import os
config_list = autogen.config_list_from_json("OAI_CONFIG_LIST")
# LLM config with caching and timeout
llm_config = {
"config_list": config_list,
"cache_seed": 42, # Cache responses — deterministic replays
"timeout": 120,
"temperature": 0.1,
}
# Local model via Ollama
local_config = {
"config_list": [{
"model": "llama3.1:70b",
"base_url": "http://localhost:11434/v1",
"api_key": "ollama", # Required but ignored by Ollama
}]
}
The simplest AutoGen pattern is a two-agent conversation: an AssistantAgent (the AI) and a UserProxyAgent (acts as the human). The proxy agent can automatically execute code blocks the assistant produces. Setting human_input_mode="NEVER" makes the workflow fully autonomous — no human approval needed.
import autogen
llm_config = {"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}]}
# AI assistant agent
assistant = autogen.AssistantAgent(
name="Assistant",
llm_config=llm_config,
system_message=(
"You are a helpful AI assistant. Solve tasks step by step. "
"When you write Python code, use print() to show results."
),
)
# Proxy agent — represents the human, can execute code
user_proxy = autogen.UserProxyAgent(
name="User",
human_input_mode="NEVER", # Fully autonomous
max_consecutive_auto_reply=5, # Safety limit
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={
"work_dir": "workspace",
"use_docker": False, # Set True for sandboxed execution
},
)
# Start the conversation
user_proxy.initiate_chat(
assistant,
message=(
"Calculate the first 20 Fibonacci numbers and plot them as a bar chart. "
"Save the chart as fibonacci.png. Reply TERMINATE when done."
),
)
# The agents will converse, write Python code, execute it, fix any errors,
# and iterate until the task is complete.
use_docker=True in production to sandbox code execution inside a Docker container. Running arbitrary LLM-generated code without a sandbox is a security risk.
AutoGen's code execution loop is its killer feature: the assistant writes code, the proxy executes it, reports the output back, and the assistant fixes any errors — automatically. This self-correcting loop handles bugs, missing imports, and runtime errors without human intervention, typically converging in 1-3 iterations.
import autogen
from autogen.coding import LocalCommandLineCodeExecutor
from pathlib import Path
# Create a proper code executor
executor = LocalCommandLineCodeExecutor(
timeout=60,
work_dir=Path("workspace"),
)
user_proxy = autogen.UserProxyAgent(
name="executor",
human_input_mode="NEVER",
code_execution_config={"executor": executor},
is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
)
assistant = autogen.AssistantAgent(
name="coder",
llm_config={"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}]},
system_message=(
"Write Python code to complete tasks. "
"Always test your code by running it. "
"Fix any errors and rerun until it works. "
"End with TERMINATE when the task succeeds."
),
)
# Data analysis task
user_proxy.initiate_chat(
assistant,
message="""
Download the NYC taxi dataset from:
https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2024-01.parquet
Then:
1. Load it with pandas
2. Show shape, dtypes, and head(3)
3. Calculate average trip distance by hour of day
4. Save a matplotlib chart of this as 'hourly_distance.png'
Reply TERMINATE when done.
"""
)
Agents can call Python functions as tools via the @register_for_llm and @register_for_execution decorators. The LLM agent decides when and how to call tools; the proxy agent executes them. This pattern gives agents access to APIs, databases, file systems, and web search without unsafe code execution.
import autogen
import requests
from datetime import datetime
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}],
"tools": [], # Tools appended by decorators
}
assistant = autogen.AssistantAgent("assistant", llm_config=llm_config)
user_proxy = autogen.UserProxyAgent(
"user_proxy",
human_input_mode="NEVER",
code_execution_config=False, # Using tools instead of code execution
)
# Register tools using decorator pattern
@user_proxy.register_for_execution()
@assistant.register_for_llm(description="Get current weather for a city")
def get_weather(city: str) -> dict:
"""Fetch current weather from Open-Meteo (free, no key required)."""
# Geocode the city
geo = requests.get(f"https://geocoding-api.open-meteo.com/v1/search?name={city}&count=1").json()
if not geo.get("results"):
return {"error": f"City '{city}' not found"}
loc = geo["results"][0]
lat, lon = loc["latitude"], loc["longitude"]
# Get weather
weather = requests.get(
f"https://api.open-meteo.com/v1/forecast"
f"?latitude={lat}&longitude={lon}¤t=temperature_2m,wind_speed_10m"
).json()
current = weather["current"]
return {
"city": city,
"temperature_c": current["temperature_2m"],
"wind_speed_kmh": current["wind_speed_10m"],
"time": current["time"],
}
@user_proxy.register_for_execution()
@assistant.register_for_llm(description="Get today's date and time")
def get_datetime() -> str:
return datetime.now().isoformat()
# The agent will automatically call tools as needed
user_proxy.initiate_chat(
assistant,
message="What's the weather like in Tokyo and London right now? Compare them."
)
Group chat allows multiple specialist agents to collaborate on a task, with a GroupChatManager orchestrating the conversation flow. The manager selects the next speaker based on context — you can use speaker_selection_method="auto" (LLM-based) or "round_robin" for predictable turn-taking.
import autogen
llm_config = {"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}]}
# Create specialist agents
planner = autogen.AssistantAgent(
name="Planner",
system_message=(
"You are a project planner. Break down complex tasks into clear subtasks. "
"Assign each subtask to the appropriate specialist."
),
llm_config=llm_config,
)
developer = autogen.AssistantAgent(
name="Developer",
system_message=(
"You are a senior Python developer. Write clean, tested, production-quality code. "
"Always include docstrings and type hints."
),
llm_config=llm_config,
)
reviewer = autogen.AssistantAgent(
name="Reviewer",
system_message=(
"You are a code reviewer. Check code for bugs, security issues, and best practices. "
"Provide specific, actionable feedback."
),
llm_config=llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="User",
human_input_mode="NEVER",
code_execution_config={"work_dir": "workspace", "use_docker": False},
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
)
# Create group chat
group_chat = autogen.GroupChat(
agents=[user_proxy, planner, developer, reviewer],
messages=[],
max_round=15,
speaker_selection_method="auto",
)
manager = autogen.GroupChatManager(groupchat=group_chat, llm_config=llm_config)
user_proxy.initiate_chat(
manager,
message=(
"Build a Python class for a thread-safe LRU cache with a configurable max size. "
"Include unit tests. Reply TERMINATE when complete and reviewed."
),
)
Extend ConversableAgent to build domain-specific agents with custom behaviours — persistent memory, rate limiting, logging, or specialised response processing. Custom agents are the building block for production AutoGen systems that need observability and guardrails.
import autogen
from datetime import datetime
import json
class LoggedAssistantAgent(autogen.AssistantAgent):
"""Assistant agent with conversation logging and cost tracking."""
def __init__(self, *args, log_file: str = "agent_log.jsonl", **kwargs):
super().__init__(*args, **kwargs)
self.log_file = log_file
self.message_count = 0
self.total_tokens = 0
def generate_reply(self, messages=None, sender=None, **kwargs):
self.message_count += 1
reply = super().generate_reply(messages=messages, sender=sender, **kwargs)
# Log the interaction
log_entry = {
"timestamp": datetime.utcnow().isoformat(),
"agent": self.name,
"message_count": self.message_count,
"reply_preview": str(reply)[:200] if reply else None,
}
with open(self.log_file, "a") as f:
f.write(json.dumps(log_entry) + "\n")
return reply
def print_summary(self):
print(f"Agent {self.name}: {self.message_count} messages processed")
print(f"Log saved to: {self.log_file}")
# Use the custom agent
logged_assistant = LoggedAssistantAgent(
name="LoggedAssistant",
llm_config={"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}]},
log_file="session_log.jsonl",
)
A practical example combining all patterns: a software engineering team where a Product Manager writes specs, a Developer implements them, a QA Engineer writes tests, and a DevOps Engineer creates the deployment config. Each agent has a clear role, and the group chat manager coordinates the workflow autonomously.
import autogen
llm_config = {"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}], "temperature": 0.1}
pm = autogen.AssistantAgent("ProductManager", llm_config=llm_config,
system_message="Write clear technical specs and acceptance criteria for features.")
dev = autogen.AssistantAgent("Developer", llm_config=llm_config,
system_message="Implement clean Python code matching the spec. Include type hints and docstrings.")
qa = autogen.AssistantAgent("QAEngineer", llm_config=llm_config,
system_message="Write comprehensive pytest test cases covering happy path, edge cases, and error handling.")
devops = autogen.AssistantAgent("DevOpsEngineer", llm_config=llm_config,
system_message="Create Dockerfile and GitHub Actions CI/CD workflow for the implemented code.")
executor = autogen.UserProxyAgent("Executor", human_input_mode="NEVER",
code_execution_config={"work_dir": "project_output", "use_docker": False},
is_termination_msg=lambda x: "ALL_DONE" in x.get("content", ""))
gc = autogen.GroupChat(agents=[executor, pm, dev, qa, devops], messages=[], max_round=20)
manager = autogen.GroupChatManager(groupchat=gc, llm_config=llm_config)
executor.initiate_chat(manager, message="""
Build a URL shortener service with these requirements:
- FastAPI REST API with POST /shorten and GET /{code} endpoints
- In-memory storage (dict) with thread safety
- 6-character alphanumeric short codes
- Return 404 for unknown codes
Roles: PM writes spec → Dev implements → QA writes tests → DevOps creates Dockerfile.
Reply ALL_DONE when all files are ready.
""")