The MongoDB Aggregation Framework is a powerful tool for data processing and analysis. It allows you to process documents through a pipeline of stages, where each stage transforms the documents as they pass through.
Aggregation Pipeline Flow
Filters documents to pass only those that match the specified condition.
db.orders.aggregate([
{ $match: { status: "completed" } }
])
Groups documents by specified expression and applies accumulator expressions.
db.orders.aggregate([
{ $group: {
_id: "$customer_id",
total_amount: { $sum: "$amount" },
order_count: { $sum: 1 }
}}
])
Sorts all input documents and returns them in sorted order.
db.orders.aggregate([
{ $sort: { total_amount: -1 } }
])
Reshapes each document by adding new fields or removing existing ones.
db.orders.aggregate([
{ $project: {
customer_name: 1,
order_date: 1,
total: { $multiply: ["$price", "$quantity"] }
}}
])
Combining multiple stages for complex data analysis:
db.orders.aggregate([
// Match completed orders
{ $match: { status: "completed" } },
// Group by customer and calculate metrics
{ $group: {
_id: "$customer_id",
total_spent: { $sum: "$amount" },
order_count: { $sum: 1 },
avg_order_value: { $avg: "$amount" }
}},
// Sort by total spent
{ $sort: { total_spent: -1 } },
// Limit to top 10 customers
{ $limit: 10 }
])
Working with arrays in aggregation:
db.products.aggregate([
{ $unwind: "$categories" },
{ $group: {
_id: "$categories",
product_count: { $sum: 1 },
avg_price: { $avg: "$price" }
}}
])
Now that you understand the Aggregation Framework, you can explore: