Beyond Basic Aggregation
While $match, $group, and $project handle most use cases, advanced stages like $lookup, $unwind, and $facet unlock powerful data processing capabilities -- from cross-collection joins to parallel sub-pipelines.
$lookup - Joining Collections
The $lookup stage performs a left outer join between two collections, similar to a SQL LEFT JOIN:
// Join orders with customer details
db.orders.aggregate([
{
$lookup: {
from: "customers", // collection to join
localField: "customerId", // field in orders
foreignField: "_id", // field in customers
as: "customer" // output array field
}
},
{ $unwind: "$customer" }, // flatten the array to a single object
{
$project: {
orderTotal: 1,
"customer.name": 1,
"customer.email": 1
}
}
])$lookup with Pipeline
For more control, use the pipeline form of $lookup to add filtering and transformation within the join:
db.customers.aggregate([
{
$lookup: {
from: "orders",
let: { custId: "$_id" },
pipeline: [
{ $match: { $expr: { $eq: ["$customerId", "$$custId"] } } },
{ $match: { status: "completed" } },
{ $sort: { createdAt: -1 } },
{ $limit: 5 }
],
as: "recentOrders"
}
}
])
// Each customer now has their 5 most recent completed orders$unwind - Deconstructing Arrays
The $unwind stage deconstructs an array field, creating one document per array element. This is essential after $lookup or for analyzing individual array entries:
// Count how many times each tag is used across articles
db.articles.aggregate([
{ $unwind: "$tags" },
{ $group: { _id: "$tags", count: { $sum: 1 } } },
{ $sort: { count: -1 } }
])
// Preserve documents with empty/missing arrays
db.articles.aggregate([
{
$unwind: {
path: "$tags",
preserveNullAndEmptyArrays: true
}
}
])$facet - Parallel Pipelines
The $facet stage runs multiple aggregation pipelines in parallel on the same set of input documents and returns their results as separate fields:
db.products.aggregate([
{ $match: { inStock: true } },
{
$facet: {
// Pipeline 1: Price statistics
priceStats: [
{
$group: {
_id: null,
avgPrice: { $avg: "$price" },
minPrice: { $min: "$price" },
maxPrice: { $max: "$price" }
}
}
],
// Pipeline 2: Count by category
byCategory: [
{ $group: { _id: "$category", count: { $sum: 1 } } },
{ $sort: { count: -1 } }
],
// Pipeline 3: Top 5 most expensive
topProducts: [
{ $sort: { price: -1 } },
{ $limit: 5 },
{ $project: { name: 1, price: 1 } }
]
}
}
])This returns a single document with three arrays: priceStats, byCategory, and topProducts. This is particularly useful for building dashboard data in a single query.
Combining Stages
A real-world pipeline often combines all three stages. Here is an example that builds an order summary with customer details:
db.orders.aggregate([
{ $match: { createdAt: { $gte: new Date("2025-01-01") } } },
{ $unwind: "$items" },
{
$group: {
_id: "$customerId",
totalSpent: { $sum: { $multiply: ["$items.price", "$items.qty"] } },
itemCount: { $sum: "$items.qty" }
}
},
{
$lookup: {
from: "customers",
localField: "_id",
foreignField: "_id",
as: "customer"
}
},
{ $unwind: "$customer" },
{ $sort: { totalSpent: -1 } },
{ $limit: 10 },
{
$project: {
_id: 0,
customerName: "$customer.name",
totalSpent: 1,
itemCount: 1
}
}
])Key Takeaways
$lookupperforms left outer joins between collections.- The pipeline form of
$lookupallows filtering and sorting within the join. $unwindflattens arrays into individual documents for per-element processing.$facetruns multiple pipelines in parallel on the same data set.
Try this query in UnifySQL
Write, optimize, and collaborate on MongoDB queries with AI assistance.
Start Free