2017 Update
Such a well put question deserves a modern response. The sort of array filtering requested can actually be done in modern MongoDB releases post 3.2 via simply $match
and $project
pipeline stages, much like the original plain query operation intends.
db.accounts.aggregate([
{ "$match": {
"email" : "john.doe@acme.com",
"groups": {
"$elemMatch": {
"name": "group1",
"contacts.localId": { "$in": [ "c1","c3", null ] }
}
}
}},
{ "$addFields": {
"groups": {
"$filter": {
"input": {
"$map": {
"input": "$groups",
"as": "g",
"in": {
"name": "$$g.name",
"contacts": {
"$filter": {
"input": "$$g.contacts",
"as": "c",
"cond": {
"$or": [
{ "$eq": [ "$$c.localId", "c1" ] },
{ "$eq": [ "$$c.localId", "c3" ] }
]
}
}
}
}
}
},
"as": "g",
"cond": {
"$and": [
{ "$eq": [ "$$g.name", "group1" ] },
{ "$gt": [ { "$size": "$$g.contacts" }, 0 ] }
]
}
}
}
}}
])
This makes use of of the $filter
and $map
operators to only return the elements from the arrays as would meet the conditions, and is far better for performance than using $unwind
. Since the pipeline stages effectively mirror the structure of “query” and “project” from a .find()
operation, the performance here is basically on par with such and operation.
Note that where the intention is to actually work “across documents” to bring details together out of “multiple” documents rather than “one”, then this would usually require some type of $unwind
operation in order to do so, as such enabling the array items to be accessible for “grouping”.
This is basically the approach:
db.accounts.aggregate([
// Match the documents by query
{ "$match": {
"email" : "john.doe@acme.com",
"groups.name": "group1",
"groups.contacts.localId": { "$in": [ "c1","c3", null ] },
}},
// De-normalize nested array
{ "$unwind": "$groups" },
{ "$unwind": "$groups.contacts" },
// Filter the actual array elements as desired
{ "$match": {
"groups.name": "group1",
"groups.contacts.localId": { "$in": [ "c1","c3", null ] },
}},
// Group the intermediate result.
{ "$group": {
"_id": { "email": "$email", "name": "$groups.name" },
"contacts": { "$push": "$groups.contacts" }
}},
// Group the final result
{ "$group": {
"_id": "$_id.email",
"groups": { "$push": {
"name": "$_id.name",
"contacts": "$contacts"
}}
}}
])
This is “array filtering” on more than a single match which the basic projection capabilities of .find()
cannot do.
You have “nested” arrays therefore you need to process $unwind
twice. Along with the other operations.