pipeline operation symbol
Help you carry out complex operations. Each symbol will accept documents and do corresponding operations on these documents, and then pass the results to the next pipeline until the final results appear.
db.getCollection('sales').insertMany([ { "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2, "date" : ISODate("2014-03-01T08:00:00Z") }, { "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1, "date" : ISODate("2014-03-01T09:00:00Z") }, { "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-03-15T09:00:00Z") }, { "_id" : 4, "item" : "xyz", "price" : 5, "quantity" : 20, "date" : ISODate("2014-04-04T11:21:39.736Z") }, { "_id" : 5, "item" : "abc", "price" : 10, "quantity" : 10, "date" : ISODate("2014-04-04T21:23:13.331Z") } ]); db.sales.aggregate( [ { $project : { _id: 0, item : 1 , price : 1 } } ] ) // Only the fields set in will be displayed, and other columns will not be displayed.
$project
Suppose that there are dozens of fields in a collection, and there are only one or two fields we really need, then we can use project
$match
$match is mainly used to filter a group of files
$group
$group is used for grouping. You can decide what to group according to.
$unwind
The English explanation of $unwind is "split". It can split each value of the array field into a separate document
$sort
$sort can be sorted according to any field. Yes, it's the same as when searching, but there's one thing to pay attention to.
If a large amount of data needs to be sorted, it is recommended to sort in the first section of the pipeline, because indexes can be used.
$lookup
A query method in the aggregation framework added in mongodb version 3.2, which will add a field in the result returned by the current collection.
The statement on the official website is too classical. In short, when you want to add a field in the current collection and connect it to another field, and the associated key value exists in two collections, it can be used to match.
Let's take a look at the main usage:
db.collection.aggregate([{ $lookup: { from: "<collection to join>", localField: "<field from the input documents>", foreignField: "<field from the documents of the from collection>", as: "<output array field>" } })
Field | Description |
---|---|
from | Specify the collection from the database to perform the connection |
localField | Field name of the current collection |
foreignField | From the field in the collection specified by from, $lookup will treat the value as null for matching if the document in the from collection does not contain foreignField. |
as | New field name for storing results |
Take an example:
Insert data into an order named collection in the database
db.orders.insert([ { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 }, { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 }, { "_id" : 3 } ])
Next, insert data from the inventory
db.inventory.insert([ { "_id" : 1, "sku" : "almonds", description: "product 1", "instock" : 120 }, { "_id" : 2, "sku" : "bread", description: "product 2", "instock" : 80 }, { "_id" : 3, "sku" : "cashews", description: "product 3", "instock" : 60 }, { "_id" : 4, "sku" : "pecans", description: "product 4", "instock" : 70 }, { "_id" : 5, "sku": null, description: "Incomplete" }, { "_id" : 6 } ])
Use $lookup:
db.orders.aggregate([ { $lookup: { from: "inventory", localField: "item", foreignField: "sku", as: "inventory_docs" } } ])
Return results
{ "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2, "inventory_docs" : [ { "_id" : 1, "sku" : "almonds", "description" : "product 1", "instock" : 120 } ] } { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1, "inventory_docs" : [ { "_id" : 4, "sku" : "pecans", "description" : "product 4", "instock" : 70 } ] } { "_id" : 3, "inventory_docs" : [ { "_id" : 5, "sku" : null, "description" : "Incomplete" }, { "_id" : 6 } ] }
Use $lookup for arrays
First, there are two collection s
comment
{ "_id" : ObjectId("5ec5e48b7f7fa92fe266d246"), "content" : "test", "reply" : [ { "content" : "Reply 1", "userId" : ObjectId("5ec5e4c37f7fa92fe266d27e") }, { "content" : "Reply 2", "userId" : ObjectId("5ec5e4ca7f7fa92fe266d285") }, { "content" : "Reply 3", "userId" : ObjectId("5ec5e4dd7f7fa92fe266d290") } ] }
And user
/* 1 */ { "_id" : ObjectId("5ec5e4c37f7fa92fe266d27e"), "name" : "u1" } /* 2 */ { "_id" : ObjectId("5ec5e4ca7f7fa92fe266d285"), "name" : "u2" } /* 3 */ { "_id" : ObjectId("5ec5e4dd7f7fa92fe266d290"), "name" : "u3" }
When we need to associate the user in the comment table and output the user's name, we need to use the reply field in the comment collection, which is an array. At this time, we can use unwind to split the array first
db.comment.aggregate([ { $unwind: '$reply' // First, split the reply array }, { $lookup: { from: 'user', // From which Schema localField: 'reply.userId', // Locally associated fields foreignField: '_id', // Associated fields used in user as: 'userInfo' // The field name put in after querying all users. This is user-defined and is an array type. } }, { $unwind:'$userInfo' //Because the data associated with lookup will return an array, continue to split it (see personal needs here) }, { // According to_ id grouping $group:{ _id:'$_id', content:{$first:'$content'}, reply:{ $push:{ 'content':'$reply.content', 'userId':'$reply.userId', 'user':'$userInfo.name' } } } } ])
Starting from MongoDB 3.4, if the localField is an array, the array elements can be matched with the scalar foreignField without the $unwind phase. This part will come back to the actual operation experiment when you have time
2021/1/30 - TODO
It is stated in the official document that if pipeline is used in $lookpup, the fields in the document cannot be accessed directly
At this time, you need to define it in the let before you can reference it in the pipeline.
To access the let variable in the pipeline, use the $expr operator.
For example:
// order collection db.orders.insert([ { "_id" : 1, "item" : "almonds", "price" : 12, "ordered" : 2 }, { "_id" : 2, "item" : "pecans", "price" : 20, "ordered" : 1 }, { "_id" : 3, "item" : "cookies", "price" : 10, "ordered" : 60 } ]) // warehouses collection db.warehouses.insert([ { "_id" : 1, "stock_item" : "almonds", warehouse: "A", "instock" : 120 }, { "_id" : 2, "stock_item" : "pecans", warehouse: "A", "instock" : 80 }, { "_id" : 3, "stock_item" : "almonds", warehouse: "B", "instock" : 60 }, { "_id" : 4, "stock_item" : "cookies", warehouse: "B", "instock" : 40 }, { "_id" : 5, "stock_item" : "cookies", warehouse: "A", "instock" : 80 } ]) db.orders.aggregate([ { $lookup: { from: "warehouses", let: { order_item: "$item", order_qty: "$ordered" }, pipeline: [ { $match: { $expr: { $and: [ { $eq: [ "$stock_item", "$$order_item" ] }, { $gte: [ "$instock", "$$order_qty" ] } ] } } }, { $project: { stock_item: 0, _id: 0 } } ], as: "stockdata" } } ])
Results:
{ "_id" : 1, "item" : "almonds", "price" : 12, "ordered" : 2, "stockdata" : [ { "warehouse" : "A", "instock" : 120 }, { "warehouse" : "B", "instock" : 60 } ] } { "_id" : 2, "item" : "pecans", "price" : 20, "ordered" : 1, "stockdata" : [ { "warehouse" : "A", "instock" : 80 } ] } { "_id" : 3, "item" : "cookies", "price" : 10, "ordered" : 60, "stockdata" : [ { "warehouse" : "A", "instock" : 80 } ] }
pipeline expression
TODO to be updated
$first
$sum
$subtract
$eq = (equal to)
$GT > (greater than)
$GTE > = (greater than or equal to)
$LT < (less than)
$LTE < = (less than or equal to)
$ne != (not equal to)
$nin !in (not in)
$$ROOT
Query operator
$in
// people collction [ { "_id" : ObjectId("5ca7a4b0219efd687462f965"), "id" : 1, "name" : "jack", "age" : 73 ,"hobby" : [ "taichi" ] } { "_id" : ObjectId("5ca7a4c4219efd687462f968"), "id" : 4, "name" : "xiaogang", "age" : 13, "hobby" : [ "Shuttlecock", "basketball", "football" ] } { "_id" : ObjectId("5ca7a4JK2K3H5JK2H4K38"), "id" : 3, "name" : "anne", "age" : 13, "hobby" : [ "pingpong", "football" ] } ]
Suppose we need to find someone whose interest is billiards or basketball
db.people.find({ id : { $in : ["pingpong","basketball"] } } );
$all
Continuing with the above example, assume that the demand is for people interested in both basketball and football
db.people.find( { id: { $all:["basketball","football"] }})
$or
$or query multiple values of multiple or relationships
Continuing with the previous example, suppose the need is for people who are interested in both basketball and football, or tai chi or billiards
db.user.find({ $or: [ {id:{$all:["basketball","football"]},}, {id:{$in:["taichi","pingpong"]},} ] })
$and
Continuing with the previous example, assume that the need is for people who are interested in both basketball and football and Tai Chi or billiards
db.user.find({ $and: [ {id:{$all:["basketball","football"]},}, {id:{$in:["taichi","pingpong"]},} ] })
2021-02-05 to be updated
Todo: $push