The functions and optimization principles of MongoDB index and MySql index are basically similar. MySql index types can be basically divided into:
- Single key index - Union index
- Primary key index (clustered index) - non primary key index (non clustered index)
In addition to these basic classifications, there are some special index types in MongoDB, such as array index, sparse index, geospatial index, TTL index, etc
For the convenience of the following test, we use the script to insert the following data
for(var i = 0;i < 100000;i++){ db.users.insertOne({ username: "user"+i, age: Math.random() * 100, sex: i % 2, phone: 18468150001+i }); }
Single key index
Single key index is the most basic index method, that is, there is only one index field
Use the username field in the collection to create a single key index. MongoDB will automatically name the index username_ one
db.users.createIndex({username:1}) 'username_1'
After creating the index, check the query plan using the username field. The stage is IXSCAN, and the index scan is used by the IXSCAN representative
db.users.find({username:"user40001"}).explain() { queryPlanner: { winningPlan: { ...... stage: 'FETCH', inputStage: { stage: 'IXSCAN', keyPattern: { username: 1 }, indexName: 'username_1', ...... } } rejectedPlans: [] , }, ...... ok: 1 }
Among the principles of index optimization, a very important principle is that the index should be built on the field with high cardinality. The so-called cardinality is the number of non repeating values in a field, that is, when we create the users set, the age value is 0-99, then the age field will have 100 non repeating values, that is, the cardinality of the age field is 100, The sex field only has two values of 0 | 1, that is, the base of the sex field is 2, which is a very low cardinality. In this case, the efficiency of the index is not high and will lead to index failure
The following is a sex field index of the ship. When querying the execution plan, it will be found that the query uses the full table scan instead of the relevant index
db.users.createIndex({sex:1}) 'sex_1' db.users.find({sex:1}).explain() { queryPlanner: { ...... winningPlan: { stage: 'COLLSCAN', filter: { sex: { '$eq': 1 } }, direction: 'forward' }, rejectedPlans: [] }, ...... ok: 1 }
Joint index
Joint index means that there will be multiple fields on the index. Next, use age and sex to create an index
db.users.createIndex({age:1,sex:1}) 'age_1_sex_1'
Then we use these two fields to make a query, view the execution plan, and walk the index smoothly
db.users.find({age:23,sex:1}).explain() { queryPlanner: { ...... winningPlan: { stage: 'FETCH', inputStage: { stage: 'IXSCAN', keyPattern: { age: 1, sex: 1 }, indexName: 'age_1_sex_1', ....... indexBounds: { age: [ '[23, 23]' ], sex: [ '[1, 1]' ] } } }, rejectedPlans: [], }, ...... ok: 1 }
Array index
Array index is to create an index on array fields, also known as multi value index. To test, add some array fields to the data in the users collection
db.users.updateOne({username:"user1"},{$set:{hobby:["sing","Basketball","rap"]}}) ......
Create an array index and view its execution plan. Note that isMultiKey: true indicates that the index used is a multivalued index
db.users.createIndex({hobby:1}) 'hobby_1' db.users.find({hobby:{$elemMatch:{$eq:"go fishing"}}}).explain() { queryPlanner: { ...... winningPlan: { stage: 'FETCH', filter: { hobby: { '$elemMatch': { '$eq': 'go fishing' } } }, inputStage: { stage: 'IXSCAN', keyPattern: { hobby: 1 }, indexName: 'hobby_1', isMultiKey: true, multiKeyPaths: { hobby: [ 'hobby' ] }, ...... indexBounds: { hobby: [ '["go fishing", "go fishing"]' ] } } }, rejectedPlans: [] }, ...... ok: 1 }
Compared with other indexes, the index entries and volume of the array index must increase exponentially. For example, if the average size of the hobby array of each document is 10, the number of entries in the hobby array index of this collection will be 10 times that of the ordinary index
Union array index
A joint array index is a joint index containing array fields. This index does not support multiple array fields in an index, that is, there can be at most one array field in an index. This is to avoid the explosive growth of index entries. Assuming that there are two array fields in an index, the number of index entries will be n*m times that of an ordinary index
Geospatial index
Add some geographic information to the original users collection
for(var i = 0;i < 100000;i++){ db.users.updateOne( {username:"user"+i}, { $set:{ location:{ type: "Point", coordinates: [100+Math.random() * 4,40+Math.random() * 3] } } }); }
Create a 2D spatial index
db.users.createIndex({location:"2dsphere"}) 'location_2dsphere' //Query people within 500 meters db.users.find({ location:{ $near:{ $geometry:{type:"Point",coordinates:[102,41.5]}, $maxDistance:500 } } })
There are many type s of geospatial indexes, including ponit (point) | linestring (line) | polygon (polygon), etc
TTL index
TTL is time to live, which is mainly used to automatically delete expired data. When using this index, you need to declare a time type field in the document, and then set an expireAfterSeconds when creating TTL index for this field. The expiration time unit is seconds. After creation, MongoDB will regularly check the data in the collection. When:
$$ Current time - TTL index field time > expireaftersrconds $$
MongoDB will automatically delete these documents. This index also has the following requirements:
- TTL index can only have one field, and there is no federated TTL index
- TTL cannot be used for fixed sets
- TTL indexes are traversed one by one. If it is found that the deletion conditions are met, it will be deleted by using the delete function, which is not efficient
First, add or delete a time field on our document
for(var i = 90000;i < 100000;i++){ db.users.updateOne( {username:"user"+i}, { $set:{ createdDate:new Date() } }); }
Create a TTL index and set the expiration time to 60s. After 60s, query and find that these data no longer exist
db.users.createIndex({createdDate:1},{expireAfterSeconds:60}) 'createdDate_1'
You can also change the expiration time of the TTL index with the CollMod command
db.runCommand({ collMod:"users", index:{ keyPattern:{createdDate:1}, expireAfterSeconds:120 } }) { expireAfterSeconds_old: 60, expireAfterSeconds_new: 120, ok: 1 }
Conditional index
Conditional index is also called partial index, which only indexes the data that meets the conditions
Create a username only for users over 50_ 1 index. When you view the execution plan, you will find that the isPartial field will become true
db.users.createIndex({username:1},{partialFilterExpression:{ age:{$gt:50} }}) 'username_1' db.users.find({$and:[{username:"user4"},{age:60}]}).explain() { queryPlanner: { ...... winningPlan: { stage: 'FETCH', filter: { age: { '$eq': 60 } }, inputStage: { stage: 'IXSCAN', keyPattern: { username: 1 }, indexName: 'username_1', ...... isPartial: true, ...... } }, rejectedPlans: [] }, ...... ok: 1 }
Sparse index
A general index will create an index for the whole collection according to a certain field. Even if a document does not have this field, the index will establish this field of the document in the index as null
Sparse index does not index fields that do not exist in the document. If this field exists but is null, the index will be created
Let's create a sparse index for some data in the users collection
for(var i = 5000;i < 10000;i++){ if(i < 9000){ db.users.updateOne( {username:"user"+i}, { $set:{email:(120000000+i)+"@qq.email"}} ) }else{ db.users.updateOne( {username:"user"+i}, { $set:{email:null}} ) } }
When the {email:null} condition is used to query without indexing, we will find that the documents found contain documents without email field
db.users.find({email:null}) { _id: ObjectId("61bdc01ba59136670f6536fd"), username: 'user0', age: 64.41483801726282, sex: 0, phone: 18468150001, location: { type: 'Point', coordinates: [ 101.42490900320335, 42.2576650823515 ] } } ......
Then create a sparse index for the email field and query with the {email:null} condition. It is found that all the documents queried are null documents with email field
db.users.createIndex({email:1},{sparse:true}); 'email_1' db.users.find({email:null}).hint({email:1}) { _id: ObjectId("61bdc12ca59136670f655a25"), username: 'user9000', age: 94.18397576757012, sex: 0, phone: 18468159001, hobby: [ 'go fishing', 'Table Tennis' ], location: { type: 'Point', coordinates: [ 101.25903151863596, 41.38450145025062 ] }, email: null } ......
Text index
The text index will segment the indexed document fields first and then retrieve them, but Chinese word segmentation is not supported at present
Next, add two text fields to create a joint text index
db.blog.insertMany([ {title:"hello world",content:"mongodb is the best database"}, {title:"index",content:"efficient data structure"} ]) //Create index db.blog.createIndex({title:"text",content:"text"}) 'title_text_content_text' //Query using text index db.blog.find({$text:{$search:"hello data"}}) { _id: ObjectId("61c092268c4037d17827d977"), title: 'index', content: 'efficient data structure' }, { _id: ObjectId("61c092268c4037d17827d976"), title: 'hello world', content: 'mongodb is the best database' }
unique index
Unique index means that duplicate elements cannot appear in the field where the index is established. In addition to single field unique index, there are joint unique index and array unique index (that is, there can be no element intersection between arrays)
//Create a unique index on the title field db.blog.createIndex({title:1},{unique:true}) 'title_1' //Insert an existing title Value db.blog.insertOne({title:"hello world",content:"mongodb is the best database"}) MongoServerError: E11000 duplicate key error collection: mock.blog index: title_1 dup key: { : "hello world" } //Check the execution plan. isUnique is true db.blog.find({"title":"index"}).explain() { queryPlanner: { ...... winningPlan: { stage: 'FETCH', inputStage: { stage: 'IXSCAN', keyPattern: { title: 1 }, indexName: 'title_1', isMultiKey: false, multiKeyPaths: { title: [] }, isUnique: true, ...... } }, rejectedPlans: [] }, ....... ok: 1 }