1, Scenario description
When we use ES for query, we often encounter such a scenario: we need to match multiple fields at the same time according to the query keywords entered by the user, and we want to make different settings for the weight of the matching fields, such as matching the company name and company profile at the same time. Here, we generally need to improve the weight of the company name matching, so that the correlation score will be more accurate.
In ES, we can control the weight of multi field query through the boost parameter.
2, Weight parameter boost
boost is a parameter used to modify the relevance of the document. The default value is 1. You can set different values to increase the weight of this field in the correlation score.
There are two types of boost:
- boost during indexing
- Query period boost
The boost parameter specified when creating the index is stored in the index. The only way to modify the boost value is to re index the document.
In view of this, it is recommended that users use boost during query, which will be more flexible. Users can change the weight of fields without re indexing data.
1. boost during indexing
Create index
explain:
In the process of creating an index, set the value of boost to 3 for the field in mappings to increase the relevance weight of the company.
PUT my-index-000001 { "mappings": { "properties": { "company": { "type": "text", "analyzer": "ik_smart", "boost":"3" }, "desc": { "type": "text", "analyzer": "ik_smart" } } } }
Add data:
PUT my-index-000001/_doc/1 { "company": "Beijing Jingdong century Co., Ltd" } PUT my-index-000001/_doc/2 { "desc": "Beijing Jingdong century Co., Ltd" }
Via multi_match for multi field matching query:
GET my-index-000001/_search { "query": { "multi_match": { "query": "JD.COM", "fields": ["company","desc"] } } }
Execution results:
"hits" : [ { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "1", "_score" : 0.68324494, "_source" : { "company" : "Beijing Jingdong century Co., Ltd" } }, { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "2", "_score" : 0.2876821, "_source" : { "desc" : "Beijing Jingdong century Co., Ltd" } } ]
Conclusion:
Match the same field string. Since the weight boost of company is 3 times that of desc, the final correlation score is nearly 3 times higher..
2. Query period boost
DELETE my-index-000001 PUT my-index-000001 { "mappings": { "properties": { "company": { "type": "text", "analyzer": "ik_smart" }, "desc": { "type": "text", "analyzer": "ik_smart" } } } } PUT my-index-000001/_doc/1 { "company": "Beijing Jingdong century Co., Ltd" } PUT my-index-000001/_doc/2 { "desc": "Beijing Jingdong century Co., Ltd" }
In normal query, the scores of two records are the same:
GET my-index-000001/_search { "query": { "multi_match": { "query": "JD.COM", "fields": ["company","desc"] } } }
Increase the weight of the company field:
GET my-index-000001/_search { "query": { "multi_match": { "query": "JD.COM", "fields": ["company^3","desc"] } } }
explain:
By adding "^" symbol and boost value after the field name, the scoring weight of the specified field is improved.
3, Weight control in ES java API
Map<String,Float> fields = new HashMap(2); fields.put("company", 3.0f); fields.put("desc", 1.0f); queryBuilder.must(QueryBuilders.multiMatchQuery(paramsDto.getKeyword()).fields(fields).analyzer("ik_smart"));
explain:
By encapsulating the fields object, specify the fields to be matched and the field weight boost.
summary
After reading this article, have you learned how to improve the score weight of multi field matching through boost reference?
1. There are two uses of the weight parameter boost: specify the weight of the field when creating the index and specify the weight of the field when querying.
2. How to specify the field weight parameter boost in ES java API query.