1, Background
There are many basic data types of ES. This article focuses on string types:
ES2.* There are no these two fields in the version, only the string field.
ES5.* And later versions, set the string field as an obsolete field and introduce the text and keyword fields.
The basic data types of ES may vary slightly according to different versions. Please refer to the instructions of different versions on the official website: https://www.elastic.co/guide/en/elasticsearch/reference/6.2/mapping-types.html
2, Difference between text and keyword
All text type strings can be defined as "text" text type or "keyword" keyword type.
The difference is that text type (text type) will use the default word splitter for word segmentation, that is, the stored data will be word segmented first, and then the word phrases after word segmentation will be stored in the index. Of course, you can also specify a specific word splitter for it.
text type retrieval does not directly give whether it matches, but retrieves the similarity and returns the results from high to low according to the similarity. This will lead to the possibility that the data we thought should be queried may not be found.
If it is defined as keyword type (keyword type), it will not be segmented by default and will be stored as is. When a field needs to be filtered, sorted and aggregated according to the exact value, the keyword type should be used
Keyword type retrieval is directly stored as binary. During retrieval, we directly match, and false is returned if there is no match. Therefore, keyword can be used for exact matching.
For the fuzzy query of ES, please refer to other blog posts:
https://blog.csdn.net/pony_maggie/article/details/113951893
Theoretically, the performance of fuzzy query is not as good as term and match.
3, Code use
eg:mapping structure
{ "mappings": { "example_test_type": { "dynamic": "false", "_all": { "enabled": false }, "properties": { "userName": {//User name: tester (fuzzy matching) "type": "text" }, "userPlace": {//Registered residence: Jilin (exact matching) "type": "keyword" }, "createTime": { "type": "long" } } } } }
get query parameters (successfully query a record):
{ "from": 0, "size": 10, "query": { "bool": { "must": [ { "term": { "userPlace": { "value": "Jilin", "boost": 1.0 } } }, { "match_phrase": { "userName": { "query": "test",//As long as the input parameter is included by the tested person "slop": 0, "zero_terms_query": "NONE", "boost": 1.0 } } } ], "adjust_pure_negative": true, "boost": 1.0 } }, "sort": [ { "createTime": { "order": "desc" } } ] }
java code call:
/* *1, Query condition assembly **/ SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder(); //Sort by creation time in descending order List<FieldSortBuilder> sortBuilderList = new ArrayList<>(); sortBuilderList.add(new FieldSortBuilder("createTime").order(SortOrder.DESC)); if (CollectionUtils.isNotEmpty(sortBuilderList)) { for (FieldSortBuilder sortBuilder : sortBuilderList) { sourceBuilder.sort(sortBuilder); } } //User name boolQueryBuilder.must(QueryBuilders.matchPhraseQuery("userName", userName)); //Registered residence boolQueryBuilder.must(QueryBuilders.termQuery("userPlace", userPlace)); sourceBuilder.query(boolQueryBuilder); /* *2, Call es query **/ SearchRequest searchRequest = new SearchRequest(example_test_index);//Indexes searchRequest.types(example_test_type);//type searchRequest.source(sourceBuilder); SearchResponse response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT); /* *3, Processing returned results **/ List<UserBO > resultList = new ArrayList<>(); SearchHits hits = response.getHits(); if (hits == null || hits.totalHits <= 0) { return null; } //Convert r es ults to objects UserBO userBO = null; for (SearchHit hit : hits.getHits()) { userBO = JsonUtil.parseObject(hit.getSourceAsString(), UserBO .class); resultList .add(userBO); } } }
The string type in this document is mainly processed in conjunction with matchPhraseQuery and termQuery.