1. Configure elastic Search in spring boot
1.1 Introducing related jar packages into engineering
1.1.1 Add the required jar package to build.gradle
I created the gradle project, the corresponding maven project is the same, add the corresponding jar package can be
// Adding dependencies to Spring Data Elastic search compile('org.springframework.boot:spring-boot-starter-data-elasticsearch') // Adding JNA dependencies, java accesses packages required by the current operating system compile('net.java.dev.jna:jna:4.3.0')
1.1.2 Add the configuration of elastic search to application.properties
#The default name of es, if you install es without special operation, is this name spring.data.elasticsearch.cluster-name=elasticsearch # Elasticsearch Cluster Node Service Address, separated by commas, starts a client node if nothing else is specified, default java access port 9300 spring.data.elasticsearch.cluster-nodes=localhost:9300 # Set connection timeout spring.data.elasticsearch.properties.transport.tcp.connect_timeout=120s
1.2 Creating Document Entity Objects
package site.wlss.blog.domain.es; import java.io.Serializable; import java.sql.Timestamp; import org.springframework.data.annotation.Id; import org.springframework.data.elasticsearch.annotations.Document; import org.springframework.data.elasticsearch.annotations.Field; import org.springframework.data.elasticsearch.annotations.FieldIndex; import site.wlss.blog.domain.Blog; /** * EsBlog Document class. * * @since 2018 5 August 2000 * @author wangli */ /*@Document Some of the attributes in the annotations, like mysql, are as follows: index –> DB type –> Table Document –> row */ @Document(indexName = "blog", type = "blog") public class EsBlog implements Serializable { private static final long serialVersionUID = 1L; @Id // Primary key, note that the search is id type string, which is different from what we usually use. private String id; //@ After the Id annotation is added, the primary key corresponds to the column in Elastic search, and can be queried directly with the primary key when querying. @Field(index = FieldIndex.not_analyzed) // Do not do full-text search fields private Long blogId; // The id of the blog entity, where an id attribute of the blog is added private String title; private String summary; private String content; @Field(index = FieldIndex.not_analyzed) // Do not do full-text search fields
The above is part of my code. Note that there is an @Document annotation for the entity object and an @id annotation for the object ID. There is also a @Field annotation. This is a description of the field. Here is a detailed explanation of these annotations.
Interpretation 1: @Document annotation
@ Several attributes in the Document annotation, analogous to mysql, are as follows:
indexName --> The name of the index library. It is suggested that the name of the project be named, which is equivalent to the database DB.
type -> type. It is suggested that table be named after entity, which is equivalent to table in database.
Document -> row is equivalent to a specific object
Attached are the annotations:
String indexName();//Name of index library. Name of project is recommended String type() default "";//Type, which is recommended to be named after the entity short shards() default 5;//Default partition number short replicas() default 1;//Default number of backups per partition String refreshInterval() default "1s";//refresh interval String indexStoreType() default "fs";//Index file storage type
Interpretation 2: @Id annotation
In Elastic search, the primary key corresponds to the column, and can be queried directly with the primary key when querying.
Explanation 3: @Field annotation
public @interface Field { FieldType type() default FieldType.Auto;#Automatic Detection of Attribute Types FieldIndex index() default FieldIndex.analyzed;#Default participle DateFormat format() default DateFormat.none; String pattern() default ""; boolean store() default false;#By default, the original text is not stored String searchAnalyzer() default "";#Specifies the word splitter to be used for field search String indexAnalyzer() default "";#The word separator specified when the specified field is indexed String[] ignoreFields() default {};#If a field needs to be ignored boolean includeInParent() default false; }
2. Create document library through jpa
Because we introduced elastic search of spring data, it follows the interface of spring data, that is to say, the method of operating elastic Search is exactly the same as that of operating spring data jpa. We can only inherit the document library from Elastic search Repository.
package site.wlss.blog.repository.es; import org.springframework.data.domain.Page; import org.springframework.data.domain.Pageable; import org.springframework.data.elasticsearch.repository.ElasticsearchRepository; import site.wlss.blog.domain.es.EsBlog; /** * EsBlog Repository Interface. * @author Wang Li * @date 2018 5 August 2000 */ public interface EsBlogRepository extends ElasticsearchRepository<EsBlog, String> { //Here are two additional query methods we created according to the spring data jpa naming specification /** * Fuzzy Query (Deduplication), Containing by Title, Introduction, Description and Label * @param title * @param Summary * @param content * @param tags * @param pageable * @return */ Page<EsBlog> findDistinctEsBlogByTitleContainingOrSummaryContainingOrContentContainingOrTagsContaining(String title,String Summary,String content,String tags,Pageable pageable); /** * Query Es Blog according to its id * @param blogId * @return */ EsBlog findByBlogId(Long blogId); }
The contents are two additional methods I created based on spring data jpa.
3. Query documents according to reporitory
There is no difference between this method and the common method of operation in jpa, that is, common addition, deletion and modification checking.
4. Elastic Search's advanced complex queries: non-aggregated queries and aggregated queries
Here's what I want to focus on today.
4.1 Non-aggregated complex queries (here we show the common processes of non-aggregated complex queries)
public List<EsBlog> elasticSerchTest() { //1. Create Query Builder (that is, set query conditions). Here we create a combination query (also known as multi-condition query). More query methods will be introduced later. /*Combination Query Builder * must(QueryBuilders) :AND * mustNot(QueryBuilders):NOT * should: :OR */ BoolQueryBuilder builder = QueryBuilders.boolQuery(); //Under builder, must, should and mustNot are equivalent to and, or and not in sql //Setting up a vague search, there are two words in the brief comment of the blog: learning builder.must(QueryBuilders.fuzzyQuery("sumary", "Study")); //Set the title of the blog to be queried to contain keywords builder.must(new QueryStringQueryBuilder("man").field("springdemo")); //The ranking of blog comments is decreasing in turn FieldSortBuilder sort = SortBuilders.fieldSort("commentSize").order(SortOrder.DESC); //Set Paging (10 items are displayed on the first page) //Note that the start is from 0, a bit like the query for method limit in sql PageRequest page = new PageRequest(0, 10); //2. Building queries NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder(); //Set search criteria to build nativeSearchQueryBuilder.withQuery(builder); //Setting paging to build nativeSearchQueryBuilder.withPageable(page); //Set the sort to build nativeSearchQueryBuilder.withSort(sort); //Production of Native SearchQuery NativeSearchQuery query = nativeSearchQueryBuilder.build(); //3. Execution Method 1 Page<EsBlog> page = esBlogRepository.search(query); //Execution Method 2: Note that there is another way to execute it here: using elastic search Template //Annotations need to be added when executing Method 2 //@Autowired //private ElasticsearchTemplate elasticsearchTemplate; List<EsBlog> blogList = elasticsearchTemplate.queryForList(query, EsBlog.class); //4. Get the total number of entries (for front-end paging) int total = (int) page.getTotalElements(); //5. Get the queried data content (returned to the front end) List<EsBlog> content = page.getContent(); return content; }
4.2 Examples of Query Builder Construction Method for Query Conditions
Before using aggregated queries, it's necessary to look at some common ways to create query conditions called Query Builder
4.2.1 Exact Query (Must Match Perfectly)
Single matching termQuery
//Non-separable query parameter 1: field name, parameter 2: field query value, because non-separable, so Chinese characters can only query a word, English is a word. QueryBuilder queryBuilder=QueryBuilders.termQuery("fieldName", "fieldlValue"); //Word segmentation query, using default word segmentation device QueryBuilder queryBuilder2 = QueryBuilders.matchQuery("fieldName", "fieldlValue");
Multiple Matches
//Non-segmented query, parameter 1: field name, parameter 2: multi-field query value, because non-segmented, so Chinese characters can only query a word, English is a word. QueryBuilder queryBuilder=QueryBuilders.termsQuery("fieldName", "fieldlValue1","fieldlValue2..."); //Word segmentation query, using default word segmentation device QueryBuilder queryBuilder= QueryBuilders.multiMatchQuery("fieldlValue", "fieldName1", "fieldName2", "fieldName3"); //Matching all files means no query conditions are set QueryBuilder queryBuilder=QueryBuilders.matchAllQuery();
4.2.2 Fuzzy Query (as long as it contains)
//Five common methods of fuzzy query are as follows //1. Common string queries QueryBuilders.queryStringQuery("fieldValue").field("fieldName");//Left-right ambiguity //2. Queries commonly used to recommend similar content QueryBuilders.moreLikeThisQuery(new String[] {"fieldName"}).addLikeText("pipeidhua");//If filedName is not specified, it defaults to all, commonly used in recommendation of similar content. //3. Prefix query: If the field has no participle, it matches the whole field prefix QueryBuilders.prefixQuery("fieldName","fieldValue"); //4.fuzzy query: A participle-based fuzzy query, which is queried by adding fuzzy attributes. If a document can match hotelName with a letter before or after tel, the meaning of fuzzy query is to add or decrease n words before and after the term. QueryBuilders.fuzzyQuery("hotelName", "tel").fuzziness(Fuzziness.ONE); //5.wildcard query: wildcard query, supporting * arbitrary strings;? Any character QueryBuilders.wildcardQuery("fieldName","ctr*");//The first is field name, and the second is a string with matching characters. QueryBuilders.wildcardQuery("fieldName","c?r?");
4.2.3 Range Query
//Closed Interval Query QueryBuilder queryBuilder0 = QueryBuilders.rangeQuery("fieldName").from("fieldValue1").to("fieldValue2"); //Open Interval Query QueryBuilder queryBuilder1 = QueryBuilders.rangeQuery("fieldName").from("fieldValue1").to("fieldValue2").includeUpper(false).includeLower(false);//The default is true, which is to include //greater than QueryBuilder queryBuilder2 = QueryBuilders.rangeQuery("fieldName").gt("fieldValue"); //Greater than or equal to QueryBuilder queryBuilder3 = QueryBuilders.rangeQuery("fieldName").gte("fieldValue"); //less than QueryBuilder queryBuilder4 = QueryBuilders.rangeQuery("fieldName").lt("fieldValue"); //Less than or equal to QueryBuilder queryBuilder5 = QueryBuilders.rangeQuery("fieldName").lte("fieldValue");
4.2.4 Combination Query/Multi-Conditional Query/Boolean Query
QueryBuilders.boolQuery() QueryBuilders.boolQuery().must();//Documents must match conditions exactly, equivalent to and QueryBuilders.boolQuery().mustNot();//Documents must not match conditions, equivalent to not QueryBuilders.boolQuery().should();//If at least one condition is met, the document will meet should, equivalent to or
4.3 Aggregated Query
Elastic search has a function called aggregations, which allows you to generate complex analysis statistics on data. It's like GROUP BY in SQL, but it's more powerful.
To master aggregation, you only need to understand two main concepts: (refer to https://blog.csdn.net/dm_vincent/article/details/42387161)
Buckets: A collection of documents that satisfy a certain condition.
Metrics: Statistical information calculated for documents in a bucket.
This is it! Each aggregation is simply a combination of one or more buckets, zero or multiple indicators. It can be roughly translated into SQL:
SELECT COUNT(color) FROM table GROUP BY color
The above COUNT(color) is equivalent to an indicator. GROUP BY color is equivalent to a bucket.
Grouping in buckets and SQL has similar concepts, while indicators are similar to COUNT(), SUM(), MAX().
Let's take a closer look at these concepts.
Buckets
A bucket is a collection of documents that satisfy certain conditions:
An employee belongs to either a male bucket or a female bucket.
The city of Albany belongs to the barrel of New York State.
Date 2014-10-28 belongs to October barrel.
As aggregation is performed, the values in each document are calculated to determine whether they match the bucket conditions. If the match is successful, the document is placed in the bucket and the aggregation continues.
Buckets can also be nested in other buckets, allowing you to complete hierarchical or conditional demarcation of these requirements. For example, Cincinnati can be placed in the barrel of Ohio State, while the whole Ohio State can be placed in the barrel of the United States.
There are many types of buckets in ES that allow you to divide documents in many ways (by hour, by the most popular entries, by age, by geographical location, and more). But fundamentally, they all operate on the same principle: dividing documents according to conditions.
Indicators (Metrics)
Buckets allow us to divide documents meaningfully, but ultimately we need to calculate some metrics for the documents in each bucket. Bucket splitting is the ultimate goal: it provides a way to divide documents so that you can calculate the required metrics.
Most metrics are simple mathematical operations (e.g., min, mean, max, and sum), which are calculated using values in documents. In practice, indicators allow you to calculate, for example, average salary, maximum selling price, or 95% query latency.
Combine the two
An aggregation is a combination of barrels and indicators. An aggregation can have only one bucket, or one indicator, or one for each. There can even be multiple nested barrels in the barrel. For example, we can divide documents into barrels according to the country in which they belong, and then calculate their average salary (an indicator) for each barrel.
Because buckets can be nested, we can implement a more complex aggregation operation:
- Documents are divided into barrels according to the country. (barrel)
- Then the barrels in each country are divided into barrels according to gender. (barrel)
- Then the barrels of each sex are divided into barrels according to the age range. (barrel)
- Finally, the average salary is calculated for each age group. (Indicators)
Aggregation queries are created by Aggregation Builders. Some common aggregation queries are as follows
(Reference: http://blog.csdn.net/u010454030/article/details/63266035)
(1)Statistics of the number of fields ValueCountBuilder vcb= AggregationBuilders.count("count_uid").field("uid"); (2)Re-counting the number of fields (with minor errors) CardinalityBuilder cb= AggregationBuilders.cardinality("distinct_count_uid").field("uid"); (3)Polymerization filtration FilterAggregationBuilder fab= AggregationBuilders.filter("uid_filter").filter(QueryBuilders.queryStringQuery("uid:001")); (4)Grouping by a field TermsBuilder tb= AggregationBuilders.terms("group_name").field("name"); (5)Summation SumBuilder sumBuilder= AggregationBuilders.sum("sum_price").field("price"); (6)Average AvgBuilder ab= AggregationBuilders.avg("avg_price").field("price"); (7)Maximum MaxBuilder mb= AggregationBuilders.max("max_price").field("price"); (8)Find the Minimum MinBuilder min= AggregationBuilders.min("min_price").field("price"); (9)Grouping by date interval DateHistogramBuilder dhb= AggregationBuilders.dateHistogram("dh").field("date"); (10)Get the results in the aggregation TopHitsBuilder thb= AggregationBuilders.topHits("top_result"); (11)Nested aggregation NestedBuilder nb= AggregationBuilders.nested("negsted_path").path("quests"); (12)Reverse nesting AggregationBuilders.reverseNested("res_negsted").path("kps ");
The detailed usage steps of aggregated queries are as follows:
public void test(){ //Goal: Search for the most blogged users (one blog corresponds to one user) and achieve the desired results by searching for the frequency of user names in the blog //First create a new collection for storing data List<String> ueserNameList=new ArrayList<>(); //1. Create query conditions, namely QueryBuild QueryBuilder matchAllQuery = QueryBuilders.matchAllQuery();//Setting all queries is equivalent to not setting query conditions //2. Building queries NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder(); //2.0 Setting QueryBuilder nativeSearchQueryBuilder.withQuery(matchAllQuery); //2.1 Set the search type. The default value is QUERY_THEN_FETCH. Refer to https://blog.csdn.net/wulex/article/details/71081042. nativeSearchQueryBuilder.withSearchType(SearchType.QUERY_THEN_FETCH);//Specify the type of index, query only the matching documents from each fragment, then reorder and rank, and get the first size documents. //2.2 Specify index libraries and document types nativeSearchQueryBuilder.withIndices("myBlog").withTypes("blog");//Specify the name and type of the index library to query, which is actually the indedName and type set in our document @Document //2.3 Here comes the point!!! Specify aggregation functions. In this case, take a field grouping aggregation as an example (you can set it according to your own aggregation query requirements) //The aggregation function explains: Calculate the frequency of occurrence of the field (assumed to be username) in all documents and rank it in descending order (usually used for a field's thermal ranking) TermsBuilder termsAggregation = AggregationBuilders.terms("Name for aggregate queries").field("username").order(Terms.Order.count(false)); nativeSearchQueryBuilder.addAggregation(termsAggregation); //2.4 Building Query Objects NativeSearchQuery nativeSearchQuery = nativeSearchQueryBuilder.build(); //3. Executing queries //3.1 Method 1, queries are executed through reporitory to obtain a Page-wrapped result set Page<EsBlog> search = esBlogRepository.search(nativeSearchQuery); List<EsBlog> content = search.getContent(); for (EsBlog esBlog : content) { ueserNameList.add(esBlog.getUsername()); } //After I get the corresponding document, I can get the author of the document, and then I can find the most popular users. //3.2 Method 2, query by the elastic search Template. queryForList method of the elastic Search template List<EsBlog> queryForList = elasticsearchTemplate.queryForList(nativeSearchQuery, EsBlog.class); //3.3 Method 3. By querying the elastic search Template. query () method of the elastic Search template, the aggregation (commonly used) can be obtained. Aggregations aggregations = elasticsearchTemplate.query(nativeSearchQuery, new ResultsExtractor<Aggregations>() { @Override public Aggregations extract(SearchResponse response) { return response.getAggregations(); } }); //Converting to map sets Map<String, Aggregation> aggregationMap = aggregations.asMap(); //Get the aggregation subclass of the corresponding aggregation function. The aggregation subclass is also a map set. The value inside is the bucket Bucket. We want to get the Bucket. StringTerms stringTerms = (StringTerms) aggregationMap.get("Name for aggregate queries"); //Get all the buckets List<Bucket> buckets = stringTerms.getBuckets(); //Converting a collection into an iterator traversal bucket, of course, if you don't delete the elements in buckets, just go ahead and traverse it. Iterator<Bucket> iterator = buckets.iterator(); while(iterator.hasNext()) { //The bucket bucket is also a map object, so we can just take its key value. String username = iterator.next().getKeyAsString();//Or bucket.getKey().toString(); //According to username, the corresponding document can be queried in the result, and the set of stored data can be added. ueserNameList.add(username); } //Finally, search the corresponding result set according to ueserNameList List<User> listUsersByUsernames = userService.listUsersByUsernames(ueserNameList); }
Original address: https://blog.csdn.net/topdandan/article/details/81436141