10, ES
7. Advanced - aggregation
Aggregation provides the ability to group and extract data from data.
The simplest aggregation method is roughly equal to SQL Group by and SQL aggregation function (average, maximum and minimum)
polymerization
It is used to process the queried data
# Including mil, average age GET bank/_search { "query": { # Query the information containing mill "match": { "address": "Mill" } }, "aggs": { #Query based aggregation "ageAgg": { # The name of aggregation, casually "terms": { # The probability distribution of values is grouping statistics, which is similar to group by "field": "age", "size": 10 } }, "ageAvg": { "avg": { # Look at the average of age values "field": "age" } }, "balanceAvg": { "avg": { # Look at the average balance "field": "balance" } } }, "size": 0 # Don't look at the details }
Sub aggregation
That is, write another aggs in aggs
GET bank/_search { "query": { "match_all": {} }, "aggs": { "ageAgg": { "terms": { # Look at the distribution of age groups "field": "age", "size": 100 }, "aggs": { # Juxtaposed with terms "ageAvg": { #Average the data grouped by age, such as the average salary of all people aged 20 "avg": { "field": "balance" } } } } }, "size": 0 }
8.Mapping field mapping
It is directly defined under the index, because there is no difference in the processing of documents with the same name under different types. Therefore, using mapping mapping is equivalent to masking types and placing documents directly at the next level of the index.
ElasticSearch7 - remove the type concept
Create index and specify mapping
PUT /my_index #It is equivalent to mysql creating a table to specify the type of each field { "mappings": { "properties": { "age": { "type": "integer" }, "email": { "type": "keyword" # Specify as keyword }, "name": { "type": "text" # Full text search. Word segmentation during saving and word segmentation matching during retrieval } } } } PUT /my_index/_mapping #It is equivalent to filling in data in the form { "properties": { "employee-id": { "type": "keyword", "index": false # Field cannot be retrieved. Retrieval indicates that the newly added field cannot be retrieved, but is a redundant field. Cannot update mapping We cannot update an existing field mapping. Update must create a new index for data migration. } } } GET /my_index #view map
Cannot update mapping
We cannot update an existing field mapping. Update must create a new index for data migration.
It is equivalent to that the attributes of the table cannot be changed (similar to mysql)
You must create a new index and migrate the old data.
create new index
PUT /newbank { "mappings": { "properties": { "account_number": { "type": "long" }, "address": { "type": "text" }, "age": { "type": "integer" }, "balance": { "type": "long" }, "city": { "type": "keyword" }, "email": { "type": "keyword" }, "employer": { "type": "keyword" }, "firstname": { "type": "text" }, "gender": { "type": "keyword" }, "lastname": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "state": { "type": "keyword" } } } }
Migrate data from bank to newbank
POST _reindex { "source": { "index": "bank", "type": "account" #The name and type of the original index }, "dest": { "index": "newbank" #The name of the new index } }
The type of the new index changes to_ doc (default index type, old index is account)
9. Participle
A tokenizer receives a character stream, divides it into independent tokens (words, usually independent words), and then outputs the tokens stream.
POST _analyze { "analyzer": "standard", "text": "The 2 Brown-Foxes bone." }
It will be separated by words, but it is not suitable for Chinese, because it will divide each word as a word
So we need to use other word splitters
Install ik word splitter
During the previous installation of elasticsearch, we have mapped the "/ usr/share/elasticsearch/plugins" directory of the elasticsearch container to the "/ mydata/elasticsearch/plugins" directory of the host machine. Therefore, a more convenient way is to download the "/ elasticsearch-analysis-ik-7.4.2.zip" file and unzip it to this folder. After installation, restart the elasticsearch container.
Download it, unzip it, put it in the / mydata/elasticsearch/plugins directory, and restart the container.
GET _analyze { "analyzer": "ik_smart", "text":"I am Chinese," } GET _analyze { "analyzer": "ik_max_word", "text":"I am Chinese," }
Supplement: linux command line editing
vi file name
i enter insert mode
esc exits insert mode
: wq exit and save
Custom Dictionary
Modify IKAnalyzer.cfg.xml in / usr/share/elasticsearch/plugins/ik/config
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment>IK Analyzer Extended configuration</comment> <!--Users can configure their own extended dictionary here --> <entry key="ext_dict"></entry> <!--Users can configure their own extended stop word dictionary here--> <entry key="ext_stopwords"></entry> <!--Users can configure the remote extension dictionary here --> <entry key="remote_ext_dict">http://192.168.56.10/es/fenci.txt < / entry > # configure the path of the custom word segmentation file here <!--Users can configure the remote extended stop word dictionary here--> <!-- <entry key="remote_ext_stopwords">words_location</entry> --> </properties>
You can look at the notes for details. I won't go into it here
https://blog.csdn.net/hancoder/article/details/113922398
10.elasticsearch-Rest-Client
Import dependency
<dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>elasticsearch-rest-high-level-client</artifactId> <version>7.4.2</version></dependency>
Since spring boot has integrated es version 6.8.5, the ES version that spring boot dependencies depends on should be changed
<properties> <java.version>1.8</java.version> <elasticsearch.version>7.4.2</elasticsearch.version> #It used to be 6.8.5 </properties>
A microservice that does not need a data source depends on a parent project that has data source related configuration processing
Annotate startup class
@SpringBootApplication(exclude = DataSourceAutoConfiguration.class)
Configuration class
@Configuration public class GulimallElasticSearchConfig { public static final RequestOptions COMMON_OPTIONS; static { RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder(); COMMON_OPTIONS = builder.build(); } @Bean public RestHighLevelClient esRestClient() { RestClientBuilder builder = null; // You can specify more than one es builder = RestClient.builder(new HttpHost(host, 9200, "http")); RestHighLevelClient client = new RestHighLevelClient(builder); return client; } }
Test class
Save / modify
@Testpublic void indexData() throws IOException { // Set index indexrequest indexrequest = new indexrequest ("users"); indexRequest.id("1"); User user = new User(); User.setusername ("Zhang San"); user.setAge(20); user.setGender("male"); String jsonString = JSON.toJSONString(user); // Set the content to be saved, and specify the data and type indexRequest.source(jsonString, XContentType.JSON)// Execute index creation and data saving indexresponse index = client. Index (indexrequest, guilmallelasticsearchconfig. Common_options); System.out.println(index);}
If the save statement is sent again, it will become a modification operation.
Retrieval and aggregation
@Test public void find() throws IOException { // 1 create a search request searchrequest searchrequest = new searchrequest(); searchRequest.indices("bank"); // Populate index SearchSourceBuilder sourceBuilder = new SearchSourceBuilder()// Construct search criteria / / sourcebuilder. Query()// sourceBuilder.from();// sourceBuilder.size();// sourceBuilder.aggregation(); sourceBuilder.query(QueryBuilders.matchQuery("address","mill")); // Query the system. Out. Println (sourcebuilder. Tostring()) of the address containing mill// Sourcebuilder is the JSON string of a query statement. / / any query condition match match match AGG is constructed with their respective constructors and placed in SearchSourceBuilder, and then inserted into searchrequest. / / the first aggregation condition is constructed: termaggregationbuilder agg1 = aggregationbuilders. Terms ("agg1"). Field ("age"). Size (10) ;// Aggregate name // The parameter is aggregationbuilder sourcebuilder.aggregation (agg1)// Build the second aggregation condition: average salary avggregationbuilder agg2 = aggregationbuilders.avg ("agg2"). Field ("balance"); sourceBuilder.aggregation(agg2); searchRequest.source(sourceBuilder); // 2. Execute search response, response = client.search (searchrequest, gulialelasticsearchconfig. Common_options)// 3. Analyze the response result system. Out. Println (response. Tostring())// Response is also a result JSON string. / / 3.1 get the java bean searchhits hits = response. Gethits(); SearchHit[] hits1 = hits.getHits(); For (searchhit: hits1) {hit. Getid(); hit. Getindex(); string sourceasstring = hit. Getsourceasstring(); / / the real query result is in source. Get the source and use the JSON conversion tool to convert it into a bean object. This bean needs to create its own VO class (or existing PO) account account = json.parseobject (sourceasstring, account. Class); system. Out. Println (account);} / / 3.2 get the aggregation result aggregations aggregations = response. Getaggregations(); terms agg21 = aggregations. Get ("agg2"); / / get for (terms. Bucket: agg21. Getbuckets()) according to the aggregation name { String keyAsString = bucket.getKeyAsString(); System.out.println(keyAsString); } }
Generally, a class name plus s represents the constructor of this class
11, Installing nginx
nginx can be understood as tomcat is a web server
Start an nginx instance just to copy the configuration
docker run -p80:80 --name nginx -d nginx:1.10
Copy the configuration file in the container to / mydata/nginx/conf /
mkdir -p /mydata/nginx/htmlmkdir -p /mydata/nginx/logsmkdir -p /mydata/nginx/confdocker container cp nginx:/etc/nginx/* /mydata/nginx/conf/ #Since there will be an nginx folder in config after copying, you need to move its contents to conf mv /mydata/nginx/conf/nginx/* /mydata/nginx/conf/rm -rf /mydata/nginx/conf/nginx
Terminate original container:
docker stop nginx
Execute the command to delete the original container:
docker rm nginx
To create a new Nginx, execute the following command
docker run -p 80:80 --name nginx \ -v /mydata/nginx/html:/usr/share/nginx/html \ -v /mydata/nginx/logs:/var/log/nginx \ -v /mydata/nginx/conf/:/etc/nginx \ -d nginx:1.10
Set startup nginx
docker update nginx --restart=always
Create a "/ mydata/nginx/html/index.html" file to test whether it can be accessed normally
echo '<h2>hello nginx!</h2>' >index.html
visit: http://nginx IP of host: 80 / index.html
12, Product es preparation
ES is in memory, so it is better than mysql in retrieval. Es also supports clustering and data fragment storage.
1. Determine the index model
Two schemes
First, search according to spu, that is, the search condition is spu, and sku stores only one sku id, which saves space. However, there is a fatal problem that when a qualified spu is retrieved, a large number of skuid s will be returned at one time, which may cause blocking in a high concurrency environment.
Second, save according to sku, that is, the search condition is sku, and all sku information is saved. This method has many redundant fields because many SKUs have the same spu attribute. However, although this method occupies more space than the first method, it returns less data each time.
Considering comprehensively, the second scheme is selected according to the concept of "space for time".
The index model is as follows:
PUT product { "mappings":{ "properties": { "skuId":{ "type": "long" }, "spuId":{ "type": "keyword" }, # Indivisible word "skuTitle": { "type": "text", "analyzer": "ik_smart" # Chinese word splitter }, "skuPrice": { "type": "keyword" }, # Guaranteed accuracy "skuImg" : { "type": "keyword" , "index": false, # It cannot be retrieved and no index is generated }, # false in video "saleCount":{ "type":"long" }, "hasStock": { "type": "boolean" }, "hotScore": { "type": "long" }, "brandId": { "type": "long" }, "catalogId": { "type": "long" }, "brandName": {"type": "keyword"}, # false in video "brandImg":{ "type": "keyword", "index": false, # It can not be retrieved, no index is generated, and it is only used as a page "doc_values": false # Cannot be aggregated. The default value is true }, "catalogName": { "type": "keyword" "index": false, # It cannot be retrieved and no index is generated }, # There is false in the video "attrs": { "type": "nested", "properties": { "attrId": {"type": "long" }, "attrName": { "type": "keyword", "index": false, "doc_values": false }, "attrValue": {"type": "keyword" } } } } } }
2.nested embedded objects
The attribute is "type": "nested", because it is an internal attribute for retrieval
Objects of array type will be flattened (each attribute of the object will be stored together separately)
user.name=["aaa","bbb"] user.addr=["ccc","ddd"] In this storage mode, the following errors may occur: Error retrieving{aaa,ddd},This combination does not exist
The flattening of the array will enable the retrieval to retrieve the non-existent ones. In order to solve this problem, the embedded attribute is adopted. When the array is an object, the embedded attribute is used (not an object, no embedded attribute is required)
13, Goods on the shelf
1. Basic ideas
The background management system transfers the spuid to the back end. The back end finds out a series of SKUs corresponding to the spuid from the mysql database according to the spuid, and then uploads these SKUs to es. Of course, this involves the transformation between PO class and ESModel.
2. Batch query sku whether there is inventory
// The specification parameters of sku are the same, so we need to query the specification parameters in advance and only query once / * * * query whether sku has inventory * return skuId and stock * / @ postmapping ("/ hasstock") public R getskuhasstock (@ requestbody list < long > skuids) {list < skuhasstockvo > Vos = wareskuservice. Getskuhasstock (skuids); return r.ok(). SetData (VOS);}
3. Batch upload ES to a skuEsModels
/** * Goods on the shelves */ @PostMapping("/product") // ElasticSaveController public R productStatusUp(@RequestBody List<SkuEsModel> skuEsModels){ boolean status; try { status = productSaveService.productStatusUp(skuEsModels); } catch (IOException e) { log.error("ElasticSaveController Goods on the shelf error: {}", e); return R.error(BizCodeEnum.PRODUCT_UP_EXCEPTION.getCode(), BizCodeEnum.PRODUCT_UP_EXCEPTION.getMsg()); } if(!status){ return R.ok(); } return R.error(BizCodeEnum.PRODUCT_UP_EXCEPTION.getCode(), BizCodeEnum.PRODUCT_UP_EXCEPTION.getMsg()); } public boolean productStatusUp(List<SkuEsModel> skuEsModels) throws IOException { // 1. Create an index product for ES BulkRequest bulkRequest = new BulkRequest(); // 2. Construct save request for (SkuEsModel esModel : skuEsModels) { // catalog index IndexRequest indexRequest = new IndexRequest(EsConstant.PRODUCT_INDEX); // Set index id indexRequest.id(esModel.getSkuId().toString()); String jsonString = JSON.toJSONString(esModel); indexRequest.source(jsonString, XContentType.JSON); // add bulkRequest.add(indexRequest); } // bulk batch save BulkResponse bulk = client.bulk(bulkRequest, GuliESConfig.COMMON_OPTIONS); // Does TODO have errors boolean hasFailures = bulk.hasFailures(); if(hasFailures){ List<String> collect = Arrays.stream(bulk.getItems()).map(item -> item.getId()).collect(Collectors.toList()); log.error("Product listing error:{}",collect); } return hasFailures; }
4. Package the shelf data according to spuId
That is, encapsulate ESModels according to spuid
// SpuInfoServiceImpl public void upSpuForSearch(Long spuId) { //1. Find out all sku information and brand name corresponding to the current spuId List<SkuInfoEntity> skuInfoEntities=skuInfoService.getSkusBySpuId(spuId); //TODO 4. Find out all specification attributes of the current sku that can be retrieved according to spu List<ProductAttrValueEntity> productAttrValueEntities = productAttrValueService.list(new QueryWrapper<ProductAttrValueEntity>().eq("spu_id", spuId)); List<Long> attrIds = productAttrValueEntities.stream().map(attr -> { return attr.getAttrId(); }).collect(Collectors.toList()); List<Long> searchIds=attrService.selectSearchAttrIds(attrIds); #Here, the database is queried according to the id. there are sql statements below Set<Long> ids = new HashSet<>(searchIds); List<SkuEsModel.Attr> searchAttrs = productAttrValueEntities.stream().filter(entity -> { return ids.contains(entity.getAttrId()); }).map(entity -> { SkuEsModel.Attr attr = new SkuEsModel.Attr(); BeanUtils.copyProperties(entity, attr); return attr; }).collect(Collectors.toList()); //TODO 1. Send a remote call to the inventory system to query whether there is inventory Map<Long, Boolean> stockMap = null; try { List<Long> longList = skuInfoEntities.stream().map(SkuInfoEntity::getSkuId).collect(Collectors.toList()); List<SkuHasStockVo> skuHasStocks = wareFeignService.getSkuHasStocks(longList); stockMap = skuHasStocks.stream().collect(Collectors.toMap(SkuHasStockVo::getSkuId, SkuHasStockVo::getHasStock)); }catch (Exception e){ log.error("Remote call to inventory service failed,reason{}",e); } //2. Encapsulate the information of each sku Map<Long, Boolean> finalStockMap = stockMap; List<SkuEsModel> skuEsModels = skuInfoEntities.stream().map(sku -> { SkuEsModel skuEsModel = new SkuEsModel(); BeanUtils.copyProperties(sku, skuEsModel); skuEsModel.setSkuPrice(sku.getPrice()); skuEsModel.setSkuImg(sku.getSkuDefaultImg()); //TODO 2. Heat score. 0 skuEsModel.setHotScore(0L); //TODO 3. Query brand and category name information BrandEntity brandEntity = brandService.getById(sku.getBrandId()); skuEsModel.setBrandName(brandEntity.getName()); skuEsModel.setBrandImg(brandEntity.getLogo()); CategoryEntity categoryEntity = categoryService.getById(sku.getCatalogId()); skuEsModel.setCatalogName(categoryEntity.getName()); //Set searchable properties skuEsModel.setAttrs(searchAttrs); //Set whether there is inventory skuEsModel.setHasStock(finalStockMap==null?false:finalStockMap.get(sku.getSkuId())); return skuEsModel; }).collect(Collectors.toList()); //TODO 5. Send data to es for saving: gulimall search R r = searchFeignService.saveProductAsIndices(skuEsModels); if (r.getCode()==0){ this.baseMapper.upSpuStatus(spuId, ProductConstant.ProductStatusEnum.SPU_UP.getCode()); }else { log.error("Commodity remote es Save failed"); } }
Persistence layer sql corresponding to selectSearchAttrIds
<resultMap type="com.atguigu.gulimal1.product.entity.AttrEntity" id="attrMap"> <result property="attrId" column="attr_id" /> <result property="attrName" column="attr_name" /> <result property="searchType" column="search_type" /> <result property="valueType" column="value_type" /> <result property="icon" column="icon" / > <result property="valueSelect" column="value_select" /> <result property="attrType" column="attr_type" /> <result property="enab1e" column= "enable"/> <result property="catelogId" column="catelog_id" /> <result property="showDesc" column="show_dese" /> < / resu1tMap> <!-- resultMap Corresponding return result PO Mapping to database column names Long And long Can handle null Situation --> <select id="selectSearchAttrIds" resu1tType="java.lang.Long"> SELECT attr_id FROM 'pms_attr' WHERE attr_id IN <foreach collection="attrIds" item="id" separator=" , " open="(" close=")"> #{id} < / foreach> AND search_type = 1 </ select> <!-- here attrIds Is a collection that can be xml of use foreach To traverse. open close Is the symbol to be added at the beginning and end, because it is in front of IN,So add There are two parentheses separator Is the separator between each element -->
Find out the information of all SKUs according to spu, and then judge whether there is inventory. According to whether there is inventory, write bool type hasStock (the difference between ESModel and PO)
Upload the encapsulated ESModels and call the service of 3
5. A small high-end writing method
When you want to do the same for elements in a collection
You can
List<CategoryEntity> newCategoryEntities = categoryEntities.stream().filter( categoryEntity -> categoryEntity.getEntityNum() == 1 ).map(categoryEntity -> { categoryEntity.setSize(new Long(100)); return categoryEntity; }).collect(Collectors.toList());
The total elements of the set can be operated by lmbda expression, and then collected into a set.
14, Nginx
1. Brief introduction
Forward proxy is that I want to access Google, but I can't access it. Then I ask another server to help me transfer to Google. Then it is said that this server is forward proxy for my request.
Reverse proxy is that I want to visit Google. Google transferred me to Baidu, so it is called Google reverse proxy for my request. (I'm not going to Baidu)
2. Logic of nginx + gateway
In fact, Nginx is to shield the ip of the intranet and expose only one Nginx ip. After having the domain name, the address mapped by the domain name is the address of Nginx.
Logic to implement: the native browser requests gulimall.com. After configuring the hosts file, when you enter gulimall.com in the browser, it is equivalent to the domain name resolution DNS service resolution to obtain ip 192.168.56.10, that is, instead of accessing the java service, you first find nginx. What do you mean? It means that if the project goes online one day, gulimall.com should be the ip of nginx, and users visit nginx
After the request reaches nginx,
If it is a static resource / static /, find the static resource directly in the nginx server and return it directly.
If it is not a static resource / (it is configured after / static / *, so the priority is low), nginx transfers its upstream to another ip 192.168.56.1:88. This ip port is the gateway.
(pay attention to configuring proxy_set_header Host $host; in the process of upstream.)
After arriving at the gateway, determine which micro service in nacos should be forwarded through url information assertion (you can also rewrite the url before giving it to nacos), and you get a response
3. Analysis of nginx configuration file
nginx.conf:
Global block: configure instructions that affect nginx global. For example, user group, pid storage path of nginx process, log storage path, introduction of configuration file, allowing generation of worker process fault, etc
events block: the configuration affects the network connection between the Nginx server and the user. Common settings include whether to enable the serialization of network connections under multiple work process es, whether multiple network connections are allowed to be received at the same time, which event driven model is selected to process connection requests, and the maximum number of connections each word process can support at the same time.
http block:
http global block: the configured instructions include file import, MIME-TYPE definition, log customization, connection timeout, maximum number of single link requests, etc. Error page, etc
Server block: this block is closely related to the virtual host. From the user's point of view, the virtual host is exactly the same as an independent hardware host. Each http block can include multiple server blocks, and each server block is equivalent to a virtual host.
location1: configure the routing of requests and the processing of various pages
location2
4.Nginx + gateway configuration
Modify the host hosts and map gulimall.com to 192.168.56.10. Turn off firewall
At this point, you can request index.html on the homepage of nginx by visiting gulimall.com
To make nginx reverse proxy to the 10000 port of the local machine, you mainly need to modify the server configuration
server {listen 80; server_name gulimall.com ; #charset koi8-r;#access_log/var/log/nginx/log/host.access.logmain;location / {proxy_pass http:/ /192.168.56.1: 10000}
listen is the listening port number, server_name is the domain name of the listener. Because of the mapping, gulimall.com is actually the ip address of the host
location is the ip + port to be forwarded
Modify nginx/conf/nginx.conf to map upstream to our gateway service
upstream gulimall{ # 88 is gateway server 192.168.56.1:88;}
Modify nginx/conf/conf.d/gulimall.conf. After receiving the visit from gulimall.com, if it is /, it will be transferred to the specified upstream. Because the host header will be lost in nginx forwarding, the gateway does not know the original host, so we add the header information
location / { proxy_pass http://gulimall; proxy_set_header Host $host; }
Gateway Routing and forwarding configuration
Configure the gateway as the server and forward the domain name * *. gulimall.com to the commodity service. When configuring, pay attention to the principle of gateway priority matching, so this configuration should be put later
- id: gulimall_host_route uri: lb://gulimall-product predicates: - Host=**.gulimall.com
In short, nginx is similar to the role of a gateway, mainly to hide the host's domain name. The external request requests the domain name, and the domain name maps to the address of nginx, and then nginx forwards the user's request to the real server.