ElasticSearch getting started notes

ElasticSearch getting started notes

  • After downloading and installing ElasticSearch on the Internet, enter the directory, open cmd and run it on Windows

    bin\elasticsearch.bat
    

    Running on linux or other systems

    bin\elasticsearch
    
  • If curl has been installed, to test whether ES is successfully installed, you can run the command on another cmd command (use another dev_tools tool for subsequent access)

    curl 'http://localhost:9200/?pretty'
    

    If it is not installed, you can directly use the browser to access the above path. If it is successful, a Json response will be returned, similar to the following code

    {
    name: "PC-20160716OWUU",
    cluster_name: "elasticsearch",
    cluster_uuid: "xmZT-FzjTcqALxfoopscww",
    version: {
    number: "7.12.0",
    build_flavor: "default",
    build_type: "zip",
    build_hash: "78722783c38caa25a70982b5b042074cde5d3b3a",
    build_date: "2021-03-18T06:17:15.410153305Z",
    build_snapshot: false,
    lucene_version: "8.8.0",
    minimum_wire_compatibility_version: "6.8.0",
    minimum_index_compatibility_version: "6.0.0-beta1"
    },
    tagline: "You Know, for Search"
    }
    
  • By default, Elastic only allows local access. If remote access is required, you can modify config / elasticsearch. In the Elastic installation directory YML file, remove network Host, change its value to 0.0.0.0, and then restart Elastic.

    network.host: 0.0.0.0
    

Install Kibana

  • Kibana is an open source analysis and visualization platform designed to work with Elasticsearch. Now, sense is no longer installed, but Dev Tools is used, so Windows runs kibana directly Bat is OK. The same is true for Linux.

  • If you can't enter the Dev Tools page, just download and run the corresponding ES version according to the page prompt

  • The console UI is divided into two panes: the editor pane (left) and the response pane (right). Use the editor to type the request and submit it to Elasticsearch. The results are displayed in the response pane on the right.

  • When entering the request, the console will make suggestions that you can accept according to the Enter / Tab. These suggestions are based on the request structure and your index and type.

Use ES

I Basic concepts

  • The following is based on Elastic 6.0
  • Elastic is essentially a distributed database, which allows multiple servers to work together, and each server can run multiple elastic instances.
  • A single Elastic instance is called a node. A group of nodes form a cluster
  • Elastic will index all fields and write an Inverted Index after processing. When searching for data, directly search the index.

Index

  • The top-level unit of Elastic data management is called Index. It is synonymous with a single database. The name of each Index (i.e. database) must be lowercase.

  • The following command is based on ES6 0 can view all indexes of the current node. You can also use a similar implementation of cURL

    GET _cat/indices?v
    

Document

  • A single record in the Index is called a Document. Many documents form an Index.

  • Document is expressed in JSON format. Here is an example.

    {
      "user": "Zhang San",
      "title": "engineer",
      "desc": "Database management"
    }
    

Type

  • Document s can be grouped. For example, in the weather Index, they can be grouped by city (Beijing and Shanghai) or by climate (sunny and rainy days). This grouping is called Type. It is a virtual logical grouping used to filter documents.

  • Different types should have similar schema s. For example, the id field cannot be a string in one group and a value in another group. This is related to the table of relational database One difference . Data with completely different properties (such as products and logs) should be stored as two indexes instead of two types in one Index (although it can be done).

  • The following command lists the types contained in each Index.

    GET _mapping?pretty=true
    

II Create and delete Index

  • When you create a new Index, you can send a PUT request directly to the Elastic server. The following example is to create a new Index named weather.

    PUT weather
    
  • The server returns a JSON object in which the acknowledged field indicates that the operation is successful.

    {
      "acknowledged": true,
      "shards_acknowledged": true,
      "index": "weather"
    }
    
  • Then, we issue a DELETE request to DELETE the Index.

    DELETE weather
    
  • The acknowledged field is returned after success.

    {
      "acknowledged": true
    }
    

III Chinese word segmentation settings

  • First, install the Chinese word segmentation plug-in. What is used here is ik , you can also consider other plug-ins (such as smartcn).

    ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.0.0/elasticsearch-analysis-ik-6.0.0.zip
    
  • The above code installs the plug-in version 6.0.0, which is used with Elastic 6.0.0.

    Then, restart Elastic and the newly installed plug-in will be loaded automatically.

  • Then, create a new Index and specify the fields that need word segmentation. This step varies according to the data structure, and the following commands are only for this article. Basically, all Chinese fields that need to be searched should be set separately.

    PUT /accounts
    {
      "mappings": {
        "person": {
          "properties": {
            "user": {
              "type": "text",
              "analyzer": "ik_max_word",
              "search_analyzer": "ik_max_word"
            },
            "title": {
              "type": "text",
              "analyzer": "ik_max_word",
              "search_analyzer": "ik_max_word"
            },
            "desc": {
              "type": "text",
              "analyzer": "ik_max_word",
              "search_analyzer": "ik_max_word"
            }
          }
        }
      }
    }
    

    Returned results

    {
      "acknowledged": true,
      "shards_acknowledged": true,
      "index": "accounts"
    }
    

IV Data operation

adding record

  • Send a PUT request to the specified / Index/Type to add a new record in the Index. For example, by sending a request to / accounts/person, you can add a personnel record.

    PUT accounts/person/1
    {
      "user": "Zhang San",
      "title": "engineer",
      "desc": "Database management"
    }
    
  • The JSON object returned by the server will give information such as Index, Type, Id, Version, etc.

    {
      "_index": "accounts",
      "_type": "person",
      "_id": "1",
      "_version": 1,
      "result": "created",
      "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
      },
      "_seq_no": 0,
      "_primary_term": 1
    }
    
  • If you look closely, you will find that the request path is / accounts/person/1, and the last 1 is the Id of the record. It is not necessarily a number. Any string (such as abc) can be used.

  • When adding a record, you can also not specify the Id. at this time, it should be changed to POST request.

    POST accounts/person
    {
      "user": "Li Si",
      "title": "engineer",
      "desc": "system management"
    }
    

    In the above code, send a POST request to / accounts/person to add a record. At this time, in the JSON object returned by the server_ The id field is a random string.

    {
      "_index": "accounts",
      "_type": "person",
      "_id": "JOMdEnkBNi6vmKXZ1RkS",
      "_version": 1,
      "result": "created",
      "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
      },
      "_seq_no": 0,
      "_primary_term": 1
    }
    

    Note that if you do not create an Index first (accounts in this example) and directly execute the above command, Elastic will not report an error, but directly generate the specified Index. Therefore, be careful when typing and don't write the name of Index wrong.

View records

  • You can view this record by issuing a GET request to / Index/Type/Id

    GET accounts/person/1?pretty=true
    

    The above code requests to view the record / accounts/person/1. The URL parameter pretty=true indicates that it is returned in an easy to read format.

  • In the returned data, the found field indicates that the query is successful_ The source field returns the original record.

    {
      "_index": "accounts",
      "_type": "person",
      "_id": "1",
      "_version": 1,
      "found": true,
      "_source": {
        "user": "Zhang San",
        "title": "engineer",
        "desc": "Database management"
      }
    }
    
  • If the Id is incorrect, the data cannot be found, and the found field is false.

    GET accounts/person/abc?pretty=true
    
    {
      "_index": "accounts",
      "_type": "person",
      "_id": "abc",
      "found": false
    }
    

Delete record

  • To DELETE a record is to issue a DELETE request.

    DELETE accounts/person/1
    

    Don't delete this record here. It will be used later.

Update record

  • To update the record is to use the PUT request to resend the data.

    PUT accounts/person/1
    {
      "user": "Zhang San",
      "title": "engineer",
      "desc": "Database management,software development"
    }
    

    Return results

    {
      "_index": "accounts",
      "_type": "person",
      "_id": "1",
      "_version": 2,
      "result": "updated",
      "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
      },
      "_seq_no": 1,
      "_primary_term": 1
    }
    

    In the above code, we changed the original data from "database management" to "database management, software development". In the returned result, several fields have changed.

    "_version" : 2,
    "result" : "updated",
    "created" : false
    

    You can see that the Id of the record has not changed, but the version has changed from 1 to 2, the operation type (result) has changed from created to updated, and the created field has changed to false, because this is not a new record.

V Query data

Return all records

Use the GET method to directly request / Index/Type/_search, all records will be returned.

GET accounts/person/_search

Return results

{
  "took": 14,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": 1,
        "_source": {
          "user": "Zhang San",
          "title": "engineer",
          "desc": "Database management,software development"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "JOMdEnkBNi6vmKXZ1RkS",
        "_score": 1,
        "_source": {
          "user": "Li Si",
          "title": "engineer",
          "desc": "system management"
        }
      }
    ]
  }
}

In the above code, the took field of the returned result indicates the time-consuming of the operation (in milliseconds), timed_ The out field indicates whether to timeout, and the hits field indicates the hit record. The meaning of the subfield is as follows.

  • total: returns the number of records. In this example, there are 2 records.
  • max_score: the highest matching degree. This example is 1.
  • hits: an array of returned records.

Full text search

  • Elastic's query is very special and uses its own Query syntax , the GET request is required to have a data body.

    GET accounts/person/_search
    {
      "query" : { "match" : { "desc" : "Software" }}
    }
    

    Use the above code Match query , the specified matching condition is that the desc field contains the word "software". The returned results are as follows.

    {
      "took": 23,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 0.2876821,
        "hits": [
          {
            "_index": "accounts",
            "_type": "person",
            "_id": "1",
            "_score": 0.2876821,
            "_source": {
              "user": "Zhang San",
              "title": "engineer",
              "desc": "Database management,software development"
            }
          }
        ]
      }
    }
    
  • Elastic returns 10 results at a time by default. You can change this setting through the size field.

    GET accounts/person/_search
    {
      "query" : { "match" : { "desc" : "Administration" }},
      "size": 1
    }
    

    The above code specifies that only one result is returned at a time.

  • You can also specify the displacement through the from field.

    GET accounts/person/_search
    {
      "query" : { "match" : { "desc" : "Administration" }},
      "from": 1,
      "size": 1
    }
    

    The above code specifies that only one result will be returned from position 1 (the default is from position 0).

Logical operation

  • If there are multiple search keywords, Elastic thinks they are or relationships.

    GET accounts/person/_search
    {
      "query" : { "match" : { "desc" : "Software system" }}
    }
    

    The returned results are as follows

    {
      "took": 9,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 0.2876821,
        "hits": [
          {
            "_index": "accounts",
            "_type": "person",
            "_id": "1",
            "_score": 0.2876821,
            "_source": {
              "user": "Zhang San",
              "title": "engineer",
              "desc": "Database management,software development"
            }
          },
          {
            "_index": "accounts",
            "_type": "person",
            "_id": "JOMdEnkBNi6vmKXZ1RkS",
            "_score": 0.2876821,
            "_source": {
              "user": "Li Si",
              "title": "engineer",
              "desc": "system management"
            }
          }
        ]
      }
    }
    

    The code above searches for the software or system.

  • If you want to perform an and search for multiple keywords, you must use Boolean query.

    GET accounts/person/_search
    {
      "query": {
        "bool": {
          "must": [
            { "match": { "desc": "Software" } },
            { "match": { "desc": "system" } }
          ]
        }
      }
    }
    

    give the result as follows

    {
      "took": 6,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 0,
        "max_score": null,
        "hits": []
      }
    }
    

s/person/_search
{
"query": {
"bool": {
"must": [
{"match": {"desc": "software"}},
{"match": {"desc": "system"}}
]
}
}
}

give the result as follows

```json
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

Reference link

Keywords: Windows ElasticSearch

Added by abigbluewhale on Sat, 19 Feb 2022 14:13:09 +0200