In depth Elasticsearch: index creation

In order to deeply study Elasticsearch and Lucene, this paper starts with the creation process of Elasticsearch index and makes a small experiment on two important API s: Refresh and Flush. So as to understand how Elasticsearch indexes a document after receiving it. After understanding the whole process, we will further study the memory data structure and disk file format of Lucene index. Note: Elasticsearch and Kibana 7.10.2 are used in this paper.

1. Experimental environment

1.1 index creation

Create an index Lucene learning, set the number of shard s to 1, turn off replica, and turn off refresh and flush at the same time (the mandatory interval is one hour).

PUT lucene-learning
{
  "settings": {
    "index": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "refresh_interval": -1,
      "translog": {
        "sync_interval": "3600s"
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text"
      },
      "age": {
        "type": "integer"
      }
    }
  }
}

1.2 document directory

From the directory structure of elasticsearch, we can see that the index name and folder name correspond. Because the Lucene learning index has only one partition 0, there is only one folder 0, which contains two directories: index and translog. The former corresponds to a Lucene index (the Shard of elasticsearch is equal to the index of Lucene). This folder is completely managed by Lucene. Elasticsearch will not directly write the files here. All interactions should be completed through Lucene API. The latter is the folder where the translog of elasticsearch is stored. At this time, there is no actual content in these two files. We will check them later.

GET _cat/indices?v
health status index           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   lucene-learning yGvjXskJQlql1-b2Sx2MHA   1   0          0            0       208b           208b

$ pwd
 elasticsearch-7.10.2/data/nodes/0

$  0 tree -I '_state|5Uwl-yHUTjiCMn9XrZgM-g' -L 5
.
├── indices
│   └── yGvjXskJQlql1-b2Sx2MHA
│       └── 0
│           ├── index
│           │   ├── segments_2
│           │   └── write.lock
│           └── translog
│               ├── translog-2.tlog
│               └── translog.ckp
└── node.lock

$ cat index/segments_2
?�segments
translog_uuidh8Y1UqnyT9ePnvkI4mMQ3Qlocal_checkpoint-1
                                                     history_uuidvic9U3IUSvWWP0rDi-fEpA
max_seq_no-1max_unsafe_auto_id_timestamp-1�(��38kZ

$ cat translog/translog-2.tlog
?�translogh8Y1UqnyT9ePnvkI4mMQ3Qy�roW

Note: Elasticsearch may have some internal processing, so the segment version number of Lucene is 2. However, this version number does not seem to be directly related to the serial number of the actual segment.

2. Experimental design

2.1 steps

The steps of the whole experiment are as follows:

  1. Index a document first
  2. The contents of the document are written to the cache and translog at the same time
  3. Manually execute refresh
  4. Elasticsearch create segment 0
  5. Because of Lucene's real-time search function, doc can be searched at this time
  6. Index a document again and refresh
  7. segment 1 is obtained, and the new request is appended to the translog file
  8. Perform flush manually
  9. Both segments 0 and 1 are fsync to the disk, the commit point in the segments file is updated, and the translog is cleared

2.2 interpretation

Please refer to the third part of the experimental process and this article Guide to Refresh and Flush Operations in Elasticsearch.

3. Experimental process

3.1 index document 1

Now let's index a file:

PUT lucene-learning/_doc/1
{
  "name": "Allen Hank",
  "age": 30
}

Even if it has not been refresh ed, according to the design of Elasticsearch, the request has entered the memory cache and translog at the same time. Otherwise, in case of downtime, all requests in the cache will be lost. Write to the translog as soon as it comes up, so what is lost is only those requests for dropping the disk to the translog file. At this time, the segment file does not change, but you can see some empty Lucene files.

$ pwd
 elasticsearch-7.10.2/data/nodes/0/indices/yGvjXskJQlql1-b2Sx2MHA/0
$ tree -I '_state'
.
├── index
│   ├── _0.fdm
│   ├── _0.fdt
│   ├── _0_Lucene85FieldsIndex-doc_ids_0.tmp
│   ├── _0_Lucene85FieldsIndexfile_pointers_1.tmp
│   ├── segments_2
│   └── write.lock
└── translog
    ├── translog-2.tlog
    └── translog.ckp

$ ll index
-rw-r--r--  1 daichen  staff     0B May 27 15:02 _0.fdm
-rw-r--r--  1 daichen  staff     0B May 27 15:02 _0.fdt
-rw-r--r--  1 daichen  staff     0B May 27 15:02 _0_Lucene85FieldsIndex-doc_ids_0.tmp
-rw-r--r--  1 daichen  staff     0B May 27 15:02 _0_Lucene85FieldsIndexfile_pointers_1.tmp
-rw-r--r--  1 daichen  staff   208B May 27 14:53 segments_2
-rw-r--r--  1 daichen  staff     0B May 27 14:53 write.lock

$ cat translog/translog-2.tlog
?�translogPpqSibBGQcK_Cowcf0190Q�$W
1_doc({
  "name": "Allen Hank",
  "age": 30
}
��������PQ�

[optional] if you search now, you can't find the document just now (note that this step may affect the observation result of refresh in the next step).

POST lucene-learning/_search
{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

After manually executing refresh, you can see Lucene merge various files into cfe and cfs two compound files (please refer to the appendix for specific meaning), which means that segment 0 has been created. But it hasn't commit ted yet, which is why segment_ The X file has not changed. (if search has been performed previously, Elasticsearch seems to flush automatically)

POST lucene-learning/_refresh
{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  }
}

tree -I '_state'
.
├── index
│   ├── _0.cfe
│   ├── _0.cfs
│   ├── _0.si
│   ├── segments_2
│   └── write.lock
└── translog
    ├── translog-2.tlog
    └── translog.ckp

3.2 index document 2

Index a new document.

PUT lucene-learning/_doc/2
{
  "name": "Tom Hank",
  "age": 25
}

$ tree -I '_state'
.
├── index
│   ├── _0.cfe
│   ├── _0.cfs
│   ├── _0.si
│   ├── _1.fdm
│   ├── _1.fdt
│   ├── _1_Lucene85FieldsIndex-doc_ids_2.tmp
│   ├── _1_Lucene85FieldsIndexfile_pointers_3.tmp
│   ├── segments_2
│   └── write.lock
└── translog
    ├── translog-2.tlog
    └── translog.ckp

Manually execute refresh to get segment 1:

POST lucene-learning/_refresh
{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  }
}

$ tree -I '_state'
.
├── index
│   ├── _0.cfe
│   ├── _0.cfs
│   ├── _0.si
│   ├── _1.cfe
│   ├── _1.cfs
│   ├── _1.si
│   ├── segments_2
│   └── write.lock
└── translog
    ├── translog-2.tlog
    └── translog.ckp

$ cat translog/translog-2.tlog
?�translogh8Y1UqnyT9ePnvkI4mMQ3Qy�roW
1_doc({
  "name": "Allen Hank",
  "age": 30
}
��������PQ�U
2_doc&{
  "name": "Tom Hank",
  "age": 25
}
����������X�

3.3 submission

After the flush is executed, the segment data file can be considered as the real disk. At this time, you can see that the version number of segments changes to 3, and the submission point contains segments 0 and 1. Note that because of the cache of the operating system itself, the files seen in the previous steps may not be saved to the disk. Specifically, you can search the relationship between the system call write and fsync.

POST lucene-learning/_flush
{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  }
}

$ tree -I '_state'
.
├── index
│   ├── _0.cfe
│   ├── _0.cfs
│   ├── _0.si
│   ├── _1.cfe
│   ├── _1.cfs
│   ├── _1.si
│   ├── segments_3
│   └── write.lock
└── translog
    ├── translog-3.tlog
    └── translog.ckp

$ cat index/segments_3
?�segments
��F�ߝᰆ�4�[3
translog_uuidh8Y1UqnyT9ePnvkI4mMQ3Qmin_retained_seq_no0�ߝᰆ�4�Z�_1��F�ߝᰆ�4Lucene87��������������������������F�ߝᰆ�4�[local_checkpoint1max_unsafe_auto_id_timestamp-1
                                                       history_uuidvic9U3IUSvWWP0rDi-fEpA
max_seq_no1�(��!��9

$ cat translog/translog-3.tlog
?�translogh8Y1UqnyT9ePnvkI4mMQ3Qy�ro

Keywords: ElasticSearch

Added by kimbhoot on Tue, 01 Feb 2022 00:08:08 +0200