elasticsearch combined with canal to build master-slave replication architecture and integrate the actual battle of canal

catalogue

preface

Build canal

preface

Previously, we have built elasticsearch and elasticsearch head and integrated spring data elasticsearch. The following is the need to use canal to connect mysql and elasticsearch in series.

Build canal

First, we go to GitHub to find the warehouse of canal, and we can see the working principle of canal

Simply put, the master-slave replication of mysql is to execute the mysql operation logs in order from the database through bin log. canal simulates itself as a slave of mysql to realize master-slave replication from the database.

First, we need to download two things, canal Adapter and Canadian Developer, extract it to the server.

Then turn on the binlog function of mysql

[mysqld]
log-bin=mysql-bin # Enable binlog
binlog-format=ROW # Select ROW mode
server_id=1 # MySQL replacement configuration needs to be defined. It should not be repeated with the slaveId of canal

Since our canal needs to be connected to our mysql, we need to assign an account to our canal

CREATE USER canal IDENTIFIED BY 'canal';  
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%';
-- GRANT ALL PRIVILEGES ON *.* TO 'canal'@'%' ;
FLUSH PRIVILEGES;

Then enter our canal Open conf / example / instance.exe in the file directory of the deployer properties

Fill in the address, account and password of the database and exit saving.

Then enter the bin directory and execute startup SH script

After execution, you can view the log

2021-09-04 16:43:38.523 [main] INFO  com.alibaba.otter.canal.deployer.CanalLauncher - ## set default uncaught exception handler
2021-09-04 16:43:38.558 [main] INFO  com.alibaba.otter.canal.deployer.CanalLauncher - ## load canal configurations
2021-09-04 16:43:38.568 [main] INFO  com.alibaba.otter.canal.deployer.CanalStarter - ## start the canal server.
2021-09-04 16:43:38.606 [main] INFO  com.alibaba.otter.canal.deployer.CanalController - ## start the canal server[172.16.51.49(172.16.51.49):11111]
2021-09-04 16:43:39.812 [main] INFO  com.alibaba.otter.canal.deployer.CanalStarter - ## the canal server is running now ......

The above log indicates successful startup

Let's start configuring canal Adapter, first enter the conf directory and open application YML file. Configure the following information in the file

The first is to configure the connected database, and the second is to configure the address, account and password of elasticsearch, so that mysql, canal and elasticsearch can be connected in series.

Next, you need to configure the file corresponding to the index during synchronization, which is also the most critical step. First, we use elasticsearch7 10.0, so we found es7 in the config directory and created a file in it. Let's start with an example. According to the conditions of the previous article, we previously defined a parent-child document in spring boot elasticsearch. At least two files should be created in canal to achieve synchronization.

dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
  _index: forum_post
  _id: _id
  relations:
    post_join:
      name: forum_post
#  upsert: true
#  pk: id
  sql: "select fp.post_id                                       as _id,
               fp.post_id, 
               fp.title,
               fp.title as title_keyword,
               fp.content,
               fp.content as content_keyword,
               fp.content_type,
               fp.plate_id,
               fpl.plate_name_en,
               fpl.plate_name_cn,
               date_format(fp.gmt_created, '%Y-%m-%d %H:%i:%s') as gmt_created,
               fp.created_by,
               fu.username,
               fu.username as username_keyword,
               fu.head_icon,
               fp.read_count,
               fp.reply_count,
               fp.like_count,
               fp.language_type,
               fp.is_top,
               fp.is_archive,
               date_format(fp.gmt_top, '%Y-%m-%d %H:%i:%s')     as gmt_top,
               date_format(fp.gmt_reply, '%Y-%m-%d %H:%i:%s')   as gmt_reply,
               1 as search_type
        from forum_post fp
               left join forum_plate fpl on fp.plate_id = fpl.plate_id
               left join forum_user fu on fu.user_account = fp.created_by"
#  objFields:
#    _labels: array:;
  etlCondition: "where fp.gmt_modified>={}"
  commitBatch: 3000

First pass_ Index defines the name of the mapped index through_ id defines the mapping primary key. Parent and child documents are defined through relations hips. The value of name determines whether it is a parent document or a child document. The fields returned by the following sql are all the fields previously defined in elasticsearch. It should be noted here that although our mapped sql supports multiple tables, it needs to meet the following restrictions

An example of a subdocument is shown below

dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
  _index: forum_post
  _id: _id
  relations:
    post_join:
      name: forum_reply
      parent: post_id
#  upsert: true
#  pk: id
  sql: "select  fr.es_id                                      as _id,
                fr.reply_id,
                fr.content,
                fr.content as content_keyword,
                fr.content_type,
                fr.like_count,
                date_format(fr.gmt_created, '%Y-%m-%d %H:%i:%s') as gmt_created,
                fr.created_by,
                fu.username,
                fu.username as username_keyword,
                fu.head_icon,
                fr.post_id,
                2 as search_type
        from forum_reply fr left join forum_user fu on fu.user_account = fr.created_by"
#  objFields:
#    _labels: array:;
  etlCondition: "where fr.gmt_modified>={}"
  commitBatch: 3000

Its index is consistent with that of the parent document. The value of name in the join field is that of the child document. At the same time, parent is declared. This is to mark which field is used to associate the parent and child documents.

Exit after saving and enter the bin directory

Execute startup sh

You can listen to logs in the logs/adapter directory.

2021-10-28 13:50:44.440 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## syncSwitch refreshed.
2021-10-28 13:50:44.440 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## start the canal client adapters.
2021-10-28 13:50:44.444 [main] INFO  c.a.otter.canal.client.adapter.support.ExtensionLoader - extension classpath dir: /usr/local/src/canal.adapter-1.1.5/plugin
2021-10-28 13:50:44.683 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: logger succeed
2021-10-28 13:50:45.196 [main] INFO  c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## Start loading es mapping config ...
2021-10-28 13:50:45.255 [main] INFO  c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded
2021-10-28 13:50:45.656 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 succeed
2021-10-28 13:50:45.662 [main] INFO  c.alibaba.otter.canal.connector.core.spi.ExtensionLoader - extension classpath dir: /usr/local/src/canal.adapter-1.1.5/plugin
2021-10-28 13:50:45.686 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Start adapter for canal-client mq topic: example-g1 succeed
2021-10-28 13:50:45.687 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## the canal client adapters are running now ......
2021-10-28 13:50:45.693 [main] INFO  org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-8081"]
2021-10-28 13:50:45.695 [main] INFO  org.apache.tomcat.util.net.NioSelectorPool - Using a shared selector for servlet write/read
2021-10-28 13:50:45.698 [Thread-4] INFO  c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Start to connect destination: example <=============
2021-10-28 13:50:45.806 [main] INFO  o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 8081 (http) with context path ''
2021-10-28 13:50:45.809 [main] INFO  c.a.otter.canal.adapter.launcher.CanalAdapterApplication - Started CanalAdapterApplication in 4.909 seconds (JVM running for 5.689)

The above log indicates that the startup is successful.

In the spring boot project, some curd http interfaces are exposed for database operation. Through the request interface, the data in mysql database changes.

You can see the following log output

2021-10-28 13:50:47.259 [pool-3-thread-1] DEBUG c.a.o.canal.client.adapter.es.core.service.ESSyncService - DML: {"data":[{"user_id":28,"user_account":"015","mobile":"","username":"Yang Mi Yang Mi","post_count":18,"freeze_type":0,"gmt_freeze":1631950395991,"gmt_created":1630940953000,"gmt_modified":1634650702000,"created_by":"admin","modified_by":"015","head_icon":"http://121.199.0.246/images/20181231010435_scjtm.png","gmt_last_login":1634650702000}],"database":"rokid_forum","destination":"example","es":1634650702000,"groupId":"g1","isDdl":false,"old":[{"gmt_modified":1634264602000,"gmt_last_login":1632987270000}],"pkNames":["user_id"],"sql":"","table":"forum_user","ts":1635400246555,"type":"UPDATE"}
Affected indexes: forum_post forum_reply forum_post forum_post forum_reply forum_post
2021-10-28 13:50:47.273 [pool-3-thread-1] DEBUG c.a.o.canal.client.adapter.es.core.service.ESSyncService - DML: {"data":[{"post_id":148,"plate_id":13,"title":"Migration Publishing","language_type":1,"content_type":1,"like_count":0,"reply_count":0,"read_count":0,"content":"<p>transfer
 release</p>","is_top":0,"is_archive":0,"gmt_reply":null,"gmt_top":null,"gmt_created":1634650722010,"gmt_modified":1634650722010,"created_by":"015","modified_by":"015"}],"database":"rokid_forum","destination":"example","es":1634650722000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["post_id"],"sql":"","table":"forum_post","ts":1635400246556,"type":"INSERT"}
Affected indexes: forum_post
2021-10-28 13:50:47.331 [pool-3-thread-1] DEBUG c.a.o.canal.client.adapter.es.core.service.ESSyncService - DML: {"data":[{"user_id":28,"user_account":"015","mobile":"","username":"Yang Mi Yang Mi","post_count":19,"freeze_type":0,"gmt_freeze":1631950395991,"gmt_created":1630940953000,"gmt_modified":1634650722000,"created_by":"admin","modified_by":"015","head_icon":"http://121.199.0.246/images/20181231010435_scjtm.png","gmt_last_login":1634650702000}],"database":"rokid_forum","destination":"example","es":1634650722000,"groupId":"g1","isDdl":false,"old":[{"post_count":18,"gmt_modified":1634650702000}],"pkNames":["user_id"],"sql":"","table":"forum_user","ts":1635400246557,"type":"UPDATE"}
Affected indexes: forum_post forum_reply forum_post forum_post forum_reply forum_post
2021-10-28 13:50:47.331 [pool-3-thread-1] DEBUG c.a.o.canal.client.adapter.es.core.service.ESSyncService - DML: {"data":[{"post_id":148,"plate_id":13,"title":"Migration Publishing","language_type":1,"content_type":1,"like_count":0,"reply_count":0,"read_count":1,"content":"<p>transfer
 release</p>","is_top":0,"is_archive":0,"gmt_reply":null,"gmt_top":null,"gmt_created":1634650722010,"gmt_modified":1634650801931,"created_by":"015","modified_by":"015"}],"database":"rokid_forum","destination":"example","es":1634650801000,"groupId":"g1","isDdl":false,"old":[{"read_count":0,"gmt_modified":1634650722010}],"pkNames":["post_id"],"sql":"","table":"forum_post","ts":1635400246567,"type":"UPDATE"}
Affected indexes: forum_post

You can see the index of the impact of the executed sql and the type of sql, and then open elasticsearch head to see that the data of elasticsearch has changed.

Keywords: Java ElasticSearch Distribution architecture

Added by rgpayne on Thu, 27 Jan 2022 10:37:27 +0200