Written in front
There are generally two principles to implement mysql data synchronization es:
-
The synchronization is performed through the regular query of sql statement.
- elasticsearch-jdbc
- [logstash JDBC] official( https://www.elastic.co/blog/l...
-
Use binlog for synchronization
- Big God's open source project go-mysql-elasticsearch
This article mainly introduces logstash JDBC, which is an official project. It is frequently updated and maintained. It is also relatively safe to use. Of course, it has some disadvantages. There's not much bullshit to start with.
Choreograph the docker environment
This article is built in the docker environment. First of all, you need to install docker. The author's version is 18.09.2. Because what we have been developing and using is laradock This is still the es and logstash added based on this environment. Because mysql has not made any changes, please refer to the laradock document for mysql installation arrangement. In fact, the arrangement of es and logstash is very simple.
Arrange elastic search
Dockerfile of es
FROM elasticsearch:7.3.0 EXPOSE 9200 9300
docker-compose.yml file
elasticsearch: build: ./elasticsearch volumes: - elasticsearch:/usr/share/elasticsearch/data environment: - cluster.name=laradock-cluster - node.name=laradock-node - bootstrap.memory_lock=true - "ES_JAVA_OPTS=-Xms512m -Xmx512m" - cluster.initial_master_nodes=laradock-node ulimits: memlock: soft: -1 hard: -1 ports: - "${ELASTICSEARCH_HOST_HTTP_PORT}:9200" - "${ELASTICSEARCH_HOST_TRANSPORT_PORT}:9300" depends_on: - php-fpm networks: - frontend - backend
Docker compose up - D elasticsearch mysql starts the image. You can test and find that es and mysql should have been started.
Arrange logstash
Since we use the logstash JDBC plug-in to achieve synchronization, our focus is on the choreography of logstash.
Logstash jdbc is connected to mysql through java, so we first need a jdbc jar file, which can be downloaded from Official website Download, get a jar file, copy it into the docker's choreography directory, of course, you can also use docker to download when choreography.
The catalogue of the author's logstash
dockerfile file
FROM logstash:7.3.0 USER root RUN rm -f /usr/share/logstash/pipeline/logstash.conf RUN curl -L -o /usr/share/logstash/lib/mysql-connector-java-5.1.47.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.47/mysql-connector-java-5.1.47.jar ADD ./pipeline/ /usr/share/logstash/pipeline/ ADD ./config /usr/share/logstash/config/ ADD mysql-connector-java-8.0.18.jar /usr/share/logstash/logstash-core/lib/jars/mysql/mysql-connector-java-8.0.18.jar RUN logstash-plugin install logstash-input-jdbc
Synchronization task configuration file mysql/mysql.conf
input { jdbc { # Here is the jdbc connection mysql statement. The second mysql is because the internal access of this docker project requires network bridging. You can modify it yourself jdbc_connection_string => "jdbc:mysql://mysql:3306/koudai" jdbc_user => "root" jdbc_password => "root" # Driver; / usr/share/logstash/config/mysql / starting for logstash plug-in is the default directory for finding jar files jdbc_driver_library => "/usr/share/logstash/config/mysql/mysql-connector-java-8.0.18.jar" # Drive class name jdbc_driver_class => "com.mysql.jdbc.Driver" jdbc_paging_enabled => "true" jdbc_page_size => "50000" jdbc_default_timezone => "Asia/Shanghai" # sql file name statement_filepath => "/usr/share/logstash/config/mysql/task.sql" # Monitoring interval [minute, hour, day, month, year] schedule => "* * * * *" type => "user" # Whether to record the last execution result. If it is true, the value of the tracking column field last executed will be recorded and saved in the file specified by last run metadata path record_last_run => true # Whether to record the value of a column. If record last run is true, you can customize the column name of the track. At this time, the parameter must be true. Otherwise, the default track value is timestamp use_column_value => true # If use column value is true, you need to configure the database column name of this parameter. track. The column must be incremental. Generally, it is mysql primary key tracking_column => "lastmodifiedTime" tracking_column_type => "timestamp" last_run_metadata_path => "./last_record/logstash_article_last_time" # Whether to clear the record of last run metadata path? If it is true, each time is equivalent to querying all database records from the beginning clean_run => false # Convert the field (column) name to lowercase lowercase_column_names => false } } output { elasticsearch { # The same reason is that the internal access of this docker project needs network bridging hosts => ["http://elasticsearch:9200"] index => "user" document_id => "%{uid}" } }
Synchronous mysql
select * from kdgx_partner_charge_user
For more use, please refer to Here
Since then, the new data in mysql can be synchronized to es.