Flink CDC and kafka carry out multi-source merging and downstream synchronization scheme

1, Foreword This paper mainly aims at the problem that Flink SQL cannot realize multi-source consolidation of multi database and multi table by using Flink CDC, and how to update the downstream Kafka synchronously after multi-source consolidation, because at present, Flink SQL can only carry out the job operation of single table Flink CDC, whic ...

Added by jefkin on Tue, 01 Feb 2022 23:22:40 +0200

flink: Table&Sql environment construction and program structure

share Big data blog list explain I have always been interested in the knowledge of Flink Table. Now I decide to go beyond some unnecessary knowledge and learn Flink Table directly. This paper mainly introduces the architecture and interface implementation of Flink Table.Apache Flink has two relational APIs for unified stream batch proces ...

Added by alexboyer on Mon, 31 Jan 2022 12:26:30 +0200

Practice data Lake iceberg Lesson 11 test the complete process of partition table (making data, building tables, merging, deleting snapshots)

Catalogue of series articles Practical data Lake iceberg lesson 1 Introduction Practical data Lake iceberg lesson 2 iceberg underlying data format based on hadoop Practice data Lake iceberg lesson 3 in sql client, read data from kafka to iceberg in sql Practice data Lake iceberg lesson 4 in sql client, read data from kafka to iceberg in sql ...

Added by kpmonroe on Sat, 29 Jan 2022 11:28:53 +0200

Apache Flink learning notes Window in streaming processing

Window concept In most scenarios, the data streams we need to count are unbounded, so we can't wait until the whole data stream is terminated. Usually, we only need to make statistical analysis on the data within a certain time range or quantity range: for example, count the hits of all goods in the past hour every five minutes; Or after every ...

Added by phpr0ck5 on Sat, 29 Jan 2022 09:53:54 +0200

3. Introduction to flink programming

1 Introduction to Flink programming1.1 initialize Flink project template1.1.1 preparationsMaven 3.0.4 and above and JDK 8 are required1.1.2 using maven command to create java project templateExecute the maven command. If the maven local warehouse does not have dependent jar s, it needs to have a networkmvn archetype:generate -DarchetypeGroupI ...

Added by dtasman7 on Thu, 27 Jan 2022 21:02:53 +0200

Basic steps of Flink programming and loading different types of data sources

Basic steps of Flink programming: 1. Create the stream execution environment streamexecutionenvironment Getexecutionenvironment() gets the stream environment. 2. Load data Source 3. Transformation 4. Output Sink, land it in other data warehouses and print it directly Basic operation of Flink data -- four categories Operation of a single ...

Added by Oxymen on Wed, 26 Jan 2022 23:28:31 +0200

FlinkSQL flow table and dimension table join and dual flow join

Dimension table is a concept in data warehouse. The dimension attribute in dimension table is to observe data and supplement the information of fact table. In the real-time data warehouse, there are also the concepts of dimension table and fact table. The fact table is usually kafka's real-time stream data, and the dimension table is usually st ...

Added by devai on Fri, 21 Jan 2022 21:44:47 +0200

Flink (14): Transformation operator of Flink

catalogue ​​​​​​0. Links to related articles 1. union and connect operators 2. split, select and Side Outputs 3. rebalance partition 4. Other partition operators ​​​​​​0. Links to related articles 1. union and connect operators API: Union: the union operator can merge multiple data streams of the same type and generate data strea ...

Added by fourthe on Fri, 21 Jan 2022 16:32:09 +0200

transform operator of Flink stream processing api

Welcome to my Personal blog Learn more transform Function: convert the South data (source data) into the required data Common functions map The map operator is similar to the map in python. In python, the data is converted into the data in lambda expression, while the map in flink is more extensive. Through a new Mapfunction, the user-defi ...

Added by Kaitosoto on Thu, 20 Jan 2022 07:52:28 +0200

Flink Tutorial - Flnk 1.11 Streaming Data ORC Format Writing file

In flink, StreamingFileSink is an important sink for writing streaming data to the file system. It supports writing data in row format (json, csv, etc.) and column format (orc, parquet). Hive is a broad data storage, while ORC, as a special optimized column storage format of hive, plays an important role in the storage format of hive. Today ...

Added by ssmitra on Mon, 17 Jan 2022 10:46:53 +0200