kettle data synchronization perfect version

Perfect version of kettle to realize data incremental synchronization preface Some time ago, there was an operation of using kettle to realize data synchronization, including Installation and configuration of kettle, creation of job, creation of translate, etc. At that time, the time point of dead writing was used (that is, the data wil ...

Added by EODC on Sat, 29 Jan 2022 21:55:08 +0200

Squid log analysis tool, yyds!!!

preface Today, I'd like to introduce a commonly used Squid log analysis software to you. I hope it can be used by helpful students in their daily work in the future. Sarg: full name: Squid Analysis Report Generator, is a Squid log analysis tool, which lists the Internet website information, time occupation information, ranking, connection tim ...

Added by Sekka on Sat, 29 Jan 2022 21:28:03 +0200

mysql basic learning - common functions

1, Single line function 1. Character function 1.1 length gets the number of bytes of the parameter value SELECT LENGTH('john'); --'4' SELECT LENGTH('Huo Yuanjia lalala'); --'15' 1.2 concat splice string SELECT CONCAT(last_name,'_',first_name) full name FROM employees; 1.3 # upper and lower change the string to uppercase or lowe ...

Added by shawnplr on Sat, 29 Jan 2022 21:03:17 +0200

hive partition notes

hive partition 1. Primary zoning A partition in Hive is a subdirectory. It is basically consistent with the slice in map. Map slicing is also to improve parallelism. Open the data in the table separately. When you check the data in the table, write the partition information to avoid scanning the whole table; It is an optimized scheme. The pa ...

Added by jeff21 on Sat, 29 Jan 2022 17:12:37 +0200

Importing Excel data into hbase

Table design Column cluster: 1-2 are recommended. If you can use one, you can't use twoVersion design: if our project does not need to save historical VERSIONS, it is OK to directly configure VERSIONS=1 according to the default configuration. If you need to save historical change information in the project, you can set VERSIONS to > 1. Bu ...

Added by switchdoc on Sat, 29 Jan 2022 16:29:20 +0200

Practice data Lake iceberg Lesson 11 test the complete process of partition table (making data, building tables, merging, deleting snapshots)

Catalogue of series articles Practical data Lake iceberg lesson 1 Introduction Practical data Lake iceberg lesson 2 iceberg underlying data format based on hadoop Practice data Lake iceberg lesson 3 in sql client, read data from kafka to iceberg in sql Practice data Lake iceberg lesson 4 in sql client, read data from kafka to iceberg in sql ...

Added by kpmonroe on Sat, 29 Jan 2022 11:28:53 +0200

Spark sparksql of big data

1 Spark SQL overview 1.1 what is Spark SQL Spark SQL is a module used by spark to process structured data. It provides two programming abstractions: DataFrame and DataSet, and acts as a distributed SQL query engine. We have learned about Hive, which converts Hive SQL into MapReduce and then submits it to the cluster for execution, which great ...

Added by condoug on Sat, 29 Jan 2022 06:48:02 +0200

Best practice guide for integrating Swagger to automatically generate interface documents for SpringBoot project

Recently, Swagger was introduced into the project to support automatic document generation. It was found that many articles only introduced how to access and use. However, for the actual engineering practice, the corresponding best practice scheme is not given. Therefore, I reorganized the relevant contents and documents to sort out a set of be ...

Added by Bailz on Sat, 29 Jan 2022 05:29:29 +0200

How to make good use of functional interfaces | Java development practice

Opening JDK8 is known and used as Lambda. Many people will use it, such as Stream flow, but they are simple and easy to use, such as calling the Stream API of the collection, but they will not define their own function interface or API. Today, we use several cases to improve the use of function programming in Java. Case demonstration Functio ...

Added by ericbangug on Fri, 28 Jan 2022 02:25:46 +0200

Some experience of using Hadoop

Some experience on the use of HDFS Write before: I've been working on big data in the company for some time. Take time to sort out the problems encountered and some better optimization methods. 1.HDFS storage multi directory 1.1 production server disk 1.2 on HDFS site Configure multiple directories in the XML file, and pay attention t ...

Added by Soldier Jane on Fri, 28 Jan 2022 02:06:47 +0200