kettle data synchronization perfect version
Perfect version of kettle to realize data incremental synchronization
preface
Some time ago, there was an operation of using kettle to realize data synchronization, including Installation and configuration of kettle, creation of job, creation of translate, etc.
At that time, the time point of dead writing was used (that is, the data wil ...
Added by EODC on Sat, 29 Jan 2022 21:55:08 +0200
Squid log analysis tool, yyds!!!
preface
Today, I'd like to introduce a commonly used Squid log analysis software to you. I hope it can be used by helpful students in their daily work in the future.
Sarg: full name: Squid Analysis Report Generator, is a Squid log analysis tool, which lists the Internet website information, time occupation information, ranking, connection tim ...
Added by Sekka on Sat, 29 Jan 2022 21:28:03 +0200
mysql basic learning - common functions
1, Single line function
1. Character function
1.1 length gets the number of bytes of the parameter value
SELECT LENGTH('john'); --'4'
SELECT LENGTH('Huo Yuanjia lalala'); --'15'
1.2 concat splice string
SELECT CONCAT(last_name,'_',first_name) full name FROM employees;
1.3 # upper and lower change the string to uppercase or lowe ...
Added by shawnplr on Sat, 29 Jan 2022 21:03:17 +0200
hive partition notes
hive partition
1. Primary zoning
A partition in Hive is a subdirectory. It is basically consistent with the slice in map. Map slicing is also to improve parallelism. Open the data in the table separately. When you check the data in the table, write the partition information to avoid scanning the whole table; It is an optimized scheme.
The pa ...
Added by jeff21 on Sat, 29 Jan 2022 17:12:37 +0200
Importing Excel data into hbase
Table design
Column cluster: 1-2 are recommended. If you can use one, you can't use twoVersion design: if our project does not need to save historical VERSIONS, it is OK to directly configure VERSIONS=1 according to the default configuration. If you need to save historical change information in the project, you can set VERSIONS to > 1. Bu ...
Added by switchdoc on Sat, 29 Jan 2022 16:29:20 +0200
Practice data Lake iceberg Lesson 11 test the complete process of partition table (making data, building tables, merging, deleting snapshots)
Catalogue of series articles
Practical data Lake iceberg lesson 1 Introduction Practical data Lake iceberg lesson 2 iceberg underlying data format based on hadoop Practice data Lake iceberg lesson 3 in sql client, read data from kafka to iceberg in sql Practice data Lake iceberg lesson 4 in sql client, read data from kafka to iceberg in sql ...
Added by kpmonroe on Sat, 29 Jan 2022 11:28:53 +0200
Spark sparksql of big data
1 Spark SQL overview
1.1 what is Spark SQL
Spark SQL is a module used by spark to process structured data. It provides two programming abstractions: DataFrame and DataSet, and acts as a distributed SQL query engine. We have learned about Hive, which converts Hive SQL into MapReduce and then submits it to the cluster for execution, which great ...
Added by condoug on Sat, 29 Jan 2022 06:48:02 +0200
Best practice guide for integrating Swagger to automatically generate interface documents for SpringBoot project
Recently, Swagger was introduced into the project to support automatic document generation. It was found that many articles only introduced how to access and use. However, for the actual engineering practice, the corresponding best practice scheme is not given. Therefore, I reorganized the relevant contents and documents to sort out a set of be ...
Added by Bailz on Sat, 29 Jan 2022 05:29:29 +0200
How to make good use of functional interfaces | Java development practice
Opening
JDK8 is known and used as Lambda. Many people will use it, such as Stream flow, but they are simple and easy to use, such as calling the Stream API of the collection, but they will not define their own function interface or API. Today, we use several cases to improve the use of function programming in Java.
Case demonstration
Functio ...
Added by ericbangug on Fri, 28 Jan 2022 02:25:46 +0200
Some experience of using Hadoop
Some experience on the use of HDFS
Write before:
I've been working on big data in the company for some time. Take time to sort out the problems encountered and some better optimization methods.
1.HDFS storage multi directory
1.1 production server disk
1.2 on HDFS site Configure multiple directories in the XML file, and pay attention t ...
Added by Soldier Jane on Fri, 28 Jan 2022 02:06:47 +0200