Zookeeper client operation

1. Client command line operation 1.1 basic grammar commandexplainhelpDisplay all operation commandsls pathView the child nodes of the current node (can listen)-w listen for changes in child nodes-s additional secondary informationcreateCreate normal node-s contains sequences-e temporary (restart or timeout disappears)get pathObtain the value ...

Added by prueba123a on Thu, 30 Dec 2021 11:04:02 +0200

Hive tuning idea - knowledge summary

Hive tuning: Choosing the appropriate "storage format" and "compression method" for the analyzed data can improve the analysis efficiency of hive Data compression format: When selecting a compression algorithm, you need to consider whether it can be divided, If segmentation is not supported (the integrity of a pi ...

Added by ZHarvey on Thu, 30 Dec 2021 02:06:19 +0200

4 - website log analysis cases - log data statistical analysis

4 - website log analysis cases - log data statistical analysis 1, Environment preparation and data import 1. Start hadoop If it is enabled in a virtual environment such as lsn, you need to perform formatting first hadoop namenode -format Start Hadoop start-dfs.sh start-yarn.sh Check to see if it starts jps 2. Import data Upload ...

Added by D_tunisia on Wed, 29 Dec 2021 17:51:55 +0200

26 data analysis cases -- the second stop: Civil Aviation Customer Value Analysis Based on Hive

26 data analysis cases -- the second stop: Civil Aviation Customer Value Analysis Based on Hive Environment required for experiment • Python: Python 3.x; • Hadoop2.7.2 environment; • Hive2.2.0 Experimental background People choose more and more travel modes, such as aircraft, high-speed rail, cars, ships, etc. in particular, aircraft ...

Added by abhic on Wed, 29 Dec 2021 16:51:45 +0200

Big data -- Introduction to Algorithms in Spark GraphX

1, ConnectedComponents algorithm ConnectedComponents, that is, the connectome algorithm labels each connectome in the graph with id, and takes the id of the vertex with the smallest serial number in the connectome as the id of the connectome. When the diagram is as follows: //Create point val vertexRDD: RDD[(VertexId, (String,Int)) ...

Added by nvee on Wed, 29 Dec 2021 05:09:54 +0200

Introduction to ElasticSearch and its deployment, principle and use

Introduction to ElasticSearch and its deployment, principle and use Chapter 1: introduction to elastic search Elasticsearch is a Lucene based search server. It provides a distributed multi-user full-text search engine based on RESTful web interface. Elasticsearch is developed in Java and released as an open source under the Apache license ter ...

Added by parijat_php on Tue, 28 Dec 2021 09:46:24 +0200

009 Optimization & new features & HA

1,Hadoop data compression compression algorithmOriginal file sizeCompressed file sizeCompression speedDecompression speedBring your ownsegmentationChange proceduregzip8.3GB1.8GB17.5MB/s58MB/syesnonobzip28.3GB1.1GB2.4MB/s9.5MB/syesyesnoLZO8.3GB2.9GB49.3MB/s74.6MB/snoyesyes Input compression: (Hadoop uses the file extension to determine whether ...

Added by prbrowne on Mon, 27 Dec 2021 20:14:25 +0200

Hadoop data compression

1, Overview 1) Advantages and disadvantages of compression Advantages of compression: to reduce disk IO and disk storage space. Disadvantages of compression: increase CPU overhead. 2) Compression principle (1) Operation intensive jobs use less compression (2) IO intensive Job, multi-purpose compression 2, MR supported compression coding 1 ...

Added by madhukar_garg on Mon, 27 Dec 2021 09:56:33 +0200

CDH6.2. The whole process of brainless construction and configuration (Beginner's version)

The software download link is at the bottom thank: CSDN Daniel: Travel through IT bilibili Daniel: amoscloud2013 1. Preliminary preparation Five 8G virtual machines are CDH1, cdh2, cdh3, CDH4 and cdh5 respectively. JDK is installed on all virtual machines 2. Modify IP and host name Select CentOS 7 for cluster deployment. All three vir ...

Added by thefollower on Mon, 27 Dec 2021 05:46:50 +0200

Detailed explanation of Elasticsearch Template

In ES, we can set Index Template and Dynamic Template to better manage and set indexes and mapping for us. 1, Index Template For example, we need to use es for log management. We all know that the amount of log data is very large. If a single index is used to save all log data, there may be some performance problems. We can automaticall ...

Added by TubeRev on Sun, 26 Dec 2021 23:24:49 +0200