Development of Big Data Module--Statistical Analysis
After the data warehouse is built, users can write Hive SQL statements to access them and analyze the data.
In actual production, which statistical indicators are needed are usually proposed by the relevant departments of data demand, and new statistical requirements will continue to emerge. The following are some typical indicators in website ...
Added by davey10101 on Fri, 23 Aug 2019 06:58:37 +0300
Hive Installation & First Experience
Download & Unzip
Download Hive 1.2.1 from this address https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz
Then use the following command to extract it to the specified directory:
tar -zxvf apache-hive-1.2.1-bin.tar.gz -C /root/apps/
Then change the name with the following command:
mv apache-hive- ...
Added by jimmyhumbled on Mon, 15 Jul 2019 23:25:59 +0300
The Optimizing Thought of Xiaobai's Deduction of HIVE Database
Xiao Bai used a relational database such as Oracle before, and summarized the knack of relational database optimization - see the explanation plan. Oracle is a mature product. Interpretation plans include many categories, real and virtual. By observing different kinds of interpreted plan data, we can grasp the vast majority of sql data from inp ...
Added by Mirkules on Sun, 19 May 2019 14:22:20 +0300
Big Data Tutorial (14.2) Website Data Analysis
The previous article introduced the business background of the website click stream data analysis project; this blogger will continue to share the relevant knowledge of website analysis.
I. Overall technical process and architecture
1.1. Data Processing Flow
This project is a pure data analysis project, and its overall process is basically b ...
Added by gonsman on Wed, 15 May 2019 19:12:18 +0300
The storage format of hit table; the use of ORC format
There are several types of source file storage formats for the hit table:
1,TEXTFILE
The default format is not specified when creating tables. When importing data, the data files will be copied directly to hdfs for processing. Source files can be viewed directly through Hadoop fs-cat
2. SEQUENCEFILE is a binary ...
Added by poppy28 on Sun, 12 May 2019 10:09:32 +0300