Development of Big Data Module--Statistical Analysis

After the data warehouse is built, users can write Hive SQL statements to access them and analyze the data. In actual production, which statistical indicators are needed are usually proposed by the relevant departments of data demand, and new statistical requirements will continue to emerge. The following are some typical indicators in website ...

Added by davey10101 on Fri, 23 Aug 2019 06:58:37 +0300

Hive Installation & First Experience

Download & Unzip Download Hive 1.2.1 from this address https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz Then use the following command to extract it to the specified directory: tar -zxvf apache-hive-1.2.1-bin.tar.gz -C /root/apps/ Then change the name with the following command: mv apache-hive- ...

Added by jimmyhumbled on Mon, 15 Jul 2019 23:25:59 +0300

The Optimizing Thought of Xiaobai's Deduction of HIVE Database

Xiao Bai used a relational database such as Oracle before, and summarized the knack of relational database optimization - see the explanation plan. Oracle is a mature product. Interpretation plans include many categories, real and virtual. By observing different kinds of interpreted plan data, we can grasp the vast majority of sql data from inp ...

Added by Mirkules on Sun, 19 May 2019 14:22:20 +0300

Big Data Tutorial (14.2) Website Data Analysis

The previous article introduced the business background of the website click stream data analysis project; this blogger will continue to share the relevant knowledge of website analysis. I. Overall technical process and architecture 1.1. Data Processing Flow This project is a pure data analysis project, and its overall process is basically b ...

Added by gonsman on Wed, 15 May 2019 19:12:18 +0300

The storage format of hit table; the use of ORC format

There are several types of source file storage formats for the hit table: 1,TEXTFILE The default format is not specified when creating tables. When importing data, the data files will be copied directly to hdfs for processing. Source files can be viewed directly through Hadoop fs-cat 2. SEQUENCEFILE is a binary ...

Added by poppy28 on Sun, 12 May 2019 10:09:32 +0300