DDL and DML data operations of Hive

catalogue 1, DDL(Data Definition Language) 1. Create database 2. Query database 3. Modify database 4. Delete database 5. Create table Management table (internal table) External table Conversion between two tables Modify table Delete table Clear table 2, DML data operation 1. Data import Load data into a table (load) Insert data ...

Added by jtapoling on Thu, 23 Dec 2021 19:57:52 +0200

Hive DDL data definition

DDL data definition Create database CREATE DATABASE [IF NOT EXISTS] database_name [COMMENT database_comment] // notes [LOCATION hdfs_path] // The path where the library directory is stored [WITH DBPROPERTIES (property_name=property_value, ...)]; // Other information 1) Create a database. The default storage path of the database on HDFS is / ...

Added by akillez on Wed, 22 Dec 2021 20:39:21 +0200

One trick of SQL day: how to use HQL to extract string elements from fixed positions [explain Hive string position lookup function]

catalogue 0 problem description 1 problem solving 2 Summary 0 problem description SQL extracts string elements from a fixed location. You have a string that contains a continuous piece of log data. You want to parse the string and extract some information from it. However, the information you need does not exist in the fixed position of t ...

Added by colbyg on Wed, 22 Dec 2021 20:23:21 +0200

Hive - common tuning methods

Enterprise level tuning Explain Basic grammar EXPLAIN [EXTENDED | DEPENDENCY | AUTHORIZATION] query Case practice (1) View the execution plan of the following statement No MR task was generated hive (default)> explain select * from emp; OK Explain STAGE DEPENDENCIES: Stage-0 is a root stage // A stage STAGE PLANS: Stage: Stage ...

Added by calumstevens on Mon, 20 Dec 2021 14:23:08 +0200

hiveSQL advanced -- 10__ Tests whether a value exists in a set of data

0 - problem description SQL tests whether there is a value in a group of data. You want to generate a Boolean value according to whether a group of row records contain a specific value. Imagine an example where a student will take several exams over a period of time. Suppose he takes three exams every three months. As long as he has pas ...

Added by Satria Ox41464b on Thu, 16 Dec 2021 13:58:28 +0200

Sogou log query analysis (MapReduce+Hive+idea comprehensive experiment)

prerequisite: Install Hadoop 2 7.3 (under Linux system) Install MySQL (under Windows or Linux system) Install Hive (under Linux system) reference: Hive installation configuration Title: Download search data from Sogou lab for analysis The downloaded data contains 6 fields, and the data format is described as follows: Access time user ID ...

Added by cheekychop on Sun, 12 Dec 2021 13:58:21 +0200

spark integrated hive summary

  I won't say much about installing spark here~ !!! Look! To install mysql and hive: Install RPM package and download mysql:   sudo yum localinstall https://repo.mysql.com//mysql80-community-release-el7-1.noarch.rpm sudo yum install mysql-community-server Start MySQL service and view the status: systemctl start mysqld.service service ...

Added by jaimitoc30 on Tue, 07 Dec 2021 22:24:11 +0200

hive data type

1. Basic data type Hive data type Java data type length example TINYINT byte 1byte signed integer 20 SMALINT short 2byte signed integer 20 INT int 4byte signed integer 20 BIGINT long 8byte signed integer 20 BOOLEAN boolean Boolean type, true or false TRUE FALSE FLOAT float Single-precision floating-point 3.14159 DOUBLE doubl ...

Added by choubix on Sun, 05 Dec 2021 16:11:35 +0200

[hive] record the hive commands used in work

Pre command Initialize metadata information schematool -dbType mysql -initSchema Start hive and hivever2 nohup hive --service metastore 1>/mnt/metastore.log 2>&1 & nohup hive --service hiveserver2 1>/mnt/hiveserver2.log 2>&1 & 1, Connect hive beeline connect hive without permission beeline -u "jdbc:hive2:// ...

Added by Ellypsys on Tue, 30 Nov 2021 13:30:36 +0200

Flink Core Programming

Flink Core Programming 1,Environment When Flink Job submits to perform calculations, it first establishes a link with the Flink framework, that is, the current Flink runtime environment in which task can be scheduled to a different taskManager execution only if environmental information is available. This environment object is relatively simp ...

Added by MadRhino on Wed, 24 Nov 2021 22:44:02 +0200