DDL and DML data operations of Hive
catalogue
1, DDL(Data Definition Language)
1. Create database
2. Query database
3. Modify database
4. Delete database
5. Create table
Management table (internal table)
External table
Conversion between two tables
Modify table
Delete table
Clear table
2, DML data operation
1. Data import
Load data into a table (load)
Insert data ...
Added by jtapoling on Thu, 23 Dec 2021 19:57:52 +0200
Hive DDL data definition
DDL data definition
Create database
CREATE DATABASE [IF NOT EXISTS] database_name
[COMMENT database_comment] // notes
[LOCATION hdfs_path] // The path where the library directory is stored
[WITH DBPROPERTIES (property_name=property_value, ...)]; // Other information
1) Create a database. The default storage path of the database on HDFS is / ...
Added by akillez on Wed, 22 Dec 2021 20:39:21 +0200
One trick of SQL day: how to use HQL to extract string elements from fixed positions [explain Hive string position lookup function]
catalogue
0 problem description
1 problem solving
2 Summary
0 problem description
SQL extracts string elements from a fixed location. You have a string that contains a continuous piece of log data. You want to parse the string and extract some information from it. However, the information you need does not exist in the fixed position of t ...
Added by colbyg on Wed, 22 Dec 2021 20:23:21 +0200
Hive - common tuning methods
Enterprise level tuning
Explain
Basic grammar
EXPLAIN [EXTENDED | DEPENDENCY | AUTHORIZATION] query
Case practice
(1) View the execution plan of the following statement
No MR task was generated
hive (default)> explain select * from emp;
OK
Explain
STAGE DEPENDENCIES:
Stage-0 is a root stage // A stage
STAGE PLANS:
Stage: Stage ...
Added by calumstevens on Mon, 20 Dec 2021 14:23:08 +0200
hiveSQL advanced -- 10__ Tests whether a value exists in a set of data
0 - problem description
SQL tests whether there is a value in a group of data. You want to generate a Boolean value according to whether a group of row records contain a specific value. Imagine an example where a student will take several exams over a period of time. Suppose he takes three exams every three months. As long as he has pas ...
Added by Satria Ox41464b on Thu, 16 Dec 2021 13:58:28 +0200
Sogou log query analysis (MapReduce+Hive+idea comprehensive experiment)
prerequisite:
Install Hadoop 2 7.3 (under Linux system)
Install MySQL (under Windows or Linux system)
Install Hive (under Linux system) reference: Hive installation configuration
Title:
Download search data from Sogou lab for analysis
The downloaded data contains 6 fields, and the data format is described as follows:
Access time user ID ...
Added by cheekychop on Sun, 12 Dec 2021 13:58:21 +0200
spark integrated hive summary
I won't say much about installing spark here~
!!! Look! To install mysql and hive:
Install RPM package and download mysql:
sudo yum localinstall https://repo.mysql.com//mysql80-community-release-el7-1.noarch.rpm
sudo yum install mysql-community-server
Start MySQL service and view the status:
systemctl start mysqld.service
service ...
Added by jaimitoc30 on Tue, 07 Dec 2021 22:24:11 +0200
hive data type
1. Basic data type
Hive data type
Java data type
length
example
TINYINT
byte
1byte signed integer
20
SMALINT
short
2byte signed integer
20
INT
int
4byte signed integer
20
BIGINT
long
8byte signed integer
20
BOOLEAN
boolean
Boolean type, true or false
TRUE FALSE
FLOAT
float
Single-precision floating-point
3.14159
DOUBLE
doubl ...
Added by choubix on Sun, 05 Dec 2021 16:11:35 +0200
[hive] record the hive commands used in work
Pre command
Initialize metadata information
schematool -dbType mysql -initSchema
Start hive and hivever2
nohup hive --service metastore 1>/mnt/metastore.log 2>&1 &
nohup hive --service hiveserver2 1>/mnt/hiveserver2.log 2>&1 &
1, Connect hive
beeline connect hive without permission
beeline -u "jdbc:hive2:// ...
Added by Ellypsys on Tue, 30 Nov 2021 13:30:36 +0200
Flink Core Programming
Flink Core Programming
1,Environment
When Flink Job submits to perform calculations, it first establishes a link with the Flink framework, that is, the current Flink runtime environment in which task can be scheduled to a different taskManager execution only if environmental information is available. This environment object is relatively simp ...
Added by MadRhino on Wed, 24 Nov 2021 22:44:02 +0200