Catalog
- 1. Tabulation statement
- 2. build tables
- 1. General table
- 2. external table
- 3. partition table
- 2. Importing data
- 1. Load data
- 2. Insert ... Select
- 3. alter partition operation
1. Tabulation statement
TABLE statement
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name -- build(external)surface,(Existence of deletion tables) [(col_name data_type [COMMENT col_comment], ...)] -- field [COMMENT table_comment] -- Table description [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] -- partition [CLUSTERED BY (col_name, col_name, ...) -- bucket [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] [ROW FORMAT row_format] -- Specify column separators [STORED AS file_format] -- Loaded data type,compress [LOCATION hdfs_path] -- file location
Field types supported by Hive
string tinyint smallint int bigint boolean double float
2. build tables
1. General table
The default data type is testfile, and internal tables are defaulted without external.
CREATE TABLE table1(id INT, name STRING) ROW FORMAT DELIMITED '\t';
2. external table
You need to specify the data location, deleting tables will only delete metadata of tables, not HDFS data.
CREATE EXTERNAL TABLE table1(id INT, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 'hdfs://...';
3. partition table
PARTITIONED BY specifies partitions that can have multiple partitions, each of which corresponds to a directory in HDFS.
CREATE TABLE table1(id INT, name STRING) PARTITIONED BY (partition STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
2. Importing data
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
set hive.exec.dynamic.partition=true
1. Load data
-- Import HDFS Data to specified table LOAD DATA INPATH '/tmp/app.log' OVERWRITE INTO TABLE cp_app; -- Import server data to the specified partition LOAD DATA LOCAL INPATH '/tmp/app.log.1' OVERWRITE INTO TABLE cp_app PARTITION(`date`='');
2. Insert ... Select
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement -- Example insert overwrite table cp_app partition(`date`) select * from cp_app_log;
FROM from_statement INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 [INSERT OVERWRITE TABLE tablename2 [PARTITION ...] select_statement2] ... -- Example from cp_app_log insert overwrite table cp_app partition(`date`) selet * from
3. alter partition operation
-- Add partition(And load the data to the partition) ALTER TABLE table_name ADD partition_spec [ LOCATION 'location1' ]; -- delete a partition ALTER TABLE table_name drop partition (partcol1[=value1]);