Hive builds tables and imports data

Catalog

1. Tabulation statement

TABLE statement

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name      -- build(external)surface,(Existence of deletion tables)
  [(col_name data_type [COMMENT col_comment], ...)]     -- field
  [COMMENT table_comment]                               -- Table description
  [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]    -- partition
  [CLUSTERED BY (col_name, col_name, ...)                             -- bucket
  [SORTED BY (col_name [ASC|DESC], ...)]
  INTO num_buckets BUCKETS]
  [ROW FORMAT row_format]    -- Specify column separators
  [STORED AS file_format]    -- Loaded data type,compress
  [LOCATION hdfs_path]       -- file location

Field types supported by Hive

string
tinyint smallint int bigint
boolean
double
float

2. build tables

1. General table

The default data type is testfile, and internal tables are defaulted without external.

CREATE TABLE table1(id INT, name STRING) 
ROW FORMAT DELIMITED '\t';

2. external table

You need to specify the data location, deleting tables will only delete metadata of tables, not HDFS data.

CREATE EXTERNAL TABLE table1(id INT, name STRING) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 'hdfs://...';

3. partition table

PARTITIONED BY specifies partitions that can have multiple partitions, each of which corresponds to a directory in HDFS.

CREATE TABLE table1(id INT, name STRING) PARTITIONED BY (partition STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

 

2. Importing data

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename 
[PARTITION (partcol1=val1, partcol2=val2 ...)]
set hive.exec.dynamic.partition=true

1. Load data

-- Import HDFS Data to specified table
LOAD DATA INPATH '/tmp/app.log' OVERWRITE INTO TABLE cp_app;

-- Import server data to the specified partition
LOAD DATA LOCAL INPATH '/tmp/app.log.1' OVERWRITE INTO TABLE cp_app PARTITION(`date`='');

2. Insert ... Select

INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] 
select_statement1 FROM from_statement 

-- Example
insert overwrite table cp_app partition(`date`) select * from cp_app_log;
FROM from_statement
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] 
select_statement1
[INSERT OVERWRITE TABLE tablename2 [PARTITION ...] select_statement2] ...

-- Example
from cp_app_log
insert overwrite table cp_app partition(`date`)
selet * from

3. alter partition operation

-- Add partition(And load the data to the partition)
ALTER TABLE table_name ADD partition_spec [ LOCATION 'location1' ];

-- delete a partition
ALTER TABLE table_name drop partition (partcol1[=value1]);

 

Keywords: hive

Added by xt3mp0r~ on Wed, 02 Oct 2019 11:05:23 +0300