explode(Official website link)
Expand is a UDTF (table generation function) that converts a single input row to multiple output rows. Generally, it is used in combination with general view, mainly in two ways:
Input type |
Usage method |
describe |
T |
explode(ARRAY<T> a) |
Decompose the array into multiple rows, return a single column and multiple rows, each row represents an element of the array |
Tkey,Tvalue |
explode(MAP<Tkey,Tvalue> m) |
Decompose the MAP into multiple rows. The returned rows have two columns (key value). Each row represents a key value pair in the input |
Example:
explode(array)
hive (default)> select explode(array('A','B','C')); OK col A B C Time taken: 0.402 seconds, Fetched: 3 row(s) hive (default)> select explode(array('A','B','C')) as col1; OK col1 A B C Time taken: 0.145 seconds, Fetched: 3 row(s) hive (default)> select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf; OK tf.col A B C Time taken: 0.191 seconds, Fetched: 3 row(s) hive (default)> select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col1; OK tf.col1 A B C
explode(map)
hive (default)> select explode(map('A',10,'B',20,'C',30)); OK key value A 10 B 20 C 30 Time taken: 0.153 seconds, Fetched: 3 row(s) hive (default)> select explode(map('A',10,'B',20,'C',30)) as (my_key,my_value); OK my_key my_value A 10 B 20 C 30 Time taken: 0.137 seconds, Fetched: 3 row(s) hive (default)> select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf; OK tf.key tf.value A 10 B 20 C 30 Time taken: 0.128 seconds, Fetched: 3 row(s) hive (default)> select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf as my_key,my_value; OK tf.my_key tf.my_value A 10 B 20 C 30 Time taken: 0.109 seconds, Fetched: 3 row(s)
LateralView(Official website link)
grammar
lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)*
fromClause: FROM baseTable (lateralView)*
describe
lateralview is used in conjunction with a user-defined table generation function (UDTF), such as expand, which generates zero or more output rows for each input row. lateralview first applies UDTF to each row of the underlying table, and then joins the resulting output row with the input row to form a virtual table with the provided table alias.
Example
The base table, pageads, has two columns: pageid (page name) and add list (an array of ads displayed on the page).
hive (test)> !cat pageads > ; front_page 1,2,3 contact_page 3,4,5 hive (test)> create table pageads( > pageid string, > add_list array<int>) > ROW FORMAT delimited > fields terminated by '\t' > collection items terminated by ',' > lines terminated by '\n' > ; OK Time taken: 0.099 seconds hive (test)> load data local inpath '/home/hadoop/pageads' into table pageads; Loading data to table test.pageads OK Time taken: 0.331 seconds hive (test)> select * from pageads; OK pageads.pageid pageads.add_list front_page [1,2,3] contact_page [3,4,5] Time taken: 0.106 seconds, Fetched: 2 row(s) hive (test)> select pageid,addid from pageads lateral view explode(add_list) adTable as addid; OK pageid addid front_page 1 front_page 2 front_page 3 contact_page 3 contact_page 4 contact_page 5 Time taken: 0.105 seconds, Fetched: 6 row(s) hive (test)> select addid,count(1) from pageads lateral view explode(add_list) adTable as addid group by addid; ....... OK addid _c1 1 1 2 1 3 2 4 1 5 1
Multiple lateralview s
The from clause can have multiple lateralview clauses. Subsequent lateralview clauses can refer to any column in the table to the left of lateralview. For example, the following query:
SELECT * FROM exampleTable
LATERAL VIEW explode(col1) myTable1 AS myCol1
LATERAL VIEW explode(myCol1) myTable2 AS myCol2;
Note that the later view clause is applied in the order in which it appears.
Example
hive (test)> !cat basetable; 1,2 a,b,c 3,4 d,e,f hive (test)> create table basetable( > col1 array<int>, > col2 array<string>) > ROW FORMAT delimited > fields terminated by '\t' > collection items terminated by ',' > lines terminated by '\n' > ; OK Time taken: 0.113 seconds hive (test)> load data local inpath '/home/hadoop/basetable' into table basetable; Loading data to table test.basetable OK Time taken: 0.329 seconds hive (test)> select * from basetable; OK basetable.col1 basetable.col2 [1,2] ["a","b","c"] [3,4] ["d","e","f"] Time taken: 0.104 seconds, Fetched: 2 row(s) hive (test)> SELECT myCol1, col2 FROM basetable > LATERAL VIEW explode(col1) myTable1 AS myCol1; OK mycol1 col2 1 ["a","b","c"] 2 ["a","b","c"] 3 ["d","e","f"] 4 ["d","e","f"] Time taken: 0.089 seconds, Fetched: 4 row(s) hive (test)> SELECT myCol1, myCol2 FROM baseTable > LATERAL VIEW explode(col1) myTable1 AS myCol1 > LATERAL VIEW explode(col2) myTable2 AS myCol2; OK mycol1 mycol2 1 a 1 b 1 c 2 a 2 b 2 c 3 d 3 e 3 f 4 d 4 e 4 f Time taken: 0.093 seconds, Fetched: 12 row(s) Outer Lateral Views hive (test)> SELECT * FROM basetable LATERAL VIEW explode(array()) C AS a limit 10; OK basetable.col1 basetable.col2 c.a Time taken: 0.063 seconds hive (test)> SELECT * FROM basetable LATERAL VIEW OUTER explode(array()) C AS a limit 10; OK basetable.col1 basetable.col2 c.a [1,2] ["a","b","c"] NULL [3,4] ["d","e","f"] NULL Time taken: 0.092 seconds, Fetched: 2 row(s)