Basic introduction
Apache Tez is a data processing framework built on Apache Hadoop YARN and based on directed acyclic graph.
Main design theme:
- Authorized end user
- Expressive data flow definition API
- Flexible input processing output operation model
- Data type independent
- Easy to deploy
- Execution performance
- Better than mapreduce
- Optimize resource management
- Run time scheduled reconfiguration
- Dynamic physical data flow decision
By allowing projects like Apache Hive and Apache Pig to run complex DAG tasks, Tez can be used to process data. Previously, multiple MR tasks were required, but now only one Tez task is required, as shown below.

Download address
https://tez.apache.org/releases/index.html
Installation deployment
Version adaptation
For Tez version 0.8.3 and later, Tez requires Apache Hadoop version 2.6.0 or later. For Tez version 0.9.0 and later, Tez requires Apache Hadoop version 2.7.0 or later. So we're choosing tez When, we need to determine our hadoop version first.
Adapt hadoop version for tez source code compilation
Compiling platform
Operating system: centos 7.6
CPU architecture: x86_ sixty-four
Dependent installation
- First make sure it is installed
- jdk8
- maven3
protobuf-2.5.0 installation
yum install protobuf protobuf-develCopy
Source code compilation
After determining the hadoop version we use, select the appropriate tez for source code compilation. This way
- tez-0.9.2
- hadoop-3.2.0
Take tez as an example to compile the source code.
Source download and decompression
wget https://mirror.olnevhost.net/pub/apache/tez/0.9.2/apache-tez-0.9.2-src.tar.gz tar zxvf apache-tez-0.9.2-src.tar.gz Copy
Source code compilation
cd apache-tez-0.9.2-src && mvn clean package -Dtar -Dhadoop.version=3.2.0 -DskipTestsCopy
After compiling, you get tez dist / target / tez-0.9.2 tar. gz

functional testing
First, ensure that hadoop is installed normally, including hdfs and yarn
reference resources: How to install hadoop yarn
Tez-0.9.2 tar. GZ upload to / app/tez directory of hdfs
hdfs dfs -put tez-0.9.2.tar.gz /app/tez/ Copy
Create a new tez directory and add tez-0.9.2 tar. GZ copy to tez clock
mkdir -p /data/tez/conf cp tez-0.9.2.tar.gz /data/tez cd /data/tez && tar zvf tez-0.9.2.tar.gzCopy
New tez site XML, as follows
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <property> <name>tez.lib.uris</name> <value>/app/tez/tez-0.9.2.tar.gz</value> </property> </configuration> Copy
Modify / etc/profile and add
export TEZ_CONF_DIR=/data/tez/conf export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_CONF_DIR:/data/tez/*:/data/tez/lib/*Copy
Modify mapred site XML, will
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>Copy
Change to
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>Copy
Execute test script:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0.jar wordcount /test/ output-1Copy
Results obtained:


This article is the original article of "xiaozhch5", a blogger from big data to artificial intelligence. It follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement for reprint.
Original link: https://lrting.top/backend/2078/