Hadoop cluster entry configuration
Hadoop overview
Hadoop composition
HDFS Architecture Overview
Hadoop Distributed File System (HDFS for short) is a distributed file system.
NameNode (nn): stores the metadata of the file. Such as file name, file directory structure, file attributes (generation time, number of copies, file permissions), block list of each file, DataNo ...
Added by Irap on Thu, 24 Feb 2022 08:51:49 +0200
Hadoop ecosystem - MapReduce Job submission source code analysis
1. Debug environment preparation
1.1 Debug code: MR classic introduction case WordCount
1.1.1 Mapper class
public class WordCountMapper extends Mapper<LongWritable, Text,Text,LongWritable> {
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String ...
Added by zuzupus on Sun, 06 Feb 2022 08:42:00 +0200
Some experience of using Hadoop
Some experience on the use of HDFS
Write before:
I've been working on big data in the company for some time. Take time to sort out the problems encountered and some better optimization methods.
1.HDFS storage multi directory
1.1 production server disk
1.2 on HDFS site Configure multiple directories in the XML file, and pay attention t ...
Added by Soldier Jane on Fri, 28 Jan 2022 02:06:47 +0200
[hadoop job] Call MapReduce to count the number of occurrences of each word in the file
1, Environment introduction
Install the Ubuntu virtual machine using VirtualBox. Install Hadoop and Eclipse 3.0 in Ubuntu 8 compiler. Download and install JAVA environment, Download jdk and complete the pseudo distributed environment configuration of Hadoop. Import all the required JAR packages encountered by the compiler in Eclipse. Start Had ...
Added by IRON FART on Tue, 04 Jan 2022 09:13:29 +0200
There is no one of the simplest service response time optimization methods
Preface - From Wan Junfeng Kevin
The average delay of the service is basically about 30ms. One of the very big prerequisites is that we make extensive use of MapReduce technology, so that even if our service calls many services, it often depends only on the duration of the slowest request.
For your existing services, you do not need to opti ...
Added by freakuency on Sun, 02 Jan 2022 19:58:37 +0200
hadoop data compression and related algorithms and (MapReduce) code example demonstration
To see which compression algorithms hadoop has
[lqs@bdc112 hadoop-3.1.3]$ bin/hadoop checknative
2021-12-15 16:20:12,342 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
2021-12-15 16:20:12,345 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2021-12-15 16:20:12,3 ...
Added by thebighere on Wed, 15 Dec 2021 19:16:34 +0200
Sogou log query analysis (MapReduce+Hive+idea comprehensive experiment)
prerequisite:
Install Hadoop 2 7.3 (under Linux system)
Install MySQL (under Windows or Linux system)
Install Hive (under Linux system) reference: Hive installation configuration
Title:
Download search data from Sogou lab for analysis
The downloaded data contains 6 fields, and the data format is described as follows:
Access time user ID ...
Added by cheekychop on Sun, 12 Dec 2021 13:58:21 +0200
MapReduce program 3 of Maven project --- realize the function of counting the total salary of employees in each department (optimization)
This paper is based on the realization of the function of counting the total salary of employees in each department. If it has not been realized, please refer to: Realize the function of counting the total salary of employees in each department
Optimization project:
1. Use serialization
2. Implement partition partition
3.Map uses Combiner
...
Added by baw on Sun, 05 Dec 2021 11:00:33 +0200
[Introduction to Cloud Computing Experiment 3] MapReduce programming
Pre-environment
You need to set up a hadoop pseudo-distributed cluster platform, which you can see in this tutorial Quick Start Tutorial for Hadoop Big Data Technology and Pseudo-Distributed Clustering
Eclipse Environment Configuration
Eclipse(Windows Local System)
1. Install plug-ins:
hadoop-eclipse-plugin-2.7.3.jar
Address: https:// ...
Added by bc2013 on Thu, 25 Nov 2021 20:05:01 +0200
Hadoop -- MapReduce implements word statistics (Graphic super detailed version)
1, Previously on
The last article introduced the Api calling method of MapReduce and the configuration of eclipse. This time, we will use MapReduce to count words in English article files!
Welcome to my previous article: MapReduce related eclipse configuration and Api call
2, Preconditions
Installation requiredDownload methodIDEAO ...
Added by webdes03 on Wed, 10 Nov 2021 21:15:32 +0200