Hadoop cluster entry configuration

Hadoop overview Hadoop composition HDFS Architecture Overview Hadoop Distributed File System (HDFS for short) is a distributed file system. NameNode (nn): stores the metadata of the file. Such as file name, file directory structure, file attributes (generation time, number of copies, file permissions), block list of each file, DataNo ...

Added by Irap on Thu, 24 Feb 2022 08:51:49 +0200

Hadoop ecosystem - MapReduce Job submission source code analysis

1. Debug environment preparation 1.1 Debug code: MR classic introduction case WordCount 1.1.1 Mapper class public class WordCountMapper extends Mapper<LongWritable, Text,Text,LongWritable> { @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String ...

Added by zuzupus on Sun, 06 Feb 2022 08:42:00 +0200

Some experience of using Hadoop

Some experience on the use of HDFS Write before: I've been working on big data in the company for some time. Take time to sort out the problems encountered and some better optimization methods. 1.HDFS storage multi directory 1.1 production server disk 1.2 on HDFS site Configure multiple directories in the XML file, and pay attention t ...

Added by Soldier Jane on Fri, 28 Jan 2022 02:06:47 +0200

[hadoop job] Call MapReduce to count the number of occurrences of each word in the file

1, Environment introduction Install the Ubuntu virtual machine using VirtualBox. Install Hadoop and Eclipse 3.0 in Ubuntu 8 compiler. Download and install JAVA environment, Download jdk and complete the pseudo distributed environment configuration of Hadoop. Import all the required JAR packages encountered by the compiler in Eclipse. Start Had ...

Added by IRON FART on Tue, 04 Jan 2022 09:13:29 +0200

There is no one of the simplest service response time optimization methods

Preface - From Wan Junfeng Kevin The average delay of the service is basically about 30ms. One of the very big prerequisites is that we make extensive use of MapReduce technology, so that even if our service calls many services, it often depends only on the duration of the slowest request. For your existing services, you do not need to opti ...

Added by freakuency on Sun, 02 Jan 2022 19:58:37 +0200

hadoop data compression and related algorithms and (MapReduce) code example demonstration

To see which compression algorithms hadoop has [lqs@bdc112 hadoop-3.1.3]$ bin/hadoop checknative 2021-12-15 16:20:12,342 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native 2021-12-15 16:20:12,345 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 2021-12-15 16:20:12,3 ...

Added by thebighere on Wed, 15 Dec 2021 19:16:34 +0200

Sogou log query analysis (MapReduce+Hive+idea comprehensive experiment)

prerequisite: Install Hadoop 2 7.3 (under Linux system) Install MySQL (under Windows or Linux system) Install Hive (under Linux system) reference: Hive installation configuration Title: Download search data from Sogou lab for analysis The downloaded data contains 6 fields, and the data format is described as follows: Access time user ID ...

Added by cheekychop on Sun, 12 Dec 2021 13:58:21 +0200

MapReduce program 3 of Maven project --- realize the function of counting the total salary of employees in each department (optimization)

This paper is based on the realization of the function of counting the total salary of employees in each department. If it has not been realized, please refer to: Realize the function of counting the total salary of employees in each department Optimization project: 1. Use serialization 2. Implement partition partition 3.Map uses Combiner ...

Added by baw on Sun, 05 Dec 2021 11:00:33 +0200

[Introduction to Cloud Computing Experiment 3] MapReduce programming

Pre-environment You need to set up a hadoop pseudo-distributed cluster platform, which you can see in this tutorial Quick Start Tutorial for Hadoop Big Data Technology and Pseudo-Distributed Clustering Eclipse Environment Configuration Eclipse(Windows Local System) 1. Install plug-ins: hadoop-eclipse-plugin-2.7.3.jar Address: https:// ...

Added by bc2013 on Thu, 25 Nov 2021 20:05:01 +0200

Hadoop -- MapReduce implements word statistics (Graphic super detailed version)

1, Previously on The last article introduced the Api calling method of MapReduce and the configuration of eclipse. This time, we will use MapReduce to count words in English article files! Welcome to my previous article: MapReduce related eclipse configuration and Api call 2, Preconditions Installation requiredDownload methodIDEAO ...

Added by webdes03 on Wed, 10 Nov 2021 21:15:32 +0200