MapReduce practice handwritten WordCount case

Requirement: count the total number of occurrences of each word in a given stack of text files As shown in the figure below is the analysis chart of MapReduce statistical WordCount: The map stage reads the data from the file, the line number is the key, and the read value of each line is the value. Each key/value pair is ou ...

Added by jber on Tue, 10 Dec 2019 22:02:08 +0200

Spark obtains a case of a mobile phone number staying under a base station and the location of the current mobile phone

1. Business requirements Calculate a cell phone number (base station, dwell time), (current longitude, current latitude) by holding the cell phone number's dwell time log and base station information at each base station The log information generated by connecting the mobile phone to the base station is similar to the following: ...

Added by compguru910 on Tue, 10 Dec 2019 21:16:10 +0200

Big data: installation details of Hive

What is hive? Open source by facebook, used to solve the data statistics of massive structured logs; A data warehouse tool based on hadoop uses HDFS to store and map structured data files into a table, and provides the function of sql like query. The bottom layer uses MR to calculate; The essence is to transform HQL into ...

Added by calbolino on Tue, 10 Dec 2019 20:06:18 +0200

Python serial56 - send email with attachment and HTML body

I. how to send HTML right click 1. Prepare HTML code as content 2. Set the message subtype to html 3. send 4. For example: send an HTML file to yourself   from email.mime.text import MIMEText ​ main_content = """ <!DOCTYPE html> <html lang = "en" <head> <meta charset = "UTF-8"> ...

Added by Flukey on Tue, 10 Dec 2019 09:30:32 +0200

Scala development environment construction

At the beginning, I used eclipse development tools. After installing Scala, download the Scala Eclipse Plug-in and copy the feature s and plugin s in the compression package to the corresponding directory of eclipse tools. However, using Eclipse Maven to develop Scala projects is a bit of a pain. So, toss to toss to give up and ...

Added by AjithTV on Mon, 09 Dec 2019 03:52:28 +0200

msql create view record

Because there was no record of the previous views, this time the customer company asked me to write three views, but found that I forgot how to write! So I went to Baidu and got it done in minutes. But, the individual still makes a record! I don't need to see other people's writing in the future. #Vehicle member query view select * from v_v ...

Added by blindeddie on Sun, 08 Dec 2019 15:42:36 +0200

Hive later view and expand

explode(Official website link) Expand is a UDTF (table generation function) that converts a single input row to multiple output rows. Generally, it is used in combination with general view, mainly in two ways: Input type Usage method describe T explode(ARRAY<T> a) Decompose the array into multiple rows, return a single column a ...

Added by leony on Sun, 08 Dec 2019 13:56:09 +0200

Group by operation of text data according to fields

Demand: The text data format is as follows: akc190|id_drg|name_drg|pdxCode|pdxName|sdxCodes|sdxNames|yka055 0001369157| 101| seizure (-) | G40.901| epilepsy | G40.901 $| epilepsy $| 1946.56 0001370448| 101| seizure (-) | G40.901| epilepsy | G40.901$J40.x00 $| epilepsy $bronchitis $| 2842.77 0001374918| 101| seizure (-) | R56.001| febrile c ...

Added by CroNiX on Sun, 08 Dec 2019 01:39:30 +0200

Phoenix duplicate record -- the reason and solution of duplicate query data

Problem description issue A: after turning on the parameter (phoenix.stats.enabled=true), use Phoenix SQL to query the data, and there are duplicates (the data found is more than the actual content stored in HBase) issue B: after the parameter is closed (phoenix.stats.enabled=false), Phoenix SQL performance decreases. Environmental Science Phoe ...

Added by evan18h on Fri, 06 Dec 2019 21:55:32 +0200

oracle rookie learning self connect query experiment

Creation of experiment table Table field description: id:Employee numbername: Employee nameano: Manager No create table admin(id varchar2(4),name varchar2(10),ano varchar2(4)); insert into admin values('001','XiongDa','004'); insert into admin values('002','XiongEr','004'); insert into admin values('003','ZhangSan','003'); insert into admin ...

Added by stuart7398 on Fri, 06 Dec 2019 17:19:16 +0200