MapReduce practice handwritten WordCount case
Requirement: count the total number of occurrences of each word in a given stack of text files
As shown in the figure below is the analysis chart of MapReduce statistical WordCount:
The map stage reads the data from the file, the line number is the key, and the read value of each line is the value. Each key/value pair is ou ...
Added by jber on Tue, 10 Dec 2019 22:02:08 +0200
Spark obtains a case of a mobile phone number staying under a base station and the location of the current mobile phone
1. Business requirements
Calculate a cell phone number (base station, dwell time), (current longitude, current latitude) by holding the cell phone number's dwell time log and base station information at each base station
The log information generated by connecting the mobile phone to the base station is similar to the following: ...
Added by compguru910 on Tue, 10 Dec 2019 21:16:10 +0200
Big data: installation details of Hive
What is hive?
Open source by facebook, used to solve the data statistics of massive structured logs;
A data warehouse tool based on hadoop uses HDFS to store and map structured data files into a table, and provides the function of sql like query. The bottom layer uses MR to calculate;
The essence is to transform HQL into ...
Added by calbolino on Tue, 10 Dec 2019 20:06:18 +0200
Python serial56 - send email with attachment and HTML body
I. how to send HTML right click
1. Prepare HTML code as content
2. Set the message subtype to html
3. send
4. For example: send an HTML file to yourself
from email.mime.text import MIMEText
main_content = """
<!DOCTYPE html>
<html lang = "en"
<head>
<meta charset = "UTF-8">
...
Added by Flukey on Tue, 10 Dec 2019 09:30:32 +0200
Scala development environment construction
At the beginning, I used eclipse development tools. After installing Scala, download the Scala Eclipse Plug-in and copy the feature s and plugin s in the compression package to the corresponding directory of eclipse tools. However, using Eclipse Maven to develop Scala projects is a bit of a pain. So, toss to toss to give up and ...
Added by AjithTV on Mon, 09 Dec 2019 03:52:28 +0200
msql create view record
Because there was no record of the previous views, this time the customer company asked me to write three views, but found that I forgot how to write! So I went to Baidu and got it done in minutes. But, the individual still makes a record! I don't need to see other people's writing in the future.
#Vehicle member query view
select * from v_v ...
Added by blindeddie on Sun, 08 Dec 2019 15:42:36 +0200
Hive later view and expand
explode(Official website link)
Expand is a UDTF (table generation function) that converts a single input row to multiple output rows. Generally, it is used in combination with general view, mainly in two ways:
Input type
Usage method
describe
T
explode(ARRAY<T> a)
Decompose the array into multiple rows, return a single column a ...
Added by leony on Sun, 08 Dec 2019 13:56:09 +0200
Group by operation of text data according to fields
Demand:
The text data format is as follows:
akc190|id_drg|name_drg|pdxCode|pdxName|sdxCodes|sdxNames|yka055
0001369157| 101| seizure (-) | G40.901| epilepsy | G40.901 $| epilepsy $| 1946.56
0001370448| 101| seizure (-) | G40.901| epilepsy | G40.901$J40.x00 $| epilepsy $bronchitis $| 2842.77
0001374918| 101| seizure (-) | R56.001| febrile c ...
Added by CroNiX on Sun, 08 Dec 2019 01:39:30 +0200
Phoenix duplicate record -- the reason and solution of duplicate query data
Problem description
issue A: after turning on the parameter (phoenix.stats.enabled=true), use Phoenix SQL to query the data, and there are duplicates (the data found is more than the actual content stored in HBase)
issue B: after the parameter is closed (phoenix.stats.enabled=false), Phoenix SQL performance decreases.
Environmental Science
Phoe ...
Added by evan18h on Fri, 06 Dec 2019 21:55:32 +0200
oracle rookie learning self connect query experiment
Creation of experiment table
Table field description:
id:Employee numbername: Employee nameano: Manager No
create table admin(id varchar2(4),name varchar2(10),ano varchar2(4));
insert into admin values('001','XiongDa','004');
insert into admin values('002','XiongEr','004');
insert into admin values('003','ZhangSan','003');
insert into admin ...
Added by stuart7398 on Fri, 06 Dec 2019 17:19:16 +0200