Hadoop distributed file system (HDFS)
Hadoop distributed file system
brief introduction
HDFS (Hadoop distributed file system) is a core component of Hadoop and a distributed storage service
Distributed file systems can span polymorphic computers. It has a wide application prospect in the era of big data. They provide the required expansion capability for storing and processing s ...
Added by ajaybuilder on Mon, 03 Jan 2022 16:43:34 +0200
PCT package using R language: drawing road network map (British bicycle database)
This paper mainly refers to: PCT Get started; International application of the PCT methods
This paper mainly introduces the R package PCT, whose goal is to improve the accessibility and repeatability of the data generated by the dependency to cycle too (PCT), which is hosted on www.pct.bike.
The bicycle use data study (dependency ot cycle - ...
Added by DJTim666 on Mon, 03 Jan 2022 16:29:51 +0200
Elasticsearch 7.X Ik source code interpretation, and custom remote dynamic thesaurus
1, ik remote Thesaurus
The previous article explained ik as a whole, including the remote dynamic thesaurus. However, the previous article is based on nginx + static txt file. After modifying the file with nginx, the last modified attribute is automatically added. This method is also officially recommended: Officials recommend using another t ...
Added by socalnate on Mon, 03 Jan 2022 12:30:30 +0200
[review] Spark core programming --- RDD
Spark computing framework encapsulates three data structures to handle different application scenarios in order to process data with high concurrency and high throughput. The three data structures are: RDD: elastic distributed data set accumulator: distributed shared write only variables Broadcast variable: distributed shared read-o ...
Added by faraco on Mon, 03 Jan 2022 03:37:59 +0200
Hive [environment setup 02] [hive-3.1.2 version HiveServer2/beeline configuration use]
Hive has built-in HiveServer and HiveServer2 services, both of which allow clients to connect using multiple programming languages. However, HiveServer cannot handle concurrent requests from multiple clients, so HiveServer2 is generated. HiveServer2 (HS2) allows remote clients to submit requests to hive and retrieve results in various programmi ...
Added by greenber on Sun, 02 Jan 2022 03:01:29 +0200
[Tushare big data community - saving your financial data needs]
Tushare big data community - I have everything I want
Wande is too expensive? Reptiles don't? But what if we still need financial data? Tushare big data community: I have everything! (tushare ID: 436348)
For economic and management researchers, financial data is just needed. A clever woman can't make bricks without straw. In most empirica ...
Added by davidjam on Sat, 01 Jan 2022 13:06:47 +0200
Detailed explanation of Scala pattern matching
Big data technology AI
Flink/Spark/Hadoop / data warehouse, data analysis, interview, source code interpretation and other dry goods learning materials
101 original content
official account
Pattern matching in Scala is similar to the switch syntax in Java
int i = 10
switch (i) {
case 10 :
System.out.println("10");
break;
case 20 ...
Added by Altairzq on Sat, 01 Jan 2022 03:26:02 +0200
2, Build Hadoop cluster
1, Create template machine
1.1. Modify the IP settings in the configuration file
vim /etc/sysconfig/network-scripts/ifcfg-ens33
#Modification:
ONBOOT=yes
BOOTPROTO=static
IPADDR=192.168.150.211
NETMASK=255.255.255.0
GATEWAY=192.168.150.2
DNS1=192.168.150.2
1.2 modify the host name to hadoop01
vim /etc/hostname
1.3 restart network servic ...
Added by SoccerGloves on Fri, 31 Dec 2021 05:15:31 +0200
Deep Tilling ElasticSearch - Bar Chart / Aggregation by Time Statistics / Range Limited
1. Data preparation
1. Create an index mapping:
PUT /cars
{
"mappings": {
"properties": {
"price":{
"type": "integer"
},
"color":{
"type": "keyword"
},
"make":{
"type": "keyword"
},
"sold":{
"type": "date"
}
}
}
}
2. Index documents:
POST /cars ...
Added by hws on Fri, 31 Dec 2021 04:48:02 +0200
Hive: permission management
Storage Based Authorization in the Metastore Server
Based on storage authorization, metadata in the Metastore can be protected, but more fine-grained access control (such as column level and row level) is not providedSQL Standards Based Authorization in HiveServer2
Hive authorization based on SQL standard is fully compatible with SQL auth ...
Added by bgbs on Thu, 30 Dec 2021 15:39:51 +0200