Film recommendation system Xiamen University database laboratory version

Resource address: http://dblab.xmu.edu.cn/post/movierecommend/ Project introduction 1. Recommendation system Discover the potential needs of users according to their historical data. 2. Long tail commodity Different from popular goods, popular goods represent the general needs of users, while long tail goods represent the personalized need ...

Added by linusx007 on Sun, 23 Jan 2022 18:04:39 +0200

Spark Development Learning: using idea to develop spark applications

Spark Learning: using idea to develop spark applications This article is based on jdk1 8. The idea development tool and maven are all configured. background Because saprk service has been deployed on the remote centos server, but the code of spark based application is developed in the local idea, how to make the locally developed spark code ...

Added by jon2396 on Thu, 20 Jan 2022 19:55:28 +0200

Spark performance optimization guide - train of thought

preface Spark job optimization is actually a general topic, because sometimes it is slow, but the solution is really different. I want to point out all aspects of optimization so that the system can formulate the overall optimization scheme. Sorting out optimization ideas How to treat the so-called slow problem? I made a sorting: themeresou ...

Added by jber on Fri, 14 Jan 2022 22:46:36 +0200

org.apache.spark.SparkException: Task not serializable

preface This article belongs to the column Spark abnormal problems summary, which is original by the author. Please indicate the source of quotation. Please help point out the deficiencies and errors in the comment area. Thank you! Please refer to Spark exception summary for the directory structure and references of this column text If ...

Added by johnska7 on Fri, 14 Jan 2022 14:37:54 +0200

Will Python script be invoked in Spark Scala/Java application?

Abstract: This article will introduce how to call Python script in Spark scala program, and the procedure of calling Spark java program is basically the same. This article is shared from Huawei cloud community< [Spark] how to invoke Python script in Spark Scala/Java application >, author: little rabbit 615. 1.PythonRunner For programs run ...

Added by MrRosary on Thu, 13 Jan 2022 09:18:20 +0200

Rpc architecture of Spark source code

1, Overview In spark, many places involve network communication, such as message exchange between various components of spark, upload of user files and Jar packets, Shuffle process data transmission between nodes, copy and backup of Block data, etc. Spark1. Before 6, Spark Rpc was implemented based on Akka, which is an asynchronous message ...

Added by ploppy on Tue, 11 Jan 2022 01:14:43 +0200

Big data - Summary of common operators of Spark RDD

The core of Spark is based on the same abstract Resilient Distributed Datasets (RDD), which enables the components of Spark to integrate seamlessly and complete big data processing in the same application 1. Basic concepts of RDD RDD is the most important abstract concept provided by spark. It is a special data set with fault-tolerant mechani ...

Added by cainfool on Mon, 10 Jan 2022 21:37:05 +0200

Teach you how to call Python script in Spark Scala/Java application.

Abstract: This article will introduce how to call Python script in Spark scala program, and the procedure of calling Spark java program is basically the same. This article is shared from Huawei cloud community< [Spark] how to invoke Python script in Spark Scala/Java application >, author: little rabbit 615. 1.PythonRunner For programs run ...

Added by billynastie on Mon, 10 Jan 2022 04:18:39 +0200

Spark sparksql foundation, DataFrame, DataSet

Spark-SQL summary Spark SQL is a spark module used by spark for structured data processing. For developers, SparkSQL can simplify the development of RDD, improve the development efficiency, and the execution efficiency is very fast. Therefore, in practical work, SparkSQL is basically used. In order to simplify the development of RDD and impr ...

Added by Asnom on Thu, 06 Jan 2022 08:03:44 +0200

[big data framework and practice] - Chapter 1 spark basic course

Section 1 Introduction to spark 1. What is spark? 1.apache spark is a unified computing engine and a set of class libraries. Using spark to process data is 100 times faster than the traditional way. 2. It is not that spark is 100 times faster than python on a single computer, but that spark is mainly used for parallel data processing on c ...

Added by hotcigar on Wed, 05 Jan 2022 08:29:00 +0200