Datawhale zero foundation entry data mining - Task3 Feature Engineering

Datawhale zero foundation entry data mining - Task3 Feature Engineering 3, Characteristic engineering objectives Game Title: Zero basic entry data mining - used car transaction price prediction 3.1 characteristic engineering objectives Further analyze the characteristics and process the data Complete the analysis of characteristic enginee ...

Added by noiseusse on Sat, 05 Mar 2022 02:57:32 +0200

Simple practice of logistic regression and record of problems encountered

Objective: to predict whether a college student can be admitted to the university according to the score. Methods: call the advanced optimization algorithm or write the gradient descent function (choose the learning rate and iteration times by yourself) Data: ex2data1 txt 1, Read in data. 1.1 INTRODUCTION Kit import numpy as np import pa ...

Added by fullyscintilla on Fri, 04 Mar 2022 21:59:56 +0200

Python project practice: analyze big data with PySpark

Python project practice: analyze big data with PySpark Big data, as its name implies, is a large amount of data. Generally, these data are above PB level. PB is the unit of data storage capacity, which is equal to the 50th power of 2 bytes, or about 1000 TB in value. These data are characterized by a wide variety, including video, voice, pictu ...

Added by ztealmax on Fri, 04 Mar 2022 19:19:29 +0200

[source code analysis] NVIDIA HugeCTR, GPU version parameter server ------ Distributed Hash, then propagate to

[source code analysis] NVIDIA HugeCTR, GPU version parameter server - (8) - Distributed Hash, then propagate to 0x00 summary In this series, we introduce HugeCTR, an industry-oriented recommendation system training framework, which is optimized for large-scale CTR models with model parallel embedding and data parallel intensive networks. ...

Added by techite on Fri, 04 Mar 2022 13:30:43 +0200

HugeCTR source code reading

Introduction to HugeCTR The large-scale sparse training based on parameter server architecture can be said that there has been no new change and progress for several years until the emergence of Baidu's aibox paper and the open source of hugectr developed by nvidia. Finally, we can see that the parameter server architecture has taken anoth ...

Added by Travis Estill on Fri, 04 Mar 2022 03:22:37 +0200

YOLOV5 trains its own data set

All codes and related articles in this article are only used for experience and technology exchange and sharing. It is prohibited to apply relevant technologies to improper ways. The risk of misuse of technology has nothing to do with me. This article is some records of my study. start Recently, we plan to do a target detection project again. ...

Added by focus310 on Thu, 03 Mar 2022 23:20:55 +0200

Interpretability study - XGNN

Paper core target Here, the author aims at the graph classification problem of GNN. Study the model level interpretation method. The specific way is to train a graph generator use f ( . ) f(.) ...

Added by JoeyT2007 on Thu, 03 Mar 2022 21:14:30 +0200

[machine learning] how to use half grid search to shorten the grid search speed?

Contents of this chapter: The principle and operation process of half grid search (theoretical description)Description of halfinggridsearchcv parameter in sklearn🤷‍♀️ Case: half grid search_ House price data set_ python Indexes 🔣 Functions and parameters🗣 case🤷‍♀️ case📖 Extract 1 (Theory) principle and process of ha ...

Added by asaschool on Thu, 03 Mar 2022 16:38:46 +0200

Principal component analysis and its application in face recognition

Recently, I was studying Turing textbook by myself< Python Basic course of machine learning ", take some notes in the form of blog on csdn. We may have many purposes in using unsupervised learning for data transformation. The most common purpose is to visualize, compress data, and find a more informative data representation for further ...

Added by jateeq on Thu, 03 Mar 2022 13:47:38 +0200

Data mining project -- prediction of accommodation reservation results for new Airbnb users

abstract Based on the prediction of the accommodation reservation results of new Airbnb users, this paper completely describes the whole process from data exploration to feature engineering to model construction. Project address: Airbnb New User Bookings | Kaggle Of which: 1. The data exploration part is mainly based on pandas library, using t ...

Added by TheBrandon on Wed, 02 Mar 2022 16:27:52 +0200