Filter feature filtering + random forest modeling + Kaggle--Elo Merchant Category Recommendation

Data preprocessing process thinking import pandas as pd import numpy as np data fetch train = pd.read_csv("preprocess/train.csv") test = pd.read_csv("preprocess/test.csv") Stochastic forest model prediction Feature selection – Pearson correlation coefficient (train.shape, test.shape) ((201917, 1700), (123623, 1699)) # Ext ...

Added by nickholas on Tue, 01 Feb 2022 20:18:25 +0200

Data preprocessing of 100 day machine learning day1

Hello, everyone. I'm xiaok. I've been learning on the mobile phone for nearly a month. I've been exposed to a lot of knowledge about machine learning, but most of them just stay in the cognitive stage. From today on, I want to gradually have a deeper understanding of learning machine learning. I will send some knowledge that I think is useful o ...

Added by Mark.P.W on Tue, 01 Feb 2022 15:41:43 +0200

anaconda configures the tensorflow GPU environment and uses it in the jupyter notebook (two methods: in the command line / anaconda)

This article is about how Anaconda configures the tensorflow GPU environment. It will be through the command line and anaconda. Readers can choose by themselves. let's go! The contents of chapters 1 and 2 in the following table of contents are the same, but the methods are different. Choose one of them. 1, Creating tensorflow virtual enviro ...

Added by Bac on Tue, 01 Feb 2022 13:50:29 +0200

Three particularly practical Python modules are recommended, which are worth collecting

Hello, everyone. Today I will introduce three Python modules that are particularly easy to use. Few people know about them, but they are particularly easy to use. Psutil Pendulum Pyfiglet Psutil The Psutil module in Python is a cross platform library. It can easily obtain the process and system utilization of the system, including CPU, m ...

Added by sb on Tue, 01 Feb 2022 13:45:00 +0200

Forecast the existing stock data and draw with matplotlib

Baidu cloud link of the file used in this article: Link: https://pan.baidu.com/s/15-qbrbtRs4frup24Y1i5og Extraction code: pm2c linear prediction Assuming that a set of data conforms to a linear law, we can predict the data that will appear in the future a b c d e f g h .... ax + by + cz = d bx + cy + dz = e cx + dy + ez = f In ...

Added by gernot on Mon, 31 Jan 2022 22:36:44 +0200

How to improve the accuracy of regression model

In this article, we will see how to deal with the regression problem and how to improve the accuracy of machine learning model by using the concepts of feature transformation, feature engineering, clustering, enhancement algorithm and so on. Data science is an iterative process. Only through repeated experiments can we get the most suitable mo ...

Added by SmoshySmosh on Mon, 31 Jan 2022 16:52:40 +0200

Python uses graphviz to generate decision trees dot file and convert it into png and other picture formats (with the specific function source code written by myself)

Recently, because of the great innovation, I began to learn machine learning. In the process of learning decision tree, I saw the related operations of decision tree visualization. Firstly, the tree object of sklearn library is used for tree building and model training: from sklearn import tree # Establish decision tree classifier dtc = t ...

Added by $SuperString on Mon, 31 Jan 2022 02:26:22 +0200

Machine learning artifact scikit learn nanny level introductory tutorial

Scikit learn nanny level introductory tutorial Scikit learn is a well-known Python machine learning library, which is widely used in data science fields such as statistical analysis and machine learning modeling. Modeling invincible: users can realize various supervised and unsupervised learning models through scikit learnVarious functions: a ...

Added by MattG on Sun, 30 Jan 2022 20:15:07 +0200

Dimension reduction algorithm of machine learning: principal component analysis (PCA)

1. PCA theory Common ideas to solve the problem of over fitting: add training to process data, add regularization items, and reduce the dimension of features. Dimensionality reduction methods: direct feature selection, linear dimensionality reduction (PCA, MDS) and nonlinear dimensionality reduction (manifold). 1.1 representation of mean ...

Added by dukeu03 on Sun, 30 Jan 2022 10:57:03 +0200

Automatic adjustment of weight of loss function in multi task learning

0 Introduction   multitasking learning: given m m m learning tasks, in which all or part of the tasks are related but not exactly the same. The goal of multi task learning is to use this method m ...

Added by eziitiss on Sun, 30 Jan 2022 06:24:58 +0200