As a data analyst, it is necessary to master a database language.
Today, Mr. Huang will take you to learn about two relational databases MySQL and Oracle and a non relational database MongoDB.
For data analysts, the most important thing to learn about databases is to learn their query functions. This article takes this as the starting point to tell you how to operate these three databases with Python.
1. Operating Oracle database with Python
The difficulty of this part is that the environment configuration is a little cumbersome. Don't worry, I wrote an article about Oracle environment configuration Article.
Python operates Oracle using cx_Oracle library. We need to use the following command to install in advance:
pip insatll cx_Oracle
① Three ways for Python to link Oracle server
# ① User name, password and monitor are written together import cx_Oracle db = cx_Oracle.connect('scott/a123456@DESKTOP-V4LKB10:1521/orcl') # ② User name, password and monitoring are written separately import cx_Oracle db = cx_Oracle.connect("scott","a123456","192.168.2.1:1521/orcl") # ③ Configure listening and connection import cx_Oracle moniter = cx_Oracle.makedsn('192.168.2.1',1521,'orcl') db = cx_Oracle.connect('scott','a123456',moniter)
② How does Python get data from Oracle?
Here are three common methods to introduce to you.
Ⅰ fetchone(): obtain one record at a time;
import cx_Oracle # Note: the following two lines of code must be added to be responsible for Chinese garbled code; import os os.environ['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.UTF8' db = cx_Oracle.connect("scott","a123456","192.168.2.1:1521/orcl") cursor = db.cursor() cursor.execute('select count(*) from emp1') aa = cursor.fetchone() print(aa) cursor.execute('select ename,deptno,sal from emp1') for i in range(aa[0]): a,b,c = cursor.fetchone() d = "My name is{},The Department is{},Salary is{}dollar".format(a,b,c) display(d) db.close()
The results are as follows:
Ⅱ fetchall(): obtain all records at one time;
import cx_Oracle # Note: the following two lines of code must be added to be responsible for Chinese garbled code; import os os.environ['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.UTF8' db = cx_Oracle.connect("scott","a123456","192.168.2.1:1521/orcl") cursor = db.cursor() cursor.execute('select ename,deptno,sal from emp1') aa = cursor.fetchall() # print(aa) for a,b,c in aa: d = "My name is{},The Department is{},Salary is{}dollar".format(a,b,c) display(d) db.close()
The results are as follows:
III. use read in pandas_ SQL () method to directly convert the extracted data into DataFrame for operation;
import cx_Oracle import pandas as pd import os os.environ['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.UTF8' db = cx_Oracle.connect("scott","a123456","192.168.2.1:1521/orcl") cursor = db.cursor() df1 = pd.read_sql("select * from emp where deptno=20",db) display(df1) df2 = pd.read_sql("select * from emp where deptno=30",db) display(df2)
The results are as follows:
2. Python operation MySQL database
MySQL database should be the most widely used database in China. Most companies generally use this database. This means that many students will choose to learn the database knowledge for interview before graduation.
Python operates MySQL using cx_Oracle library. We need to use the following command to install in advance:
pip insatll pymysql
For more details: Detailed explanation of Python operation Oracle!
① Python linked MySQL server
import pymysql db = pymysql.connect(host='localhost',user='root',password='123456',port=3306,db='spiders',charset=' utf8')
There are six parameters, which need to be introduced one by one:
- Parameter host: ip of the host where the mysql server is located;
- Parameter user: user name;
- Parameter password: password;
- Parameter port: the port of the connected mysql host, which is 3306 by default;
- Parameter db: name of the connected database;
- Parameter charset: we need to set the code when the Chinese code will be garbled when reading data; When we use python to operate the database, python is equivalent to a client. We use this client to operate the mysql server. python 3 uses the utf8 character set by default, and my mysql server uses the latin1 character set by default. Therefore, each table created in mysql is encoded with utf8 when creating a table, Therefore, the code set here should be the code of the connection connector;
② How does Python get data in MySQL?
Ⅰ fetchone(): obtain one record at a time;
import pymysql db = pymysql.connect(host='localhost',user='root',db='huangwei',password='123456',port=3306,charset='utf8') cursor = db.cursor() cursor.execute('select count(*) from person') aa = cursor.fetchone() print(aa) cursor.execute('select name,age from person') for i in range(aa[0]): a,b = cursor.fetchone() c = "My name is{},this year{}year".format(a,b) display(c) db.close()
The results are as follows:
Ⅱ fetchall(): obtain all records at one time;
import pymysql db = pymysql.connect(host='localhost',user='root',db='huangwei',password='123456',port=3306,charset='utf8') cursor = db.cursor() cursor.execute('select name,age from person') aa = cursor.fetchall() # print(aa) for a,b in aa: c = "My name is{},this year{}year".format(a,b) display(c) db.close()
The results are as follows:
III. use read in pandas_ SQL () method to directly convert the extracted data into DataFrame for operation;
import pymysql import pandas as pd db = pymysql.connect(host='localhost',user='root',db='huangwei',password='123456',port=3306,charset='utf8') cursor = db.cursor() df1 = pd.read_sql("select * from student where ssex='male'",db) display(df1) df2 = pd.read_sql("select * from student where ssex='female'",db) display(df2)
The results are as follows:
3. Python operation MongoDB database
This part mainly takes you to compare and learn: the differences between relational data and non relational database. Let's just have a look. We don't have to study it too deeply, because data analysts basically don't use this kind of database.
Python operates MongoDB using the pymongo library. We need to use the following command to install in advance:
pip insatll pymongo
For more details: Detailed explanation of Python operation MongoDB!
① Python link MongoDB server
from pymongo import MongoClient conn = MongoClient("localhost",27017)
② How does Python get the data in MongoDB?
I. query some documents;
res = collection.find({"age": {"$gte": 19}}) for row in res: print(row)
II. Query all documents;
res = collection.find() for row in res: print(row)
III. statistical query;
res = collection.find().count() print(res)
Ⅳ query by id;
Third party libraries need to be introduced here.
from bson.objectid import ObjectId res = collection.find({"_id": ObjectId("5cc506289e1d88c95465488e")}) print(res[0])
Ⅴ ascending sort;
res = collection.find().sort("age") for row in res: print(row)
Ⅵ descending sort;
Third party libraries also need to be introduced here.
import pymongo res = collection.find().sort("age",pymongo.DESCENDING) for row in res: print(row)
VII paging query
res = collection.find().limit(3).skip(5) for row in res: print(row)