1, Process thinking of data analysis
1. Clarify the purpose and thinking of the analysis / put forward the hypothesis
2. Data collection
3. Data processing / collation
4. Data analysis / validation assumptions
5. Data presentation / visualization chart
6. Report writing
2, Analysis purpose
- Demand 01: unit price of second-hand house per square meter (total price, monthly average price)
- Demand 02: sum of housing areas in each region, sorted in descending order
- Demand 03: according to the weekly analysis, is the number of second-hand housing transactions in Beijing rising or declining, or basically unchanged?
- Demand 04: according to weekly analysis, how about the trend of the average unit price of second-hand housing transactions every week?
- Demand 05: average listing cycle by Region / community / Region
3, Data collection
The data of a real estate website has been stored in the csv file.
4, Data processing
4.1 introducing python data analysis library
import numpy as np #For general data operations import pandas as pd #Used for data analysis, including data introduction, feature extraction, data cleaning and transfer, etc import matplotlib as mpl #Visualization for data import matplotlib.pyplot as plt #Convenient and fast drawing of 2D chart
4.2 setting Chinese support for drawing
mpl.rcParams["font.family"] = "SimHei" #Set font mpl.rcParams["axes.unicode_minus"]=False # Used to display negative sign normally plt.rcParams['font.sans-serif']=['SimHei'] # Used to display Chinese labels normally % matplotlib inline #Display the map used for matplotlib drawing in the page instead of a pop-up window
4.3 read data
lianjia = pd.read_csv("XXXXXXXX.csv", encoding="utf-8", sep="\t") # Read csv file pd.set_option("max_colwidth", 60) # Set each field to display up to 60 characters pd.set_option("max_columns", 50) # Set each dataframe to display 50 fields lianjia.head(3) #View the first three lines
Result:
Transaction price (10000) transaction time, residential unit type building area, listing price (10000), transaction cycle (day), price adjustment (time), attention (person) browsing (sub chain family number, transaction ownership, listing time, house purpose, house age, house type, floor unit type structure, Suite area (M2), building type towards the year of completion Decoration condition; building structure; heating mode; proportion of elevator households; years of ownership; equipped with elevator 0. Daxing NaN Nan NaN Nan NaN Nan NaN Nan NaN Nan NaN Nan NaN Nan NaN Nan 1 297, 2019-10-29, transaction: Room 3, hall 1, green cloud villa, 89.95, 300, 608, 1.0, 4.0, 43.0, 6424, 1.01E+11, commercial house, 2018 / 3 / 1, ordinary residence, two years in total, non shared, room 3, hall 1, kitchen 1, bathroom 1, middle floor (nine floors in total), no data at present, 73.63, Tower North and south, 2014, hardbound steel concrete structure, central heating, one ladder, two households, 70, NaN Nan NaN 2 366, 2019-10-29, transaction: sanyangli, 2 rooms, 1 hall, 89.79, 368, 31, 0.0, 3.0, 4.0, 118, 1.01E+11, commercial housing, 2019 / 9 / 29, ordinary residence, two years in total, non shared, 2 rooms, 1 hall, 1 kitchen, 1 bathroom, middle floor (6 floors in total), flat floor, 80.44, plank building, South-North 2009, other steel concrete structure, central heating, one ladder, two households, 70, no NaN Nan NaN
View the overall structure of data
lianjia.info() #View the overall structure of data
Result:
<class 'pandas.core.frame.DataFrame'> RangeIndex: 38393 entries, 0 to 38392 Data columns (total 32 columns): Transaction price (10000) 38393 non null object Closing time 38379 non null object 38379 non null object in the cell House type 38379 non null object Building area 38379 non null object Listing price (10000) 38379 non null object Closing period (days): 38379 non null object Price adjustment (Times) 38379 non null float64 Take a look (Times) 38379 non null float64 Attention (person) 38379 non null float64 Browse (times 38379 non null object Chainer No. 38379 non null object Transaction ownership 38379 non null object Listing time 38379 non null object House use 38379 non null object House age 38379 non null object 38379 non null object of the property right 38379 non null object 38379 non null object on the floor House type structure 38379 non null object Set area (㎡) 38379 non null object Building type 38379 non null object House facing 38379 non null object Built in 38379 non null object Decoration situation 38379 non null object Building structure 38379 non null object Heating mode 38379 non null object Scale of households 38379 non null object Property right period: 38379 non null object Equipped with elevator 38379 non null object xx1 3022 non-null object xx2 3022 non-null object dtypes: float64(3), object(29) memory usage: 9.4+ MB
4.4 data preprocessing
4.4.1 treatment area
There are more non empty data in transaction price than in other fields. After checking the original table, the region is stored in this field. A new large area field is added below.
lianjia["Large area"] = lianjia["Transaction price (10000 yuan)"] # Assign the value of "transaction price (10000)" to the field of the region lianjia[["Large area", "Transaction price (10000 yuan)"]].head(10) #View newly added fields
Regional transaction price (10000 yuan) 0. Daxing 1 297 297 2 366 366 3 226 226 4 548 548 5 245 245 6 254 254 7 193 193 8 280 280 9 347 347
Replace the number in the region field with the region name
lianjia["Large area"] = lianjia["Large area"].str.replace("-", "").replace("\d+", np.nan, regex=True) #Remove special symbols - replace numbers with nan lianjia["Large area"].fillna(inplace=True, method="ffill") #Replace the null value with the previous area name lianjia.dropna(axis=0, inplace=True, thresh=20) #Delete lines like large nan nan nan # Move region field to first column lianjia_daqu = lianjia["Large area"] lianjia.drop("Large area", axis=1, inplace=True) lianjia.insert(0, "Large area", lianjia_daqu) display(lianjia[["Large area", "Closing time", "Residential area", "Apartment layout", "Built-up area"]].sample(10))#View results //Transaction time, unit type and building area of the community 6654 Fangshan 2019-08 Deal Biguiyuan community zone 2 1room0office 58 15175 Mentougou 2019-09 Deal Xinqiao Road community 2room1office 51.47 29127 Westlife 2019-05-12 Deal Dragon claw and Sophora Hutong 2room1office 67.36 25210 Tongzhou 2019-06 Deal Haitangwan phase I 2room2office 91.07 1797 Daxing 2019-07 Deal Liyuan C area 3room2office 141.48 33204 Changping 2019.09.03 New Dragon City 2room1office 100.2 36054 Dongcheng 1905/7/11 Qianmen East Street 3room1office 69.79 5760 Chaoyang 2019-09-04 Deal Fuli City D area 2room1office 85.04 12756 Haidian 2019-09-30 Deal Ding Hui Bei Li 3room1office 82.44 32146 Yizhuang Development Zone 2017-03-07 Deal Rong Jing Lido 1room0office 41.44
4.4.2 processing xx1,xx2
From the previous info list, we can see that most of the two fields are missing values
display(lianjia["xx1"].unique()) # View the value of the xx1 after de duplication display(lianjia["xx2"].unique()) # View the value of the xx2 after de duplication array([nan, '70', '40', '50', 'Unknown'], dtype=object) array([nan, 'Yes', 'nothing', 'No data'], dtype=object) ### Delete the two columns of xx1 and XX2 lianjia.drop(axis=1, columns=["xx1", "xx2"], inplace=True) #Delete the two columns of xx1 and XX2
All fields have no missing values
<class 'pandas.core.frame.DataFrame'> Int64Index: 35357 entries, 1 to 35369 Data columns (total 31 columns): Region 35357 non null object Transaction price (10000) 35357 non null object Closing time: 35357 non null object 35357 non null object in the cell House type 35357 non null object Building area: 35357 non null object Listing price (10000) 35357 non null object Transaction period (days) 35357 non null object Price adjustment (Times) 35357 non null float64 Take a look (Times) 35357 non null float64 Attention (person) 35357 non null float64 Browse (Times: 35357 non null object Chainer No. 35357 non null object Transaction ownership 35357 non null object Listing time: 35357 non null object House use 35357 non null object House age: 35357 non null object House ownership 35357 non null object House type 35357 non null object Floor 35357 non null object House type structure 35357 non null object Set area (M2) 35357 non null object Building type 35357 non null object House facing 35357 non null object Built in 35357 non null object Decoration situation 35357 non null object Building structure 35357 non null object Heating mode 35357 non null object Scale of ladder households: 35357 non null object Property right years: 35357 non null object Equipped with elevator 35357 non null object dtypes: float64(3), object(28) memory usage: 8.6+ MB
4.4.3 processing time
#View all fields related to date display(lianjia[["Closing time", "Transaction period (days)", "Listing time"]].sample(10)) display(lianjia[["Closing time", "Transaction period (days)", "Listing time"]].dtypes)
# First, remove the "transaction" string from the "transaction time" field lianjia["Closing time"] = lianjia["Closing time"].str.replace(" Deal", "") # Unified time format lianjia["Closing time"] = pd.to_datetime(lianjia["Closing time"]) lianjia["Listing time"] = pd.to_datetime(lianjia["Listing time"]) # Calculate closing cycle and convert to days lianjia["Transaction cycle(new)"] = lianjia["Closing time"] - lianjia["Listing time"] lianjia["Transaction period (days)"] = lianjia["Transaction cycle(new)"].dt.days #Year and week of obtaining transaction time lianjia["Transaction time (year)"] = lianjia["Closing time"].dt.year lianjia["Closing time (week)"] = lianjia["Closing time"].dt.week
4.4.4 processing other fields
lianjia [["large area", "community", "house type", "building area"]]. loc[[30922, 32852, 8784, 31629]] Building area of the residential area Area C, Lincoln Park Phase II, Yizhuang Development Zone, 30922-- 32852 Changping first smart Club parking space 6.99 house structure 8784 NAME of Fangshan palace garden-- 31629 new Hainan Island NAME of Yizhuang Development Zone-- #Delete the row whose house type is parking space, do not need to analyze parking space, and the corresponding building area is not standardized lianjia.drop(lianjia[(lianjia ["house type"] = = "parking space")]. index, inplace=True) #Delete "NAME" for house type, delete line with building area lianjia.drop(lianjia[(lianjia ["house type"] = = "" name? ") (Lianjia [" building area "] = =" -- ")]. index, inplace=True)
# Building area Chinese characters and spaces become empty and converted to floating-point values lianjia["Built-up area"] = lianjia["Built-up area"].str.replace("[\s\u4e00-\u9fa5]", "", regex=True) lianjia["Built-up area"] = lianjia["Built-up area"].astype(np.float32)
lianjia [["transaction price (10000)", "listing price (10000)", "price adjustment (time)", "show (time)", "attention (person)", "browse (time)]]. sample(10) Transaction price (10000) listing price (10000) price adjustment (Times) show (Times) pay attention to (person) browse (Times) 10648 290 290 0.0 2.0 8.0 288 30125 363 390 0.0 6.0 14.0 3680 14244 648 680 0.0 0.0 0 0.0 no data temporarily 21619 209-214 226 1.0 2.0 3.0 155 27061 327-334 355 1.0 6.0 132.0 1423 4598 499 480 0.0 27.0 42.0 924 35070 805 850 1.0 24.0 51.0 13856 5253 437 437 1.0 46.0 82.0 962 1369 293-324 no data temporarily 0.0 0.0 0.0 no data temporarily 12051 586 600 1.0 84.0 55.0 5399
# The transaction price is similar to 293-324, taking the average of two figures #The function splits by. If it is a number, it will return it directly. If it is two numbers, it will return after calculating the average value def handle(value): values2 = str(value).split("-") if len(values2) == 1: return value else: result = (float(values2[0]) + float(values2[1])) / 2 return str(result) lianjia["Transaction price (10000 yuan)"] = lianjia["Transaction price (10000 yuan)"].map(handle) # Call function to map lianjia["Transaction price (10000 yuan)"] = lianjia["Transaction price (10000 yuan)"].astype(np.float32) #Handle lianjia["Listing price (10000 yuan)"] = lianjia["Listing price (10000 yuan)"].str.replace("No data", "0") lianjia["Browse (Times)"] = lianjia["Browse (Times)"].str.replace("No data", "0") #shifting clause lianjia["Listing price (10000 yuan)"] = lianjia["Listing price (10000 yuan)"].astype(np.float32) lianjia["Transaction period (days)"] = lianjia["Transaction period (days)"].astype(np.float32) lianjia["Browse (Times)"] = lianjia["Browse (Times)"].astype(np.float32)
lianjia[[ "Chain number", "Trading Right", "Housing use", "Housing life"]].sample(10) //Serial number of chain store, ownership of transaction right, housing purpose, housing years 30379 1.01E+11 Commercial housing Ordinary residence No data 32736 1.01E+11 Commercial housing Ordinary residence No data 4078 1.01E+11 Commercial housing Ordinary residence Five years 9624 1.01E+11 Commercial housing Ordinary residence Five years 7003 1.01E+11 Commercial housing Ordinary residence Five years 31937 1.01E+11 Commercial housing apartment No data 13796 1.01E+11 Commercial housing Ordinary residence No data 5560 1.01E+11 Purchased public housing Ordinary residence Five years 11346 1.01E+11 Commercial housing Five years in general 598 1.01E+11 Commercial housing Ordinary residence Five years lianjia[[ "Ownership of premises", "Apartment layout", "Floor", "Structure of apartment layout", "Inner area(㎡)"]].sample(10) //Unit area of unit structure on the floor where the house belongs to (㎡) 29047 Non co ownership 2room2office1kitchen1Wei Low floor(common7layer) Flat layer No data 17853 Share 2room1office1kitchen1Wei Top floor(common6layer) Flat layer No data 29596 No data 2room1office1kitchen2Wei Middle floor(common19layer) Flat layer No data 14038 Non co ownership 2room1office1kitchen1Wei Tall building(common18layer) No data Temporarily numerous 27348 Non co ownership 1room0office1kitchen1Wei Middle floor(common26layer) No data 18.53 31512 Non co ownership 2room1office1kitchen1Wei Middle floor(common6layer) Flat layer 77.18 25027 Non co ownership 1room0office1kitchen1Wei Low floor(common7layer) No data No data 687 Non co ownership 1room1office1kitchen1Wei Middle floor(common15layer) Flat layer 46.22 3567 Share 3room1office1kitchen1Wei Low floor(common6layer) Flat layer No data 15897 Non co ownership 1room1office1kitchen1Wei Bottom(common6layer) Flat layer No data #Use the same unit type area to fill the inner area of the set temp_df1 = lianjia[~lianjia["Internal area(㎡)"].str.contains("No time|build|number")][["Ownership of premises", "Apartment layout", "Floor", "Structure of apartment layout", "Inner area(㎡)"]] temp_df2 = lianjia[lianjia["Inner area(㎡)"].str.contains("No time|build|number")][["Ownership of premises", "Apartment layout", "Floor", "Structure of apartment layout", "Inner area(㎡)"]] temp_df2["Inner area(㎡)"] = temp_df2["Inner area(㎡)"].replace("No data", np.nan) temp_df2["Inner area(㎡)"] = temp_df2["Inner area(㎡)"].replace("Temporarily numerous", np.nan) temp_df2["Inner area(㎡)"] = temp_df2["Inner area(㎡)"].replace("\d+\s+.*", np.nan, regex=True) lianjia_new5 = pd.concat((temp_df1, temp_df2)) lianjia[["Ownership of premises", "Apartment layout", "Floor", "Structure of apartment layout", "Inner area(㎡)"]] = lianjia_new5[["Ownership of premises", "Apartment layout", "Floor", "Structure of apartment layout", "Inner area(㎡)"]] lianjia[["Ownership of premises", "Apartment layout", "Floor", "Structure of apartment layout", "Inner area(㎡)"]].tail(10) //Unit area of unit structure on the floor where the house belongs to (㎡) 35360 No data 2room1office1kitchen1Wei Middle floor(common6layer) Flat layer NaN 35361 Non co ownership 2room2office1kitchen1Wei Bottom(common6layer) Flat layer NaN 35362 Non co ownership 3room1office1kitchen2Wei Tall building(common21layer) Flat layer 117.11 35363 No data 1room1office1kitchen1Wei Bottom(common5layer) Flat layer NaN 35364 Non co ownership 3room1office1kitchen1Wei Tall building(common6layer) Flat layer 84.11 35365 No data 3room1office1kitchen1Wei Tall building(common6layer) Flat layer NaN 35366 Non co ownership 1room0office0kitchen1Wei Low floor(common28layer) Flat layer NaN 35367 No data 1room0office0kitchen1Wei Low floor(common28layer) Flat layer NaN 35368 Non co ownership 1room2office1kitchen1Wei Tall building(common6layer) Flat layer NaN 35369 Non co ownership 2room2office1kitchen2Wei Tall building(common6layer) Flat layer NaN # Type conversion lianjia["Inner area(㎡)"] = lianjia["Inner area(㎡)"].astype(np.float32) # Use the average area of each type of house to replace lianjia["Inner area(㎡)"] = lianjia["Inner area(㎡)"].fillna( lianjia.groupby("Apartment layout")["Inner area(㎡)"].transform("mean"))
lianjia [["building type", "building orientation", "completion date", "decoration", "building structure]]. sample(10) Building type, building orientation, decoration, building structure 28721 plank building north south 1960 simple installation mixed structure 21679 plank building north south 2009 other steel concrete structure 20617 board building east west 2004 simple installation mixed structure 30381 plate building north south 2001 simple installation mixed structure 32116 slab building south north 2013 simple steel concrete structure 2539 slab building north south 2003 hardcover mixed structure 15401 board building north south 1980 hardbound mixed structure 21363 plate building south south 2012 simple steel concrete structure 13273 plank building south north 1996 other mixed structure 5942 plank building south northwest 2007 other steel concrete structure lianjia [["heating mode", "proportion of elevator households", "years of ownership", "equipped with elevator]]. sample(10) Heating mode, proportion of elevator households, years of ownership, equipped with elevator 22469 central heating, one ladder, four households, 70 no 29064 central heating, two ladders, seven households, 70 households 23215 central heating, one ladder, three households, 70 no 26982 central heating, one ladder, three households, 70 no 3056 central heating, one ladder, three households, 70 households 25895 central heating, one ladder, two households, 70 no 33206 self heating, one ladder, four households, 70 no 31214 self heating, two ladders, three households and 70 households 7638 central heating, one ladder, nine households, 70 no 23420 central heating, one ladder, three households, 70 no #Fill with 70 if the age of property right is unknown lianjia [years of ownership] = lianjia [years of ownership]. str.replace("unknown", "70") lianjia [years of ownership] = lianjia [years of ownership]. astype(np.int32) #Delete closing cycle (New) field lianjia.drop("closing cycle (New)", axis=1, inplace=True)
4.4.5 check the processed data
lianjia.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 34520 entries, 1 to 35369 Data columns (total 34 columns): Region 34520 non null object Transaction price (10000) 34520 non null float32 Closing time 34520 non null datetime64 [ns] 34520 non null object in the cell House type 34520 non null object Building area: 34520 non null float32 Listing price (10000) 34520 non null float32 Transaction cycle (days) 34520 non null float32 Price adjustment (Times) 34520 non null float64 Take a look (Times) 34520 non null float64 Attention (person) 34520 non null float64 Browse (times 34520 non null float32 Chainer No. 34520 non null object Transaction ownership 34520 non null object Listing time 34520 non null datetime64 [ns] House use 34520 non null object House age 34520 non null object House ownership 34520 non null object House type 34520 non null object Floor 34520 non null object House type structure 34520 non null object Inner area of sheath (M2) 34475 non null float32 Building type 34520 non null object House facing 34520 non null object Built in 34520 non null object Decoration situation 34520 non null object Building structure 34520 non null object Heating mode 34520 non null object Scale of ladder households: 34520 non null object Term of ownership: 34520 non null int32 Equipped with elevator 34520 non null object Closing cycle (New) 34520 non null timedelta64 [ns] Transaction time (year) 34520 non null Int64 Transaction time (week) 34520 non null Int64 dtypes: datetime64[ns](2), float32(6), float64(3), int32(1), int64(2), object(19), timedelta64[ns](1) memory usage: 8.3+ MB lianjia.sample(3) Regional transaction price (10000) transaction time, residential building area, listing price (10000), transaction cycle (day), price adjustment (time), watch (time), pay attention (person), browse (sub chain family number, transaction right, listing time, house purpose, house age, house type, floor, household structure, suite area (M2), building type towards completion Decoration situation in s, building structure, heating mode, elevator household proportion, ownership period, elevator equipped transaction cycle (New) transaction time (year) transaction time (week) 17501 Mentougou 160.0 2017-12-20 room 1, room 2, Shuangyu Road community 53.779999 180.0 151.0 1.0 85.0 144.0 8739.0 1.01E+11 commercial housing 2017-07-22 ordinary residence two years old non shared room 2 room 1 hall 1 kitchen 1 bathroom ground floor (5 floors in total) flat floor 38.1500002 plank building south 1980 simple installation mixed structure central heating one ladder three households 70 no 151 days 2017 51 1675 Daxing 203.0 2019-07-11 Kangtai Garden Room 1 hall 60.099998 203.0 492.0 1.0 62.0 192.0 10405.0 1.01E+11 commercial housing 2018-03-06 ordinary residence no data temporarily non shared room 1 hall 1 kitchen 1 bathroom ground floor (18 floors in total) flat floor 46.389999 board building north south 2009 simple steel concrete structure central heating one ladder two households 70 492 days 2019 28 14601 Haidian 938.5 2019-08-01 today's home 4 rooms 1 hall 179.509995 1380.0 66.0 1.0 21.0 17.0 842.0 1.01E+11 commercial housing 2019-05-27 ordinary residence no data temporarily non shared 4 rooms 1 hall 1 kitchen 3 bathroom middle floor (9 floors in total) 156.351532 plank building south 2000 hardbound steel concrete structure central heating one ladder six households 70 66 days 2019 31
5, Analyze requirements
5.1 unit price of second-hand house per square meter
# Total building area lianjia["Built-up area"].sum() 2961666.2 # Total transaction amount lianjia["Transaction price (10000 yuan)"].sum() 15145877.0 # Calculate unit price per square meter result = lianjia["Transaction price (10000 yuan)"].sum() / lianjia["Built-up area"].sum() display(str(result) + "ten thousand") '5.1139717 ten thousand'
5.2 total area of buildings in each region, in descending order
Result [DF = Lianjia. Groupby ("region") ["building area"]. agg({"building area": "sum"}) Result? DF = result? Df.sort? Values ("building area", ascending=False) display(result_df) Built-up area Large area Yizhuang Development Zone 300565.843750 Changping 289278.125000 Shunyi 286023.375000 Fangshan 259237.218750 Daxing 253338.468750 Tongzhou 251888.953125 Chaoyang 247918.687500 Haidian 238134.578125 Mentougou 230544.812500 Fengtai 229984.734375 Xicheng 202130.656250 Shijingshan 165606.250000 Others (Pinggu Miyun huairouyanqing) 7014.509766
5.3 weekly changes in the number of second-hand housing transactions in Beijing?
result_df = lianjia.groupby(["Transaction time (year)","Closing time (week)"]).size() display(result_df.loc[2019].head(60)) //Closing time (week) 1 459 2 110 3 144 4 157 5 283 6 1 7 46 8 104 9 620 10 190 11 176 12 151 13 171 14 690 15 270 16 286 17 292 18 1252 19 365 20 445 21 439 22 1642 23 418 24 470 25 486 26 651 27 1625 28 602 29 700 30 913 31 2432 32 660 33 733 34 769 35 2596 36 956 37 1007 38 994 39 1229 40 1541 41 861 42 988 43 1157 44 508 dtype: int64
year = 2019 mpl.rcParams["font.size"] = 12 plt.figure(figsize=(12,6)) plt.bar(result_df.loc[year].index, result_df.loc[year].values) plt.xticks(result_df.loc[year].index) plt.yticks(np.linspace(0, 2750, 20)) font = {"family":"Kaiti", "style":"oblique", "weight":"normal", "color":"green", "size": 20 } plt.xlabel("week", fontdict=font) plt.ylabel("Transaction number", fontdict=font) plt.grid(axis="y", color="g", ls=":", lw=1) plt.title(str(year) + "Volume of second-hand housing transactions in Beijing", fontdict=font, color= "r")
There will be a big increase in trading volume every four weeks or so. It may be at the beginning of the month or the end of the month, which needs further exploration.
5.4 trend of unit average price of second-hand house transaction every week
result_df = lianjia.groupby(["Transaction time (year)","Closing time (week)"])[["Transaction price (10000 yuan)", "Built-up area"]].agg({"Transaction price (10000 yuan)":"sum", "Built-up area":"sum"}) result_df["Unit average price"] = result_df["Transaction price (10000 yuan)"] / result_df["Built-up area"] display(result_df.loc[2019].head(60)) year = 2019 mpl.rcParams["font.size"] = 12 plt.figure(figsize=(12,6)) plt.plot(result_df.loc[year].index, result_df.loc[year]["Unit average price"])
The average unit price range rose and remained stable after 20 weeks.
result_df2= lianjia.groupby(["Transaction time (year)","Closing time (week)"])["Transaction period (days)"].agg( {"Transaction period (days)":"mean"}) display(result_df2.loc[2019].head(60)) year = 2019 mpl.rcParams["font.size"] = 12 plt.figure(figsize=(12,6)) plt.bar(result_df2.loc[year].index, result_df2.loc[year]["Transaction period (days)"]) plt.xticks(result_df.loc[year].index) plt.yticks(np.linspace(0, 225, 10)) font = {"family":"Kaiti", "style":"oblique", "weight":"normal", "color":"green", "size": 20 } plt.xlabel("week", fontdict=font) plt.ylabel("Transaction period (days)", fontdict=font) plt.grid(axis="y", color="g", ls=":", lw=1) plt.title(str(year) + "Sales cycle of second-hand houses in Beijing", fontdict=font, color= "r")
The long transaction cycle of the 6th, 18th, 22nd, 27th, 31st, 35th and 40th weeks needs further analysis