Construct dataset
design sketch:import pandas as pd df ={'full name':[' Classmate Huang','Huang Zhizun','Huang Laoxie ','Da Mei Chen','Sun Shangxiang'], 'English name':['Huang tong_xue','huang zhi_zun','Huang Lao_xie','Chen Da_mei','sun shang_xiang'], 'Gender':['male','women','men','female','male'], 'ID':['463895200003128433','429475199912122345','420934199110102311','431085200005230122','420953199509082345'], 'height':['mid:175_good','low:165_bad','low:159_bad','high:180_verygood','low:172_bad'], 'Home address':['Guangshui, Hubei','Xinyang, Henan','Guangxi Guilin','Hubei Xiaogan','Guangzhou, Guangdong'], 'Telephone number':['13434813546','19748672895','16728613064','14561586431','19384683910'], 'income':['1.1 ten thousand','8.5 thousand','0.9 ten thousand','6.5 thousand','2.0 ten thousand']} df = pd.DataFrame(df) df
data:image/s3,"s3://crabby-images/270df/270dfc5255d1dc1063a326995126c766be40099d" alt=""
1. cat function
This function is mainly used for string splicing;df["full name"].str.cat(df["Home address"],sep='-'*3)
design sketch:
data:image/s3,"s3://crabby-images/a8ece/a8ece3eaea6d1f917881c8c5f4f524fb741f8f7b" alt=""
2. contains function
This function is mainly used to {judge whether a string contains a given character;df["Home address"].str.contains("wide")
design sketch:
data:image/s3,"s3://crabby-images/9166f/9166fe834a4b35e158445cccf28678dc1fee4e98" alt=""
3. Startswitch and endswitch functions
This function is mainly used to} determine whether a string is represented by Beginning / end;# "Huang Wei" in the first line begins with a space df["full name"].str.startswith("yellow") df["English name"].str.endswith("e")
design sketch:
data:image/s3,"s3://crabby-images/9d5d0/9d5d096ceb94a469856385963aab2f9338c9515c" alt=""
4. count function
This function is mainly used to} calculate the number of times a given character appears in the string;df["Telephone number"].str.count("3")
design sketch:
data:image/s3,"s3://crabby-images/07a7a/07a7a36406d9587da24ae33dea88439663487439" alt=""
5. get function
This function is mainly used to {get the string at the specified position;df["full name"].str.get(-1) df["height"].str.split(":") df["height"].str.split(":").str.get(0)
design sketch:
data:image/s3,"s3://crabby-images/fc4f3/fc4f309a63650a0fd0f8779dc082e6b0a829e108" alt=""
6. len function
This function is mainly used to {calculate the length of the string;df["Gender"].str.len()
design sketch:
data:image/s3,"s3://crabby-images/c57b2/c57b211c3445fa0e7c816448575d2ffe1e58c539" alt=""
7. upper and lower functions
This function is mainly used for {English case conversion;df["English name"].str.upper() df["English name"].str.lower()
design sketch:
data:image/s3,"s3://crabby-images/932ba/932bafddeaf66f15880110a53e8deed226cfe757" alt=""
8. pad+side parameter / center function
This function is mainly used to {add a given character to the left, right or left and right sides of a string;df["Home address"].str.pad(10,fillchar="*") # Equivalent to ljust() df["Home address"].str.pad(10,side="right",fillchar="*") # Equivalent to rjust() df["Home address"].str.center(10,fillchar="*")
design sketch:
data:image/s3,"s3://crabby-images/62fa1/62fa122b95c98e624e89a608ae255bbac64aa02c" alt=""
9. repeat function
This function is mainly used to {repeat the string several times;df["Gender"].str.repeat(3)
design sketch:
data:image/s3,"s3://crabby-images/30b1d/30b1d48420f8fa0c7dff4a19255e17f9573f3786" alt=""
10. slice_replace function
This function is mainly used to} use the given string to replace the characters at the specified position;df["Telephone number"].str.slice_replace(4,8,"*"*4)
design sketch:
data:image/s3,"s3://crabby-images/3459d/3459d38f196972223864792ce375c83608f2eac0" alt=""
11. replace function
This function is mainly used to} replace the character at the specified position with the given string;df["height"].str.replace(":","-")
design sketch:
This function also} accepts a regular expression to replace the character at the specified position with the given string.
df["income"].str.replace("\d+\.\d+","regular")
design sketch:
data:image/s3,"s3://crabby-images/9bb97/9bb97b1a325e132ad467e979ca23a830243ad4f4" alt=""
12. split method + expand parameter
This function is mainly used to} expand a column into several columns;# Common usage df["height"].str.split(":") # split method with expand parameter df[["Height description","final height"]] = df["height"].str.split(":",expand=True) df # split method with join method df["height"].str.split(":").str.join("?"*5)
design sketch:
data:image/s3,"s3://crabby-images/b8a73/b8a73ef52ecbd587361ec209a2ce9dcfc86e6a41" alt=""
13. strip, rstrip and lstrip functions
This function is mainly used to {remove blank characters and line breaks;df["full name"].str.len() df["full name"] = df["full name"].str.strip() df["full name"].str.len()
design sketch:
data:image/s3,"s3://crabby-images/02989/02989e5e254d111cb328bfd2f703ba4a0bc7bef9" alt=""
14. findall function
This function is mainly used to} use regular expressions to match strings and return a list of search results;df["height"] df["height"].str.findall("[a-zA-Z]+")
design sketch:
data:image/s3,"s3://crabby-images/8f97c/8f97cce6be501b80bfcd655df6cb0390791ef0e5" alt=""
15. extract, extractall functions
This function is mainly used to} accept regular expressions and extract matching strings (be sure to add parentheses);df["height"].str.extract("([a-zA-Z]+)") # Extract the composite index from extractall df["height"].str.extractall("([a-zA-Z]+)") # extract with expand parameter df["height"].str.extract("([a-zA-Z]+).*?([a-zA-Z]+)",expand=True)
design sketch:
data:image/s3,"s3://crabby-images/cbc88/cbc884acc73ad2ff52395b8b759eb4a21a2054a1" alt=""