Python string processing is widely used in crawler data parsing, large data text cleaning, and common file processing. Python has built-in many efficient functions for string processing, which is very powerful and convenient to use. Today, I will summarize and share with you the most frequently used methods of string processing. I hope you can deal with string processing easily.
1. Slice and multiplication of strings
(1) Section
str='Monday is a busy day' print(str[0:7]) #Represents taking the first to the seventh string print(str[-3:]) #A string representing the beginning and ending of the penultimate character print(str[::]) #Copy strings
(2) Multiplication
When we write Python code, we need a separator, which makes it easy to multiply by strings.
line='*'*30 print(line) >>******************************
2. String segmentation
(1) Ordinary splitting, using split function, but split can only do very simple splitting, and does not support multiple splitting.
phone='400-800-800-1234' print(phone.split('-')) >>['400', '800', '800', '1234']
(2) Complex segmentation, r means no escaping, the separator can be ";", "or","or the space followed by more than 0 additional spaces, and then according to this pattern to split.
line='hello world; python, I ,like, it' import re print(re.split(r'[;,s]\s*',line)) >>>['hello world', 'python', 'I ', 'like', 'it']
3. Connection and merge of strings
(1) Connection, two characters can be easily connected by "+"
str1='Hello' str2='World' new_str=str1+str2 print(new_str) >>>HelloWorld
(2) Merge with join method
url=['www','python','org'] print('.'.join(url)) >>>www.python.org
4. Determine whether a string ends with a specified prefix or suffix
Suppose we want to find out what the name of a file begins with or ends with?
filename='trace.h' print(filename.endswith('h')) >>True print(filename.startswith('trace')) >>True
5. Search and match of strings
(1) General search
Using find method, it is very convenient to find substrings in long strings, which will return the index of the location of the strings, and - 1 if it cannot be found.
str1 = "this is string example....wow!!!" str2 = "exam" print(str1.find(str2)) # 15 print(str1.find(str2, 10)) # 15 print(str1.find(str2, 40)) # -1
(2) Complex matching requires regular expressions.
mydate='11/27/2016' import re if re.match(r'\d+/\d+/\d+',mydate): print('ok.match') else: print('not match') >>>ok.match
6. Statistics of the number of occurrences of a character in a string
str = "thing example....wow!!!" print(str.count('i', 0, 5)) # 1 print(str.count('e')) # 2
7. Replacement of strings
(1) Common substitution, replace method is enough.
text='python is an easy to learn,powerful programming language.' print(text.replace('learn','study')) >>>python is an easy to study,powerful programming language.
(2) Complex substitution, re module subfunction is needed
students='Boy 103,girl 105' import re print(re.sub(r'\d+','100',students)) >>>Boy 100,girl 100
8. Remove some specific characters from the string
(1) De-space, such as reading a line from a file in text processing, and then need to remove the blanks, table s or newline characters of each line.
str = ' python str ' print(str) # Remove headspace and tail space print(str.strip()) # Remove the left blank print(str.lstrip()) # Go to the right blank print(str.rstrip())
(2) For complex text cleaning, str.translate can be used.
For example, first build a conversion table, table is a translation table, which means to convert "to" to "TO" in uppercase, then remove "12345" in old_str, and then the remaining strings are translated by table.
instr = 'to' outstr = 'TO' old_str = 'Hello world , welcome to use Python. 123456' remove = '12345' table = str.maketrans(instr,outstr,remove) new_str = old_str.translate(table) print(new_str) >>>HellO wOrld , welcOme TO use PyThOn. 6
summary
Usually we use Python to deal with some scripts, of which the most frequently used is the string processing, so we collate these commonly used string processing methods, hoping to be useful to you.