Python Regular-match, search, findall Difference & Instance Resolution

Content 

match
Match string begins, Match object is returned successfully, None is returned unsuccessfully, only one is matched.

search
Searching in a string (not limited to the beginning), returning the Match object successfully, and returning None failed, matching only one.

findall
Find all groups in the string that match successfully, that is, the parts enclosed in parentheses.Returns a list object, each list item being a list of all matching groups.

 

 

1. match

Re.match() always matches from the beginning of the string and returns the match object of the matched string.So when I use the re.match() function to match a string that is not at the beginning of the string, I return NONE.

e.g.1

As an example, only'string3'can print out the result p, the others are output'NONE'.

import re

string1='I love python but hate pig'
string2='I love python'
string3='python'
string4='123'
result = re.match(r'[p]', string1)
print(result)

 

import re
 
# Compile regular expressions into Pattern s
pattern = re.compile(r'hello')
 
# Use Pattern Match Text to get a match result, and return None if no match occurs
match = pattern.match('hello world, hello word')
 
if match:
    # Use Match to get grouping information
    print (match.group())


hello

Intuitively, re.match() has a limited purpose.Match string start, match only one.

It can sometimes be useful depending on your needs.Following are a number of extensions that introduce a variety of matching patterns.

1.1 Matching characters between a and z

string3='python'
string4='123'
result = re.match(r'[a-z]', string3)
print(result) # p

1.2 Matching characters between A and Z

string3='Python'
string4='123'
result = re.match(r'[A-Z]', string3)
print(result) # P

1.3 Match characters between 0 and 9

ma = re.match(r'[0-9]',string4) 
print (ma.group())

1.4 A-z, A-Z, and 0-9 can be combined

string3='python'
string4='123'
result = re.match(r'[a-zA-Z0-9]', string3)
print(result)

\w and \W are identical, matching word characters [a-zA-Z0-9] and non-word characters, respectively.

 

1.5 Matching numbers/non-numbers

string4 = '[];;:'
ma1 = re.match(r'\D',string4)#Match non-numeric
ma2 = re.match(r'\d',string2)#Match Number
print (ma1.group())    # [
# print (ma2.group())  # raise error

1.6 Match whitespace and non-whitespace characters

\s and \S match whitespace and non-whitespace characters, respectively, as above.

 

1.7 Match 0 to infinite times: * (asterisk)

ma = re.match(r'[a-z][a-z]*',string1)

1.8 Match 1 to infinity: + (plus sign)

 

1.9 Match strings that occur m to n times: {m,n}

ma = re.match(r'[\w]{1,4}',string1) Any letter or number occurs 1 to 4 times

2. search

Searching in a string (not limited to the beginning), returning the Match object successfully, and returning None failed, matching only one.

The wildcards are the same as before.

For more wildcards, refer to the following blog:

https://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

e.g.

import re
string5 = 'alibetjgi676$gjgk@126.com'
ma6 = re.search(r'[\d]+',string5)  #Match Number
print(ma6)
print(ma6.group())


output:
<_sre.SRE_Match object; span=(9, 12), match='676'>
676
import re

string1='I love python but hate pig'
string2='I love python'
string3='python'
string4='123'
result = re.search(r'[\w]+', string3)
print(result)  # python

result2 = re.search(r'[\w]+', string2)
print(result2)  # I

Notice that when string2 is entered, the result of the search is'I'.\w is not recognized because there is a space (not a character) between'I'and'love'.

To put it bluntly, when you encounter a character that the wildcard cannot recognize, search ends because only one result is returned.

string4='123 45'
result = re.search(r'[\d]+', string4)
print(result)  # 123

Of course, you can force this to match all the characters.(

str2 = 'char|johljh'
ma6 = re.search(r'char[\W][\w]+',str2)
print(ma6.group())

char|johljh
str = 'oajfs|char|dhddfgdfg'
str2 = 'char|johljh|jjgkhk'
str3 = 'dlkngldnfk|flmgkdm|char'

ma6 = re.search(r'char[\W][\w]+',str) #This pattern can be matched to str str2
print(ma6.group())

ma6 = re.search(r'[\W]char',str3) # This pattern can be matched to str3
print(ma6.group())

3. findall

Find all groups in the string that match successfully, that is, the parts enclosed in parentheses.Returns a list object, each list item being a list of all matching groups.

import re

string1='I love python but hate pig'
string2='I love python'
string3='python'
string4='123 45'

result0 = re.findall(r'[p]+', string1)
result1 = re.findall(r'[p][a-z]+', string1)
result2 = re.findall(r'[\w]+', string2)
result4 = re.findall(r'[\d]+', string4)

print(result0)
print(result1)
print(result2)
print(result4)

output:

['p', 'p']
['python', 'pig']    --->  highly recommend       result1 = re.findall(r'[p][a-z]+', string1)
['I', 'love', 'python']
['123', '45']
string2 = '1,2,3,4'
ma = re.findall(r'\d+',string2)
print (ma)

#['1', '2', '3', '4']
import re
 
p = re.compile(r'\d+')
print (p.findall('one1two2three3four456'))
 
### output ###
# ['1', '2', '3', '456']

 

 

References:

https://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

https://blog.csdn.net/ali197294332/article/details/50894419

https://blog.csdn.net/tp7309/article/details/72823258

https://blog.csdn.net/djskl/article/details/44357389

18 original articles published. 5. 10,000 visits+
Private letter follow

Keywords: Python Asterisk

Added by cx323 on Sun, 12 Jan 2020 05:18:01 +0200