Python regular expression

Introduction to Python regular

Python has added the re module since version 1.5, which provides a Perl style regular expression pattern.
The re module enables the Python language to have all the regular expression functions.
Before using regular expressions in Python, you need to reference the re module. The syntax is as follows:

import re

It mainly has the following functions

function	function
Match string	match() ; search() ; findall()
Replace string
Split string

Match string

re.match()

Start position matching

re.match(pattern, string[, flags=0])

Parameter Description:

parameter	explain
pattern	Matching regular expressions
string	String to match
flags	Flag bit is used to control the matching method of regular expressions, such as case sensitivity, multi line matching, etc. see below for details.

If the Match is successful, a Match object is returned; otherwise, None is returned

import re

s = 'The children are playing on the green grass.'
pattern = r'green.*land'
match = re.match(pattern, s, 0)

print('output match Object:', match)
#Run result output match object: < re Match object;  Span = (0,6), match = 'green grass' >

Method of MatchObject

group() returns the matching string
span() returns a tuple containing the matched (start, end) position
start() returns the location where the match started
end() returns the position where the match ends
Note: when using the above methods, you need to ensure that the regular object is returned. If the regular object is empty, an exception will be thrown

import re

s = 'The children are playing on the green grass.'
pattern = r'green.*land'
match = re.match(pattern, s, 0)

print('Matching data:', match.group())
print('Match location:', match.span())
print('Starting position:', match.start())
print('End position:', match.end())

# The operation results are as follows:
# Matching data: green grassland
# Match location: (0, 6)
# Start position: 0
# End position: 6

re.search()

Matches the full string and returns the first match

re.search(pattern, string[, flags=0])

If the Match is successful, a Match object is returned; otherwise, None is returned

re.findall()

Match all characters and return a list

re.findall(pattern, string[, flags=0])

Find all substrings matched by the regular expression in the string and return a list. If no matching is found, return an empty list.

Expand re finditer()

Similar to findall, all substrings matched by the regular expression are found in the string and returned as an iterator.

re.finditer(pattern, string[, flags=0])

Replace string

re.sub()

Replace matches in string

re.sub(pattern, repl, string[, count=0][, flags=0])

parameter	explain
pattern	Pattern string in regular.
repl	The replaced string can also be a function.
string	The original string to be found and replaced.
count	The maximum number of times to replace after pattern matching. The default value of 0 means to replace all matches.
flags	The matching pattern used at compile time, in digital form.

The repl parameter is set to function

The repl parameter can be a function, for example:

import re

def double(matched):								
	value = str(matched.group('string'))
	return str(value * 2)

s = 'A snowflake fell from the sky'
print(re.sub('(?P<string>slice)', double, s))
# Operation result: snowflakes fall from the sky

Split string

re.split()

Split the string according to the substring that can be matched and return to the list

re.split(pattern, string[, maxsplit=0, flags=0])

parameter	explain
pattern	Matching regular expressions
string	String to match.
maxsplit	Separation times, maxplit = 1, separation once; The default value is 0, and the number of times is not limited.
flags	Flag bit is used to control the matching method of regular expressions, such as case sensitivity, multi line matching, etc. see below for details.

Regular expression modifier - optional flag

For controlling the matching mode, multiple flags can be specified through (flags1|flags2). Such as re I | re. M is set to I and M flags:

Modifier	explain
re.I	Make matching pairs case insensitive
re.L	Do local aware matching
re.M	Multiline matching, affecting ^ and$
re.S	Make Matches all characters, including line breaks
re.U	Parses characters according to the Unicode character set. This flag affects \ w, \W, \b, \B
re.X	This flag gives you a more flexible format so that you can write regular expressions easier to understand.

Keywords: Python regex

Added by j4v1 on Sun, 23 Jan 2022 10:00:02 +0200

Programming VIP