Python regular expression

Introduction to Python regular

Python has added the re module since version 1.5, which provides a Perl style regular expression pattern.
The re module enables the Python language to have all the regular expression functions.
Before using regular expressions in Python, you need to reference the re module. The syntax is as follows:

import re

It mainly has the following functions

functionfunction
Match stringmatch() ; search() ; findall()
Replace string
Split string

Match string

re.match()

Start position matching

re.match(pattern, string[, flags=0])

Parameter Description:

parameterexplain
patternMatching regular expressions
stringString to match
flagsFlag bit is used to control the matching method of regular expressions, such as case sensitivity, multi line matching, etc. see below for details.

If the Match is successful, a Match object is returned; otherwise, None is returned

import re

s = 'The children are playing on the green grass.'
pattern = r'green.*land'
match = re.match(pattern, s, 0)

print('output match Object:', match)
#Run result output match object: < re Match object;  Span = (0,6), match = 'green grass' >

Method of MatchObject

group() returns the matching string
span() returns a tuple containing the matched (start, end) position
start() returns the location where the match started
end() returns the position where the match ends
Note: when using the above methods, you need to ensure that the regular object is returned. If the regular object is empty, an exception will be thrown

import re

s = 'The children are playing on the green grass.'
pattern = r'green.*land'
match = re.match(pattern, s, 0)

print('Matching data:', match.group())
print('Match location:', match.span())
print('Starting position:', match.start())
print('End position:', match.end())

# The operation results are as follows:
# Matching data: green grassland
# Match location: (0, 6)
# Start position: 0
# End position: 6

re.search()

Matches the full string and returns the first match

re.search(pattern, string[, flags=0])

If the Match is successful, a Match object is returned; otherwise, None is returned

re.findall()

Match all characters and return a list

re.findall(pattern, string[, flags=0])

Find all substrings matched by the regular expression in the string and return a list. If no matching is found, return an empty list.

Expand re finditer()

Similar to findall, all substrings matched by the regular expression are found in the string and returned as an iterator.

re.finditer(pattern, string[, flags=0])

Replace string

re.sub()

Replace matches in string

re.sub(pattern, repl, string[, count=0][, flags=0])
parameterexplain
patternPattern string in regular.
replThe replaced string can also be a function.
stringThe original string to be found and replaced.
countThe maximum number of times to replace after pattern matching. The default value of 0 means to replace all matches.
flagsThe matching pattern used at compile time, in digital form.

The repl parameter is set to function

The repl parameter can be a function, for example:

import re

def double(matched):								
	value = str(matched.group('string'))
	return str(value * 2)

s = 'A snowflake fell from the sky'
print(re.sub('(?P<string>slice)', double, s))
# Operation result: snowflakes fall from the sky

Split string

re.split()

Split the string according to the substring that can be matched and return to the list

re.split(pattern, string[, maxsplit=0, flags=0])
parameterexplain
patternMatching regular expressions
stringString to match.
maxsplitSeparation times, maxplit = 1, separation once; The default value is 0, and the number of times is not limited.
flagsFlag bit is used to control the matching method of regular expressions, such as case sensitivity, multi line matching, etc. see below for details.

Regular expression modifier - optional flag

For controlling the matching mode, multiple flags can be specified through (flags1|flags2). Such as re I | re. M is set to I and M flags:

Modifier explain
re.IMake matching pairs case insensitive
re.LDo local aware matching
re.MMultiline matching, affecting ^ and$
re.SMake Matches all characters, including line breaks
re.UParses characters according to the Unicode character set. This flag affects \ w, \W, \b, \B
re.XThis flag gives you a more flexible format so that you can write regular expressions easier to understand.

Keywords: Python regex

Added by j4v1 on Sun, 23 Jan 2022 10:00:02 +0200