Regular expressions for headaches

regular expression

Learning regular expressions has always been a headache, but the main reason is that there are too many characters and rules. Therefore, for me, it is because of laziness. I can change this state by summing up carefully and slowly.

Regular expression is actually a tool for string pattern matching, so as to realize the function of search and replacement. From its name, we can know that it is an expression used to describe rules. Its underlying principle is also very simple, that is to use the idea of state machine for pattern matching. You can use regexper COM is a good tool for visualizing regular expressions written by yourself:

Starting from characters

Let's first understand some common characters




After understanding the specific rules, let's summarize them in an example:

  1. Match a string beginning with abc:
\babc perhaps^abc
  1. QQ number matching 8 digits:
^\d\d\d\d\d\d\d\d$
^\d{8}$
  1. Mobile phone number matching the first 11 digits of 1:
^1\d\d\d\d\d\d\d\d\d\d$
^1\d{10}$
  1. The matching bank card number is 14 ~ 18 digits:
^\d{14,18}$
  1. Matches a string that starts with a and ends with 0 or more b's
^ab*$
  1. For example, China Unicom has 130 / 131 / 132 / 155 / 156 / 185 / 186 / 145 / 176
^(130|131|132|155|156|185|186|145|176)\d{8}$
((13[0-2])|(15[56])|(18[5-6])|145|176)\d{8}$

Regular method

. test is a method that can be used for regular expressions. It returns the Boolean value of whether the regular expression is found in the string:

let rat = /dog/;
rat.test('I saw a dog!');  // returns true
/rat/.test('I saw a dog!');  // returns true

Special characters:
Special characters are characters used to modify or specify character combinations for regular expressions. One of the most useful special characters is square brackets. Square brackets allow you to indicate that the characters in the target string can be any number of characters! Let's see what they do:

const bt = /b[aeiou]t/;
bt.test('bat');  // returns true
bt.test('bet');  // returns true
bt.test('bit');  // returns true
bt.test('bot');  // returns true
bt.test('but');  // returns true
bt.test('bpt');  // returns false

Think about it. Everything in parentheses corresponds to a character in the string you want to search. On top of this useful ability, we can use the "-" character to specify a specific character range!

const nums = /[0-5]/;
nums.test('0');  //  returns true
nums.test('3');  //  returns true
nums.test('7');  //  returns false

Furthermore, for example, to specify all the letters, you would do something like this:

const letters = /[A-Za-z]/;
letters.test('M');  // returns true
letters.test('y');  // returns true
letters.test('5');  // returns false

Another special character to remember is the "+" character. This indicates that a particular element can be repeated 1 or any number of times (not 0). Let's see what it does.

const bomb = /boo+m/;  
bomb.test('boom!');  // returns true
bomb.test('Boom!');  // returns false
bomb.test('boooooooooooom!');  // returns true

If you want to ignore case, you can add an i after it.

const bomb = /boo+m/i;  
bomb.test('boom!');  // returns true
bomb.test('Boom!');  // returns true
bomb.test('boooooooooooom!');  // returns true
bomb.test('BOOOOOOOOOOOOM!');  // returns true

For example, our common "?" Character is also a useful special character. This character indicates that the preceding characters can or can not be included. The number of times is 0 or greater than 0

const color = /colou?r/; 
color.test('color');  // returns true
color.test('colour');  // returns true

The special character you may also need to pay attention to is the "." character. This is a wildcard. A "." Can represent any other character, excluding line breaks. This character can be multiple or single.

const anything = /./; 
anything.test('a');  // returns true
anything.test('1');  // returns true
anything.test('[');  // returns true
/._./.test('qqwwa_aassdddss')   // returns true

\"W" character refers to any alphanumeric character. Its antonym, "\ W", refers to any non alphanumeric character.

const alphaNumber = /\w/;  
alphaNumber.test('a');  // returns true
alphaNumber.test('1');  // returns true
alphaNumber.test('&');  // returns false
/\w/.test('1222sssaa')   // returns true

const notAlphaNumber = /\W/; 
notAlphaNumber.test('a');  // returns false
notAlphaNumber.test('1');  // returns false
notAlphaNumber.test('&');  // returns true

Again, the "\ S" character refers to any white space character, while the "\ S" character refers to any non white space character.

const whitespace = /\s/;  
whitespace.test('a');  // returns false
whitespace.test('1');  // returns false
whitespace.test('&');  // returns false
whitespace.test(' ');  // returns true
whitespace.test('\n');  // returns true
/\s/.test('     ssss')   // returns true

const notWhitespace = /\S/; 
notWhitespace.test('a');  // returns true
notWhitespace.test('1');  // returns true
notWhitespace.test('&');  // returns true
notWhitespace.test(' ');  // returns false
notWhitespace.test('\n');  // returns false

Well, if you can't finish all the introduction, you have to check it yourself. Is there a summary diagram? Of course there is at this time.

This is a third-party website. It is recommended to combine this with the syntax reference on the right
https://c.runoob.com/front-end/854
The most important thing is the following. In fact, regular is to remember!!!

  • Expression of check number
  • Expression of check character
  • Special requirements expression

These can be found through the above website when necessary.

RegExp – four common regularization methods

match() method
The match() method can retrieve the specified value in the string. This verification method is somewhat similar to the indexOf() and lastIndexOf() methods of array / string, except that these two methods return the index of the first occurrence of the specified value, while match() returns two results:

1. For normal verification of specified characters, the returned character data is a character data that does not know whether it is an array or an object. The returned specified verification characters can be obtained through the index 0, but can be obtained through key to obtain the subscript index of the first occurrence of the specified check character and the original string of the data check source

'adddddd'.match('ad')
["ad", index: 0, input: "adddddd", groups: undefined]

'adddddd'.match('d')
["d", index: 1, input: "adddddd", groups: undefined]

'adddddd'.match('12')
null

'123456'.match('5')
["5", index: 4, input: "123456", groups: undefined]

let ss = '123456'.match('4')

ss[0]
"4"
ss[1]
undefined
'1212121212'.match('12')
["12", index: 0, input: "1212121212", groups: undefined]

Searching by string retrieval will only return the element to be found for the first time

2. If the verification is carried out in a regular way, if the g modifier is not used, it is not much different from string retrieval. If the g modifier is used, all the value array sets of the specified characters in the string will be returned. There will be no index, input and other attributes, and it is purely a matching result value array set

12345623123312'.match(/12/g)
(3) ["12", "12", "12"]

replace() method
This method is used to match the string specified in the string and replace the matched string with the specified string. I also prefer to use replaceAll.

'122222222'.replace('12', 'ouyang')
"ouyang2222222"

'12345661222333'.replace(/12/g, 'ouyang')
"ouyang34566ouyang22333"

'12345678909122'.replaceAll('12','ouyang')
"ouyang345678909ouyang2"

This method has two parameters. The first parameter can be a regular expression or a string. The second parameter is the data to be replaced after matching or the function to generate the replacement text

exec() method
It is used to verify the match specified in the incoming string and return an array to store the matching results. If there is no match, it returns null

The zeroth element of the returned array is the value matched by the regular expression. The first element is the value matched by the first sub expression in the regular expression, and the second element is the value matched by the second Z in the regular expression. By analogy, the index and input attributes will be returned additionally

let str = 'abcdefgabcdefg'
let reg = /b/
let res = reg.exec(str)
console.log(res)

// Print results
['b', index: 1, input: "abcdefgabcdefg"]

If it is a global match, you can repeatedly call the exec method through a circular statement

let str = 'abcdefgabcdefg'
let reg = /b/g
let newRes
while ((newRes = reg.exec(str)) !== null) {
  console.log(newRes)
}

// Print results
["b", index: 1, input: "abcdefgabcdefg", groups: undefined]
["b", index: 8, input: "abcdefgabcdefg", groups: undefined]

Why does it feel like matc? you 're right! They just write differently. match is a string match and exec are regular exec

'122233311222'.match('1')
["1", index: 0, input: "122233311222", groups: undefined]

/1/.exec('123331211111')
["1", index: 0, input: "123331211111", groups: undefined]

test() method
test has been listed above. Well, when fishing every day, just look at it more, write more and remember more. Don't bluff me any more...

Keywords: Javascript regex

Added by orlandinho on Fri, 18 Feb 2022 18:02:00 +0200