regular expression
1. Learning objectives
- Master the function of regular expressions
- Master the syntax of regular expressions
- Learn about common regular expressions
2. Content explanation
2.1 concept of regular expression
Regular expression is a logical formula for string operation. It uses some specific characters defined in advance and the combination of these specific characters to form a "regular string". This "regular string" is used to express a kind of filtering logic for strings. In our own words: regular expression is a formula used to verify whether a string meets certain rules
2.2 use of regular expressions
So regular expressions have three main uses:
- Mode validation: whether a string is compliant with rules, such as detecting mobile phone numbers, ID number, etc., is in conformity with specifications.
- Match read: read the part of the target string that meets the rules, such as the email address in the whole text
- Match and replace: replace the part of the target string that meets the criteria with another string, such as replacing "hello" with "haha" in the whole text
2.3 regular expression syntax
2.3.1 creating regular expression objects
- Object form: var reg = new RegExp("regular expression"). This is used when there is "/" in the regular expression
- Direct quantity form: var reg = / regular expression / this declaration is generally used
2.3.2 regular expression introduction case
2.3.2.1 mode verification: check whether the string contains the letter 'o'
Note: Here we use regular expression objects to call methods.
// Create the simplest regular expression object var reg = /o/; // Create a string object as the target string var str = 'Hello World!'; // Call the test() method of the regular expression object to verify whether the target string meets the pattern we specified, and return the result true console.log("Whether the string contains'o'="+reg.test(str));
2.3.2.2 match read: read all 'o' in the string
// Find matching characters in the target string and return an array of matching results var resultArr = str.match(reg); // Array length is 1 console.log("resultArr.length="+resultArr.length); // The contents of the array are o console.log("resultArr[0]="+resultArr[0]);
2.3.2.3 match and replace: replace the first '0' in the string with '@'
var newStr = str.replace(reg,'@'); // Only the first o is replaced, which means that our regular expression can only match the first satisfied string console.log("str.replace(reg)="+newStr);//Hell@ World! // The original string does not change, but a new string is returned console.log("str="+str);//str=Hello World!
2.3.3 matching pattern of regular expression
2.3.3.1 full text search
If you do not use g to decorate the regular expression object, only the first match will be returned when you use the regular expression to find it; After using g, all matches are returned.
// Target string var targetStr = 'Hello World!'; // There is no regular expression using global matching var reg = /[A-Z]/; // Get all matches var resultArr = targetStr.match(reg); // Array length is 1 console.log("resultArr.length="+resultArr.length); // Traverse the array and find that only 'H' can be obtained for(var i = 0; i < resultArr.length; i++){ console.log("resultArr["+i+"]="+resultArr[i]); }
Comparison code:
// Target string var targetStr = 'Hello World!'; // Global matching regular expressions are used var reg = /[A-Z]/g; // Get all matches var resultArr = targetStr.match(reg); // Array length is 2 console.log("resultArr.length="+resultArr.length); // Traverse the array and find that "H" and "W" can be obtained for(var i = 0; i < resultArr.length; i++){ console.log("resultArr["+i+"]="+resultArr[i]); }
2.3.3.2 ignore case
//Target string var targetStr = 'Hello WORLD!'; //Regular expressions that ignore case are not used var reg = /o/g; //Get all matches var resultArr = targetStr.match(reg); //Array length is 1 console.log("resultArr.length="+resultArr.length); //Traverse the array and get only 'o' for(var i = 0; i < resultArr.length; i++){ console.log("resultArr["+i+"]="+resultArr[i]); }
Comparison code:
//Target string var targetStr = 'Hello WORLD!'; //A regular expression that ignores case is used var reg = /o/gi; //Get all matches var resultArr = targetStr.match(reg); //Array length is 2 console.log("resultArr.length="+resultArr.length); //Traverse the array to get 'o' and 'o' for(var i = 0; i < resultArr.length; i++){ console.log("resultArr["+i+"]="+resultArr[i]); }
2.3.3.3 multi line search
Without using the multi line search mode, the target string will be treated as a line whether there is a newline or not.
//Target string 1 var targetStr01 = 'Hello\nWorld!'; //Target string 2 var targetStr02 = 'Hello'; //Matches regular expressions ending in 'Hello', without using multiline matching var reg = /Hello$/; console.log(reg.test(targetStr01));//false console.log(reg.test(targetStr02));//true
Comparison code:
//Target string 1 var targetStr01 = 'Hello\nWorld!'; //Target string 2 var targetStr02 = 'Hello'; //Matching regular expressions ending in 'Hello', using multiple line matching var reg = /Hello$/m; console.log(reg.test(targetStr01));//true console.log(reg.test(targetStr02));//true
2.3.4 metacharacters
Characters given special meaning in regular expressions cannot be directly used as ordinary characters. If you want to match the metacharacter itself, you need to escape the metacharacter by adding "\" before the metacharacter, for example:^
2.3.4.1 common metacharacters
code | explain |
---|---|
. | Matches any character except the newline character. |
\w | Matching letters or numbers or underscores is equivalent to [a-zA-Z0-9_] |
\W | Matches any non word characters. Equivalent to [^ A-Za-z0-9] |
\s | Match any white space characters, including spaces, tabs, page breaks, and so on. Equivalent to [\ f\n\r\t\v]. |
\S | Matches any non whitespace characters. Equivalent to [^ \ f\n\r\t\v]. |
\d | Match numbers. Equivalent to [0-9]. |
\D | Matches a non numeric character. Equivalent to [^ 0-9] |
\b | Matches the beginning or end of a word |
^ | Matches the beginning of the string, but is used in [] to indicate negation |
$ | End of matching string |
2.3.4.2 example 1
var str = 'one two three four'; // Match all spaces var reg = /\s/g; // Replace spaces with@ var newStr = str.replace(reg,'@'); // one@two@three@four console.log("newStr="+newStr);
2.3.4.3 example 2
var str = 'This year is 2014'; // Match at least one number var reg = /\d+/g; str = str.replace(reg,'abcd'); console.log('str='+str); // This year is the year of abcd
2.3.4.4 example 3
var str01 = 'I love Java'; var str02 = 'Java love me'; // Match starts with Java var reg = /^Java/g; console.log('reg.test(str01)='+reg.test(str01)); // flase console.log("<br />"); console.log('reg.test(str02)='+reg.test(str02)); // true
2.3.4.5 example 4
var str01 = 'I love Java'; var str02 = 'Java love me'; // Match ends in Java var reg = /Java$/g; console.log('reg.test(str01)='+reg.test(str01)); // true console.log("<br />"); console.log('reg.test(str02)='+reg.test(str02)); // flase
2.3.5 character set
Syntax format | Examples | explain |
---|---|---|
[character list] | Regular expression: [abc] meaning: the target string contains any character in abc. Target string: plain match: Yes reason: the "a" in plain is in the list "abc" | Any character in the target string appears in the character list, even if it matches. |
[^ character list] | [^ abc] meaning: the target string contains any character other than abc. Target string: plain match: Yes reason: plain contains "p", "l", "i", "n" | Matches any character not contained in the character list. |
[character range] | Regular expression: [A-Z] meaning: character list composed of all lowercase English characters regular expression: [A-Z] meaning: character list composed of all uppercase English characters | Matches any character within the specified range. |
var str01 = 'Hello World'; var str02 = 'I am Tom'; //Match any one of abc var reg = /[abc]/g; console.log('reg.test(str01)='+reg.test(str01));//flase console.log('reg.test(str02)='+reg.test(str02));//true
2.3.6 occurrence times
code | explain |
---|---|
* | Zero or more occurrences |
+ | One or more occurrences |
? | Zero or one occurrence |
{n} | Appear n times |
{n,} | n or more occurrences |
{n,m} | n to m times |
console.log("/[a]{3}/.test('aa')="+/[a]{3}/g.test('aa')); // flase console.log("/[a]{3}/.test('aaa')="+/[a]{3}/g.test('aaa')); // true console.log("/[a]{3}/.test('aaaa')="+/[a]{3}/g.test('aaaa')); // true
2.3.7 express "or" in regular expression
Use symbols:|
// Target string var str01 = 'Hello World!'; var str02 = 'I love Java'; // Match 'World' or 'Java' var reg = /World|Java/g; console.log("str01.match(reg)[0]="+str01.match(reg)[0]);//World console.log("str02.match(reg)[0]="+str02.match(reg)[0]);//Java
2.4 common regular expressions
demand | regular expression |
---|---|
user name | /^[a-zA-Z_][a-zA-Z_-0-9]{5,9}$/ |
password | /1{6,12}$/ |
Space before and after | /^\s+|\s+$/g |
/2+@([a-zA-Z0-9-]+[.]{1})+[a-zA-Z]+$/ |