On regular expression in js

Most of the time, they are confused by regular expressions. Recently, they have taken the time to study the regular expressions systematically. The following are arranged as follows:

Creation of Regular Expressions

Two methods, one is to write directly, consisting of patterns contained between slashes, and the other is to call the constructor of the RegExp object.

The code for creating the two methods is as follows:

// Direct creation
const regex1 = /ab+c/;
const regex2 = /^[a-zA-Z]+[0-9]*\W?_$/gi;

// Call the constructor
const regex3 = new RegExp('ab+c');
const regex4 = new RegExp(/^[a-zA-Z]+[0-9]*\W?_$/, "gi");
const regex5 = new RegExp('^[a-zA-Z]+[0-9]*\W?_$', 'gi');

As you can see, when you call the RegExp constructor to create a regular expression, the first parameter can be either a string or a directly created regular expression.

Note that both toLocaleString() and toString () methods inherited from RegExp instances return literal quantities of regular expressions, regardless of the way they are created.

For example:

const ncname = '[a-zA-Z_][\\w\\-\\.]*';
const qnameCapture = '((?:' + ncname + '\\:)?' + ncname + ')';
const startTagOpen = new RegExp('^<' + qnameCapture);
startTagOpen.toString();        // '/^<((?:[a-zA-Z_][\w\-\.]*\:)?[a-zA-Z_][\w\-\.]*)/'

Special Characters in Regular Expressions

\ (Backslash)

1. Adding a backslash before a non-special character indicates that the next character is special.

2. Translate the subsequent special characters into literal quantities;

Note: When using the RegExp constructor, translate because is also a translated character in a string.

^

1. The beginning of matching input;

2. Represents the reverse character set at the first bit in [].

Example:

/^A/.exec('an A')        // null
/^A/.exec('An E')        // ["A", index: 0, input: "An E"]

$

End of matching input

/t$/.exec('eater')        // null
/t$/.exec('eat')          // ["t", index: 2, input: "eat"]

* (decimal point)

* Match the previous expression 0 or more times. Equivalent to {0,};

+ Match the previous expression one or more times. Equivalent to {1,};

Match any single character except newline characters;

(question mark)

1. Match the previous expression 0 or 1 times. Equivalent to {0,1};
2. If it follows any quantifier * +?{}, it will make the quantifier non-greedy (matching as few characters as possible), just the opposite of the greedy mode used by default;
3. Applying to antecedent assertion

Example:

/\d+/.exec('123abc')        　　　// ["123", index: 0, input: "123abc"]
/\d+?/.exec('123abc')            // ["1", index: 0, input: "123abc"]

(x)

Match'x'and remember the matches, parentheses indicate capture parentheses;

Example:

/(foo) (bar) \1 \2/.test('bar foo bar foo');   // false
/(bar) (foo) \1 \2/.test('bar foo bar foo');   // true
/(bar) (foo) \1 \2/.test('bar foo');           // false
/(bar) (foo) \1 \2/.test('bar foo foo bar');   // false
/(bar) (foo) \2 \1/.test('bar foo foo bar');   // true

'bar foo bar foo'.replace( /(bar) (foo)/, '$2 $1' );    // "foo bar bar foo"

Pattern /(foo) (bar)12/ matches'(foo)'and'(bar)' and remembers the first two words in the string "foo bar foo bar". The 1 and 2 in the pattern match the last two words of the string.

Note: 1, 2, n is used for matching regular expressions, and in replacing regular expressions, grammars like $1, $2, $n are used. For example,'bar foo'. replace (/(...) (...)/,'$2 $1').

(?:x)

Matching'x'but not remembering the matches is called Uncaptured parentheses.

Example:

'foo'.match(/foo{1,2}/)                // ["foo", index: 0, input: "foo"]
'foo'.match(/(?:foo){1,2}/)            // ["foo", index: 0, input: "foo"]
'foofoo'.match(/(?:foo){1,2}/)         // ["foofoo", index: 0, input: "foofoo"]
'foofoo'.match(/foo{1,2}/)             // ["foo", index: 0, input: "foofoo"]

Use scenarios: example expression /(?: foo){1,2}/. If the expression is / foo{1,2}/, {1,2} will only take effect on the last character of `foo', `o'. If non-capture parentheses are used, then {1,2} matches the entire `foo'word.

x(?=y), x(?!y), x|y

x(?=y): Match'x'only when'x' is followed by'y';

x(?!y): Match'x'only if'x' is not followed by'y';

x|y: Match x or Y

Neither matching result contains y

Example:

'JackSprat'.match(/Jack(?=Sprat)/)            // ["Jack", index: 0, input: "JackSprat"]
'JackWprat'.match(/Jack(?=Sprat)/)            // null
'JackWprat'.match(/Jack(?=Sprat|Wprat)/)    // ["Jack", index: 0, input: "JackWprat"]
/\d+(?!\.)/.exec("3.141")        // ["141", index: 2, input: "3.141"]

{n}, {n,m}:

{n}: matching the previous character happens n times;

{n,m}: Matches the previous character at least n times, up to m times. If the value of N or m is 0, this value is ignored.

Example:

    /a{2}/.exec('candy')         // null
    /a{2}/.exec('caandy')        // ["aa", index: 1, input: "caandy"]
    /a{2}/.exec('caaandy')       // ["aa", index: 1, input: "caaandy"]

    /a{1,3}/.exec('candy')       // ["a", index: 1, input: "candy"]
    /a{1,3}/.exec('caandy')      // ["aa", index: 1, input: "caandy"]
    /a{1,3}/.exec('caaandy')     // ["aaa", index: 1, input: "caaandy"]
    /a{1,3}/.exec('caaaandy')    // ["aaa", index: 1, input: "caaaandy"]

[xyz], [^xyz]

[xyz]: A character set. Match any character in square brackets;

[^ xyz]: A reverse character set. Match any characters that are not included in square brackets;

Both matches can use dashes (-) to specify a character range, and special symbols have no special meaning in the character set.

Example:

function escapeRegExp(string){
    return string.replace(/([.*+?^=!:${}()|[\]\/\\])/g, "\\$&"); 
    //$&Represents the entire matched string
}

In the example, *+?^=!: ${}() denotes literal quantities and has no special significance.

Other

\ b: Match the boundaries of a word. The boundaries of a matched word are not included in the matched content. In other words, the length of the content of the boundary of a matching word is 0;

\ B: Match a non-word boundary;

Example:

    /\bm/.exec('moon')            　　　　　  　// ["m", index: 0, input: "moon"]
    /\bm/.exec('san moon')      　　　　　　    // ["m", index: 4, input: "san moon"]
    /oo\b/.exec('moon')           　　　　　　  // null
    /\B../.exec('noonday')                    // ["oo", index: 1, input: "noonday"]
    /y\B../.exec('possibly yesterday')        // /y\B../.exec('possibly yesterday')

\ d: Match a number, equivalent to [0-9];

\ D: Matches a non-numeric character, equivalent to [^ 0-9];

\ f: Match a page break (U+000C);

\ n: Matches a newline character (U+000A);

\ r: Match a carriage return character (U+000D);

\ s: Matches a blank character, including spaces, tabs, page breaks, and newline characters, equivalent to [\\\\\\\\\\\\\\\\\\\\\.

\ S: Matches a non-blank character, equivalent to [^ f__________________________________________

\ w: Matches a single character (letter, number or underscore), equivalent to [A-Za-z0-9_];

\ W: Matches a non-word character, equivalent to [^ A-Za-z0-9_];

Regular expression flags

g: Global search;

i: case-insensitive;

m: Multi-line search;

Regular expression usage

RegExp has exec() and test() methods.

The results of exec matching are: matching results, capture results, index and input.

The result of test matching is true or false, which is more efficient than exec.

String has match(), replace(), search(), split() methods;

The result of match matching is the same as RegExp's exec. replace is replaced by regular expression. search finds the location. split divides strings according to regular expression.

When replace has function, the parameters are as follows:

* Matches
* Memory items (items in parentheses)
* ...
* Matched index
* input entry

Keywords: Javascript

Added by kaizix on Thu, 20 Jun 2019 03:37:24 +0300

Programming VIP

On regular expression in js

Creation of Regular Expressions

Special Characters in Regular Expressions

\ (Backslash)

^

$

* (decimal point)

(question mark)

(x)

(?:x)

x(?=y), x(?!y), x|y

{n}, {n,m}:

[xyz], [^xyz]

Other

Regular expression flags

Regular expression usage

Popular Keywords