Regular expressions for Java SCR learning

regular

  • Regular expression, also known as "regular expression"

  • We write the "rule" ourselves, which is specially used to detect whether the string meets the requirements of the "rule"

  • We use some special characters or symbols to define a "rule formula", and then use our defined "rule formula" to detect whether the string is qualified

    var reg = /\d+/
    var str1 = '123'
    var str2 = 'abc'
    console.log(reg.test(str1)) // true
    console.log(reg.test(str2)) // false
    
    • The above variable reg is the customized rule
    • When detecting the str1 string, the rule is met
    • When detecting the str2 string, it does not comply with the rules

Create a regular expression

  • If you want to formulate "rules", you must formulate them in the way required by others
  • Things that write letters and symbols in the middle of / / are called regular expressions, such as / abcdefg/
  • There are two ways to create regular expressions: literal and constructor

Literal creation

// Here is the literal to create a regular expression
var reg = /abcdefg/
  • This regular expression can detect strings

Constructor creation

// Here is the constructor to create a regular expression
var reg = new RegExp('abcdefg')
console.log(reg) //  /abcdefg/
  • The result is the same as that created by literal and constructor

Symbols in regular expressions

  • After knowing how to create a regular expression, let's talk about some symbols involved in the regular expression in detail

Metacharacter

  • . : Matches any character that is not newline

  • \: Translate symbols, convert meaningful symbols into meaningless characters, and convert meaningless characters into meaningful symbols

  • \s: Match white space characters (spaces / tabs /...)

  • \S: Match non white space characters

  • \d: Match number

  • \D: Match non numeric

  • \w: Match alphanumeric underscores

  • \W: Match non numeric letter underscores

  • With metacharacters, we can simply make some rules

    var reg = /\s/
    var str = 'a b'
    var str2 = 'ab'
    console.log(reg.test(str)) // true
    console.log(reg.test(str2)) // false
    
    var reg = /\d/
    var str = 'abc1'
    var str2 = 'abc'
    console.log(reg.test(str)) // true
    console.log(reg.test(str2)) // false
    
    var reg = /\w/
    var str = 'a1'
    var str2 = '#@$'
    console.log(reg.test(str)) // true
    console.log(reg.test(str2)) // false
    

qualifier

  • *: the previous content is repeated at least 0 times, that is, it can appear 0 ~ positive infinite times

  • +: the previous content is repeated at least once, that is, it can occur 1 ~ positive infinite times

  • ? : The previous content is repeated 0 or 1 times, that is, it can appear 0 ~ 1 times

  • {n} : the previous content is repeated N times, that is, it must appear n times

  • {n,}: the previous content appears at least N times, that is, N ~ positive infinite times

  • {n,m}: the previous content appears at least N times and at most m times, that is, N ~ m times

  • Qualifiers are used with metacharacters

    // The following regular means that the verification number can appear 0 ~ positive infinity
    var reg = /\d*/
    var str = 'abc'
    var str2 = 'abc1'
    var str3 = 'abc123'
    console.log(reg.test(str)) // true
    console.log(reg.test(str2)) // true
    console.log(reg.test(str3)) // true
    
    // The following regular means that the verification number can appear 1 ~ positive infinity
    var reg = /\d+/
    var str = 'abc'
    var str2 = 'abc1'
    var str3 = 'abc123'
    console.log(reg.test(str)) // false
    console.log(reg.test(str2)) // true
    console.log(reg.test(str3)) // true
    
    // The following regular means that the verification number can appear 0 ~ 1 times
    var reg = /\d?/
    var str = 'abc'
    var str2 = 'abc1'
    console.log(reg.test(str)) // true
    console.log(reg.test(str2)) // true
    
    // The following regular means that the verification number must appear 3 times
    var reg = /\d{3}/
    var str = 'abc'
    var str2 = 'abc1'
    var str3 = 'abc123'
    console.log(reg.test(str)) // false
    console.log(reg.test(str2)) // false
    console.log(reg.test(str3)) // true
    
    // The following regular means that the verification number appears 3 ~ positive infinity times
    var reg = /\d{3,}/
    var str = 'abc'
    var str2 = 'abc1'
    var str3 = 'abc123'
    var str4 = 'abcd1234567'
    console.log(reg.test(str)) // false
    console.log(reg.test(str2)) // false
    console.log(reg.test(str3)) // true
    console.log(reg.test(str4)) // true
    
    // The following regular means that the verification number can only appear 3 ~ 5 times
    var reg = /\d{3,5}/
    var str = 'abc'
    var str2 = 'abc1'
    var str3 = 'abc123'
    var str4 = 'abc12345'
    console.log(reg.test(str)) // false
    console.log(reg.test(str2)) // false
    console.log(reg.test(str3)) // true
    console.log(reg.test(str4)) // true
    

Boundary character

  • ^: indicates the beginning

  • $: indicates the end

  • A delimiter defines the beginning and end of a string

    // The following indicates that there can only be numbers from the beginning to the end, and they appear 3 ~ 5 times
    var reg = /^\d{3,5}$/
    var str = 'abc'
    var str2 = 'abc123'
    var str3 = '1'
    var str4 = '1234567'
    var str5 = '123'
    var str6 = '12345'
    console.log(reg.test(str)) // false
    console.log(reg.test(str2)) // false
    console.log(reg.test(str3)) // false
    console.log(reg.test(str4)) // false
    console.log(reg.test(str5)) // true
    console.log(reg.test(str6)) // true
    

Special symbols

  • (): defines a set of elements

  • []: character set, which means any character written in []

  • [^]: inverse character set, indicating that any character other than that written in [^] is OK

  • -: range, e.g. a-z means from letter a to letter z

  • |: or, or a|b in the regular form means the letter A or B

  • [\ u4e00-\u9fa5]: how to match Chinese characters in regular expressions

  • The group of regular expressions is more appropriate in replace

  • Groups are defined in regular expressions

  • Groups are numbered. From left to right, they are group 1, group 2, group 3

  • When replacing characters, we can directly use the characters matched by the "group" in the regular. The use method is that $1 represents the content matched by group 1 and $2 represents the content matched by group 2

  • Now we can combine several symbols together

    // Here is a simple mailbox verification
    // Not_$ At the beginning, any character appears at least 6 times, an @ symbol, any one of (163|126|qq|sina), any one of (com|cn|net)
    var reg = /^[^_$].{6,}@(163|126|qq|sina)\.(com|cn|net)$/
    

Identifier

  • i: Indicates that case is ignored
    • This i is written at the end of regular
    • /\w/i
    • It is not case sensitive in regular matching
  • g: Represents a global match
    • This g is written at the end of the regular
    • /\w/g
    • Is the global match alphanumeric underscore

Regular expression method

  • Regular provides some methods for us to use
  • Used to detect and capture the contents of a string

test

  • test is used to check whether the string meets our regular criteria

  • Syntax: regular Test (string)

  • Return value: boolean

    console.log(/\d+/.test('123')) // true
    console.log(/\d+/.test('abc')) // false
    

exec

  • exec is to capture the qualified content in the string

  • Syntax: regular Exec (string)

  • Return value: returns the first item in the string that meets the regular requirements and some other information in the form of an array

    var reg = /\d{3}/
    var str = 'hello123world456 Hello 789'
    var res = reg.exec(str)
    console.log(res)
    /*
    	["123", index: 5, input: "hello123world456 Hello 789", groups: undefined]
        0: "123"
        groups: undefined
        index: 5
        input: "hello123world456 Hello 789“
        length: 1
      	__proto__: Array(0)
    */
    
    • Item 0 of the array is the content of the matched string
    • The index property indicates the number of matches from the index of the string to the string

String method

  • There are some methods in strings that can also be used with regular strings

search

  • search is to find whether there is content in the string that meets the regular condition

  • Syntax: string Search (regular)

  • Return value: If yes, the start index is returned, but - 1 is not returned

    var reg = /\d{3}/
    var str = 'hello123'
    var str2 = 'hello'
    console.log(str.search(reg)) // 5
    console.log(str2.search(reg)) // -1
    
    

match

  • match finds the content in the string that meets the regular conditions and returns

  • Syntax: string Match (regular)

  • Return value:

    • When there is no identifier g, it is the same as the exec method
    • When there is the identifier g, it returns an array containing each matched item
    var reg = /\d{3}/
    var str = 'hello123world456'
    var str2 = 'hello'
    console.log(str.match(reg)) 
    // ["123", index: 5, input: "hello123wor456", groups: undefined]
    console.log(str2.match(reg)) // null
    
    var reg = /\d{3}/g
    var str = 'hello123world456'
    var str2 = 'hello'
    console.log(str.match(reg)) 
    // ["123", "456"]
    console.log(str2.match(reg)) // null
    

replace

  • Replace is to replace the string that meets the regular condition in the string

  • Syntax: string Replace (regular, string to replace)

  • Return value: replaced string

    var reg = /\d{3}/
    var str = 'hello123world456'
    var str2 = 'hello'
    console.log(str.replace(reg)) // hello666world456
    console.log(str2.replace(reg)) // hello
    
    var reg = /\d{3}/g
    var str = 'hello123world456'
    var str2 = 'hello'
    console.log(str.replace(reg)) // hello666world666
    console.log(str2.replace(reg)) // hello
    
  • Regular greedy feature: as long as the rules allow, we will match as many characters that meet the rules as possible. This feature is called greedy feature

  • Regular expressions with + are greedy patterns

  • Non greedy mode: as long as the matching rules are met, the matching will be completed immediately

  • With *? Replace+

Zero width assertion

  • Zero width: does not occupy the matching result position

  • Assertions: specifying conditions

  • ?= condition

    var string = "<span>hello world</span><div>hello www</div>"
    // -Assertion condition: the letter must be followed by < / span >;
    // -\ escape operation can convert meaningful operations in regular into pure characters; 
    var reg = /[a-z\s]+(?=<\/span>)/g
    console.log( string.match( reg ) );
    
  • Reverse assertion:?!

    var string = "Hello, Xiaohong, I'm Xiaolv's friend!Do you remember me, Xiao Hong"
    var reg2 = /Small[Red and green](?!My friend)/g
    console.log( string.match( reg2 ) )
    
  • ?<=

    var string = "<span>hello world</span><div>hello www</div>"
    // -Assertion condition: the letter must be followed by < / span >;
    // -\ escape operation can convert meaningful operations in regular into pure characters; 
    var reg = /(?<=\<span>)[a-z\s]+/g
    console.log( string.match( reg ) );
    
  • ?<!

    var string = "Hello, Xiaohong, I'm Xiaolv's friend!Do you remember me, Xiao Hong"
    var reg2 = /(?<!Hello)Small[Red and green]/g
    console.log( string.match( reg2 ) )
    

Original is not easy, please indicate the source for reprint.

Keywords: Javascript Front-end

Added by lanrat on Mon, 03 Jan 2022 10:00:05 +0200