Difference between replace and replaceAll in String method (source code analysis)

preface

Let's start with a brief overview:

  1. replace() and replaceAll() are commonly used methods to replace strings;
  2. Both are full replacement. You can replace a character or string in the source string with a specified character or string;
  3. If you only want to replace the first one, you can use replaceFirst();
    This method is also based on the replacement of regular expressions, but unlike replaceAll(), it only replaces the string that appears for the first time;
  4. The replacement parameters used by replaceAll() and replaceFirst() can be ordinary strings or regular expressions;
  5. If the parameter data used by replaceAll() and replaceFirst() are not based on regular expressions, they have the same effect and efficiency as replace() in replacing strings.

Note: after the replacement operation, a new object will be returned, and the content of the source string will not be changed.

Source code analysis

Let's take a look at the definitions of the two methods in the source code. I extracted a paragraph respectively:

/* String.class */
...
/** 
 * Replaces each substring of this string that matches the literal target sequence 
 * with the specified literal replacement sequence ...
 * Replaces each substring of the string that matches the text target sequence with the specified text replacement sequence.
 */
public String replace(CharSequence target, CharSequence replacement) {
	return Pattern.compile(target.toString(),Pattern.LITERAL).
		matcher(this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
    }
...

/** 
 * Replaces each substring of this string that matches the given regular expression 
 * with the given replacement...
 * Replace each substring of a string that matches a given regular expression with a given replacement.
 */
public String replaceAll(String regex, String replacement) {
	return Pattern.compile(regex).
		matcher(this).replaceAll(replacement);
}
...

Through the definition of method, we find that when replaceAll() is defined, it is endowed with the function of matching regular expressions.

Through the source code, two points can be summarized:

  1. String.replace() and string The methods called by replaceall () are the same, both of which are matchers Replaceall() method;
  2. The replaceAll() method does not pass in the parameter "Pattern.LITERAL";

This little difference determines that the replaceAll() method needs to give priority to judging whether the replaced parameter regex is a regular expression.

  1. If it is regular, perform regular replacement;
  2. If it is a string, perform string replacement, which is the same as replace().

replaceAll() source code analysis

I know that all the partners who study technology belong to the column group of "getting to the bottom". Let me show you how the parameter "Pattern.LITERAL" affects the business logic:

  1. String. The replaceall (string regex, string replacement) function is as follows:

    It calls three functions:
  • Pattern.compile(String regex) – compile (parse) regular expressions to obtain pattern objects;
  • Pattern.matcher(CharSequence input) – get the matcher;
  • Matcher.replaceAll(String replacement) – replace the string;
    As the name suggests, the key point we need to explain is pattern In the compile (string regex) method.
  1. Pattern. The compile (string regex) function is as follows:

    It returns a Pattern object.
  2. The constructor of Pattern is as follows:

    This constructor is private level and cannot be called directly by other classes. It can only be called through compile(String regex) and compile(String regex, int flags) of Pattern class.
    The constructor calls compile(), and the processing of regex parameters takes place in this function!
  3. Pattern. The compile() function is as follows:

    Among them, the parameter "LITERAL" in ① is the little point we mentioned above, which is not needed (if it is unclear, you can look back);
    ① The if – else statement at determines whether to execute at ②;
    ② Matchroot at = expr (lastaccept); This is the method to obtain the root node of regular expression matching. If this method is executed, regular expression matching will begin.

Well, I won't demonstrate the next code. Interested partners can have a look by themselves.

Code demonstration

After saying a lot of theoretical things, write a few lines of code to verify:

	@Test
	public void replaceTest() {
		String str1 = "Aoc.Iop.Aoc.Iop.Aoc";		//Define three identical strings
		String str2 = "Aoc.Iop.Aoc.Iop.Aoc";
		String str3 = "Aoc.Iop.Aoc.Iop.Aoc";

		String str11 = str1.replace(".", "#");		// str11 = "Aoc#Iop#Aoc#Iop#Aoc"
		String str22 = str2.replaceAll(".", "#");	// str22 = "###################"
		String str33 = str3.replaceFirst(".", "#");	// str33 = "#oc.Iop.Aoc.Iop.Aoc"
	}

Due to "." Is a symbol of a regular expression, so the replaceAll() method performs regular substitution.

Escape symbol – "\", special attention should be paid to the following:

  1. "\" is an escape character in java, so you need to use two to represent one.
    For example, system out. println(“\”); Print out only one "\";
  2. "\" is also an escape character in a regular expression (the parameter of replaceAll() is a regular expression), and it also needs two to represent one.
    Therefore, "\ \" will be converted into "\" by j ava, and "\" will be converted into "\" by regular expression.

Take an example:

	@Test
	public void replaceTest() {
		String str1 = "blog.csdn.net/weixin_44259720/";
		String str2 = "blog.csdn.net/weixin_44259720/";
		
		String str11 = str1.replace("/", "\\");			// Escape
		String str22 = str2.replaceAll("/", "\\\\");	// Escape + regular matching
	}
	
	The output results are the same:
	str11 = "blog.csdn.net\weixin_44259720\"
	str22 = "blog.csdn.net\weixin_44259720\"

Summary

  1. The parameters of replace are char and CharSequence, which can support both character replacement and string replacement (CharSequence means string sequence, in other words, string);
  2. The parameter of replaceAll is regex, that is, replacement based on regular expression. For example, you can replace all numeric characters of a string with asterisks through replaceAll ("\d", "*");
  3. After the String class performs the replacement operation, it returns a new object, and the content of the source String has not changed.

Added by eyalrosen on Sun, 30 Jan 2022 19:15:52 +0200