Analysis and design of string function
The processing of characters and strings in C language is very frequent, but C language itself has no string type. Strings are usually placed in constant strings or character arrays. String constants apply to string functions that do not modify it. Let's understand some string functions in the library and simulate their implementation.
Find string length function:
strlen
C library function size_t strlen(const char *str) calculates the length of the string str until the null end character, but excluding the null end character.
Header file | <string.h> |
---|---|
statement | size_t strlen(const char *str) |
parameter | str – the string whose length is to be calculated |
Return value | This function returns the length of the string (unsigned integer) |
be careful:
1. The string has' \ 0 'as the end flag. The strlen function returns the number of characters that appear before' \ 0 'in the string (excluding' \ 0 ').
2. The string pointed to by the parameter must end with '\ 0'.
3. Note that the return value of the function is size_t, which is unsigned (error prone)
Three methods to simulate the implementation of strlen function:
//Counter method int my_strlen1(const char* p) { assert(p); int count = 0; while (*p++) { count++; } return count; } //Pointer method int my_strlen2(const char* p) { assert(p); const char* tmp = p; while (*tmp++) { ; } return tmp - p - 1; } //Recursive method int my_strlen3(const char* p) { assert(p); if (*p) { return 1 + my_strlen3(p + 1); } else { return 0; } }
However, it should be noted that the strlen we designed returns signed shaping, while the strlen in the library returns unsigned shaping.
String function with unlimited length:
strcpy
The C library function char *strcpy(char *dest, const char *src) copies the string pointed to by src to dest.
It should be noted that if the destination array dest is not large enough and the length of the source string is too long, a buffer overflow may occur
Header file | <string.h> |
---|---|
statement | char *strcpy(char *dest, const char *src) |
parameter | dest – points to the target array used to store the copied content. src – the string to copy. |
Return value | This function returns a pointer to the final target string dest |
be careful:
1. Copy the C string pointed to by the source to the array pointed to by the destination, including terminating the null character (and stopping at that point).
2. The source string must end with '\ 0'.
3. The '\ 0' in the source string will be copied to the destination space.
4. The destination space must be large enough to hold the source string.
5. Target space must be variable
Simulate the implementation of strcpy function:
#define _CRT_SECURE_NO_WARNINGS 1 #include <stdio.h> #include <assert.h> char* my_strcpy(char* dest, const char* src) { assert(dest && src); char* tmp = dest; while (*dest++ = *src++) { ; } return tmp; } int main() { char arr1[] = "afghrdt"; char arr2[20] = "cddfg"; char* ret = my_strcpy(arr1, arr2); printf("%s",ret); return 0; }
strcat
The C library function char *strcat(char *dest, const char *src) appends the string pointed to by src to the end of the string pointed to by dest.
Header file | <string.h> |
---|---|
statement | char *strcat(char *dest, const char *src) |
Parameter 1 | dest – points to the target array, which contains a C string and is enough to hold the appended string |
Parameter 2 | src – points to the string to append, which does not overwrite the target string |
Return value | This function returns a pointer to the final target string dest |
be careful:
1. The source string must end with '\ 0'.
2. The target space must be large enough to accommodate the contents of the source string.
3. The target space must be modifiable.
4. The string cannot be appended by itself, because when the '\ 0' character is found for the first time, the \ 0 will be changed, and the \ 0 will never be found again, which will cause an endless loop.
Simulate the implementation of strcat function:
#define _CRT_SECURE_NO_WARNINGS 1 #include <stdio.h> #include <assert.h> char* my_strcat(char* dest, const char* src) { assert(dest && src); char* tmp = dest; //Find \ 0 in the destination string first while (*dest) { dest++; } //Append after finding while (*dest++ = *src++) { ; } return tmp; } int main() { char arr1[20] = "abcd"; char arr2[] = "efghijk"; char* ret = my_strcat(arr1, arr2); printf("%s", ret); return 0; }
strcmp
This function starts comparing the first character of each string. If they are equal to each other, continue to execute the following pairs until the characters are different or the null character at the end of the input is reached.
Note: this strcmp function compares the size of characters, not the length of characters.
Simulation Implementation:
#define _CRT_SECURE_NO_WARNINGS 1 #include <stdio.h> #include <assert.h> int my_strcmp(const char* str1, const char* str2) { assert(str1 && str2); while (*str1 == *str2) { if (*str1 == '\0') { return 0; } str1++; str2++; } return *str1 - *str2; } int main() { char arr1[] = "abxdef"; char arr2[] = "adddcfg"; int ret = my_strcmp(arr1, arr2); printf("%d", ret); return 0; }
String functions with limited length:
I don't know if you have found that the above string functions that are not limited by length are very dangerous for us to use, because we can't control the length, and overflow and other problems may occur. Next, we introduce the string functions with controllable length, which will be more convenient and easy to control.
strncpy
The C library function char *strncpy(char *dest, const char *src, size_t n) copies the string pointed to by src to dest, with a maximum of n characters. When the length of src is less than N, the rest of DeST will be filled with empty bytes.
statement | char *strncpy(char *dest, const char *src, size_t n) |
---|---|
parameter | dest – points to the target array used to store the copied content. |
parameter | src – string to copy |
parameter | n – the number of characters to copy from the source |
Return value | This function returns the final copied string |
be careful:
strncpy does not automatically add the terminator \ 0.
If the length of the source string is less than N, after copying the source string, append 0 to n after the target.
Simulate the implementation of strncpy function:
char* my_strncpy(char* dest, const char* src, int n) { assert(dest && src); char* tmp = dest; //If the source string is less than n, copy the characters before the source string \ 0 while (n && *src!='\0') { *dest++ = *src++; n--; } //After copying the source string, append 0 to n after the target while (n--) { *dest++ = '\0'; } return tmp; } int main() { char arr1[20] = "zbcdkkkkkkk"; char arr2[] = "xxxxpo"; char* ret = my_strncpy(arr1, arr2, 11); printf("%s", ret); return 0; }
Commissioning verification:
strncat
The C library function char *strncat(char *dest, const char *src, size_t n) appends the string pointed to by src to the end of the string pointed to by dest until the length of n characters.
Header file | <string.h> |
---|---|
statement | char *strncat(char *dest, const char *src, size_t n) |
parameter | dest – refers to the target array, which contains a C string and is enough to hold the appended string, including additional null characters |
parameter | src – the string to append |
parameter | n – maximum number of characters to append |
Return value | This function returns a pointer to the final target string dest |
be careful:
strncat is automatically terminated with \ 0.
If the length of the source string is less than n, 0 will not be appended after the destination after the source string is appended.
Simulate the implementation of strncat:
#define _CRT_SECURE_NO_WARNINGS 1 #include <stdio.h> #include <assert.h> char* my_strncat(char* dest, const char* src, int n) { assert(dest && src); char* tmp = dest; //Find \ 0 in the destination string first while (*dest) { dest++; } //Append from position \ 0. If the length of the source string is less than n, //After the source string is appended, 0 is not appended after the target while (n && (*dest++ = *src++)) { n--; } //If n is less than the source string, add \ 0 after the destination string //And prevent changing the source string when n==0 if (n == 0 && dest!=tmp) { *dest = '\0'; } return tmp; } int main() { char arr1[20] = "zbc\0kkkkkkkkkk"; char arr2[] = "xxxx"; char* ret = my_strncat(arr1, arr2, 2); printf("%s", ret); return 0; }
strncmp
The C library function int strncmp(const char *str1, const char *str2, size_t n) compares str1 and str2, up to the first n bytes.
The comparison shows that another character is different, or a string ends, or num characters are all compared.
Header file | <string.h> |
---|---|
statement | int strncmp(const char *str1, const char *str2, size_t n) |
parameter | str1 – the first string to compare |
parameter | str2 – the second string to compare |
parameter | n – maximum number of characters to compare |
Return value 1 | If the return value < 0, str1 is less than str2 |
Return value 2 | If the return value > 0, str2 is less than str1 |
Return value 3 | If the return value = 0, it means that str1 is equal to str2 |
Simulate the implementation of strncmp function:
#define _CRT_SECURE_NO_WARNINGS 1 #include <stdio.h> #include <assert.h> int my_strncmp(const char* str1, const char* str2, int n) { assert(str1 && str2); while (n--) { if (*str1 != *str2) { return *str1 - *str2; } str1++; str2++; } return 0; } int main() { char arr1[20] = "xxxkkkkkkkkkk"; char arr2[] = "xxxx"; char ret = my_strncmp(arr1, arr2, 3); printf("%d", ret); return 0; }
String lookup function:
strstr
The C library function char * strstr (const char * haystack, const char * need) finds the position where the string need first appears in the string haystack, excluding the terminator '\ 0'.
Usage: this function is used to find out whether the string contains. For example, if you want to find out whether the string hello world contains world, you can use this function. If it is found, the first found address will be returned. If it is not found, the NULL pointer NULL will be returned.
Header file | <string.h> |
---|---|
statement | char *strstr(const char *haystack, const char *needle) |
parameter | haystack – the C string to be retrieved. |
parameter | Need – the small string to search within the haystack string |
Return value | This function returns the position where the need string first appears in haystack. If it is not found, it returns null |
Simulate the implementation of STR function:
#define _CRT_SECURE_NO_WARNINGS 1 #include <stdio.h> #include <assert.h> char* my_strstr(const char* str1, const char* str2) { assert(str1 && str2); char* p1; char* p2; char* cp = str1; while (*cp) { p1 = cp; p2 = str2; while (*p1 == *p2 && *p1!='\0' && *p2!='\0') { p1++; p2++; } if (*p2 == '\0') { return cp; } cp++; } return NULL; } int main() { char arr1[20] = "xxxkkkkkkkkkk"; char arr2[] = ""; char* ret = my_strstr(arr1, arr2); printf("%s", ret); return 0; }
The process analysis is as follows:
strtok
The C library function char *strtok(char *str, const char *delim) decomposes the string str into a group of strings, and delim is the separator. Strtok is a string cutting function.
be careful:
1. The sep parameter is a string that defines the set of characters used as delimiters
2. The first parameter specifies a string that contains 0 or more tags separated by one or more separators in the sep string.
3. The strtok function finds the next tag in str, ends it with \ 0, and returns a pointer to this tag. (Note: the strtok function will change the string to be manipulated, so the string segmented by the strtok function is generally a temporary copy and can be modified.)
4. The first parameter of the strtok function is not NULL. The function will find the first tag in str, and the strtok function will save its position in the string.
5. The first parameter of strtok function is NULL. The function will start at the position saved in the same string to find the next tag.
6. If there are no more tags in the string, a NULL pointer is returned
Use examples:
#define _CRT_SECURE_NO_WARNINGS 1 #include <stdio.h> #include <string.h> int main() { char arr1[] = "This, Ordinary pointer@Blog."; char arr2[30]; //Save temporary data strcpy(arr2, arr1); //In arr3 is the division symbol char arr3[] = ",@."; char* p = NULL; for (p = strtok(arr2, arr3);p != NULL; p = strtok(NULL, arr3)) { printf("%s\n", p); } return 0;
The string is split as follows: