[C++STL] Part 1: introduction and Simulation Implementation of string class

preface

1, string class in standard library

1.1 string class

  1. A class representing a sequence of characters
  2. The standard string class provides support for such objects. Its interface is similar to that of the standard character container, but it is specially used for operation
    Design features of single byte character strings.
  3. The string class uses char (that is, as its character type, its default char_traits and allocator type (more about templates)
    For more information, see basic_string).
  4. The string class is basic_ An instance of the string template class, which uses char to instantiate basic_string template class with char_traits
    And allocator as basic_ The default parameter of string (based on more template information, please refer to basic_string).
  5. Note that this class handles bytes independently of the encoding used: if it is used to handle sequences of multi byte or variable length characters (such as UTF-8), this class
    All members of the class, such as length or size, and its iterators will still operate in bytes rather than actually encoded characters.

2.2 description of common interfaces in string class + simulation implementation

2.2.1 common construction + simulation implementation of string class objects
Function nameFunction description
string();Construct an empty string
string (const string& str);copy constructor
string (const string& str, size_t pos, size_t len = npos);Copy the str part starting from the character position and span the len character
string (const char* s);Constructing string class objects with strings
string (const char* s, size_t n);Construct the character sequence pointed to by s and assign n characters
string (size_t n, char c);Construct a class object with n c character sequences

Code demonstration:

#include<iostream>
#include<string>
using namespace std;

int main()
{
	string s1;
	string s4("hello world");
	string s5("hello world", 7);
	string s6(10, 'x');
	string s2(s4);
	string s3(s4, 6, 3);

	cout << "s1:"<< s1.c_str() << endl;
	cout << "s4:" << s4.c_str() << endl;
	cout << "s5:" << s5.c_str() << endl;
	cout << "s6:" << s6.c_str() << endl;
	cout << "s2:" << s2.c_str() << endl;
	cout << "s3:" << s3.c_str() << endl;	
}

Operation results:

Simulation Implementation
Because some of the above interfaces are not commonly used, I simulated and implemented some commonly used interfaces

string (const char* s)
namespace cxy
{
	class string
	{
	public:
		string(const char*s = "")
		{
			if (s==nullptr)
				return;
			_size = strlen(s);
			_capacity = _size;
			_str = new char[_capacity + 1];
			strcpy(_str, s);
		}
		
		const char* c_str()
		{
			return _str;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
string (const string& str)
void swap (string& str)
namespace cxy
{
	class string
	{
	public:
		void swap(string& str)
		{
		//The following swap will call the interface in the library
			::swap(_size, str._size);
			::swap(_capacity, str._capacity);
			::swap(_str, str._str);
		}

		string(const char*s = "")
		{
			if (s==nullptr)
				return;
			_size = strlen(s);
			_capacity = _size;
			_str = new char[_capacity + 1];

			strcpy(_str, s);
		}

		string(const string& str)
			:_str(nullptr), _size(0), _capacity(0)
		{
			string tmp(str._str);
			swap(tmp);
		}

		char* c_str()
		{
			return _str;
		}

	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
2.2.2 capacity operation + simulation implementation of string object
functionFunction description
size_t size() const; (key points)Returns the valid length of a string
size_t length() const;Returns the valid length of a string
size_t capacity() const;Returns the size of the space
bool empty() const; (key points)If the detection string is released as an empty string, return true; otherwise, return false
void clear(); (key points)Clear valid characters
void reserve (size_t n = 0); (key points)Request to change the capacity to reserve space for the string
void resize (size_t n, char c); (key points)Convert the number of valid characters into n, and the extra space is filled with character c

Code demonstration:

int main()
{
	string s1("hello world");
	cout <<"s1.size(): " <<s1.size() << endl;
	cout <<"s1.length(): "<< s1.length() << endl;
	cout <<"s1.capacity(): "<<s1.capacity() << endl;
	cout <<"s1:"<< s1 << endl;
	cout << endl;

	s1.clear();
	cout <<"s1:"<< s1 << endl;
	cout << "s1.size(): " << s1.size() << endl;
	cout << "s1.capacity(): " << s1.capacity() << endl;
	cout << endl;

	s1 = "hello world";
	cout << "s1:" << s1 << endl;
	cout << "s1.size(): " << s1.size() << endl;
	cout << "s1.capacity(): " << s1.capacity() << endl;
	s1.resize(17,'x');
	//When n > capacity, the capacity will be expanded, and the spare positions at positions 0 ~ 27 will be filled with 'characters'
	cout << "s1:" << s1 << endl;
	cout << "s1.size(): " << s1.size() << endl;
	cout << "s1.capacity(): " << s1.capacity() << endl;
	s1.resize(27, 'x');
	//When size < n < capacity, fill the empty position on 0 ~ 27 with 'character'
	cout << "s1:" << s1 << endl;
	cout << "s1.size(): " << s1.size() << endl;
	cout << "s1.capacity(): " << s1.capacity() << endl;
	s1.resize(5, 'x');
	//When n < size, only n 'characters' are reserved, and the space size remains unchanged
	cout << "s1:" << s1 << endl;
	cout << "s1.size(): " << s1.size() << endl;
	cout << "s1.capacity(): " << s1.capacity() << endl;
	cout << endl;

	string s2("hello world");
	s2.reserve(5);
	//When n < = capacity, the space size remains unchanged and the data content does not change
	cout << "s2:" << s2 << endl;
	cout << "s2.size(): " << s2.size() << endl;
	cout << "s2.capacity(): " << s2.capacity() << endl;

	s2.reserve(100);
	//When n > capacity, the space increases
	cout << "s2:" << s2 << endl;
	cout << "s2.size(): " << s2.size() << endl;
	cout << "s2.capacity(): " << s2.capacity() << endl;
}

Operation results:

hear:
The difference between reserve and resize: Reserve will not affect the content, and resize will affect the content.

Simulation Implementation

size_t size() const
Returns the valid length of a string
namespace cxy
{
	class string
	{
	public:
		size_t size()const
		{
			return _size;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
size_t capacity() const
Returns the size of the space
namespace cxy
{
	class string
	{
	public:
		size_t capacity()const
		{
			return _capacity;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
bool empty() const
If the detection string is released as an empty string, return true; otherwise, return false
namespace cxy
{
	class string
	{
	public:
		bool empty()const
		{
			return _str == 0;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
void clear()
Clear valid characters without changing capacity
namespace cxy
{
	class string
	{
	public:
		void clear()
		{
			_size = 0;
			_str[_size] = '\0';
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
void reserve (size_t n = 0)
Request to change the capacity. This function has no effect on the length of the string and cannot change its content
  • If n is greater than the current string capacity, this function causes the container to increase its capacity to n characters (or more)
  • When n is less than the current string capacity, it will not change
namespace cxy
{
	class string
	{
	public:
		void reserve(size_t n=0)
		{
			if (n > _capacity)
			{
				char *tmp = new char[n + 1];
				strncpy(tmp,_str,_size+1);
				delete[]_str;

				_str = tmp;
				_capacity = n;
			}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}

Supplement: strncpy is a function in C language

char * strncpy ( char * destination, const char * source, size_t num )

Function:

  1. Copy the string from the source to the destination, and copy the number of num characters. If the end of the source is found before the num characters are copied, the target fills in zero until a total of num characters are written to it.
  2. If the effective length of a character in the source is greater than a number, no null character ('\ 0') will be implied at the end of the destination.
  3. Therefore, in this case, the destination should not be regarded as an invalid terminated C string (so reading it will overflow, so remember to add '\ 0' at the end at this time).
void resize (size_t n, char c)
void resize (size_t n)
Convert the number of valid characters into n, and the extra space is filled with character c
  • Change the string size back to the length of n characters.
  • If n is less than the current string length, the current value is shortened to its first n character, thereby deleting characters other than n.
  • If n is greater than the current string length, the current content is expanded by inserting as many characters c at the end as possible to reach the size of n.
  • If c is specified, the new elements are initialized to a copy of c; otherwise, they are value initialization characters (null characters).
namespace cxy
{
	class string
	{
	public:
		void resize(size_t n,char c='\0')
		{
			if (n<_size)
			{
				_size = n;
				_str[_size] = '\0';
			}
			else
			{		
				if (n > _capacity)
				{
					reserve(n);
				}
				memset(_str + _size, c, n - _size);			
				_size = n;
				_str[_size] = '\0';
			}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}

Supplement: memset is a function in C language

void * memset ( void * ptr, int value, size_t num )

Function:

  • Transfer value to prt, start with the first position, transfer num, and finish.

[summary]

  1. The underlying implementation principles of size() and length() methods are exactly the same. The reason for introducing size() is to keep consistent with the interfaces of other containers. Generally, size() is basically used.
  2. clear() just clears the valid characters in the string without changing the size of the underlying space.
  3. Both resize(size_t n) and resize(size_t n, char c) change the number of valid characters in the string to N. the difference is that when the number of characters increases: resize(n) fills the extra element space with 0, and resize(size_t n, char c) fills the extra element space with character C. Note: when resizing the number of elements, increasing the number of elements may change the size of the underlying capacity
  4. reserve(size_t res_arg=0): reserve space for string without changing the number of effective elements. When the reserve parameter is less than the total size of the underlying space of string, the reserve will not change the capacity.
2.2.3 string object access and traversal operation + simulation implementation
functionFunction description
char& operator[] (size_t pos); And const char & operator [] (size_t POS) const;Returns the character of the pos position, which is called by the const string class object
iterator begin();Gets an iterator of one character
iterator end();Gets the iterator at the next position of the last character
Range forC++11 supports a new traversal method of more concise range for

Code demonstration:

int main()
{
	string s("hello world");
	cout << "operator[] :";
	for (size_t i = 0; i < s.size(); i++)
		cout << s[i] ;
	cout << endl;
	//iterator 
	string::iterator it = s.begin();
	cout << "iterator :";
	while (it != s.end())
	{
		cout << *it ;
		++it;
	}
	cout << endl;
	//Range for
	cout << "Range for :";
	for (auto ch : s)
	{
		cout << ch ;
	}
	cout << endl;
}


Simulation Implementation

const char& operator[] (size_t pos) const
namespace cxy
{
	class string
	{
	public:
		const char& operator[](size_t pos)const
		{
			assert(pos < _size);
			return _str[pos];
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
iterator begin()iterator end()
namespace cxy
{
	class string
	{
	public:
		typedef char* iterator;		
		iterator begin()
		{
			return _str;
		}
		iterator end()
		{
			return _str+_size;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
C++11: scope for

It's not implemented here. Just know how to use it

2.2.4 modification operation of string class object + simulation implementation
functionFunction description
void push_back (char c);Insert character c after string
string& append (const char*s);Append string s after string
string& operator+= (const char* s);Append string s after string
const char* c_str() const;Returns a string in c format
size_t find (const char* s, size_t pos = 0) const;Find the string s from the position of the string pos, and return the position of the string s in the string
string& erase (size_t pos = 0, size_t len = npos);Erase a part of a string and reduce its length

Code demonstration:

int main()
{
	string s("hello world");
	s.push_back('K');
	cout << s << endl;
	s.append("SSSSS");
	cout << s << endl;
	s += "FF";
	cout << s << endl;
	cout << s.find("KSS") << endl;
	s.erase(11, 8);
	cout << s << endl;
}

Operation results:
Simulation Implementation: only some common interfaces are implemented

void push_back (char c)
Insert character c after string
namespace cxy
{
	class string
	{
	public:
		void push_back(char c)
		{
			if (_size == _capacity)
			{
				reserve(_capacity * 2);
			}
			_str[_size] = c;
			_str[_size+1] = '\0';
			_size++;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
string& append (const char*s)
Append string s after string
namespace cxy
{
	class string
	{
	public:
		string &append(const char*s)
		{
			size_t len = strlen(s)+_size;
			if (len > _capacity)
			{
				reserve(len);
			}
			strncpy(_str + _size, s, len - _size+1);
			_size = len;
			return *this;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
string& operator+= (const char* s)
Append string s after string
namespace cxy
{
	class string
	{
	public:
		string& operator+=(const char*s)
		{
			append(s);
			return *this;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
const char* c_str() const
Returns a string in c format
namespace cxy
{
	class string
	{
	public:
		const char* c_str()const
		{
			return _str;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
size_t find (const char* s, size_t pos = 0) const
Find the string s from the position of the string pos, and return the position of the string s in the string
namespace cxy
{
	class string
	{
	public:
		size_t find(const char*s,size_t pos=0)const
		{
			char *str = _str+pos;
			while (*str)
			{
				char* str_s = str;
				const char* tmp = s;
				while (*str_s&&*tmp==*str_s)
				{
					tmp++;
					str_s++;
				}
				if (*tmp=='\0')
					return str - _str;
				else
					str++;
			}
			return -1;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
string& erase (size_t pos = 0, size_t len = npos)
Erase a part of a string and reduce its length
static const size_t npos = 0;

namespace cxy
{
	class string
	{
	public:
		string &erase(size_t pos = 0, size_t len = npos)
		{
			assert(pos < _size);

			if (len+pos >= _size)
			{
				_str[pos] = '\0';
				_size = pos;
			}
			else
			{
				strcpy(_str + pos, _str + pos + len);
				_size -= len;

			}
			return *this;
		}
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
}
2.2.5 string class non member function + simulation implementation
functionFunction introduction
istream& operator>> (istream& is, string& str); (key points)Input operator overload
ostream& operator<< (ostream& os, const string& str); (key points)Output operator overload
istream& getline (istream& is, string& str); (key points)Get one line of string

We are already familiar with the above, so we will not do code demonstration here, but directly simulate the implementation.

Simulation Implementation

istream& operator>> (istream& is, string& str)
namespace cxy
{
	class string
	{
	public:
		
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
	
	istream& operator >> (istream& is, string& str)
	{
		str.clear();
		char ch;
		ch = is.get();
		while (ch != ' '&&ch != '\0')
		{
			str += ch;
			ch = is.get();
		}
		return is;
	}
}

Note: this function is implemented globally because its is wants to grab the first position with the object str. if it is implemented in the string class, the first position is the this pointer, that is, the str object, which will be very awkward when using this function.

ostream& operator<< (ostream& os, const string& str);
namespace cxy
{
	class string
	{
	public:
		
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};
	
	ostream& operator<< (ostream& os, string& str)
	{
		for (auto ch:str)
		{
			os << ch;
		}
		return os;
	}	
}

istream& getline (istream& is, string& str)
Get one line of string
namespace cxy
{
	class string
	{
	public:
		
	private:
		size_t _size;
		size_t _capacity;
		char* _str;
	};	
	istream&getline(istream&is ,string&s)
	{
		s.clear();
		char ch;
		ch = is.get();
		while (ch != '\0')
		{
			s += ch;
			ch = is.get();
		}
		return is;
	}
}

summary

Here is the whole content. I just briefly introduced some interfaces in the string in stl and some rough simulation implementations. For me as a beginner, there must be some bad contents and errors. I hope xdm can point out that we can make progress together. Goodbye.

Keywords: C++ STL

Added by pankirk on Fri, 14 Jan 2022 21:20:27 +0200