【C++初阶 ---- string类】文档介绍 | 容量操作

string介绍

前言
1. C语言中的字符串
2. 标准库中的string类
- 2.1string类对象的容量操作
- 2.2string类对象的访问及遍历操作
- - 访问操作 [ ]和at
string底层模拟实现

前言

其实 string 就是一个管理字符数组的顺序表，因为字符数组的使用广泛，C++ 就专门给了一个 string 类，由于编码原因，它写的是一个模板。针对 string，一般情况它有三个成员 —— char* _str、size_t _size、size_t _capacity。

在这里插入图片描述

1. C语言中的字符串

C语言中，字符串是以’\0’结尾的一些字符的集合，为了操作方便，C标准库中提供了一些str系列的库函数，但是这些库函数与字符串是分离开的，不太符合OOP（Object Oriented Programming）的思想，而且底层空间需要用户自己管理，稍不留神可能还会越界访问

2. 标准库中的string类

string类的文档介绍

字符串是表示字符序列的类
标准的字符串类提供了对此类对象的支持，其接口类似于标准字符容器的接口，但添加了专门用于操作单字节字符字符串的设计特性。
string类是使用char(即作为它的字符类型，使用它的默认char_traits和分配器类型
string类是basic_string模板类的一个实例，它使用char来实例化basic_string模板类，并用char_traits和allocator作为basic_string的默认参数
注意，这个类独立于所使用的编码来处理字节:如果用来处理多字节或变长字符(如UTF-8)的序列，这个类的所有成员(如长度或大小)以及它的迭代器，将仍然按照字节(而不是实际编码的字符)来操作。

总结：
7. string是表示字符串的字符串类
8. 该类的接口与常规容器的接口基本相同，再添加了一些专门用来操作string的常规操作
9. string在底层实际是：basic_string模板类的别名，typedef basic_string<char, char_traits, allocator>string;
10. 不能操作多字节或者变长字符的序列。在使用string类时，必须包含#include头文件以及using namespace std;

2.1string类对象的容量操作

#include<string>
#include<iostream>
using namespace std;
void test_string1()
{
	//1、size | length
	string s1("hello world");
	cout << s1.size() << endl;
	cout << s1.length() << endl;
	cout << "----------cut1----------" << endl;
	//2、max_size
	string s2;
	cout << s1.max_size() << endl;
	cout << s2.max_size() << endl;	
	cout << "----------cut2----------" << endl;
	//3、capacity
	cout << s1.capacity() << endl;
	cout << "----------cut3----------" << endl;
	//4、resize
	string s3("hello world");
	cout << s3.size() << endl;
	cout << s3 << endl;
	//s3.resize(20);//n大于当前的字符串的长度且没有指定c，所以hello world\0\0\0\0...   
	//s3.resize(5);//n小于当前的字符串的长度， 它会删除掉从n开始的这些字符
	s3.resize(20, 'x');//n大于当前的字符串的长度且指定c，所以hello worldxxxx...
	cout << s3.size() << endl;
	cout << s3 << endl;
	cout << "----------cut4----------" << endl;
	//5、reserve
	string s4("hello world");
	s4.reserve(20);
	cout << s4 << endl;
	cout << s4.size() << endl;
	cout << s4.capacity() << endl;
	s4.reserve(10);
	cout << s4 << endl;
	cout << s4.size() << endl;
	cout << s4.capacity() << endl;
	cout << "----------cut5----------" << endl;
	//6、clear | empty
	string s5("hello world");
	cout << s5 << endl;
	cout << s5.empty() << endl;;
	s5.clear();
	cout << s5 << endl;
	cout << s5.empty() << endl;
	cout << "----------cut6----------" << endl;
	//7、shrink_to_fit 暂且不演示
}   
void test_string2()
{
	string s;
	size_t sz = s.capacity();
	cout << "making s grow:\n" << sz << endl;
	for(int i = 0; i < 500; ++i)
	{
		s.push_back('c');
		if(sz != s.capacity())
		{
			sz = s.capacity();
			cout << "capacity changed:" << sz << '\n';
		}
	}
	cout << "----------cut7----------" << endl;
}
int main()
{
	test_string1();
	test_string2();

	return 0;
}

在这里插入图片描述
📝说明

size || length ❗
在这里插入图片描述
两者的功能相同，一般我们比较常用的是 size。

对于 string 是在 STL 这个规范前被设计出来的，因此在 Containers 下并没有 string：
在这里插入图片描述
早期说要算字符串字符的长度，所以早期提供的接口就叫 length，至于后面要加 size 的原因是后面增加了 map、set 这样的树，所以用 length 去表示它的数据个数就不合适了。

max_size ❗

它也是早期设计比较早的，属于一个没用的接口 —— 从操作系统中获取最大的长度。本意是想告诉你这个字符串最大你能定义多长，这个接口在设计的时候其实不好实现，它没有办法标准的去定义这个接口，因为它有很多不确定的因素。所以它这个地方是直接给你 232 ，也就是 4G。所以没什么价值，以后再遇到就直接跳过了。
在这里插入图片描述

capacity ❗

对于 string 对象而言，如果 capacity 是 15，意味着它有 16 个字节的空间，因为有一个位置是 \0 的。对于 capacity 它会随着字符串的大小而增容，这里默认是 15。
在这里插入图片描述
resize ❗

resize 的前者版本可以让字符串的长度变成 n；后者可以让字符串的 n 个长度变成 c。

如果 n 是小于当前的字符串的长度，它就会缩减到 n 个字符，删除掉从 n 开始的这些字符。

如果 n 是大于当前的字符串的长度，通过在末尾插入尽可能多的内容来扩展当前内容，倘若指定了 c，则新元素被初始化为 c 的副本，否则它们就是值初始化字符 (空字符 \0)。

对于 s3.resize(20); s3[19] 是有效位置，因为对于 operator[] 里它会 assert(pos < _size)。
在这里插入图片描述
reserve ❗

请求 capacity。

注意这里不是你要多少 capacity 它就给多少 capacity，它在增容时还要对照不同编译器下自己的增容规则，最终容量不一定等于字符串长度，它可能是相等的，也可能是更大的。

如果 n 大于当前字符串的 capacity，那么它会去扩容。

如果 n 小于当前字符串的 capacity，这里跟不同的平台有关系。文档里是这样说的：其他情况下 (小于或等于)，有可能它会缩容(开一块新空间，将数据拷贝，释放原有空间)，也有可能不对当前空间进行影响，只是变换 capacity 的值。已证，VS 和 Linux 下不会缩容，STL 的标准也是这样规定的，这是实现 STL 的人决定的。

对于 s4.reserve(20); s4[19] 是无效位置，因为对于 operator[] 里它会 assert(pos < _size)。
在这里插入图片描述
可以看到 Windows VS 下初始容量是 15 ，除了第一次，其余的大概是以 1.5 倍增容。

g++ ？？？
在这里插入图片描述
可以看到在不同的编译器下增容规则也不同，Linux g++ 下初始容量是 0，其余是以 2 倍增容的。

clear | empty ❗

清理字符串 | 判空字符串
在这里插入图片描述
resize 和 reserve 有什么价值 ❓

对于 resize，既要开空间，还要对这些空间初始化，就可以用 resize —— s.resize(20, ‘x’);

对于 reserve，明确知道需要多大空间的情况，可以提前把空间开好，以减少增容所带来的代价 —— s.reserve(500);

2.2string类对象的访问及遍历操作

访问操作 [ ]和at

在这里插入图片描述

更常用的是[ ]

int main()
{
	string s1("hello world");
 
	cout << s1[4] << endl;
	cout << s1.at(4) << endl;
	return 0;
}

string底层模拟实现

#pragma once
#include<iostream>
using namespace std;
#include<string.h>
#include<assert.h>
namespace bit {
	class string {
	public:
		
		typedef char* iterator;
		typedef const char* const_iterator;
		const iterator begin() const
		{
			return _str;
		}
		const iterator end() const 
		{
			return _str + _size;
		}
		iterator begin()
		{
			return _str;
		}
		iterator end()
		{
			return _str+_size;
		}

		//无参构造
		/*string() 
			:_str(nullptr)
			,_size(0)
			,_capacity(0)
		{}*/
		//常量字符串带参构造
		/*string(const char* str)
			:_str(new char[strlen(str)+1])
			,size(strlen(str))
			,capacity(strlen(str))
		{
		}*/
		                     //注意缺省值为'\0'不行
		string(const char* str="")
			: _size(strlen(str))
		{
			_capacity = _size;
			_str = new char[_capacity + 1];
			strcpy(_str, str);
		}
		~string()
		{
			delete[] _str;
			_str = nullptr;
			_size = _capacity = 0;
		}
		const char* c_str() const
		{
			return _str;
		}
		//遍历
		size_t size() const 
		{
			return _size;
		}

		char& operator[](size_t pos) 
		{
			assert(pos);
			return _str[pos];
		}

		const char& operator[](size_t pos) const
		{
			assert(pos);
			return _str[pos];
		}

		void reserve(size_t n)
		{
			if (n > _capacity)
			{
				char* tmp = new char[n+1];
				strcpy(tmp, _str);
				delete[] _str;
				_str = tmp;
			}
			_capacity = n;
		}

		void push_back(char ch)
		{
			//扩容
			if (_size == _capacity)
			{
				reserve(_capacity == 0 ? 4 : 2 * _capacity);
			}
			_str[_size] = ch;
			_size++;
			_str[_size] = '\0';
		}
		void append(const char* str)
		{
			/*size_t len = strlen(str);
			if (_size + len > _capacity)
			{
				reserve(_size + len);
			}
			strcpy(_str + _size, str);
			_size += len;*/
			insert(_size, str);
		}
		void insert(size_t pos, char ch)
		{
			assert(pos <= _size);
			if (_size == _capacity)
			{
				reserve(_capacity == 0 ? 4 : 2 * _capacity);
			}
			size_t end = _size+1;
			while (end > pos)
			{
				_str[end] = _str[end-1];
				end--;
			}
			_str[pos] = ch;
			++_size;
		}
		void resize(size_t n, char ch = '\0')
		{
			if (n <= _size)
			{
				_str[n] = '\0';
				_size = n;
			}
			else
			{
				reserve(n);
				for (size_t i = _size; i < n; i++)
				{
					_str[i] = ch;
				}
				_str[n] = '\0';
				_size = n;
			}
		}
		//可以解决浅拷贝带来的问题
		//拷贝构造
		//s1(s2)
	/*	string(const string& s)
		{
			_str = new char[s._capacity + 1];
			strcpy(_str, s._str);
			_size = s._size;
			_capacity = s._capacity;
		}*/
		string(const string& s)
		{
			string tmp(s._str);
			swap(tmp);
		}
		//s1=s2
		string& operator=(const string& s)
		{
			char* tmp = new char[s._capacity + 1];
			strcpy(tmp, s._str);
			delete[] _str;
			_str = tmp;
			_size = s._size;
			_capacity = s._capacity;
			return *this;
		}

		//检查6.10--在pos位置插入字符串str
		void insert(size_t pos, const char* str)
		{
			assert(pos <= _size);
			size_t len = strlen(str);
			//扩容
			if (_size + len > _capacity)
			{
				reserve(_size + len);
			}
			size_t end = _size + len;
			while (end > pos+len-1)
			{
				_str[end] = _str[end - 1];
				end--;
			}
			strncpy(_str + pos, str,len);
			_size += len;
		}

		void erase(size_t pos, size_t len = npos)
		{
			//全部删除的情况
			if (len == npos || pos  > _size-len)
			{
				_str[pos] = '\0';
				_size = pos;
			}
			else
			{
				strcpy(_str + pos, _str + pos + len);
				_size -= len;
			}
		}

		void swap(string& s)
		{
			std::swap(_str, s._str);
			std::swap(_size, s._size);
			std::swap(_capacity, s._capacity);
		}
		//找到返回下标，找不到就返回npos		
		size_t find(char ch,size_t pos=0) const
		{
			assert(pos <= _size);
			for (size_t i = pos; i < _size; i++)
			{
				if (_str[i] == ch)
					return i;
			}
			return npos;
		}
		//找子串
		size_t find(const char* sub, size_t pos = 0) const
		{
			//strstr未匹配会返回一个空指针
			const char* p=strstr(_str, sub);
			if (p)
			{
				return p - _str;
			}
			else
			{
				return npos;
			}
		}
		//取子串
		string& substr(size_t pos=0, size_t len = npos)
		{
			string sub;
			if (len == npos||len>_size-pos)
			{
				for (size_t i = pos; i < _size; i++)
				{
					sub += _str[i];
				}
			}
			else
			{
				for (size_t i = pos; i < pos+len; i++)
				{
					sub += _str[i];
				}
			}
			return sub;
		}
		string& operator+=(char ch)
		{
			push_back(ch);
			return *this;
		}
		string& operator+=(const char* str)
		{
			append(str);
			return *this;
		}

		void clear()
		{
			_size = 0;
			_str[_size] = '\0';
		}
	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	public:
		static const int npos;
	};

	const int string::npos = -1;

	//具体函数，和库里的swap模板同名时，优先调用此函数
	void swap(string& x, string& y)
	{
		x.swap(y);
	}
	bool operator==(const string& s1,const string& s2)
	{
		int ret = strcmp(s1.c_str(), s2.c_str());
		return ret == 0;
	}
	bool operator<(const string& s1, const string& s2)
	{
		int ret = strcmp(s1.c_str(), s2.c_str());
		return ret < 0;
	}

	bool operator<=(const string& s1, const string& s2)
	{
		return s1 < s2 || s1 == s2;
	}

	bool operator>(const string& s1, const string& s2)
	{
		return !(s1 <= s2);
	}

	bool operator>=(const string& s1, const string& s2)
	{
		return !(s1 < s2);
	}

	bool operator!=(const string& s1, const string& s2)
	{
		return !(s1 == s2);
	}

	ostream& operator<<(ostream& out, const string s)
	{
		for (auto ch : s)
		{
			out << ch;
		}
		return out;
	}

	istream& operator>>(istream& in, string& s)
	{
		s.clear();
		char ch;
		ch = in.get();
		char buff[128];
		size_t i = 0;
		while (ch != ' ' && ch != '\n')
		{
			buff[i++] = ch;
			if (i == 127)
			{
				buff[127] = '\0';
				s += buff;
				i = 0;
			}
			ch = in.get();
		}
		if (i > 0)
		{
			buff[i] = '\0';
			s += buff;
		}
		return in;
	}

	istream& getline(istream& in, string& s)
	{
		s.clear();
		char ch;
		ch = in.get();
		char buff[128];
		size_t i = 0;
		while ( ch != '\n')
		{
			buff[i++] = ch;
			if (i == 127)
			{
				buff[127] = '\0';
				s += buff;
				i = 0;
			}
			ch = in.get();
		}
		if (i > 0)
		{
			buff[i] = '\0';
			s += buff;
		}
		return in;
	}

	void test_string1() 
	{
		string s1("hello world");
		//cout << s1 << endl;
		for (int i = 0; i < s1.size(); i++)
		{
			 s1[i]++;
		}
		for (int i = 0; i < s1.size(); i++)
		{
			cout << s1[i] <<" ";
		}
	}


	void test_string2()
	{
		string s1("hello wrold");
		string::iterator it1 = s1.begin();
		while (it1 != s1.end())
		{
			//*it1 -= 3;
			cout << *it1 << " ";
			++it1;
		}
		cout << endl;
		for (auto ch : s1)
		{
			cout << ch << " ";
		}
		cout << endl;
		string s2("x x x x");
		for (auto ch : s2)
		{
			cout << ch << " ";
		}
		cout << endl;
	}
	void test_string3()
	{
		string s3("hello world");
		s3.push_back('1');
		s3.push_back('2');
		cout << s3.c_str() << endl;
		s3 += '4';
		cout << s3.c_str() << endl;
		s3 += "333";
		cout << s3.c_str() << endl;
		s3.insert(3, "xxxxxx");
		cout << s3.c_str() << endl;
		s3.erase(6, 3);
		cout << s3.c_str() << endl;
		s3.resize(5);
		cout << s3.c_str() << endl;
		s3.resize(20, 'x');
		cout << s3.c_str() << endl;
	}
	void test_string4()
	{
		string s4("hello world");
		s4.insert(6, "xxxxxx");
		string s5("shijiaqing");
		cout << s4.c_str() << endl;
		cout << s5.c_str() << endl;
		//库里的swap：代价三次拷贝+一次析构，不推荐
		swap(s4, s5);
		//代价更小
		//s4.swap(s5);
		cout << s4.c_str() << endl;
		cout << s5.c_str() << endl;
	}
	void test_string5()
	{
		string s1("hello");
		string s2("hello");
		cout << (s1 == s2) << endl;
		cout << ("hello" == s2) << endl;
		cout << (s1 == "hello") << endl;
		//cin >> s1 >> s2;
		cout << s1 << endl;
		cout << s2 << endl;
		string s3;
		getline(cin, s3); 
		cout << s3 << endl;
	}
	void test_string6()
	{
		string s1("hello wrold");
		string s2(s1);
		cout << s1 << endl;
		cout << s2 << endl;
	}
}