C++ - set 和 map详解

0. 引言

1. 关联式容器

2. 键值对

3. 树形结构

4. set

4.1 set 的定义

4.2 set 的构造

4.3 set 的常用函数

4.4 set 的特点

5. multiset

5.1 multiset 插入冗余数据

5.2 multiset - count 的使用

6. map

6.1 map 的定义

6.2 map 的构造

6.3 map的常用函数

6.4 map - operator[ ]

7. multimap

0. 引言

在之前的博客中，我们已经介绍过 STL 中的部分容器，例如：vector，list，deque等，这些容器统称为序列式容器，因为其底层为线性序列的数据结构，里面存储的是元素本身。而本节我们介绍的 map 以及set 属于关联式容器，那什么是关联式容器？它与序列式容器有什么区别？

关联式容器也是用来存储数据的，与序列式容器不同的是，其里面存储的是结构的键值对，在数据检索时比序列式容器效率更高。

那么接下来我们一起逐步开始学习吧！

1. 关联式容器

我们之前学习过的序列式容器底层为线性序列的数据结构，例如 list ，其节点是线性存储的，一个节点存储一个数据，但这些数据未必有序。而关联式容器则比较特殊，存储的是 <key, value> 的键值对，这意味着可以按照键值大小 key 以某种规则放置在适当的位置，因此，关联式容器没有首位的概念，因此没有头插尾插等相关操作。

那什么是键值对呢？我们接着来看。

2. 键值对

键值对是用来表示具有一一对应关系的一种结构，该结构中一般只包含两个成员变量key和value，key代表键值，value表示与key对应的信息。

比如：现在要建立一个英汉互译的字典，那该字典中必然有英文单词与其对应的中文含义，而且，英文单词与其中文含义是一一对应的关系，即通过该应该单词，在词典中就可以找到与其对应的中文含义。

在标准库中，专门定义了键值对 pair :

//SGI 版 STL 中的实现
template <class T1, class T2>
struct pair {
  typedef T1 first_type;
  typedef T2 second_type;

  T1 first;	
  T2 second;	
  pair() : first(T1()), second(T2()) {}
  pair(const T1& a, const T2& b) : first(a), second(b) {}

#ifdef __STL_MEMBER_TEMPLATES
  template <class U1, class U2>
  pair(const pair<U1, U2>& p) : first(p.first), second(p.second) {}
#endif
};

其中， pair 中的 first 表示键值， second 表示实值，在给关联式容器中插入数据时，就可以构建键值对 pair 对象。

例如，下面构建了一个键值 key 为 string，实值 value 为 int 的匿名键值对 pair 对象。

pair<string, int>("hello", 123);

此外，库中还设计了一个函数模板 make_pair, 可以根据传入的参数，去调用 pair 构建对象并返回。make_pair 实际会被编译器优化为 内联函数，因此不会造成过多消耗，可以放心使用

//make_pair 定义：
template <class T1,class T2>
pair<T1,T2> make_pair (T1 x, T2 y)
{
  return ( pair<T1,T2>(x,y) );
}

make_pair("hello", 123)

3. 树形结构

在 C++ 中提供了四种树形结构的关联式容器。set / multiset / map / multimap。

基于哈希表的，我们将在后续详细介绍。

4. set

4.1 set 的定义

set 实则是二叉搜索树中的 key 模型。key 模型主要的应用场景是判断在不在，例如：门禁系统，车库系统，检查文章中单词拼写是否正确等。

set 只包含实值 value，或者说， 实值就是键值，键值就是实值。

其参数的解释如下：

template < class T,                        // 实值(键值)
           class Compare = less<T>,        // 存储依据，默认为升序
           class Alloc = allocator<T>      // 空间配置器
           > class set;

接着我们来看 set 的使用。

4.2 set 的构造

我们首先来看构造函数：

在这里我们创建时，需要指定实值的类型。

#include<iostream>
#include<vector>
#include<set>
using namespace std;

int main()
{
	vector<int> arr = {1,4,5,6,8,8,5,3,2,9};
	set<int> s1;//创建一个空的 set
	set<int> s2(arr.begin(), arr.end());//创建包含数据的 set

	cout << "s1: ";
	for (auto e : s1)
		cout << e << " ";
	cout << endl;

	cout << "s2: ";
	for (auto e : s2)
		cout << e << " ";
	cout << endl;

	return 0;
}

与二叉搜索树一样， set 不支持数据冗余，冗余数据则无法插入，如果需要插入冗余数据，则需要使用 multiset，我们后续会讲到。

4.3 set 的常用函数

功能	作用
迭代器	遍历容器
empty	判断容器是否为空
size	当前容器中的元素个数
max_set	容器的最大容量
insert	将元素插入到特定位置
erase	删除指定元素
swap	交换两个容器
clear	清空容器中的所有元素
find	查找实值是否存在并返回迭代器位置
count	统计容器中指定键值的数量

下面我们实际演示相关函数的使用，代码如下：

#include<iostream>
#include<vector>
#include<set>
using namespace std;

int main()
{
	vector<int> arr = {1,4,5,6,8,3,2,9};
	set<int> s1(arr.begin(), arr.end());//创建包含数据的 set
	
	//迭代器遍历
	cout << "迭代器遍历结果: ";
	set<int>::iterator it = s1.begin();
	while (it != s1.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;

	cout << endl;
	cout << "empty(): " << s1.empty() << endl;
	cout << "size(): " << s1.size() << endl;
	cout << "max_size(): " << s1.max_size() << endl;

	//插入元素7
	cout << endl;
	cout << "insert(7): ";
	s1.insert(7);
	for (auto e : s1) cout << e << " ";
	cout << endl;

	//删除元素9
	cout << endl;
	cout << "erase(9): ";
	s1.erase(9);
	for (auto e : s1) cout << e << " ";
	cout << endl;

	//交换 查找 清理
	cout << endl;
	set<int> s2(arr.begin() + 3, arr.end());
	s1.swap(s2);
	cout << "s1: ";
	for (auto e : s1) cout << e << " ";
	cout << endl;
	cout << "s2: ";
	for (auto e : s2) cout << e << " ";
	cout << endl;

	cout << endl;
	cout << "s1.find(3): ";
	cout << (s1.find(3) != s1.end()) << endl;

	cout << endl;
	cout << "s2.clear(): " << endl;
	s2.clear();
	cout << "s1: ";
	for (auto e : s1) cout << e << " ";
	cout << endl;
	cout << "s2: ";
	for (auto e : s2) cout << e << " ";
	cout << endl;

	return 0;
}

结果如下：

那么，count 在 set 中只能用作查找，因为 set 键值就是实值，因为其不允许冗余，因此只有一个键值。

#include <iostream>
#include <vector>
#include <set>
using namespace std;

int main()
{
	vector<int> arr = { 1,4,5,6,8,3,2,9 };
	set<int> s1(arr.begin(), arr.end());
	for (int i = 0; i < 10; i++)
	{
		if (s1.count(i))
			cout << i << " 在 set 中" << endl;
		else
			cout << i << " 不在 set 中" << endl;
	}
	return 0;
}

我们也可以根据 set 的构造函数将其排为降序。

#include <iostream>
#include <vector>
#include <set>
using namespace std;

int main()
{
	vector<int> arr = { 1,4,5,6,8,3,2,9 };
	set<int, greater<int>> s1(arr.begin(), arr.end());

	for (auto e : s1)
		cout << e << " ";

	return 0;
}

最后需要注意的是，键值 key 不允许修改，因为如果改变了，会破坏二叉搜索树的原则，set 中的普通迭代器在本质上也是 const 迭代器。

4.4 set 的特点

最后，我们总结一下 set 的特点:

5. multiset

multiset 与 set 的唯一区别就在于 multiset 插入冗余数据时，不会失败。

由于 multiset 的操作与 set 基本相同，在这里我们主要演示区别。

5.1 multiset 插入冗余数据

先看代码，从中我们可以发现，multiset 允许数据冗余。

int main()
{
	vector<int> arr = { 1,3,3,4,5,6,7,5,6,7 };
	multiset<int> ms1(arr.begin(), arr.end());

	for (auto e : ms1)
		cout << e << " ";
	cout << endl;

	return 0;
}

此外，multiset 查找冗余的数据时，返回的是中序遍历中，第一次出现的元素 ：

int main()
{
	vector<int> arr = { 1,3,3,4,5,6,7,5,6,7 };
	multiset<int> ms1(arr.begin(), arr.end());

	auto it = ms1.begin();
	while (it != ms1.end())
	{
		cout << *it << " | " << &*(it) << endl;
		++it;
	}

	cout << endl << endl;

	cout << "ms1.find(3): " << &*(ms1.find(3)) << endl;

	return 0;
}

5.2 multiset - count 的使用

cout 在 multiset中。真正发挥了统计键值数的效果。

int main()
{
	vector<int> arr = { 1,3,3,4,5,6,7,5,6,7 };
	multiset<int> ms1(arr.begin(), arr.end());

	for (int i = 0; i < 10; i++)
		cout << i << "在 multiset 中的数量: " << ms1.count(i) << endl;

	return 0;
}

6. map

6.1 map 的定义

map 是二叉搜索树改造后的 key / value 模型，是真正意义上的键值对，存在关于应用搜索的应用场景，例如：中英文互译字典，电话号码查询快速信息，电话号码+验证码等。

key / value 模型除了可以用来查找之外，还可以用来统计，其实就是哈希的思想，建立映射关系

模板中 key 就是键值对中的键值， T 为键值对中的实值。所以 map 也会使用到 pair 结构，其中first 表示键值，second 表示实值。map 也存在迭代器，且为双向迭代器。

6.2 map 的构造

map 有以下构造方法：

#include <iostream>
#include <vector>
#include <map>
using namespace std;

int main()
{
	vector<pair<string, int>> arr = { make_pair("A", 65), make_pair("B", 66), make_pair("C", 67), make_pair("D", 68) };

	map<string, int> m1;//创建一个空的 map
	map<string, int> m2(arr.begin(), arr.end());//创建包含数据的 map

	cout << "m1: " << endl;
	for (auto e : m1)
		cout << e.first << " | " << e.second << endl;

	cout << endl;

	cout << "m2: " << endl;
	for (auto e : m2)
		cout << e.first << " | " << e.second << endl;

	return 0;
}

这里在访问 map 中的键值和实值时，需要通过 pair 指定对象访问，例如这里的：e.first 。

6.3 map的常用函数

功能	作用
迭代器	遍历容器
empty	判断容器是否为空
size	当前容器中的元素数
max_size	容器的最大容量
operator[ ]	按照键值，访问实值，如果没有，则新插入
insert	元素插入，根据特定条件插入至合适位置
erase	删除指定元素
swap	交换两个容器
clear	清空容器中的所有元素
find	查找实值是否存在并返回迭代器位置
count	统计容器中指定键值的数量

这里函数的使用方法与 set 基本相同，但新增了一个 operator[ ]。下面我们来看演示：

#include <iostream>
#include <vector>
#include <string>
#include <map>
using namespace std;

int main()
{
	vector<pair<string, int>> arr{ make_pair("C", 67),make_pair("B", 66), make_pair("E",69), make_pair("A", 65), make_pair("D", 68), };

	map<string, int> m1(arr.begin(), arr.end());

	//迭代器遍历
	cout << "迭代器遍历结果: ";
	map<string, int>::iterator it = m1.begin();
	while (it != m1.end())
	{
		cout << "<" << it->first << ":" << it->second << "> ";
		++it;
	}
	cout << endl;

	//判空、求大小、解引用
	cout << endl;
	cout << "empty(): " << m1.empty() << endl;
	cout << "size(): " << m1.size() << endl;
	cout << "max_size(): " << m1.max_size() << endl;
	cout << "m1[""A""]: " << m1["A"] << endl;
	cout << "m1[""B""]: " << m1["B"] << endl;

	//插入元素
	cout  << endl;
	cout << "insert(""F"", 69): ";
	m1.insert(make_pair("F", 69));
	for (auto e : m1) cout << "<" << e.first << ":" << e.second << "> ";
	cout << endl;

	//删除元素
	cout << endl;
	cout << "erase(""C""): ";
	m1.erase("C");
	for (auto e : m1) cout << "<" << e.first << ":" << e.second << "> ";
	cout << endl;

	//交换、查找、清理
	cout << endl;
	map<string, int> m2(arr.begin() + 1, arr.end()-1);
	m1.swap(m2);
	cout << "m1.swap(m2)" << endl;
	cout << "m1: ";
	for (auto e : m1) cout << "<" << e.first << ":" << e.second << "> ";
	cout << endl;

	cout << "m2: ";
	for (auto e : m2) cout << "<" << e.first << ":" << e.second << "> ";
	cout << endl;

	cout << endl;
	cout << "m1.find(""B""): ";
	cout << (m1.find("B") != m1.end()) << endl;

	cout << endl;
	cout << "m2.clear()" << endl;
	m2.clear();

	cout << "m1: ";
	for (auto e : m1) cout << "<" << e.first << ":" << e.second << "> ";
	cout << endl;

	cout << "m2: " << endl;
	for (auto e : m2) cout << "<" << e.first << ":" << e.second << "> ";
	cout << endl;

	return 0;
}

map 依然不允许数据冗余，如果需要插入重复的数据，则需要使用 multimap。

map 插入操作返回值是一个 pair ，因为既要表示是否成功，也要返回插入成功的迭代器

map 中 find 和 count 跟 set 一样。可以用来判断元素是否存在，find 返回迭代器，count 返回则是键值数。此外，map可以根据普通迭代器修改实值。

6.4 map - operator[ ]

operator[ ]返回的是当前键值对应的实值，如果当前键值不存在，则会插入新的键值对。

int main()
{
	vector<string> word = { "西瓜", "西瓜", "苹果", "西瓜", "苹果", "苹果", "西瓜", "苹果", "香蕉", "苹果", "香蕉", "梨" };
	map<string, int> table;

	for (auto& e : word)
		table[e]++;

	for (auto e : table)
		cout << "<" << e.first << ":" << e.second << ">" << endl;

	return 0;
}