當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

C++ STL: 基本六大部件概览及各个容器使用方式和底层实现概览

發布時間：2023/11/27 生活经验 35 豆豆

生活随笔收集整理的這篇文章主要介紹了 C++ STL: 基本六大部件概览及各个容器使用方式和底层实现概览小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

- - STL六大部件
  - 容器的使用
  - - Array
    - vector
    - list
    - deque
    - mutiset
    - multimap
    - unordered_multiset/set

使用一個東西，卻不明白它的道理，不高明。

STL六大部件

容器 Containers 用來存放數據
分配器 Allocators 為容器內的數據分配存儲空間
算法 Algorithms 計算數據
迭代器 Iterators 訪問數據的式（算法使用其訪問容器內大數據）
適配器 Adapters
仿函數 Functors 類似于函數的功能

六大組件之間的關系示意圖如下：

容器和容器之間的實現關系圖如下（這個圖是候捷老師總結的）：
大體意思是，左側的序列式容器中，heap相關的算法實現使用的是上一層的vector，而priority_queue優先級隊列的實現使用的是上一層的heap。
縮進的容器代表的是被實現的容器，之上未縮進的代表的是底層結構。像stack和queue（縮進）了容器適配器底層是用deque實現的。

同時對于關聯式容器，set,map,mutiset,multimap(縮緊)的底層實現都是rb_tree(未縮進)；且hash_set,hash_map等容器的底層是hashtable。

圖中顯示的是C++11 版本之前的容器內容，當C++11推出之后，slist就變成了forward_list,hash_set和hash_map就變成了unordered_set和unordered_map;hash_multiset 和 hash_multimap就變成了unordered_multiset和unordered_multimap

舉例如下代碼：
核心功能是計算數組arr中大于40的元素的個數

所有的容器區間都是前閉后開區間[ )，所以遍歷容器的形態可以如下

Containter<T> c;
...
Container<T>::iterator ite = c.begin();
for(; ite != c.end(); ++ ite)...

C++11 的新特性，提供range-based for statement，來簡化以上的容器遍歷形態

for ( dec1 : coll ) {statement
}

for ( int i : {2,3,4,5,7,8}) {std::cout << i << std::endl;
}

std::vector <double> vec;
...
for ( auto elem : vec) {std::cout << elem << std::endl;
}for (auto &elem :vec) {elem *= 3;
}

容器的使用

Array

靜態數組，空間大小是直接分配好的
基本接口如下

size()
front() 返回第一個元素
back() 返回最后一個元素
data() 返回第一個元素的首地址

#include <iostream>
#include <array>using namespace std;int main()
{array<int,100> arr_test;for(int i = 0;i < 100; ++i) {arr_test[i]=i;}cout << arr_test.size() << endl;cout << arr_test.data() << endl;return 0;
}

vector

基本接口如下：

push_back()
size()
front() 第一個元素
back() 最后一個元素
data() 內存中的起始地址

vector的內存增長是兩倍的方式，即 push_back 第一個元素時發現內存空間不夠，allocator分配器會分配能夠存放2個vector元素的內存空間；放第三個元素的時候，又會分配4個元素的內存空間；放第五個的時候又會分配 8個元素的存儲空間。。。

同時每次分配新的大小的內存空間的時候會做一個元素的整體拷貝，即將之前的空間元素整體拷貝到新的擴容后的內存空間。
利用data()函數驗證如下：

#include <vector> // vector
#include <iostream>using namespace std;int main()
{vector<int> test_arr;for(int i = 0; i<100;++i) {test_arr.push_back(i);if(i % 2 == 0) cout << test_arr.data() << endl;} return 0;
}

可以看到最后的輸出，當分配空間之后內存中的起始地址都會發生變化：

0x1885050 #兩個元素空間
0x1885050
0x1885090 #四個元素空間
0x1885090
0x18850c0 #八個元素空間
0x18850c0
0x18850c0
0x18850c0
...

后續會進行相關的源碼實現的分析，為什么要這樣做？

list

基本操作如下

push_back()
size()
max_size() 存放最多的元素的個數，只要內存足夠，就能夠一直分配
front()
back()

deque

雙端隊列
基本操作如下：

push_back()
size()
front()
back()
max_size()

內存分配的示意圖如下（元素內存中的存放方式并不是連續的）：

deque內部每一個區域存儲一個指針，指針指向的是內存一段buffer區域，buffer里面存儲的是一個個具體元素。當需要擴容的時候，會分配一個新的指針區域，并為這個指針區域分配對應的buffer。

mutiset

multiset和set的區別就是 muti允許元素重復，來體現multi的作用。

insert() 插入元素
size()
max_size()

底層數據結構是紅黑樹的實現
這里它和multimap的基本是一致的，只是multimap容器的對外使用是允許key-value不同，而multiset的key-value是一樣的。且元素插入本身有序，因為底層紅黑樹的結構就是有序的。

一些測試代碼如下：

#include <iostream>
#include <string>
#include <ctime>
#include <set>
#include <algorithm>#define MAX_LEN 1000000
using namespace std;int main()
{multiset<string> c;for(int i = 0; i<MAX_LEN ;++i) {try {c.insert(to_string(i));}catch(exception &p) {cout << "i = " << i << " " << p.what() << endl;abort();}} cout << "c.max_size: " << c.max_size() << endl; cout << "c.size: " << c.size() << endl; string target = to_string(23456);cout << "find target is: " << target << endl;	/*對比全局的find函數和mutiset自帶的函數查詢效率*/clock_t timestart = clock();auto item = ::find(c.begin(),c.end(),target);cout << "::find,milli-seconds: " << clock() - timestart << endl;if (item != c.end()) cout << "found " << endl;else cout << "not found" << endl;timestart = clock();auto pItem = c.find(target);cout << "c.find,milli-seconds: " << clock() - timestart << endl;if(pItem != c.end()) cout << "found" << endl;else cout << "not found" << endl;	return 0;
}

輸出如下，結果非常明顯了：

c.max_size: 288230376151711743
c.size: 1000000
find target is: 23456
::find,milli-seconds: 6629
found 
c.find,milli-seconds: 5
found

multimap

基本接口如下：

insert() 插入元素
size()
max_size()
multiset 不允許使用 [ ] 進行insert

同樣使用紅黑樹實現，底層數據按key有序。

簡單應用代碼:

#include <iostream>
#include <string> //to_string()
#include <ctime> //clock()
#include <map>
#include <algorithm> // find()#define MAX_LEN 1000000
using namespace std;int main()
{multimap<long, string> c;for(int i = 0; i<MAX_LEN ;++i) {try {c.insert(make_pair(i,to_string(i*10)));}catch(exception &p) {cout << "i = " << i << " " << p.what() << endl;abort();}} cout << "c.max_size: " << c.max_size() << endl; cout << "c.size: " << c.size() << endl; long target = 23456;cout << "find target is: " << target << endl;	clock_t timestart = clock();auto pItem = c.find(target);cout << "c.find,milli-seconds: " << clock() - timestart << endl;if(pItem != c.end()) cout << "found" << endl;else cout << "not found" << endl;	return 0;
}

輸出如下：

c.max_size: 256204778801521550 #懷疑這里的max_size與系統內存大小有關系
c.size: 1000000
find target is: 23456
c.find,milli-seconds: 5
found

unordered_multiset/set

基本操作接口：

insert()
size()
max_size()
bucket_count()
load_factor()
max_load_factor()
max_bucket_count()

數據結構如下：

unordered_multiset/set 以及unordered_multimap/map的底層實現都是hash，為了解決hash沖突，也使用hash 鏈。為了防止鏈過長，使用鏈hash bucket的方式，當鏈的元素個數達到一定程度會分配一個雙倍的存儲空間，但是該鏈上的元素會打rehash重新分配。

簡單應用代碼：

#include <iostream>
#include <string> // to_string()
#include <ctime> // clock()
#include <unordered_set> // unordered_multiset 
#include <algorithm> //::find()#define MAX_LEN 1000000
using namespace std;int main()
{unordered_multiset<string> c;for(int i = 0; i<MAX_LEN ;++i) {try {c.insert(to_string(i));}catch(exception &p) {cout << "i = " << i << " " << p.what() << endl;abort();}} cout << "c.max_size: " << c.max_size() << endl; cout << "c.size: " << c.size() << endl; cout << "c.bucket_count: " << c.bucket_count() << endl; cout << "c.load_factor(): " << c.load_factor() << endl; cout << "c.max_load_factor(): " << c.max_load_factor() << endl; cout << "c.max_bucket_count: " << c.max_bucket_count() << endl; string target = to_string(23456);cout << "find target is: " << target << endl;	/*對比全局::find函數提供的查找效率和容器本身提供的find函數查找效率的差異*/clock_t timestart = clock();auto item = ::find(c.begin(),c.end(),target);cout << "::find,milli-seconds: " << clock() - timestart << endl;if (item != c.end()) cout << "found " << endl;else cout << "not found" << endl;timestart = clock();auto pItem = c.find(target);cout << "c.find,milli-seconds: " << clock() - timestart << endl;if(pItem != c.end()) cout << "found" << endl;else cout << "not found" << endl;	for(int i = 0;i < 20; ++i) cout << "bucket " << i << "'s element is: " << c.bucket_size(i) << endl;return 0;
}