當前位置：首頁 > 编程语言 > c/c++ >内容正文

c/c++

正则表达式提取器_C++11新特性7 - 正则表达式

發布時間：2024/7/19 c/c++ 29 豆豆

生活随笔收集整理的這篇文章主要介紹了正则表达式提取器_C++11新特性7 - 正则表达式小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

C++11 新增了正則表達式的標準庫支持，本文簡介 C++ 正則表達式的使用

在 C++ 中使用正則表達式，和其它語言差別不大

int main() {regex e("abc*");bool m = regex_search("abccc", e);// 輸出 yescout << (m ? "yes" : "no") << endl; }

C++11 自帶了 6 種正則表達式語法的支持

ECMAScript

basic

extended

awk

grep

egrep

C++11 默認使用 ECMAScript 語法，這也是 6 種語法中最強大的，假如想使用其他 5 種語法，只需在聲明 regex 對象時指定即可

regex e("^a.", regex_constants::grep);

假如我們不僅僅想知道一個正則表達式是否匹配一個字符串，我們還想要提取出匹配的部分，例如我們需要從郵箱中提取用戶名和網址，就需要用到 match_results

int main() {string str("Email a@bc.com abc");// 等同于 match_results<string>smatch m; regex e("([[:w:]]+)@([[:w:]]+.com)");bool found = regex_search(str, m, e);// m.size=3, 存儲了 3 個 resultcout << "m.size=" << m.size() << endl;/* 迭代 match_results, 輸出m[0]=a@bc.com (整個匹配)m[1]=a (第1個group)m[2]=bc.com (第2個group)*/for (int n=0; n<m.size(); n++){cout << "m[" << n << "]=" << m[n].str() << endl;//等價寫法 m.str(n), *(m.begin()+n) }// m.prefix=Emailcout << "m.prefix=" << m.prefix().str() << endl;// m.suffix= is minecout << "m.suffix=" << m.suffix().str() << endl; }

假如我們想要匹配的字符串中，有多個子串都可以匹配正則表達式，并且我們想把這些子串全部找出來，例如一個字符串中包含多個郵箱地址，那么就需要用到 regex_iterator

int main() {string str("a@bc.com, d@ef.com, aa@b.com");regex e("([[:w:]]+)@([[:w:]]+.com)");sregex_iterator pos(str.cbegin(), str.cend(), e); // 定義 regex_iteraror// C++慣例: 默認構造的迭代器表示序列結束sregex_iterator end;/* email=a@bc.com, user=a, domain=bc.com email=d@ef.com, user=d, domain=ef.com email=aa@bb.com, user=aa, domain=b.com */for (; pos!=end; pos++) {cout << "email=" << pos->str(0) << ", user=" << pos->str(1) << ", domain=" << pos->str(2) << endl;} }

如上我們可以看到，regex_iterator 其實就是迭代字符串中所有正則表達式匹配的 match_results。

除此之外，C++ 還提供了另一種跌到器, regex_token_iterator。不同的是，regex_token_iterator 迭代的是所有正則表達式匹配中的指定子表達式，或迭代未匹配的子字符串

int main() {string str("a@bc.com, d@ef.com, aa@bb.com");regex e("([[:w:]]+)@([[:w:]]+.com)");sregex_token_iterator pos(str.cbegin(), str.cend(), e); // 定義regex_token_iteratorsregex_token_iterator end; //序列結束/*Matched: a@bc.comMatched: d@ef.comMatched: aa@bb.com*/for (; pos!=end; pos++) {cout << "Matched: " << *pos << endl;} }

我們可以修改 pos 的定義，使它每次迭代 match_results 的第 2 個 group

// 第 4 個參數表示第幾個 group sregex_token_iterator pos(str.cbegin(), str.cend(), e, 2 );

值得注意的是，如果我們把這里的參數設為 -1，則迭代字符串中所有不匹配正則表達式的部分，相當于用正則表達式切割字符串

int main() {string str("a bb cd");regex e("s+"); // 匹配空格// 迭代不匹配正則表達式的部分sregex_token_iterator pos(str.cbegin(), str.cend(), e, -1);sregex_token_iterator end;/*Matched: aMatched: bbMatched: cd*/for (; pos!=end; pos++) {cout << "Matched: " << *pos << endl;} }

正則表達式還有一個常用的場景——字符串替換。C++ 中我們可以使用 regex_replace

int main() {string str("a@bc.com, d@ef.com, aa@bb.com");regex e("([[:w:]]+)@([[:w:]]+.com)");cout << regex_replace(str, e, "$1 is on $2"); }

輸出為

a is on bc.com, d is on ef.com, aa is on bb.com

本文主要翻譯自 Bo Qian 的 YouTube 視頻

C++ 11 Library: Regular Expression 1?youtu.beC++ 11 Library: Regular Expression 2 -- Submatch?youtu.beC++ 11 Library: Regular Expression 3 -- Iterators?youtu.be

總結

以上是生活随笔為你收集整理的正则表达式提取器_C++11新特性7 - 正则表达式的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： matlab实现脑电信号的相位同步分析,
下一篇：怎么设置班级文件服务器,如何开设论坛如题

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

c/c++

正则表达式提取器_C++11新特性7 - 正则表达式

總結