mysql strtok,strtok()和strtok_r()
下面的說(shuō)明摘自于最新的Linux內(nèi)核2.6.29,說(shuō)明了strtok()這個(gè)函數(shù)已經(jīng)不再使用,由速度更快的strsep()代替
/*
* linux/lib/string.c
*
* Copyright (C) 1991, 1992 Linus Torvalds
*/
/*
* stupid library routines.. The optimized versions should generally be found
* as inline code in
*
* These are buggy as well..
*
* * Fri Jun 25 1999, Ingo Oeser
* - Added strsep() which will replace strtok() soon (because strsep() is
* reentrant and should be faster). Use only strsep() in new code, please.
*
* * Sat Feb 09 2002, Jason Thomas ,
* Matthew Hawkins
* - Kissed strtok() goodbye
*/
strtok()這個(gè)函數(shù)大家都應(yīng)該碰到過(guò),但好像總有些問(wèn)題, 這里著重講下它
下面我們來(lái)看一個(gè)例子:
int main() {
char test1[] = "feng,ke,wei";
char *test2 = "feng,ke,wei";
char *p; p = strtok(test1, ",");
while(p)
{
printf("%s\n", p);
p = strtok(NULL, ",");
}
return 0;
}
運(yùn)行結(jié)果:
feng
ke
wei
但如果用p = strtok(test2, ",")則會(huì)出現(xiàn)內(nèi)存錯(cuò)誤,這是為什么呢?是不是跟它里面那個(gè)靜態(tài)變量有關(guān)呢? 我們來(lái)看看它的原碼:
/***
*strtok.c - tokenize a string with given delimiters
*
*?????? Copyright (c) Microsoft Corporation. All rights reserved.
*
*Purpose:
*?????? defines strtok() - breaks string into series of token
*?????? via repeated calls.
*
*******************************************************************************/
#include
#include
#ifdef _MT
#include
#endif /* _MT */
/***
*char *strtok(string, control) - tokenize string with delimiter in control
*
*Purpose:
*?????? strtok considers the string to consist of a sequence of zero or more
*?????? text tokens separated by spans of one or more control chars. the first
*?????? call, with string specified, returns a pointer to the first char of the
*?????? first token, and will write a null char into string immediately
*?????? following the returned token. subsequent calls with zero for the first
*?????? argument (string) will work thru the string until no tokens remain. the
*?????? control string may be different from call to call. when no tokens remain
*?????? in string a NULL pointer is returned. remember the control chars with a
*?????? bit map, one bit per ascii char. the null char is always a control char.
*??????//這里已經(jīng)說(shuō)得很詳細(xì)了!!比MSDN都好! *Entry:
*?????? char *string - string to tokenize, or NULL to get next token
*?????? char *control - string of characters to use as delimiters
*
*Exit:
*?????? returns pointer to first token in string, or if string
*?????? was NULL, to next token
*?????? returns NULL when no more tokens remain.
*
*Uses:
*
*Exceptions:
*
*******************************************************************************/
char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;
unsigned char map[32];
int count;
#ifdef _MT
_ptiddata ptd = _getptd();
#else /* _MT */
static char *nextoken;?????????????????????? //保存剩余子串的靜態(tài)變量
#endif /* _MT */
/* Clear control map */
for (count = 0; count < 32; count++)
map[count] = 0;
/* Set bits in delimiter table */
do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);
/* Initialize str. If string is NULL, set str to the saved
* pointer (i.e., continue breaking tokens out of the string
* from the last strtok call) */
if (string)
str = string;???????????????????????????? //第一次調(diào)用函數(shù)所用到的原串
else
#ifdef _MT
str = ptd->_token;
#else /* _MT */
str = nextoken;????????????????????? //將函數(shù)第一參數(shù)設(shè)置為NULL時(shí)調(diào)用的余串
#endif /* _MT */
/* Find beginning of token (skip over leading delimiters). Note that
* there is no token iff this loop sets str to point to the terminal
* null (*str == '\0') */
while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;
string = str;????????????????????????????????? //此時(shí)的string返回余串的執(zhí)行結(jié)果
/* Find the end of the token. If it is not the end of the string,
* put a null there. */
//這里就是處理的核心了, 找到分隔符,并將其設(shè)置為'\0',當(dāng)然'\0'也將保存在返回的串中
for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';????????????? //這里就相當(dāng)于修改了串的內(nèi)容① ??????????????????????? break;
}
/* Update nextoken (or the corresponding field in the per-thread data
* structure */
#ifdef _MT
ptd->_token = str;
#else /* _MT */
nextoken = str;???????????????? //將余串保存在靜態(tài)變量中,以便下次調(diào)用 #endif /* _MT */
/* Determine if a token has been found. */
if ( string == str )
return NULL;
else
return string;1. strtok介紹眾所周知,strtok可以根據(jù)用戶所提供的分割符(同時(shí)分隔符也可以為復(fù)數(shù)比如“,。”)
將一段字符串分割直到遇到"\0".
比如,分隔符=“,” 字符串=“Fred,John,Ann”
通過(guò)strtok 就可以把3個(gè)字符串 “Fred”???? “John”????? “Ann”提取出來(lái)。
上面的C代碼為
QUOTE:
int in=0;
char buffer[]="Fred,John,Ann"
char *p[3];
char *buff = buffer;
while((p[in]=strtok(buf,","))!=NULL) {
i++;
buf=NULL; }
如上代碼,第一次執(zhí)行strtok需要以目標(biāo)字符串的地址為第一參數(shù)(buf=buffer),之后strtok需要以NULL為第一參數(shù) (buf=NULL)。指針列p[],則儲(chǔ)存了分割后的結(jié)果,p[0]="John",p[1]="John",p[2]="Ann",而buf就變 成??? Fred\0John\0Ann\0。
2. strtok的弱點(diǎn)
讓我們更改一下我們的計(jì)劃:我們有一段字符串 "Fred male 25,John male 62,Anna female 16" 我們希望把這個(gè)字符串整理輸入到一個(gè)struct,
QUOTE:
struct person {
char [25] name ;
char [6] sex;
char [4] age;
}
要做到這個(gè),其中一個(gè)方法就是先提取一段被“,”分割的字符串,然后再將其以“ ”(空格)分割。
比如: 截取 "Fred male 25" 然后分割成 "Fred" "male" "25"
以下我寫(xiě)了個(gè)小程序去表現(xiàn)這個(gè)過(guò)程:
QUOTE:
#include
#include
#define INFO_MAX_SZ 255
int main()
{
int in=0;
char buffer[INFO_MAX_SZ]="Fred male 25,John male 62,Anna female 16";
char *p[20];
char *buf=buffer;
while((p[in]=strtok(buf,","))!=NULL) {
buf=p[in];
while((p[in]=strtok(buf," "))!=NULL) {
in++;
buf=NULL;
}
p[in++]="***"; //表現(xiàn)分割
buf=NULL; }
printf("Here we have %d strings\n",i);
for (int j=0; j
printf(">%s
return 0;
}
這個(gè)程序輸出為:
Here we have 4 strings
>Fred<
>male<
>25<
>***<
這只是一小段的數(shù)據(jù),并不是我們需要的。但這是為什么呢? 這是因?yàn)閟trtok使用一個(gè)static(靜態(tài))指針來(lái)操作數(shù)據(jù),讓我來(lái)分析一下以上代碼的運(yùn)行過(guò)程:
紅色為strtok的內(nèi)置指針指向的位置,藍(lán)色為strtok對(duì)字符串的修改
1. "Fred male 25,John male 62,Anna female 16" //外循環(huán)
2. "Fred male 25\0John male 62,Anna female 16" //進(jìn)入內(nèi)循環(huán)
3.??? "Fred\0male 25\0John male 62,Anna female 16"
4.??? "Fred\0male\025\0John male 62,Anna female 16"
5 "Fred\0male\025\0John male 62,Anna female 16" //內(nèi)循環(huán)遇到"\0"回到外循環(huán)
6?? "Fred\0male\025\0John male 62,Anna female 16" //外循環(huán)遇到"\0"運(yùn)行結(jié)束。
3. 使用strtok_r
在這種情況我們應(yīng)該使用strtok_r, strtok reentrant.
char *strtok_r(char *s, const char *delim, char **ptrptr);
相對(duì)strtok我們需要為strtok提供一個(gè)指針來(lái)操作,而不是像strtok使用配套的指針。
代碼:
QUOTE:
#include
#include
#define INFO_MAX_SZ 255
int main()
{
int in=0;
char buffer[INFO_MAX_SZ]="Fred male 25,John male 62,Anna female 16";
char *p[20];
char *buf=buffer;
char *outer_ptr=NULL;
char *inner_ptr=NULL;
while((p[in]=strtok_r(buf,",",&outer_ptr))!=NULL) {
buf=p[in];
while((p[in]=strtok_r(buf," ",&inner_ptr))!=NULL) {
in++;
buf=NULL;
}
p[in++]="***";
buf=NULL; }
printf("Here we have %d strings\n",i);
for (int j=0; jn
printf(">%s
return 0;
}
這一次的輸出為:
Here we have 12 strings
>Fred<
>male<
>25<
>***<
>John<
>male<
>62<
>***<
>Anna<
>female<
>16<
>***<
讓我來(lái)分析一下以上代碼的運(yùn)行過(guò)程:
紅色為strtok_r的outer_ptr指向的位置,
紫色為strtok_r的inner_ptr指向的位置,
藍(lán)色為strtok對(duì)字符串的修改
1. "Fred male 25,John male 62,Anna female 16" //外循環(huán)
2. "Fred male 25\0John male 62,Anna female 16"//進(jìn)入內(nèi)循環(huán)
3.?? "Fred\0male 25\0John male 62,Anna female 16"
4?? "Fred\0male\025\0John male 62,Anna female 16"
5 "Fred\0male\025\0John male 62,Anna female 16" //內(nèi)循環(huán)遇到"\0"回到外循環(huán)
6?? "Fred\0male\025\0John male 62\0Anna female 16"//進(jìn)入內(nèi)循環(huán)
}
原來(lái), 該函數(shù)修改了原串.
所以,當(dāng)使用char *test2 = "feng,ke,wei"作為第一個(gè)參數(shù)傳入時(shí),在位置①處, 由于test2指向的內(nèi)容保存在文字常量區(qū),該區(qū)的內(nèi)容是不能修改的,所以會(huì)出現(xiàn)內(nèi)存錯(cuò)誤. 而char test1[] = "feng,ke,wei" 中的test1指向的內(nèi)容是保存在棧區(qū)的,所以可以修改.
看到這里 大家應(yīng)該會(huì)對(duì)文字常量區(qū)有個(gè)更加理性的認(rèn)識(shí)吧.....
總結(jié)
以上是生活随笔為你收集整理的mysql strtok,strtok()和strtok_r()的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 一吨铝材多少钱啊?
- 下一篇: 牙齿种植价格多少钱啊?