Understanding C/C++ Strict Aliasing
Understanding C/C++ Strict Aliasing
深入理解C/C++中的`Strict Aliasin`規則
or - Why won't the #$@##@^% compiler let me do what I need to do!
副標題 -- 為什么編譯器違背了我的意愿!
?
What's The Problem? 引出問題
There's a lot of confusion about strict aliasing rules. The main source of people's confusion is that there are two different audiences that talk about aliasing, developers who use compilers, and compiler writers. In this document I'm going to try to clear it all up for you. The things that I'm going to cover are based on the aliasing rules in C89/90 (6.3), C98/99 (6.5/7) as well as in C++98 (3.10/15), and C++11 (3.10/10). To find the aliasing rules in any current version of the C or C++ standards, search for "may not be aliased", which will find a footnote that refers back up to the section on allowable forms of aliasing. For information about what was on the mind of the creators of the spec, see C89 Rationale Section 3.3 Expressions, where they talk about why and how the aliasing rules came about.
我發現很多普通的開發者一直對strict aliasing規則感到困惑,我想最主要的原因是他們沒能理解strict aliasing其實是`程序`與`編譯器`之間的一個優化約定導致的,從而導致編譯器優化我們的程序時違背了我們原來的意思。在本篇文章中,我就同時站在`程序`和`編譯器`的角度分析,力爭把這個問題徹底搞清楚。接下來我的分析完全是基于C/C++標準中關于aliasing rules的論述進行展開的。在這里我給大家一個小竅門可以迅速地找到標準中描述aliasing rules的章節,就是以關鍵字“may not be aliased”搜索,這樣會定位到文檔中的一個腳注,依據該腳注就能很快地找到標準中描述aliasing rules的實際部分了。具體的可以參閱C89標準3.3表達式小節,在那里有關于aliasing rules的詳細描述。
Developers get interested in aliasing when a compiler gives them a warning about type punning and strict aliasing rules and they try to understand what the warnings mean. They Google for the warning message, they find references to the section on aliasing in one of the C or C++ specs and think, "Yes, that's what I'm trying to do, alias." Then they study that section of the appropriate spec like they're studying arcane runes and try to divine the rules that will let them do the things that they're trying to do. They think that the aliasing rules are written to tell them how to do type punning. They couldn't be more wrong.
據我所知,aliasing rules引起普通開發者的注意是因為編譯器報出了與“type punning”和“strict aliasing規則”相關的警告信息。于是乎,他們通過google搜索,千辛萬苦終于找到標準中關于aliasing rules的章節,啊!異常興奮,然后潛心埋進去學習,試圖通過仔細研讀標準內容而讓自己以后不再犯錯。然而,我想說的是,他們大錯特錯了。
The compiler writers know what the strict aliasing rules are for. They are written to let compiler writers know when they can safely assume that a change made through one variable won't affect the value of another variable, and conversely when they have to assume that two variables might actually refer to the same spot in memory.
相反的,編譯器開發者是真真切切地了解什么是strict aliasing規則的。為什么上面說普通開發者是錯誤的呢? 因為標準的內容壓根就不是給普通開發者看的,而是給編譯器開發者看的 -- 標準中關于aliasing rules的描述是用來告訴編譯器開發者什么情況下可以安全地假設多個指針變量不會指向同一塊內存,又是在什么情況下必須假設多個指針變量可能指向同一塊內存。
So this document is divided into two parts. First I'll talk about what strict aliasing is and why it exists, and then I'll talk about how to do the kinds of things developers need to do in ways that won't come in conflict with those rules.
鑒于此,我打算將本文分成兩部分。第一部分主要討論什么是strict aliasing規則,以及它存在的必要性。第二部分總結一些經驗用來告訴普通的程序開發者如何來規避違反strict aliasing規則。
?
Part the first. What is aliasing exactly? 第一部分。究竟什么是aliasing(別名)?
Aliasing is when more than one lvalue refers to the same memory location (when you hear lvalue, think of things (variables) that can be on the left-hand side of assignments), i.e. that are modifiable. As an example:
要理解什么是strict aliasing規則,首先需要理解什么是aliasing?aliasing指的是若干個左值(左值是指代對象的表達式 ISO/IEC 9899:201x/6.3.2.1)同時指代同一塊內存,這種情況下我們稱這若干個左值彼此是aliasing的。舉例如下:
?
int anint; int *intptr=&anint;If you change the value of?*intptr, the value referenced by?anint also changes because?*intptr aliases?anint, it's just another name for the same thing. Another example is:
如果你改變*intptr的值,那么anint的值同樣改變,因為*intptr和anint彼此之間是aliasing的。再看一個例子:
?
int anint; void foo(int &i1, int &i2); foo(anint,anint);Within the body of foo since we used?anint for both arguments, the two references, ?i1, and?i2 alias, i.e. refer to the same location when foo is called this way.
可以看到我們傳遞給函數foo的兩個參數都是變量anint的引用,這種情況下,在函數foo內我們稱i1和i2是彼此aliasing的。
What's the problem? 這樣會存在什么問題呢?
Examine the following code:
使用下面代碼片段來做個測試:
?
int anint;void foo(double *dblptr) {anint = 1;*dblptr = 3.14159;bar(anint); }Looking at this, it looks safe to assume that the argument to?bar()?is a constant 1. In the bad old days compiler writers had to make worst-case aliasing assumptions, to support lots of crazy wild west legacy code, and could not say that it was safe to assume the argument to?bar?was 1. They had to insert code to reload the value of ?anint?for the call, because the intervening assignment through?dblptr could have changed the value of?anint if dblptr pointed to it. It's possible that the call to?foo?was?foo((double *) &anint).
咋一看,貌似可以肯定地說函數bar接受的實參值為常量1。然而你知道嗎?在很久以前那黑暗的時光里,面對各種稀奇古怪的代碼,編譯器開發者必須要做最`糟糕`的aliasing假設,也就是不能夠假設函數bar接受的實參值肯定為1。而是在給函數bar傳遞實參之前,必須生成相應的指令去anint的所在內存處去重新獲取anint的值。為什么要這么做呢,這樣做豈不是很不高效?這是因為中間插入的代碼*dblptr = 3.14159;很有可能去改變anint的值。什么,怎么可能呢?你可能覺得很匪夷所思,但是的確是有這樣的可能的,例如這樣來調用函數foo:foo((double *) &anint)(哎呀,好變態呀)。
That's the problem that strict aliasing is intended to fix. There was low hanging fruit for compiler optimizer writers to pick and they wanted programmers to follow the aliasing rules so that they could pluck those fruit. Aliasing, and the problems it leads to, have been there as long as ?C has existed. The difference lately, is that compiler writers are being strict about the rules and enforcing them when optimization is in effect. In their respective standards, ?C and ?C++ include lists of the things that can legitimately alias, (see the next section), and in all other cases, compiler writers are allowed to assume no interactions between lvalues. Anything not on the list can be assumed to not alias, and compiler writers are free to do optimizations that make that assumption. For anything on the list, aliasing could possibly occur and compiler writers have to assume that it does. When compiler writers follow these lists, and assume that your code follows the rules, it's called strict-aliasing. Under strict aliasing, the compiler writer is free to optimize the function foo above because incompatible types,double and int, can't alias. That means that if you do call foo as?foo((double *)&anint) something will go quickly wrong, but you get what you deserve.
由此可見,由于aliasing導致了編譯器生成的機器代碼很不高效,其實,由aliasing導致的上述問題自從C語言存在時就一直存在,編譯器開發者們越來越無法忍受了。于是乎,標準出臺了strict aliasing規則來使編譯器開發者們消消氣,而普通開發者寫程序時則必須遵守strict aliasing規則,否則有時會被坑的很慘。自從有了strict aliasing規則后,編譯器開發者就可以理直氣壯地生成高效的代碼進行程序的優化了。現如今的C/C++標準中都有詳細描述哪些情況下是合法的alias(具體下一小節會列出),除此以外,編譯器開發者就可以假設不存在彼此aliasing的變量,從而可以盡情得來做程序的優化工作了。編譯器開發者遵守標準的描述,并且假設普通開發者的程序也是嚴格遵守規定的這個過程就被稱為strict-aliasing。在嚴格遵守strict aliasing規則的情況下,編譯器就可以優化上面的函數foo了,因為double和int是彼此不兼容的類型,所以編譯器認為它們不可能彼此aliasing,也就是假設語句*dblptr = 3.14159;不可能會改變anint的值,因此當你foo((double *)&anint)時,編譯器最后傳遞給函數bar的實參值依然為1。
?
So what can alias? 標準規定到底哪些是合法的alias呢?
From C9899:201x 6.5 Expressions:
摘自C11 6.5關于表達式部分:
7. An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
對象的存儲值只能由具有以下類型之一的左值訪問:
— a type compatible with the effective type of the object,
? ? 與對象的有效類型兼容的類型,
— a qualified version of a type compatible with the effective type of the object,
? ? 與對象的有效類型兼容的類型的限定版本,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
? ? 與對象的有效類型對應的有符號和無符號版本,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
? ? 與對象的有效類型對應的有符號和無符號版本的限定版本,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
? ? 我愛你。
— a character type.
? ? 字符類型(char,signed char,unsigned char)。
These can be summarized as follows:
可以總結出以下幾點來:
- Things that are compatible types or differ only by the addition of any combination of?signed, ?unsigned, or?volatile. For most purposes compatible type just means the same type. If you want more details you can read the specs. (Example: If you get a pointer to ?long, and a pointer to ?const unsigned long they could point to the same thing.)
- An aggregate (struct or class) or union type can alias types contained inside them. (Example: If a function gets passed a pointer to an?int, and a pointer to a struct or union?containing an int, or possibly containing another struct or union containing an int, or containing...ad infinitum, it's possible that the?int* points to a?int contained inside the?struct or union pointed at by the other pointer.)
- A character type. A char*, ?signed char*, or?unsigned char* is specifically allowed by the specs to point to anything. That means it can alias anything in memory.
- For C++ only, a possibly CV (const and/or volatile) qualified base class type of a dynamic type can alias the child type. (Example: if?class dog has?class animal for a base class, pointers or references to ?class dog and?class animal?can alias.)
Of course references have all these same issues and pointers and references can alias. Any lvalue has to be assumed to possibly alias to another lvalue if these rules say that they can alias. An aliasing issue is just as likely to come up with values passed by reference as it is with values passed as pointer to values. Additionally any combination of pointers and references have a possibility of aliasing, and you'd have to consult the aliasing rules to see if it might happen.
?
Part the second. How to do something the compiler doesn't like.
The following program swaps the halves of a 32 bit integer, and is typical of code you might use to handle data passed between a little-endian and big-endian machine. It also generates 6 warnings about breaking strict-aliasing rules. Many would dismiss them. The correct output of the program is:
00000020 00200000
but when optimization is turned on it's:
00000020 00000020
THAT's what the warning is trying to tell you, that the optimizer is going to do things that you don't like. Don't think this means that the optimizer broke your code. It's already broken. The optimizer just pointed it out for you.
Broken Version
uint32_t swaphalves(uint32_t a) {uint32_t acopy = a;uint16_t *ptr=(uint16_t *)&acopy;// can't use static_cast<>, not legal.// you should be warned by that.uint16_t tmp = ptr[0];ptr[0] = ptr[1];ptr[1] = tmp;return acopy; }int main(void) {uint32_t a;a = 32;cout << hex << setfill('0') << setw(8) << a << endl;a = swaphalves(a);cout << setw(8) << a << endl; }So what goes wrong? Since a uint16_t can't alias a uint32_t, under the rules, it's ignored in considering what to do with ?acopy. Since it sees that nothing is done with ?acopy inside the swaphalves function, it just returns the original value of a. Here's the (annotated) x86 assembler generated by gcc 4.4.1 for ?swaphalves, let's see what went wrong:
_Z10swaphalvesj:pushl %ebpmovl %esp, %ebpsubl $16, %espmovl 8(%ebp), %eax # get a in %eaxmovl %eax, -8(%ebp) # and store it in acopyleal -8(%ebp), %eax # now get eax pointing at acopy (ptr=&acopy)movl %eax, -12(%ebp) # save that ptr at -12(%ebp)movl -12(%ebp), %eax # get the ptr back in %eaxmovzwl (%eax), %eax # get 16 bits from ptr[0] in eaxmovw %ax, -2(%ebp) # store the 16 bits into tmpmovl -12(%ebp), %eax # get the ptr back in eaxaddl $2, %eax # bump up by two to get to ptr[1]movzwl (%eax), %edx # get that 16 bits into %edxmovl -12(%ebp), %eax # get ptr into eaxmovw %dx, (%eax) # store the 16 bits into ptr[1]movl -12(%ebp), %eax # get the ptr againleal 2(%eax), %edx # get the address of ptr[1] into edxmovzwl -2(%ebp), %eax # get tmp into eaxmovw %ax, (%edx) # store into ptr[1]movl -8(%ebp), %eax # forget all that, return original a.leaveretScary, isn't it? Of course, if you are using gcc, you could use -fno-strict-aliasing to get the output you expect, but the generated code won't be as good, and you're just treating the symptom instead of curing the problem. A better way to accomplish the same thing without the warnings or the incorrect output is to define ?swaphalves like this. N.B. this is supported in C99 and later C specs, as noted in this footnote to 6.5.2.3 Structure and union members :
85. If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.
but your mileage may vary in C++. All C++ compilers that I know of support it, but the C++ spec doesn't allow it, so it would be risky to count on it. Right after this discussion I'll have another solution with memcpy that may be, (but probably isn't), slightly less efficient, and is supported by both C and C++):
Another Broken version, referencing a twice.
uint32_t swaphalves(uint32_t a) {a = (a >>= 16) | ( a <<= 16);return a; }This version looks reasonable, but you don't know if the right and left sides of the | will each get the original version of ?a or if one of them will get the result of the other. There's no sequence point here, so we don't know anything about the order of operations here, and you may get different results from the same compiler using different levels of optimization.
Union version. Fixed for C but not guaranteed portable to C++.
uint32_t swaphalves(uint32_t a) {typedef union { uint32_t as32bit; uint16_t as16bit[2]; } swapem;swapem s={a};uint16_t tmp;tmp=s.as16bit[0];s.as16bit[0]=s.as16bit[1];s.as16bit[1]=tmp;return s.as32bit; }The C++ compiler knows that members of a union fill the same memory, and this helps the compiler generate MUCH better code:
_Z10swaphalvesj:pushl %ebp # save the original value of ebpmovl %esp, %ebp # point ebp at the stack framemovl 8(%ebp), %eax # get a in eaxpopl %ebp # get the original ebp value backroll $16, %eax # swap the two halves of a and return itretSo do it wrong, via strange casts and get incorrect code, or by turning off strict-aliasing get inefficient code, or do it right and get efficient code.
You can also accomplish the same thing by using memcpy with char* to move the data around for the swap, and it will probably be as efficient. Wait, you ask me, how can that be? The will be at least two calls to memcpy added to the mix! Well gcc and other modern compilers have smart optimizers and will, in many cases, (including this one), elide the calls to memcpy. That makes it the most portable, and as efficient as any other method. Here's how it would look:
memcpy version, compliant to C and C++ specs and efficient
uint32_t swaphalves(uint32_t a) {uint16_t as16bit[2],tmp;memcpy(as16bit, &a, sizeof(a));tmp = as16bit[0];as16bit[0] = as16bit[1];as16bit[1] = tmp;memcpy(&a, as16bit, sizeof(a));return a; }For the above code, a C compiler will generate code similar to the previous solution, but with the addition of two calls to memcpy (possibly optimized out). gcc generates code identical to the previous solution. You can imagine other variants that substitute reading and writing through a char pointer locally for the calls to memcpy.
Similar issues arrive from networking code where you don't know what type of packet you have until you examine it. unions and/or memcpy are your friends here as well.
?
The restrict keyword
In C99 and later C Standards, but not in any C++ you can promise the compiler that a pointer to something is not aliased with the ?restrict qualifier keyword. In a situation where the compiler would have to expect that things could alias, you can tell the compiler that you promise it will not be so. So in this:
void foo(int * restrict i1, int * restrict i2);you're telling the compiler that you promise that ?i1 and i2 will never point at the same memory. You have to know well the implementation of foo and only pass into it things that will keep the promise that things accessed through i1 and i2 will never alias. The compiler believes you and may be able to do a better job of optimization. If you break the promise your mileage may vary (and by that I mean that you will almost certainly cry).
?
Current C++ Standards specify that when C libraries are used from C++ the ?restrict qualifier shall be omitted. ?restrict is not a keyword for C++ and is not part of the C++ Standard in any version. Nonetheless, as pointed out by Ian Mallett, many compilers, as a non-standard extension, allow the use of ?__restrict__ or ?__restrict as qualifiers. Since ?restrict is not a keyword of C++ you can't use it directly, but in g++, clang, and MSVC you can do something like ?#define restrict __restrict and accomplish the same thing. In spite of this, they are still required by the C++ Standard to omit the qualifier from linked C libraries. Use at your own risk;)
Let me know if it can be better
If you have comments, corrections, suggestions for improvement, or examples, feel free to email me.
Thanks,
Patrick Horgan
patrick at dbp-consulting dot com
Kudos and Thanks
Particular thanks go to people who participated in the discussion of this document on the boost-users and gcc-help mailing lists. In particular I'd like to thank Václav Haisman, Thomas Heller who wrote the memcpy version I use here and pointed out that it will generate exactly the same assembler, and Andrew Haley who pointed out a more portable way to define the union, and also pointed out that gcc will elide the calls to memcpy.
Thanks to Mike Dyckhoff for catching a thinko in an example. Additionally I'd like to thank Gabe Jones for catching a thinko, and Alex Markin who did the Russian translation:) Thanks to Ian Mallett who pointed out the availability of __restrict in many C++ compilers.
原文鏈接:http://dbp-consulting.com/tutorials/StrictAliasing.html
?
擴展閱讀:
《Aliasing (computing)》
《Effective types and aliasing》
《Type-based alias analysis in C》
《Strict aliasing in C90 vs. C99 – and how to read the C standard》
《Objects and alignment》
《Pointer aliasing -- Wikipeida》
《Fixing the rules for type-based aliasing》
《EXP39-C. Do not access a variable through a pointer of an incompatible type》
《How to Access Safely Unaligned Data》
《Type punning isn't funny: Using pointers to recast in C is bad.》
《dereferencing type-punned pointer will break strict-aliasing rules》
《C pointer aliasing violations and aggressive compiler optimizations.》
《Type punning, aliasing, unions, strict-aliasing, oh my!》
《Type-punning and the strict aliasing rule》
《Strict aliasing in C》
《Demystifying The Restrict Keyword》
《The Value of Undefined Behavior》
《How to Access Safely Unaligned Data》
《Pointer Aliasing and Vectorization》
《The Strict Aliasing Situation Is Pretty Bad》
《Detecting Strict Aliasing Violations in the Wild》
《Understanding Strict Aliasing》
《Strict aliasing in C》
《Pointers in C, Part III: The Strict Aliasing Rule》
《What is the Strict Aliasing Rule and Why do we care?》
《TYPE-BASED ALIAS ANALYSIS IN C》
《EXP39-C. Do not access a variable through a pointer of an incompatible type》
總結
以上是生活随笔為你收集整理的Understanding C/C++ Strict Aliasing的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 中国古代十大名曲背后的掌故(转载)
- 下一篇: pr怎样进行素材嵌套