Effective C# 原则11:选择foreach循环
Effective C# 原則11:選擇foreach循環(huán)
Item 11: Prefer foreach Loops
C#的foreach語句是從do,while,或者for循環(huán)語句變化而來的,它相對要好一些,它可以為你的任何集合產(chǎn)生最好的迭代代碼。它的定義依懶于.Net框架里的集合接口,并且編譯器會為實(shí)際的集合生成最好的代碼。當(dāng)你在集合上做迭代時,可用使用foreach來取代其它的循環(huán)結(jié)構(gòu)。檢查下面的三個循環(huán):
int [] foo = new int[100];
// Loop 1:
foreach ( int i in foo)
? Console.WriteLine( i.ToString( ));
// Loop 2:
for ( int index = 0;? index < foo.Length;? index++ )
? Console.WriteLine( foo[index].ToString( ));
// Loop 3:
int len = foo.Length;
for ( int index = 0;? index < len;? index++ )
? Console.WriteLine( foo[index].ToString( ));
對于當(dāng)前的C#編譯器(版本1.1或者更高)而言,循環(huán)1是最好的。起碼它的輸入要少些,這會使你的個人開發(fā)效率提提升。(1.0的C#編譯器對循環(huán)1而言要慢很多,所以對于那個版本循環(huán)2是最好的。) 循環(huán)3,大多數(shù)C或者C++程序員會認(rèn)為它是最有效的,但它是最糟糕的。因?yàn)樵谘h(huán)外部取出了變量Length的值,從而阻礙了JIT編譯器將邊界檢測從循環(huán)中移出。
C#代碼是安全的托管代碼里運(yùn)行的。環(huán)境里的每一塊內(nèi)存,包括數(shù)據(jù)的索引,都是被監(jiān)視的。稍微展開一下,循環(huán)3的代碼實(shí)際很像這樣的:
// Loop 3, as generated by compiler:
int len = foo.Length;
for ( int index = 0;? index < len;? index++ )
{
? if ( index < foo.Length )
??? Console.WriteLine( foo[index].ToString( ));
? else
??? throw new IndexOutOfRangeException( );
}
C#的JIT編譯器跟你不一樣,它試圖幫你這樣做了。你本想把Length屬性提出到循環(huán)外面,卻使得編譯做了更多的事情,從而也降低了速度。CLR要保證的內(nèi)容之一就是:你不能寫出讓變量訪問不屬于它自己內(nèi)存的代碼。在訪問每一個實(shí)際的集合時,運(yùn)行時確保對每個集合的邊界(不是len變量)做了檢測。你把一個邊界檢測分成了兩個。
你還是要為循環(huán)的每一次迭代做數(shù)組做索引檢測,而且是兩次。循環(huán)1和循環(huán)2要快一些的原因是因?yàn)?#xff0c;C#的JIT編譯器可以驗(yàn)證數(shù)組的邊界來確保安全。任何循環(huán)變量不是數(shù)據(jù)的長度時,邊界檢測就會在每一次迭代中發(fā)生。(譯注:這里幾次說到JIT編譯器,它是指將IL代碼編譯成本地代碼時的編譯器,而不是指將C#代碼或者其它代碼編譯成IL代碼時的編譯器。其實(shí)我們可以用不安全選項(xiàng)來迫使JIT不做這樣的檢測,從而使運(yùn)行速度提高。)
原始的C#編譯器之所以對foreach以及數(shù)組產(chǎn)生很慢的代碼,是因?yàn)樯婕暗搅搜b箱。裝箱會在原則17中展開討論。數(shù)組是安全的類型,現(xiàn)在的foreach可以為數(shù)組生成與其它集合不同的IL代碼。對于數(shù)組的這個版本,它不再使用IEnumerator接口,就是這個接口須要裝箱與拆箱。
IEnumerator it = foo.GetEnumerator( );
while( it.MoveNext( ))
{
? int i = (int) it.Current; // box and unbox here.
? Console.WriteLine( i.ToString( ) );
}
取而代之的是,foreach語句為數(shù)組生成了這樣的結(jié)構(gòu):
for ( int index = 0;? index < foo.Length;? index++ )
? Console.WriteLine( foo[index].ToString( ));
(譯注:注意數(shù)組與集合的區(qū)別。數(shù)組是一次性分配的連續(xù)內(nèi)存,集合是可以動態(tài)添加與修改的,一般用鏈表來實(shí)現(xiàn)。而對于C#里所支持的鋸齒數(shù)組,則是一種折衷的處理。)
foreach總能保證最好的代碼。你不用操心哪種結(jié)構(gòu)的循環(huán)有更高的效率:foreach和編譯器為你代勞了。
如果你并不滿足于高效,例如還要有語言的交互。這個世界上有些人(是的,正是他們在使用其它的編程語言)堅(jiān)定不移的認(rèn)為數(shù)組的索引是從1開始的,而不是0。不管我們?nèi)绾闻?#xff0c;我們也無法破除他們的這種習(xí)慣。.Net開發(fā)組已經(jīng)嘗試過。為此你不得不在C#這樣寫初始化代碼,那就是數(shù)組從某個非0數(shù)值開始的。
// Create a single dimension array.
// Its range is [ 1 .. 5 ]
Array test = Array.CreateInstance( typeof( int ),
new int[ ]{ 5 }, new int[ ]{ 1 });
這段代碼應(yīng)該足夠讓所有人感到畏懼了(譯注:對我而言,確實(shí)有一點(diǎn))。但有些人就是很頑固,無認(rèn)你如何努力,他們會從1開始計(jì)數(shù)。很幸運(yùn),這是那些問題當(dāng)中的一個,而你可以讓編譯器來“欺騙”。用foreach來對test數(shù)組進(jìn)行迭代:
foreach( int j in test )
? Console.WriteLine ( j );
foreach語句知道如何檢測數(shù)組的上下限,所以你應(yīng)該這樣做,而且這和for循環(huán)的速度是一樣的,也不用管某人是采用那個做為下界。
對于多維數(shù)組,foreach給了你同樣的好處。假設(shè)你正在創(chuàng)建一個棋盤。你將會這樣寫兩段代碼:
private Square[,] _theBoard = new Square[ 8, 8 ];
// elsewhere in code:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
? for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
??? _theBoard[ i, j ].PaintSquare( );
取而代之的是,你可以這樣簡單的畫這個棋盤:
foreach( Square sq in _theBoard )
? sq.PaintSquare( );
(譯注:本人不贊成這樣的方法。它隱藏了數(shù)組的行與列的邏輯關(guān)系。循環(huán)是以行優(yōu)先的,如果你要的不是這個順序,那么這種循環(huán)并不好。)
foreach語句生成恰當(dāng)?shù)拇a來迭代數(shù)組里所有維數(shù)的數(shù)據(jù)。如果將來你要創(chuàng)建一個3D的棋盤,foreach循環(huán)還是一樣的工作,而另一個循環(huán)則要做這樣的修改:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
? for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
??? for( int k = 0; k < _theBoard.GetLength( 2 ); k++ )
????? _theBoard[ i, j, k ].PaintSquare( );
(譯注:這樣看上去雖然代碼很多,但我覺得,只要是程序員都可以一眼看出這是個三維數(shù)組的循環(huán),但是對于foreach,我看沒人一眼可以看出來它在做什么! 個人理解。當(dāng)然,這要看你怎樣認(rèn)識,這當(dāng)然可以說是foreach的一個優(yōu)點(diǎn)。)
事實(shí)上,foreach循環(huán)還可以在每個維的下限不同的多維數(shù)組上工作(譯注:也就是鋸齒數(shù)組)。 我不想寫這樣的代碼,即使是為了做例示。但當(dāng)某人在某時寫了這樣的集合時,foreach可以勝任。
foreach也給了你很大的伸縮性,當(dāng)某時你發(fā)現(xiàn)須要修改數(shù)組里底層的數(shù)據(jù)結(jié)構(gòu)時,它可以盡可能多的保證代碼不做修改。我們從一個簡單的數(shù)組來討論這個問題:
int [] foo = new int[100];
假設(shè)后來某些時候,你發(fā)現(xiàn)它不具備數(shù)組類(array class)的一些功能,而你又正好要這些功能。你可能簡單把一個數(shù)組修改為ArrayList:
// Set the initial size:
ArrayList foo = new ArrayList( 100 );
任何用for循環(huán)的代碼被破壞:
int sum = 0;
for ( int index = 0;
? // won't compile: ArrayList uses Count, not Length
? index < foo.Length;
? index++ )
? // won't compile: foo[ index ] is object, not int.
? sum += foo[ index ];
然而,foreach循環(huán)可以根據(jù)所操作的對象不同,而自動編譯成不同的代碼來轉(zhuǎn)化恰當(dāng)?shù)念愋汀J裁匆膊挥酶摹_€不只是對標(biāo)準(zhǔn)的數(shù)組可以這樣,對于其它任何的集合類型也同樣可以用foreach.
如果你的集合支持.Net環(huán)境下的規(guī)則,你的用戶就可以用foreach來迭代你的數(shù)據(jù)類型。為了讓foreach語句認(rèn)為它是一個集合類型,一個類應(yīng)該有多數(shù)屬性中的一個:公開方法GetEnumerator()的實(shí)現(xiàn)可以構(gòu)成一個集合類。明確的實(shí)現(xiàn)IEnumerable接口可以產(chǎn)生一個集合類。實(shí)現(xiàn)IEnumerator接口也可以實(shí)現(xiàn)一個集合類。foreach可以在任何一個上工作。
foreach有一個好處就是關(guān)于資源管理。IEnumerable接口包含一個方法:GetEnumerator()。foreach語句是一個在可枚舉的類型上生成下面的代碼,優(yōu)化過的:
IEnumerator it = foo.GetEnumerator( ) as IEnumerator;
using ( IDisposable disp = it as IDisposable )
{
? while ( it.MoveNext( ))
? {
??? int elem = ( int ) it.Current;
??? sum += elem;
? }
}
如果斷定枚舉器實(shí)現(xiàn)了IDisposable接口,編譯器可以自動優(yōu)化代碼為finally塊。但對你而言,明白這一點(diǎn)很重要,無論如何,foreach生成了正確的代碼。
foreach是一個應(yīng)用廣泛的語句。它為數(shù)組的上下限自成正確的代碼,迭代多維數(shù)組,強(qiáng)制轉(zhuǎn)化為恰當(dāng)?shù)念愋?使用最有效的結(jié)構(gòu)),還有,這是最重要的,生成最有效的循環(huán)結(jié)構(gòu)。這是迭代集合最有效的方法。這樣,你寫出的代碼更持久(譯注:就是不會因?yàn)殄e誤而改動太多的代碼),第一次寫代碼的時候更簡潔。這對生產(chǎn)力是一個小的進(jìn)步,隨著時間的推移會累加起來。
=========================
Item 11: Prefer foreach Loops
The C# foreach statement is more than just a variation of the do, while, or for loops. It generates the best iteration code for any collection you have. Its definition is tied to the collection interfaces in the .NET Framework, and the C# compiler generates the best code for the particular type of collection. When you iterate collections, use foreach instead of other looping constructs. Examine these three loops:
int [] foo = new int[100];
// Loop 1:
foreach ( int i in foo)
? Console.WriteLine( i.ToString( ));
// Loop 2:
for ( int index = 0;
? index < foo.Length;
? index++ )
? Console.WriteLine( foo[index].ToString( ));
// Loop 3:
int len = foo.Length;
for ( int index = 0;
? index < len;
? index++ )
? Console.WriteLine( foo[index].ToString( ));
?
For the current and future C# compilers (version 1.1 and up), loop 1 is best. It's even less typing, so your personal productivity goes up. (The C# 1.0 compiler produced much slower code for loop 1, so loop 2 is best in that version.) Loop 3, the construct most C and C++ programmers would view as most efficient, is the worst option. By hoisting the Length variable out of the loop, you make a change that hinders the JIT compiler's chance to remove range checking inside the loop.
C# code runs in a safe, managed environment. Every memory location is checked, including array indexes. Taking a few liberties, the actual code for loop 3 is something like this:
// Loop 3, as generated by compiler:
int len = foo.Length;
for ( int index = 0;
? index < len;
? index++ )
{
? if ( index < foo.Length )
??? Console.WriteLine( foo[index].ToString( ));
? else
??? throw new IndexOutOfRangeException( );
}
?
The JIT C# compiler just doesn't like you trying to help it this way. Your attempt to hoist the Length property access out of the loop just made the JIT compiler do more work to generate even slower code. One of the CLR guarantees is that you cannot write code that overruns the memory that your variables own. The runtime generates a test of the actual array bounds (not your len variable) before accessing each particular array element. You get one bounds check for the price of two.
You still pay to check the array index on every iteration of the loop, and you do so twice. The reason loops 1 and 2 are faster is that the C# compiler and the JIT compiler can verify that the bounds of the loop are guaranteed to be safe. Anytime the loop variable is not the length of the array, the bounds check is performed on each iteration.
The reason that foreach and arrays generated very slow code in the original C# compiler concerns boxing, which is covered extensively in Item 17. Arrays are type safe. foreach now generates different IL for arrays than other collections. The array version does not use the IEnumerator interface, which would require boxing and unboxing:
IEnumerator it = foo.GetEnumerator( );
while( it.MoveNext( ))
{
? int i = (int) it.Current; // box and unbox here.
? Console.WriteLine( i.ToString( ) );
}
?
Instead, the foreach statement generates this construct for arrays:
for ( int index = 0;
? index < foo.Length;
? index++ )
? Console.WriteLine( foo[index].ToString( ));
?
foreach always generates the best code. You don't need to remember which construct generates the most efficient looping construct: foreach and the compiler will do it for you.
If efficiency isn't enough for you, consider language interop. Some folks in the world (yes, most of them use other programming languages) strongly believe that index variables start at 1, not 0. No matter how much we try, we won't break them of this habit. The .NET team tried. You have to write this kind of initialization in C# to get an array that starts at something other than 0:
// Create a single dimension array.
// Its range is [ 1 .. 5 ]
Array test = Array.CreateInstance( typeof( int ),
new int[ ]{ 5 }, new int[ ]{ 1 });
?
This code should be enough to make anybody cringe and just write arrays that start at 0. But some people are stubborn. Try as you might, they will start counting at 1. Luckily, this is one of those problems that you can foist off on the compiler. Iterate the test array using foreach:
foreach( int j in test )
? Console.WriteLine ( j );
?
The foreach statement knows how to check the upper and lower bounds on the array, so you don't have toand it's just as fast as a hand-coded for loop, no matter what different lower bound someone decides to use.
foreach adds other language benefits for you. The loop variable is read-only: You can't replace the objects in a collection using foreach. Also, there is explicit casting to the correct type. If the collection contains the wrong type of objects, the iteration throws an exception.
foreach gives you similar benefits for multidimensional arrays. Suppose that you are creating a chess board. You would write these two fragments:
private Square[,] _theBoard = new Square[ 8, 8 ];
// elsewhere in code:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
? for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
??? _theBoard[ i, j ].PaintSquare( );
?
Instead, you can simplify painting the board this way:
foreach( Square sq in _theBoard )
? sq.PaintSquare( );
?
The foreach statement generates the proper code to iterate across all dimensions in the array. If you make a 3D chessboard in the future, the foreach loop just works. The other loop needs modification:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
? for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
??? for( int k = 0; k < _theBoard.GetLength( 2 ); k++ )
????? _theBoard[ i, j, k ].PaintSquare( );
?
In fact, the foreach loop would work on a multidimensional array that had different lower bounds in each direction. I don't want to write that kind of code, even as an example. But when someone else codes that kind of collection, foreach can handle it.
foreach also gives you the flexibility to keep much of the code intact if you find later that you need to change the underlying data structure from an array. We started this discussion with a simple array:
int [] foo = new int[100];
?
Suppose that, at some later point, you realize that you need capabilities that are not easily handled by the array class. You can simply change the array to an ArrayList:
// Set the initial size:
ArrayList foo = new ArrayList( 100 );
?
Any hand-coded for loops are broken:
int sum = 0;
for ( int index = 0;
? // won't compile: ArrayList uses Count, not Length
? index < foo.Length;
? index++ )
? // won't compile: foo[ index ] is object, not int.
? sum += foo[ index ];
?
However, the foreach loop compiles to different code that automatically casts each operand to the proper type. No changes are needed. It's not just changing to standard collections classes, eitherany collection type can be used with foreach.
Users of your types can use foreach to iterate across members if you support the .NET environment's rules for a collection. For the foreach statement to consider it a collection type, a class must have one of a number of properties. The presence of a public GetEnumerator() method makes a collection class. Explicitly implementing the IEnumerable interface creates a collection type. Implementing the IEnumerator interface creates a collection type. foreach works with any of them.
foreach has one added benefit regarding resource management. The IEnumerable interface contains one method: GetEnumerator(). The foreach statement on an enumerable type generates the following, with some optimizations:
IEnumerator it = foo.GetEnumerator( ) as IEnumerator;
using ( IDisposable disp = it as IDisposable )
{
? while ( it.MoveNext( ))
? {
??? int elem = ( int ) it.Current;
??? sum += elem;
? }
}
?
The compiler automatically optimizes the code in the finally clause if it can determine for certain whether the enumerator implements IDisposable. But for you, it's more important to see that, no matter what, foreach generates correct code.
foreach is a very versatile statement. It generates the right code for upper and lower bounds in arrays, iterates multidimensional arrays, coerces the operands into the proper type (using the most efficient construct), and, on top of that, generates the most efficient looping constructs. It's the best way to iterate collections. With it, you'll create code that is more likely to last, and it's simpler for you to write in the first place. It's a small productivity improvement, but it adds up over time.
?
總結(jié)
以上是生活随笔為你收集整理的Effective C# 原则11:选择foreach循环的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 全 球 最 老 金 鱼 病 逝
- 下一篇: 幸福的一家