深度学习(9)TensorFlow基础操作五: Broadcasting
深度學習(9)TensorFlow基礎操作五: Broadcasting
- 1. 操作思想
- 2. 具體例子
- 3. 理解
- (1) How to understand?
- (2) Why Broadcasting?
- 4. Broadcasting使用條件
- 5. Broadcasting的優點
- 6. x+tf.random.normal([3])
- 7. tf.broadcast_to()
- 8. Broadcast VS Tile
Broadcasting
- expand
- without copying data
- VS tf.tile
- tf.broadcast_to
在Y=X@W+bY=X@W+bY=X@W+b中,計算時: [b, 784]@[784, 10]+[10] →\to→ [b, 10]+[10] →\to→ [b, 10],其中[b, 10]+[10] →\to→ [b, 10]這一步計算時,需要將[10] →\to→ [b, 10]才能使其相加成為out,那么[10] →\to→ [b, 10]這個過程就叫做Broadcasting。
1. 操作思想
- Insert 1 dim ahead if needed;
如果a的dim和b的dim不相等的話,就插入維度;
例如: a.shape=[4, 16, 16, 32]; b.shape=[32]; 我們首先要將a與b的小維度對齊,我們將b的[32]這個維度叫做“小維度”,a的[4]這個維度叫做“大維度”,將a與b的小維度進行對齊,然后從小維度向大維度方向插入維度,且插入值為1,即b插入維度后就會變為: [1, 1, 1, 32]; - expand dims with size 1 to same size;
將插入維度后的b的插入的維度值變為和a一樣的維度值,即b.shape=[4, 16, 16, 32];
經上面的操作后,a與b就可以進行相應的數學運算了。- Feature maps: [4, 32, 32, 3]
- Bias: [3] →\to→ [1, 1, 1, 32] →\to→ [4, 32, 32, 3]
注: Broadcasting操作并沒有復制或者添加數據,只不過計算的過程中添加了虛擬的維度而已,是一種運算的優化手段,計算時添加到虛擬維度和數據是不占用內存的;
2. 具體例子
(1) [4, 3] + [4, 3] →\to→ [4, 3] + [4, 3] →\to→ [4, 3]
(2) [4, 3] + [1, 3] →\to→ [4, 3] + [4, 3] →\to→ [4, 3]
(3) [4, 1] + [1, 3] →\to→ [4, 3] + [4, 3] →\to→ [4, 3]
3. 理解
(1) How to understand?
- When it has no axis
- Create a new concept
- [classes, students, scores] + [scores]
- When it has dim of size 1
- Treat it shared by all
- [classes, students, scores] + [students, 1]
(2) Why Broadcasting?
- for real demanding(實際需求)
- [classes, students, scores];
- Add bias for every student: + 5 score;
- memory consumption(節省內存)
無需因為計算而占用內存,如果我們不使用Broadcasting而手動添加維度計算的話會占用內存,而使用Broadcasting則會極大地減少占用內存;
- [4, 32, 8] →\to→ 1024;
- bias = [8]: [5.0, 5.0, 5.0, …] →\to→ 8;
可以看到,不使用Broadcasting會占用1024個單元來存儲數據,而使用Broadcasting則只會占用8個單元來存儲數據。
4. Broadcasting使用條件
Broadcastable?
- Match from Last dim!(一定要從小維度開始對齊!)
- If current dim=1, expand to same;
- If either has no dim, insert one dim and expand to same;
- otherwise, NOT broadcastable.
注: 一定要從小維度,也就是從最右邊開始對齊!!!
(1) Situation 1:
- [4, 32, 14, 14]
- [1, 32, 1, 1] →\to→ [4, 32, 14, 14]
(2) Situation 2: - [4, 32, 14, 14]
- [14, 14] →\to→ [1, 1, 14, 14] →\to→ [4, 32, 14, 14]
(3) Situation 3: - [4, 32, 14, 14]
- [2, 32, 14, 14]
維度值不相等,所以不能進行Broadcasting操作!
如果將[2, 32, 14, 14]改為[1, 32, 14, 14],則可以進行Broadcasting操作。
5. Broadcasting的優點
It’s efficient and intuitive!
高效且直觀!
- [4, 32, 32, 3]
-
- [3]
-
- [32, 32, 1]
-
- [4, 1, 1, 1]
6. x+tf.random.normal([3])
(1) x+tf.random.normal([3]): x和bias=[3]相加;
- 系統會自動判定是否滿足Broadcasting操作的條件;
(2) x+tf.random.normal([32, 32, 1]): x和bias=[32, 32, 1]相加;
(3) x+tf.random.normal([4, 1, 1, 1]): x和bias=[4, 1, 1, 1]相加;
(4) x+tf.random.normal([1, 4, 1, 1]): 不滿足Broadcasting操作的條件,維度的數值對不上,所以報錯;
7. tf.broadcast_to()
(1) b = tf.broadcast_to(tf.random.normal([4, 1, 1, 1]), [4, 32, 32, 3]): 通過調用tf.broadcast_to()函數將[4, 1, 1, 1]變為新的shape=[4, 32, 32, 3];
(2) 如果維度的數值不相等,就會報錯。例如: b = tf.broadcast_to(tf.random.normal([4, 1, 1, 1]), [3, 32, 32, 3]),其中[4, 1, 1, 1]和[3, 32, 32, 3]的第1個維度值不相等,不滿足Broadcasting的條件,就會報錯;
8. Broadcast VS Tile
(1) a1 = tf.broadcast_to(a, [2, 3, 4]): 將a進行Broadcasting操作,由a1.shape=[3, 4] →\to→ [2, 3, 4];
(2) a2 = tf.expand_dims(a, axis=0): 對a進行擴展維度的操作,在a中第1個維度處插入一個新的維度,a2.shape=[1, 3, 4]
(3) a2 = tf.tile(a2, [2, 1, 1]): 使用tile()函數將a2中的第1個維度復制2次; 將a2中的第2個維度復1次; 將a2中的第2個維度復制1次。所以完成tile()操作的a2.shape=[2, 3, 4]。
- a1和a2在功能上是完全等價的,但是a2占用的內存空間更大,所以a1更加高效。
參考文獻:
[1] 龍良曲:《深度學習與TensorFlow2入門實戰》
[2] https://blog.openai.com/generative-models/
總結
以上是生活随笔為你收集整理的深度学习(9)TensorFlow基础操作五: Broadcasting的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 快手粉条是啥意思
- 下一篇: 华为申请“星耀手机”商标:国际分类为科学