當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

深度学习（9）TensorFlow基础操作五: Broadcasting

發布時間：2023/12/15 pytorch 28 豆豆

生活随笔收集整理的這篇文章主要介紹了深度学习（9）TensorFlow基础操作五: Broadcasting 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

深度學習（9）TensorFlow基礎操作五: Broadcasting

1. 操作思想
2. 具體例子
3. 理解
- (1) How to understand?
- (2) Why Broadcasting?
4. Broadcasting使用條件
5. Broadcasting的優點
6. x+tf.random.normal([3])
7. tf.broadcast_to()
8. Broadcast VS Tile

Broadcasting

expand
without copying data
- VS tf.tile
tf.broadcast_to
在 $Y = X @ W + b$ 中，計算時: [b, 784]@[784, 10]+[10] $→\to$ [b, 10]+[10] $→\to$ [b, 10]，其中[b, 10]+[10] $→\to$ [b, 10]這一步計算時，需要將[10] $→\to$ [b, 10]才能使其相加成為out，那么[10] $→\to$ [b, 10]這個過程就叫做Broadcasting。

1. 操作思想

Insert 1 dim ahead if needed;
如果a的dim和b的dim不相等的話，就插入維度;
例如: a.shape=[4, 16, 16, 32]; b.shape=[32]; 我們首先要將a與b的小維度對齊，我們將b的[32]這個維度叫做“小維度”，a的[4]這個維度叫做“大維度”，將a與b的小維度進行對齊，然后從小維度向大維度方向插入維度，且插入值為1，即b插入維度后就會變為: [1, 1, 1, 32];
expand dims with size 1 to same size;
將插入維度后的b的插入的維度值變為和a一樣的維度值，即b.shape=[4, 16, 16, 32];
經上面的操作后，a與b就可以進行相應的數學運算了。
- Feature maps: [4, 32, 32, 3]
- Bias: [3] $→\to$ [1, 1, 1, 32] $→\to$ [4, 32, 32, 3]

注: Broadcasting操作并沒有復制或者添加數據，只不過計算的過程中添加了虛擬的維度而已，是一種運算的優化手段，計算時添加到虛擬維度和數據是不占用內存的;

2. 具體例子

(1) [4, 3] + [4, 3] $→\to$ [4, 3] + [4, 3] $→\to$ [4, 3]
(2) [4, 3] + [1, 3] $→\to$ [4, 3] + [4, 3] $→\to$ [4, 3]
(3) [4, 1] + [1, 3] $→\to$ [4, 3] + [4, 3] $→\to$ [4, 3]

3. 理解

(1) How to understand?

When it has no axis
- Create a new concept
- [classes, students, scores] + [scores]
When it has dim of size 1
- Treat it shared by all
- [classes, students, scores] + [students, 1]

(2) Why Broadcasting?

for real demanding（實際需求）
- [classes, students, scores];
- Add bias for every student: + 5 score;
memory consumption（節省內存）

無需因為計算而占用內存，如果我們不使用Broadcasting而手動添加維度計算的話會占用內存，而使用Broadcasting則會極大地減少占用內存;
- [4, 32, 8] $→\to$ 1024;
- bias = [8]: [5.0, 5.0, 5.0, …] $→\to$ 8;
可以看到，不使用Broadcasting會占用1024個單元來存儲數據，而使用Broadcasting則只會占用8個單元來存儲數據。

4. Broadcasting使用條件

Broadcastable?

Match from Last dim!（一定要從小維度開始對齊!）
If current dim=1, expand to same;
If either has no dim, insert one dim and expand to same;
otherwise, NOT broadcastable.

注: 一定要從小維度，也就是從最右邊開始對齊!!!
(1) Situation 1:

[4, 32, 14, 14]
[1, 32, 1, 1] $→\to$ [4, 32, 14, 14]
(2) Situation 2:
[4, 32, 14, 14]
[14, 14] $→\to$ [1, 1, 14, 14] $→\to$ [4, 32, 14, 14]
(3) Situation 3:
[4, 32, 14, 14]
[2, 32, 14, 14]

維度值不相等，所以不能進行Broadcasting操作!
如果將[2, 32, 14, 14]改為[1, 32, 14, 14]，則可以進行Broadcasting操作。

5. Broadcasting的優點

It’s efficient and intuitive!
高效且直觀!

[4, 32, 32, 3]
- [3]
- [32, 32, 1]
- [4, 1, 1, 1]

6. x+tf.random.normal([3])

(1) x+tf.random.normal([3]): x和bias=[3]相加;

系統會自動判定是否滿足Broadcasting操作的條件;

(2) x+tf.random.normal([32, 32, 1]): x和bias=[32, 32, 1]相加;
(3) x+tf.random.normal([4, 1, 1, 1]): x和bias=[4, 1, 1, 1]相加;
(4) x+tf.random.normal([1, 4, 1, 1]): 不滿足Broadcasting操作的條件，維度的數值對不上，所以報錯;

7. tf.broadcast_to()

(1) b = tf.broadcast_to(tf.random.normal([4, 1, 1, 1]), [4, 32, 32, 3]): 通過調用tf.broadcast_to()函數將[4, 1, 1, 1]變為新的shape=[4, 32, 32, 3];
(2) 如果維度的數值不相等，就會報錯。例如: b = tf.broadcast_to(tf.random.normal([4, 1, 1, 1]), [3, 32, 32, 3])，其中[4, 1, 1, 1]和[3, 32, 32, 3]的第1個維度值不相等，不滿足Broadcasting的條件，就會報錯;

8. Broadcast VS Tile

(1) a1 = tf.broadcast_to(a, [2, 3, 4]): 將a進行Broadcasting操作，由a1.shape=[3, 4] $→\to$ [2, 3, 4];
(2) a2 = tf.expand_dims(a, axis=0): 對a進行擴展維度的操作，在a中第1個維度處插入一個新的維度，a2.shape=[1, 3, 4]
(3) a2 = tf.tile(a2, [2, 1, 1]): 使用tile()函數將a2中的第1個維度復制2次; 將a2中的第2個維度復1次; 將a2中的第2個維度復制1次。所以完成tile()操作的a2.shape=[2, 3, 4]。

a1和a2在功能上是完全等價的，但是a2占用的內存空間更大，所以a1更加高效。

參考文獻:
[1] 龍良曲:《深度學習與TensorFlow2入門實戰》
[2] https://blog.openai.com/generative-models/

創作挑戰賽新人創作獎勵來咯，堅持創作打卡瓜分現金大獎

總結

以上是生活随笔為你收集整理的深度学习（9）TensorFlow基础操作五: Broadcasting的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：快手粉条是啥意思
下一篇：华为申请“星耀手机”商标：国际分类为科学