GPU performance tunning
生活随笔
收集整理的這篇文章主要介紹了
GPU performance tunning
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
DDR 輸出帶寬:
640M*8byte=5.1GB/s(其中8byte受限為DMC/BUS寬度的影響)
latency read:107ns
latency write:43ns
outstanding:
read :9 transactions ,write:3 transactions
burst length:7 ;
transfer size:3;
L2 cache:64K
target performance :gfxbenchmark
MALI-T820 MP1
Manhattan
FPS :2.1fps
BW(rw+wr):255.2MB/frame=536.134MB/s
T-Rex:
FPS:6.5fps
BW(rw+wr):166.7MB/frame=1.08355GB/s
Egypt HD:
FPS:18.5fps
BW(rw+wr):56.1MB/frame=1.03785GB/s
Egypt classic
FPS:18.5fps
BW(rw+wr):20.7MB/frame=836.28MB/s
?i)理論計算帶寬: rbw=wbw=1.6795GB/s,
???? busmonitor 測量帶寬:rbw? = 1.80783GB/s??? ,wr =? 209.408MB/s
?ii)讀通道帶寬與理論計算帶寬相差 較少,主要是實際GPU會讀少量的job descriptor ,但理論計算中忽略job descriptor ;
寫通道的帶寬與理論帶寬相差較大,主要是因GPU的TE進行對寫進行了優化,根據波形的busmon_wbw_cnt中的字節總量為
garden: fps*1000*0.62
maroon: fps*1000*1.35
640M*8byte=5.1GB/s(其中8byte受限為DMC/BUS寬度的影響)
latency read:107ns
latency write:43ns
outstanding:
read :9 transactions ,write:3 transactions
burst length:7 ;
transfer size:3;
L2 cache:64K
target performance :gfxbenchmark
MALI-T820 MP1
Manhattan
FPS :2.1fps
BW(rw+wr):255.2MB/frame=536.134MB/s
T-Rex:
FPS:6.5fps
BW(rw+wr):166.7MB/frame=1.08355GB/s
Egypt HD:
FPS:18.5fps
BW(rw+wr):56.1MB/frame=1.03785GB/s
Egypt classic
FPS:18.5fps
BW(rw+wr):20.7MB/frame=836.28MB/s
?i)理論計算帶寬: rbw=wbw=1.6795GB/s,
???? busmonitor 測量帶寬:rbw? = 1.80783GB/s??? ,wr =? 209.408MB/s
?ii)讀通道帶寬與理論計算帶寬相差 較少,主要是實際GPU會讀少量的job descriptor ,但理論計算中忽略job descriptor ;
寫通道的帶寬與理論帶寬相差較大,主要是因GPU的TE進行對寫進行了優化,根據波形的busmon_wbw_cnt中的字節總量為
133.1988Mbyte左右,約等于2個16K的FB容量(4096*4096*4*2)大小。
garden: fps*1000*0.62
maroon: fps*1000*1.35
????? GPU write? 測試條件:
1)GPU OD :600MZH ,MP1;
2)DDR fre? :640M
3) CPU :1.3GHZ
??? ?
GPU RW/WR bandwidth :
?t=8.0685S ,BW =3.1014GB/s;
GPU filltare:
t=56.436028S ,fillrate=567.013375M pixels/s;
總結
以上是生活随笔為你收集整理的GPU performance tunning的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 华硕笔记本快捷键无法使用解决方案
- 下一篇: Arduino SPI + SPI Fl