當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

『Numpy』内存分析_高级切片和内存数据解析

發(fā)布時(shí)間：2024/9/5 编程问答 33 豆豆

生活随笔收集整理的這篇文章主要介紹了『Numpy』内存分析_高级切片和内存数据解析小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

在計(jì)算機(jī)中，沒有任何數(shù)據(jù)類型是固定的，完全取決于如何看待這片數(shù)據(jù)的內(nèi)存區(qū)域。在numpy.ndarray.view中，提供對(duì)內(nèi)存區(qū)域不同的切割方式，來(lái)完成數(shù)據(jù)類型的轉(zhuǎn)換，而無(wú)須要對(duì)數(shù)據(jù)進(jìn)行額外的copy，可以節(jié)約內(nèi)存空間，我們可以將view看做對(duì)內(nèi)存的展示方式。如： import numpy as np x = np.arange(10, dtype=np.int)print('An integer array:', x) print ('An float array:', x.view(np.float)) An integer array: [0 1 2 3 4 5 6 7 8 9]

An float array: [ 0.00000000e+000 4.94065646e-324 9.88131292e-324 1.48219694e-323 1.97626258e-323 2.47032823e-323 2.96439388e-323 3.45845952e-3233.95252517e-323 4.44659081e-323]

在實(shí)際使用中我們往往會(huì)采取更復(fù)雜的dtype（也就是說(shuō)view可以與dtype搭配使用）輸出內(nèi)存中的值，后面我們會(huì)示范對(duì)于結(jié)構(gòu)化數(shù)組的較為復(fù)雜的view使用。

一、view和copy

我們從numpy.reshape()函數(shù)入手，文檔對(duì)于其返回值的解釋：

Returns
??? -------
??? reshaped_array : ndarray
??????? This will be a new view object if possible; otherwise, it will
??????? be a copy.? Note there is no guarantee of the *memory layout* (C- or
??????? Fortran- contiguous) of the returned array. 其返回值可能是一個(gè)view，或是一個(gè)copy。相應(yīng)的條件為： 1、返回一個(gè)view條件：數(shù)據(jù)區(qū)域連續(xù)的時(shí)候 2、反之，則返回一個(gè)copy 我們得到了一個(gè)新概念，數(shù)組內(nèi)存區(qū)域是否連續(xù)，numpy數(shù)組有flags['C_CONTIGUOUS']表示是否連續(xù)，有np.may_share_memory方法判斷兩個(gè)數(shù)組內(nèi)存區(qū)域是否一致：
a = np.zeros([2,10], dtype=np.int32) b = a.T # 轉(zhuǎn)置破壞連續(xù)結(jié)構(gòu)a.flags['C_CONTIGUOUS'] # True b.flags['C_CONTIGUOUS'] # Falsenp.may_share_memory(a,b) # True b.base is a # True id(b)==id(a) # Falsea.shape = 20 # a的shape變了 a.flags['C_CONTIGUOUS'] # True# b.shape = 20 # AttributeError: incompatible shape for a non-contiguous array # 想要使用指定shape的方式，只能是連續(xù)數(shù)組，但是reshape方法由于不改變?cè)瓟?shù)組，所以reshape不受影響

數(shù)組切片是否會(huì)copy數(shù)據(jù)？

不過(guò)，數(shù)組的切片對(duì)象雖然并非contiguous，但是對(duì)它的reshape操作并不會(huì)copy新的對(duì)象，

a = np.arange(16).reshape(4,4) print(a.T.flags['C_CONTIGUOUS'],a[:,0].flags['C_CONTIGUOUS']) # False Falseprint (np.may_share_memory(a,a.T.reshape(16)),np.may_share_memory(a,a[:,0].reshape(4))) # False True

但是，下一小節(jié)會(huì)介紹，高級(jí)切片會(huì)copy數(shù)組，開辟新的內(nèi)存。

二、numpy的結(jié)構(gòu)數(shù)組

利用np.dtype可以構(gòu)建結(jié)構(gòu)數(shù)組，numpy.ndarray.base會(huì)返回內(nèi)存主人的信息，文檔如下，

Help on getset descriptor numpy.ndarray.base:

base
??? Base object if memory is from some other object.
?? ?
??? Examples
??? --------
??? The base of an array that owns its memory is None:
?? ?
??? >>> x = np.array([1,2,3,4])
??? >>> x.base is None
??? True
?? ?
??? Slicing creates a view, whose memory is shared with x:
?? ?
??? >>> y = x[2:]
??? >>> y.base is x
??? True

1、建立結(jié)構(gòu)數(shù)組

persontype = np.dtype({'names':['name','age','weight','height'],'formats':['S30','i','f','f']}, align=True) a = np.array([('Zhang',32,72.5,167),('Wang',24,65,170)],dtype=persontype) a['age'].base

?array([(b'Zhang', 32, 72.5, 167.),

??????????? (b'Wang', 24, 65. , 170.)],

??????????? dtype={'names':['name','age','weight','height'],

??????????? 'formats':['S30','<i4','<f4','<f4'],?

??????????? 'offsets':[0,32,36,40],

??????????? 'itemsize':44,

??????????? 'aligned':True})

2、高級(jí)切片和普通切片的不同

In [26]: a.base In [27]: a[0].base In [28]: a[:1].base Out[28]: array([123, 4, 5, 6, 78]) In [29]: a[[0,1]].baseIn [30]: a.base is None Out[30]: True In [31]: a[0].base is None Out[31]: True In [32]: a[:1].base is None Out[32]: False In [33]: a[[0,1]].base is None Out[33]: True

?由上可見高級(jí)切片會(huì)開辟新的內(nèi)存，復(fù)制被切出的數(shù)據(jù)，這是因?yàn)檫@種不規(guī)則的內(nèi)存訪問(wèn)使用原來(lái)的內(nèi)存結(jié)構(gòu)效率很低(邏輯相鄰元素內(nèi)存不相鄰，標(biāo)準(zhǔn)的訪問(wèn)由于固定了起始和步長(zhǎng)相當(dāng)于訪問(wèn)相鄰元素，所以效率較高)，拷貝出來(lái)就是連續(xù)的內(nèi)存數(shù)組了。

3、高級(jí)切片且不開辟新內(nèi)存的方法

回到上上小節(jié)的結(jié)構(gòu)數(shù)組，

print(a['age'].base is a) print(a[['age', 'height']].base is None)

True

我們通過(guò)指定內(nèi)存解析方式，實(shí)現(xiàn)不開辟新內(nèi)存，將原內(nèi)存解析為高級(jí)切片指定的結(jié)構(gòu)數(shù)組，

def fields_view(arr, fields):dtype2 = np.dtype({name:arr.dtype.fields[name] for name in fields})# print(dtype2)# {'names':['age','weight'], 'formats':['<i4','<f4'], 'offsets':[32,36], 'itemsize':40}# print([(name,arr.dtype.fields[name]) for name in fields])# [('age', (dtype('int32'), 32)), ('weight', (dtype('float32'), 36))]# print(arr.strides)# (44,)return np.ndarray(arr.shape, dtype2, arr, 0, arr.strides) ''' ndarray(shape, dtype=float, buffer=None, offset=0,| strides=None, order=None)參數(shù) 類型作用 shape int型tuple 多維數(shù)組的形狀 dtype data-type 數(shù)組中元素的類型 buffer 用于初始化數(shù)組的buffer offset int buffer中用于初始化數(shù)組的首個(gè)數(shù)據(jù)的偏移 strides int型tuple 每個(gè)軸的下標(biāo)增加1時(shí)，數(shù)據(jù)指針在內(nèi)存中增加的字節(jié)數(shù) order 'C' 或者 'F' 'C':行優(yōu)先；'F':列優(yōu)先 '''v = fields_view(a, ['age', 'weight']) print(v.base is a)v['age'] += 10 print('+++'*10) print(v) print(v.dtype) print(v.dtype.fields) print('+++'*10) print(a) print(a.dtype) print(a.dtype.fields) True ++++++++++++++++++++++++++++++ [(42, 72.5) (34, 65. )] {'names':['age','weight'], 'formats':['<i4','<f4'], 'offsets':[32,36], 'itemsize':40} {'age': (dtype('int32'), 32), 'weight': (dtype('float32'), 36)} ++++++++++++++++++++++++++++++ [(b'Zhang', 42, 72.5, 167.) (b'Wang', 34, 65. , 170.)] {'names':['name','age','weight','height'], 'formats':['S30','<i4','<f4','<f4'], 'offsets':[0,32,36,40], 'itemsize':44, 'aligned':True} {'name': (dtype('S30'), 0), 'age': (dtype('int32'), 32), 'weight': (dtype('float32'), 36), 'height': (dtype('float32'), 40)}

這里注意一下.dtype的’itemsize‘參數(shù)，表示添加一條（行）數(shù)據(jù)，內(nèi)存增加了多少字節(jié)，由于保存了'offsets'偏移信息，我們生成的dtype展示的是一個(gè)稀疏的結(jié)構(gòu)，但是每一行不會(huì)有多余的尾巴，這是因?yàn)榭赵厥怯蓪?shí)元素記錄偏移量的空隙產(chǎn)生的。

在『Numpy』內(nèi)存分析_numpy.dtype解析內(nèi)存數(shù)據(jù)中我們會(huì)更詳細(xì)的介紹有關(guān)數(shù)組內(nèi)存解析的方法。

轉(zhuǎn)載于:https://www.cnblogs.com/hellcat/p/8715830.html

總結(jié)

以上是生活随笔為你收集整理的『Numpy』内存分析_高级切片和内存数据解析的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：通过Java代码装配bean
下一篇： Linux学习之CentOS(二)--初