當前位置：首頁 > 编程语言 > python >内容正文

python

python科学计算笔记（十二）pandas的resample采样

發布時間：2024/1/23 python 33 豆豆

生活随笔收集整理的這篇文章主要介紹了 python科学计算笔记（十二）pandas的resample采样小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

降采樣：高頻數據到低頻數據

升采樣：低頻數據到高頻數據

主要函數：resample()（pandas對象都會有這個方法）

resample方法的參數

參數說明

freq	表示重采樣頻率，例如‘M’、‘5min’，Second(15)
how=’mean’	用于產生聚合值的函數名或數組函數，例如‘mean’、‘ohlc’、np.max等，默認是‘mean’，其他常用的值由：‘first’、‘last’、‘median’、‘max’、‘min’
axis=0	默認是縱軸，橫軸設置axis=1
fill_method = None	升采樣時如何插值，比如‘ffill’、‘bfill’等
closed = ‘right’	在降采樣時，各時間段的哪一段是閉合的，‘right’或‘left’，默認‘right’
label= ‘right’	在降采樣時，如何設置聚合值的標簽，例如，9：30-9：35會被標記成9：30還是9：35,默認9：35
loffset = None	面元標簽的時間校正值，比如‘-1s’或Second(-1)用于將聚合標簽調早1秒
limit=None	在向前或向后填充時，允許填充的最大時期數
kind = None	聚合到時期（‘period’）或時間戳（‘timestamp’），默認聚合到時間序列的索引類型
convention = None	當重采樣時期時，將低頻率轉換到高頻率所采用的約定（start或end）。默認‘end’

降采樣

需考慮：

1）各區間哪邊是閉合的（參數：closed）

2）如何標記各聚合面元，用區間的開頭還是末尾（參數：label）

ts_index = pd.date_range('2017-06-20',periods =12,freq = '1min')#一分鐘采樣數據ts = pd.Series(np.arange(12),index = ts_index) ts 2017-06-20 00:00:00 02017-06-20 00:01:00 12017-06-20 00:02:00 22017-06-20 00:03:00 32017-06-20 00:04:00 42017-06-20 00:05:00 52017-06-20 00:06:00 62017-06-20 00:07:00 72017-06-20 00:08:00 82017-06-20 00:09:00 92017-06-20 00:10:00 102017-06-20 00:11:00 11Freq: T, dtype: int32

聚合到5分鐘

ts.resample('5min',how='sum') C:\Program Files\anaconda\lib\site-packages\ipykernel\__main__.py:1: FutureWarning: how in .resample() is deprecatedthe new syntax is .resample(...).sum()if __name__ == '__main__':2017-06-20 00:00:00 102017-06-20 00:05:00 352017-06-20 00:10:00 21Freq: 5T, dtype: int32 ts.resample('5min',how='sum',closed='left') C:\Program Files\anaconda\lib\site-packages\ipykernel\__main__.py:1: FutureWarning: how in .resample() is deprecatedthe new syntax is .resample(...).sum()if __name__ == '__main__':2017-06-20 00:00:00 102017-06-20 00:05:00 352017-06-20 00:10:00 21Freq: 5T, dtype: int32 ts.resample('5min',how='sum',closed='left',label ='left') C:\Program Files\anaconda\lib\site-packages\ipykernel\__main__.py:1: FutureWarning: how in .resample() is deprecatedthe new syntax is .resample(...).sum()if __name__ == '__main__':2017-06-20 00:00:00 102017-06-20 00:05:00 352017-06-20 00:10:00 21Freq: 5T, dtype: int32

通過groupby進行重插樣

另外一種降采樣方法

ts1_index = pd.date_range('2017-6-01',periods = 100,freq = 'd')ts1 = pd.Series(np.arange(100),index = ts1_index) ts1.head() 2017-06-01 02017-06-02 12017-06-03 22017-06-04 32017-06-05 4Freq: D, dtype: int32 ts1.groupby(lambda x:x.month).mean() 6 14.5 7 45.0 8 76.0 9 95.5dtype: float64 ts1.groupby(lambda x:x.weekday).mean() 0 49.5 1 50.5 2 51.5 3 49.0 4 50.0 5 47.5 6 48.5dtype: float64 df1 = pd.DataFrame(np.arange(200).reshape(100,2),index = ts1_index) df1.groupby(lambda x:x.weekday).mean() ?010123456

99	100
101	102
103	104
98	99
100	101
95	96
97	98

對于具有時間序列索引的pandas數據結構，當groupby傳入一個函數時，可以對時間索引對應列進行聚合

升采樣

升采樣沒有聚合，但是需要填充

df2 = pd.DataFrame(np.arange(200).reshape(100,2),index = ts1_index,columns=['add1','add2']) df2.head() ?add1add22017-06-012017-06-022017-06-032017-06-042017-06-05

0	1
2	3
4	5
6	7
8	9

df2.resample('W-THU',fill_method = 'ffill') C:\Program Files\anaconda\lib\site-packages\ipykernel\__main__.py:1: FutureWarning: fill_method is deprecated to .resample()the new syntax is .resample(...).ffill()if __name__ == '__main__': ?add1add22017-06-012017-06-082017-06-152017-06-222017-06-292017-07-062017-07-132017-07-202017-07-272017-08-032017-08-102017-08-172017-08-242017-08-312017-09-072017-09-14

0	1
14	15
28	29
42	43
56	57
70	71
84	85
98	99
112	113
126	127
140	141
154	155
168	169
182	183
196	197
198	199

總結

本篇博客主要內容：

1）生成指定時間段，指定頻率的日期

2）對含有時間索引的pandas數據進行重采樣，包括降采樣和升采樣等。

總結

以上是生活随笔為你收集整理的python科学计算笔记（十二）pandas的resample采样的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： python科学计算笔记（十一）pand
下一篇： python科学计算笔记（十三）pand

0	1
14	15
28	29
42	43
56	57
70	71
84	85
98	99
112	113
126	127
140	141
154	155
168	169
182	183
196	197
198	199

0	1
14	15
28	29
42	43
56	57
70	71
84	85
98	99
112	113
126	127
140	141
154	155
168	169
182	183
196	197
198	199

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

python

python科学计算笔记（十二）pandas的resample采样

resample方法的參數

降采樣

通過groupby進行重插樣

升采樣

總結

總結

0	1
14	15
28	29
42	43
56	57
70	71
84	85
98	99
112	113
126	127
140	141
154	155
168	169
182	183
196	197
198	199