當(dāng)前位置：首頁(yè) > 编程语言 > python >内容正文

python

python3-pandas 数据结构 Series、DataFrame 基础

發(fā)布時(shí)間：2024/9/27 python 27 豆豆

生活随笔收集整理的這篇文章主要介紹了 python3-pandas 数据结构 Series、DataFrame 基础小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Pandas 應(yīng)用
Pandas 的主要數(shù)據(jù)結(jié)構(gòu)是 Series （一維數(shù)據(jù)）與 DataFrame（二維數(shù)據(jù)），這兩種數(shù)據(jù)結(jié)構(gòu)足以處理金融、統(tǒng)計(jì)、社會(huì)科學(xué)、工程等領(lǐng)域里的大多數(shù)典型用例。
數(shù)據(jù)結(jié)構(gòu)
Series 是一種類似于一維數(shù)組的對(duì)象，它由一組數(shù)據(jù)（各種Numpy數(shù)據(jù)類型）以及一組與之相關(guān)的數(shù)據(jù)標(biāo)簽（即索引）組成。
DataFrame 是一個(gè)表格型的數(shù)據(jù)結(jié)構(gòu)，它含有一組有序的列，每列可以是不同的值類型（數(shù)值、字符串、布爾型值）。DataFrame 既有行索引也有列索引，它可以被看做由 Series 組成的字典（共同用一個(gè)索引）。

1、Pandas 數(shù)據(jù)結(jié)構(gòu) - Series

Series 帶標(biāo)簽的一維數(shù)組
pandas.Series( data, index, dtype, name, copy)

參數(shù)說明：

data：一組數(shù)據(jù)(ndarray 類型)。
index：數(shù)據(jù)索引標(biāo)簽，如果不指定，默認(rèn)從 0 開始。
dtype：數(shù)據(jù)類型，默認(rèn)會(huì)自己判斷。
name：設(shè)置名稱。
copy：拷貝數(shù)據(jù)，默認(rèn)為 False。

如果沒有指定索引，索引值就從 0 開始

t = pd.Series([4,5,6]) print(t) print(type(t)) # <class 'pandas.core.series.Series'> print(t[1]) # 5 """ 0 4 1 5 2 6 dtype: int64 <class 'pandas.core.series.Series'> 5 """

指定索引值，修改數(shù)據(jù)類型：

t2 = pd.Series([2,4,6,8], index=list("abcd")) print(t2) print(t2["c"]) # 6 print(t2.astype(float)) print(t2[t2>5]) """ a 2 b 4 c 6 d 8 dtype: int64 6 a 2.0 b 4.0 c 6.0 d 8.0 dtype: float64 c 6 d 8 dtype: int64 """

使用 key/value 對(duì)象，類似字典來創(chuàng)建 Series

temp_dict = {"name": "wang1", "age": 18, "tel": 10010}t3 = pd.Series(temp_dict) print(t3) print(t3["age"]) # 18 print(t3[1]) # 18 print(t3[:2]) print(t3[[1,2]]) print(t3[["name","tel"]]) """ name wang1 age 18 tel 10010 dtype: object 18 18 name wang1 age 18 dtype: object age 18 tel 10010 dtype: object name wang1 tel 10010 dtype: object """

獲取 Series 的值、索引

print(t3.index) # Index(['name', 'age', 'tel'], dtype='object') print(type(t3.index)) # <class 'pandas.core.indexes.base.Index'>print(t3.values) # ['wang1' 18 10010] print(type(t3.values)) # <class 'numpy.ndarray'>

2、Pandas 數(shù)據(jù)結(jié)構(gòu) - DataFrame

DataFrame 是一個(gè)表格型的數(shù)據(jù)結(jié)構(gòu)，它含有一組有序的列，每列可以是不同的值類型（數(shù)值、字符串、布爾型值）。DataFrame 既有行索引也有列索引，它可以被看做由 Series 組成的字典（共同用一個(gè)索引）

DataFrame 構(gòu)造方法如下：
pandas.DataFrame( data, index, columns, dtype, copy)
DataFrame 二維，Series 容器
參數(shù)說明：

data：一組數(shù)據(jù)(ndarray、series, map, lists, dict 等類型)。
index：索引值，或者可以稱為行標(biāo)簽。
columns：列標(biāo)簽，默認(rèn)為 RangeIndex (0, 1, 2, …, n) 。
dtype：數(shù)據(jù)類型。
copy：拷貝數(shù)據(jù)，默認(rèn)為 False。

import pandas as pd import numpy as np t = pd.DataFrame(np.arange(12).reshape(3,4)) print(t) """0 1 2 3 0 0 1 2 3 1 4 5 6 7 2 8 9 10 11 """

DataFrame對(duì)象既有行索引，又有列索引
行索引，表明不同行，橫向索引，叫index，0軸，axis=0
列索引，表明不同列，縱向索引，叫columns，1軸，axis=1

2.1、index、columns 使用：

t1 = pd.DataFrame(np.arange(12).reshape(3,4), index=list("abc"), columns=list("wxyz")) print(t1) """w x y z a 0 1 2 3 b 4 5 6 7 c 8 9 10 11 """

2.2、使用列表創(chuàng)建DataFrame，缺失的值用 NaN 代替

data = [['Google',10],['Runoob',12],['Wiki',13]] df = pd.DataFrame(data,columns=['Site','Age']) print(df) """Site Age 0 Google 10 1 Runoob 12 2 Wiki 13 """

2.3、使用字典創(chuàng)建DataFrame，缺失的值用 NaN 代替

data = {'Site':['Google', 'Runoob', 'Wiki'], 'Age':[10, 12, 13]} df = pd.DataFrame(data) print (df) """Site Age 0 Google 10 1 Runoob 12 2 Wiki 13 """ data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] df = pd.DataFrame(data) print (df) """a b c 0 1 2 NaN 1 5 10 20.0 """

2.4、DataFrame基礎(chǔ)屬性

DataFrame.shape # 行數(shù) 列數(shù) DataFrame.dtypes # 列數(shù)據(jù)類型 DataFrame.ndim # 數(shù)據(jù)維度 DataFrame.index # 行索引 DataFrame.columns # 列索引 DataFrame.values # 對(duì)象值DataFrame.head(3) # 顯示頭部幾行，默認(rèn)5行 DataFrame.tail(3) # 顯示末尾幾行，默認(rèn)5行 DataFrame.info() # 相關(guān)信息概覽：行數(shù)，列數(shù)，列索引，列非空值個(gè)數(shù)，列類型，內(nèi)存占用 DataFrame.describe() # 快速綜合統(tǒng)計(jì)結(jié)果：計(jì)數(shù)，均值，標(biāo)準(zhǔn)差，最大值，四分位數(shù)，最小值 data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] df = pd.DataFrame(data) print(df) """a b c 0 1 2 NaN 1 5 10 20.0 """ print(df.index) # RangeIndex(start=0, stop=2, step=1) print(df.columns) # Index(['a', 'b', 'c'], dtype='object') print(df.values) # [[ 1. 2. nan] [ 5. 10. 20.]] print(df.shape) # (2, 3) print(df.ndim) # 數(shù)據(jù)維度 2 print(df.dtypes) # 列數(shù)據(jù)類型 """ a int64 b int64 c float64 dtype: object """ print("*"*80) print(df.info()) """ <class 'pandas.core.frame.DataFrame'> RangeIndex: 2 entries, 0 to 1 Data columns (total 3 columns):# Column Non-Null Count Dtype --- ------ -------------- ----- 0 a 2 non-null int64 1 b 2 non-null int64 2 c 1 non-null float64 dtypes: float64(1), int64(2) memory usage: 176.0 bytes None """ print(df.describe()) """a b c count 2.000000 2.000000 1.0 mean 3.000000 6.000000 20.0 std 2.828427 5.656854 NaN min 1.000000 2.000000 20.0 25% 2.000000 4.000000 20.0 50% 3.000000 6.000000 20.0 75% 4.000000 8.000000 20.0 max 5.000000 10.000000 20.0 """

2.5、DataFrame 排序

data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] df = pd.DataFrame(data) print(df) """a b c 0 1 2 NaN 1 5 10 20.0 """ # ascending=True 升序 # ascending=False 降序 df = df.sort_values("c", ascending=False) print(df) """a b c 1 5 10 20.0 0 1 2 NaN """

https://www.runoob.com/pandas/pandas-series.html
https://www.bilibili.com/video/BV1hx411d7jb?p=23
https://www.bilibili.com/video/BV1hx411d7jb?p=24
https://www.bilibili.com/video/BV1hx411d7jb?p=25
https://www.bilibili.com/video/BV1hx411d7jb?p=26

總結(jié)

以上是生活随笔為你收集整理的python3-pandas 数据结构 Series、DataFrame 基础的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

python

python3-pandas 数据结构 Series、DataFrame 基础

1、Pandas 數(shù)據(jù)結(jié)構(gòu) - Series

2、Pandas 數(shù)據(jù)結(jié)構(gòu) - DataFrame

2.1、index、columns 使用：

2.2、使用列表創(chuàng)建DataFrame，缺失的值用 NaN 代替