當前位置：首頁 > 编程语言 > python >内容正文

python

用于数据分析的Python – Pandas

發布時間：2025/3/11 python 23 豆豆

生活随笔收集整理的這篇文章主要介紹了用于数据分析的Python – Pandas 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

大熊貓 (Pandas)

Pandas is an open-source library built on top of NumPy
Pandas是建立在NumPy之上的開源庫
It allows for fast analysis and data cleaning and preparation
它允許快速分析以及數據清理和準備
It excels in performance and productivity
它在性能和生產力方面都非常出色
It also has built-in visualization features
它還具有內置的可視化功能
It can work with data from a wide variety of sources
它可以處理來自各種來源的數據

如何安裝熊貓？ (How to install Pandas?)

Using PIP

使用畫中畫

(venv) -bash-4.2$ pip install pandas Requirement already satisfied: pandas in ./venv/lib/python3.6/site-packages (0.25.1) Requirement already satisfied: python-dateutil>=2.6.1 in ./venv/lib/python3.6/site-packages (from pandas) (2.8.0) Requirement already satisfied: pytz>=2017.2 in ./venv/lib/python3.6/site-packages (from pandas) (2019.2) Requirement already satisfied: numpy>=1.13.3 in ./venv/lib/python3.6/site-packages (from pandas) (1.17.2) Requirement already satisfied: six>=1.5 in ./venv/lib/python3.6/site-packages (from python-dateutil>=2.6.1->pandas) (1.12.0) venv) -bash-4.2$

Series

系列

One-dimensional ndarray with axis labels, including time series. It is capable of holding data of any type. The axis labels are collectively known as an index. Series is very similar to a NumPy array, built on NumPy array object. However, the difference being a series can be indexed by labels.

具有軸標簽的一維ndarray，包括時間序列。它能夠保存任何類型的數據。軸標簽統稱為索引。系列與建立在NumPy數組對象上的NumPy數組非常相似。但是，區別在于可以通過標簽對系列進行索引。

Syntax:

句法：

class pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)

Below snippets shows examples of creating a series,

以下代碼片段顯示了創建系列的示例，

import numpy as np import pandas as pdlabels = ['a','e','i','o'] #python list data = [1,2,3,4] #python list arr = np.array(data) #NumPy array d = {'a':1,'b':2,'c':3} #python dict# creating a series object with default index print(pd.Series(data = data))# creating a series object with labels as index print(pd.Series(data = data, index = labels))# creating a series with NumPy array print(pd.Series(arr,index = labels))# creating a series with dictionary, # here the key becomes the index print(pd.Series(d))# Series can also hold built-in func print(pd.Series(data = [sum, print, len]))

Output

輸出量

0 1 1 2 2 3 3 4 dtype: int64 a 1 e 2 i 3 o 4 dtype: int64 a 1 e 2 i 3 o 4 dtype: int64 a 1 b 2 c 3 dtype: int64 0 <built-in function sum> 1 <built-in function print> 2 <built-in function len> dtype: object

系列操作 (Operations on Series)

Create two series object

創建兩個系列對象

import pandas as pdser1 = pd.Series([1,2,3,4],['Delhi','Bangalore','Mysore', 'Pune']) print(ser1)ser2 = pd.Series([1,2,5,4],['Delhi','Bangalore','Vizag','Pune']) print(ser2)

Output

輸出量

Delhi 1 Bangalore 2 Mysore 3 Pune 4 dtype: int64 Delhi 1 Bangalore 2 Vizag 5 Pune 4 dtype: int64

To retrieve the information from the series, is similar to the python dictionary, pass on the index-label of the given data type. In the above example, the index-label is of type String.

要從系列中檢索信息，類似于python字典，傳遞給定數據類型的index-label。在上面的示例中，索引標簽的類型為String。

print(ser1['Delhi']) # Output: 1

Now let's trying adding the two series,

現在讓我們嘗試添加兩個系列，

print(ser1+ser2) ''' Output: Bangalore 4.0 Delhi 2.0 Mysore NaN Pune 8.0 Vizag NaN dtype: float64 '''

The pandas, adds the values of the index-labels. In case the match is not found, it will be put a NaN (null value). When the operations are performed on series or any NumPy/Pandas based object, the integers will be converted to float.

pandas ，添加索引標簽的值。如果找不到匹配項，則將其放入NaN(空值)。當對序列或任何基于NumPy / Pandas的對象執行操作時，整數將轉換為float。

翻譯自: https://www.includehelp.com/python/python-for-data-analysis-pandas.aspx

總結

以上是生活随笔為你收集整理的用于数据分析的Python – Pandas的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： cisc 和 risc_RISC和CIS
下一篇： strictmath_Java Stri