用于数据分析的Python – Pandas
大熊貓 (Pandas)
- Pandas is an open-source library built on top of NumPy - Pandas是建立在NumPy之上的開源庫 
- It allows for fast analysis and data cleaning and preparation - 它允許快速分析以及數據清理和準備 
- It excels in performance and productivity - 它在性能和生產力方面都非常出色 
- It also has built-in visualization features - 它還具有內置的可視化功能 
- It can work with data from a wide variety of sources - 它可以處理來自各種來源的數據 
如何安裝熊貓? (How to install Pandas?)
Using PIP
使用畫中畫
(venv) -bash-4.2$ pip install pandas Requirement already satisfied: pandas in ./venv/lib/python3.6/site-packages (0.25.1) Requirement already satisfied: python-dateutil>=2.6.1 in ./venv/lib/python3.6/site-packages (from pandas) (2.8.0) Requirement already satisfied: pytz>=2017.2 in ./venv/lib/python3.6/site-packages (from pandas) (2019.2) Requirement already satisfied: numpy>=1.13.3 in ./venv/lib/python3.6/site-packages (from pandas) (1.17.2) Requirement already satisfied: six>=1.5 in ./venv/lib/python3.6/site-packages (from python-dateutil>=2.6.1->pandas) (1.12.0) venv) -bash-4.2$Series
系列
One-dimensional ndarray with axis labels, including time series. It is capable of holding data of any type. The axis labels are collectively known as an index. Series is very similar to a NumPy array, built on NumPy array object. However, the difference being a series can be indexed by labels.
具有軸標簽的一維ndarray,包括時間序列 。 它能夠保存任何類型的數據。 軸標簽統稱為索引。 系列與建立在NumPy數組對象上的NumPy數組非常相似。 但是,區別在于可以通過標簽對系列進行索引。
Syntax:
句法:
class pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)Below snippets shows examples of creating a series,
以下代碼片段顯示了創建系列的示例,
import numpy as np import pandas as pdlabels = ['a','e','i','o'] #python list data = [1,2,3,4] #python list arr = np.array(data) #NumPy array d = {'a':1,'b':2,'c':3} #python dict# creating a series object with default index print(pd.Series(data = data))# creating a series object with labels as index print(pd.Series(data = data, index = labels))# creating a series with NumPy array print(pd.Series(arr,index = labels))# creating a series with dictionary, # here the key becomes the index print(pd.Series(d))# Series can also hold built-in func print(pd.Series(data = [sum, print, len]))Output
輸出量
0 1 1 2 2 3 3 4 dtype: int64 a 1 e 2 i 3 o 4 dtype: int64 a 1 e 2 i 3 o 4 dtype: int64 a 1 b 2 c 3 dtype: int64 0 <built-in function sum> 1 <built-in function print> 2 <built-in function len> dtype: object系列操作 (Operations on Series)
Create two series object
創建兩個系列對象
import pandas as pdser1 = pd.Series([1,2,3,4],['Delhi','Bangalore','Mysore', 'Pune']) print(ser1)ser2 = pd.Series([1,2,5,4],['Delhi','Bangalore','Vizag','Pune']) print(ser2)Output
輸出量
Delhi 1 Bangalore 2 Mysore 3 Pune 4 dtype: int64 Delhi 1 Bangalore 2 Vizag 5 Pune 4 dtype: int64To retrieve the information from the series, is similar to the python dictionary, pass on the index-label of the given data type. In the above example, the index-label is of type String.
要從系列中檢索信息,類似于python字典,傳遞給定數據類型的index-label。 在上面的示例中,索引標簽的類型為String。
print(ser1['Delhi']) # Output: 1Now let's trying adding the two series,
現在讓我們嘗試添加兩個系列,
print(ser1+ser2) ''' Output: Bangalore 4.0 Delhi 2.0 Mysore NaN Pune 8.0 Vizag NaN dtype: float64 '''The pandas, adds the values of the index-labels. In case the match is not found, it will be put a NaN (null value). When the operations are performed on series or any NumPy/Pandas based object, the integers will be converted to float.
pandas ,添加索引標簽的值。 如果找不到匹配項,則將其放入NaN(空值)。 當對序列或任何基于NumPy / Pandas的對象執行操作時,整數將轉換為float。
翻譯自: https://www.includehelp.com/python/python-for-data-analysis-pandas.aspx
總結
以上是生活随笔為你收集整理的用于数据分析的Python – Pandas的全部內容,希望文章能夠幫你解決所遇到的問題。
 
                            
                        - 上一篇: cisc 和 risc_RISC和CIS
- 下一篇: strictmath_Java Stri
