统计 python_Python统计简介
統計 python
數據分析 (Data Analytics)
什么是統計 (What is Statistics)
Statistics is the discipline that concerns the collection, organization, analysis, interpretation and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied.
統計是一門涉及數據收集,組織,分析,解釋和表示的學科。 在將統計數據應用于科學,工業或社會問題時,通常從統計人口或要研究的統計模型開始。
中心趨勢: (Central Tendencies:)
is a central or typical value for a probability distribution. It may also be called a center or location of the distribution. Colloquially, measures of central tendency are often called averages.
是概率分布的中心值或典型值。 也可以稱為分布的中心或位置。 通俗地說, 集中趨勢的度量通常稱為平均值。
分散: (Dispersion:)
is the extent to which a distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartile range.
是分布被拉伸或壓縮的程度。 統計離差度量的常見示例是方差,標準差和四分位數范圍。
相關性: (Correlation:)
or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. In the broadest sense correlation is any statistical association, though it commonly refers to the degree to which a pair of variables are linearly related.
依存關系是兩個隨機變量或雙變量數據之間的任何統計關系,無論是否為因果關系。 從廣義上講,相關性是任何統計關聯,盡管它通常是指一對變量線性相關的程度。
辛普森悖論: (Simpson’s Paradox:)
which goes by several names, is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined.
這有幾個名字,是概率和統計上的一種現象,其中趨勢出現在幾個不同的數據組中,但是當這些組組合在一起時便消失或反轉。
什么是高級數據分析 (What is Data Analytics at high level)
Data Analytics solutions offer a convenient way to leverage business data. But the number of solutions on the market can be daunting — and many may seem to cover a different category of analytics. How can organizations make sense of it all? Start by understanding the different types of analytics, including descriptive, diagnostic, predictive, and prescriptive analytics.
數據分析解決方案提供了一種利用業務數據的便捷方法。 但是市場上的解決方案數量可能令人望而生畏,而且許多解決方案似乎涵蓋了不同類別的分析。 組織如何理解這一切? 首先了解不同類型的分析,包括描述性,診斷性,預測性和規范性分析。
Descriptive Analytics tells you what happened in the past.
描述性分析可以告訴您過去發生了什么。
Diagnostic Analytics helps you understand why something happened in the past.
Diagnostic Analytics可幫助您了解過去發生過什么的原因。
Predictive Analytics predicts what is most likely to happen in the future.
預測分析預測未來最有可能發生的事情。
Prescriptive Analytics recommends actions you can take to affect those outcomes.
規范分析建議您可以采取的措施來影響這些結果。
Python中的應用統計方法 (Applied Statistics Methods in Python)
Imagine we have to do some data analysis with the number of friends for each member of our staffs in the work has. The number of friends will be described in a Python list like below :
想象一下,我們必須對工作中每位員工的朋友數進行一些數據分析。 朋友的數量將在下面的Python列表中描述:
num_friends = [100, 49, 41, 40, 25, 100, 100, 100, 41, 41, 49, 59, 25, 25, 4, 4, 4, 4, 4, 4, 10, 10, 10, 10,]
We will display the num_friends in Histogram with matplotlib :
我們將使用matplotlib在直方圖中顯示num_friends:
Seeing the histogram would be
看到直方圖將是
Histogram friends counter直方圖朋友專柜集中趨勢 (Central Tendencies)
- mean 意思
We would like to get the mean of number of friends
我們想得到朋友數量的平均值
def mean(x):return sum(x) / len(x)
Apply this method will get the value for number of friends like
應用此方法將獲得喜歡的朋友數量的價值
35.791666666666664- median 中位數
The median is a simple measure of central tendency. To find the median, we arrange the observations in order from smallest to largest value. If there is an odd number of observations, the median is the middle value. If there is an even number of observations, the median is the average of the two middle values.
中位數是集中趨勢的簡單度量。 為了找到中位數 ,我們按從最小到最大的順序排列觀察值。 如果觀察值的數量為奇數,則中位數為中間值。 如果觀察數為偶數,則中位數為兩個中間值的平均值。
Apply this method will give us the result
應用此方法將給我們結果
25.0- quantile 分位數
A generalization of the median is the quantile, which represents the value less than which a certain percentile of the data lies. (The median represents the value less than which 50% of the data lies.)
中位數的一般化是分位數,它表示的值小于數據的某個百分位數所在的值。 (中位數表示小于該值的50%的值。)
def quantile(x, p):"""returns the pth-percentile value in x"""
p_index = int(p * len(x))
return sorted(x)[p_index]
Apply quantile method with num_friends for the percentile is 0.8 would have result
將分位數方法與num_friends應用于百分位數為0.8將產生結果
59- mode (or most common values) 模式(或最常見的值)
Apply mode method for num_friends will return
num_friends的Apply模式方法將返回
[4]結論 (Conclusion)
Studying about statistics help us know more about the fundamentals concept of Data Analysis or Data Science in general. There’s a lot more about statistics like Hypothesis testing, Correlation, or Estimation which I have not went over. So feel free to learn more about them.
研究統計信息可以幫助我們更全面地了解數據分析或數據科學的基本概念。 假設檢驗,相關性或估計等統計信息還有很多,我還沒有介紹。 因此,隨時了解更多有關它們的信息。
翻譯自: https://towardsdatascience.com/introduction-to-statistics-in-python-6f5a8876c994
統計 python
總結
以上是生活随笔為你收集整理的统计 python_Python统计简介的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 红白机平台“五大最硬核格斗游戏”盘点
- 下一篇: AOC 推出 P3 系列专业显示器,自带