回归分析检验_回归分析
回歸分析檢驗
Regression analysis is a reliable method in statistics to determine whether a certain variable is influenced by certain other(s). The great thing about regression is also that there could be multiple variables influencing the variable of interest. Regression analysis can be used for prediction.
回歸分析是統(tǒng)計中確定某個變量是否受某些其他變量影響的可靠方法。 回歸的偉大之處還在于,可能會有多個變量影響目標(biāo)變量。 回歸分析可用于預(yù)測。
You have to understand the two types of variables to get started with regression analysis:
您必須了解兩種類型的變量才能開始回歸分析:
Dependent variable — the variable that you want to examine, understand or predict.
因變量-您要檢查,理解或預(yù)測的變量。
Independent variable(s) — all the other variables that you hypothisize to influence the dependent variable.
自變量—您假設(shè)的所有其他變量都會影響因變量。
In order to start the regression analysis, the dependent variable should be chosen. Then the independent variable or variables should be chosen which you hypothesize to affect the dependent variable.
為了開始回歸分析,應(yīng)選擇因變量。 然后應(yīng)選擇一個或多個您假設(shè)會影響因變量的自變量。
The next step is obtaining data for the regression analysis. This is usually a dataset that has the identified dependent and independent variables. As an instance, if there are separate datasets available for each of the variables, the variables of interest can be extracted and combined into a new dataset.
下一步是獲取用于回歸分析的數(shù)據(jù)。 這通常是具有已標(biāo)識的因變量和自變量的數(shù)據(jù)集。 例如,如果每個變量都有單獨(dú)的數(shù)據(jù)集,則可以提取感興趣的變量并將其合并到新的數(shù)據(jù)集中。
After that, the data should be plotted. The dependent variable always goes on the x-axis and the independent variable on the y-axis.
之后,應(yīng)繪制數(shù)據(jù)。 因變量始終在x軸上 ,而自變量始終在y軸上 。
From the plot, initial trends and correlation can be observed that suggest what kind of relationship the dependent and independent variables have. In the example to the left, the hypothetical data points have an increasing trend. As the independent variable increases the dependent increases as well.
從圖中可以觀察到初始趨勢和相關(guān)性,它們表明因變量和自變量具有什么樣的關(guān)系。 在左側(cè)的示例中,假設(shè)的數(shù)據(jù)點(diǎn)呈上升趨勢。 隨著自變量的增加,因變量也隨之增加。
A trend could be observed from the plot, but what is the precise degree to which the dependent variable is influenced by the independent? A regression line should be calculated. Usually, this can be done in software like STATA or Excel. The regression line is the best approximation of the data points on the plot.
從圖中可以觀察到趨勢,但是因變量在多大程度上受到自變量的影響? 應(yīng)該計算一條回歸線。 通常,這可以在STATA或Excel之類的軟件中完成。 回歸線是圖中數(shù)據(jù)點(diǎn)的最佳近似值。
In other words, explains Redman, “The red line is the best explanation of the relationship between the independent variable and dependent variable.”
換句話說,雷德曼解釋說:“紅線是對自變量和因變量之間關(guān)系的最好解釋。”
計算回歸線 (Calculating the regression line)
Calculating a regression line means finding a best-fit line for all the data points. For simple linear regression analysis, usually, the least-squares method is used.
計算回歸線意味著找到所有數(shù)據(jù)點(diǎn)的最佳擬合線。 對于簡單的線性回歸分析,通常使用最小二乘法。
The linear regression line is a simple line of the form y=mx+b. In order to find the best-fit line for your data you need to first find the five summary statistics:
線性回歸線是形式為y = mx + b的簡單線。 為了找到最適合您的數(shù)據(jù)的行,您需要首先找到五個匯總統(tǒng)計信息:
Mean of the x values
x值的平均值
2. Mean of the y values
2. y值的平均值
3. The standard deviation of the x values (denoted sx)
3. x值的標(biāo)準(zhǔn)偏差(表示為sx )
4. The standard deviation of the y values (denoted sy)
4. y值的標(biāo)準(zhǔn)偏差(表示為sy )
5. The correlation between X and Y (denoted r)
5. X和Y之間的相關(guān)性(表示為r )
The formula for calculating the slope m of the regression line is the following:
回歸線的斜率m的計算公式如下:
This formula calculates the slope for the regression line equation of the form y=mx+b. Now the last part to calculate is the y-intercept b. It can be calculated using the formula below:
該公式計算形式為y = mx + b的回歸線方程的斜率。 現(xiàn)在要計算的最后一部分是y截距 b 。 可以使用以下公式計算:
are the means of the x values and y values respectively and m is the already calculated slope.
分別是x值和y值的均值, m是已經(jīng)計算出的斜率。
The regression line that Excel will produce for example will look something like y=6x+70+error_term. This is different from the simple regression line that we calculated in that it has an error_term.
例如,Excel將產(chǎn)生的回歸線將類似于y = 6x + 70 + error_term 。 這與我們計算的簡單回歸線不同,它具有error_term 。
Regression lines always consider an error term because in reality, independent variables are never precisely perfect predictors of dependent variables.
回歸線總是考慮一個誤差項,因為實(shí)際上,自變量從來都不是因變量的精確預(yù)測器。
In reality, the dependent term might be determined by a number of different factors. The regression line is only an estimate based on the data available to you and the larger the error term is the less definitely certain your regression line is.
實(shí)際上,從屬項可能由許多不同的因素決定。 回歸線只是基于您可用數(shù)據(jù)的估計值,誤差項越大,確定線越不確定。
結(jié)論 (Conclusion)
Regression analysis helps determine effect of some variables on another. It is widely used in business analysis for determining different factors that influence the target variable and predict its future values.
回歸分析有助于確定某些變量對另一個變量的影響。 它廣泛用于業(yè)務(wù)分析中,以確定影響目標(biāo)變量并預(yù)測其未來價值的不同因素。
We’ve discussed what regression analysis is and how to calculate the regression line.
我們已經(jīng)討論了什么是回歸分析以及如何計算回歸線。
翻譯自: https://medium.com/swlh/regression-analysis-86e6a8bee0b7
回歸分析檢驗
總結(jié)
以上是生活随笔為你收集整理的回归分析检验_回归分析的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 梦到老鼠猫蛇是什么意思
- 下一篇: 梦到下雨淹水是什么意思