當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

总体方差的充分统计量_R方是否衡量预测能力或统计充分性？

發布時間：2023/12/15 编程问答 24 豆豆

生活随笔收集整理的這篇文章主要介紹了总体方差的充分统计量_R方是否衡量预测能力或统计充分性？小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

總體方差的充分統計量

The way that R-squared shouldn’t be utilized for choosing if you have a satisfactory model is illogical and is once in a while clarified unmistakably. This exhibit diagrams how R-squared integrity of-fit functions in relapse investigation and relationships while demonstrating why it’s anything but a proportion of measurable sufficiency, so ought not to propose anything about future prescient execution.

如果您有滿意的模型，不應該使用R平方的方法進行選擇是不合邏輯的，而且有時會明確地加以闡明。該圖說明了在復發調查和關系中擬合函數的R平方完整性如何進行，同時說明了為什么它只是可測量的充足程度的一部分，因此不應提出任何關于未來先行執行的建議。

The R-squared Goodness-of-Fit measure is one of the most broadly accessible insights going with the yield of relapse investigation in factual programming. Maybe incompletely because of its far-reaching accessibility, it is additionally one of the frequently misjudged ones.

R平方擬合優度度量是事實編程中復發調查的結果之一，是使用最廣泛的見解之一。由于它的深遠的可訪問性，它可能是不完整的，它也是經常被誤判的一種。

Initial, a concise update on R-squared (R2). In a relapse with a solitary free factor, R2 is determined as the proportion between the variety clarified by the model and the all-out watched variety. It is regularly called the coefficient of assurance and can be deciphered as the extent of variety clarified by the presented indicator. In such a case, it is proportionate to the square of the convection coefficient of the watched and fitted estimations of the variable. In various relapse, it is known as the coefficient of different assurance and is regularly determined utilizing a modification that punishes it is worth relying upon the number of indicators utilized.

最初，是對R平方(R2)的簡要更新。在具有單獨的自由因子的復發中，R2被確定為模型闡明的品種與全面觀察的品種之間的比例。通常將其稱為保證系數，并且可以根據所提供的指標所闡明的多樣性程度來進行解密。在這種情況下，它與變量的監視和擬合估計的對流系數的平方成比例。在各種復發中，它被稱為不同保證系數，并定期根據懲罰的修改來確定，這值得依賴所使用的指標數量。

In neither of these cases, be that as it may, does R2 measure whether the correct model was picked, and therefore, it doesn’t quantify the prescient limit of the acquired fit. This is accurately noted in numerous sources, yet not many clarify that factual sufficiency is essential for effectively deciphering a coefficient of assurance. Special cases incorporate Spanos 2019 [1] wherein one can peruse “underscore that the above measurements [R-squared and others] are significant just for the situation where the evaluated straight relapse model is factually sufficient,” and Hagquist and Stenbeck (1998).[2] It is much rarer to see instances of why that is the situation with an exemption being Ford 2015.[3]

在這兩種情況下，R2都不會測量是否選擇了正確的模型，因此，它不會量化獲得的擬合的先驗極限。許多來源都準確地指出了這一點，但很少有人澄清事實的充分性對于有效地破譯保證系數至關重要。特殊情況包括Spanos 2019 [1]，其中人們可以細讀“強調以上測量值(R平方和其他測量值)僅對于所評估的直接復發模型實際上足夠的情況才有意義”，以及Hagquist和Stenbeck(1998年)。 [2] 很少見到為什么會有這種情況的情況被豁免為福特2015。[3]

The current article incorporates a more extensive arrangement of models that explain the job and constraints of the coefficient of assurance. To keep it reasonable, just a single variable relapse is analyzed. The Appeal of R-squared

當前文章包含了更廣泛的模型排列，這些模型解釋了保證系數的工作和約束。為了使其合理，僅分析單個變量復發。 R平方的訴求

First, let us examine the utility of R2 and to see why it is so easy to incorrectly interpret it as a measure of statistical adequacy and predictive accuracy when it is neither. Using a comparison with the simple linear correlation coefficient will help us understand why it behaves the way it does.

首先，讓我們檢查一下R2的效用，看看為什么容易錯誤地將其解釋為統計上的充分性和預測準確性的度量，而兩者都不是。與簡單的線性相關系數進行比較將有助于我們理解其行為方式。

Figure 1 below is based on extracted data from 32 price offers for second-hand cars of one high-end model. The idea is to examine the relationship between car age (x-axis) and price (y-axis, in my local currency unit).

下面的圖1基于從一種高端模型的二手車的32種價格報價中提取的數據。這個想法是要檢查車齡(x軸)和價格(y軸，以我的本地貨幣單位)之間的關系。

Entering the information into a connection coefficient adding machine, we acquire a Pearson’s r of — 0.8978 (p-esteem 0 to the eighth decimal spot, 95%CI: — 0.9493, — 0.7992). These outcomes in an agreeable R2 of 0.8059, which would, truth be told, be viewed as high by numerous guidelines. The connection figuring is comparable to running a straight relapse. Eyeballing the fit makes it clear that it can probably be improved by picking a non-straight relationship from the exponential family all things considered:

將信息輸入到連接系數加法器中，我們獲得的皮爾遜r為-0.8978(p估計0到第八個小數點，95％CI：-0.9493，-0.7992)。這些結果的可接受的R2為0.8059，事實證明，許多指導方針都認為這是很高的。連接方式可與連續重復進行比較。眼神看似合體，很明顯，可以通過從指數家族中選擇所有考慮的非直線關系來改善這種關系：

Figure 2圖2

The fit acquired with the relapse condition that appeared in Figure 2 above has an R2 estimation of 0.9465, this one is constrained to supplant the straight with the exponential model. As the exponential model clarifies a greater amount of the difference, one may think it is a superior portrayal of the hidden information creating an instrument, and maybe this likewise implies it will have better prescient exactness. That is the natural intrigue of utilizing R-squared to pick between one model and another. Notwithstanding, this isn’t really so as the following parts will illustrate.

在上面的圖2中出現的復發情況下獲得的擬合的R2估計值為0.9465，這被約束為用指數模型代替直線。隨著指數模型澄清了更大的差異，人們可能會認為這是對創建工具的隱藏信息的更好描述，也許這也暗示著它將具有更好的先驗準確性。利用R平方在一個模型和另一個模型之間進行選擇是很自然的事情。盡管如此，以下部分將對此進行說明。

不同的底層模型，相同的R平方？ (Different Underlying Model, Same R-squared?)

Imagine a scenario in which disclosed to you that you can get the equivalent R2 measurement for a direct relapse of two particular datasets while realizing that the fundamental model is very extraordinary in each set. This can be shown by a straightforward reenactment. The accompanying R code produces indicator and reaction esteems dependent on two separate genuine models — one is direct, and the other is exponential:

想象一下一個場景，在該場景中向您披露，您可以在兩個特定數據集直接重復使用的同時獲得等效的R2度量，同時意識到基本模型在每個集合中都非常出色。這可以通過簡單的重新制定來體現。隨附的R代碼根據兩個獨立的真實模型來產生指標和React評價-一個是直接的，另一個是指數的：

set.seed(1); # set seed for replicability
x <- seq(1,10,length.out = 1000) # predictor values
y <- 1 + 1.5*x + rnorm(1000, 0, 0.80) # response values produced as a linear function of x and random noise with sd of 0.649
summary(lm(y ~ x))$r.squared # print R-squared for the linear model
y <- x^2 + rnorm(1000, 0, 0.80) # response values produced as an exponential function of x and random noise with sd of 1.377
summary(lm(y ~ x))$r.squared # print R-squared for the exponential model

The consequence of the straight model fit is R2 = 0.957 for both. In any case, we realize that one lot of information originates from a direct reliance and another from an exponential one. This can be additionally investigated utilizing the plots of the reaction variable y, as appeared in Figure 3.

兩者的直線模型擬合結果均為R2 = 0.957。無論如何，我們意識到，大量信息來自直接依賴，而另一半則來自指數依賴。可以使用React變量y的圖對此進行額外研究，如圖3所示。

Figure 3圖3

On the off chance that one uses an R-squared edge to acknowledge a model, they would probably acknowledge a straight model for the two conditions. In spite of a similar R-squared measurement created, the prescient legitimacy would be somewhat extraordinary relying upon what the genuine reliance is. In the event that it is really direct, at that point the prescient exactness would be very acceptable. Else, it will be a lot more unfortunate. In this sense, R-Squared is certainly not a decent proportion of prescient mistake. The standard blunder would have been a vastly improved guide being about multiple times littler in the principal case.

在使用R平方邊確認模型的機會很小的情況下，他們可能會在兩種情況下都承認一個直線模型。盡管創建了類似的R平方測量值，但根據真正依賴程度的不同，預先確定的合法性還是有些不同。如果確實是直接的，那么在那一點上預先確定的準確性是可以接受的。否則，將會更加不幸。從這個意義上說，R-Squared當然不是先天錯誤的一部分。標準錯誤將是一個大大改進的指南，在主要情況下要少很多倍。

低R平方不一定表示統計模型不足 (Low R-squared Doesn’t Necessarily Mean an Inadequate Statistical Model)

In the principal model, the standard deviation of the irregular commotion was kept the equivalent, and we changed just the kind of reliance. In any case, the coefficient of assurance is fundamentally impacted by the scattering of the arbitrary mistake term, and this is the thing that we will look at straightaway. The code beneath produces mimicked R2 values for various degrees of the standard deviation of the blunder term for y while keeping the kind of reliance the equivalent. The information is produced with a satisfactory model for the blunder term, and the relationship is straight. It fulfills the straightforward relapse model on all records: ordinariness, zero mean, homoskedasticity, no autocorrelation, and no collinearity as only a solitary variable is included.

在主模型中，不規則運動的標準偏差保持相等，而我們僅更改了依賴類型。無論如何，保證系數從根本上受到任意錯誤項的分散的影響，這就是我們將要看的東西。下面的代碼針對y的失誤項的標準偏差的不同程度生成模仿的R2值，同時保持對等的依賴。對于失誤項，將使用令人滿意的模型來生成信息，并且該關系是直接的。它滿足所有記錄上簡單明了的復發模型：通常，零均值，同方差，無自相關和共線性，因為僅包含一個單獨變量。

r2 <- function(sd){
x <- seq(1,10,length.out = 1000) # predictor values
y <- 2 + 1.2*x + rnorm(1000,0,sd = sd) # response values produced as a linear function of x and random noise with sd of sd
summary(lm(y ~ x))$r.squared # print R-squared
}sds <- seq(0.5,40,length.out = 40) # generate sd values with a step of 0.5
res <- sapply(sds, r2) # calculate the function with each sd value
plot(res ~ sds, type="b", main="R-squared for different values of Sigma", xlab="Sigma (Standard Deviation)", ylab="Average R-squared")Figure 4圖4

Despite the fact that the measurable model fulfills all prerequisites and is in this manner very much indicated, with expanding change in the blunder term, the R2 esteem will in general zero. The above is an exhibit of why it can’t be utilized as a proportion of measurable insufficiency.

盡管可測量模型滿足了所有先決條件，并且以這種方式非常明顯地指出了這一點，但隨著失誤項的變化不斷擴大，R2自尊通常將為零。以上是為什么不能將其作為可衡量的供血不足比例使用的一種證明。

高R平方不一定表示適當的統計模型 (High R-squared Doesn’t Necessarily Mean an Adequate Statistical Model)

Something contrary to the above situation can occur if the model is miss-indicated at this point the standard deviation is adequately little. This will in general produce high R2 values as shown by running the code beneath.

如果模型在這一點上未正確指示，則標準偏差足夠小，則可能會發生與上述情況相反的情況。如下面的代碼所示，這通常會產生較高的R2值。

r2 <- function(sd){
x <- seq(1,10,length.out = 1000) # predictor values
y <- x^2 + rnorm(1000,0,sd = sd) # response values produced as a linear function of x and random noise with sd of sd
summary(lm(y ~ x))$r.squared # print R-squared
}sds <- seq(0.5,40,length.out = 40) # genearte sd values with a step of 0.5
res <- sapply(sds, r2) # calculate the function with each sd value
plot(res ~ sds, type="b", main="R-squared for different values of Sigma", xlab="Sigma (Standard Deviation)", ylab="Average R-squared")

It should produce a plot like the one shown in Figure 5.

它應該產生類似于圖5所示的圖。

Figure 5圖5

As clear, even with an unequivocally non-straight basic model, high R-squared qualities can be watched for a wide scope of sigma esteems. This implies even self-assertively high estimations of the measurement can’t really be taken as proof for model ampleness.

很明顯，即使使用了明確的非直線型基本模型，對于廣泛的sigma估計，也可以觀察到高R平方質量。這意味著，即使是對測量值的高估，也不能真正用作模型充裕度的證明。

結論 (conclusion)

Ideally, the above models fill in as an adequate outline of the threats of over-deciphering the coefficient of assurance. While it is a proportion of decency of-fit, it possibly increases meaning if the model is satisfactory as for the fundamental instrument producing the information. Utilizing it as a proportion of model ampleness isn’t justified, and to the degree to which homing in on the right model effects or prescient mistake, it’s anything but a proportion of it either. Regardless of whether that leaves any valuable spot for it at all is as yet a matter of discussion.

理想情況下，以上模型可作為過度解密保證系數威脅的適當概述。雖然它是擬合優度的一部分，但對于產生信息的基本工具而言，如果模型令人滿意，則可能會增加含義。將其作為一定比例的模型充裕度是沒有道理的，在某種程度上，歸因于正確的模型效果或先天性的錯誤，也只是比例的一部分而已。不管這是否給它留下任何有價值的地方，都仍在討論中。

[1] Spanos A. (2019). “Probability Theory and Statistical Inference: Empirical Modeling with Observational Data.” Cambridge University Press. p.635

[1] Spanos A.(2019年)。 “概率論與統計推論：帶有觀測數據的經驗建模。” 劍橋大學出版社。第635章

[2] Hagquist, C., Stenbeck, M. (1998) “Goodness of Fit in Regression Analysis — R 2 and G 2 Reconsidered.” Quality and Quantity. 32, pp.229–245

[2] Hagquist，C.，Stenbeck，M.(1998)“回歸分析的擬合優度-重新考慮R 2和G 2。” 質量和數量。 32，第229–245頁

[3] Ford C. (2015). “Is R-squared Useless?” https://data.library.virginia.edu/is-r-squared-useless/

[3]福特C.(2015)。 “ R平方沒用嗎？” https://data.library.virginia.edu/is-r-squared-useless/

Bio: Arpit Bhushan Sharma (B.Tech, 2016–2020) Electrical & Electronics Engineering, KIET Group of Institutions Ghaziabad, Uttar Pradesh, India. | Project Manager — Project4jungle | Chief Organising Officer — Waco | Project Manager— Edusmith | Student Member R10 IEEE | Student Member PES | Voice: +91 8445726929 | E-mail: bhushansharmaarpit@gmail.com

翻譯自: https://medium.com/@bhushansharmaarpit/does-r-square-measure-the-predictive-capacity-or-statistical-sufficiency-73812bb95c38