Рассмотрим простую линейную модель:

y y = X' β β + ϵ

$\pmb{y}=X'\pmb{\beta}+\epsilon$

где $\epsilon_i\sim\mathrm{i.i.d.}\;\mathcal{N}(0,\sigma^2)$ и $X\in\mathbb{R}^{n\times p}$ , $p\geq2$ и $X$ содержит столбец констант.

Мой вопрос заключается в том, что, учитывая $\mathrm{E}(X'X)$ , $\beta$ и $\sigma$ , существует ли формула для нетривиальной верхней границы $\mathrm{E}(R^2)$ *? (при условии, что модель была оценена OLS).

* Я предполагал, написав это, что получение $E(R^2)$ само по себе было бы невозможно.

EDIT1

используя решение, полученное Стефаном Лораном (см. ниже), мы можем получить нетривиальную верхнюю оценку на $E(R^2)$ . Некоторые численные расчеты (ниже) показывают, что эта граница на самом деле довольно тесная.

Стефан Лоран вывел следующее: $R^2\sim\mathrm{B}(p-1,n-p,\lambda)$ где $\mathrm{B}(p-1,n-p,\lambda)$ - нецентральное бета-распределение с параметром нецентральности $\lambda$ где

λ = | | X ' β - E ( X ) ' β 1 n | | 2 σ 2

$\lambda=\frac{||X'\beta-\mathrm{E}(X)'\beta1_n||^2}{\sigma^2}$

Так

E (R 2) = E (χ 2 p - 1 ( λ ) χ 2 p - 1 ( λ ) + χ 2 n - p) \geq E ( χ 2 p - 1 ( λ ) ) E ( χ 2 p - 1 ( λ ) ) + E ( χ 2 n - p )

$\mathrm{E}(R^2)=\mathrm{E}\left(\frac{\chi^2_{p-1}(\lambda)}{\chi^2_{p-1}(\lambda)+\chi^2_{n-p}}\right)\geq\frac{\mathrm{E}\left(\chi^2_{p-1}(\lambda)\right)}{\mathrm{E}\left(\chi^2_{p-1}(\lambda)\right)+\mathrm{E}\left(\chi^2_{n-p}\right)}$

где - нецентральный с параметром и степенями свободы. Так нетривиальным верхняя граница для является $\chi^2_{k}(\lambda)$ $\chi^2$ $\lambda$ $k$ $\mathrm{E}(R^2)$

λ + p - 1 λ + n - 1

$\frac{\lambda+p-1}{\lambda+n-1}$

это очень туго (гораздо жестче, чем я ожидал, возможно):

например, используя:

rho<-0.75
p<-10
n<-25*p
Su<-matrix(rho,p-1,p-1)
diag(Su)<-1
su<-1
set.seed(123)
bet<-runif(p)

среднее значение более 1000 моделирований . Теоретическая верхняя граница выше дает . Оценка представляется одинаково точной для многих значений $R^2$ 0.9608190.9609081 . Действительно поразительно! $R^2$

EDIT2:

после дальнейших исследований выясняется, что качество приближения верхней границы к будет улучшаться увеличением (и при прочих равных условиях возрастает с ростом ). $E(R^2)$ $\lambda+p$ $\lambda$ $n$

linear-model expected-value

— user603
источник

имеет бета-распределение с параметрами, зависящими только от

. Нет? R2 $R^2$

n $n$

p $p$

— Стефан Лоран

Ооппсс, извините, мое предыдущее утверждение верно только в рамках гипотезы «нулевой модели» (только перехват). В противном случае распределение

должно быть чем-то вроде нецентрального бета-распределения с параметром нецентральности, включающим неизвестные параметры. R2 $R^2$

— Стефан Лоран

@ StéphaneLaurent: спасибо. Будете ли вы знать больше о связи между неизвестными параметрами и параметрами бета-версии? Я застрял, так что любой указатель будет приветствоваться ...

— user603

Вам абсолютно необходимо иметь дело с

? Возможно, существует простая точная формула для

. E[R2] $E[R^2]$

E[R2/(1−R2)] $E[R^2/(1-R^2)]$

— Стефан Лоран

С помощью обозначений моего ответа

для некоторого скаляра

и первый момент нецентрального

распределения прост. R2/(1−R2)=kF $R^2/(1-R^2) = k F$

k $k$

F $F$

— Стефан Лоран

Any linear model can be written $\boxed{Y=\mu+\sigma G}$ where $G$ has the standard normal distribution on $\mathbb{R}^n$ and $\mu$ is assumed to belong to a linear subspace $W$ of $\mathbb{R}^n$ . In your case $W=\text{Im}(X)$ .

Let $[1] \subset W$ be the one-dimensional linear subspace generated by the vector $(1,1,\ldots,1)$ . Taking $U=[1]$ below, the $R^2$ is highly related to the classical Fisher statistic

F = ∥ P Z Y ∥ 2 / ( m - ℓ ) ∥ P ⊥ W Y ∥ 2 / ( n - m ),

$F = \frac{{\Vert P_Z Y\Vert}^2/(m-\ell)}{{\Vert P_W^\perp Y\Vert}^2/(n-m)},$ for the hypothesis test of

H0:{μ∈U} $H_0\colon\{\mu \in U\}$ where

U⊂W $U\subset W$ is a linear subspace, and denoting by

Z=U⊥∩W $Z=U^\perp \cap W$ the orthogonal complement of

U $U$ in

W $W$ , and denoting

m=dim(W) $m=\dim(W)$ and

ℓ=dim(U) $\ell=\dim(U)$ (then

m=p $m=p$ and

ℓ=1 $\ell=1$ in your situation).

Indeed,

∥ P Z Y ∥ 2 ∥ P ⊥ W Y ∥ 2 = R 2 1 - R 2

$\dfrac{{\Vert P_Z Y\Vert}^2}{{\Vert P_W^\perp Y\Vert}^2} = \frac{R^2}{1-R^2}$ because the definition of

R2 $R^2$ is

R 2 = ∥ P Z Y ∥ 2 ∥ P ⊥ U Y ∥ 2 = 1 - ∥ P ⊥ W Y ∥ 2 ∥ P ⊥ U Y ∥ 2 .

$R^2 = \frac{{\Vert P_Z Y\Vert}^2}{{\Vert P_U^\perp Y\Vert}^2}=1 - \frac{{\Vert P^\perp_W Y\Vert}^2}{{\Vert P_U^\perp Y\Vert}^2}.$

Obviously $\boxed{P_Z Y = P_Z \mu + \sigma P_Z G}$ and $\boxed{P_W^\perp Y = \sigma P_W^\perp G}$ .

When $H_0\colon\{\mu \in U\}$ is true then $P_Z \mu = 0$ and therefore

F = ∥ P Z G ∥ 2 / ( m - ℓ ) ∥ P ⊥ W G ∥ 2 / ( n - m ) \sim F m - ℓ, n - m

$F = \frac{{\Vert P_Z G\Vert}^2/(m-\ell)}{{\Vert P_W^\perp G\Vert}^2/(n-m)} \sim F_{m-\ell,n-m}$ has the Fisher

Fm−ℓ,n−m $F_{m-\ell,n-m}$ distribution. Consequently, from the classical relation between the Fisher distribution and the Beta distribution,

R2∼B(m−ℓ,n−m) $R^2 \sim {\cal B}(m-\ell, n-m)$ .

In the general situation we have to deal with $P_Z Y = P_Z \mu + \sigma P_Z G$ when $P_Z\mu \neq 0$ . In this general case one has ${\Vert P_Z Y\Vert}^2 \sim \sigma^2\chi^2_{m-\ell}(\lambda)$ , the noncentral $\chi^2$ distribution with $m-\ell$ degrees of freedom and noncentrality parameter $\boxed{\lambda=\frac{{\Vert P_Z \mu\Vert}^2}{\sigma^2}}$ , and then $\boxed{F \sim F_{m-\ell,n-m}(\lambda)}$ (noncentral Fisher distribution). This is the classical result used to compute power of $F$ -tests.

The classical relation between the Fisher distribution and the Beta distribution hold in the noncentral situation too. Finally $R^2$ has the noncentral beta distribution with "shape parameters" $m-\ell$ and $n-m$ and noncentrality parameter $\lambda$ . I think the moments are available in the literature but they possibly are highly complicated.

Finally let us write down $P_Z\mu$ . Note that $P_Z = P_W - P_U$ . One has $P_U \mu = \bar\mu 1$ when $U=[1]$ , and $P_W \mu = \mu$ . Hence $P_Z \mu =\mu - \bar\mu 1$ where here $\mu=X\beta$ for the unknown parameters vector $\beta$ .

— Stéphane Laurent
источник

PZx $P_Z x$ is the orthogoanl projection of

x $x$ on the linear subspace

$Z$ . And

$P^\perp$ denotes projection on the orthogonal.

— Stéphane Laurent

Beware of

$Px \neq \Vert P x \Vert^2$ . I'm going to edit my post to write the formulas.

— Stéphane Laurent

Done - do you see any simplification ?

— Stéphane Laurent

$\bar \mu = \frac{1}{n} \sum \mu_i$

— Stéphane Laurent

Type I, obviously: type II are distributed on

$(0, \infty)$ . Actually

$R^2/(1-R^2)$ has the type II distribution. I have done the last corrections for today.

— Stéphane Laurent

Условное ожидание R-квадрата

EDIT1

EDIT2: