Учитывая три вектора , и , возможно ли, чтобы корреляции между и , и , а также и были отрицательными? Т.е. возможно ли это?
Учитывая три вектора , и , возможно ли, чтобы корреляции между и , и , а также и были отрицательными? Т.е. возможно ли это?
Ответы:
Это возможно, если размер вектора равен 3 или больше. Например
Корреляции следующие:
Мы можем доказать, что для векторов размера 2 это невозможно:
Формула имеет смысл: если больше, чем a 2 , b 1 должно быть больше, чем b 1, чтобы сделать корреляцию отрицательной.
Аналогично для соотношений между (a, c) и (b, c) получаем
Clearly, all of these three formulas can not hold in the same time.
Yes, they can.
Suppose you have a multivariate normal distribution . The only restriction on is that it has to be positive semi-definite.
So take the following example
Its eigenvalues are all positive (1.2, 1.2, 0.6), and you can create vectors with negative correlation.
let's start with a correlation matrix for 3 variables
non-negative definiteness creates constraints for pairwise correlations which can be written as
For example, if , the values of is restricted by , which forces . On the other hand if , can be within range.
Answering the interesting follow up question by @amoeba: "what is the lowest possible correlation that all three pairs can simultaneously have?"
Let , Find the smallest root of , which will give you . Perhaps not surprising for some.
A stronger argument can be made if one of the correlations, say . From the same equation , we can deduce that . Therefore if two correlations are , third one should be .
A simple R function to explore this:
f <- function(n,trials = 10000){
count <- 0
for(i in 1:trials){
a <- runif(n)
b <- runif(n)
c <- runif(n)
if(cor(a,b) < 0 & cor(a,c) < 0 & cor(b,c) < 0){
count <- count + 1
}
}
count/trials
}
As a function of n
, f(n)
starts at 0, becomes nonzero at n = 3
(with typical values around 0.06), then increases to around 0.11 by n = 15
, after which it seems to stabilize:
So, not only is it possible to have all three correlations negative, it doesn't seem to be terribly uncommon (at least for uniform distributions).