Clever Geek Handbook
πŸ“œ ⬆️ ⬇️

Multiple correlation coefficient

Multiple correlation coefficient - Characterizes the tightness of the linear correlation between one random variable and some many random variables. More precisely, if (ΞΎ 1 , ΞΎ 2 , ..., ΞΎ k ) is a random vector from R k , then the multiple correlation coefficientρξoneβˆ™ΞΎ2,...,ΞΎk {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}}} {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}}} between ΞΎ 1 and ΞΎ 2 , ..., ΞΎ k is numerically equal to the coefficient of pairwise linear correlation between the quantity ΞΎ 1 and its best linear approximationM(ΞΎone|ΞΎ2,...,ΞΎk) {\ displaystyle M (\ xi _ {1} | \ xi _ {2}, \ ldots, \ xi _ {k})} {\ displaystyle M (\ xi _ {1} | \ xi _ {2}, \ ldots, \ xi _ {k})} in the variables ΞΎ 2 ..., ΞΎ k , which is a linear regression of ΞΎ 1 on ΞΎ 2 , ..., ΞΎ k .

Content

Properties

The multiple correlation coefficient has the property that, provided

MΞΎone=MΞΎ2=...=MΞΎk=0{\ displaystyle M \ xi _ {1} = M \ xi _ {2} = \ ldots = M \ xi _ {k} = 0}   whenΞΎoneβˆ—=Ξ²2ΞΎ2+Ξ²3ΞΎ3+β‹―+Ξ²kΞΎk {\ displaystyle \ xi _ {1} ^ {*} = \ beta _ {2} \ xi _ {2} + \ beta _ {3} \ xi _ {3} + \ cdots + \ beta _ {k} \ xi _ {k}}   is the regression of ΞΎ 1 on ΞΎ 2 , ..., ΞΎ k ,

among all linear combinations of the variables ΞΎ 2 , ..., ΞΎ k, the variable ΞΎ 1 will have a maximum correlation coefficient with ΞΎ 1 * , which coincides withρξoneβˆ™ΞΎ2,...,ΞΎk {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}}}   . In this sense, the multiple correlation coefficient is a special case of the canonical correlation coefficient . For k = 2, the multiple correlation coefficient in absolute value coincides with the pair linear correlation coefficient ρ 12 between ΞΎ 1 and ΞΎ 2 .

Calculation

The multiple correlation coefficient is calculated using the correlation matrixR={ρi,j},i,j=one,...,k {\ displaystyle \ mathbf {R} = \ left \ {\ rho _ {i, j} \ right \}, i, j = 1, \ ldots, k}   according to the formula

ρξoneβˆ™ΞΎ2,...,ΞΎk2=one-|R|Releven{\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2} = 1 - {\ frac {\ left \ vert R \ right \ vert} {R_ {11}}}}   ,

Where|R| {\ displaystyle \ left \ vert R \ right \ vert}   is the determinant of the correlation matrix, andReleven {\ displaystyle R_ {11}}   is an algebraic complement of the element ρ 11 = 1 ; here0⩽ρξoneβˆ™ΞΎ2,...,ΞΎkβ©½one {\ displaystyle 0 \ leqslant \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} \ leqslant 1}   . If aρξoneβˆ™ΞΎ2,...,ΞΎk=one {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} = 1}   , then with probability 1 the values ​​of ΞΎ 1 coincide with the linear combination ΞΎ 2 , ..., ΞΎ k , therefore, the joint distribution ΞΎ 1 , ΞΎ 2 , ..., ΞΎ k lies on a hyperplane in the space R k . On the other hand, withρξoneβˆ™ΞΎ2,...,ΞΎk=0 {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} = 0}   all pair correlation coefficients ρ 12 = ρ 13 = ... = ρ 1k = 0 are equal to zero, therefore, the values ​​of ΞΎ 1 do not correlate with the quantities ΞΎ 2 , ..., ΞΎ k . The converse is also true. The multiple correlation coefficient can also be calculated by the formula

ρξoneβˆ™ΞΎ2,...,ΞΎk2=one-σξoneβˆ™ΞΎ2,...,ΞΎk2Οƒone2{\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2} = 1 - {\ frac {\ sigma _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2}} {\ sigma _ {1} ^ {2}}}}   ,

WhereΟƒone2 {\ displaystyle \ sigma _ {1} ^ {2}}   is the variance ΞΎ 1 , andσξoneβˆ™ΞΎ2,...,ΞΎk2=M(ΞΎone-(Ξ²2ΞΎ2+Ξ²3ΞΎ3+β‹―+Ξ²kΞΎk))2 {\ displaystyle \ sigma _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2} = M (\ xi _ {1} - (\ beta _ {2} \ xi _ {2} + \ beta _ {3} \ xi _ {3} + \ cdots + \ beta _ {k} \ xi _ {k})) ^ {2}}   - variance ΞΎ 1 relative to the regression.

Selective Multiple Correlation Coefficient

The selective analogue of the multiple correlation coefficient is the quantityroneβˆ™2,...,k=one-soneβˆ™2,...,k2sone2 {\ displaystyle r_ {1 \ bullet 2, \ ldots, k} = {\ sqrt {1 - {\ frac {s_ {1 \ bullet 2, \ ldots, k} ^ {2}} {s_ {1} ^ { 2}}}}}}   wheresoneβˆ™2,...,k2 {\ displaystyle s_ {1 \ bullet 2, \ ldots, k} ^ {2}}   andsone2 {\ displaystyle s_ {1} ^ {2}}   are grades forσξoneβˆ™ΞΎ2,...,ΞΎk2 {\ displaystyle \ sigma _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2}}   andΟƒone2 {\ displaystyle \ sigma _ {1} ^ {2}}   obtained from a sample of volume n . To test the null hypothesis about the absence of correlation, the distribution of statistics is usedroneβˆ™2,...,k {\ displaystyle r_ {1 \ bullet 2, \ ldots, k}}   . Provided that the sample is taken from the multidimensional normal distribution, the quantityroneβˆ™2,...,k2 {\ displaystyle r_ {1 \ bullet 2, \ ldots, k} ^ {2}}   will have a beta distribution with parametersk-one2,n-k2 {\ displaystyle {\ frac {k-1} {2}}, {\ frac {nk} {2}}}   , if aρξoneβˆ™ΞΎ2,...,ΞΎk=0 {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} = 0}   . For caseρξoneβˆ™ΞΎ2,...,ΞΎkβ‰ 0 {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} \ neq 0}   distribution typeroneβˆ™2,...,k2 {\ displaystyle r_ {1 \ bullet 2, \ ldots, k} ^ {2}}   known, but almost never used due to its bulkiness.

See also

  • Coefficient of determination

Literature

  • Kramer G. Mathematical methods of statistics, trans. from English., 2 ed., M., 1975;
  • Kendall M., Steward A. , Statistical Findings and Communications, trans. from English., M., 1973.
Source - https://ru.wikipedia.org/w/index.php?title=Multiple_correlation coefficient_old&oldid = 98801700


More articles:

  • David Dorfman
  • Nick
  • Urutskoev, Leonid Irbekovich
  • Hnevank Monastery
  • USS Brooklyn (ACR-3)
  • Hovanavank
  • Turaji
  • Frederick (Duke of Courland)
  • Silver Hole
  • Hygrin

All articles

Clever Geek | 2019