Multiple correlation coefficient - Characterizes the tightness of the linear correlation between one random variable and some many random variables. More precisely, if (ΞΎ 1 , ΞΎ 2 , ..., ΞΎ k ) is a random vector from R k , then the multiple correlation coefficient {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}}}
between ΞΎ 1 and ΞΎ 2 , ..., ΞΎ k is numerically equal to the coefficient of pairwise linear correlation between the quantity ΞΎ 1 and its best linear approximation {\ displaystyle M (\ xi _ {1} | \ xi _ {2}, \ ldots, \ xi _ {k})}
in the variables ΞΎ 2 ..., ΞΎ k , which is a linear regression of ΞΎ 1 on ΞΎ 2 , ..., ΞΎ k .
Content
PropertiesThe multiple correlation coefficient has the property that, provided
{\ displaystyle M \ xi _ {1} = M \ xi _ {2} = \ ldots = M \ xi _ {k} = 0} when {\ displaystyle \ xi _ {1} ^ {*} = \ beta _ {2} \ xi _ {2} + \ beta _ {3} \ xi _ {3} + \ cdots + \ beta _ {k} \ xi _ {k}} is the regression of ΞΎ 1 on ΞΎ 2 , ..., ΞΎ k ,
among all linear combinations of the variables ΞΎ 2 , ..., ΞΎ k, the variable ΞΎ 1 will have a maximum correlation coefficient with ΞΎ 1 * , which coincides with {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}}} . In this sense, the multiple correlation coefficient is a special case of the canonical correlation coefficient . For k = 2, the multiple correlation coefficient in absolute value coincides with the pair linear correlation coefficient Ο 12 between ΞΎ 1 and ΞΎ 2 .
CalculationThe multiple correlation coefficient is calculated using the correlation matrix {\ displaystyle \ mathbf {R} = \ left \ {\ rho _ {i, j} \ right \}, i, j = 1, \ ldots, k} according to the formula
{\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2} = 1 - {\ frac {\ left \ vert R \ right \ vert} {R_ {11}}}} ,
Where {\ displaystyle \ left \ vert R \ right \ vert} is the determinant of the correlation matrix, and {\ displaystyle R_ {11}} is an algebraic complement of the element Ο 11 = 1 ; here {\ displaystyle 0 \ leqslant \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} \ leqslant 1} . If a {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} = 1} , then with probability 1 the values ββof ΞΎ 1 coincide with the linear combination ΞΎ 2 , ..., ΞΎ k , therefore, the joint distribution ΞΎ 1 , ΞΎ 2 , ..., ΞΎ k lies on a hyperplane in the space R k . On the other hand, with {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} = 0} all pair correlation coefficients Ο 12 = Ο 13 = ... = Ο 1k = 0 are equal to zero, therefore, the values ββof ΞΎ 1 do not correlate with the quantities ΞΎ 2 , ..., ΞΎ k . The converse is also true. The multiple correlation coefficient can also be calculated by the formula
{\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2} = 1 - {\ frac {\ sigma _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2}} {\ sigma _ {1} ^ {2}}}} ,
Where {\ displaystyle \ sigma _ {1} ^ {2}} is the variance ΞΎ 1 , and {\ displaystyle \ sigma _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2} = M (\ xi _ {1} - (\ beta _ {2} \ xi _ {2} + \ beta _ {3} \ xi _ {3} + \ cdots + \ beta _ {k} \ xi _ {k})) ^ {2}} - variance ΞΎ 1 relative to the regression.
Selective Multiple Correlation CoefficientThe selective analogue of the multiple correlation coefficient is the quantity {\ displaystyle r_ {1 \ bullet 2, \ ldots, k} = {\ sqrt {1 - {\ frac {s_ {1 \ bullet 2, \ ldots, k} ^ {2}} {s_ {1} ^ { 2}}}}}} where {\ displaystyle s_ {1 \ bullet 2, \ ldots, k} ^ {2}} and {\ displaystyle s_ {1} ^ {2}} are grades for {\ displaystyle \ sigma _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} ^ {2}} and {\ displaystyle \ sigma _ {1} ^ {2}} obtained from a sample of volume n . To test the null hypothesis about the absence of correlation, the distribution of statistics is used {\ displaystyle r_ {1 \ bullet 2, \ ldots, k}} . Provided that the sample is taken from the multidimensional normal distribution, the quantity {\ displaystyle r_ {1 \ bullet 2, \ ldots, k} ^ {2}} will have a beta distribution with parameters {\ displaystyle {\ frac {k-1} {2}}, {\ frac {nk} {2}}} , if a {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} = 0} . For case {\ displaystyle \ rho _ {\ xi _ {1} \ bullet \ xi _ {2}, \ ldots, \ xi _ {k}} \ neq 0} distribution type {\ displaystyle r_ {1 \ bullet 2, \ ldots, k} ^ {2}} known, but almost never used due to its bulkiness.
See also- Coefficient of determination
Literature- Kramer G. Mathematical methods of statistics, trans. from English., 2 ed., M., 1975;
- Kendall M., Steward A. , Statistical Findings and Communications, trans. from English., M., 1973.