Clever Geek Handbook
📜 ⬆️ ⬇️

Kruskel-Wallis test

The Kruskal-Wallis test is designed to verify the median equality of several samples . This criterion is a multidimensional generalization of the Wilcoxon – Mann – Whitney criterion . The Kraskel-Wallis criterion is ranked , therefore it is invariant with respect to any monotonic transformation of the measurement scale .

Also known under the names: Kruskal-Wallis H-test, one-way analysis of Kruskal-Wallis analysis ( Kruskal - Wallis one-way analysis of variance ), Kruskal-Wallis test ( English Kruskal - Wallis test ). Named after American mathematicians William Kraskel and Allen Wallis .

Examples of tasks

It takes the World Cup. The first sample is a survey of fans asking “What are the chances of winning the Russian national team?” Before the start of the championship. The second sample is after the first game, the third after the second match, etc. The values ​​in the samples are Russia's chances to win on a ten-point scale (1 - “no prospects”, 10 - “take the cup to Russia is a matter of time” ) It is required to check whether the results of the polls depend on the course of the championship.

Criteria Description

Askedk {\ displaystyle k} k samples:

xonenone={xeleven,...,xonenone},...,xknk={xkone,...,xknk}{\ displaystyle x_ {1} ^ {n_ {1}} = \ {x_ {11}, \; \ ldots, \; x_ {1n_ {1}} \}, \; \ ldots, \; x_ {k} ^ {n_ {k}} = \ {x_ {k1}, \; \ ldots, \; x_ {kn_ {k}} \}} {\displaystyle x_{1}^{n_{1}}=\{x_{11},\;\ldots ,\;x_{1n_{1}}\},\;\ldots ,\;x_{k}^{n_{k}}=\{x_{k1},\;\ldots ,\;x_{kn_{k}}\}} .

The combined sample will look like:

x=xonenone∪x2n2∪...∪xknk.{\ displaystyle x = x_ {1} ^ {n_ {1}} \ cup x_ {2} ^ {n_ {2}} \ cup \ ldots \ cup x_ {k} ^ {n_ {k}}.} {\displaystyle x=x_{1}^{n_{1}}\cup x_{2}^{n_{2}}\cup \ldots \cup x_{k}^{n_{k}}.}

Additional assumptions:

  1. all samples are simple, the combined sample is independent;
  2. samples taken from unknown continuous distributionsFone(x),...,Fk(x) {\ displaystyle F_ {1} (x), \; \ ldots, \; F_ {k} (x)} {\displaystyle F_{1}(x),\;\ldots ,\;F_{k}(x)} .

Hypothesis testedH0:Fone(x)=...=Fk(x) {\ displaystyle H_ {0} \ colon F_ {1} (x) = \ ldots = F_ {k} (x)} {\displaystyle H_{0}\colon F_{1}(x)=\ldots =F_{k}(x)} with an alternativeHone:Fone(x)=F2(x-Δone)=...=Fk(x-Δk-one) {\ displaystyle H_ {1} \ colon F_ {1} (x) = F_ {2} (x- \ Delta _ {1}) = \ ldots = F_ {k} (x- \ Delta _ {k-1} )} {\displaystyle H_{1}\colon F_{1}(x)=F_{2}(x-\Delta _{1})=\ldots =F_{k}(x-\Delta _{k-1})} .

Arrange everythingN=∑i=onekni {\ displaystyle N = \ sum _ {i = 1} ^ {k} n_ {i}} {\displaystyle N=\sum _{i=1}^{k}n_{i}} sample items in ascending order and denoteRij {\ displaystyle R_ {ij}} R_{ij} rankj {\ displaystyle j} j elementi {\ displaystyle i}   -th sample in the obtained variation series .

The statistics of the Kruskal - Wallis criterion for testing the hypothesis of the presence of a shift in the position parameters of the two compared samples has the form:

H=∑i=onek(one-niN){R¯i-N+one2(N-ni)(N+one)12ni}2=12N(N+one)∑i=onekni(R¯i-N+one2)2={\ displaystyle H = \ sum _ {i = 1} ^ {k} \ left (1 - {\ frac {n_ {i}} {N}} \ right) \ left \ {{\ frac {{\ bar { R}} _ {i} - {\ dfrac {N + 1} {2}}} {\ sqrt {\ dfrac {(N-n_ {i}) (N + 1)} {12n_ {i}}}} } \ right \} ^ {2} = {\ frac {12} {N (N + 1)}} \ sum _ {i = 1} ^ {k} n_ {i} \ left ({\ bar {R} } _ {i} - {\ frac {N + 1} {2}} \ right) ^ {2} =}  
=12N(N+one)∑i=onekRi2ni-3(N+one){\ displaystyle = {\ frac {12} {N (N + 1)}} \ sum _ {i = 1} ^ {k} {\ frac {R_ {i} ^ {2}} {n_ {i}} } -3 (N + 1)}   ,

Where

Ri=∑j=onekRij{\ displaystyle R_ {i} = \ sum _ {j = 1} ^ {k} R_ {ij}}   ;
R¯i=oneniRi{\ displaystyle {\ bar {R}} _ {i} = {\ frac {1} {n_ {i}}} R_ {i}}   .

Shift hypothesis deviates at significance levelα {\ displaystyle \ alpha}   , if aH⩾Hα {\ displaystyle H \ geqslant H _ {\ alpha}}   whereHα {\ displaystyle H _ {\ alpha}}   - critical value, atk⩽five {\ displaystyle k \ leqslant 5}   andni⩽eight {\ displaystyle n_ {i} \ leqslant 8}   calculated by tables. For higher values, various approximations apply.

Kraskel - Wallis approximation

Let be

M=N3-∑i=onekni3N(N+one){\ displaystyle M = {\ frac {N ^ {3} - \ displaystyle {\ sum _ {i = 1} ^ {k} n_ {i} ^ {3}}} {N (N + 1)}}}   ;
νone=(k-one)(k-one)(M-k+one)-Vone2MV{\ displaystyle \ nu _ {1} = (k-1) {\ frac {(k-1) (M-k + 1) -V} {{\ dfrac {1} {2}} MV}}}   ;
ν2=M-k+onek-oneνone{\ displaystyle \ nu _ {2} = {\ frac {M-k + 1} {k-1}} \ nu _ {1}}   ;
V=2(k-one)-2{3k2-6k+N(2k2-6k+one)}fiveN(N+one)-6five∑i=onekoneni{\ displaystyle V = 2 (k-1) - {\ frac {2 \ left \ {3k ^ {2} -6k + N (2k ^ {2} -6k + 1) \ right \}} {5N (N +1)}} - {\ frac {6} {5}} \ sum _ {i = 1} ^ {k} {\ frac {1} {n_ {i}}}}   .

Then the statisticsF=H(M-k+one)(k-one)(M-H) {\ displaystyle F = {\ frac {H (M-k + 1)} {(k-1) (MH)}}}   will have no shiftF {\ displaystyle F}   distribution withνone {\ displaystyle \ nu _ {1}}   andν2 {\ displaystyle \ nu _ {2}}   degrees of freedom. Thus, the null hypothesis is rejected at the significance level.α {\ displaystyle \ alpha}   , if aF>Fα(νone,ν2) {\ displaystyle F> F _ {\ alpha} (\ nu _ {1}, \; \ nu _ {2})}   .

Approximation of Iman - Davenport

In accordance with it, the null hypothesis of a shift is rejected with certaintyα {\ displaystyle \ alpha}   , if aJ⩾Jα {\ displaystyle J \ geqslant J _ {\ alpha}}   whereJ=H2(one+N-kN-one-H) {\ displaystyle J = {\ frac {H} {2}} \ left (1 + {\ frac {Nk} {N-1-H}} \ right)}   ;Jα={(k-one)Fα(k-one;N-k)+χα2(k-one)} {\ displaystyle J _ {\ alpha} = \ left \ {(k-1) F _ {\ alpha} (k-1; \; Nk) + \ chi _ {\ alpha} ^ {2} (k-1) \ right \}}   ,Fα(fone;f2) {\ displaystyle F _ {\ alpha} (f_ {1}; \; f_ {2})}   andχα2(a) {\ displaystyle \ chi _ {\ alpha} ^ {2} (a)}   - respectively, the critical values ​​of the Fisher statistics and chi-square with the corresponding degrees of freedom.

This is a more accurate approximation than the Kruskel - Wallis approximation. If there are related ranks (that is, when values ​​from different samples coincide and they are assigned the same average ranks), it is necessary to use modified statisticsH∗=H{one-(∑j=oneqTjN3-N)}-one {\ displaystyle H ^ {*} = H \ left \ {1- \ left (\ sum _ {j = 1} ^ {q} {\ frac {T_ {j}} {N ^ {3} -N}} \ right) \ right \} ^ {- 1}}   whereTj=tj3-tj {\ displaystyle T_ {j} = t_ {j} ^ {3} -t_ {j}}   ;tj {\ displaystyle t_ {j}}   - the sizej {\ displaystyle j}   th group of identical elements;q {\ displaystyle q}   - the number of groups of identical elements. Atni⩾20 {\ displaystyle n_ {i} \ geqslant 20}   the approximation of the distribution of statistics is validH {\ displaystyle H}   ;χ2 {\ displaystyle \ chi ^ {2}}   distribution withf=k-one {\ displaystyle f = k-1}   degrees of freedom, i.e. the null hypothesis is rejected ifH⩾χα2(k-one) {\ displaystyle H \ geqslant \ chi _ {\ alpha} ^ {2} (k-1)}   .

See also

  • Cochren criterion

Literature

  • Kruskal WH, Wallis WA Use of ranks in one-criterion variance analysis. // Journal of the American Statistical Association. - 1952, 47 No. 260. - pp. 583-621.
  • Likesh I., Lyaga J. Basic tables of mathematical statistics. - M .: Finance and statistics, 1985.
  • Kobzar A.I. Applied mathematical statistics. - M .: Fizmatlit, 2006 .-- 466-468 p.

Links

  • Kruskel-Wallis test
Source - https://ru.wikipedia.org/w/index.php?title=Kraskela_ criterion_ — _Wallis&oldid = 100993162


More articles:

  • Arga (mythology)
  • Krivoluchye (tram depot, Tula)
  • Archeda (river)
  • Babich, Yury Vladimirovich
  • Sretensky Gate Square
  • Steinberg, Isaac Zakharovich
  • Slavery of primitive peoples
  • Odessa street list
  • Chris Andersen
  • Elections in Mariupol

All articles

Clever Geek | 2019