Clever Geek Handbook
📜 ⬆️ ⬇️

Odds ratio

The odds ratio is a characteristic used in mathematical statistics (in Russian it is denoted by the abbreviation “OS”, in English “OR” from odds ratio) for a quantitative description of the tightness of the connection between sign A and sign B in a certain statistical population.

Consider the principle of calculating this indicator on a hypothetical example. Suppose several volunteers are asked two questions:

  1. What is your blood pressure?
  2. How much alcohol do you drink?

Further, for each participant it is possible to determine whether he has the property “A” (for example, “high blood pressure (BP)”) and the property “B” (for example, “moderate alcohol consumption”). As a result of a survey of the entire group of participants, it is required to build an integral indicator that would quantitatively characterize the relationship between the presence of the trait “A” and the presence of “B” in the population. There are three characteristics of this kind and one of them is the odds ratio (OR), which is calculated in three steps:

  1. For each observation with property “B”, calculate the chances that this observation has property “A”.
  2. For each observation that does not have property “B”, calculate the chances that this observation has property “A”.
  3. Divide the odds obtained in clause 1 by the odds obtained in clause 2 - this will be the odds ratio (OR).

The term "participant" does not necessarily mean a person, a population can combine any objects, whether animate or inanimate nature.

If the OR exceeds 1, the presence of attribute “A” is associated with attribute “B” in the sense that the presence of “B” increases (with respect to the absence of “B”) the chances of having “A”.

An important note : despite the existence of such an association, it does not, however, imply the existence of causal relationships between “B” and “A”, it is quite possible that there is a false connection mediated by some other property “C”, which induces both signs “ A ”and“ B ”( False correlation ). In our example, a false correlation could manifest itself as follows: in the studied group of volunteers, there is a tendency to lower blood pressure in people who drink alcohol moderately, but when we try to force alcohol consumption (in moderate amounts, naturally) of volunteers who did not drink alcohol before, we would find that their blood pressure does not change on average. Such contradictory results could be explained, hypothetically, by the influence of an extraneous factor: for example, in the study group there are mainly people who have long and regularly consumed alcohol in moderate amounts, whose adaptation mechanisms are clearly expressed, which, hypothetically, can be manifested by a decrease in blood pressure . Thus, the “adaptation” factor is outsider here.

The other two methods for quantifying the relationship between the two qualitative attributes are relative risk (“RR”) and absolute risk reduction (“RSR”). In clinical trials and in many other cases, the most interesting is the characteristic of OR, which is calculated in the same way except that probabilities are used instead of chances. Unfortunately, researchers often encounter a situation where the available data allow only OS to be calculated, especially in case-control studies. Nevertheless, when one of the signs, for example A, is quite rare (the “ rare case assumption ”), then the odds ratio for the presence of “A”, provided that the participant has “B”, is a good approximation for the PR (requirement “A for condition B ”is mandatory, since OS takes into account both properties symmetrically, while OR and other characteristics do not).

In technical terms, the odds ratio is a measure of the magnitude of the effect , describing the strength of the relationship or relationship between two two-digit (binary) quantities. It is used as descriptive statistics and plays an important role in logistic regression .

Definition and basic properties

A rare disease study example

Imagine a rare disease that affects, for example, only one among many thousands of adults in the country. Suppose that there is a factor (for example, a certain trauma received in childhood), which makes the development of this disease in the future in an adult more likely. The most informative, in this case, would be an indicator of risk ratio (RR). But for its calculation, we would have to learn from all adults in the population a) whether they had an injury in childhood and b) if they have a disease now. After that, we will get information about the total number of people who had an injury in childhood (the size of the exposed group)NE {\ displaystyle N_ {E}}   , of whichDE {\ displaystyle D_ {E}}   got sick in the future andHE {\ displaystyle H_ {E}}   remained healthy; as well as the total number of people who did not have an injury in childhood (volume of the unexposed group),NNE {\ displaystyle N_ {NE}}   , of whichDNE {\ displaystyle D_ {NE}}   got sick andHNE {\ displaystyle H_ {NE}}   stayed healthy. Insofar asNE=DE+HE {\ displaystyle N_ {E} = D_ {E} + H_ {E}}   and a similar amount holds for the “NE” indices, we have four independent numbers that we can write to the table :

SickAre healthy
Have sufferedDE{\ displaystyle D_ {E}}  HE{\ displaystyle H_ {E}}  
Not hurtDNE{\ displaystyle D_ {NE}}  HNE{\ displaystyle H_ {NE}}  

To avoid misunderstandings, we further emphasize that all these numbers are obtained from the general population, and not from the sample.

Now the risk of developing the disease in the presence of injury will beDE/NE {\ displaystyle D_ {E} / N_ {E}}   (WhereNE=DE+HE {\ displaystyle N_ {E} = D_ {E} + H_ {E}}   ), and the risk of developing the disease in the absence of injuryDNE/NNE {\ displaystyle D_ {NE} / N_ {NE}}   . Relative risk (RR) is the ratio of two numbers:

RR=DE/NEDNE/NNE,{\ displaystyle RR = {\ frac {D_ {E} / N_ {E}} {D_ {NE} / N_ {NE}}} \ ,,}  

which can be rewritten like thisRR=DENNEDNENE=DE/DNENE/NNE. {\ displaystyle RR = {\ frac {D_ {E} N_ {NE}} {D_ {NE} N_ {E}}} = {\ frac {D_ {E} / D_ {NE}} {N_ {E} / N_ {NE}}}.}  

Consider the chances of developing a disease that, if injured, willDE/HE, {\ displaystyle D_ {E} / H_ {E} \ ,,}   , and in the absence of injuryDNE/HNE {\ displaystyle D_ {NE} / H_ {NE}}   . The odds ratio (OR) is the ratio of two numbers:OR=DE/HEDNE/HNE, {\ displaystyle OR = {\ frac {D_ {E} / H_ {E}} {D_ {NE} / H_ {NE}}} \ ,,}  

which can be rewritten like thisOR=DEHNEDNEHE=DE/DNEHE/HNE. {\ displaystyle OR = {\ frac {D_ {E} H_ {NE}} {D_ {NE} H_ {E}}} = {\ frac {D_ {E} / D_ {NE}} {H_ {E} / H_ {NE}}}.}  

Since the disease is a rare OR≈OSH. In fact, for a rare disease we haveDE≪HE {\ displaystyle D_ {E} \ ll H_ {E}}   soDE+HE≈HE {\ displaystyle D_ {E} + H_ {E} \ approx H_ {E}}   butDE/(DE+HE)≈DE/HE {\ displaystyle D_ {E} / (D_ {E} + H_ {E}) \ approx D_ {E} / H_ {E}}   , or in other words, for the exposed group, the risk of developing the disease is approximately equal to the chances. Similar considerations lead us to understand that risk is approximately equal to the chances for an unexposed group; but then the risk ratio , which is the PR, roughly equals the odds ratio, which is the OS. You may also notice that the assumption of a rare disease suggests thatNE≈HE {\ displaystyle N_ {E} \ approx H_ {E}}   andNNE≈HNE, {\ displaystyle N_ {NE} \ approx H_ {NE},}   what followsNE/NNE≈HE/HNE, {\ displaystyle N_ {E} / N_ {NE} \ approx H_ {E} / H_ {NE},}   or in other words, the denominators in the final expressions for OR and OS are approximately equal. The numerators are exactly the same, and therefore again we conclude that OSH≈OR.

If we go back to our hypothetical study, a very often arising problem is that we may not have the necessary information to evaluate all four of these numbers. For example, we may not have data on the entire population on the facts of the presence or absence of injury in childhood.

Often we can get around this problem by randomly sampling from the general population: namely, if neither a disease nor exposure to injuries in childhood are rare in a population, we can randomly select, say, a hundred people and find these four numbers in a given sample; assuming that this sample is representative enough, the OR calculated in this sample will be a good approximation to the OR for the whole population.

At the same time, some diseases can be so rare that, with all the desire, even in a large sample there may not be a single patient (or there may be so few that there can be no question of statistical significance). For this reason, the calculation of the OR becomes impossible. But, nevertheless, we can get an estimate of RR in these circumstances because, unlike a disease, exposure to trauma in childhood is not a rare event. Of course, due to the rarity of the disease, this will also be just an estimate of OR.

Take a look at the last expression for OP: fraction in the numeratorDE/DNE {\ displaystyle D_ {E} / D_ {NE}}   we are able to evaluate by collecting all known cases of the disease (it is assumed that there are such cases, otherwise we would not have started a study at all), and by seeing how many sick people were exposed and how many were not. And the fraction in the denominatorHE/HNE {\ displaystyle H_ {E} / H_ {NE}}   - these are the chances that a healthy person in the population was injured in childhood. Now we note that these chances, in fact, can be estimated by random sampling from the population, as it was said earlier that the prevalence of exposure to trauma in childhood is high enough, as a result of which random sampling of a sufficient volume will most likely contain a significant number of exposed people. Therefore, here the disease is very rare, but the factor that causes it is not so rare; Similar situations are quite common in practice.

Thus, we can evaluate the OR and then, using the rarity of the disease, argue that this estimate is also a good approximation for OR. By the way, the case considered is a common case-control study. [one]

Similar reasoning can be carried out without resorting to the use of the concept of OS, for example, as follows: since we have the relationsNE≈HE {\ displaystyle N_ {E} \ approx H_ {E}}   andNNE≈HNE {\ displaystyle N_ {NE} \ approx H_ {NE}}   therefore we getNE/NNE≈HE/HNE {\ displaystyle N_ {E} / N_ {NE} \ approx H_ {E} / H_ {NE}}   . Therefore, if by random sampling we seek to estimate the ratioHE/HNE {\ displaystyle H_ {E} / H_ {NE}}   , then, resorting to the assumption of the rarity of the disease, we get that its value will be a good estimateNE/NNE {\ displaystyle N_ {E} / N_ {NE}}   , which we needed (at the same timeDE/DNE {\ displaystyle D_ {E} / D_ {NE}}   we already know after studying several cases of the disease) get to calculate the OR. Nevertheless, it is considered good form when publishing the results to give the OS value, but with the proviso that the PR is approximately the same.

Definition through Odds in Groups

The odds ratio is a fraction, in the numerator of which, there are the odds of some event for one group, and in the denominator the odds of the same event, but for another group. This expression is also used to calculate sample estimates of the relationship. The groups may include men and women, the experimental and control groups , as well as any dichotomy . If the probability of an event in each group is designated as p 1 (first group) and p 2 (second group), then the odds ratio will be equal to:

pone/(one-pone)p2/(one-p2)=pone/qonep2/q2=poneq2p2qone,{\ displaystyle {p_ {1} / (1-p_ {1}) \ over p_ {2} / (1-p_ {2})} = {p_ {1} / q_ {1} \ over p_ {2} / q_ {2}} = {\ frac {\; p_ {1} q_ {2} \;} {\; p_ {2} q_ {1} \;}},}  

where q x = 1 - p x . An odds ratio of 1 means that the event in question has equal odds in both groups. An odds ratio greater than 1 means that the event is more likely to occur in the first group. And the odds ratio not exceeding 1 indicates that the event has less chances in the first group. The odds ratio is always a non-negative value (if its value is determined). The value becomes undefined if p 2 q 1 is zero, that is, if p 2 is zero or q 1 is zero.

Definition through joint and conditional probabilities

The odds ratio can be determined through the joint probability distribution of two binary random variables . The joint distribution of binary random variables X and Y is given by the table

Y = 1Y = 0
X = 1peleven{\ displaystyle p_ {11}}  pten{\ displaystyle p_ {10}}  
X = 0p01{\ displaystyle p_ {01}}  p00{\ displaystyle p_ {00}}  

where p 11 , p 10 , p 01 and p 00 are non-negative joint probabilities, the sum of which is 1. The odds for Y in two groups defined by conditions X = 1 and X = 0 are calculated using conditional probabilities under condition X , that is, P ( Y | X ):

Y = 1Y = 0
X = 1peleven/(peleven+pten){\ displaystyle p_ {11} / (p_ {11} + p_ {10})}  pten/(peleven+pten){\ displaystyle p_ {10} / (p_ {11} + p_ {10})}  
X = 0p01/(p01+p00){\ displaystyle p_ {01} / (p_ {01} + p_ {00})}  p00/(p01+p00){\ displaystyle p_ {00} / (p_ {01} + p_ {00})}  

Thus, the odds ratio will be equal to

peleven/(peleven+pten)pten/(peleven+pten)/p01/(p01+p00)p00/(p01+p00)=pelevenp00ptenp01.{\ displaystyle {{\ dfrac {p_ {11} / (p_ {11} + p_ {10})} {p_ {10} / (p_ {11} + p_ {10})}} {\ bigg /} { \ dfrac {p_ {01} / (p_ {01} + p_ {00})} {p_ {00} / (p_ {01} + p_ {00})}}} = {\ dfrac {p_ {11} p_ {00}} {p_ {10} p_ {01}}}.}  

The fraction on the right side of the expression above is easy to remember as the product of the probabilities of the matched cells ( X = Y ) divided by the product of the probabilities of the mismatched cells ( X ≠ Y ). Despite the fact that the designation of categories using 0 and 1 is arbitrary, the rule of consistent and inconsistent cells remains in effect.

Symmetry

If we calculate the odds ratio using conditional probabilities under condition Y ,

Y = 1Y = 0
X = 1peleven/(peleven+p01){\ displaystyle p_ {11} / (p_ {11} + p_ {01})}  pten/(pten+p00){\ displaystyle p_ {10} / (p_ {10} + p_ {00})}  
X = 0p01/(peleven+p01){\ displaystyle p_ {01} / (p_ {11} + p_ {01})}  p00/(pten+p00){\ displaystyle p_ {00} / (p_ {10} + p_ {00})}  

we get the same result

peleven/(peleven+p01)p01/(peleven+p01)/pten/(pten+p00)p00/(pten+p00)=pelevenp00ptenp01.{\ displaystyle {{\ dfrac {p_ {11} / (p_ {11} + p_ {01})} {p_ {01} / (p_ {11} + p_ {01})}} {\ bigg /} { \ dfrac {p_ {10} / (p_ {10} + p_ {00})} {p_ {00} / (p_ {10} + p_ {00})}}} = {\ dfrac {p_ {11} p_ {00}} {p_ {10} p_ {01}}}.}  

Other measures of the magnitude of the effect for binary data, for example, relative risk , do not have this symmetry property.

Relationship with the property of statistical independence

If X and Y are independent, their joint probabilities can be expressed in terms of the marginal probabilities p x = P ( X = 1) and p y = P ( Y = 1) as follows:

Y = 1Y = 0
X = 1pxpy{\ displaystyle p_ {x} p_ {y}}  px(one-py){\ displaystyle p_ {x} (1-p_ {y})}  
X = 0(one-px)py{\ displaystyle (1-p_ {x}) p_ {y}}  (one-px)(one-py){\ displaystyle (1-p_ {x}) (1-p_ {y})}  

In this case, the odds ratio is equal to one, and vice versa, if the odds ratio is equal to unity, joint probabilities can be represented in the form of such products. Thus, the odds ratio is equal to one if and only if X and Y are independent .

Determining joint probabilities from a relationship of odds and marginal probabilities

The odds ratio is a function of joint probabilities, and vice versa, joint probabilities can be restored if the odds ratio and marginal probabilities are known

P ( X = 1) = p 11 + p 10 and P ( Y = 1) = p 11 + p 01 . If the odds ratio R is different from 1, then:

peleven=one+(pone⋅+p⋅one)(R-one)-S2(R-one){\ displaystyle p_ {11} = {\ frac {1+ (p_ {1 \ cdot} + p _ {\ cdot 1}) (R-1) -S} {2 (R-1)}}}  

where p 1 • = p 11 + p 10 , p • 1 = p 11 + p 01 and

S=(one+(pone⋅+p⋅one)(R-one))2+fourR(one-R)pone⋅p⋅one.{\ displaystyle S = {\ sqrt {(1+ (p_ {1 \ cdot} + p _ {\ cdot 1}) (R-1)) ^ {2} + 4R (1-R) ​​p_ {1 \ cdot} p _ {\ cdot 1}}}.}  

In the case of equality R = 1, we have independence, therefore p 11 = p 1 • p • 1 .

Since we know p 11 , the other three probabilities are easily determined from marginal ones.

Example

 
The diagram shows the relationship between the logarithm of the odds ratio and the corresponding probabilities for event X in the two groups indicated by A and B. The logarithms of the odds ratio are calculated based on the odds for an event in group B relative to the odds for an event in group A. Thus, if the probability of X in group B is higher than the probability of X in group A , the odds ratio is greater than 1, and the logarithm of the odds ratio is greater than 0.

Suppose that in a sample of 100 men 90 used wine last week, while in a sample of 100 women only 20 consumed wine over the same period. The chances of a man drinking wine are 90 to 10, or 9: 1, while the same chances for women are only 20 to 80, or 1: 4 = 0.25: 1. The odds ratio is 9 / 0.25, or 36, which shows us that a significantly larger number of men use wine. More detailed calculations:

0,9/0,one0,2/0,eight=0,9×0,eight0,one×0,2=0,720,02=36.{\ displaystyle {0.9 / 0.1 \ over 0.2 / 0.8} = {\ frac {\; 0.9 \ times 0.8 \;} {\; 0.1 \ times 0.2 \;}} = {0.72 \ over 0.02} = 36.}  

This example shows how the odds ratios differ in different calculation systems: in the sample of people who consumed wine, men are 90/20 = 4.5 times more than women, but at the same time they have 36 times more chances. The logarithm of the odds ratio, the difference of the probability logits , softens this effect and gives the property of symmetry with respect to the order of the groups. For example, applying the natural logarithm to the odds ratio 36/1, we get 3,584, and by doing the same with the ratio 1/36, we get −3,584.

Statistical Findings

 
This graph shows the minimum statistics for the log odds ratio, which allows you to prove its significance at the 0.05 level for a given sample size. Three curves correspond to different marginal probabilities of the 2x2 contingency table (the marginal probabilities of the row and column are assumed to be equal).

Several approaches have been developed to test statistical hypotheses about odds ratios.

One approach is based on the approximation of the sample distribution of the logarithm of the odds ratio (namely, the natural logarithm of the odds ratio). If we use the notation in terms of joint probabilities, the logarithm of the general odds ratio will be

log⁡(pelevenp00p01pten)=log⁡(peleven)+log⁡(p00)-log⁡(pten)-log⁡(p01).{\ displaystyle {\ log \ left ({\ frac {p_ {11} p_ {00}} {p_ {01} p_ {10}}} \ right) = \ log (p_ {11}) + \ log (p_ {00} {\ big)} - \ log (p_ {10}) - \ log (p_ {01})}.}  

If we present the results of the experiment in the form of a contingency table

Y = 1Y = 0
X = 1neleven{\ displaystyle n_ {11}}  nten{\ displaystyle n_ {10}}  
X = 0n01{\ displaystyle n_ {01}}  n00{\ displaystyle n_ {00}}  

probability estimates for the joint distribution can be defined as follows:

Y = 1Y = 0
X = 1p^eleven{\ displaystyle {\ hat {p}} _ {11}}  p^ten{\ displaystyle {\ hat {p}} _ {10}}  
X = 0p^01{\ displaystyle {\ hat {p}} _ {01}}  p^00{\ displaystyle {\ hat {p}} _ {00}}  

where p ̂ ij = n ij / n , and n = n 11 + n 10 + n 01 + n 00 is the sum of the values ​​of all four cells of the table. The logarithm of the sample odds ratio will be:

L=log⁡(p^elevenp^00p^tenp^01)=log⁡(nelevenn00ntenn01){\ displaystyle {L = \ log \ left ({\ dfrac {{\ hat {p}} _ {11} {\ hat {p}} _ {00}} {{\ hat {p}} _ {10} {\ hat {p}} _ {01}}} \ right) = \ log \ left ({\ dfrac {n_ {11} n_ {00}} {n_ {10} n_ {01}}} \ right)} }   .

The distribution of the logarithm of the odds ratio is well approximated by the normal distribution with the parameters:

X∼N(log⁡(OR),σ2).{\ displaystyle X \ \ sim \ {\ mathcal {N}} (\ log (OR), \, \ sigma ^ {2}).}  

The standard error of the logarithm of the odds ratio is estimated by the formula

SE=oneneleven+onenten+onen01+onen00{\ displaystyle {{\ rm {SE}} = {\ sqrt {{\ dfrac {1} {n_ {11}}} + {\ dfrac {1} {n_ {10}}} + {\ dfrac {1} {n_ {01}}} + {\ dfrac {1} {n_ {00}}}}}}}   .

This approximation is asymptotic, and therefore can give a meaningless result if any of the cells contains too small a number. If we denote by L the logarithm of the sample odds ratio, an approximate estimate of the 95% confidence interval for the logarithm of the general odds ratio will be determined in the normal model as follows: L ± 1.96 SE . [2] You can get rid of the logarithm using the transform exp ( L - 1.96SE), exp ( L + 1.96SE), and get a 95% confidence interval for the odds ratio. If you want to test the hypothesis that the general odds ratio is equal to one, you can determine the two-sided value of the p-statistics as 2 P ( Z <- | L | / SE), where P is the probability and Z is the quantity with the standard normal distribution .

Another approach allows to some extent to restore the original distribution of the sample odds ratio. To do this, the marginal frequencies of the signs X and Y are fixed, and the values ​​in the table cells change sequentially or randomly. It is easy to understand that only one of the cells of the table is subject to change, since all the others are determined on the basis of the constancy of marginal frequencies.

Role in Logistic Regression

Logistic regression is one way of determining the odds ratio for two binary variables. Suppose there is one dependent binary variable Y , one independent binary variable X (predictor), and a group of additional predictors Z 1 , ..., Z p that can take any value. If we use multiple logistic regression of Y on X , Z 1 , ..., Z p , the coefficient estimateβ^x {\ displaystyle {\ hat {\ beta}} _ {x}}   for X has a relationship with the conditional odds ratio. Namely, at the general population level

exp⁡(βx)=P(Y=one|X=one,Zone,...,Zp)/P(Y=0|X=one,Zone,...,Zp)P(Y=one|X=0,Zone,...,Zp)/P(Y=0|X=0,Zone,...,Zp),{\ displaystyle \ exp (\ beta _ {x}) = {\ frac {P (Y = 1 | X = 1, Z_ {1}, \ ldots, Z_ {p}) / P (Y = 0 | X = 1, Z_ {1}, \ ldots, Z_ {p})} {P (Y = 1 | X = 0, Z_ {1}, \ ldots, Z_ {p}) / P (Y = 0 | X = 0 , Z_ {1}, \ ldots, Z_ {p})}},}  

soexp⁡(β^x) {\ displaystyle \ exp ({\ hat {\ beta}} _ {x})}   Is an assessment of this conditional odds ratio. Valueexp⁡(β^x) {\ displaystyle \ exp ({\ hat {\ beta}} _ {x})}   , in this case, is interpreted as an estimate of the odds ratio between Y and X for fixed values ​​of the variables Z 1 , ..., Z p .

Sample Type Insensitivity

When the data is a representative sample, the probabilities in the cells of the table p ̂ ij are interpreted as the frequencies of each of the four groups in the general population according to combinations of X and Y values. In many cases, using a representative sample is impractical, so a selective sample is often formed. For example, objects for which X = 1 with a given probability f are selected, despite their real frequency in the general population (as a result of this, inevitably, objects with property X = 0 will be selected with probability 1 - f ). In this case, we get the following joint probabilities:

Y = 1Y = 0
X = 1fpeleven/(peleven+pten){\ displaystyle fp_ {11} / (p_ {11} + p_ {10})}  fpten(peleven+pten){\ displaystyle fp_ {10} (p_ {11} + p_ {10})}  
X = 0(one-f)p01/(p01+p00){\ displaystyle (1-f) p_ {01} / (p_ {01} + p_ {00})}  (one-f)p00/(p01+p00){\ displaystyle (1-f) p_ {00} / (p_ {01} + p_ {00})}  

The odds ratio p 11 p 00 / p 01 p 10 for a given distribution is independent of f . This example shows that the odds ratio (and, accordingly, the logarithm of the odds ratio) is invariant to nonrandom samples with respect to one of the studied variables. However, it is worth noting that the standard error of the logarithm of the odds ratio depends on f .

The invariance property is used in two very important situations:

  • Obtaining a representative sample is uncomfortable or impractical, but it is possible to obtain a suitable sample of objects with different X values, such that among the subsamples X = 0 and X = 1, the Y values ​​are representative of the general population (for example, they correspond to true conditional probabilities).
  • A strong skew of the marginal distribution of one of the variables, say, X , is assumed. For example, when studying the relationship between high alcohol consumption and pancreatic cancer, the incidence of cancer can be very low, so a very large representative sample may be required to get at least some cases of cancer. However, we can use hospital data to contact most or all patients with pancreatic cancer, and then get a random sample of the same number of subjects without cancer of the gland (this study is called “case-control”).

In both situations, the odds ratio can be estimated without bias from the data of the selective sample.

Application for quantitative research

In view of the widespread occurrence of logistic regression , the odds ratio is often used in medical and social research. The odds ratio is usually used in questionnaires, epidemiology , and also to present the results of clinical trials such as case-control . In reports, it is most often abbreviated as “OR”. In the case when the results of several surveys are combined, the name "pooled OR" is used.

Relative Risk

In clinical and other studies, characterization of relative risk rather than odds ratios is of more interest. The relative risk is best determined by the general population, but if the assumption of a rare disease is true, the odds ratio is a good approximation for assessing the relative risk - the odds are a fraction of the form p / (1 - p ), so when p approaches zero, 1 - p approaches to unity, which means the approximation of the odds to the magnitude of the risk, and, therefore, the approximation of the odds ratio to the relative risk. [3] When the assumption of a rare disease cannot be considered fair, the odds ratio may overestimate the relative risk. [4] [5] [6]

If the absolute risk value is known in the control group, the transition from one value to another is carried out through the expression: [4]

RR≈ORone-RC+(RC×OR){\ displaystyle RR \ approx {\ frac {OR} {1-R_ {C} + (R_ {C} \ times OR)}}}  

Where:

  • RR = relative risk
  • OR = odds ratio
  • R C = absolute risk in the unexposed group, specified as a fraction (for example, a risk value of 10% is entered into the formula as 0.1)

Confusion and exaggeration

In the medical literature, the odds ratio is often confused with the magnitude of relative risk. For an audience of non-statisticians, the concept of an odds relationship is difficult to understand and therefore produces a more impressive effect on the reader. [7] However, most authors find that relative risk is easily understood. [8] One study said that members of the national fund for the fight against the disease were 3.5 times more likely than others to know the general principles of treatment for this disease, but the odds ratio was 24 and in the article it was presented as what the members of this organization "More than 20 times more likely to know about treatment." [9] A study of articles in two journals showed that in 26% of the articles, the odds ratio was interpreted as a risk ratio. [ten]

This may indicate that authors who do not have an idea of ​​the essence of this quantity prefer it as the most expressive for their publication. [8] But its use in some cases can be misleading. [11] It was previously said that the odds ratio should describe the measure of the effect when it is not possible to assess the risk ratio directly. [7]

Reversibility and Invariance

Another unique feature of the odds ratio is the property of direct mathematical reversibility, for example, depending on the problem statement: to study freedom from a certain disease or to study the presence of this disease, the OS for freedom from the disease is the reciprocal (or 1 / OS) to the OS for the presence of the disease . This is a property of the 'invariance of the odds ratio' that the relative risk value does not have. Consider it with an example:

Suppose that in a clinical study, the risk of a 4/100 case in the group taking the drug and 2/100 in the placebo group was obtained, that is, OR = 2 and OS = 2.04166 were obtained for the case when comparing drug-placebo groups. On the other hand, if you turn the analysis and examine the non-random risk, then in the group taking the drug, the non-random risk will be 94/100, and in the placebo group 98/100, that is, OP = 0.9796 for non-random when comparing the drug-placebo groups, but = 0.48979. As you can see, OP = 0.9796 is not the reciprocal of OP = 2. On the contrary, OSH = 0.48979, in fact, is the inverse of OSH = 2.04166.

This is the property of the “invariance of the odds ratio”, because of which the PR for freedom from an event does not coincide for the PR for risk of an event, while OS has this symmetry property when analyzing freedom or risk. A danger to the clinical interpretation of OS occurs when the probability of a case is high, while the existing differences are exaggerated if the assumption of a rare disease is not fulfilled. On the other hand, when the disease is indeed rare, the use of OR to describe freedom (for example, OR = 0.9796 from the example above) may obscure the clinical effect of doubling the risk for an event when taking the drug or when exposed.

Alternative odds ratios

The sample odds ratio n 11 n 00 / n 10 n 01 is easy to calculate, and for moderate to large samples gives a good estimate of the general odds ratio. When one or more cells in the contingency table contains a small value, the odds ratio can become biased and become more varied . Several alternative evaluations of the odds ratio have been proposed, which have better properties under such conditions. Одна из альтернатив — это оценка условного максимального правдоподобия, которая опирается на суммы строк и столбцов при определении функции правдоподобия, подлежащей максимизации (также как это делается при выполнении точного теста Фишера ). [12] Альтернатива — это оценка Мантеля-Хензеля .

Числовые примеры

Следующие четыре таблицы сопряженности содержат совместные абсолютные частоты, а также соответствующие выборочные отношения шансов ( OR ) и логарифмы выборочных отношений шансов ( LOR ):

OR = 1, LOR = 0OR = 1, LOR = 0OR = 4, LOR = 1.39OR = 0.25, LOR = −1.39
Y = 1Y = 0Y = 1Y = 0Y = 1Y = 0Y = 1Y = 0
X = 1tenten10010020tenten20
X = 0fivefive5050ten2020ten

Следующие таблицы совместных распределений содержат генеральные совместные вероятности, а также соответствующие генеральные отношения шансов ( OR ) и логарифмы генеральных отношений шансов ( LOR ):

OR = 1, LOR = 0OR = 1, LOR = 0OR = 16, LOR = 2.77OR = 0.67, LOR = −0.41
Y = 1Y = 0Y = 1Y = 0Y = 1Y = 0Y = 1Y = 0
X = 10.20.20.40.40.40.10.10.3
X = 00.30.30.10.10.10.40.20.4

Реальные примеры

Пример 1: уменьшение рискаПример 2: увеличение риска
Экспериментальная группа (E)Контрольная группа (C)Total(E)(C)Total
Случаев (E)EE = 15CE = 100115EE = 75CE = 100175
Неслучаев (N)EN = 135CN = 150285EN = 75CN = 150225
Всего (S)ES = EE + EN = 150CS = CE + CN = 250400ES = 150CS = 250400
Частота случаев (ER)EER = EE / ES = 0,1 или 10%CER = CE / CS = 0,4 или 40%EER = 0,5 (50%)CER = 0,4 (40%)
ФормулаIndicatorСокр.Example 1Example 2
EER − CER< 0: уменьшение абсолютного рискаARR(−)0,3 или (−)30%N / a
> 0: увеличение абсолютного рискаARIN / a0,1 или 10%
(EER − CER) / CER< 0: уменьшение относительного рискаRRR(−)0,75 или (−)75%N / a
> 0: увеличение относительного рискаRRIN / a0,25 или 25%
1 / (EER − CER)< 0: необходимое число для леченияNNT(−)3,33N / a
> 0: необходимое число для фактора рискаNNHN / aten
EER / CERОтносительный рискRR0.251.25
(EE / EN) / (CE / CN)Отношение шансовOR0.1671,5
EER − CERАтрибутивный рискAR(−)0,30 или (−)30%0,1 или 10%
(RR − 1) / RRОтносительный атрибутивный рискARPN / a20%
1 − RR (или 1 − OR)Превентивная фракцияPF0,75 или 75%N / a

See also

  • Уровень опасности в анализе выживаемости
  • Относительный риск

Notes

  1. ↑ LaMorte, Wayne W. (May 13, 2013), Case-Control Studies , Boston University School of Public Health , < http://sph.bu.edu/otlt/MPH-Modules/EP/EP713_AnalyticOverview/EP713_AnalyticOverview5.html# > . Проверено 2 сентября 2013.  
  2. ↑ Morris and Gardner; Gardner, MJ Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates (англ.) // British Medical Journal : journal. — 1988. — Vol. 296 , no. 6632 . — P. 1313—1316 . — DOI : 10.1136/bmj.296.6632.1313 . — PMID 3133061 .
  3. ↑ Viera AJ Odds ratios and risk ratios: what's the difference and why does it matter? (англ.) // South. Med. J. : journal. — 2008. — July ( vol. 101 , no. 7 ). — P. 730—734 . — DOI : 10.1097/SMJ.0b013e31817a7ee4 . — PMID 18580722 .
  4. ↑ 1 2 Zhang J., Yu KF What's the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes (англ.) // JAMA : journal. — 1998. — November ( vol. 280 , no. 19 ). — P. 1690—1691 . — DOI : 10.1001/jama.280.19.1690 . — PMID 9832001 . (inaccessible link)
  5. ↑ Robbins AS, Chao SY, Fonseca VP What's the relative risk? A method to directly estimate risk ratios in cohort studies of common outcomes (англ.) // Ann Epidemiol : journal. — 2002. — October ( vol. 12 , no. 7 ). — P. 452—454 . — DOI : 10.1016/S1047-2797(01)00278-2 . — PMID 12377421 .
  6. ↑ Nurminen, Markku. To Use or Not to Use the Odds Ratio in Epidemiologic Analyses? (англ.) // European Journal of Epidemiology : journal. - 1995. - Vol. 11 , no. 4 . — P. 365—371 . — DOI : 10.1007/BF01721219 .
  7. ↑ 1 2 «On the use, misuse and interpretation of odds ratios». Dirk Taeger, Yi Sun, Kurt Straif. 10 August 1998. DOI : 10.1136/bmj.316.7136.989 http://www.bmj.com/content/316/7136/989?tab=responses
  8. ↑ 1 2 «Against all odds? Improving the understanding of risk reporting». A'Court, Christine; Stevens, Richard; Heneghan, Carl. British Journal of General Practice , Volume 62, Number 596, March 2012, pp. e220-e223(4). DOI : 10.3399/bjgp12X630223
  9. ↑ Nijsten T, Rolstad T, Feldman SR, Stern RS. Members of the national psoriasis foundation: more extensive disease and better informed about treatment options. Archives of Dermatology 2005;141(1): 19-26, p24 table 3 and text. http://archderm.ama-assn.org/cgi/reprint/141/1/19.pdf
  10. ↑ Holcomb WL, Chaiworapongsa T, Luke DA, Burgdorf KD. (2001) «An Odd Measure of Risk: Use and Misuse of the Odds Ratio» . Obstetrics and Gynecology , 98(4): 685—688.
  11. ↑ «The trouble with odds ratios». Thabani Sibanda. 1 May 2003 DOI : 10.1136/bmj.316.7136.989 http://www.bmj.com/content/316/7136/989?tab=responses
  12. ↑ Rothman, Kenneth J. Modern Epidemiology. - Lippincott Williams & Wilkins, 2008 .-- ISBN 0-7817-5564-6 .

Links

  • A simple calculator for calculating odds ratios - website
  • Odds ratio calculator with various tests - website
  • OpenEpi Internet application that calculates the odds ratio for different situations
Источник — https://ru.wikipedia.org/w/index.php?title=Отношение_шансов&oldid=101073245


More articles:

  • Bad Kohlberg-Heldburg
  • Krai (city)
  • Pulitzer Prize for Outstanding Music
  • Rakhmatullaev, Dilshod
  • Khasanov, Jasur Dzhumamuratovich
  • Dilstedt
  • Kaltensundheim
  • Genekykozapalladiokagermany
  • Nikolskoye (village, west of Moscow)
  • Rivers and canals of St. Petersburg

All articles

Clever Geek | 2019